Tech Showcase: HDC and SDM Information Retrieval with Corruption

At Transparent AI, we're constantly exploring approaches to make AI more explainable, efficient, and human-like in its capabilities. In this blog post, we dive into a fascinating model called Sparse Distributed Memory (SDM) and demonstrate how it can mimic human-like pattern recognition and recall abilities.

Introduction to Sparse Distributed Memory, Vector Symbolic Architectures, and Hyperdimensional Computing

Sparse Distributed Memory, first proposed by Pentti Kanerva in the 1980s, represents a fundamentally different approach to information storage and retrieval compared to traditional computing architectures. Rather than storing information in precise locations, SDM distributes data across thousands of "hard locations" in a high-dimensional space.

This approach falls within the broader frameworks of Vector Symbolic Architectures (VSA) and Hyperdimensional Computing (HDC), which use high-dimensional vectors (typically in the hundreds or thousands of dimensions) to represent and manipulate information. These models are particularly intriguing because they:

  1. Share properties with human memory, including graceful degradation, pattern completion, and association

  2. Allow for efficient parallel implementation in hardware

  3. Exhibit remarkable robustness to noise and partial information

The key insight is that in high-dimensional spaces, random points tend to be far apart from each other. This mathematical property allows for significant noise tolerance while maintaining the ability to recognize patterns and associations.

Benefits and Unique Characteristics of Sparse Distributed Memory

SDM has several remarkable characteristics that make it particularly suitable for cognitive models and certain AI applications:

Robustness to Noise: Perhaps the most compelling feature of SDM is its ability to handle noisy or corrupted data. Unlike traditional computing models where a single bit error can cause failure, SDM degrades gracefully as noise increases.

Pattern Completion: SDM can "fill in the blanks" when given partial information, similar to how humans can recognize a partially obscured face or complete a familiar saying from just a few words.

Distributed Representation: Information is stored across many locations rather than at a single address, creating redundancy that provides robustness.

Associative Retrieval: SDM allows for content-addressable memory, where patterns can be retrieved by presenting similar or associated patterns rather than exact addresses.

Biological Plausibility: The architecture of SDM shares remarkable similarities with structures in the brain, particularly the cerebellum, making it interesting for cognitive modeling.

Replicating Kanerva's Original Demonstrations

To explore the capabilities of SDM, we first implemented two classic demonstrations from Kanerva's original work: prototype construction from noisy examples and sequence recall.

Prototype Construction (Circle Example)

In this demonstration, we stored nine noisy versions of a circle pattern, each with 20% of its bits randomly flipped. We then presented a new noisy pattern (also with 20% noise) as a retrieval cue.

The results were striking:

Created a new noisy test pattern (20% noise)

Performing iterative retrieval...

Iteration 1: activated 32 locations, 5.1% noise

Iteration 2: activated 28 locations, 2.7% noise

Iteration 3: activated 35 locations, 1.6% noise

On the first retrieval, the noise level dropped dramatically from 20% to just 5.1%. After three iterations, the noise was reduced to a mere 1.6%, effectively reconstructing a clean prototype from noisy inputs. This demonstrates SDM's ability to form "concepts" from noisy examples - a key feature of human cognition. The system correctly associated the noisy test pattern with less noisy versions to iteratively search the memory and find a pattern almost identical to the original. This can be likened to human memory where we start out with vague concepts of a memory and it becomes clearer the more we think about it.

Sequence Recall (Roman Numerals Example)

For our second demonstration, we encoded a sequence of Roman numerals (I through VI)* and linked them in order. When presented with a noisy version of numeral III (with 30% of its bits flipped), the system was able to recall the rest of the sequence perfectly:

Created noisy version of numeral 3 (30% noise) as retrieval cue

Retrieving the sequence...

Retrieved numeral 4: activated 40 locations, 0.0% noise

Retrieved numeral 5: activated 23 locations, 0.0% noise

Retrieved numeral 6: activated 16 locations, 0.0% noise

This showcases SDM's ability to store and recall sequences, similar to how humans can continue a song after hearing just a snippet of it or recite a memorized sequence after being prompted with the beginning.

*Here’s an interesting artifact the we decided to leave in here for educational purposes. We use an LLM to help with visualizations and writing these blog posts. Here you can see that the IV, V, and VI Roman numerals are upside down. No matter how hard we tried and even after multiple iterations, the LLM could not figure out how to make right-side-up Roman numerals. This is an interesting example that shows how, even though LLMs are powerful tools, they still have limitations with spacial and visual reasoning.

Starting with the very noisy Roman numeral III, the system was still able to finish the sequence and find the Roman numeral in the chain with zero error.

Extending to Natural Language: Phrase Completion

While the original demonstrations are powerful, we wanted to extend the concept to a more relatable example: phrase completion in natural language. Most of us have experienced the phenomenon of automatically completing a familiar phrase when given just the beginning. For example, when someone says "the quick brown fox...", many people will immediately think "...jumps over a lazy dog."

We implemented this by:

  1. Converting text fragments into binary patterns using a hash-based approach

  2. Storing associations between words, word pairs, and larger fragments

  3. Implementing a retrieval mechanism that could find the closest match to partially retrieved patterns

The results demonstrated SDM's remarkable ability to mimic human phrase completion:

Test Case 1: Starting with 'the quick'

Retrieved: 'brown' (similarity: 100.0%)

Retrieved: 'fox' (similarity: 100.0%)

Retrieved: 'jumps' (similarity: 100.0%)

Retrieved: 'over' (similarity: 100.0%)

Retrieved: 'a' (similarity: 100.0%)

Retrieved: 'lazy' (similarity: 100.0%)

Retrieved: 'dog' (similarity: 100.0%)

Reconstructed: "the quick brown fox jumps over a lazy dog"

Recovery: 100.0% of original phrase words

Starting with just "the quick", the system successfully retrieved the entire phrase with 100% accuracy. Even more impressively, the system could handle partial and noisy fragments:

Test Case 4: Starting with 'the qu..k bro.n'

(Interpreted as noisy version of 'the quick brown')

Retrieved: 'fox' (similarity: 100.0%)

Retrieved: 'jumps' (similarity: 100.0%)

Retrieved: 'over' (similarity: 100.0%)

Retrieved: 'a' (similarity: 100.0%)

Retrieved: 'lazy' (similarity: 100.0%)

Retrieved: 'dog' (similarity: 100.0%)

Reconstructed: "the qu..k bro.n fox jumps over a lazy dog"

Recovery: 77.8% of original phrase words

Despite the deliberately corrupted input ("the qu..k bro.n"), the system was able to continue the phrase correctly, recovering 77.8% of the original words. This mimics how humans can still understand and complete phrases even when they're partially obscured or contain typos.

Analysis of Results: What Do These Demonstrations Tell Us?

These demonstrations highlight several key capabilities of Sparse Distributed Memory:

  1. Generalization: In the circle example, SDM didn't just memorize the noisy patterns—it extracted the underlying "concept" of a circle from them. This is similar to how humans form abstract concepts from multiple examples.

  2. Sequence Learning: The Roman numeral demo shows that SDM can effectively store and recall sequences, with each item serving as a retrieval cue for the next. This ability is fundamental to many cognitive processes, from language to motor skills.

  3. Associative Memory: The phrase completion demonstration reveals how SDM can form complex networks of associations, allowing it to navigate from partial information to complete patterns, much like human associative memory.

  4. Noise Tolerance: Across all demonstrations, SDM showed remarkable resilience to noise. Even with 20-30% noise in the input patterns, it could still retrieve the correct information and progressively clean up the noise through iterative retrieval.

These capabilities arise naturally from the mathematical properties of high-dimensional spaces and the distributed nature of the storage mechanism, without requiring complex algorithms or extensive training procedures.

Applications and Future Directions

The capabilities demonstrated by SDM have numerous practical applications:

Edge Computing: The efficiency and robustness of SDM make it well-suited for edge devices with limited computational resources. Its ability to work with noisy or partial data is particularly valuable in real-world sensor environments.

Cognitive Modeling: SDM provides a computational framework for understanding human memory, potentially offering insights into memory disorders and cognitive processes.

Anomaly Detection: The model's ability to form "prototypes" from examples could be applied to detecting unusual patterns in data streams.

Robust Information Retrieval: In applications where data might be corrupted or incomplete, SDM's pattern completion abilities could improve retrieval accuracy.

Low-Power AI: The simple operations required by SDM (primarily bit comparisons and counting) could enable AI capabilities on ultra-low-power devices.

Future research directions might include:

  1. Developing optimized hardware implementations specifically designed for SDM operations

  2. Extending the text encoding mechanism to capture semantic relationships better

  3. Combining SDM with other AI approaches for hybrid systems that leverage the strengths of each

  4. Exploring applications in natural language understanding, particularly for handling ambiguous or noisy inputs

  5. Investigating how SDM might be used to model aspects of human memory impairment and cognitive decline

Conclusion

Sparse Distributed Memory represents a fascinating bridge between computer science and cognitive science. Its ability to mimic human-like pattern recognition, sequence learning, and associative recall makes it a valuable tool for both understanding human cognition and developing more robust AI systems.

At Transparent AI, we're particularly interested in approaches like SDM, VSA, and HDC because they offer inherent explainability and robustness while maintaining efficiency. The demonstrations presented here only scratch the surface of what's possible with these architectures.

As AI continues to evolve, we believe that drawing inspiration from human cognition—as SDM does—will lead to systems that are not only more capable but also more aligned with human thought processes and expectations. This alignment is crucial for building trustworthy AI that can work effectively alongside humans in solving complex problems.

Previous
Previous

Tech Showcase: HDC Explainable Sensor Fusion on Low SWaP Systems

Next
Next

Tech Showcase: HDC Capabilities in Analogical Reasoning