What are Embeddings?
Embeddings are dense vector representations of data that capture semantic meaning. They transform text, images, or other data into fixed-length arrays of numbers.
Text:
"The cat sat on the mat"
Embedding (simplified):
[0.23, -0.45, 0.89, 0.12, -0.67, 0.34, ...]
Typically 384 to 1536 dimensions
Why Embeddings Matter
The magic of embeddings: similar things have similar vectors. This enables:
- Semantic search (find by meaning, not keywords)
- Recommendations (find similar items)
- Clustering (group related content)
- RAG (give LLMs relevant context)
How Embeddings Work
Neural networks learn embeddings by processing vast amounts of data. The network learns to place similar items close together in vector space.
Close in Vector Space
- "king" and "queen"
- "happy" and "joyful"
- "Paris" and "France"
Far in Vector Space
- "king" and "banana"
- "happy" and "table"
- "Paris" and "algorithm"
Types of Embeddings
Text Embeddings
Most common type. Models like OpenAI's text-embedding-3, Cohere, or open-source Sentence-BERT.
Image Embeddings
Convert images to vectors. CLIP can even align text and images in the same space.
Audio Embeddings
Represent audio/speech as vectors for search and classification.
Multi-modal Embeddings
Combine different types (text + image) in shared vector space.
Using Embeddings
1. Generate Embeddings
import openai
response = openai.embeddings.create(
model="text-embedding-3-small",
input="Your text here"
)
embedding = response.data[0].embedding
# Returns: [0.023, -0.045, 0.089, ...]
2. Store in Vector Database
# Using Pinecone
index.upsert([
{
"id": "doc1",
"values": embedding,
"metadata": {"text": "Your text here"}
}
])
3. Search by Similarity
# Query embedding
query_embedding = get_embedding("similar text")
# Find nearest neighbors
results = index.query(
vector=query_embedding,
top_k=5
)
About Dimensions
Embedding dimension is the length of the vector. Common sizes:
| Model | Dimensions | Use Case |
|---|---|---|
| text-embedding-3-small | 1536 | General purpose |
| text-embedding-3-large | 3072 | High accuracy |
| all-MiniLM-L6-v2 | 384 | Fast, lightweight |
Best Practices
- Chunk text appropriately - Don't embed entire documents
- Use the same model - Query and documents must use identical model
- Normalize vectors - Many models output normalized vectors
- Consider cost - API calls add up at scale
Next Steps
- Cosine Similarity - Measure embedding similarity
- Vector Databases - Store embeddings at scale
- RAG - Use embeddings with LLMs