Zineb Bendhiba created CAMEL-21587:
--------------------------------------
Summary: Langchain4j : add similarity search
Key: CAMEL-21587
URL: https://issues.apache.org/jira/browse/CAMEL-21587
Project: Camel
Issue Type: New Feature
Reporter: Zineb Bendhiba
Assignee: Zineb Bendhiba
+*Problem:*+
Apache Camel Langchain4j and Vector databases (Qdrant/Pinecone) can't perform
similarity search for RAG. This limitation hinders our ability to use Apache
Camel with Vector databases with Langchain4j for robust search capabilities.
While creating an example of RAG using Apache Camel Langchain4j and Apache
Camel Vector databases, I realized that the Vector databases components cannot
handle similarity search for the Langchain4j scenarios.
Despite trying both Qdrant and Pinecone, I encountered issues when attempting
to perform RAG (Relevance-based Active Learning) with these databases.
+*Research*+
I've analyzed various existing issues related to Search with Qdrant and
Pinecone. For example, the Camel Qdrant component doesn't provide similarity
search capabilities. In contrast, Pinecone allows for searching based on
similarity.
However, when trying to use Pinecone with Apache Camel, I encountered a
challenge: we need to convert text to Embeddings using Camel Langchain4j
Embeddings Component. The resulting list of Embeddings cannot be used directly
with RAG.
+*Proposed Solution*+
To address this limitation, I propose implementing a new feature (or new camel
AI component) that enables easy similarity search for RAG using Apache Camel
and Vector databases. This could involve leveraging the abstraction from
Embedding stores in Langchain4j to provide an easy way to perform similarity
searches.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)