Re: [PR] feat(ai-rag): support multiple embedding providers, add Cohere rerank, and standardize chat interface [apisix]

via GitHub Sun, 01 Feb 2026 19:47:04 -0800


ChuanFF commented on PR #12941:
URL: https://github.com/apache/apisix/pull/12941#issuecomment-3832744063


   > Hi @ChuanFF, could you explain these breaking changes? Is it necessary to 
introduce these changes?
   @Baoyuantop 
   1. **Request format**：The previous plugin required the `ai_rag` field to be 
included in the request, which is not standard practice. Ideally, the document 
retrieval should be performed on the user's question (usually the last 
question). This approach is directly compatible with the `openai-api`'s 
completions interface. We can refer to other AI proxy projects like Higress and 
Literm for this.
   
   2. **Azure OpenAI key**： `azure_openai` was changed to `azure-openai` to 
maintain consistency with the `ai-proxy` plugin's fields. Please let me know if 
prefer to keep it as `azure_openai`.
   
   3. **Context position**：The `ai-rag` information should be inserted before 
the user's question to keep the LLM focused on the user's question. Otherwise, 
in some scenarios, the LLM might treat inserted documents as user questions, 
reducing the quality of the LLM's response. We can refer to other AI proxy 
projects for this as well.
   
   4. **Vector search output**：The previous plugin, when using 
`azure-ai-search` for document retrieval, neither filtered the fields nor 
parsed the response body to obtain the document content. This resulted in the 
`rag` results passed to the LLM containing a large amount of information 
unrelated to the retrieved documents.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(ai-rag): support multiple embedding providers, add Cohere rerank, and standardize chat interface [apisix]

Reply via email to