GitHub user Kryst4lDem0ni4s added a comment to the discussion: [Discussion] The
selection of Agentic/Taskflow frame
I looked further into what @chiruu12 suggests, about not using off-the-shelf
agentic components that can prevent developers from understanding critical
behaviors, so that over-time the behavior of the service doesnt go out of
control.
Indeed it would be possible to write a custom HG_agentic library that borrows
only the necessary pieces, so hugegraph can maintain control over the logic and
integration details.
How about combining all of our suggestions that integrates LlamaIndex,
Pydantic-AI, CrewFlow, and Agno into a dual-mode, modular GraphRAG system while
avoiding the dependency hell as warned by @Aryankb
As a hybrid GraphRAG system that supports two modes we can include:
• A beginner-friendly “agentic retriever” that is pre-fine-tuned with robust
LLM prompting for straightforward use cases, and
• A customizable mode for advanced developers who need to tailor retrieval,
orchestration, and validation mechanisms.
Key design principles so that everyone can get a good night's sleep:
• Modularity & Microservices: standalone services with clearly defined APIs.
• Dual-Mode Operation: ease of use and deep customization.
• Transparent Integration: extracting core functionalities and integrating them
in-house.
• Extensive Logging & Monitoring: via Prometheus.
• Containerization: isolate dependencies.
Architectural Layers & Components:
A. Base Layer – Agno for L1 Queries
Handle high-frequency, low-latency queries (e.g., simple entity lookups) with
optimized parallel execution. Beyond this point we must also seek the correct
strategy to handle LN queries as well.
Key Features:
• Fast execution with low memory footprint.
• Built-in Gremlin-Cypher transpiler for hybrid query support.
• Integration with a hybrid caching layer that combines Agno shared memory and
RocksDB.
• Wrap Agno’s core query engine in a microservice that exposes an HTTP endpoint.
• Queries can be configured to pass through a lightweight pre-processing step
to select between cache and live query execution (L1) specific.
This component when abstracted into our own agentic library will be the base of
all performance optimizations.
B. Orchestration Layer – CrewAI for Complex Workflows
This would help us manage multi-hop, dynamic queries and agent workflows that
require intent classification and asynchronous execution and allows
customization.
Key Features:
• Dynamic intent classification powered by domain-specific embeddings
(integrated with HugeGraph).
• Event-driven workflow, where subtasks are dynamically generated from a user’s
plain-English prompt.
• Built-in support for sequencing (sequential/parallel) and conditional
delegation of agent tasks.
• Adapt core functionalities from CrewAI (CrewFlow) to create a custom
orchestration module.
• Define a clear API contract for submitting workflows, retrieving status, and
handling error/fallback logic.
C. Validation Layer – Pydantic
All general schema consistency and data integrity across all operations. THIS
distinction is necessary to understand that it's sole purpose here should be
for schema purposes only, not beyond it so far.
Key Features:
• Middleware to validate incoming queries and agent responses.
• Dev-friendly type hints and error reporting.
• Mechanisms to ensure that changes in one layer do not break API contracts.
• Wrap core endpoints of other layers with Pydantic models that perform
input/output validation.
• Integrate validation middleware as a separate microservice or as decorators
within the existing service codebase.
Note: This is the general usage of Pydantic, not it's agentic tools. Otherwise
it is too unpredictable and unsuitable for production.
D. Retrieval Enhancement Layer – LlamaIndex
Finally, in order to provide recursive, multi-hop retrieval functionality
enhanced by tiered caching, ensuring that complex graph queries are answered
effectively, LlamaIndex is compatible with CrewAI alreayd, so we'll look
further into how it's compatibility has been provided.
Key Features:
• Recursive retrieval strategies that work well with hierarchical graph caching.
• Integration with HugeGraph’s OLAP engine for analytical queries.
• Modular “runnables” inspired by LangChain that allow flexible composition of
retrieval steps.
• Expose LlamaIndex’s retrieval engine via an API that accepts complex,
multi-hop query parameters.
• Use a caching strategy that combines in-memory (for fast lookups) and
persistent (RocksDB) storage to accelerate repeated queries.
Summary for the plan with general key points and implementation steps:
- RESTful API endpoints for query submission, workflow orchestration,
validation, and retrieval.
- A Python SDK (e.g., HG_agentic and HG_orchestrator) that abstracts away the
internal microservices and provides simple functions for (examples):
Creating agents via plain-English commands.
Configuring custom workflows (sequential, parallel, conditional).
Integrating with existing agent systems (AutoGen).
- Define API endpoints for each core service. For example:
> /query/l1 for Agno-based L1 queries.
> /workflow/submit for submitting orchestration tasks.
> /validate for schema checks.
> /retrieve for multi-hop retrieval.
- The Python SDK wraps these endpoints and provides high-level functions, error
handling, and logging.
- Pre-fine-tuned with robust LLMs using few-shot or one-shot prompting.
- Offers a simplified interface where users only need to provide a natural
language prompt.
- Customizable pipeline where developers can modify key components (LLM
selection, prompt configuration, integration with vector databases like
Pinecone, FAISS, Qdrant).
- Leverage the modular “runnables” design inspired by LangChain to allow easy
insertion or replacement of retrieval steps.
- Minimize latency by combining HugeGraph’s native caching (e.g., via RocksDB)
with Agno’s shared memory features.
- Develop a caching microservice that first checks an in-memory cache and then
falls back to RocksDB.
- Ensure that cached results are seamlessly used across L1 and multi-hop
retrieval layers.
- Package each architectural layer as its own Docker container.
- Use orchestration tools e.g., Kubernetes
- Define strict API contracts between services.
- Integrate Prometheus (or a similar tool) into each microservice to collect
metrics
```mermaid
graph
A[User Query_Input] --> B{HTTP API Gateway}
B --> C[Agno L1 Query Service]
B --> D[CrewFlow Orchestrator]
D --> E[Dynamic Agent Creation]
E --> F[Workflow Execution]
F --> G[Pydantic Validation Middleware]
D --> H[Retrieve Request]
H --> I[LlamaIndex Recursive Retriever]
I --> J[Hybrid Caching Layer_RocksDB_Shared Memory]
G & J --> K[Result Aggregator]
K --> L[HTTP API Gateway_Response]
```
What are your thoughts on this approach @imbajin ? Further I'd also like your
thoughts about what I mentioned regarding LN queries and how we'd go about to
handle them. But I'd still stand by what I said, that implementing this is a
seperate project in itself and would require lots of time and expertise to do
before it can be put into production due to the added complexities of the
architecture.
GitHub link:
https://github.com/apache/incubator-hugegraph-ai/discussions/203#discussioncomment-12666612
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]