xephtar opened a new issue, #2187: URL: https://github.com/apache/age/issues/2187
### 🔍 Performance Optimization for Multi-Hop Traversal in Apache AGE Context: We are currently using Apache AGE and have the following graph structure: ``` (:A)-[:HAS_Y]->(:Y) (:A)-[:HAS_Z]->(:Z) (:A)-[:HAS_D]->(:D) ``` (similar for 7 relation types and 8 node types total) Our typical traversal pattern in Neo4j was: ``` MATCH (n:Y {property_example: 123})-[r*..4]-(d:A) RETURN d.property_found AS property_found LIMIT 50 UNION ALL MATCH (n:Z {property_example: 123})-[r*..4]-(d:A) RETURN d.property_found AS property_found LIMIT 50 ``` We expect: ~500 million nodes 3–4x that number in relationships ### Question: What kind of indexing strategy or query optimization would you recommend in Apache AGE for improving the performance of multi-hop traversal queries like [*..4]? Any guidance or best practices for: Node property indexing Relationship indexing (e.g., start_id, end_id) Traversal optimizations would be highly appreciated. ### Current Setup: We currently have: Indexes on all relevant node properties start_id and end_id indexes on all relationships Sample test data: ~27 million vertices ~23 million edges Query example: ``` SELECT d FROM ag_catalog.cypher('user_unification', $$ MATCH (n:Y) WHERE n.value = 'a0de44c7fc8cb783' MATCH (n)-[*..2]-(d:A) RETURN d $$) as (d ag_catalog.agtype); ``` Execution time: For [*..2]: ~30 seconds For [*..4]: >150 seconds (often fails to complete) Expected execution time: ≤10 ms for [*..2] Any suggestions or feedback from the AGE team would be incredibly helpful. Thanks in advance! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org