alteck45 opened a new issue, #1522:
URL: https://github.com/apache/age/issues/1522

   We have been experimenting recently with Apache AGE specifically with the 
main focus on performance in comparison to the standard Postgres DB.
   
   Our very basic data model contains single vertex label `AccessPoint`. For 
the simplicity sake the only relevant property on `AccessPoint` label is the 
`id` value, which is also supposed to be frequently used by the application 
queries.
   
   The dataset contains 50_013 `AccessPoint` vertexes.
   
   We introduced several indices to see, which are actually being used by the 
queries of different type. For the completeness here are the indices we 
introduced on the `AccessPoint` table:
   ```sql
   CREATE UNIQUE INDEX ap_id ON "test-graph"."AccessPoint" USING btree 
(agtype_access_operator(properties, '"id"'));
   CREATE INDEX ap_gin_properties ON "test-graph"."AccessPoint" USING 
GIN(properties);
   CREATE UNIQUE INDEX ap_v_id ON "dcore-graph"."AccessPoint"(id);
   ```
   To sum up, we have BTREE indices on the `properties -> id` and on the vertex 
id, and we have a GIN index on `properties`.
   
   The problem we have observed is the lack of utilisation of any indices, for 
a simple query like:
   ```sql
   SELECT * FROM cypher(
     'test-graph', $$
       MATCH (ap:AccessPoint)
       WITH ap
       ORDER BY ap.id
       RETURN ap
     $$
   ) as (value agtype)
   ```
   The query plan for the query above is:
   ```sql
   Subquery Scan on _age_default_alias_previous_cypher_clause  
(cost=5631.49..9308.86 rows=29419 width=32) (actual time=1703.577..2113.010 
rows=50013 loops=1)
     ->  Gather Merge  (cost=5631.49..9014.67 rows=29419 width=64) (actual 
time=1703.567..2085.553 rows=50013 loops=1)
           Workers Planned: 1
           Workers Launched: 1
           ->  Sort  (cost=4631.48..4705.03 rows=29419 width=64) (actual 
time=1533.703..1700.085 rows=25006 loops=2)
                 Sort Key: (agtype_access_operator(VARIADIC 
ARRAY[_agtype_build_vertex(ap.id, _label_name('20505'::oid, ap.id), 
ap.properties), '""id""'::agtype]))"
                 Sort Method: external merge  Disk: 10848kB
                 Worker 0:  Sort Method: external merge  Disk: 8512kB
                 ->  Parallel Seq Scan on ""AccessPoint"" ap  
(cost=0.00..2447.93 rows=29419 width=64) (actual time=2.513..560.890 rows=25006 
loops=2)"
   Planning Time: 7.458 ms
   Execution Time: 2128.705 ms
   ```
   
   
   For the comparison a standard SQL query:
   ```sql
   SELECT * 
   FROM "test-graph"."AccessPoint" 
   ORDER BY agtype_access_operator(properties, '"id"');
   ```
   utilizes the `ap_id` btree index on `properties -> id`, the query plan:
   ```sql
   Index Scan using ap_id on "AccessPoint"  (cost=0.41..9687.64 rows=50013 
width=288) (actual time=0.610..347.862 rows=50013 loops=1)
   Planning Time: 1.683 ms
   Execution Time: 364.441 ms
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@age.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to