jrgemignani commented on issue #2194: URL: https://github.com/apache/age/issues/2194#issuecomment-3136961271
@serdarmicrosoft @MuhammadTahaNaveed From my local system with the above graph loaded (PG17, CentOS 10)- **Graph statistics** ``` psql-17.5-5432-pgsql=# select * from cypher('goodreads_graph_x', $$ return graph_stats('goodreads_graph_x') $$) as (results agtype); results ---------------------------------------------------------------------------------------------- {"graph": "goodreads_graph_x", "num_loaded_edges": 14897522, "num_loaded_vertices": 3151377} (1 row) psql-17.5-5432-pgsql=# ``` **Open end vertex** ``` psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ explain analyze MATCH (u:User)-[:HAS_INTERACTION]->() RETURN u.user_id, count(*) ORDER BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count agtype); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------- Limit (cost=34001551.37..34001551.40 rows=10 width=64) (actual time=108330.804..108352.939 rows=10 loops=1) -> Sort (cost=34001551.37..34003163.09 rows=644686 width=64) (actual time=108330.802..108352.936 rows=10 loops=1) Sort Key: ((count(*))::agtype) DESC Sort Method: top-N heapsort Memory: 27kB -> GroupAggregate (cost=1256126.22..33987619.94 rows=644686 width=64) (actual time=33750.560..108224.304 rows=132661 loops=1) Group Key: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties), '"user_id"'::agtype] ) -> Nested Loop (cost=1256126.22..21661006.56 rows=2462743931 width=32) (actual time=33747.672..103070.495 rows=10000000 loops=1) -> Gather Merge (cost=1256126.21..2420813.48 rows=10000193 width=118) (actual time=33747.455..54797.290 rows=10000000 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=1255126.18..1265543.05 rows=4166747 width=118) (actual time=33671.073..45033.045 rows=3333333 loops=3) Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties), '" user_id"'::agtype])) Sort Method: external merge Disk: 549696kB Worker 0: Sort Method: external merge Disk: 566440kB Worker 1: Sort Method: external merge Disk: 557168kB -> Parallel Hash Join (cost=21619.93..284274.14 rows=4166747 width=118) (actual time=1721.984..15407.078 rows=3333333 loops=3 ) Hash Cond: (_age_default_alias_0.start_id = u.id) -> Parallel Seq Scan on "HAS_INTERACTION" _age_default_alias_0 (cost=0.00..184515.47 rows=4166747 width=16) (actual tim e=0.038..838.437 rows=3333333 loops=3) -> Parallel Hash (cost=13802.19..13802.19 rows=268619 width=110) (actual time=112.966..112.967 rows=214895 loops=3) Buckets: 65536 (originally 65536) Batches: 32 (originally 16) Memory Usage: 9792kB -> Parallel Seq Scan on "User" u (cost=0.00..13802.19 rows=268619 width=110) (actual time=0.048..41.160 rows=2148 95 loops=3) -> Memoize (cost=0.01..1.54 rows=5 width=8) (actual time=0.001..0.001 rows=1 loops=10000000) Cache Key: _age_default_alias_0.end_id Cache Mode: logical Hits: 9907870 Misses: 92130 Evictions: 17232 Overflows: 0 Memory Usage: 8193kB -> Append (cost=0.00..1.53 rows=5 width=8) (actual time=0.013..0.023 rows=1 loops=92130) -> Seq Scan on _ag_label_vertex _age_default_alias_1_1 (cost=0.00..0.00 rows=1 width=8) (actual time=0.000..0.000 rows=0 loop s=92130) Filter: (id = _age_default_alias_0.end_id) -> Index Only Scan using "User_id_idx" on "User" _age_default_alias_1_2 (cost=0.42..0.44 rows=1 width=8) (actual time=0.006.. 0.006 rows=0 loops=92130) Index Cond: (id = _age_default_alias_0.end_id) Heap Fetches: 0 -> Index Only Scan using "Book_id_idx" on "Book" _age_default_alias_1_3 (cost=0.29..0.31 rows=1 width=8) (actual time=0.005.. 0.005 rows=1 loops=92130) Index Cond: (id = _age_default_alias_0.end_id) Heap Fetches: 0 -> Index Only Scan using "Author_id_idx" on "Author" _age_default_alias_1_4 (cost=0.29..0.31 rows=1 width=8) (actual time=0.0 04..0.004 rows=0 loops=92130) Index Cond: (id = _age_default_alias_0.end_id) Heap Fetches: 0 -> Index Only Scan using "Review_id_idx" on "Review" _age_default_alias_1_5 (cost=0.43..0.45 rows=1 width=8) (actual time=0.0 05..0.005 rows=0 loops=92130) Index Cond: (id = _age_default_alias_0.end_id) Heap Fetches: 0 Planning Time: 1.962 ms Execution Time: 108387.898 ms (42 rows) psql-17.5-5432-pgsql=# ``` **Set end vertex** ``` psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ explain analyze MATCH (u:User)-[:HAS_INTERACTION]->(:Book) RETURN u.user_id, count(*) ORDER BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count agtype); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------- Limit (cost=1007580.65..1007580.67 rows=10 width=64) (actual time=21179.688..21199.448 rows=10 loops=1) -> Sort (cost=1007580.65..1009192.36 rows=644686 width=64) (actual time=21179.686..21199.445 rows=10 loops=1) Sort Key: ((count(*))::agtype) DESC Sort Method: top-N heapsort Memory: 27kB -> Finalize GroupAggregate (cost=823871.40..993649.21 rows=644686 width=64) (actual time=20646.968..21112.283 rows=132661 loops=1) Group Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties), '"user_id"'::agtype ])) -> Gather Merge (cost=823871.40..974308.63 rows=1289372 width=40) (actual time=20646.945..20848.792 rows=132820 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=822871.37..824483.09 rows=644686 width=40) (actual time=20607.823..20674.302 rows=44273 loops=3) Sort Key: (agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties), '"user_i d"'::agtype])) Sort Method: quicksort Memory: 4030kB Worker 0: Sort Method: external merge Disk: 2896kB Worker 1: Sort Method: external merge Disk: 2928kB -> Partial HashAggregate (cost=674786.11..743035.36 rows=644686 width=40) (actual time=20149.603..20164.340 rows=44273 loops=3) Group Key: agtype_access_operator(VARIADIC ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id), u.properties), '" user_id"'::agtype]) Planned Partitions: 16 Batches: 1 Memory Usage: 6417kB Worker 0: Batches: 1 Memory Usage: 6673kB Worker 1: Batches: 1 Memory Usage: 6673kB -> Parallel Hash Join (cost=23994.82..342748.46 rows=4166747 width=32) (actual time=2795.794..16327.732 rows=3333333 loops=3) Hash Cond: (_age_default_alias_0.start_id = u.id) -> Parallel Hash Join (cost=2374.89..219877.18 rows=4166747 width=8) (actual time=18.239..1938.707 rows=3333333 loops=3 ) Hash Cond: (_age_default_alias_0.end_id = _age_default_alias_1.id) -> Parallel Seq Scan on "HAS_INTERACTION" _age_default_alias_0 (cost=0.00..184515.47 rows=4166747 width=16) (actu al time=0.025..524.881 rows=3333333 loops=3) -> Parallel Hash (cost=1888.44..1888.44 rows=38916 width=8) (actual time=17.815..17.817 rows=31133 loops=3) Buckets: 131072 Batches: 1 Memory Usage: 4768kB -> Parallel Index Only Scan using "Book_id_idx" on "Book" _age_default_alias_1 (cost=0.29..1888.44 rows=389 16 width=8) (actual time=0.089..7.761 rows=31133 loops=3) Heap Fetches: 0 -> Parallel Hash (cost=13802.19..13802.19 rows=268619 width=110) (actual time=123.008..123.009 rows=214895 loops=3) Buckets: 65536 (originally 65536) Batches: 32 (originally 16) Memory Usage: 9792kB -> Parallel Seq Scan on "User" u (cost=0.00..13802.19 rows=268619 width=110) (actual time=0.045..44.468 rows=2148 95 loops=3) Planning Time: 6.800 ms Execution Time: 21202.833 ms (33 rows) psql-17.5-5432-pgsql=# ``` **SQL command** ``` psql-17.5-5432-pgsql=# explain analyze select count(*), u.id from goodreads_graph_x."User" u, goodreads_graph_x."HAS_INTERACTION" h, goodreads_graph_x."Book" b where u.id = h.start_id and b.id = h.end_id GROUP BY u.id ORDER BY 1 DESC LIMIT 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------------------------- Limit (cost=828731.73..828731.75 rows=10 width=16) (actual time=5160.862..5183.487 rows=10 loops=1) -> Sort (cost=828731.73..830343.44 rows=644686 width=16) (actual time=5160.860..5183.484 rows=10 loops=1) Sort Key: (count(*)) DESC Sort Method: top-N heapsort Memory: 25kB -> Finalize GroupAggregate (cost=651469.34..814800.29 rows=644686 width=16) (actual time=5030.692..5154.455 rows=132661 loops=1) Group Key: u.id -> Gather Merge (cost=651469.34..801906.57 rows=1289372 width=16) (actual time=5030.671..5092.293 rows=135523 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=650469.31..652081.03 rows=644686 width=16) (actual time=5023.358..5031.229 rows=45174 loops=3) Sort Key: u.id Sort Method: quicksort Memory: 2980kB Worker 0: Sort Method: quicksort Memory: 2901kB Worker 1: Sort Method: quicksort Memory: 2963kB -> Partial HashAggregate (cost=538245.23..577244.80 rows=644686 width=16) (actual time=4987.979..5002.350 rows=45174 loops=3) Group Key: u.id Planned Partitions: 8 Batches: 1 Memory Usage: 6929kB Worker 0: Batches: 1 Memory Usage: 6673kB Worker 1: Batches: 1 Memory Usage: 6673kB -> Parallel Hash Join (cost=19772.67..303865.71 rows=4166747 width=8) (actual time=2829.474..3983.936 rows=3333333 loops=3) Hash Cond: (h.start_id = u.id) -> Parallel Hash Join (cost=2374.89..219877.18 rows=4166747 width=8) (actual time=17.717..1958.554 rows=3333333 loops=3 ) Hash Cond: (h.end_id = b.id) -> Parallel Seq Scan on "HAS_INTERACTION" h (cost=0.00..184515.47 rows=4166747 width=16) (actual time=0.057..527. 239 rows=3333333 loops=3) -> Parallel Hash (cost=1888.44..1888.44 rows=38916 width=8) (actual time=16.894..16.895 rows=31133 loops=3) Buckets: 131072 Batches: 1 Memory Usage: 4736kB -> Parallel Index Only Scan using "Book_id_idx" on "Book" b (cost=0.29..1888.44 rows=38916 width=8) (actual time=0.054..7.187 rows=31133 loops=3) Heap Fetches: 0 -> Parallel Hash (cost=12990.05..12990.05 rows=268619 width=8) (actual time=117.643..117.644 rows=214895 loops=3) Buckets: 262144 (originally 262144) Batches: 16 (originally 8) Memory Usage: 12352kB -> Parallel Index Only Scan using "User_id_idx" on "User" u (cost=0.42..12990.05 rows=268619 width=8) (actual tim e=0.080..52.500 rows=214895 loops=3) Heap Fetches: 0 Planning Time: 0.899 ms Execution Time: 5186.043 ms (34 rows) psql-17.5-5432-pgsql=# ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org