jrgemignani commented on issue #2194:
URL: https://github.com/apache/age/issues/2194#issuecomment-3136961271
@serdarmicrosoft @MuhammadTahaNaveed From my local system with the above
graph loaded (PG17, CentOS 10)-
**Graph statistics**
```
psql-17.5-5432-pgsql=# select * from cypher('goodreads_graph_x', $$ return
graph_stats('goodreads_graph_x') $$) as (results agtype);
results
----------------------------------------------------------------------------------------------
{"graph": "goodreads_graph_x", "num_loaded_edges": 14897522,
"num_loaded_vertices": 3151377}
(1 row)
psql-17.5-5432-pgsql=#
```
**Open end vertex**
```
psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ explain
analyze MATCH (u:User)-[:HAS_INTERACTION]->() RETURN u.user_id, count(*) ORDER
BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count agtype);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------
Limit (cost=34001551.37..34001551.40 rows=10 width=64) (actual
time=108330.804..108352.939 rows=10 loops=1)
-> Sort (cost=34001551.37..34003163.09 rows=644686 width=64) (actual
time=108330.802..108352.936 rows=10 loops=1)
Sort Key: ((count(*))::agtype) DESC
Sort Method: top-N heapsort Memory: 27kB
-> GroupAggregate (cost=1256126.22..33987619.94 rows=644686
width=64) (actual time=33750.560..108224.304 rows=132661 loops=1)
Group Key: agtype_access_operator(VARIADIC
ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id),
u.properties), '"user_id"'::agtype]
)
-> Nested Loop (cost=1256126.22..21661006.56
rows=2462743931 width=32) (actual time=33747.672..103070.495 rows=10000000
loops=1)
-> Gather Merge (cost=1256126.21..2420813.48
rows=10000193 width=118) (actual time=33747.455..54797.290 rows=10000000
loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=1255126.18..1265543.05
rows=4166747 width=118) (actual time=33671.073..45033.045 rows=3333333 loops=3)
Sort Key: (agtype_access_operator(VARIADIC
ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id),
u.properties), '"
user_id"'::agtype]))
Sort Method: external merge Disk: 549696kB
Worker 0: Sort Method: external merge
Disk: 566440kB
Worker 1: Sort Method: external merge
Disk: 557168kB
-> Parallel Hash Join
(cost=21619.93..284274.14 rows=4166747 width=118) (actual
time=1721.984..15407.078 rows=3333333 loops=3
)
Hash Cond:
(_age_default_alias_0.start_id = u.id)
-> Parallel Seq Scan on
"HAS_INTERACTION" _age_default_alias_0 (cost=0.00..184515.47 rows=4166747
width=16) (actual tim
e=0.038..838.437 rows=3333333 loops=3)
-> Parallel Hash
(cost=13802.19..13802.19 rows=268619 width=110) (actual time=112.966..112.967
rows=214895 loops=3)
Buckets: 65536 (originally
65536) Batches: 32 (originally 16) Memory Usage: 9792kB
-> Parallel Seq Scan on "User"
u (cost=0.00..13802.19 rows=268619 width=110) (actual time=0.048..41.160
rows=2148
95 loops=3)
-> Memoize (cost=0.01..1.54 rows=5 width=8) (actual
time=0.001..0.001 rows=1 loops=10000000)
Cache Key: _age_default_alias_0.end_id
Cache Mode: logical
Hits: 9907870 Misses: 92130 Evictions: 17232
Overflows: 0 Memory Usage: 8193kB
-> Append (cost=0.00..1.53 rows=5 width=8)
(actual time=0.013..0.023 rows=1 loops=92130)
-> Seq Scan on _ag_label_vertex
_age_default_alias_1_1 (cost=0.00..0.00 rows=1 width=8) (actual
time=0.000..0.000 rows=0 loop
s=92130)
Filter: (id =
_age_default_alias_0.end_id)
-> Index Only Scan using "User_id_idx" on
"User" _age_default_alias_1_2 (cost=0.42..0.44 rows=1 width=8) (actual
time=0.006..
0.006 rows=0 loops=92130)
Index Cond: (id =
_age_default_alias_0.end_id)
Heap Fetches: 0
-> Index Only Scan using "Book_id_idx" on
"Book" _age_default_alias_1_3 (cost=0.29..0.31 rows=1 width=8) (actual
time=0.005..
0.005 rows=1 loops=92130)
Index Cond: (id =
_age_default_alias_0.end_id)
Heap Fetches: 0
-> Index Only Scan using "Author_id_idx"
on "Author" _age_default_alias_1_4 (cost=0.29..0.31 rows=1 width=8) (actual
time=0.0
04..0.004 rows=0 loops=92130)
Index Cond: (id =
_age_default_alias_0.end_id)
Heap Fetches: 0
-> Index Only Scan using "Review_id_idx"
on "Review" _age_default_alias_1_5 (cost=0.43..0.45 rows=1 width=8) (actual
time=0.0
05..0.005 rows=0 loops=92130)
Index Cond: (id =
_age_default_alias_0.end_id)
Heap Fetches: 0
Planning Time: 1.962 ms
Execution Time: 108387.898 ms
(42 rows)
psql-17.5-5432-pgsql=#
```
**Set end vertex**
```
psql-17.5-5432-pgsql=# SELECT * FROM cypher('goodreads_graph_x', $$ explain
analyze MATCH (u:User)-[:HAS_INTERACTION]->(:Book) RETURN u.user_id, count(*)
ORDER BY count(*) DESC LIMIT 10 $$) AS (user_id agtype, interaction_count
agtype);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------
Limit (cost=1007580.65..1007580.67 rows=10 width=64) (actual
time=21179.688..21199.448 rows=10 loops=1)
-> Sort (cost=1007580.65..1009192.36 rows=644686 width=64) (actual
time=21179.686..21199.445 rows=10 loops=1)
Sort Key: ((count(*))::agtype) DESC
Sort Method: top-N heapsort Memory: 27kB
-> Finalize GroupAggregate (cost=823871.40..993649.21 rows=644686
width=64) (actual time=20646.968..21112.283 rows=132661 loops=1)
Group Key: (agtype_access_operator(VARIADIC
ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id),
u.properties), '"user_id"'::agtype
]))
-> Gather Merge (cost=823871.40..974308.63 rows=1289372
width=40) (actual time=20646.945..20848.792 rows=132820 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=822871.37..824483.09 rows=644686
width=40) (actual time=20607.823..20674.302 rows=44273 loops=3)
Sort Key: (agtype_access_operator(VARIADIC
ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id),
u.properties), '"user_i
d"'::agtype]))
Sort Method: quicksort Memory: 4030kB
Worker 0: Sort Method: external merge Disk:
2896kB
Worker 1: Sort Method: external merge Disk:
2928kB
-> Partial HashAggregate
(cost=674786.11..743035.36 rows=644686 width=40) (actual
time=20149.603..20164.340 rows=44273 loops=3)
Group Key: agtype_access_operator(VARIADIC
ARRAY[_agtype_build_vertex(u.id, _label_name('237668'::oid, u.id),
u.properties), '"
user_id"'::agtype])
Planned Partitions: 16 Batches: 1 Memory
Usage: 6417kB
Worker 0: Batches: 1 Memory Usage: 6673kB
Worker 1: Batches: 1 Memory Usage: 6673kB
-> Parallel Hash Join
(cost=23994.82..342748.46 rows=4166747 width=32) (actual
time=2795.794..16327.732 rows=3333333 loops=3)
Hash Cond:
(_age_default_alias_0.start_id = u.id)
-> Parallel Hash Join
(cost=2374.89..219877.18 rows=4166747 width=8) (actual time=18.239..1938.707
rows=3333333 loops=3
)
Hash Cond:
(_age_default_alias_0.end_id = _age_default_alias_1.id)
-> Parallel Seq Scan on
"HAS_INTERACTION" _age_default_alias_0 (cost=0.00..184515.47 rows=4166747
width=16) (actu
al time=0.025..524.881 rows=3333333 loops=3)
-> Parallel Hash
(cost=1888.44..1888.44 rows=38916 width=8) (actual time=17.815..17.817
rows=31133 loops=3)
Buckets: 131072 Batches:
1 Memory Usage: 4768kB
-> Parallel Index Only
Scan using "Book_id_idx" on "Book" _age_default_alias_1 (cost=0.29..1888.44
rows=389
16 width=8) (actual time=0.089..7.761 rows=31133 loops=3)
Heap Fetches: 0
-> Parallel Hash
(cost=13802.19..13802.19 rows=268619 width=110) (actual time=123.008..123.009
rows=214895 loops=3)
Buckets: 65536 (originally
65536) Batches: 32 (originally 16) Memory Usage: 9792kB
-> Parallel Seq Scan on "User"
u (cost=0.00..13802.19 rows=268619 width=110) (actual time=0.045..44.468
rows=2148
95 loops=3)
Planning Time: 6.800 ms
Execution Time: 21202.833 ms
(33 rows)
psql-17.5-5432-pgsql=#
```
**SQL command**
```
psql-17.5-5432-pgsql=# explain analyze select count(*), u.id from
goodreads_graph_x."User" u, goodreads_graph_x."HAS_INTERACTION" h,
goodreads_graph_x."Book" b where u.id = h.start_id and b.id = h.end_id GROUP BY
u.id ORDER BY 1 DESC LIMIT 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------
Limit (cost=828731.73..828731.75 rows=10 width=16) (actual
time=5160.862..5183.487 rows=10 loops=1)
-> Sort (cost=828731.73..830343.44 rows=644686 width=16) (actual
time=5160.860..5183.484 rows=10 loops=1)
Sort Key: (count(*)) DESC
Sort Method: top-N heapsort Memory: 25kB
-> Finalize GroupAggregate (cost=651469.34..814800.29 rows=644686
width=16) (actual time=5030.692..5154.455 rows=132661 loops=1)
Group Key: u.id
-> Gather Merge (cost=651469.34..801906.57 rows=1289372
width=16) (actual time=5030.671..5092.293 rows=135523 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=650469.31..652081.03 rows=644686
width=16) (actual time=5023.358..5031.229 rows=45174 loops=3)
Sort Key: u.id
Sort Method: quicksort Memory: 2980kB
Worker 0: Sort Method: quicksort Memory: 2901kB
Worker 1: Sort Method: quicksort Memory: 2963kB
-> Partial HashAggregate
(cost=538245.23..577244.80 rows=644686 width=16) (actual
time=4987.979..5002.350 rows=45174 loops=3)
Group Key: u.id
Planned Partitions: 8 Batches: 1 Memory
Usage: 6929kB
Worker 0: Batches: 1 Memory Usage: 6673kB
Worker 1: Batches: 1 Memory Usage: 6673kB
-> Parallel Hash Join
(cost=19772.67..303865.71 rows=4166747 width=8) (actual time=2829.474..3983.936
rows=3333333 loops=3)
Hash Cond: (h.start_id = u.id)
-> Parallel Hash Join
(cost=2374.89..219877.18 rows=4166747 width=8) (actual time=17.717..1958.554
rows=3333333 loops=3
)
Hash Cond: (h.end_id = b.id)
-> Parallel Seq Scan on
"HAS_INTERACTION" h (cost=0.00..184515.47 rows=4166747 width=16) (actual
time=0.057..527.
239 rows=3333333 loops=3)
-> Parallel Hash
(cost=1888.44..1888.44 rows=38916 width=8) (actual time=16.894..16.895
rows=31133 loops=3)
Buckets: 131072 Batches:
1 Memory Usage: 4736kB
-> Parallel Index Only
Scan using "Book_id_idx" on "Book" b (cost=0.29..1888.44 rows=38916 width=8)
(actual
time=0.054..7.187 rows=31133 loops=3)
Heap Fetches: 0
-> Parallel Hash
(cost=12990.05..12990.05 rows=268619 width=8) (actual time=117.643..117.644
rows=214895 loops=3)
Buckets: 262144 (originally
262144) Batches: 16 (originally 8) Memory Usage: 12352kB
-> Parallel Index Only Scan
using "User_id_idx" on "User" u (cost=0.42..12990.05 rows=268619 width=8)
(actual tim
e=0.080..52.500 rows=214895 loops=3)
Heap Fetches: 0
Planning Time: 0.899 ms
Execution Time: 5186.043 ms
(34 rows)
psql-17.5-5432-pgsql=#
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]