vladiksun opened a new issue, #2020: URL: https://github.com/apache/age/issues/2020
**Describe the bug** Potential memory leak **How are you accessing AGE (Command line, driver, etc.)?** - JDBC **What data setup do we need to do?** ```pgsql select ag_catalog.create_graph('test_graph'); --- create test labels --- select ag_catalog.create_vlabel('test_graph','domain'); select ag_catalog.create_vlabel('test_graph','user'); --- create test relation --- select ag_catalog.create_elabel('test_graph','edges'); --- generate test data, should take about 2 minutes --- DO $do$ BEGIN FOR i IN 1..8000 LOOP EXECUTE format(' select * from ag_catalog.cypher(''test_graph'', $$ CREATE (Org_level_1_%s: domain {id:"Org_level_1_%s", type:"Organization", name: "Org_level_1_%s some real name that could looooooooooooooong" ,hidden:false }) CREATE (Org_level_2_%s: domain {id:"Org_level_2_%s", type:"Organization", name: "Org_level_2_%s some real name that could looooooooooooooong" ,hidden:false }) CREATE (Org_level_3_%s: domain {id:"Org_level_3_%s", type:"Organization", name: "Org_level_3_%s some real name that could looooooooooooooong" ,hidden:false }) CREATE (Org_level_4_%s: domain {id:"Org_level_4_%s", type:"Organization", name: "Org_level_4_%s some real name that could looooooooooooooong" ,hidden:false }) CREATE (Dep_level_5_%s: domain {id:"Dep_level_5_%s", type:"Department", name: "Dep_level_5_%s some real name that could looooooooooooooong" ,hidden:false }) CREATE (User_%s: user { id:"User_%s", type:"User", name: "user_%s" ,hidden:false }) CREATE (User_%s)-[_rel_user:edges { from: ''User_%s'', to: ''Dep_level_5_%s'', hidden: false, toCollection: ''domain'', fromCollection: ''user'', relation: ''user-domain'' }]->(Dep_level_5_%s) CREATE (Dep_level_5_%s)-[_rel1:edges { from: ''Dep_level_5_%s'', to: ''Org_level_4_%s'', hidden: false, toCollection: ''domain'', fromCollection: ''domain'', relation: ''parent'' }]->(Org_level_4_%s) CREATE (Org_level_4_%s)-[_rel2:edges { from: ''Org_level_4_%s'', to: ''Org_level_3_%s'', hidden: false, toCollection: ''domain'', fromCollection: ''domain'', relation: ''parent'' }]->(Org_level_3_%s) CREATE (Org_level_3_%s)-[_rel3:edges { from: ''Org_level_3_%s'', to: ''Org_level_2_%s'', hidden: false, toCollection: ''domain'', fromCollection: ''domain'', relation: ''parent'' }]->(Org_level_2_%s) CREATE (Org_level_2_%s)-[_rel4:edges { from: ''Org_level_2_%s'', to: ''Org_level_1_%s'', hidden: false, toCollection: ''domain'', fromCollection: ''domain'', relation: ''parent'' }]->(Org_level_1_%s) $$ ) as (any_vertex ag_catalog.agtype) ', VARIADIC ARRAY(SELECT array_fill('iter_' || i::text, '{100}')) ); END LOOP; END $do$; ``` **What is the command that caused the error?** ```pgsql -- run all three scripts below -- create some data select * from ag_catalog.cypher('test_graph', $$ CREATE (d: domain { id: "test1", type: "Organization", name: "test1" , hidden: false } ) RETURN d $$ ) as (vertex agtype); -- delete some data select * from ag_catalog.cypher('test_graph', $$ MATCH (any_vertex: domain { id: "test1" } ) DELETE any_vertex RETURN any_vertex $$) as (catalog ag_catalog.agtype); -- run query select domain from ag_catalog.cypher('test_graph',$$ MATCH (user_vertex: user { hidden: false, id: 'User_iter_100'}) -[e1:edges {hidden: false, fromCollection: 'user', toCollection: 'domain', relation: 'user-domain'}]->(to1: domain {hidden: false}) -[e2:edges*0..10 {hidden: false, fromCollection: 'domain', toCollection: 'domain', relation: 'parent'}]->(to2: domain {hidden: false}) RETURN to2 $$) as domain(domain ag_catalog.agtype); ``` ``` Database went into recovery mode, could be seen as an IO error on the client side. ``` **Expected behavior** No server crash happens **Environment (please complete the following information):** - Postgres 14 + Apache AGE 1.5.0 **Additional context** It looks like any Cypher script's result is being cached after the first run. The second run and all subsequent runs are faster unless any data has been created/updated/deleted. If create/update/delete happens the scripts executes slower again. This leads to increased DB memory consumption that never released. For our real data (about 20 vertex labels with different connectedness over 3 million edges) similar script such as below executes around 5 seconds first time, around 300ms second time until create/update/delete happens. ```pgsql select domain from ag_catalog.cypher('test_graph',$$ MATCH (user_vertex: user { hidden: false, id: 'User_iter_100'}) -[e1:edges {hidden: false, fromCollection: 'user', toCollection: 'domain', relation: 'user-domain'}]->(to1: domain {hidden: false}) -[e2:edges*0..10 {hidden: false, fromCollection: 'domain', toCollection: 'domain', relation: 'parent'}]->(to2: domain {hidden: false}) RETURN to2 $$) as domain(domain ag_catalog.agtype); ``` Please take a look at the attached screenshot.  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org