Yashmitha1 opened a new issue, #2177: URL: https://github.com/apache/age/issues/2177
Hi, I am trying to insert/update around 70K edges into the graph using age. I created the following function: CREATE OR REPLACE FUNCTION create_update_edge(edge_list TEXT) RETURNS void AS $$ DECLARE query TEXT; BEGIN query := format($q$ SELECT * FROM ag_catalog.cypher( 'test_attribute_instance_graph', $cypher$ UNWIND %s AS edge MERGE (a:attribute_instance {name: edge.source_name, value: edge.source_value})-[r:cooccurs_with]-(b:attribute_instance {name: edge.target_name, value: edge.target_value}) SET r.count = edge.count SET r.first_seen = edge.first_seen SET r.last_seen = edge.last_seen $cypher$ ) AS (ignored agtype); $q$, edge_list); EXECUTE query; RAISE NOTICE 'Edges merged and properties set.'; END; $$ LANGUAGE plpgsql; These are some of the points I had to take care of : 1. Function signature of ag_catalog.cyphe is (name, cstring, agtype)) So, I was not directly able to pass JSONB format as a parameter, and had to change the list of edges(each data is mapped to a dict) to a Cypher-compatible map(keys are unquoted) before passing it to the above function. 2. Trying to use MERGE..ON CREATE/ON MATCH SET gave syntax error, so I simply used MERGE SET to do the upsertion of edges. 3. Trying to B-tree indexing was a failed attempt, as again I was coming across a syntax error using the standard CREATE INDEX for B-tree command, so I used GIN Index. Under this setup and batch unwind (batch size 1000 edges): I got the following results | Edges Inserted | Time (s) | Time (min) | |----------------|----------|------------| | 1,000 | 0.62 | 0.01 | | 2,000 | 16.35 | 0.27 | | 3,000 | 22.42 | 0.37 | | 4,000 | 69.63 | 1.16 | | 5,000 | 104.75 | 1.75 | | 6,000 | 177.12 | 2.95 | | 7,000 | 447.66 | 7.46 | | 8,000 | 598.38 | 9.97 | | 9,000 | 913.82 | 15.23 | | 10,000 | 1320.22 | 22.00 | Clearly, the performance is pretty bad, so I was wondering if this is a limitation of age or a wrong implementation on my end. If someone could give me insights on my method and any other ways I should try inserting the data/or optimise the current method, it'd be of great help! Thankyou!! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@age.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org