MatheusFarias03 opened a new issue, #1517:
URL: https://github.com/apache/age/issues/1517

   Hi folks! I've been working on a Python project that collects 3851 vertices 
and creates 12507 edges. I want to store this data in AGE, but I'm not quite 
sure about the best approach.
   
   The data collection is performed using Python's Beautiful Soup, a web 
scraping library. I then create objects representing vertices, edges, and the 
graph. Vertices and edges are stored in their respective arrays within the 
graph class.
   
   For creating vertices in AGE, I use the MERGE clause to check if a vertex 
with the same label and properties already exists in the graph before creating 
it:
   
   ```python
   query = f'''
   SELECT * FROM cypher('{graph_name}', $$ 
   MERGE (:{vertex.label} {properties}) 
   $$) AS (n agtype); 
   '''
   ```
   
   This works well for creating each vertex independently.
   
   However, during testing with AGE, I noticed that when vertices are created 
this way and the following query is executed:
   ```py
   query=f'''
   SELECT * FROM cypher('{graph_name}', $$
   MERGE (a:{from_v_label} {from_v_properties})-[e:{e_label}]->(b:{to_v_label} 
{to_v_properties})
   $$) AS (e agtype);
   '''
   ```
   
   If the vertices already exist, it creates duplicates of the vertices.
   
   So, my question is, what is the best way to create vertices and edges 
without duplicating vertices that may already exist and with good performance? 
Thank you for taking the time to read my question.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@age.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to