MatheusFarias03 opened a new issue, #1517:
URL: https://github.com/apache/age/issues/1517
Hi folks! I've been working on a Python project that collects 3851 vertices
and creates 12507 edges. I want to store this data in AGE, but I'm not quite
sure about the best approach.
The data collection is performed using Python's Beautiful Soup, a web
scraping library. I then create objects representing vertices, edges, and the
graph. Vertices and edges are stored in their respective arrays within the
graph class.
For creating vertices in AGE, I use the MERGE clause to check if a vertex
with the same label and properties already exists in the graph before creating
it:
```python
query = f'''
SELECT * FROM cypher('{graph_name}', $$
MERGE (:{vertex.label} {properties})
$$) AS (n agtype);
'''
```
This works well for creating each vertex independently.
However, during testing with AGE, I noticed that when vertices are created
this way and the following query is executed:
```py
query=f'''
SELECT * FROM cypher('{graph_name}', $$
MERGE (a:{from_v_label} {from_v_properties})-[e:{e_label}]->(b:{to_v_label}
{to_v_properties})
$$) AS (e agtype);
'''
```
If the vertices already exist, it creates duplicates of the vertices.
So, my question is, what is the best way to create vertices and edges
without duplicating vertices that may already exist and with good performance?
Thank you for taking the time to read my question.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]