rafsun42 commented on issue #1090:
URL: https://github.com/apache/age/issues/1090#issuecomment-1658810204

   @Zainab-Saad 
   I was thinking about an algorithm that merges two sorted array. In our case, 
the vertex table and edge csv file. And, we are not actually merging them. We 
are just scanning the vertex table to get the new id. 
   
   Sort the vertex table by `id`. Sort the edge csv file by `start_id` and 
`end_id`. You can set up a cursor to scan the vertex table row by row. As you 
read the edge csv file, you can get the next vertex from the cursor, instead of 
scanning the entire vertex table. Because these tables are sorted, you need to 
scan vertex table and csv file only once. 
   
   This is just an idea. One drawback is sorting will cost time, but may be 
worth it for large csv file.
   
   Another idea that Shoaib suggested is to use the IDs in the csv file as 
actual ID (instead of getting it from the sequence), while importing vertices. 
Once imported, set the sequence to reflect the max ID in CSV + 1.
   
   Both of these are just ideas, you may need to think about their correctness 
and any potential edge cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to