rafsun42 commented on issue #1090: URL: https://github.com/apache/age/issues/1090#issuecomment-1658810204
@Zainab-Saad I was thinking about an algorithm that merges two sorted array. In our case, the vertex table and edge csv file. And, we are not actually merging them. We are just scanning the vertex table to get the new id. Sort the vertex table by `id`. Sort the edge csv file by `start_id` and `end_id`. You can set up a cursor to scan the vertex table row by row. As you read the edge csv file, you can get the next vertex from the cursor, instead of scanning the entire vertex table. Because these tables are sorted, you need to scan vertex table and csv file only once. This is just an idea. One drawback is sorting will cost time, but may be worth it for large csv file. Another idea that Shoaib suggested is to use the IDs in the csv file as actual ID (instead of getting it from the sequence), while importing vertices. Once imported, set the sequence to reflect the max ID in CSV + 1. Both of these are just ideas, you may need to think about their correctness and any potential edge cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
