[GitHub] [age] Amr-Shams commented on issue #971: Is there any Good documentation how age load works

via GitHub Wed, 07 Jun 2023 01:20:25 -0700


Amr-Shams commented on issue #971:
URL: https://github.com/apache/age/issues/971#issuecomment-1580180448


   parsing the data from the CSV file goes in a 
   
   
   ## 1. CSV file structure:
   
   ### First row: Header row describing the content of each column.
   ### Subsequent rows: Edge data with the following fields:
     1. Start node ID (integer)
     2. Start node label (string)
     3. End node ID (integer)
     4. End node label (string)
     5. Additional properties (optional)
   
   here is the detailed info about each function used
   **csv_edge_reader struct**: This structure holds the state of the CSV 
parser, including the fields, header, graph name, graph ID, object name, object 
ID, and other related information.
   
   **edge_field_cb()**: This is a callback function called for each field in 
the CSV file. It stores the field in the csv_edge_reader struct, reallocating 
memory as needed.
   
   **edge_row_cb()**: This is a callback function called for each row in the 
CSV file. If the row is the first row (header row), it stores the header 
information. For other rows, it processes the fields to extract start and end 
nodes, properties, and other edge-related information. It then calls 
insert_edge_simple() to insert the edge into the graph.
   
   **is_space() and is_term()**: These are utility functions used to customize 
the CSV parser's behavior when detecting space and line terminator characters, 
respectively.
   
   **create_edges_from_csv_file()**: This is the main function responsible for 
reading the CSV file and processing it using the CSV parser. It initializes the 
parser, reads the file in chunks, calls the appropriate callbacks 
(edge_field_cb() and edge_row_cb()), and cleans up memory after processing is 
comp
   
   
   I have read the code and I found this might be useful 
   the CSV file should follow these constraints:
   
   **Header row**: The CSV file must have a header row that describes the 
content of each column. The code assumes that the first row of the file is the 
header row.
   
   **Field values**: The start and end node IDs must be integers. The start and 
end node labels should be strings.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [age] Amr-Shams commented on issue #971: Is there any Good documentation how age load works

Reply via email to