Adding support for loading CSV into a graph using Gremlin's GraphReader will lower the entry barrier for new users. A lot of data is already in CSV format and a lot of existing databases/repositories allow users to export their data as CSV.
I'd like to add this capability to the gremlin core as a new GraphReader instance. Since the CSV data doesn't map directly to nodes and vertexes, I'm planning to do the loading on two steps: *Nodes* The first is to load a CSV as vertex CSV file. I'll create a node for every line in the csv and a property for each column on that line. If the csv has column headers, then the names of the columns will be the names of the corresponding vertex property. Otherwise, It'll be prop1, prop2, etc... (There are other ways to do it as well, but I'm just trying to show the general idea) *Edges* The second step is loading the edges csv file which will be in the following format vertex1 prop name (source vertex), vertex2 prop name (destination vertex), bidirectional (TRUE/FALSE), prop1,prop2,prop3,etc... For each line in the edge csv file, the reader will search for a vertex with the vertex1 prop value (caller need to ensure it's unique) to find the source vertex, search for a destination vertex with destination prop value and then create an edge that ties the two together. We will be creating an edge property for each additional property on the line. Thoughts? Alaa
