Thanks for bringing this up for discussion and offering to work on it. You don't make mention of how you will deal with data types - will you have some way to give users some fine-grained control of that?
On Tue, Dec 1, 2015 at 10:46 AM, Alaa Mahmoud <[email protected]> wrote: > Adding support for loading CSV into a graph using Gremlin's GraphReader > will lower the entry barrier for new users. A lot of data is already in CSV > format and a lot of existing databases/repositories allow users to export > their data as CSV. > > I'd like to add this capability to the gremlin core as a new GraphReader > instance. Since the CSV data doesn't map directly to nodes and vertexes, > I'm planning to do the loading on two steps: > > *Nodes* > The first is to load a CSV as vertex CSV file. I'll create a node for every > line in the csv and a property for each column on that line. If the csv has > column headers, then the names of the columns will be the names of the > corresponding vertex property. Otherwise, It'll be prop1, prop2, etc... > (There are other ways to do it as well, but I'm just trying to show the > general idea) > > *Edges* > The second step is loading the edges csv file which will be in the > following format > > vertex1 prop name (source vertex), vertex2 prop name (destination vertex), > bidirectional (TRUE/FALSE), prop1,prop2,prop3,etc... > > For each line in the edge csv file, the reader will search for a vertex > with the vertex1 prop value (caller need to ensure it's unique) to find the > source vertex, search for a destination vertex with destination prop value > and then create an edge that ties the two together. We will be creating an > edge property for each additional property on the line. > > Thoughts? > > Alaa >
