Re: [DISCUSS] Add native CSV loading support for gremlin (GraphReader)

Stephen Mallette Wed, 02 Dec 2015 03:55:37 -0800

Thanks for bringing this up for discussion and offering to work on it. You
don't make mention of how you will deal with data types - will you have
some way to give users some fine-grained control of that?






On Tue, Dec 1, 2015 at 10:46 AM, Alaa Mahmoud <[email protected]> wrote:

> Adding support for loading CSV into a graph using Gremlin's GraphReader
> will lower the entry barrier for new users. A lot of data is already in CSV
> format and a lot of existing databases/repositories allow users to export
> their data as CSV.
>
> I'd like to add this capability to the gremlin core as a new GraphReader
> instance. Since the CSV data doesn't map directly to nodes and vertexes,
> I'm planning to do the loading on two steps:
>
> *Nodes*
> The first is to load a CSV as vertex CSV file. I'll create a node for every
> line in the csv and a property for each column on that line. If the csv has
> column headers, then the names of the columns will be the names of the
> corresponding vertex property. Otherwise, It'll be prop1, prop2, etc...
> (There are other ways to do it as well, but I'm just trying to show the
> general idea)
>
> *Edges*
> The second step is loading the edges csv file which will be in the
> following format
>
> vertex1 prop name (source vertex), vertex2 prop name (destination vertex),
> bidirectional (TRUE/FALSE), prop1,prop2,prop3,etc...
>
> For each line in the edge csv file, the reader will search for a vertex
> with the vertex1 prop value (caller need to ensure it's unique) to find the
> source vertex, search for a destination vertex with destination prop value
> and then create an edge that ties the two together. We will be creating an
> edge property for each additional property on the line.
>
> Thoughts?
>
> Alaa
>

Re: [DISCUSS] Add native CSV loading support for gremlin (GraphReader)

Reply via email to