Dear Experts,

I'm looking to load some Enron scandal data from two tables in MYSQL, into 
NEO4J. 

The first table is People, which is a list of email
addresses, names, etc.  Just under 90K rows. 

The second table is mailgraph, a table of who emailed who, in aggregate 
totals. 
About 360K rows. 

I decided to do a two step load. 
Load People into nodes
Check the data
Load Mailgraph, into relationships
Check

It's similar to RDBMS. You can't insert into the intersecting M:N table,
until you have the other rows already inserted in the two lookup tables. 


To load the People table, I used Talend Big Data, after Yasser's blog post.
http://lucidwebdreams.wordpress.com/2014/07/24/import-data-into-neo4j-from-ms-sql-server-directly-using-talend/

However, the Talend insert speed was only 11 rows per second. 
So, to insert the 90K rows, it took over 2 hours! 
And now I'm looking for a clear tutorial on how to insert
the just the relationships into Talend. 


I know there have been some improvements lately with Cypher and some new 
NEO4J data loaders: LOAD CSV, neo4j-shell-tools, batch-import


I've no problem with CSV files. What loader would you recommend for this 
relatively simple job, done in two stages? 


BTW, on my Redhat server, yum does not list any available packages 
for mvn, or maven. So, any loader that requires such software in
the tech stack won't work for me.


Thanks a lot!

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to neo4j+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to