Hi Michael, I tried with 2.3.2, started with a fresh db that had 10 nodes in it. I then ran the first command to import 5 million nodes from CSV. This took 12 minutes and when it finished it was using 1.6GB memory. Size on disk was 2.5GB.
I ran the second command and it created the 5 million edges in 8 minutes, after which it was using 1.8GB memory and size on disk was 3.32GB. A few minutes later memory usage went down to 1.3GB. Next I ran the first command again on another CSV file which contained 5 million events too. It took 15 minutes to create the nodes, was using 2.2GB memory and size on disk was 5.9GB. When I ran the second command on this file it completed in 8 minutes and was still using 2.2GB memory. Size on disk was at 6.8GB. After that I ran another command similar to the second one, which creates another edge for each node and it completed in 8 minutes and memory was at 2.3GB. So up to now it does seem to be a bit better in that it doesn't stall. When I prefix the second command with EXPLAIN this is what I'm getting: Compiler CYPHER 2.3 Planner RULE Runtime INTERPRETED +--------------+-----------------------+-------------------------------+ | Operator | Identifiers | Other | +--------------+-----------------------+-------------------------------+ | +EmptyResult | | | | | +-----------------------+-------------------------------+ | +Merge(Into) | anon[167], e, f, line | (e)-[:FOR]->(f) | | | +-----------------------+-------------------------------+ | +SchemaIndex | e, f, line | line.eventID; :EVENT(eventID) | | | +-----------------------+-------------------------------+ | +SchemaIndex | f, line | line.name; :Feature(name) | | | +-----------------------+-------------------------------+ | +LoadCSV | line | | +--------------+-----------------------+-------------------------------+ Total database accesses: ? Regards, Arielle On Wednesday, January 27, 2016 at 5:29:52 PM UTC+1, Michael Hunger wrote: > > Can you try it on 2.3.2 too? > In general your code looks ok. Can you share your query plan? > Prefix your query with EXPLAIN and remove the USING PERIODIC COMMIT to see > the plan. > > How big is your neo4j store on disk? > > Michael > > > Am 27.01.2016 um 13:29 schrieb Arielle Bonnici <[email protected] > <javascript:>>: > > I'm currently running a test with Neo4j CE 2.3.1 on a Windows 7 machine > with 4GB memory and trying to understand how to manage memory allocation > when importing from CSV using the Neo4jShell. > > I am running these two commands, the first one to create the nodes and the > second one to create edges (one edge for each node). > > USING PERIODIC COMMIT 10000 > LOAD CSV WITH HEADERS FROM 'file:///C:\\seq.csv' AS line > CREATE (:EVENT { eventID: line.eventID, name: line.name, referrer: > line.referrer, sessionID: toInt(line.sessionID), timestamp: > toInt(line.timestamp), pID: toInt(line.pID)}); > > USING PERIODIC COMMIT 10000 > LOAD CSV WITH HEADERS FROM 'file:///C:\\seq.csv' AS line > MATCH (f:Feature) > WHERE f.name = line.name > MATCH (e:EVENT) > WHERE e.eventID = line.eventID > MERGE (e)-[:FOR]->(f); > > I have the following related indexes and constraints: > > Indexes > ON :EVENT(eventID) ONLINE (for uniqueness constraint) > ON :Feature(name) ONLINE (for uniqueness constraint) > > Constraints > ON (feature:Feature) ASSERT feature.name IS UNIQUE > ON (event:EVENT) ASSERT event.eventID IS UNIQUE > > When I have 5 million nodes in the db and try to load a CSV that has > another 5 million nodes, it takes about 15 minutes to complete and gets to > ~1.5GB memory usage. If I immediately run the second command to create the > edges, the memory starts going up again and sometimes it will stall at some > point. In order to make sure the second command works I have to restart > Neo4j. > > I'm trying to understand if I can improve this by optimizing the commands > somehow, or if specifying memory settings in the properties file might > help...in which case how best to go about that? > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
