Ran into another problem this time - I'm trying to populate a graph with about 3300 links. I've a CSV file as input which has approx 3300 records - of the form (A, B, n) - specifying a link from A to B, with a weight n. I am reading each line of the file, and creating each node (A and B). If the nodes are new, a new link is created. If both A and B exist, the link between them is updated with the new weight. Now here is the problem. Say I have a dataset as follows -
A, B, 3 C, D, 5 D, E, 2 E, A, 7 My algorithm is something like this - check whether node1 already exists or not, create if it doesn't exist; check whether node2 already exists or not, create if it doesn't exist; if (either node1 or node2 or both are new) { create a new relationship between them; } if (both node1 and node2 exist) { if (! node1.hasRelationship(RelTypes.KNOWS, Direction.OUTGOING) { // there is no link between node1 and node2 because node1 has no outgoing links create a new relationship; } else { // node1 has outgoing links, but none of them may link with node2 check whether there is any link from node1 to node2; update if found, else create new; } } The trouble seems to occur because of the "hasRelationship" statement. The data I have has many cases where node1 and node2 have been created before separately - which means, in many cases, this "hasRelationship" check is done and a new link is created. It's working ok with small dataset, but for a dataset with many such cases (like the last line in the sample data: E, A, 7) I am getting an OutOfMemory error with "GC overhead limit exceeded". This is the code - public CallInfo createOrUpdateCallInfo(final Subscriber callingParty, final Subscriber calledParty, final long count, final long duration) { CallInfo callInfo = null; if (callingParty == null) { throw new IllegalArgumentException("Null CallingParty"); } if (calledParty == null) { throw new IllegalArgumentException("Null CalledParty"); } final Node callingPartyNode = ((SubscriberImpl) callingParty).getUnderlyingNode(); final Node calledPartyNode = ((SubscriberImpl) calledParty).getUnderlyingNode(); if (!callingPartyNode.hasRelationship(RelTypes.CALLS, Direction.OUTGOING)) { //no outgoing relationships - bound to be a new one final Relationship rel = callingPartyNode.createRelationshipTo(calledPartyNode, RelTypes.CALLS); callInfo = new CallInfoImpl(rel); callInfo.setCount(count); callInfo.setDuration(duration); } else { // could be an update or a create // get all the outgoing relationships and check if there is one which // ends with the calledParty node final Iterable<Relationship> currentRels = callingPartyNode.getRelationships(RelTypes.CALLS, Direction.OUTGOING); for (Relationship rel : currentRels) { final boolean found = rel.getEndNode().equals(calledPartyNode); if (found) { // update the existing relationship callInfo = new CallInfoImpl(rel); callInfo.updateCount(count); callInfo.updateDuration(duration); break; } else { final Relationship newRel = callingPartyNode.createRelationshipTo(calledPartyNode, RelTypes.CALLS); callInfo = new CallInfoImpl(newRel); callInfo.setCount(count); callInfo.setDuration(duration); } } } return callInfo; } I've tried to allocate 3GB to this application, but that didn't help. It crawls for 10-15 mins, then throws the OutOfMemory error... Can anyone give me some pointers please? Regards Arijit -- "And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be." _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user