Ran into another problem this time -

I'm trying to populate a graph with about 3300 links. I've a CSV file
as input which has approx 3300 records - of the form (A, B, n) -
specifying a link from A to B, with a weight n. I am reading each line
of the file, and creating each node (A and B). If the nodes are new, a
new link is created. If both A and B exist, the link between them is
updated with the new weight. Now here is the problem. Say I have a
dataset as follows -

A, B, 3
C, D, 5
D, E, 2
E, A, 7

My algorithm is something like this -

check whether node1 already exists or not, create if it doesn't exist;
check whether node2 already exists or not, create if it doesn't exist;
if (either node1 or node2 or both are new) {
      create a new relationship between them;
}
if (both node1 and node2 exist) {
    if (! node1.hasRelationship(RelTypes.KNOWS, Direction.OUTGOING) {
        // there is no link between node1 and node2 because node1 has
no outgoing links
        create a new relationship;
    } else {
        // node1 has outgoing links, but none of them may link with node2
        check whether there is any link from node1 to node2;
        update if found, else create new;
    }
}

The trouble seems to occur because of the "hasRelationship" statement.
The data I have has many cases where node1 and node2 have been created
before separately - which means, in many cases, this "hasRelationship"
check is done and a new link is created. It's working ok with small
dataset, but for a dataset with many such cases (like the last line in
the sample data: E, A, 7) I am getting an OutOfMemory error with "GC
overhead limit exceeded".

This is the code -

public CallInfo createOrUpdateCallInfo(final Subscriber callingParty,
            final Subscriber calledParty,
            final long count,
            final long duration) {
        CallInfo callInfo = null;
        if (callingParty == null) {
            throw new IllegalArgumentException("Null CallingParty");
        }
        if (calledParty == null) {
            throw new IllegalArgumentException("Null CalledParty");
        }
        final Node callingPartyNode = ((SubscriberImpl)
callingParty).getUnderlyingNode();
        final Node calledPartyNode = ((SubscriberImpl)
calledParty).getUnderlyingNode();
        if (!callingPartyNode.hasRelationship(RelTypes.CALLS,
Direction.OUTGOING)) {
             //no outgoing relationships - bound to be a new one
             final Relationship rel =
callingPartyNode.createRelationshipTo(calledPartyNode,
RelTypes.CALLS);
             callInfo = new CallInfoImpl(rel);
             callInfo.setCount(count);
             callInfo.setDuration(duration);
        } else {
            // could be an update or a create
            // get all the outgoing relationships and check if there
is one which
            // ends with the calledParty node
            final Iterable<Relationship> currentRels =
callingPartyNode.getRelationships(RelTypes.CALLS, Direction.OUTGOING);
            for (Relationship rel : currentRels) {
                final boolean found = rel.getEndNode().equals(calledPartyNode);
                if (found) {
                    // update the existing relationship
                    callInfo = new CallInfoImpl(rel);
                    callInfo.updateCount(count);
                    callInfo.updateDuration(duration);
                    break;
                } else {
                    final Relationship newRel =
callingPartyNode.createRelationshipTo(calledPartyNode,
RelTypes.CALLS);
                    callInfo = new CallInfoImpl(newRel);
                    callInfo.setCount(count);
                    callInfo.setDuration(duration);
                }
            }
        }
        return callInfo;
    }

I've tried to allocate 3GB to this application, but that didn't help.
It crawls for 10-15 mins, then throws the OutOfMemory error...

Can anyone give me some pointers please?

Regards
Arijit

-- 
"And when the night is cloudy,
There is still a light that shines on me,
Shine on until tomorrow, let it be."
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to