Hi all, I've been doing preliminary evaluations on some Neo4j operations. One of which rises from a specific need in my application: My method will get a List of node ids (stored in the nodes' properties) and need to retrieve exactly these nodes from the GraphDB. This should happen as fast as possible, of course. I used an index for the ids. My code is as follows:
private static final int SAMPLE_SIZE = 100000; ... GraphDatabaseService graphDb = new EmbeddedGraphDatabase("tmp/graphdb"); Transaction t = graphDb.beginTx(); IndexManager im = graphDb.index(); Index<Node> ni = im.forNodes("nodes"); ( (LuceneIndex<Node>) ni ).setCacheCapacity( "nodes", 500000 ); for (int i = 0; i < SAMPLE_SIZE; ++i) { Node n = graphDb.createNode(); n.setProperty("id", i); ni.add(n, "id", n.getProperty("id")); } t.success(); t.finish(); long time = System.currentTimeMillis(); for (int i = 0; i < SAMPLE_SIZE; ++i) { Node n = ni.get("id", i).getSingle(); } System.out.println(System.currentTimeMillis() - time); It works, but is rather slow. If I do the last loop a second time, the Lucene cache kicks in and reduces the required time by half. But then it's still some time (2000ms on my machine). When I do the exact same thing with a HashMap for example, the same loop (with call Node n = ni.get("id", i).getSingle();) takes about 10ms. I now HashMaps have other drawbacks such like memory consumption. For my use case this wouldn't be the problem, however, as I would only have to cache about 1M nodes which is perfectly possible in a HashMap. My main question is: Have I done something wrong in my usage of the Lucene index? Can it be sped up somehow? Or will I always be served better performance wise using a HashMap for such cases where I have a large amount of single queries? Thank you and best regards, Erik _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user