This is probably not news to anyone, but I might as well post about
it in case new users are wondering about performance between
index based lookups and lookups by node ids.

I have a test database of 750,000 nodes of type A.

The db also contains 90,000 nodes of types B and C, and roughly
4M relationships between A-B and A-C (so two different relationship
types). The size on disk is 4.7GB, of which the Lucene index takes
2.3GB or so.

Node of type A has three properties, one fulltext indexed ones and
an id type property indexed with type exact index (type of property
is a string). Let's call the property name as guid. The relationships and
other types of nodes also have indexed properties, which are all indexed
in their own indexes. There are about 14M properties in the db.

To test the performance I generate a list of all node IDs and guid property
values, and perform 400,000 lookups using random entries from those
lists, and record the execution time of the 400,000 lookups.

This is on a box with 8GB of RAM, and the performance runs are nowhere
near using all that memory.

I'm using SDN 2.0.0 M1 to access the data. The node id lookups are
done with the findOne(Long id) method in the CRUDRepository class
and the guid property lookups are done with the
findByPropertyValue(String indexName, String property, Object value)
method in the NamedIndexRepository class.

Using default settings for the graph db.

The node id lookups run in about 12,700ms

The index based guid property id lookups run in about 123,000ms.

So roughly a 10x performance difference.

-TPP
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to