Lately I've played with some OpenStreetMap data... Nodes imported have many properties with a small set of values (road type, point-of-interest type, colour, ...) but I don't know in advance the set of values (sometimes a new value can become standard, sometimes an invalid value is present). Other node properties are just unique text (address, url). To speed up the import process I've tried to apply some kind of compression, I've seen that Neo4j encode property names using a sequence of integers, I've tried to do the same for values of all the properties which I know they contain only a small set.
With this encoding the database is obviously much smaller.. after importing sweden.osm the database dir is 552M: 100M neostore.propertystore.db 220M neostore.propertystore.db.arrays 227M neostore.propertystore.db.strings with 'compression' on is 344M: 100M neostore.propertystore.db 220M neostore.propertystore.db.arrays 20M neostore.propertystore.db.strings property value dictionary entries: 16286 property value dictionary size: 387378 bytes I don't know if this is a common use case, but it would be cool to have this kind of compression out of the box! WDYT? Regards, -- Davide Savazzi _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user