Hi David,

Great to hear you're interested in using neo4j for the OSM model. I also
think it is a great match. However, you are right to assume there are a few
missing pieces. Two shortcomings that are relevant to your questions below
are:

   - *Scalability*. We only recently tried to load very large OSM files,
   starting with Germany. There are two problems with this currently, one is
   that the graph model takes up too much disk space, and the second is the
   load performance degrades. The problem here is that the OSM file, despite
   being XML, is to some extent just a sequential dump of a number of postgis
   tables, the first being the point nodes, and only later the ways with
   foreign key references to the nodes. So we need an independent way to lookup
   the node-id's when loading the ways, and currently use a lucene index for
   this. The index works, but like all tree structures degrades in performance
   as the total index size increases. Peter has been investigating this, and is
   in the process of evaluating two options:
      - Switching off the batch-inserter. I refactored the OSMImporter to
      allow for importing with the normal GraphDatabaseService instead of the
      batch inserter, and Peter is trying this out for the performance
of larger
      loads and incremental loads.
      - Using an index other than lucene. Peter is currently evaluating the
      BDB database for its exact match index which might perform better than
      lucene for the node-id lookup.
   - *Changesets*. We do not yet properly support changesets. In fact, the
   current code loads the OSM XML into a structure that still has some residual
   resemblances to the XML, for example we store the user, uid and changeset as
   properties of the nodes and ways they were attributes of in the XML. I have
   started refactoring this, and the plan is to make a two phase improvement:
      - Firstly structure users and changesets as a tree, with nodes and
      ways related to the changeset in the tree structure. This allows for
      analysis of the graph from the perspective of users and
changesets. It also
      reduces the total disk-space used because the user, uid and
changeset id are
      not duplicated in properties as they are today. I have already
done part of
      this work on my computer, but not pushed it. I see database size
reductions
      down to nearly 60% of previous, but I have not completed the new tree, so
      the size will go up again somewhat.
      - Secondly, once we have the changeset tree in place we can work on
      applying changes to the graph. As you requested in your email,
we want to be
      able to apply the daily updates to an existing full OSM model.

So, we have definitely thought about your specific requirements, but due to
other priorities have not made much progress in completing these. I
certainly welcome your feedback, and even help, in completing this work. I
suggest we take a skype call to discuss this further.

Regards, Craig

On Thu, Feb 17, 2011 at 4:28 AM, David Winslow <cdwins...@gmail.com> wrote:

> Hi all,
>
> My organization (OpenGeo) is investigating options for generating and
> hosting map tiles based on OpenStreetMap data on Amazon AWS.  We are
> currently using OSM's osm2pgsql tool with a PostGIS database, GeoServer
> with
> SLD styles to render the data, and GeoWebCache to dice up the map into
> tiles
> and serve them from a filesystem cache.  I'm interested in investigating
> neo4j-spatial as an alternative to Postgres since the graph model seems to
> fit OSM's data more cleanly than an RDBMS.  To be clear, investigating
> neo4j
> is just a side project for me at present.  I've played with neo4j-spatial
> before, and I plan on getting my hands a bit dirty this weekend, but for
> now
> I have a few questions about it.
>
> 1) Has anyone attempted a full OSM planet import using neo4j-spatial?  Any
> tips on ensuring it goes smoothly (how much disk it is likely to require,
> whether the full planet dump will fit in a neo4j 1.2 database, etc)?
> 2) Is there any information available about neo4j performance on EC2?
> 3) The rendering process divides up the OSM data into several classes which
> are styled differently (roads/rivers/buildings/etc).  I am aware that
> neo4j-spatial can index sublayers based on property filters, but when I
> last
> checked the filter syntax used wasn't as flexible as I need for the
> stylesheet I'm using.  For my investigation this weekend I am thinking of
> replacing the existing filter system with one based on CQL[1] to serialize
> filters, does that seem like a bad idea?
> 4) Is there any support for applying OSM's daily or minutely patches?
>  (From
> a look at the code, I think the answer is no, so if not - how tough would
> it
> be to add? Are there any design docs or notes written up about implementing
> that feature?)
>
> [1] CQL - http://docs.codehaus.org/display/GEOTOOLS/ECQL+Parser+Design
>
> Thanks in advance.
>
> --
> David Winslow
> OpenGeo - http://opengeo.org/
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to