I did some initial work on incremental imports back in 2010, but stopped due to some complications:
- We needed to mix lucene reads and writes during the import (read to check if the node already exists, so we don't import twice) and this performs very badly in the batch inserter. We decided to first code a non-batch insert mode before re-starting the incremental import work. Now Peter and I did code a non-batch importer in early 2011, but never went back to complete the incremental import. - We wanted to support both the case of importing multiple OSM files that could be stitched together by resolving overlaps, as well as the case of applying changesets to the existing OSM model. This increased the complexity of the work just enough to ensure it got dropped. In early 2011 we also added support to changesets in the model (but only as a data structure, not in terms of importing changesets). So we are one step closer to this also. Since we now have non-batch importing, and changeset data structures, the opportunity to re-start the incremental import and importing changesets is there. It should not be too hard. For incremental imports, stitching osm files together, we re-activate the old code that tests the lucene index before adding nodes and relations. There might be some subtle edge cases to consider, but a set of tests with overlapping and non-overlapping osm files should flush them out. For applying changesets, more thinking is still required. Do we want to support history in the model, or only the latest version? Should we verify that only newer changesets are applied and in the right order, or rely on the user to get it right? I can say that we did some thinking this summer on the data structures required to support a complete change history. This relies on the fact that we already support multiple possible ways on the same nodes, so we can also, in principle, support multiple possible 'versions' of ways on the same nodes. More thinking is required, but I have a suspicion that we should actually go ahead and do this properly will full history, because that might be the only way to make sure the user never messes things up by importing in the wrong order. On Tue, Nov 22, 2011 at 9:58 AM, Peter Neubauer < peter.neuba...@neotechnology.com> wrote: > Gregory, > incremental loads (and thus, restarts of OSM imports) are a feature we > want to add later on, but it's not in there yet. This would also mean > we could stitch in other areas on demand, and support submitting > changesets back to OSM or at least capture them, so you as an OSM > based app can contribute to OSM automagically. > > I know it's much to ask, but help here would be greatly appreciated. I > hope to lab with Michael Hunger on import of data into OSM (and > others) this Friday and hope to get somewhere :) > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - NOSQL for the Enterprise. > http://startupbootcamp.org/ - Ă–resund - Innovation happens HERE. > > > > On Tue, Nov 22, 2011 at 7:15 AM, grimace <macegh...@gmail.com> wrote: > > I've been playing with OSMImporter; tried batch and native java. I've > had > > mixed success trying to import the planet, but since it's of considerable > > size, the job usually blows up or grinds to a halt about half way. I > think > > the most I've made it to is 651M nodes and that's not even the ways or > > relations. I just don't know enough about it and thought I would ask > > before I try to dive in to it, but what would I have to do to so that I > > could restart the job ( where it left off ) when it blows? > > > > -- > > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/OSMImporter-Is-there-a-way-to-do-incremental-imports-tp3526941p3526941.html > > Sent from the Neo4j Community Discussions mailing list archive at > Nabble.com. > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user