On Fri, Jul 03, 2015 at 07:23:54AM +0100, Simon Nuttall wrote: > Now it is showing these again: > >· > > Done 274 in 136 @ 2.014706 per second - Rank 26 ETA (seconds): 2467.854004 > >· > > Presumably this means it is now playing catchup relative to the > > original download data? >
I would suppose so. > How can I tell what date it has caught up to? (And thus get an idea of > > when it is likely to finish?) > Have a look at the import_osmosis_log table. It gives you a good idea how long the batches take. > Is it catching up by downloading minutely diffs or using larger > > intervals, then switching to minutely diffs when it is almost fully up > > to date? > That depends how you have configured it. If it is set to the URL of the minutelies it will use minutely diffs but accumulate them to batches of the size you have configured. When it has caught up it will just accumulate the latest minutelies, so batches become smaller. > This phase still seems very disk intensive, will that settle down and > > become much less demanding when it has eventually got up to date? > It will become less but there still is IO going on. Given that your initial import took about 10 times as long as the best time I've seen, it will probably take a long time to catch up. You should consider running with --index-instances 2 while catching up and you should really investigate where the bottleneck in the system is. > Can the whole installed running Nominatim be copied to another > machine? And set running? > > Presumably this is a database dump and copy - but how practical is that? Yes, dump and restore is possible. You should be aware that indexes are not dumped, so it still takes a day or two to restore the complete database. > Are there alternative ideas such as replication or backup? For backup you can do partial dumps that contain only tables needed for querying the database. These dumps can be faster restored but they are not updateable, so they are more of an interim solution to install on a spare emergency server while the main DB is reimported. The dump/backup script used for the osm.org servers can be found here: https://github.com/openstreetmap/chef/blob/master/cookbooks/nominatim/templates/default/backup-nominatim.erb If you go down that road, I recommend actually trying the restore at least once, so you get an idea about the time and space requirements. Replication is possible as well. In fact, the two osm.org servers have been running as master and slave with streaming replication for about two weeks now. You should disable writing logs to the database. Otherwise the setup is fairly standard, following largely this guide: https://wiki.postgresql.org/wiki/Streaming_Replication > > string(123) "INSERT INTO import_osmosis_log values > > ('2015-06-08T07:58:02Z',25816916,'2015-07-03 06:07:34','2015-07-03 > > 06:44:10','index')" > > 2015-07-03 06:44:10 Completed index step for 2015-06-08T07:58:02Z in > > 36.6 minutes > > 2015-07-03 06:44:10 Completed all for 2015-06-08T07:58:02Z in 58.05 minutes > > 2015-07-03 06:44:10 Sleeping 0 seconds > > /usr/local/bin/osmosis --read-replication-interval > > workingDirectory=/home/nominatim/Nominatim/settings --simplify-change > > --write-xml-change /home/nominatim/Nominatim/data/osmosischange.osc > > > > Which presumably means it is updating June 8th? (What else can I read > > from this?) See above, check out the import_osmosis_log. The important thing to take away is how long it takes to update which interval. If on average the import takes longer than real time you are in trouble. > > Also, at what point is it safe to expose the Nominatim as a live service? As soon as the import is finished. Search queries might interfere with the updates when your server gets swarmed with lots of parallel queries but I doubt that you have enough traffic for that. Just make sure to keep the number of requests that can hit the database in parallel at a moderate level. Use php-fpm with limited pools for that and experiment with the limits until you get the maximum performance. Sarah _______________________________________________ Geocoding mailing list [email protected] https://lists.openstreetmap.org/listinfo/geocoding

