Hi, * Jeroen van Rijn <[email protected]> [2010-09-15 18:19:38]:
> 1. pgsql stores the database in e.g. /var/pgsql/ocitysmap
> 2. at 00:00, finish any current job if running, then shut down postgres
> 3. mv /var/pqsql/ocitysmap /var/pgsql/ocityold
> 4. mv /var/pgsql/ocitynew /var/pgsql/ocitysmap
> 5. restart pgsql and start processing jobs again
> 6. meanwhile at low io prio: rsync someu...@somehost:/somewhere
> /var/pgsql/ocityold
> 7. mv /var/pgsql/ocityold /var/pgsql/ocitynew
>
> The other machine 'somehost' can keep the database up to date, just making
> sure it's in a known good state by the time the public server comes for its
> rsync.
>
> That or you could have this other server be the initiator of the rsync
> update, putting a semaphore file on the maposmatic server when it's done;
> this public server would then upon seeing this file finish any running job,
> move those directories around, and restart postgres.
Any solution based on using two databases has several problems:
- the planet OSM database already takes *a lot* of space, and is
growing very fast. It will soon be hard to affort having two of them
at any point in time;
- swapping databases means a very low update rate, every couple of
days for example, instead of our current daily updates and our wish
to move to hourly updates with Osmosis.
> Another thought is to ask the authors of the update tool if it can delay the
> actual updating to near the end of the process and write all the updates at
> once in a single transaction. The database would be doing mostly reads until
> then and keep more rows around in cache that any rendering job at the time
> might make use of as well.
This is already what osm2pgsql is doing IIRC, but commiting the
transaction is very I/O intensive and not at all immediate.
> Also... have you thought to ask the guys at geofabrik how they keep things
> up to date? Is it solely throwing hardware at the problem, or did they do
> something clever with configuring both postgres and the update jobs too.
The guys at geofabrik threw *a lot* of hardware at the problem.
Fredrik's numbers were using a quad-Xeon machine with > 40GB of RAM and
SSD hard drives (with that setup it could do a full planet import in
less than 5 hours...).
So, it mainly is a hardware problem: we need a server with fast disks
and a great deal of RAM to deal with the planet import and the
hourly/daily updates, as well as enough disk space to accomodate the
database and its constant growth.
- Maxime
--
Maxime Petazzoni <http://www.bulix.org>
``One by one, the penguins took away my sanity.''
Linux kernel and software developer at MontaVista Software
signature.asc
Description: Digital signature
