Hi,

* Jeroen van Rijn <[email protected]> [2010-09-15 18:19:38]:

> 1. pgsql stores the database in e.g. /var/pgsql/ocitysmap
> 2. at 00:00, finish any current job if running, then shut down postgres
> 3. mv /var/pqsql/ocitysmap /var/pgsql/ocityold
> 4. mv /var/pgsql/ocitynew /var/pgsql/ocitysmap
> 5. restart pgsql and start processing jobs again
> 6. meanwhile at low io prio: rsync someu...@somehost:/somewhere
> /var/pgsql/ocityold
> 7. mv /var/pgsql/ocityold /var/pgsql/ocitynew
> 
> The other machine 'somehost' can keep the database up to date, just making
> sure it's in a known good state by the time the public server comes for its
> rsync.
> 
> That or you could have this other server be the initiator of the rsync
> update, putting a semaphore file on the maposmatic server when it's done;
> this public server would then upon seeing this file finish any running job,
> move those directories around, and restart postgres.

Any solution based on using two databases has several problems:

  - the planet OSM database already takes *a lot* of space, and is
    growing very fast. It will soon be hard to affort having two of them
    at any point in time;
  - swapping databases means a very low update rate, every couple of
    days for example, instead of our current daily updates and our wish
    to move to hourly updates with Osmosis.

> Another thought is to ask the authors of the update tool if it can delay the
> actual updating to near the end of the process and write all the updates at
> once in a single transaction. The database would be doing mostly reads until
> then and keep more rows around in cache that any rendering job at the time
> might make use of as well.

This is already what osm2pgsql is doing IIRC, but commiting the
transaction is very I/O intensive and not at all immediate.

> Also... have you thought to ask the guys at geofabrik how they keep things
> up to date? Is it solely throwing hardware at the problem, or did they do
> something clever with configuring both postgres and the update jobs too.

The guys at geofabrik threw *a lot* of hardware at the problem.
Fredrik's numbers were using a quad-Xeon machine with > 40GB of RAM and
SSD hard drives (with that setup it could do a full planet import in
less than 5 hours...).


So, it mainly is a hardware problem: we need a server with fast disks
and a great deal of RAM to deal with the planet import and the
hourly/daily updates, as well as enough disk space to accomodate the
database and its constant growth.

- Maxime
-- 
Maxime Petazzoni <http://www.bulix.org>
 ``One by one, the penguins took away my sanity.''
Linux kernel and software developer at MontaVista Software

Attachment: signature.asc
Description: Digital signature

Reply via email to