> From: torstenm...@schleim.mac.com [mailto:torstenm...@schleim.mac.com]
> Subject: [OSM-talk] Size of installed Database?
>
> Hello,
>
> i'm about to install the latest planet.osm and i'd like to install it on
> an SSD.
>
> The bzipped XML file is approximately 31 GB of size, but how much space
> will i need on an SSD for the PostgreSQL database?
You haven't said you're using the XML file, but I'd suggest using the PBF
instead.
> I haven't bought an SSD yet and i'd like to make sure that the database
> will fit on the SSD.
>
> It would be very kind if somebody could tell me how much size an
> installation takes on disk and how old their installation is (if it is
> not actual).
I'm assuming you're asking about an osm2pgsql rendering database being
updated
with diffs. If not, some of the following concepts still apply, but numbers
will differ.
Because it's been awhile since it's come up, I figure I might as well go
over osm2pgsql size in detail. The disk space required is governed by
the size of OSM data, import options used, and database bloat.
1. Size of OSM data
More data = more space. PBF is faster and smaller, but doesn't change
database size.
2. Import options used
If you're dealing with the entire planet, you should be using
--flat-nodes as its faster and saves space. Flat nodes uses about the
same space for an extract as for the full planet, so it's not worth
using for small extracts. Aside from that, there's not much you can
tweak to save space, except --drop. If you aren't planning on doing
updates, use --slim --drop to get rid of the slim tables and save lots
of space.
hstore or a bigger .style will use more space
3. Database bloat
Over time, data gets updated or deleted in the database. Postgres
recovers this space with autovacuum, but the default autovacuum settings
are not agressive enough. You want to adjust
autovacuum_vacuum_scale_factor so it will autovacuum more frequently.
autovacuum_analyze_scale_factor may also be adjusted. I use 0.04 and
0.02 for these.
A second problem is index bloat. If you can tolerate the database locks,
just do a REINDEX (or a CLUSTER). Otherwise, create a new index
concurrently and then use it to replace the old index. You can do
reindexing on an index-by-index basis
Now, for numbers. It's important to remember that bloat changes size, so
if you turn autovacuum off and import a year old planet then update it,
you get a different size than if you import a fresh planet.
Yevaud had a fresh planet imported 3 weeks ago. It is 263GB in-DB
currently, with an additional ~20GB for flat-nodes. Of this, it is 60GB
for slim data, 105GB for slim indexes, 76GB for rendering data, 19GB for
rendering indexes and 1.6GB for non-standard rendering indexes.
This gives you a total of 95GB for rendering, and an additional 190GB
for slim.
One additional caution is the import requires more space to do the
CLUSTER.
This leads to a few conclusions
- If you don't need updates, use --slim --drop --flat-nodes and delete
the flat nodes file to save lots of space, particularly during import
- You probably want to run reindex every 1-3 months
- If you want to update your database by reloading into a new DB and
replacing the old DB with it, the additional space requirements during
import vs. non-slim are modest (slim size - CLUSTER requirements for
non-slim tables = ~60GB)
- I'd recommend a 512GB class SSD with a new server if you want
everything on a SSD. The OSMF rendering servers use 512GB 840 Pros. If
you have less space, use tablespace options to split it up, particularly
if doing rendering only and just need the slim tables for the import.
- Tune autovacuum aggressively if you're short space
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk