Re: [OSM-dev] Yevaud SSD Drive

2012-04-13 Thread John Perrin
Thanks for this great response Kai, it's incredibly useful.

 

John

 

From: Ian Dees [mailto:ian.d...@gmail.com] 
Sent: 13 April 2012 03:05
To: Kai Krueger
Cc: dev@openstreetmap.org
Subject: Re: [OSM-dev] Yevaud SSD Drive

 

This is a great writeup, Kai. I hope you throw it on the wiki or
something.

 

I'll throw in my 2 cents: The OSM US tile server has a worldwide
osm2pgsql database set up for experimental rendering. It has almost no
tile rendering load (because it's not really doing anything notable yet)
and I had it updating every 2 hours before the diff location changed. It
has four 600GB WD Velociraptor disks in a RAID10 set up and was usually
able to catch up those 2 hours in 15-30 minutes.

 

On Thu, Apr 12, 2012 at 8:39 PM, Kai Krueger kakrue...@gmail.com
wrote:

There are two main components to the storage system of a tile server,
each of which can have different requirements depending on the
circumstances

1) Tile storage cache

For the tile storage usually one needs quite a bit of space, but
performance isn't quite as critical. For a general purpose world wide
map you will likely need somewhere on the order of above 600 GB. The
full world wide tile set is considerably larger than that, but rendering
on the fly of e.g. z18 ocean tiles is usually possible without too much
problems. I don't know the exact scaling, but it seems like above
somewhere between 300 - 600 GB the cache hit rate only increases slowly
with size of the cache.

Performance wise, it appears like 1000 tiles/s will generate somewhere
on the order of 300 - 500 iops on the disk system, although that
obviously depends on the size of ram of the server and the distribution
of areas served. This is a level of performance that you can probably
get out a raid array of a few sata disks. The performance requirement on
this part of the disks likely scale fairly straightforward with the
number of tiles served per second. Adding a reverse proxy in front of
the tile server can also help reasonably to distribute load for tile
serving. For most tile servers I have seen so far tile serving hasn't
really been much of an issue, but in the order above 1000 Tiles/s you
probably do need to consider it as well.

2) Rendering database

The rendering database is where for most people the disk performance
bottlenecks are. For the full planet, the postgis database with indexes
is around 300 - 400GB in size. This is as others have pointed out where
some people use SSDs for. Quite a bit of performance is consumed in
keeping the database up to date with minutely diffs from OSM. This
performance does not depend at all on how many tiles you serve, but only
the rate of editing in OSM. From what I have seen (and other might
correct me), a 2 - 4 disk sata raid array might not be able to keep up
with edits during absolute peak editing times (e.g. Sunday afternoon
European time), but should catch back up during the night. On top of
that is the actual rendering of tiles. As typically one doesn't
re-render tiles in advance (other than low zoom tiles), but only once
they are actually viewed. Rendering performance does to some degree
depend on the tile serving performance. If it doesn't matter how up to
date rendered tiles are, rendering requests can be queued and rendered
during quiet periods, which considerably reduces the performance
requirements on the database.

So overall whether you need an SSD for the database mostly depends on
how up-to-date you want to me with respect to OSM edits. If you want to
be in the order of minutes behind OSM, you probably will need an SSD.
Given that a fast update is important for mappers as reward for their
work, the SSD in osm's tile server has been a big win. If daily updates
or less are fine, then you might not need one. Once you get down to
monthly updates, you are likely best not using an updateable database
but do full reimports, the size of the database reduces typically to
less than half the size.

It also depends on how spatially distributed your requests are. If e.g.
your site has a bunch of locations around which it displays local
maps. I.e. the same locations are shown over and over again, the
rendering load is much less than if you offer Downloading country wide
tiles for offline use in a mobile app even with the same amount of
serving load.

If you don't need a world wide map, then hardware requirements also
considerably reduce and once you get down to only e.g. a country like
e.g. Italy or the UK, you possibly don't really have to worry about the
database anymore at all, as any modern hardware is probably sufficient.

Kai

On 04/12/2012 03:53 PM, Paul Norman wrote:
 I believe the SSD is used for the database. Before the SSD the DB was
on
 the RAID10 array. I'm not sure four 300 GB 10k RPM drives are much
 cheaper than a SSD.



 You might find looking through munin for yevaud helpful -

http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/#disk

 The SSD is sdd according to the wiki.



 How many

Re: [OSM-dev] Yevaud SSD Drive

2012-04-12 Thread Toby Murray
On Thu, Apr 12, 2012 at 4:33 PM, John Perrin john.d.per...@gmail.com wrote:
 Hi,

 I've posted this question on the OSM Q  A site a well, not sure what the
 best forum for the question is, so please forgive the dual post if you also
 follow that site.

 Basically, I was just inquiring into the specific need for the SSD drive on
 the yevaud tile server.  I'm looking to run an OSM tile server that can
 handle roughly 200,000 - 400,000 map views a day and have taken this as a
 good benchmark for the server spec.  However the SSD is half the cost of
 reproducing a server with that spec.  I was just wondering exactly what the
 disk was used for, and why is specifically needed the SSD drive. I can see
 the purchase logged in the server upgrade history, but I can't see any
 reason explaining why it was needed.

See the yearly graph here:
http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/renderd_queue.html
and here:
http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/renderd_processed.html

The second graph had a hiccup in August that makes it kind of hard to
read the other values but at the left edge you can see the render
queue was full for a lot of the time and it was dropping render
requests a lot. Then in May of last year the SSD was installed. Now I
think the only time the render queue gets full is short spikes of peak
load (when OSM hits slashot and the like)

The SSD holds the postgis database that mapnik executes queries
against to pull the data for rendering. Before the SSD, I believe
yevaud was running a fairly beefy RAID 10 so the SSD really did make a
big difference.

Toby

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Yevaud SSD Drive

2012-04-12 Thread Paul Norman
I believe the SSD is used for the database. Before the SSD the DB was on the
RAID10 array. I'm not sure four 300 GB 10k RPM drives are much cheaper than
a SSD.

 

You might find looking through munin for yevaud helpful -
http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/#disk

The SSD is sdd according to the wiki. 

 

How many tiles do you expect each map view to generate? I'd expect at least
50-100. This would give you an average of 200-500 requests/second. Just for
comparison, the caches in front of yevaud peak at about 3.5k requests/second

 

From: John Perrin [mailto:john.d.per...@gmail.com] 
Sent: Thursday, April 12, 2012 2:34 PM
To: dev@openstreetmap.org
Subject: [OSM-dev] Yevaud SSD Drive

 

Hi,

 

I've posted this question on the OSM Q  A site a well, not sure what the
best forum for the question is, so please forgive the dual post if you also
follow that site.

 

Basically, I was just inquiring into the specific need for the SSD drive on
the yevaud tile server.  I'm looking to run an OSM tile server that can
handle roughly 200,000 - 400,000 map views a day and have taken this as a
good benchmark for the server spec.  However the SSD is half the cost of
reproducing a server with that spec.  I was just wondering exactly what the
disk was used for, and why is specifically needed the SSD drive. I can see
the purchase logged in the server upgrade history, but I can't see any
reason explaining why it was needed.

 

Thanks

 

John

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Yevaud SSD Drive

2012-04-12 Thread John Perrin
Thanks for the answers so far.

The standard map size will be 600 x 400, so given that open layers might
request tiles around the visible map area, I had thought that I would be
getting at least 16 tile requests per map view.  Of course
this doesn't take into account people panning around.  Our current Google
map usage is about 200,000 a day, but we can't tell from the data we have
how many are panning around the maps, as Google only count the registering
of the API as a hit, not how many tiles are served to the client.

Our load profile is heavily weighted to the evening rather than spread-out
evenly throughout the day, so we could hit a spike of ~1,000 requests a
second, which the yevaud spec seems to handle ably.

John



On Thu, Apr 12, 2012 at 10:53 PM, Paul Norman penor...@mac.com wrote:

 I believe the SSD is used for the database. Before the SSD the DB was on
 the RAID10 array. I’m not sure four 300 GB 10k RPM drives are much cheaper
 than a SSD.

 ** **

 You might find looking through munin for yevaud helpful -
 http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/#disk***
 *

 The SSD is sdd according to the wiki. 

 ** **

 How many tiles do you expect each map view to generate? I’d expect at
 least 50-100. This would give you an average of 200-500 requests/second.
 Just for comparison, the caches in front of yevaud peak at about 3.5k
 requests/second

 ** **

 *From:* John Perrin [mailto:john.d.per...@gmail.com]
 *Sent:* Thursday, April 12, 2012 2:34 PM
 *To:* dev@openstreetmap.org
 *Subject:* [OSM-dev] Yevaud SSD Drive

 ** **

 Hi,

 ** **

 I've posted this question on the OSM Q  A site a well, not sure what the
 best forum for the question is, so please forgive the dual post if you also
 follow that site.

 ** **

 Basically, I was just inquiring into the specific need for the SSD drive
 on the yevaud tile server.  I'm looking to run an OSM tile server that can
 handle roughly 200,000 - 400,000 map views a day and have taken this as a
 good benchmark for the server spec.  However the SSD is half the cost of
 reproducing a server with that spec.  I was just wondering exactly what the
 disk was used for, and why is specifically needed the SSD drive. I can see
 the purchase logged in the server upgrade history, but I can't see any
 reason explaining why it was needed.

 ** **

 Thanks

 ** **

 John

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Yevaud SSD Drive

2012-04-12 Thread Kai Krueger
There are two main components to the storage system of a tile server,
each of which can have different requirements depending on the circumstances

1) Tile storage cache

For the tile storage usually one needs quite a bit of space, but
performance isn't quite as critical. For a general purpose world wide
map you will likely need somewhere on the order of above 600 GB. The
full world wide tile set is considerably larger than that, but rendering
on the fly of e.g. z18 ocean tiles is usually possible without too much
problems. I don't know the exact scaling, but it seems like above
somewhere between 300 - 600 GB the cache hit rate only increases slowly
with size of the cache.

Performance wise, it appears like 1000 tiles/s will generate somewhere
on the order of 300 - 500 iops on the disk system, although that
obviously depends on the size of ram of the server and the distribution
of areas served. This is a level of performance that you can probably
get out a raid array of a few sata disks. The performance requirement on
this part of the disks likely scale fairly straightforward with the
number of tiles served per second. Adding a reverse proxy in front of
the tile server can also help reasonably to distribute load for tile
serving. For most tile servers I have seen so far tile serving hasn't
really been much of an issue, but in the order above 1000 Tiles/s you
probably do need to consider it as well.

2) Rendering database

The rendering database is where for most people the disk performance
bottlenecks are. For the full planet, the postgis database with indexes
is around 300 - 400GB in size. This is as others have pointed out where
some people use SSDs for. Quite a bit of performance is consumed in
keeping the database up to date with minutely diffs from OSM. This
performance does not depend at all on how many tiles you serve, but only
the rate of editing in OSM. From what I have seen (and other might
correct me), a 2 - 4 disk sata raid array might not be able to keep up
with edits during absolute peak editing times (e.g. Sunday afternoon
European time), but should catch back up during the night. On top of
that is the actual rendering of tiles. As typically one doesn't
re-render tiles in advance (other than low zoom tiles), but only once
they are actually viewed. Rendering performance does to some degree
depend on the tile serving performance. If it doesn't matter how up to
date rendered tiles are, rendering requests can be queued and rendered
during quiet periods, which considerably reduces the performance
requirements on the database.

So overall whether you need an SSD for the database mostly depends on
how up-to-date you want to me with respect to OSM edits. If you want to
be in the order of minutes behind OSM, you probably will need an SSD.
Given that a fast update is important for mappers as reward for their
work, the SSD in osm's tile server has been a big win. If daily updates
or less are fine, then you might not need one. Once you get down to
monthly updates, you are likely best not using an updateable database
but do full reimports, the size of the database reduces typically to
less than half the size.

It also depends on how spatially distributed your requests are. If e.g.
your site has a bunch of locations around which it displays local
maps. I.e. the same locations are shown over and over again, the
rendering load is much less than if you offer Downloading country wide
tiles for offline use in a mobile app even with the same amount of
serving load.

If you don't need a world wide map, then hardware requirements also
considerably reduce and once you get down to only e.g. a country like
e.g. Italy or the UK, you possibly don't really have to worry about the
database anymore at all, as any modern hardware is probably sufficient.

Kai

On 04/12/2012 03:53 PM, Paul Norman wrote:
 I believe the SSD is used for the database. Before the SSD the DB was on
 the RAID10 array. I’m not sure four 300 GB 10k RPM drives are much
 cheaper than a SSD.
 
  
 
 You might find looking through munin for yevaud helpful -
 http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/#disk
 
 The SSD is sdd according to the wiki.
 
  
 
 How many tiles do you expect each map view to generate? I’d expect at
 least 50-100. This would give you an average of 200-500 requests/second.
 Just for comparison, the caches in front of yevaud peak at about 3.5k
 requests/second
 
  
 
 *From:*John Perrin [mailto:john.d.per...@gmail.com]
 *Sent:* Thursday, April 12, 2012 2:34 PM
 *To:* dev@openstreetmap.org
 *Subject:* [OSM-dev] Yevaud SSD Drive
 
  
 
 Hi,
 
  
 
 I've posted this question on the OSM Q  A site a well, not sure what
 the best forum for the question is, so please forgive the dual post if
 you also follow that site.
 
  
 
 Basically, I was just inquiring into the specific need for the SSD drive
 on the yevaud tile server.  I'm looking to run an OSM tile server that
 can handle roughly 200,000 - 400,000 map 

Re: [OSM-dev] Yevaud SSD Drive

2012-04-12 Thread Ian Dees
This is a great writeup, Kai. I hope you throw it on the wiki or something.

I'll throw in my 2 cents: The OSM US tile server has a worldwide osm2pgsql
database set up for experimental rendering. It has almost no tile rendering
load (because it's not really doing anything notable yet) and I had it
updating every 2 hours before the diff location changed. It has four 600GB
WD Velociraptor disks in a RAID10 set up and was usually able to catch up
those 2 hours in 15-30 minutes.

On Thu, Apr 12, 2012 at 8:39 PM, Kai Krueger kakrue...@gmail.com wrote:

 There are two main components to the storage system of a tile server,
 each of which can have different requirements depending on the
 circumstances

 1) Tile storage cache

 For the tile storage usually one needs quite a bit of space, but
 performance isn't quite as critical. For a general purpose world wide
 map you will likely need somewhere on the order of above 600 GB. The
 full world wide tile set is considerably larger than that, but rendering
 on the fly of e.g. z18 ocean tiles is usually possible without too much
 problems. I don't know the exact scaling, but it seems like above
 somewhere between 300 - 600 GB the cache hit rate only increases slowly
 with size of the cache.

 Performance wise, it appears like 1000 tiles/s will generate somewhere
 on the order of 300 - 500 iops on the disk system, although that
 obviously depends on the size of ram of the server and the distribution
 of areas served. This is a level of performance that you can probably
 get out a raid array of a few sata disks. The performance requirement on
 this part of the disks likely scale fairly straightforward with the
 number of tiles served per second. Adding a reverse proxy in front of
 the tile server can also help reasonably to distribute load for tile
 serving. For most tile servers I have seen so far tile serving hasn't
 really been much of an issue, but in the order above 1000 Tiles/s you
 probably do need to consider it as well.

 2) Rendering database

 The rendering database is where for most people the disk performance
 bottlenecks are. For the full planet, the postgis database with indexes
 is around 300 - 400GB in size. This is as others have pointed out where
 some people use SSDs for. Quite a bit of performance is consumed in
 keeping the database up to date with minutely diffs from OSM. This
 performance does not depend at all on how many tiles you serve, but only
 the rate of editing in OSM. From what I have seen (and other might
 correct me), a 2 - 4 disk sata raid array might not be able to keep up
 with edits during absolute peak editing times (e.g. Sunday afternoon
 European time), but should catch back up during the night. On top of
 that is the actual rendering of tiles. As typically one doesn't
 re-render tiles in advance (other than low zoom tiles), but only once
 they are actually viewed. Rendering performance does to some degree
 depend on the tile serving performance. If it doesn't matter how up to
 date rendered tiles are, rendering requests can be queued and rendered
 during quiet periods, which considerably reduces the performance
 requirements on the database.

 So overall whether you need an SSD for the database mostly depends on
 how up-to-date you want to me with respect to OSM edits. If you want to
 be in the order of minutes behind OSM, you probably will need an SSD.
 Given that a fast update is important for mappers as reward for their
 work, the SSD in osm's tile server has been a big win. If daily updates
 or less are fine, then you might not need one. Once you get down to
 monthly updates, you are likely best not using an updateable database
 but do full reimports, the size of the database reduces typically to
 less than half the size.

 It also depends on how spatially distributed your requests are. If e.g.
 your site has a bunch of locations around which it displays local
 maps. I.e. the same locations are shown over and over again, the
 rendering load is much less than if you offer Downloading country wide
 tiles for offline use in a mobile app even with the same amount of
 serving load.

 If you don't need a world wide map, then hardware requirements also
 considerably reduce and once you get down to only e.g. a country like
 e.g. Italy or the UK, you possibly don't really have to worry about the
 database anymore at all, as any modern hardware is probably sufficient.

 Kai

 On 04/12/2012 03:53 PM, Paul Norman wrote:
  I believe the SSD is used for the database. Before the SSD the DB was on
  the RAID10 array. I’m not sure four 300 GB 10k RPM drives are much
  cheaper than a SSD.
 
 
 
  You might find looking through munin for yevaud helpful -
  http://munin.openstreetmap.org/openstreetmap/yevaud.openstreetmap/#disk
 
  The SSD is sdd according to the wiki.
 
 
 
  How many tiles do you expect each map view to generate? I’d expect at
  least 50-100. This would give you an average of 200-500 requests/second.
  Just for comparison,