Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-24 Thread Nathan Edgars II

On 2/16/2012 6:00 PM, Jochen Topf wrote:

Generic key names can be confusing, especially when one OSM object is used for
multiple things. Say there is a way tagged as railway and at the same time this
way is part of an area tagged as a generating station. Does power_source mean
the type of generating station or the type of power used by the railway
(overhead vs. third rail vs. unelectrified)? I hope this example is
hypothetical, but people do strange things in OSM...


A better example is a highway=service service=alley with a railway=rail 
service=spur down the middle.


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Jochen Topf
On Fri, Feb 17, 2012 at 07:44:41AM +, Jukka Rahkonen wrote:
 Frederik Ramm frederik at remote.org writes:
 
  
  Hi,
  
  On 02/16/2012 07:25 PM, Graham Jones wrote:
   This reminded me of a question I have been meaning to ask for quite a
   while - is a database generated by osm2pgsql with an hstore expected to
   perform similarly to one without?
  
  I never ran one with hstore when I think of what this must mean for the 
  database engine, and storage space, then I shudder and would not be 
  surprised by the factor 5 you mentioned.
 
 How about doing it with relations? Let the importer program create four 
 tables 
 osm_point
 osm_line
 osm_polygon
 osm_relation
 
 These tables would each have two attributes: Geometry and osm_is. Geometries 
 in
 the osm_relation would be of type geometry collection, that is, a collection
 of whatever, and it could hold for example the tranport route relation with 
 the
 route and bus stops and everything in one PostGIS geometry object.
 
 Then there should be one or four tables for tags (everything in one table or
 split to suit the geometry tables). The three attributes would be osm_id, key,
 and value. Osm_id would be used as a foreing key for joining geometries and
 attributes. If osm_id can not be guaranteed to be unique then there should be
 point_id, line_id, polygon_id and relation_id added to suitable places.
 
 Every attribute would be indexed. For Mapnik use where all that is needed is 
 to
 do simple SQL selects.  I guess that this database would be faster to query 
 than
 hstore even it also contains all the tags. It might be faster for Mapnik than
 the current tables with extra wide attribute schema because now attributes are
 not indexed at all (or are they?). SQL queries according to tags and values or
 even with a part of the tag and value strings would be easier to do than from
 hstore. People could enhance attributes for some special uses by converting
 max_speeds and other values which are actually measures from strings to 
 integers
 or doubles etc.
 
 This option is so obvious that I believe that someone must have tried it
 already. It would be nice to hear about experiences.

Much too slow. Requiring joins on every query is not a good idea. And having
the right indexes is important. You can't just index everything and hope it
would do the right thing. (The most important index btw is the geometry index
not the attribute indexes, although having the right attribute indexes here
or there can help.)

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Komяpa
 - is a database generated by osm2pgsql with an hstore expected to perform
 similarly to one without?

It performs at the same speed for me.

Of course you're not supposed to use hstore for every tag, but just
for those on highest zoom levels, where spatial indexes are used
mostly, not indexes for other columns.

-- 
Darafei Komяpa Praliaskouski
OSM BY Team - http://openstreetmap.by/
xmpp:m...@komzpa.net mailto:m...@komzpa.net

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Stephan Knauss
Komяpa writes: 


- is a database generated by osm2pgsql with an hstore expected to perform
similarly to one without?


It performs at the same speed for me. 


Of course you're not supposed to use hstore for every tag, but just
for those on highest zoom levels, where spatial indexes are used
mostly, not indexes for other columns.


if you are using Postgres 9.1 you are advised to update to 9.1.2. There was 
a bug introduced that prevented index usage. For example testing for a tag 
like

SELECT * from table WHERE hstore ? tag
did not use an index and could make a query horribly slow. We experienced 
this on Toolserver. 



Stephan

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Graham Jones
Thanks - I will give it another try - I have got a computer running Ubuntu
11.10, which has Postgres 9.1.2.   I will create a second database with an
hstore and see how it compares.

Graham.

On 17 February 2012 11:36, Stephan Knauss o...@stephans-server.de wrote:

 Komяpa writes:

 - is a database generated by osm2pgsql with an hstore expected to perform
 similarly to one without?


 It performs at the same speed for me.
 Of course you're not supposed to use hstore for every tag, but just
 for those on highest zoom levels, where spatial indexes are used
 mostly, not indexes for other columns.


 if you are using Postgres 9.1 you are advised to update to 9.1.2. There
 was a bug introduced that prevented index usage. For example testing for a
 tag like
 SELECT * from table WHERE hstore ? tag
 did not use an index and could make a query horribly slow. We experienced
 this on Toolserver.

 Stephan




-- 
Graham Jones
Hartlepool, UK.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Jukka Rahkonen
Jochen Topf jochen at remote.org writes:


 
 Much too slow. Requiring joins on every query is not a good idea. And having
 the right indexes is important. You can't just index everything and hope it
 would do the right thing. (The most important index btw is the geometry index
 not the attribute indexes, although having the right attribute indexes here
 or there can help.)
 
I sent by followup accidentally to osm-dev list. The main message was that
perhaps there is a need to do some tests before saying if hstore or joined
tables are faster.

Here is a Spatialite database for testing

http://latuviitta.org/documents/relation_test.zip

First trials suggest that joined tables with proper indexes are not at all
slow and compared with the standard tables created by osm2pgsql the query
times can be several times faster. This is not a fair comparison because
osm2pgsql tables are missing the attribute indexes but so they normally do
in the Mapnik chain. Query times vary, so try yourself. Data is osm_lines
from Berlin from some months ago.

By the way, recent development around GDAL suggests that Spatialite may be
faster with spatial and attribute queries than PostGIS. Improvements in GDAL
Spatialite driver made Mapserver about 20 times faster than it used to be and
now Spatialite is faster that PostGIS and even shapefiles.


query from osm_line from osm2pgsql
===
select geometry, highway from osm_line
where highway='primary'
0.531 seconds

query from joined tables
===
select * from line_geometry, tags
where tags.tag='highway' and tags.value='primary'
and tags.join_id=line_geometry.osm_id
0.059 seconds


-Jukka Rahkonen-


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Jochen Topf
On Fri, Feb 17, 2012 at 02:02:26PM +, Jukka Rahkonen wrote:
 Jochen Topf jochen at remote.org writes:
 
 
  
  Much too slow. Requiring joins on every query is not a good idea. And having
  the right indexes is important. You can't just index everything and hope it
  would do the right thing. (The most important index btw is the geometry 
  index
  not the attribute indexes, although having the right attribute indexes here
  or there can help.)
  
 I sent by followup accidentally to osm-dev list. The main message was that
 perhaps there is a need to do some tests before saying if hstore or joined
 tables are faster.
 
 Here is a Spatialite database for testing
 
 http://latuviitta.org/documents/relation_test.zip
 
 First trials suggest that joined tables with proper indexes are not at all
 slow and compared with the standard tables created by osm2pgsql the query
 times can be several times faster. This is not a fair comparison because
 osm2pgsql tables are missing the attribute indexes but so they normally do
 in the Mapnik chain. Query times vary, so try yourself. Data is osm_lines
 from Berlin from some months ago.

Sorry, but that kind of test doesn't really tell you anything about real
performance on a real system. Do this test with data for the whole planet with
one process updating the database and multiple readers accessing the db at the
same time. And results you get with sqlite do not tell you anything about
results for postgresql.

 By the way, recent development around GDAL suggests that Spatialite may be
 faster with spatial and attribute queries than PostGIS. Improvements in GDAL
 Spatialite driver made Mapserver about 20 times faster than it used to be and
 now Spatialite is faster that PostGIS and even shapefiles.
 
 
 query from osm_line from osm2pgsql
 ===
 select geometry, highway from osm_line
 where highway='primary'
 0.531 seconds
 
 query from joined tables
 ===
 select * from line_geometry, tags
 where tags.tag='highway' and tags.value='primary'
 and tags.join_id=line_geometry.osm_id
 0.059 seconds

These numbers are also irrelevant for our discussion here because Mapnik
doesn't do this kinds of queries. Mapnik always queries with a bounding box.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Frederik Ramm

Hi,

On 02/16/2012 07:25 PM, Graham Jones wrote:

This reminded me of a question I have been meaning to ask for quite a
while - is a database generated by osm2pgsql with an hstore expected to
perform similarly to one without?


I never ran one with hstore when I think of what this must mean for the 
database engine, and storage space, then I shudder and would not be 
surprised by the factor 5 you mentioned.



I keep having to re-import my database to add new keys.


If you want to add a new key but avoid re-importing the full database, 
you can add the key to osm2pgsql's style file, somehow generate an .osm 
file that contains only the objects that have this tag, wrap this .osm 
file into a osmChangemodify.../modify/osmChange instead of 
osm.../osm and throw it at osm2pgsql in append mode.



power_source is much more generic so you could re-cycle
it for things like district heating, but generator:power_source is only
ever going to be used for generating stations, and needs a new column in
the database.


Yes, that's the plan I believe. People like keys that are understandable 
without context. Contrast this with a tag width=2 - where you have no 
chance in knowing what it means without looking at the whole thing.


Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Graham Jones
On 16 February 2012 18:51, Frederik Ramm frede...@remote.org wrote:

 On 02/16/2012 07:25 PM, Graham Jones wrote:

 This reminded me of a question I have been meaning to ask for quite a
 while - is a database generated by osm2pgsql with an hstore expected to
 perform similarly to one without?


 I never ran one with hstore when I think of what this must mean for the
 database engine, and storage space, then I shudder and would not be
 surprised by the factor 5 you mentioned.


Thanks - I don't feel so bad about giving up so easily now!

If you want to add a new key but avoid re-importing the full database, you
 can add the key to osm2pgsql's style file, somehow generate an .osm file
 that contains only the objects that have this tag, wrap this .osm file into
 a osmChangemodify.../**modify/osmChange instead of
 osm.../osm and throw it at osm2pgsql in append mode.


That's a neat idea - will try that next time, thanks.


 People like keys that are understandable without context. Contrast this
 with a tag width=2 - where you have no chance in knowing what it means
 without looking at the whole thing.


Yes - I think it is just a style difference - I am quite happy to interpret
it from context for the sake of having a smaller number of unique keys to
think about.

Cheers

Graham

-- 
Graham Jones
Hartlepool, UK.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Lennard

On 16-2-2012 19:25, Graham Jones wrote:


grumble Why create a key generator:power_source rather than just use
power_source.  power_source is much more generic so you could re-cycle
it for things like district heating, but generator:power_source is only
ever going to be used for generating stations, and needs a new column in
the database. /grumble.   I think I just prefer more generic,
re-usable keys rather than trying to invent a new one for each situation


This is a core problem with the public_transport=* scheme as well. This 
tag in and of itself is fine, but the whole additional shebang that is 
train=yes/no, bus=yes/no, subway=yes/no, etc etc is so dreadful (in the 
context of the rendering chain) that every time I read #2798 I feel the 
urge to run away screaming.


http://trac.openstreetmap.org/ticket/2798

--
Lennard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Jochen Topf
On Thu, Feb 16, 2012 at 06:25:30PM +, Graham Jones wrote:
 grumble Why create a key generator:power_source rather than just use
 power_source.  power_source is much more generic so you could re-cycle it
 for things like district heating, but generator:power_source is only ever
 going to be used for generating stations, and needs a new column in the
 database. /grumble.   I think I just prefer more generic, re-usable keys
 rather than trying to invent a new one for each situation

Generic key names can be confusing, especially when one OSM object is used for
multiple things. Say there is a way tagged as railway and at the same time this
way is part of an area tagged as a generating station. Does power_source mean
the type of generating station or the type of power used by the railway
(overhead vs. third rail vs. unelectrified)? I hope this example is
hypothetical, but people do strange things in OSM...

Non-generic tags also make tools such as Taginfo more useful, because each tag
stands on its own. They make it immediately clear if the tagging is incomplete.
If the generator tag got lost, how do we know what the power_source tag
is supposed to mean?

So on the one side you have generic tags like type and class and id which
are very confusing. You never know what they belong to. On the other side you
have rather complex tagging schemas like in your generator:... example. Thats
sometimes hard to use, too. We still have to figure out where the sweet spot is,
but I tend to prefer the more explicit keys.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Martin Koppenhoefer
2012/2/17 Jochen Topf joc...@remote.org:
 On Thu, Feb 16, 2012 at 06:25:30PM +, Graham Jones wrote:
 Generic key names can be confusing, especially when one OSM object is used for
 multiple things.


+1.
E.g. an object tagged barrier=fence, height=2, landuse=forest. In this
example you could also use 2 objects: a closed way for the fence and a
multipolygon for the area. Linear objects we can create with route
relations (instead of overlapping ways). We don't have currently a
relation for nodes and maybe we don't need this utterly (you could
place another node nearby), but it would solve some cases where you
want to state the topology precisely (e.g. 2 objects at the same
pole). This way we can have things (expressed by a relation) which
have their geometry only as an attribute, instead of a geometry that
_is_ something.


 Non-generic tags also make tools such as Taginfo more useful, because each tag
 stands on its own. They make it immediately clear if the tagging is 
 incomplete.
 If the generator tag got lost, how do we know what the power_source tag
 is supposed to mean?


+1 for taginfo, but I don't think we need the namespaces for the case
something gets lost (there is the history for this).

Cheers,
Martin

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Jukka Rahkonen
Frederik Ramm frederik at remote.org writes:

 
 Hi,
 
 On 02/16/2012 07:25 PM, Graham Jones wrote:
  This reminded me of a question I have been meaning to ask for quite a
  while - is a database generated by osm2pgsql with an hstore expected to
  perform similarly to one without?
 
 I never ran one with hstore when I think of what this must mean for the 
 database engine, and storage space, then I shudder and would not be 
 surprised by the factor 5 you mentioned.

How about doing it with relations? Let the importer program create four tables 
osm_point
osm_line
osm_polygon
osm_relation

These tables would each have two attributes: Geometry and osm_is. Geometries in
the osm_relation would be of type geometry collection, that is, a collection
of whatever, and it could hold for example the tranport route relation with the
route and bus stops and everything in one PostGIS geometry object.

Then there should be one or four tables for tags (everything in one table or
split to suit the geometry tables). The three attributes would be osm_id, key,
and value. Osm_id would be used as a foreing key for joining geometries and
attributes. If osm_id can not be guaranteed to be unique then there should be
point_id, line_id, polygon_id and relation_id added to suitable places.

Every attribute would be indexed. For Mapnik use where all that is needed is to
do simple SQL selects.  I guess that this database would be faster to query than
hstore even it also contains all the tags. It might be faster for Mapnik than
the current tables with extra wide attribute schema because now attributes are
not indexed at all (or are they?). SQL queries according to tags and values or
even with a part of the tag and value strings would be easier to do than from
hstore. People could enhance attributes for some special uses by converting
max_speeds and other values which are actually measures from strings to integers
or doubles etc.

This option is so obvious that I believe that someone must have tried it
already. It would be nice to hear about experiences.

-Jukka Rahkonen-


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk