Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
On 2/16/2012 6:00 PM, Jochen Topf wrote: Generic key names can be confusing, especially when one OSM object is used for multiple things. Say there is a way tagged as railway and at the same time this way is part of an area tagged as a generating station. Does "power_source" mean the type of generating station or the type of power used by the railway (overhead vs. third rail vs. unelectrified)? I hope this example is hypothetical, but people do strange things in OSM... A better example is a highway=service service=alley with a railway=rail service=spur down the middle. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
On Fri, Feb 17, 2012 at 02:02:26PM +, Jukka Rahkonen wrote: > Jochen Topf remote.org> writes: > > > > > > Much too slow. Requiring joins on every query is not a good idea. And having > > the right indexes is important. You can't just index everything and hope it > > would do the right thing. (The most important index btw is the geometry > > index > > not the attribute indexes, although having the right attribute indexes here > > or there can help.) > > I sent by followup accidentally to osm-dev list. The main message was that > perhaps there is a need to do some tests before saying if hstore or joined > tables are faster. > > Here is a Spatialite database for testing > > http://latuviitta.org/documents/relation_test.zip > > First trials suggest that joined tables with proper indexes are not at all > slow and compared with the standard tables created by osm2pgsql the query > times can be several times faster. This is not a fair comparison because > osm2pgsql tables are missing the attribute indexes but so they normally do > in the Mapnik chain. Query times vary, so try yourself. Data is osm_lines > from Berlin from some months ago. Sorry, but that kind of test doesn't really tell you anything about real performance on a real system. Do this test with data for the whole planet with one process updating the database and multiple readers accessing the db at the same time. And results you get with sqlite do not tell you anything about results for postgresql. > By the way, recent development around GDAL suggests that Spatialite may be > faster with spatial and attribute queries than PostGIS. Improvements in GDAL > Spatialite driver made Mapserver about 20 times faster than it used to be and > now Spatialite is faster that PostGIS and even shapefiles. > > > query from osm_line from osm2pgsql > === > select geometry, highway from osm_line > where highway='primary' > 0.531 seconds > > query from joined tables > === > select * from line_geometry, tags > where tags.tag='highway' and tags.value='primary' > and tags.join_id=line_geometry.osm_id > 0.059 seconds These numbers are also irrelevant for our discussion here because Mapnik doesn't do this kinds of queries. Mapnik always queries with a bounding box. Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
Jochen Topf remote.org> writes: > > Much too slow. Requiring joins on every query is not a good idea. And having > the right indexes is important. You can't just index everything and hope it > would do the right thing. (The most important index btw is the geometry index > not the attribute indexes, although having the right attribute indexes here > or there can help.) I sent by followup accidentally to osm-dev list. The main message was that perhaps there is a need to do some tests before saying if hstore or joined tables are faster. Here is a Spatialite database for testing http://latuviitta.org/documents/relation_test.zip First trials suggest that joined tables with proper indexes are not at all slow and compared with the standard tables created by osm2pgsql the query times can be several times faster. This is not a fair comparison because osm2pgsql tables are missing the attribute indexes but so they normally do in the Mapnik chain. Query times vary, so try yourself. Data is osm_lines from Berlin from some months ago. By the way, recent development around GDAL suggests that Spatialite may be faster with spatial and attribute queries than PostGIS. Improvements in GDAL Spatialite driver made Mapserver about 20 times faster than it used to be and now Spatialite is faster that PostGIS and even shapefiles. query from osm_line from osm2pgsql === select geometry, highway from osm_line where highway='primary' 0.531 seconds query from joined tables === select * from line_geometry, tags where tags.tag='highway' and tags.value='primary' and tags.join_id=line_geometry.osm_id 0.059 seconds -Jukka Rahkonen- ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
Thanks - I will give it another try - I have got a computer running Ubuntu 11.10, which has Postgres 9.1.2. I will create a second database with an hstore and see how it compares. Graham. On 17 February 2012 11:36, Stephan Knauss wrote: > Komяpa writes: > >> - is a database generated by osm2pgsql with an hstore expected to perform >>> similarly to one without? >>> >> >> It performs at the same speed for me. >> Of course you're not supposed to use hstore for every tag, but just >> for those on highest zoom levels, where spatial indexes are used >> mostly, not indexes for other columns. >> > > if you are using Postgres 9.1 you are advised to update to 9.1.2. There > was a bug introduced that prevented index usage. For example testing for a > tag like > SELECT * from table WHERE hstore ? tag > did not use an index and could make a query horribly slow. We experienced > this on Toolserver. > > Stephan > -- Graham Jones Hartlepool, UK. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
Komяpa writes: - is a database generated by osm2pgsql with an hstore expected to perform similarly to one without? It performs at the same speed for me. Of course you're not supposed to use hstore for every tag, but just for those on highest zoom levels, where spatial indexes are used mostly, not indexes for other columns. if you are using Postgres 9.1 you are advised to update to 9.1.2. There was a bug introduced that prevented index usage. For example testing for a tag like SELECT * from table WHERE hstore ? tag did not use an index and could make a query horribly slow. We experienced this on Toolserver. Stephan ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
> - is a database generated by osm2pgsql with an hstore expected to perform > similarly to one without? It performs at the same speed for me. Of course you're not supposed to use hstore for every tag, but just for those on highest zoom levels, where spatial indexes are used mostly, not indexes for other columns. -- Darafei "Komяpa" Praliaskouski OSM BY Team - http://openstreetmap.by/ xmpp:m...@komzpa.net mailto:m...@komzpa.net ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
On Fri, Feb 17, 2012 at 07:44:41AM +, Jukka Rahkonen wrote: > Frederik Ramm remote.org> writes: > > > > > Hi, > > > > On 02/16/2012 07:25 PM, Graham Jones wrote: > > > This reminded me of a question I have been meaning to ask for quite a > > > while - is a database generated by osm2pgsql with an hstore expected to > > > perform similarly to one without? > > > > I never ran one with hstore when I think of what this must mean for the > > database engine, and storage space, then I shudder and would not be > > surprised by the factor 5 you mentioned. > > How about doing it with relations? Let the importer program create four > tables > osm_point > osm_line > osm_polygon > osm_relation > > These tables would each have two attributes: Geometry and osm_is. Geometries > in > the osm_relation would be of type "geometry collection", that is, a collection > of whatever, and it could hold for example the tranport route relation with > the > route and bus stops and everything in one PostGIS geometry object. > > Then there should be one or four tables for tags (everything in one table or > split to suit the geometry tables). The three attributes would be osm_id, key, > and value. Osm_id would be used as a foreing key for joining geometries and > attributes. If osm_id can not be guaranteed to be unique then there should be > point_id, line_id, polygon_id and relation_id added to suitable places. > > Every attribute would be indexed. For Mapnik use where all that is needed is > to > do simple SQL selects. I guess that this database would be faster to query > than > hstore even it also contains all the tags. It might be faster for Mapnik than > the current tables with extra wide attribute schema because now attributes are > not indexed at all (or are they?). SQL queries according to tags and values or > even with a part of the tag and value strings would be easier to do than from > hstore. People could enhance attributes for some special uses by converting > max_speeds and other values which are actually measures from strings to > integers > or doubles etc. > > This option is so obvious that I believe that someone must have tried it > already. It would be nice to hear about experiences. Much too slow. Requiring joins on every query is not a good idea. And having the right indexes is important. You can't just index everything and hope it would do the right thing. (The most important index btw is the geometry index not the attribute indexes, although having the right attribute indexes here or there can help.) Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
Frederik Ramm remote.org> writes: > > Hi, > > On 02/16/2012 07:25 PM, Graham Jones wrote: > > This reminded me of a question I have been meaning to ask for quite a > > while - is a database generated by osm2pgsql with an hstore expected to > > perform similarly to one without? > > I never ran one with hstore when I think of what this must mean for the > database engine, and storage space, then I shudder and would not be > surprised by the factor 5 you mentioned. How about doing it with relations? Let the importer program create four tables osm_point osm_line osm_polygon osm_relation These tables would each have two attributes: Geometry and osm_is. Geometries in the osm_relation would be of type "geometry collection", that is, a collection of whatever, and it could hold for example the tranport route relation with the route and bus stops and everything in one PostGIS geometry object. Then there should be one or four tables for tags (everything in one table or split to suit the geometry tables). The three attributes would be osm_id, key, and value. Osm_id would be used as a foreing key for joining geometries and attributes. If osm_id can not be guaranteed to be unique then there should be point_id, line_id, polygon_id and relation_id added to suitable places. Every attribute would be indexed. For Mapnik use where all that is needed is to do simple SQL selects. I guess that this database would be faster to query than hstore even it also contains all the tags. It might be faster for Mapnik than the current tables with extra wide attribute schema because now attributes are not indexed at all (or are they?). SQL queries according to tags and values or even with a part of the tag and value strings would be easier to do than from hstore. People could enhance attributes for some special uses by converting max_speeds and other values which are actually measures from strings to integers or doubles etc. This option is so obvious that I believe that someone must have tried it already. It would be nice to hear about experiences. -Jukka Rahkonen- ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
2012/2/17 Jochen Topf : > On Thu, Feb 16, 2012 at 06:25:30PM +, Graham Jones wrote: > Generic key names can be confusing, especially when one OSM object is used for > multiple things. +1. E.g. an object tagged barrier=fence, height=2, landuse=forest. In this example you could also use 2 objects: a closed way for the fence and a multipolygon for the area. Linear objects we can create with route relations (instead of overlapping ways). We don't have currently a relation for nodes and maybe we don't need this utterly (you could place another node "nearby"), but it would solve some cases where you want to state the topology precisely (e.g. 2 objects at the same pole). This way we can have "things" (expressed by a relation) which have their geometry only as an attribute, instead of a geometry that _is_ something. > Non-generic tags also make tools such as Taginfo more useful, because each tag > stands on its own. They make it immediately clear if the tagging is > incomplete. > If the "generator" tag got lost, how do we know what the "power_source" tag > is supposed to mean? +1 for taginfo, but I don't think we need the namespaces for the case something gets "lost" (there is the history for this). Cheers, Martin ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
On Thu, Feb 16, 2012 at 06:25:30PM +, Graham Jones wrote: > Why create a key generator:power_source rather than just use > power_source. power_source is much more generic so you could re-cycle it > for things like district heating, but generator:power_source is only ever > going to be used for generating stations, and needs a new column in the > database. . I think I just prefer more generic, re-usable keys > rather than trying to invent a new one for each situation Generic key names can be confusing, especially when one OSM object is used for multiple things. Say there is a way tagged as railway and at the same time this way is part of an area tagged as a generating station. Does "power_source" mean the type of generating station or the type of power used by the railway (overhead vs. third rail vs. unelectrified)? I hope this example is hypothetical, but people do strange things in OSM... Non-generic tags also make tools such as Taginfo more useful, because each tag stands on its own. They make it immediately clear if the tagging is incomplete. If the "generator" tag got lost, how do we know what the "power_source" tag is supposed to mean? So on the one side you have generic tags like "type" and "class" and "id" which are very confusing. You never know what they belong to. On the other side you have rather complex tagging schemas like in your generator:... example. Thats sometimes hard to use, too. We still have to figure out where the sweet spot is, but I tend to prefer the more explicit keys. Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
On 16-2-2012 19:25, Graham Jones wrote: Why create a key generator:power_source rather than just use power_source. power_source is much more generic so you could re-cycle it for things like district heating, but generator:power_source is only ever going to be used for generating stations, and needs a new column in the database. . I think I just prefer more generic, re-usable keys rather than trying to invent a new one for each situation This is a core problem with the public_transport=* scheme as well. This tag in and of itself is fine, but the whole additional shebang that is train=yes/no, bus=yes/no, subway=yes/no, etc etc is so dreadful (in the context of the rendering chain) that every time I read #2798 I feel the urge to run away screaming. http://trac.openstreetmap.org/ticket/2798 -- Lennard ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
On 16 February 2012 18:51, Frederik Ramm wrote: > On 02/16/2012 07:25 PM, Graham Jones wrote: > >> This reminded me of a question I have been meaning to ask for quite a >> while - is a database generated by osm2pgsql with an hstore expected to >> perform similarly to one without? >> > > I never ran one with hstore when I think of what this must mean for the > database engine, and storage space, then I shudder and would not be > surprised by the factor 5 you mentioned. Thanks - I don't feel so bad about giving up so easily now! If you want to add a new key but avoid re-importing the full database, you > can add the key to osm2pgsql's style file, somehow generate an .osm file > that contains only the objects that have this tag, wrap this .osm file into > a "..." instead of > "..." and throw it at osm2pgsql in append mode. That's a neat idea - will try that next time, thanks. > People like keys that are understandable without context. Contrast this > with a tag "width=2" - where you have no chance in knowing what it means > without looking at the whole thing. Yes - I think it is just a style difference - I am quite happy to interpret it from context for the sake of having a smaller number of unique keys to think about. Cheers Graham -- Graham Jones Hartlepool, UK. ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk
Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)
Hi, On 02/16/2012 07:25 PM, Graham Jones wrote: This reminded me of a question I have been meaning to ask for quite a while - is a database generated by osm2pgsql with an hstore expected to perform similarly to one without? I never ran one with hstore when I think of what this must mean for the database engine, and storage space, then I shudder and would not be surprised by the factor 5 you mentioned. I keep having to re-import my database to add new keys. If you want to add a new key but avoid re-importing the full database, you can add the key to osm2pgsql's style file, somehow generate an .osm file that contains only the objects that have this tag, wrap this .osm file into a "..." instead of "..." and throw it at osm2pgsql in append mode. power_source is much more generic so you could re-cycle it for things like district heating, but generator:power_source is only ever going to be used for generating stations, and needs a new column in the database. Yes, that's the plan I believe. People like keys that are understandable without context. Contrast this with a tag "width=2" - where you have no chance in knowing what it means without looking at the whole thing. Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09" E008°23'33" ___ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk