Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-24 Thread Nathan Edgars II

On 2/16/2012 6:00 PM, Jochen Topf wrote:

Generic key names can be confusing, especially when one OSM object is used for
multiple things. Say there is a way tagged as railway and at the same time this
way is part of an area tagged as a generating station. Does "power_source" mean
the type of generating station or the type of power used by the railway
(overhead vs. third rail vs. unelectrified)? I hope this example is
hypothetical, but people do strange things in OSM...


A better example is a highway=service service=alley with a railway=rail 
service=spur down the middle.


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Jochen Topf
On Fri, Feb 17, 2012 at 02:02:26PM +, Jukka Rahkonen wrote:
> Jochen Topf  remote.org> writes:
> 
> 
> > 
> > Much too slow. Requiring joins on every query is not a good idea. And having
> > the right indexes is important. You can't just index everything and hope it
> > would do the right thing. (The most important index btw is the geometry 
> > index
> > not the attribute indexes, although having the right attribute indexes here
> > or there can help.)
>  
> I sent by followup accidentally to osm-dev list. The main message was that
> perhaps there is a need to do some tests before saying if hstore or joined
> tables are faster.
> 
> Here is a Spatialite database for testing
> 
> http://latuviitta.org/documents/relation_test.zip
> 
> First trials suggest that joined tables with proper indexes are not at all
> slow and compared with the standard tables created by osm2pgsql the query
> times can be several times faster. This is not a fair comparison because
> osm2pgsql tables are missing the attribute indexes but so they normally do
> in the Mapnik chain. Query times vary, so try yourself. Data is osm_lines
> from Berlin from some months ago.

Sorry, but that kind of test doesn't really tell you anything about real
performance on a real system. Do this test with data for the whole planet with
one process updating the database and multiple readers accessing the db at the
same time. And results you get with sqlite do not tell you anything about
results for postgresql.

> By the way, recent development around GDAL suggests that Spatialite may be
> faster with spatial and attribute queries than PostGIS. Improvements in GDAL
> Spatialite driver made Mapserver about 20 times faster than it used to be and
> now Spatialite is faster that PostGIS and even shapefiles.
> 
> 
> query from osm_line from osm2pgsql
> ===
> select geometry, highway from osm_line
> where highway='primary'
> 0.531 seconds
> 
> query from joined tables
> ===
> select * from line_geometry, tags
> where tags.tag='highway' and tags.value='primary'
> and tags.join_id=line_geometry.osm_id
> 0.059 seconds

These numbers are also irrelevant for our discussion here because Mapnik
doesn't do this kinds of queries. Mapnik always queries with a bounding box.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Jukka Rahkonen
Jochen Topf  remote.org> writes:


> 
> Much too slow. Requiring joins on every query is not a good idea. And having
> the right indexes is important. You can't just index everything and hope it
> would do the right thing. (The most important index btw is the geometry index
> not the attribute indexes, although having the right attribute indexes here
> or there can help.)
 
I sent by followup accidentally to osm-dev list. The main message was that
perhaps there is a need to do some tests before saying if hstore or joined
tables are faster.

Here is a Spatialite database for testing

http://latuviitta.org/documents/relation_test.zip

First trials suggest that joined tables with proper indexes are not at all
slow and compared with the standard tables created by osm2pgsql the query
times can be several times faster. This is not a fair comparison because
osm2pgsql tables are missing the attribute indexes but so they normally do
in the Mapnik chain. Query times vary, so try yourself. Data is osm_lines
from Berlin from some months ago.

By the way, recent development around GDAL suggests that Spatialite may be
faster with spatial and attribute queries than PostGIS. Improvements in GDAL
Spatialite driver made Mapserver about 20 times faster than it used to be and
now Spatialite is faster that PostGIS and even shapefiles.


query from osm_line from osm2pgsql
===
select geometry, highway from osm_line
where highway='primary'
0.531 seconds

query from joined tables
===
select * from line_geometry, tags
where tags.tag='highway' and tags.value='primary'
and tags.join_id=line_geometry.osm_id
0.059 seconds


-Jukka Rahkonen-


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Graham Jones
Thanks - I will give it another try - I have got a computer running Ubuntu
11.10, which has Postgres 9.1.2.   I will create a second database with an
hstore and see how it compares.

Graham.

On 17 February 2012 11:36, Stephan Knauss  wrote:

> Komяpa writes:
>
>> - is a database generated by osm2pgsql with an hstore expected to perform
>>> similarly to one without?
>>>
>>
>> It performs at the same speed for me.
>> Of course you're not supposed to use hstore for every tag, but just
>> for those on highest zoom levels, where spatial indexes are used
>> mostly, not indexes for other columns.
>>
>
> if you are using Postgres 9.1 you are advised to update to 9.1.2. There
> was a bug introduced that prevented index usage. For example testing for a
> tag like
> SELECT * from table WHERE hstore ? tag
> did not use an index and could make a query horribly slow. We experienced
> this on Toolserver.
>
> Stephan
>



-- 
Graham Jones
Hartlepool, UK.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Stephan Knauss
Komяpa writes: 


- is a database generated by osm2pgsql with an hstore expected to perform
similarly to one without?


It performs at the same speed for me. 


Of course you're not supposed to use hstore for every tag, but just
for those on highest zoom levels, where spatial indexes are used
mostly, not indexes for other columns.


if you are using Postgres 9.1 you are advised to update to 9.1.2. There was 
a bug introduced that prevented index usage. For example testing for a tag 
like

SELECT * from table WHERE hstore ? tag
did not use an index and could make a query horribly slow. We experienced 
this on Toolserver. 



Stephan

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Komяpa
> - is a database generated by osm2pgsql with an hstore expected to perform
> similarly to one without?

It performs at the same speed for me.

Of course you're not supposed to use hstore for every tag, but just
for those on highest zoom levels, where spatial indexes are used
mostly, not indexes for other columns.

-- 
Darafei "Komяpa" Praliaskouski
OSM BY Team - http://openstreetmap.by/
xmpp:m...@komzpa.net mailto:m...@komzpa.net

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-17 Thread Jochen Topf
On Fri, Feb 17, 2012 at 07:44:41AM +, Jukka Rahkonen wrote:
> Frederik Ramm  remote.org> writes:
> 
> > 
> > Hi,
> > 
> > On 02/16/2012 07:25 PM, Graham Jones wrote:
> > > This reminded me of a question I have been meaning to ask for quite a
> > > while - is a database generated by osm2pgsql with an hstore expected to
> > > perform similarly to one without?
> > 
> > I never ran one with hstore when I think of what this must mean for the 
> > database engine, and storage space, then I shudder and would not be 
> > surprised by the factor 5 you mentioned.
> 
> How about doing it with relations? Let the importer program create four 
> tables 
> osm_point
> osm_line
> osm_polygon
> osm_relation
> 
> These tables would each have two attributes: Geometry and osm_is. Geometries 
> in
> the osm_relation would be of type "geometry collection", that is, a collection
> of whatever, and it could hold for example the tranport route relation with 
> the
> route and bus stops and everything in one PostGIS geometry object.
> 
> Then there should be one or four tables for tags (everything in one table or
> split to suit the geometry tables). The three attributes would be osm_id, key,
> and value. Osm_id would be used as a foreing key for joining geometries and
> attributes. If osm_id can not be guaranteed to be unique then there should be
> point_id, line_id, polygon_id and relation_id added to suitable places.
> 
> Every attribute would be indexed. For Mapnik use where all that is needed is 
> to
> do simple SQL selects.  I guess that this database would be faster to query 
> than
> hstore even it also contains all the tags. It might be faster for Mapnik than
> the current tables with extra wide attribute schema because now attributes are
> not indexed at all (or are they?). SQL queries according to tags and values or
> even with a part of the tag and value strings would be easier to do than from
> hstore. People could enhance attributes for some special uses by converting
> max_speeds and other values which are actually measures from strings to 
> integers
> or doubles etc.
> 
> This option is so obvious that I believe that someone must have tried it
> already. It would be nice to hear about experiences.

Much too slow. Requiring joins on every query is not a good idea. And having
the right indexes is important. You can't just index everything and hope it
would do the right thing. (The most important index btw is the geometry index
not the attribute indexes, although having the right attribute indexes here
or there can help.)

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Jukka Rahkonen
Frederik Ramm  remote.org> writes:

> 
> Hi,
> 
> On 02/16/2012 07:25 PM, Graham Jones wrote:
> > This reminded me of a question I have been meaning to ask for quite a
> > while - is a database generated by osm2pgsql with an hstore expected to
> > perform similarly to one without?
> 
> I never ran one with hstore when I think of what this must mean for the 
> database engine, and storage space, then I shudder and would not be 
> surprised by the factor 5 you mentioned.

How about doing it with relations? Let the importer program create four tables 
osm_point
osm_line
osm_polygon
osm_relation

These tables would each have two attributes: Geometry and osm_is. Geometries in
the osm_relation would be of type "geometry collection", that is, a collection
of whatever, and it could hold for example the tranport route relation with the
route and bus stops and everything in one PostGIS geometry object.

Then there should be one or four tables for tags (everything in one table or
split to suit the geometry tables). The three attributes would be osm_id, key,
and value. Osm_id would be used as a foreing key for joining geometries and
attributes. If osm_id can not be guaranteed to be unique then there should be
point_id, line_id, polygon_id and relation_id added to suitable places.

Every attribute would be indexed. For Mapnik use where all that is needed is to
do simple SQL selects.  I guess that this database would be faster to query than
hstore even it also contains all the tags. It might be faster for Mapnik than
the current tables with extra wide attribute schema because now attributes are
not indexed at all (or are they?). SQL queries according to tags and values or
even with a part of the tag and value strings would be easier to do than from
hstore. People could enhance attributes for some special uses by converting
max_speeds and other values which are actually measures from strings to integers
or doubles etc.

This option is so obvious that I believe that someone must have tried it
already. It would be nice to hear about experiences.

-Jukka Rahkonen-


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Martin Koppenhoefer
2012/2/17 Jochen Topf :
> On Thu, Feb 16, 2012 at 06:25:30PM +, Graham Jones wrote:
> Generic key names can be confusing, especially when one OSM object is used for
> multiple things.


+1.
E.g. an object tagged barrier=fence, height=2, landuse=forest. In this
example you could also use 2 objects: a closed way for the fence and a
multipolygon for the area. Linear objects we can create with route
relations (instead of overlapping ways). We don't have currently a
relation for nodes and maybe we don't need this utterly (you could
place another node "nearby"), but it would solve some cases where you
want to state the topology precisely (e.g. 2 objects at the same
pole). This way we can have "things" (expressed by a relation) which
have their geometry only as an attribute, instead of a geometry that
_is_ something.


> Non-generic tags also make tools such as Taginfo more useful, because each tag
> stands on its own. They make it immediately clear if the tagging is 
> incomplete.
> If the "generator" tag got lost, how do we know what the "power_source" tag
> is supposed to mean?


+1 for taginfo, but I don't think we need the namespaces for the case
something gets "lost" (there is the history for this).

Cheers,
Martin

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Jochen Topf
On Thu, Feb 16, 2012 at 06:25:30PM +, Graham Jones wrote:
>  Why create a key generator:power_source rather than just use
> power_source.  power_source is much more generic so you could re-cycle it
> for things like district heating, but generator:power_source is only ever
> going to be used for generating stations, and needs a new column in the
> database. .   I think I just prefer more generic, re-usable keys
> rather than trying to invent a new one for each situation

Generic key names can be confusing, especially when one OSM object is used for
multiple things. Say there is a way tagged as railway and at the same time this
way is part of an area tagged as a generating station. Does "power_source" mean
the type of generating station or the type of power used by the railway
(overhead vs. third rail vs. unelectrified)? I hope this example is
hypothetical, but people do strange things in OSM...

Non-generic tags also make tools such as Taginfo more useful, because each tag
stands on its own. They make it immediately clear if the tagging is incomplete.
If the "generator" tag got lost, how do we know what the "power_source" tag
is supposed to mean?

So on the one side you have generic tags like "type" and "class" and "id" which
are very confusing. You never know what they belong to. On the other side you
have rather complex tagging schemas like in your generator:... example. Thats
sometimes hard to use, too. We still have to figure out where the sweet spot is,
but I tend to prefer the more explicit keys.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Lennard

On 16-2-2012 19:25, Graham Jones wrote:


 Why create a key generator:power_source rather than just use
power_source.  power_source is much more generic so you could re-cycle
it for things like district heating, but generator:power_source is only
ever going to be used for generating stations, and needs a new column in
the database. .   I think I just prefer more generic,
re-usable keys rather than trying to invent a new one for each situation


This is a core problem with the public_transport=* scheme as well. This 
tag in and of itself is fine, but the whole additional shebang that is 
train=yes/no, bus=yes/no, subway=yes/no, etc etc is so dreadful (in the 
context of the rendering chain) that every time I read #2798 I feel the 
urge to run away screaming.


http://trac.openstreetmap.org/ticket/2798

--
Lennard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Graham Jones
On 16 February 2012 18:51, Frederik Ramm  wrote:

> On 02/16/2012 07:25 PM, Graham Jones wrote:
>
>> This reminded me of a question I have been meaning to ask for quite a
>> while - is a database generated by osm2pgsql with an hstore expected to
>> perform similarly to one without?
>>
>
> I never ran one with hstore when I think of what this must mean for the
> database engine, and storage space, then I shudder and would not be
> surprised by the factor 5 you mentioned.


Thanks - I don't feel so bad about giving up so easily now!

If you want to add a new key but avoid re-importing the full database, you
> can add the key to osm2pgsql's style file, somehow generate an .osm file
> that contains only the objects that have this tag, wrap this .osm file into
> a "..." instead of
> "..." and throw it at osm2pgsql in append mode.


That's a neat idea - will try that next time, thanks.


> People like keys that are understandable without context. Contrast this
> with a tag "width=2" - where you have no chance in knowing what it means
> without looking at the whole thing.


Yes - I think it is just a style difference - I am quite happy to interpret
it from context for the sake of having a smaller number of unique keys to
think about.

Cheers

Graham

-- 
Graham Jones
Hartlepool, UK.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Frederik Ramm

Hi,

On 02/16/2012 07:25 PM, Graham Jones wrote:

This reminded me of a question I have been meaning to ask for quite a
while - is a database generated by osm2pgsql with an hstore expected to
perform similarly to one without?


I never ran one with hstore when I think of what this must mean for the 
database engine, and storage space, then I shudder and would not be 
surprised by the factor 5 you mentioned.



I keep having to re-import my database to add new keys.


If you want to add a new key but avoid re-importing the full database, 
you can add the key to osm2pgsql's style file, somehow generate an .osm 
file that contains only the objects that have this tag, wrap this .osm 
file into a "..." instead of 
"..." and throw it at osm2pgsql in append mode.



power_source is much more generic so you could re-cycle
it for things like district heating, but generator:power_source is only
ever going to be used for generating stations, and needs a new column in
the database.


Yes, that's the plan I believe. People like keys that are understandable 
without context. Contrast this with a tag "width=2" - where you have no 
chance in knowing what it means without looking at the whole thing.


Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


[OSM-talk] osm2pgsql hstore (was: Wind turbines no longer rendered on mapnik layer)

2012-02-16 Thread Graham Jones
On 16 February 2012 16:25, Frederik Ramm  wrote:

> Hi,
>
>
> On 02/16/2012 03:05 PM, Kay Drangmeister wrote:
>
>> Isn't there a hstore in the rendering-db?
>>
>
> No.
>
> This reminded me of a question I have been meaning to ask for quite a
while - is a database generated by osm2pgsql with an hstore expected to
perform similarly to one without?

The only time I tried to use hstore, the performance of the whole database
was dreadful (at least a factor of 5 slower, but I didn't do any proper
benchmarking - I just gave up and re-created it without hstore and the
problem went away).It may have been that something went wrong with the
import that I did not notice, and it was not an hstore issue.

Using hstore would be really good given the fashion for generating new keys
with colons in them, but not with that reduction in performance, so I keep
having to re-import my database to add new keys.

 Why create a key generator:power_source rather than just use
power_source.  power_source is much more generic so you could re-cycle it
for things like district heating, but generator:power_source is only ever
going to be used for generating stations, and needs a new column in the
database. .   I think I just prefer more generic, re-usable keys
rather than trying to invent a new one for each situation

Regards


Graham.

-- 
Graham Jones
Hartlepool, UK.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk