Hi David,

On Sat, Aug 7, 2010 at 10:10 PM, David Carmean <d...@halibut.com> wrote:

>
> Brett, et al:
>
> Please consider leaving the current "simple" schema intact (with
> the addition of the index clustering) and instead create a third
> schema for the HSTORE version.
>
> Losing the current schema would break at least two of my use cases:
>
> 1: I use postGIS as back-end storage for GIS client software, and
> use views and tables derived from queries against the OSM simple
> schema to create various layers of OSM data in the GIS client,
> e.g. roads, footpaths/trails/bike routes, hydrography, landuse,
> structures (buildings), etc.  This is such an obvious technique
> that I'd be very surprised not to find a lot of others doing the
> same.
>

It's great to hear you're finding this schema useful.  I don't hear a lot
from people using it, so getting feedback is always appreciated.

Is it enough to have the existing 0.36 version of Osmosis to do this?  Is it
possible to change your queries to use the hstore data instead?

I'm hoping to avoid maintaining lots of different schemas if possible.


>
> 2: I am beginning a project to parallelize OSM data processing
> with Hadoop, and the postgreSQL copy-format output is perfect
> for loading into HDFS. (If this goes well, I'd want to discuss
> ideas for adapting Osmosis to talk to Hadoop, eventually.)
>
> I also think that the simple schema lends itself better to
> research uses, especially when the input is a country/state
> extract or smaller.
>
> That said, I can also see cases where it would be useful to have
> the option to create a "simple++" schema, with the separate
> tags table *and* the HSTORE columns as well.
>

I did think about adding the hstore columns as an optional add-on to the
existing schema.  I already have a couple of cases of this (ie. action
table, way linestring column, way bbox column).  But this change is fairly
invasive and given that I have to re-work a large amount of existing code
I'd end up having to implement everything twice.

If the only bit you need here is the COPY format, you may be better off with
a new task that creates a file in exactly the format you want.


>
> The CLUSTER operation is indeed a big performance booster.  Also,
> creating the optional ways.bbox column is at least two orders of
> magnitude faster when performed against an indexed ways.linestring
> column rather than as-shipped, which uses the nodes geometry before
> that has been indexed.
>

The general intent is that you only create one of the linestring or bbox
columns.  If you already have a linestring column is there any point
creating the bbox column?

As for using the node geometry before indexing, you shouldn't have to do
that.  The fastest way to create the linestring or bbox columns will be to
use the enableLinestringBuilder or enableBboxBuilder options on the various
write-pgsql tasks.

Cheers,
Brett
_______________________________________________
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev

Reply via email to