I also had problems with gosmore consuming needing too many disk seeks (or
too much memory) and missing the processor cache.

I solved these problems by doing less during the xml processing phase. Just
categorize the data and writing it into many temporary files. For example
nodes with id's in the range 0-10,000,000 in the first file, nodes with id's
in the range 10,000,000 to 20,000,000 in another. Strings starting with AA
in one file etc. The each of these files are under 200 MB and can be read
into memory completely and processed during subsequent phases.

So it adds another loop to an already highly nested piece of code.

On Mon, Aug 17, 2009 at 9:27 PM, Lars Francke <lars.fran...@gmail.com>wrote:

> > So please elaborate what combinations you need ;) Since the output is
> > csv /anything/ you imagine is there ;) and if you don't want to code,
> > just add what you want and run uniq -c on it.
>
> I'll need output in the following form:
> tag-key, number of changesets, nodes, relations and ways this key is
> used on, number of distinct values
> tag-value, the tag-key this value belongs to, number of changesets,
> nodes, relations and ways this value is used on
>
> Additionally the following information would be nice:
> key/key combinations and how often these two keys are used in together
> on changesets, nodes, relations and ways
>
> To avoid hitting the database every time a tag is processed I cache
> the information in a Map in memory. Unfortunately I only have my
> (t)rusty FreeBSD box with an Dual Core Athlon and 3 GB of RAM. The
> cache-maps don't fit in memory so I'm looking for a solution to swap
> this to disk.
>
> Additional points for ideas on how to process the daily-diffs to keep
> the data current :)
> I need the old and new data for changesets, nodes, relations and ways
> to process deletions and changes in tags. I think I could write a
> Osmosis plugin for ths...but that's a problem for another day :)
>
> Cheers,
> Lars
>
> _______________________________________________
> dev mailing list
> dev@openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev
>
_______________________________________________
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev

Reply via email to