Re: [OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)
Christoph Eckert wrote: Hi, Well, I wanted to try to create a kind of worldwide base map for Navit to get a clue what file size this would trigger. OK, I meanwhile got a bit further thanks to the various hints. I use the following command: ./osmosis-0.31/bin/osmosis --read-xml file=planet.osm --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction, highway.trunk,highway.trunk_link,highway.primary,highway.primary_link,route.ferry, waterway.river,railway.rail,railway.narrow_gauge,landuse.forest,landuse.wood, natural.wood,natural.water,boundary.administrative,boundary.civil --used-node --write-xml file=basemap.osm This seems to work well on smaller files (tried 24.8MB), but I tried it twice on a true planet file, and osmosis will crash after a while (link to pastebin log is attached). It creates the temporary files in /temp, but it seems to fail as soon as it tries to staff the temporary file's content into the destination file. After the crash, the latter one only contains the XML header up to the maximum possible bounding box, but nothing else. The temporary files are removed. The failure seems to depend on the usage of --used-node. So my question is if it won't work for tech limitations on such huge files (data must fit into RAM), or if it is intended to work and I can use some workaround. As you've guessed, you are running into a memory limitation. A 32-bit java VM can use up to about 2GB if necessary if specified with the -Xmx option, but most times that isn't necessary. The complete set of nodes does need to be in RAM, however you can reduce the amount of RAM required by changing the method used to hold them. Try adding the idTrackerType=BitSet option to the --used-node task. That will use approximately 1/32 of the RAM that the default IdList implementation uses. Both methods have their advantages and disadvantages, with BitSet being more memory efficient on very large data sets, and IdList being more memory efficient on smaller data data sets with sparsely allocated ids. I hope the BitSet option solves the problem. Brett ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)
On Wed, Jun 24, 2009 at 1:27 PM, Christoph Eckert c...@christeck.de wrote: Hi, You may trigger a deadlock in this situation ... I've been waiting for somebody to try this out for a long time :-) he he, bull's eye, eh ;-) ? Well, I wanted to try to create a kind of worldwide base map for Navit to get a clue what file size this would trigger. Right now, I'm only extracting ways. I will try to extract nodes (cities and the like) in a second run. I'm curious however if it will be possible to merge both files, especially if possible duplicates of nodes and ways occur. I'm not that confident that the desired basemap will be of a reasonable size; the resulting way's file already is 40GB, and the job is not completed yet. Anyway, you helped me a lot. Thanks, guys! Cheers, ce Yes, you can easily merge files. If there are duplicates, it will take the node/way/relation with the newest version by default (you can change that to choose the newest timestamp or always from one source). The files must be sorted for that to work. The command line would be something like: osmosis --read-xml file1.osm --sort --read-xml file2.osm --sort --merge --write-xml merged.osm Karl ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)
Hi Brett, That type of error is usually because you're running java 1.5 or older. From your previous emails you seem to be running java 1.6 which should be okay. Can you double check to make sure you're still using 1.6? If you are then I'm not sure what's going on ... there's a global Java 1.5 installation, and a local 1.6 installation in ~/bin/ I adjusted the osmosis shell script to use the latter one and yeah, it's up and running! Thanks a bunch for the help. Of course it immediately triggers the next question :) . I try to extract some data from an osm file which shall only contain the base net of roads, railways and cities. E.g. I do: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 --node-key-value keyValueList=place.city --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link --write-xml file=basemap.osm This only writes nodes, no ways at all. Removing the nodes, it will write ways: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link --write-xml file=basemap.osm So I thought I need to use the pipes as found in the documentation. So I tried: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 outPipe.0=readpipe --node-key-value keyValueList=place.city inPipe.0=readpipe outPipe.0=outpipe --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction inPipe.0=readpipe outPipe.0=outpipe --write-xml file=basemap.osm inPipe.0=outpipe However, osmosis does not like my syntax. I'm obviously using the pipes in a wrong or at least unsupported :) manner. Any hint is much appreciated. Do I need the pipes in this case? If so, what should I change? Or an alternative syntax? Thanks best regards, ce ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)
On Tue, Jun 23, 2009 at 3:56 PM, Christoph Eckert c...@christeck.de wrote: Hi Brett, That type of error is usually because you're running java 1.5 or older. From your previous emails you seem to be running java 1.6 which should be okay. Can you double check to make sure you're still using 1.6? If you are then I'm not sure what's going on ... there's a global Java 1.5 installation, and a local 1.6 installation in ~/bin/ I adjusted the osmosis shell script to use the latter one and yeah, it's up and running! Thanks a bunch for the help. Of course it immediately triggers the next question :) . I try to extract some data from an osm file which shall only contain the base net of roads, railways and cities. E.g. I do: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 --node-key-value keyValueList=place.city --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link --write-xml file=basemap.osm This only writes nodes, no ways at all. Removing the nodes, it will write ways: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link --write-xml file=basemap.osm So I thought I need to use the pipes as found in the documentation. So I tried: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 outPipe.0=readpipe --node-key-value keyValueList=place.city inPipe.0=readpipe outPipe.0=outpipe --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction inPipe.0=readpipe outPipe.0=outpipe --write-xml file=basemap.osm inPipe.0=outpipe However, osmosis does not like my syntax. I'm obviously using the pipes in a wrong or at least unsupported :) manner. Any hint is much appreciated. Do I need the pipes in this case? If so, what should I change? Or an alternative syntax? Thanks best regards, ce What's happening there is that the node-key-value and way-key-value are ANDed together (which would leave you with only ways which match your tags and are composed of nodes tagged place=city), and you want an OR instead. You were sort of on the right track with the pipes, but what you need to do is use the tee function and apply the node-key-value filter to one leg of the tee, and apply the way-key-value filter to the other leg of the tee, then use the merge function to join the results. It would look something like this: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 --tee outputCount=2 outPipe.0=nodes outPipe.1=ways --node-key-value keyValueList=place.city inPipe.0=nodes --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link inPipe.0=ways --merge --write-xml file=basemap.osm I'm not sure if that will work exactly as written. You may need to add outPipe arguments to the node-key-value and way-key-value filters and then reference them as inPipe.0 and inPipe.1 arguments to the merge task. Karl ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] Question running osmosis (node-key-value and way-key-value at the same time)
On Tue, Jun 23, 2009 at 6:11 PM, Brett Henderson br...@bretth.com wrote: Karl Newman wrote: What's happening there is that the node-key-value and way-key-value are ANDed together (which would leave you with only ways which match your tags and are composed of nodes tagged place=city), and you want an OR instead. You were sort of on the right track with the pipes, but what you need to do is use the tee function and apply the node-key-value filter to one leg of the tee, and apply the way-key-value filter to the other leg of the tee, then use the merge function to join the results. It would look something like this: ./osmosis-0.31/bin/osmosis --read-xml file=planet.bz2 --tee outputCount=2 outPipe.0=nodes outPipe.1=ways --node-key-value keyValueList=place.city inPipe.0=nodes --way-key-value keyValueList=highway.motorway,highway.motorway_link,highway.motorway_junction,highway.trunk,highway.trunk_link inPipe.0=ways --merge --write-xml file=basemap.osm I'm not sure if that will work exactly as written. You may need to add outPipe arguments to the node-key-value and way-key-value filters and then reference them as inPipe.0 and inPipe.1 arguments to the merge task. You may trigger a deadlock in this situation ... I've been waiting for somebody to try this out for a long time :-) While it's possible to construct a pipeline that tees a single dataset into multiple streams before merging them back together again, it is problematic from thread synchronisation point of view because the same input thread is feeding two inputs of another thread. Using a --buffer task within both paths of the branch may help because it de-couples the threads somewhat with a buffer. The --read-xml task creates a thread which passes data into the --tee task. The --tee task doesn't create a thread, it just uses the existing thread to pass incoming data to all consumers. The --node-key-value and --way-key-value also use the existing thread to write to their destination which in both cases is the --merge task. The --merge task creates a new thread which reads the incoming data from both of its inputs, but both inputs are coming from a single thread (ie. the original --read-xml thread). The --merge thread may read from one input, then start waiting for a specific value on the the other input and never receive it. But if it works let me know. I'm curious :-) Brett Hmm... I didn't look at the code too closely. I thought the tee created separate threads. I'm trying to see what might cause a deadlock--it looks like it would happen in DataPostBox if anywhere--but it's not obvious what might trigger it. I guess what could happen is if the merge task is trying to get entities from both pipelines to compare them, and there isn't anything available in one of the pipelines, that might deadlock it. I guess someone will have to try it and see! Karl ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev