On Thu, Sep 16, 2010 at 10:23 AM, Scott Crosby <scro...@cs.rice.edu> wrote:
> > For now, for simplicity, I'm going to revert to the same metadata as > > the XML format. Just a BBox and source field. I'll make them both > > optional, making it easier to upgrade metadata features in the future. > > When/if there is a consensus for additional metadata fields, support > > for them can be added then. > > > > > I'll be releasing an rc2 at some point. > > This has been done. RC2 is in osmosis trunk. Changes are almost > exclusively to the underlying osmbin.jar with no format > incompatibilities. Changes include: > > Sorry its taken so long. Personal reasons have kept me away from this work. I have committed the 1.0 version of the 'osmbin' jar to osmosis trunk. I have also increased the maximum size of a header or fileblock to 64kb and 32mb respectively. (These limits are used to detect corrupt files.) I believe I have also fixed the two reported bugs, Frederick's bug with reporting the wrong error message and the negative UserID bug. The only thing left is to rename 'osmbin' to 'osmpbf' to match the name of the format, and put a copy of the source code into OSM's SVN server and to find a good home for the jar. (Any suggestions?) For now the jar lives in osmosis's SVN repository and the source is on github. In osmosis, the important change is that the tasks have been renamed to match the *.pbf file extension and are now --write-pbf and --read-pbf. I am leaving behind the old task names --read-bin and --write-bin so that existing scripts will work, but please fix your scripts. I also made one small API change. The timestamps metadata field should have been an int64, not an int32. This is not a format-compatability change, but it may require minor changes to code using the protobuf definitions. I am not sure when I will have time to update the wiki with the documentation of the pbf tasks. For now, I am attaching a description of all of the options. Scott /////////////// // --write-pbf Arguments: file=<filename> Currently '-' representing stdout is not supported. compress=deflate (default) Use deflate compression on each block compress=none Disable compression. About twice as fast to write and twice the size. batchlimit=8000 Block size used when compressing. This is a reasonable default. Batchlimits that are too big may cause files to exceed the defined filesize limits. granularity=100 The granularity or precision used to store coordinates. The default of 100 nanodegrees is the highest precision used by OSM, corresponding to about 1.1cm at the equator. In the current osmosis implementation, the granularity must be a multiple of 100. If map data is going to be exported to software that does not need the full precision, increasing the granularity to 10000 nanodegrees can save about 10% of the file size, while still having 1.1m precision. omitmetadata=false (default) omitmetadata=true Omit non-geographic metadata on OSM entities. This includes version number and timestamp of the last edit to the entity as well as the user name and id of the last modifier. Omitting this metadata can save 15% filesize when exporting to software that does not need this data. usedense=true (default) Nodes can be represented in a regular format or a dense format. The dense format is about 30% smaller, but more complex. To make it easier to interoperate with (future) software that chooses to not implement the dense format, the dense format may be disabled. // --read-pbf Arguments: file=<filename> Currently '-' representing stdin is not supported. // Usage tips: The default options for reading and writing are the safe options and work efficiently and quickly. Buffering can improve performance. The binary format processes data in batches, entities are queued until a limit is reached, then that batch is serialized and compressed. This serialization can run concurrently with other osmosis processing. With more than one core, writing throughput can be increased by about 60% by placing a buffer in the processing pipeline just before writing. Similarily, a buffer placed in the pipelilne immediatelly after parsing can likewise improve read concurrency. Eg: osmosis --read-pbf file=XXX --b bufferCapacity=12000 .... OR osmosis .... --b bufferCapacity=12000 --write-pbf file=XXX ... When generating data for export to other applications, I suggest considerring --omitmetadata and --granularity=10000. Each option reduces the size by about 1gb. With both options, a full planet (in 2010), including all nodes, ways, and tags, fits in 5.5gb.
_______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev