[OSM-dev] Unsigned 32 bit node numbers in applications

2016-02-08 Thread Andrew Hain
Just a quick reminder: is there anyone still using unsigned 32 bit 
node numbers? The database is approaching node 40.

--
Andrew


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[josm-dev] Latest josm seems to be crashing Xorg

2016-02-08 Thread ael
I am not sure what is happeing here, but downloading the latest
josm version a few minutes ago and attempting to run it crashes 
Xorg with "Bus error at address 0x0".

I am on debian testing, and using

java version "1.7.0_91"
OpenJDK Runtime Environment (IcedTea 2.6.3) (7u91-2.6.3-1)
OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode)

Running my usual "java -jar ~/mapping/josm-latest.jar &"
crashes X.

Here is the extract from Xorg.0.log:
[  3637.585] (EE) Backtrace:
[  3637.585] (EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace+0x4e) [0x5642f568249e]
[  3637.585] (EE) 1: /usr/lib/xorg/Xorg (0x5642f54cd000+0x1b9819) 
[0x5642f5686819]
[  3637.585] (EE) 2: /lib/x86_64-linux-gnu/libc.so.6 (0x7f96cf8cd000+0x33590) 
[0x7f96cf900590]
[  3637.585] (EE) 3: /lib/x86_64-linux-gnu/libc.so.6 (0x7f96cf8cd000+0x785e8) 
[0x7f96cf9455e8]
[  3637.585] (EE) 4: /lib/x86_64-linux-gnu/libc.so.6 (__libc_calloc+0xd5) 
[0x7f96cf948395]
[  3637.585] (EE) 5: /usr/lib/xorg/Xorg (0x5642f54cd000+0x508b2) 
[0x5642f551d8b2]
[  3637.585] (EE) 6: /usr/lib/xorg/Xorg (0x5642f54cd000+0x53a3f) 
[0x5642f5520a3f]
[  3637.585] (EE) 7: /usr/lib/xorg/Xorg (0x5642f54cd000+0x57a53) 
[0x5642f5524a53]
[  3637.585] (EE) 8: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xf0) 
[0x7f96cf8ed870]
[  3637.585] (EE) 9: /usr/lib/xorg/Xorg (_start+0x29) [0x5642f550ede9]
[  3637.585] (EE) 
[  3637.585] (EE) Bus error at address 0x0
[  3637.585] (EE) 
Fatal server error:
[  3637.585] (EE) Caught signal 7 (Bus error). Server aborting
[  3637.585] (EE) 

I might try to open a ticket, but the last time I tried that, it seemed
to vanish :-(

ael


___
josm-dev mailing list
josm-...@openstreetmap.org
https://lists.openstreetmap.org/listinfo/josm-dev


Re: [OSM-dev] changes in coastline are not rendered

2016-02-08 Thread Gmail
That's a particular itch, but hey, among the thousands of each month mappers, 
there's a lot of inches.
Why not render anyway these lakes or freeze this object ?
Yves

Le 8 février 2016 20:20:39 UTC+01:00, Christoph Hormann  
a écrit :
>On Monday 08 February 2016, Gerd Petermann wrote:
>>
>> It was changed more than 3 days ago and the change is not rendered.
>>
>> I understand that coastline ways are special, but that seems too long
>> for me.
>>
>> Any hints why this takes so long ?
>
>Coastline processing on openstreetmapdata.com is stuck for more than a 
>month now, mostly because a lot of back and forth with tagging of large
>
>lakes as coastline.
>
>For information: to avoid major disruptions of map rendering due to
>data 
>errors the coastline is not updated if there are larger changes in the 
>geometry compared to the last time it was successfully processed.  Any 
>addition or removal of a lake with coastline tag will require manual 
>intervention and we do not have the time to manually check the data 
>every day because some mapper somewhere wants to scratch an itch and 
>decides to tag a lake outline with natural=coastline.
>
>-- 
>Christoph Hormann
>http://www.imagico.de/
>
>___
>dev mailing list
>dev@openstreetmap.org
>https://lists.openstreetmap.org/listinfo/dev

-- 
Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] changes in coastline are not rendered

2016-02-08 Thread Christoph Hormann
On Monday 08 February 2016, Gerd Petermann wrote:
>
> It was changed more than 3 days ago and the change is not rendered.
>
> I understand that coastline ways are special, but that seems too long
> for me.
>
> Any hints why this takes so long ?

Coastline processing on openstreetmapdata.com is stuck for more than a 
month now, mostly because a lot of back and forth with tagging of large 
lakes as coastline.

For information: to avoid major disruptions of map rendering due to data 
errors the coastline is not updated if there are larger changes in the 
geometry compared to the last time it was successfully processed.  Any 
addition or removal of a lake with coastline tag will require manual 
intervention and we do not have the time to manually check the data 
every day because some mapper somewhere wants to scratch an itch and 
decides to tag a lake outline with natural=coastline.

-- 
Christoph Hormann
http://www.imagico.de/

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] changes in coastline are not rendered

2016-02-08 Thread Jochen Topf
On Mo, Feb 08, 2016 at 05:51:52 +, Gerd Petermann wrote:
> pleasse see this way:
> 
> http://www.openstreetmap.org/way/396163254
> 
> 
> It was changed more than 3 days ago and the change is not rendered.
> 
> I understand that coastline ways are special, but that seems too long for me.
> 
> Any hints why this takes so long ?

Because the update process didn't produce new files for over a month. You can
see this here: http://openstreetmapdata.com/data/land-polygons (look for "Last
update" in the Download section).

This happens when the changes to the coastlines are larger than some cut-off
which can mean that the coastlines are so badly broken that an update could
lead to some large scale problems. It can also mean that the update process
isn't doing its job. Mostly it means that I didn't have the time to look into
it recently.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.jochentopf.com/  +49-351-31778688

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] changes in coastline are not rendered

2016-02-08 Thread Gerd Petermann
Hi,


pleasse see this way:

http://www.openstreetmap.org/way/396163254


It was changed more than 3 days ago and the change is not rendered.

I understand that coastline ways are special, but that seems too long for me.

Any hints why this takes so long ?


ciao,

Gerd
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Simpler binary OSM formats

2016-02-08 Thread Colin Smale
On 2016-02-08 12:45, Andrew Byrd wrote:

> Hello Benjamin, 
> 
> I was aware of Cap'n Proto, but thanks for pointing out FlatBuffer. I've 
> studied this system and considered how it might be useful for OSM data 
> exchange. Here are my impressions: 
> 
> 1. Each FlatBuffer message does indirection through a table "to allow for 
> format evolution and optional fields". The basic OSM data model is quite 
> stable at this point and to my knowledge evolves only through the 
> introduction of different tag strings. Unlike existing formats, I'd like vex 
> to be extremely simple and non-extensible so developers can easily and 
> completely support reading or writing it. I would hesitate to devote space in 
> every serialized entity to unused extensibility features.

There are discussions going on which may change the underlying data
metamodel. I am thinking of support for polygons/areas as primitive
types and multi-valued keys. Although the model has been stable since
API0.6 it would not be prudent to preclude changes in the future. 

//colin ___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Simpler binary OSM formats

2016-02-08 Thread Andrew Byrd
Hello Benjamin,

I was aware of Cap’n Proto, but thanks for pointing out FlatBuffer. I’ve 
studied this system and considered how it might be useful for OSM data 
exchange. Here are my impressions:

1. Each FlatBuffer message does indirection through a table "to allow for 
format evolution and optional fields”. The basic OSM data model is quite stable 
at this point and to my knowledge evolves only through the introduction of 
different tag strings. Unlike existing formats, I’d like vex to be extremely 
simple and non-extensible so developers can easily and completely support 
reading or writing it. I would hesitate to devote space in every serialized 
entity to unused extensibility features. 

2. FlatBuffer messages use fixed-width integers throughout, for both field 
values and vtable entries. OSM entity IDs are now 64 bits wide. Vtable entries 
are 32 bits wide and are used to refer to all strings and vectors, which are 
“never stored in-line”. The buffer will contain a very large proportion of 
zeros and repeated or unnecessary bytes (redundant fragments of coordinates and 
successive OSM entity references, offsets to strings and vectors). To get even 
remotely close to the file sizes we are accustomed to, the FlatBuffers would 
need to be inside compressed blocks. To achieve anything like comparable file 
sizes, we’d want to delta-code most numeric fields and probably apply 
variable-byte coding, i.e. pre-filter the data to assist the general purpose 
compression in its job. However, FlatBuffer inherently does not support 
variable-width integers.

3. Generally speaking, I can certainly see the appeal of using code generated 
from a schema to support a format quickly and reliably in several languages. 
But one of the main difficulties I encountered with OSM PBF is that it requires 
the developer to mix automatically generated Protobuf code with various bits of 
hand-rolled code to handle the block structure, compression, delta coding, 
string tables, etc. diminishing the appeal of code generation. In a well 
designed format, the code to parse each individual OSM entity (or interpret it 
in-place) could in fact be quite simple compared to this compression and 
block-handling code, and I’m not sure we gain much by generating it. To achieve 
reasonably compact file sizes, FlatBuffer would still require mixing custom 
code into and around generated code. This would defeat one of my major design 
goals.

4. FlatBuffer allows accessing buffer contents without parsing or dynamic 
allocations, which is a laudable goal. However, the vex format as it is 
currently defined would also allow iterative access to every entity with no 
dynamic allocations, requiring only an initial pass over each entity to 
determine the offsets of tags, references, etc. before use. You could refer to 
this as “parsing the entity” but I expect it would have a near zero impact on 
speed (and potentially zero impact considering that the data needs to be pulled 
into the processor cache for use anyway). Also, the file sizes we are 
accustomed to depend on delta coding, which is a cumulative process. While 
entire blocks may be skipped over, we must scan over all entities within a 
block to progressively decode coordinates or entity references. Random access 
within a block is not compatible with delta coding, nor do I see much use for 
it in a bulk data transfer and archiving format. So I think it’s a non-problem 
that we have to sequentially interpret the entities within each block.

Of course I may have misunderstood something about your suggestion or the use 
cases you had in mind. As always I’d welcome any reactions or discussion. My 
intent here is not to defend a specification set in stone, but to see if there 
is a technical consensus on what a next generation OSM format could look like.

Regards,
Andrew

> On 06 Feb 2016, at 23:47, Stadin, Benjamin 
>  wrote:
> 
> Hi Andrew,
> 
> Cap'n Proto (successor of ProtoBuffer from the guy who invented ProtoBuffer) 
> and FlatBuffers (another ProtoBuffer succesor, by Google) have gained a lot 
> of traction since last year. Both eliminate many if the shortcomings of the 
> original ProtoBuffer (allow for random access, streaming,...), and improve on 
> performance also.
> 
> https://github.com/google/flatbuffers 
> 
> Here is a comparison between ProtoBuffer competitors:
> https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html 
> 
> 
> In my opinion FlatBuffers is the most interesting. It seems to have very good 
> language and platform support, and has quite a high adoption rate already. 
> 
> I think that it's well worth to reconsider creating an own file format and 
> parser for several reasons. Your concept looks well thought, it should be 
> possible to implement a lighweight parser using FlatBuffers for your data 
> scheme. 
> 
> Regards
> Ben 
> 
> Von meinem i

Re: [OSM-dev] Simpler binary OSM formats

2016-02-08 Thread Andrew Byrd

> On 08 Feb 2016, at 10:57, Andrew Byrd  wrote:
> To me, it seems much more advantageous to provide a table of file offsets 
> stating where each entity type begins. I have already considered adding this 
> to vex after the basic format is established (like author metadata and map 
> layers). It seems appropriate to place such a table at the end of the vex 
> data, because this allows the writer to produce output as a stream (no 
> backward seeks) and a reader can only make effective use of this table if 
> it’s buffering the data and able to seek within the file.

On second thought, if the table is to be placed at the end of the file/stream 
the writer would not even necessarily need to store it because the reader can 
easily construct an equivalent table as it receives the data (or the first time 
it scans over the file).

-Andrew
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Simpler binary OSM formats

2016-02-08 Thread Andrew Byrd

> On 07 Feb 2016, at 20:10, Дмитрий Киселев  wrote:
> 
> As for fixed sized blocks in vex, I did consider that option but couldn’t 
> come up with a compelling reason for it. I can see the case for a maximum 
> block size (so we know what the maximum size of allocation will be), but can 
> you give a concrete example of how fixed-size blocks would be advantageous in 
> practice? I would be very hesitant to split any entity across multiple blocks.
> 
> 
> When you need relations-ways-nodes read order, blocks will save you  from 
> unnecessary read-through the whole file (yes, you can skip decompression for 
> nodes/ways but still you must read the whole file).

Let me rephrase the question: You specifically mentioned blocks of a 
predetermined, fixed size. How would having fixed-size blocks (as opposed to 
the current variable sized blocks) improve your ability to seek to different 
entity types within a file? Maybe you are thinking of doing a binary search 
through the file rather than a linear search for the blocks of interest. But 
that means the vex blocks would need to be a fixed size after compression, not 
before compression. It seems too complex to require writers to target an exact 
post-compression block size.

Also, I think your observation that “you must read the whole file” when seeking 
ahead to another entity type requires some additional nuance. You must only 
read the header of each block, at which point you know how long that block is 
and you can seek ahead to the next block. So indeed, you’d touch at least one 
page or disk block per vex block. Pages are typically 4 kbytes, so if your vex 
blocks are a few Mbytes in size, you would only access on the order of 1/1000 
of the pages while seeking ahead to a particular entity type. 

To me, it seems much more advantageous to provide a table of file offsets 
stating where each entity type begins. I have already considered adding this to 
vex after the basic format is established (like author metadata and map 
layers). It seems appropriate to place such a table at the end of the vex data, 
because this allows the writer to produce output as a stream (no backward 
seeks) and a reader can only make effective use of this table if it’s buffering 
the data and able to seek within the file.

> Second example: find something by id, if you have blocks it's easy to map 
> whole block into memory and do a binary search for logN block reads instead 
> of seeing through a file all the time.

Unlike o5m I have not included any requirement that IDs be in a particular 
order, which means binary searches are not always possible. I see vex as a data 
interchange format usable in both streaming and disk-backed contexts, not as a 
replacement for an indexed database table. It’s an interesting idea to try to 
serve both purposes at once and be able to quickly seek to an ID within a flat 
data file, but I’m not sure if such capabilities are worth the extra 
complexity. Such a binary search, especially if performed repeatedly for 
different entities, would be touching (and decompressing) a lot of disk blocks 
/ memory pages because the IDs you’re searching through are mixed in with the 
rest of the data rather than in a separate index as they would be in a database.

Andrew

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Simpler binary OSM formats

2016-02-08 Thread Stadin, Benjamin
> When you need relations-ways-nodes read order, blocks will save you  from 
> unnecessary read-through the whole file (yes, you can skip decompression for 
> nodes/ways but still you must read the whole file).

Or use mmap. Or directly Cap’n Proto or FlatBuffers, which support this out of 
the box.
https://capnproto.org/cxx.html#using-mmap

> Second example: find something by id, if you have blocks it's easy to map 
> whole block into memory and do a binary search for logN block reads instead 
> of seeing through a file all the time.

I believe it will make implementation unnecessary more complicated. In case of 
Cap’n Proto you could set the initial position of your binary search to the 
array index, under the hood it would use mmap to seek the location. Also you 
can improve on lookup: for example start a binary search for way with id 999 at 
ways[999] instead of ways[waysCount/2]

Von: Дмитрий Киселев 
mailto:dmitry.v.kise...@gmail.com>>
Datum: Sonntag, 7. Februar 2016 um 20:10
An: Andrew Byrd mailto:and...@fastmail.net>>
Cc: Benjamin Stadin 
mailto:benjamin.sta...@heidelberg-mobil.com>>,
 "dev@openstreetmap.org" 
mailto:dev@openstreetmap.org>>
Betreff: Re: [OSM-dev] Simpler binary OSM formats

As for fixed sized blocks in vex, I did consider that option but couldn’t come 
up with a compelling reason for it. I can see the case for a maximum block size 
(so we know what the maximum size of allocation will be), but can you give a 
concrete example of how fixed-size blocks would be advantageous in practice? I 
would be very hesitant to split any entity across multiple blocks.
[https://ssl.gstatic.com/ui/v1/icons/mail/images/cleardot.gif]

When you need relations-ways-nodes read order, blocks will save you  from 
unnecessary read-through the whole file (yes, you can skip decompression for 
nodes/ways but still you must read the whole file).

Second example: find something by id, if you have blocks it's easy to map whole 
block into memory and do a binary search for logN block reads instead of seeing 
through a file all the time.
___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev