I think it would be best to make the tools we use JOSM, Overpass API, iD,
etc. Unicode aware, so they can handle this correctly.

Polyglot

2018-01-26 16:50 GMT+01:00 Matej Lieskovský <lieskovsky.ma...@gmail.com>:

> @marc: I just realized - I'm not talking about breaking words between
> syllables but about breaking lines between words. It is not adding a
> character, just using a nonbreakable version of a space. Sorry if I'm
> not being clear.
>
> On 26 January 2018 at 16:47, Matej Lieskovský
> <lieskovsky.ma...@gmail.com> wrote:
> > In Czech, a nonbreakable space should follow any single-letter
> > preposition or conjunction and academic or military titles. A
> > nonbreakable space should also be used due to some common
> > contractions, between a number and a unit, and around some punctuation
> > marks.
> >
> > I noticed that some Overpass queries were not returning some elements
> > - that is how I found out that we actually have a rather large number
> > of nonbreakable spaces in the data.
> >
> > Nonbreakable spaces are currently quite troublesome - not all
> > consumers actually use Unicode collation, it is invisible in JOSM and
> > it is not exactly easy to input. Also, the chance that we convince all
> > contributors to use it correctly is exactly zero. Along with this
> > potentially being "tagging for the renderer", there are many calls for
> > a mass-removal.
> >
> > On the other hand, there is software that actually handles Unicode
> > collation well and it does make the correct rendering of names an
> > order of magnitude easier. Leaving this up to the renderer sounds
> > logical, but imagine forcing every renderer to figure out what
> > language any given name is in and then running the appropriate
> > subprogram to fill in the nonbreakable spaces. This could require
> > semantic analysis due to the need to add a nonbreakable space after
> > the "V" in "V jámě" (preposition) but before the "V" in "Jiří V."
> > (roman ordinal number) and after the "V." in "V. Špidla" (contraction
> > of name (and yes, there are cases when you should use a contraction)).
> >
> > Nonbreakable spaces are strange - you cannot reliably tell if they are
> > used OTG (but in some cases you can), official documents often ignore
> > them (leaving them up to the automated systems in office software, so
> > they do occur sometimes) and the rules governing them are older than
> > computers, so asking if they are a rule or a character is... dubious.
> >
> > And yes, we do have really long names of things. Names of POIs named
> > after people are a common use case.
> >
> > Matej
> >
> > On 26 January 2018 at 16:11, marc marc <marc_marc_...@hotmail.com>
> wrote:
> >> Le 26. 01. 18 à 15:48, Matej Lieskovský a écrit :
> >>> Several Slavic languages have rather formal rules about line breaks.
> >>
> >> it depends on whether it is a grammar rule or a "char".
> >> In French, it is a rule to know how to cut a word at the end of a line.
> >> Since it's a grammar rule, I don't see any point in adding a character
> >> between syllables to describe it. it's up to the render
> >> to know when it can do it if ppl wants this feature.
> >> I know nothing about your language, but I feel it look like the same.
> >> If my understanding is correct, I am in favour of not putting
> >> this "nonbreakable" information into a value and moving it to app code
> >> that need it (witch ? have you so long value that's needed to break it
> >> in several line ?)
> >>
> >> Regards,
> >> Marc
> >> _______________________________________________
> >> Tagging mailing list
> >> Tagging@openstreetmap.org
> >> https://lists.openstreetmap.org/listinfo/tagging
>
> _______________________________________________
> Tagging mailing list
> Tagging@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/tagging
>
_______________________________________________
Tagging mailing list
Tagging@openstreetmap.org
https://lists.openstreetmap.org/listinfo/tagging

Reply via email to