On Sat, Feb 14, 2015 at 03:57:21PM +0100, Colin Smale wrote:
What about multi-lingual countries such as Belgium or Switzerland?
Or multi-lingual cities, such as Montréal in Canada?
But, is this really an issue? Street signs may be in two or more
languages, saying "Foo Street" and "Rue Foo" for example. Can anyone
name a multi-lingual area where a stopword in one language would be a
non-stopword in the other language?
For example, could there be a highway=* with name="Rue Street" in a
French/English area? I would not think so.
For what it is worth, there are a lot of bilingual street signs in
Finland, using Finnish (name:fi), Swedish (name:sv) or in the north,
Sámi (name:se). It depends on the share of the minority population
whether multiple languages are used. The majority language appears first
in the signs. So, usually it is Finnish first, then Swedish, or Swedish
first, then Finnish. Sometimes the signs could be Finnish or Swedish
only.
How about this (sorry the abbreviations are wrong but it is only to
illustrate my point):
mkgmap:country=POL {set mkgmap:lang=polish;}
AFAIU, your suggestion wrongly assumes that only one language will be
used in a given region. And I think it should be based on administrative
regions, not necessarily countries.
How would you represent an area that has multiple official languages
that can appear on street signs? I think that the OSM convention would
be something like this:
{ set mkgmap:lang:fi=yes; mkgmap:lang:sv=yes; }
or the (more tricky for our style rules)
{ set mkgmap:lang='fi;sv' }
If the stopwords were also defined to be regular expressions, then it
could also handle prefixes and suffixes as well as whole words.
I agree that defining stopwords as regular expressions would provide
some necessary flexibility. Like someone said, we do not want to omit
Straße (or other stopwords) at the start of a street name in languages
that usually put the stopword at the end of the name. But, in French and
Spanish the stopword is always at the start of the name. An anchored
regexp (\<Straße$ or ^Calle\>) would nicely express this.
Maybe the regexp could also facilitate a rewriting system for
abbreviating the index entries, such as replacing "street" with "st" in
English, "Straße" with "Str" in German, "puiestee" with "pst" in
Estonian, "katu" with "k" in Finnish and so on.
Marko
_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev