Hi Ticker > Problem is that resources/sort/cp65001.txt doesn't give ordering to > lots of characters; it looks like it covers only about 10,500 of the > 1,112,064 possible code-points. Many of these non-ordered characters > are being used by the names in the tile in question.
I used the program in extra/src/uk/me/parabola/util/CollationRules.java to generate some of the tables. This uses the file "allkeys.txt" which can be obtained from https://www.unicode.org/Public/UCA/latest/allkeys.txt The document explaining the unicode collation rules that references that file is: http://www.unicode.org/reports/tr10/ It includes a section for programmatically deriving the weights for characters that do not have explicit entries in the table. > Assuming the actual ordering of unspecified code-points doesn't really > matter, I propose to change the logic slightly so undefined Unicode is > sorted on its 16-bit value after the range of known sorts. I think that is a good initial approach to get things working. Steve _______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev