Hi Ticker,

please remove the unrelated changes. I think we discussed them with patch 
mdrSort.patch in May, subject "MDR building out-of-memory".

Gerd

________________________________________
Von: mkgmap-dev <[email protected]> im Auftrag von Ticker 
Berkin <[email protected]>
Gesendet: Montag, 18. Oktober 2021 16:36
An: Development list for mkgmap
Betreff: Re: [mkgmap-dev] java.lang.AssertionError while building index from 
unicode tiles

Hi Gerd

Here is first version of the changes to improve MDR unicode and stop
the crash.

It always provides a PRIMARY strength sort value, both in the key for
sorting and direct comparison when using the collator. Previously
neither of these would have anything for a unicode character not
mentioned in the sort/cp65001.txt file

In an attempt to stop ordering clashes between the specified sort and
the ones fudged from the actual unicode value, it orders anything
unknown after the known values. Unfortunately these can then become
larger than 2 bytes - and, as this is all the space available without
re-structuring, they have to wrap onto the known sort region. I only
found 1 character that did this and I don't know if it conflicted with
an existing sort.

Regardless of the character set used, in all the places where sorting
is used for de-dupe, I've used the SECONDARY strength collator to
detect similar record instead of name.equals(lastName)

I also noticed that my source base included optimisation for
LargeListSorter, its use of a key cache and some tidy-up of this in
mdr7 & mdr11 so these are here as well.

Ticker

_______________________________________________
mkgmap-dev mailing list
[email protected]
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to