Hi Gerd While looking at the mdr output from Mapinstall and the differences in the street repeat flags, mainly where there are shield/road-ref entries, I realised my change to Sort.java for undefined sortOrder went to far and it disrupts this ordering for Unicode.
Missing or 0 sortOrder (anything before the first "<" in sort/cp*.txt) means the character is ignored by the key sort and collator.compare(), so TERTIARY and EQUAL can give different answers. Attached is mdrUnicode_v9a.patch that generates a fake sortOrder only for rows where no sortOrders are defined, instead of for anything in Unicode with no/0 sortOrder. With this patch it becomes possible for Unicode to trigger the original problem again. Attached is mdrUnicode_v9b.patch that has the fix for this from the previous patch. Ticker On Tue, 2021-11-09 at 18:13 +0000, Ticker Berkin wrote: > Hi Gerd > > Sorry about the delay, I'm away at the moment without access to the > sources and other tools, but: > > The same pointer size (1-3 bytes) is used by Mdr5 and Mdr25. I can't > remember if this is determined by the actual number of Mdr5 entries > or > bigger than Mdr5, then this setting needs to be investigated. > > To see what happens when Mdr25 uses exact name dedupe and Mdr5 uses > TERTIARY collator, the best way I can think of is to remove some > characters from the sort/code-page such that Mdr5 will consider some > cities to be identical in Mdr5, then see what effect this has on > Find>Address. > > With Mdr, I can well understand unwillingness to change anything when > it > seems to be working. My attempts to switch to SECONDARY collator, > which > I think is what has to happen eventually, had confusing results > > Ticker > > > On 07/11/2021 09:16, Gerd Petermann wrote: > > Hi Ticker, > > > > > there is no logical change in behaviour to Mdr5 with this > > > patch. > > yes, I already wrote it's only refactoring. > > I still try to find some "real OSM" input where the changes to the > > other files show any difference in the mdr file. > > > > I found this very old svn log entry by Steve: > > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=3093 > > which says that "All cases where sorted names are compared with > > equals() need to be changed to use Collator.compare() with the > > SrtCollator from Sort." > > > > So, I really wonder why he didn't change all sources accordingly. > > Maybe it was only about cities, maybe not. > > I see only one way to find out: Have an input that has shows > > differences in the index and find out which version gives better > > results in the search in MapSource or - if testing is possible - on > > a device. > > > > Gerd > > > > ________________________________________ > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag > > von Ticker Berkin <rwb-mkg...@jagit.co.uk> > > Gesendet: Samstag, 6. November 2021 18:46 > > An: Development list for mkgmap > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix > > java.lang.AssertionError while building index from unicode tiles > > > > Hi Gerd > > > > I don't know if Mdr5 was correct before, but, for European names at > > least, I presume it has been used quite a lot and no complaints. > > > > If the default strength for a Collator is TERTIARY, there is no > > logical > > change in behaviour to Mdr5 with this patch. > > > > Ticker > > > > On Sat, 2021-11-06 at 14:33 +0000, Gerd Petermann wrote: > > > H Ticker, > > > > > > but how do you know that the code in Mdr5 is right (with or > > > without > > > the patch)? > > > > > > Gerd > > > > > > ________________________________________ > > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im > > > Auftrag > > > von Ticker Berkin <rwb-mkg...@jagit.co.uk> > > > Gesendet: Samstag, 6. November 2021 15:29 > > > An: Development list for mkgmap > > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix > > > java.lang.AssertionError while building index from unicode tiles > > > > > > Hi Gerd > > > > > > I just hate the idea it is obviously wrong and trivially to make > > > correct. It is unlikely it will ever fail. While it is in this > > > inconsistent state it inhibits any addressing of the more > > > complicated > > > issue about how to tackle the case-differences in city names and > > > what > > > MdrCheck is reporting. > > > > > > Ticker > > > > > > On Sat, 2021-11-06 at 14:10 +0000, Gerd Petermann wrote: > > > > Hi Ticker, > > > > > > > > ok, so let's wait for that crash to get an example. I really > > > > have > > > > no > > > > idea how to force it :( > > > > According to MdrCheck the index is full of errors for the two > > > > tiles > > > > in China, but maybe it's MdrCheck which is wrong. > > > > > > > > Gerd > > > > > > > > ________________________________________ > > > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im > > > > Auftrag > > > > von Ticker Berkin <rwb-mkg...@jagit.co.uk> > > > > Gesendet: Samstag, 6. November 2021 12:27 > > > > An: mkgmap-dev@lists.mkgmap.org.uk > > > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix > > > > java.lang.AssertionError while building index from unicode > > > > tiles > > > > > > > > Hi Gerd > > > > > > > > The crash in Carlos's original problem is due to the > > > > inconsistency > > > > in > > > > the dedups between Mdr5/Mdr25. > > > > > > > > This could be triggered with any --code-page where city names > > > > contain > > > > characters that exist in this character-set but are not given > > > > sort > > > > positions. > > > > > > > > My mistake with mdrUnicode_v1/2.patch was trying to tackle the > > > > case > > > > problem at the same time. This is going to be much more > > > > difficult. > > > > > > > > Ticker > > > > > > > > > > > > On Fri, 2021-11-05 at 11:00 +0000, Gerd Petermann wrote: > > > > > Hi Ticker, > > > > > > > > > > sorry for my reluctance. I simply have no test case that > > > > > shows an > > > > > error (search for xyz not working). If you have one please > > > > > share > > > > > it > > > > > so that I can understand the importance of the patch. > > > > > > > > > > I would also be happy if you could create a new branch to > > > > > test > > > > > further changes reg. --lower-case. > > > > > According to MdrCheck we also produce wrong data for mdr 27 > > > > > (cities > > > > > are not de-duped). > > > > > Found this also with Arndts *.img data but did not yet try to > > > > > find > > > > > out if MdrCheck is right. > > > > > > > > > > Gerd > > > > > > > > > > ________________________________________ > > > > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im > > > > > Auftrag > > > > > von Ticker Berkin <rwb-mkg...@jagit.co.uk> > > > > > Gesendet: Freitag, 5. November 2021 11:34 > > > > > An: mkgmap-dev@lists.mkgmap.org.uk; > > > > > mkgmap-...@lists.mkgmap.org.uk > > > > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix > > > > > java.lang.AssertionError while building index from unicode > > > > > tiles > > > > > > > > > > Hi Gerd > > > > > > > > > > I really don't like the idea of the consistent dedup part of > > > > > this > > > > > patch > > > > > not being applied (Mdr5 & Mdr25). > > > > > The Mdr7 changes are a slight refactor and a useful comment, > > > > > but > > > > > has > > > > > no > > > > > logical effect. > > > > > The Mdr29 changes are an assert to detect inconsistency in > > > > > Mdr5/25 > > > > > index pointer lengths before these could cause a crash + > > > > > .equal()/collator.compare() change that could be removed > > > > > > > > > > Ticker > > > > > > > > > > On Fri, 2021-11-05 at 09:02 +0000, svn commit wrote: > > > > > > Version mkgmap-r4811 was committed by gerd on Fri, 05 Nov > > > > > > 2021 > > > > > > > > > > > > fix java.lang.AssertionError while building index from > > > > > > unicode > > > > > > tiles > > > > > > Changes extracted from mdrUnicode_v8.patch by Ticker Berkin > > > > > > > > > > > > http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4811 > > > > > > _______________________________________________ > > > > > > mkgmap-svn mailing list > > > > > > To unsubscribe send an mail to > > > > > > mkgmap-svn-le...@lists.mkgmap.org.uk > > > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-svn > > > > > > > > > > > > > > > _______________________________________________ > > > > > mkgmap-dev mailing list > > > > > mkgmap-dev@lists.mkgmap.org.uk > > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > > _______________________________________________ > > > > > mkgmap-dev mailing list > > > > > mkgmap-dev@lists.mkgmap.org.uk > > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > > > > > > > > > _______________________________________________ > > > > mkgmap-dev mailing list > > > > mkgmap-dev@lists.mkgmap.org.uk > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > _______________________________________________ > > > > mkgmap-dev mailing list > > > > mkgmap-dev@lists.mkgmap.org.uk > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > > > > > > _______________________________________________ > > > mkgmap-dev mailing list > > > mkgmap-dev@lists.mkgmap.org.uk > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > _______________________________________________ > > > mkgmap-dev mailing list > > > mkgmap-dev@lists.mkgmap.org.uk > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > > > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > _______________________________________________ > > mkgmap-dev mailing list > > mkgmap-dev@lists.mkgmap.org.uk > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev > > > _______________________________________________ > mkgmap-dev mailing list > mkgmap-dev@lists.mkgmap.org.uk > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr25.java =================================================================== --- src/uk/me/parabola/imgfmt/app/mdr/Mdr25.java (revision 4817) +++ src/uk/me/parabola/imgfmt/app/mdr/Mdr25.java (working copy) @@ -12,6 +12,7 @@ */ package uk.me.parabola.imgfmt.app.mdr; +import java.text.Collator; import java.util.ArrayList; import java.util.List; @@ -37,6 +38,8 @@ */ public void sortCities(List<Mdr5Record> list) { Sort sort = getConfig().getSort(); + Collator collator = sort.getCollator(); + collator.setStrength(Collator.TERTIARY); List<SortKey<Mdr5Record>> keys = new ArrayList<>(); for (Mdr5Record c : list) { @@ -52,9 +55,10 @@ for (SortKey<Mdr5Record> key : keys) { Mdr5Record city = key.getObject(); - if (lastCity == null || !city.getName().equals(lastCity.getName()) - || !city.getRegionName().equals(lastCity.getRegionName()) - || !city.getCountryName().equals(lastCity.getCountryName())) { + if (lastCity == null || + collator.compare(city.getName(), lastCity.getName()) != 0 || + collator.compare(city.getRegionName(), lastCity.getRegionName()) != 0 || + collator.compare(city.getCountryName(), lastCity.getCountryName()) != 0) { record++; // Record in the 29 index if there is one for this record @@ -61,8 +65,8 @@ Mdr14Record mdrCountry = city.getMdrCountry(); Mdr29Record mdr29 = mdrCountry.getMdr29(); String name = mdr29.getName(); - assert mdrCountry.getName().equals(name); - if (!name.equals(lastName)) { + assert collator.compare(mdrCountry.getName(), name) == 0; + if (lastName == null || collator.compare(name, lastName) != 0) { mdr29.setMdr25(record); lastName = name; } Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr29.java =================================================================== --- src/uk/me/parabola/imgfmt/app/mdr/Mdr29.java (revision 4817) +++ src/uk/me/parabola/imgfmt/app/mdr/Mdr29.java (working copy) @@ -83,6 +83,8 @@ PointerSizes sizes = getSizes(); int size24 = sizes.getSize(24); int size22 = sizes.getSize(22); + assert sizes.getNumberOfItems(5) >= sizes.getNumberOfItems(25) + : "5:" + sizes.getNumberOfItems(5) + " 25:" + sizes.getNumberOfItems(25); int size25 = sizes.getSize(5); // NB appears to be size of 5 (cities), not 25 (cities with country). int size26 = has26? sizes.getSize(26): 0; int size17 = Utils.numberToPointerSize(max17); Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr5.java =================================================================== --- src/uk/me/parabola/imgfmt/app/mdr/Mdr5.java (revision 4817) +++ src/uk/me/parabola/imgfmt/app/mdr/Mdr5.java (working copy) @@ -39,6 +39,8 @@ private int maxCityIndex; private int localCitySize; private int mdr20PointerSize = 0; // bytes for mdr20 pointer, or 0 if no mdr20 + private Sort sort; + private Collator collator; // We need a common area to save the mdr20 values, since there can be multiple // city records with the same global city index @@ -46,6 +48,9 @@ public Mdr5(MdrConfig config) { setConfig(config); + sort = config.getSort(); + collator = sort.getCollator(); + collator.setStrength(Collator.TERTIARY); } public void addCity(Mdr5Record record) { @@ -62,15 +67,14 @@ public void preWriteImpl() { allCities.trimToSize(); localCitySize = Utils.numberToPointerSize(maxCityIndex + 1); - Sort sort = getConfig().getSort(); - genCitiesAndMdr20s(sort); + genCitiesAndMdr20s(); // calculate positions for the different road indexes - calcMdr20SortPos(sort); - calcMdr21SortPos(sort); - calcMdr22SortPos(sort); + calcMdr20SortPos(); // TODO: Maybe this can be simplified away - see mdr5-simplify-v2.patch 02-Nov-2021 + calcMdr21SortPos(); + calcMdr22SortPos(); } - private void genCitiesAndMdr20s(Sort sort) { + private void genCitiesAndMdr20s() { List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size()); Map<String, byte[]> regionCache = new HashMap<>(); Map<String, byte[]> countryCache = new HashMap<>(); @@ -90,7 +94,6 @@ sortKeys.sort(null); cities = new ArrayList<>(sortKeys.size()); - Collator collator = sort.getCollator(); int count = 0; Mdr5Record lastCity = null; @@ -122,7 +125,7 @@ /** * Calculate a position when sorting by name, region, and country- This is used for MDR20. */ - private void calcMdr20SortPos(Sort sort) { + private void calcMdr20SortPos() { List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size()); Map<String, byte[]> regionCache = new HashMap<>(); Map<String, byte[]> countryCache = new HashMap<>(); @@ -155,7 +158,7 @@ /** * Calculate a position when sorting by region- This is used for MDR21. */ - private void calcMdr21SortPos(Sort sort) { + private void calcMdr21SortPos() { List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size()); for (Mdr5Record m : allCities) { if (m.getRegionName() == null) @@ -181,7 +184,7 @@ * Calculate a position when sorting by country- This is used for MDR22. */ - private void calcMdr22SortPos(Sort sort) { + private void calcMdr22SortPos() { List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size()); for (Mdr5Record m : allCities) { if (m.getCountryName() == null) @@ -208,7 +211,6 @@ Mdr5Record lastCity = null; boolean hasString = hasFlag(0x8); boolean hasRegion = hasFlag(0x4); - Collator collator = getConfig().getSort().getCollator(); for (Mdr5Record city : cities) { int gci = city.getGlobalCityIndex(); addIndexPointer(city.getMapIndex(), gci); @@ -284,7 +286,7 @@ int val = (localCitySize - 1); // String offset is only included for a mapsource index. if (isForDevice() ) { - if (!getConfig().getSort().isMulti()) + if (!sort.isMulti()) val |= 0x40; // mdr17 sub section present (not with unicode) } else { val |= 0x04; // region Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr7.java =================================================================== --- src/uk/me/parabola/imgfmt/app/mdr/Mdr7.java (revision 4817) +++ src/uk/me/parabola/imgfmt/app/mdr/Mdr7.java (working copy) @@ -52,6 +52,7 @@ private int partialInfoSize; private Set<String> exclNames; private final Sort sort; + private Collator collator; private int maxPrefixCount; private int maxSuffixCount; private boolean magicIsValid; @@ -60,6 +61,8 @@ public Mdr7(MdrConfig config) { setConfig(config); sort = config.getSort(); + collator = sort.getCollator(); + collator.setStrength(Collator.SECONDARY); splitName = config.isSplitName(); exclNames = config.getMdr7Excl(); codepage = sort.getCodepage(); @@ -217,8 +220,6 @@ // list is now sorted by partial name only, we have to group by name and map index now String lastPartial = null; List<Mdr7Record> samePartial = new ArrayList<>(); - Collator collator = sort.getCollator(); - collator.setStrength(Collator.SECONDARY); for (int i = 0; i < sorted.size(); i++) { Mdr7Record r = sorted.get(i); String partial = r.getPartialName(); @@ -264,6 +265,12 @@ // per map for the same name. for (int i = 0; i < samePartial.size(); i++) { Mdr7Record r = samePartial.get(i); + //if (last != null && r.getMapIndex() == last.getMapIndex() && collator.compare(r.getName(), last.getName()) == 0) { + // 28Oct2021 Ticker: + // Maybe this should use above "collator.compare(x,y)==0" rather than following "x.equals(y)", but this causes + // significant differences to various MDR sections when there is the same street name but with different shields + // The eTrex Legend HXc supresses these duplicate entries in the list of streets but gives a result for each. + // cf. Mdr20 "Only save a single copy of each street name." if (last != null && r.getMapIndex() == last.getMapIndex() && r.getName().equals(last.getName())) { // This has the same name (and map number) as the previous one. // Save the pointer to that one @@ -283,8 +290,6 @@ public void writeSectData(ImgFileWriter writer) { boolean hasStrings = hasFlag(MDR7_HAS_STRING); boolean hasNameOffset = hasFlag(MDR7_HAS_NAME_OFFSET); - Collator collator = sort.getCollator(); - collator.setStrength(Collator.SECONDARY); Mdr7Record last = null; for (Mdr7Record s : streets) { addIndexPointer(s.getMapIndex(), s.getIndex());
Index: src/uk/me/parabola/imgfmt/app/srt/Sort.java =================================================================== --- src/uk/me/parabola/imgfmt/app/srt/Sort.java (revision 4817) +++ src/uk/me/parabola/imgfmt/app/srt/Sort.java (working copy) @@ -424,11 +424,7 @@ int exp = (getFlags(c) >> 4) & 0xf; if (exp == 0) { - int startIndex = index; index = writePos(type, c, outKey, index); - if (index == startIndex && isMulti() && type == Collator.PRIMARY) { // nothing got written but is needed - index = writeSort(type, c + maxPrimary, outKey, index); - } } else { // now have to redirect to a list of input chars, get the list via the primary value always. int idx = getPrimary(c); @@ -882,8 +878,6 @@ expPos = 0; } else { next = getPos(type, c); - if (next == 0 && isMulti() && type == Collator.PRIMARY) // like earlier - return (c + maxPrimary) & 0xffff; } } while (next == 0);
_______________________________________________ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev