Hi Gerd

While looking at the mdr output from Mapinstall and the differences in
the street repeat flags, mainly where there are shield/road-ref
entries, I realised my change to Sort.java for undefined sortOrder went
to far and it disrupts this ordering for Unicode.

Missing or 0 sortOrder (anything before the first "<" in sort/cp*.txt)
means the character is ignored by the key sort and collator.compare(),
so TERTIARY and EQUAL can give different answers.  

Attached is mdrUnicode_v9a.patch that generates a fake sortOrder only
for rows where no sortOrders are defined, instead of for anything in
Unicode with no/0 sortOrder.

With this patch it becomes possible for Unicode to trigger the original
problem again. Attached is mdrUnicode_v9b.patch that has the fix for
this from the previous patch.

Ticker

On Tue, 2021-11-09 at 18:13 +0000, Ticker Berkin wrote:
> Hi Gerd
> 
> Sorry about the delay, I'm away at the moment without access to the 
> sources and other tools, but:
> 
> The same pointer size (1-3 bytes) is used by Mdr5 and Mdr25. I can't 
> remember if this is determined by the actual number of Mdr5 entries
> or 
> bigger than Mdr5, then this setting needs to be investigated.
> 
> To see what happens when Mdr25 uses exact name dedupe and Mdr5 uses 
> TERTIARY collator, the best way I can think of is to remove some 
> characters from the sort/code-page such that Mdr5 will consider some 
> cities to be identical in Mdr5, then see what effect this has on 
> Find>Address.
> 
> With Mdr, I can well understand unwillingness to change anything when
> it 
> seems to be working. My attempts to switch to SECONDARY collator,
> which 
> I think is what has to happen eventually, had confusing results
> 
> Ticker
> 
> 
> On 07/11/2021 09:16, Gerd Petermann wrote:
> > Hi Ticker,
> > 
> > >   there is no logical change in behaviour to Mdr5 with this
> > > patch.
> > yes, I already wrote it's only refactoring.
> > I still try to find some "real OSM" input where the changes to the
> > other files show any difference in the mdr file.
> > 
> > I found this very old svn log entry by Steve:
> > https://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=3093
> > which says that "All cases where sorted names are compared with
> > equals() need to be changed to use Collator.compare() with the
> > SrtCollator from Sort."
> > 
> > So, I really wonder why he didn't change all  sources accordingly.
> > Maybe it was only about cities, maybe not.
> > I see only one way to find out: Have an input that has shows
> > differences in the index and find out which version gives better
> > results in the search in MapSource or - if testing is possible - on
> > a device.
> > 
> > Gerd
> > 
> > ________________________________________
> > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im Auftrag
> > von Ticker Berkin <rwb-mkg...@jagit.co.uk>
> > Gesendet: Samstag, 6. November 2021 18:46
> > An: Development list for mkgmap
> > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix
> > java.lang.AssertionError while building index from unicode tiles
> > 
> > Hi Gerd
> > 
> > I don't know if Mdr5 was correct before, but, for European names at
> > least, I presume it has been used quite a lot and no complaints.
> > 
> > If the default strength for a Collator is TERTIARY, there is no
> > logical
> > change in behaviour to Mdr5 with this patch.
> > 
> > Ticker
> > 
> > On Sat, 2021-11-06 at 14:33 +0000, Gerd Petermann wrote:
> > > H Ticker,
> > > 
> > > but how do you know that the code in Mdr5 is right (with or
> > > without
> > > the patch)?
> > > 
> > > Gerd
> > > 
> > > ________________________________________
> > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im
> > > Auftrag
> > > von Ticker Berkin <rwb-mkg...@jagit.co.uk>
> > > Gesendet: Samstag, 6. November 2021 15:29
> > > An: Development list for mkgmap
> > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix
> > > java.lang.AssertionError while building index from unicode tiles
> > > 
> > > Hi Gerd
> > > 
> > > I just hate the idea it is obviously wrong and trivially to make
> > > correct. It is unlikely it will ever fail. While it is in this
> > > inconsistent state it inhibits any addressing of the more
> > > complicated
> > > issue about how to tackle the case-differences in city names and
> > > what
> > > MdrCheck is reporting.
> > > 
> > > Ticker
> > > 
> > > On Sat, 2021-11-06 at 14:10 +0000, Gerd Petermann wrote:
> > > > Hi Ticker,
> > > > 
> > > > ok, so let's wait for that crash to get an example. I really
> > > > have
> > > > no
> > > > idea how to force it :(
> > > > According to MdrCheck the index is full of errors for the two
> > > > tiles
> > > > in China, but maybe it's MdrCheck which is wrong.
> > > > 
> > > > Gerd
> > > > 
> > > > ________________________________________
> > > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im
> > > > Auftrag
> > > > von Ticker Berkin <rwb-mkg...@jagit.co.uk>
> > > > Gesendet: Samstag, 6. November 2021 12:27
> > > > An: mkgmap-dev@lists.mkgmap.org.uk
> > > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix
> > > > java.lang.AssertionError while building index from unicode
> > > > tiles
> > > > 
> > > > Hi Gerd
> > > > 
> > > > The crash in Carlos's original problem is due to the
> > > > inconsistency
> > > > in
> > > > the dedups between Mdr5/Mdr25.
> > > > 
> > > > This could be triggered with any --code-page where city names
> > > > contain
> > > > characters that exist in this character-set but are not given
> > > > sort
> > > > positions.
> > > > 
> > > > My mistake with mdrUnicode_v1/2.patch was trying to tackle the
> > > > case
> > > > problem at the same time. This is going to be much more
> > > > difficult.
> > > > 
> > > > Ticker
> > > > 
> > > > 
> > > > On Fri, 2021-11-05 at 11:00 +0000, Gerd Petermann wrote:
> > > > > Hi Ticker,
> > > > > 
> > > > > sorry for my reluctance. I simply have no test case that
> > > > > shows an
> > > > > error (search for xyz not working). If you have one please
> > > > > share
> > > > > it
> > > > > so that I can understand the importance of the patch.
> > > > > 
> > > > > I would also be happy if you could create a new branch to
> > > > > test
> > > > > further changes reg. --lower-case.
> > > > > According to MdrCheck we also produce wrong data for mdr 27
> > > > > (cities
> > > > > are not de-duped).
> > > > > Found this also with Arndts *.img data but did not yet try to
> > > > > find
> > > > > out if MdrCheck is right.
> > > > > 
> > > > > Gerd
> > > > > 
> > > > > ________________________________________
> > > > > Von: mkgmap-dev <mkgmap-dev-boun...@lists.mkgmap.org.uk> im
> > > > > Auftrag
> > > > > von Ticker Berkin <rwb-mkg...@jagit.co.uk>
> > > > > Gesendet: Freitag, 5. November 2021 11:34
> > > > > An: mkgmap-dev@lists.mkgmap.org.uk;
> > > > > mkgmap-...@lists.mkgmap.org.uk
> > > > > Betreff: Re: [mkgmap-dev] [mkgmap-svn] Commit r4811: fix
> > > > > java.lang.AssertionError while building index from unicode
> > > > > tiles
> > > > > 
> > > > > Hi Gerd
> > > > > 
> > > > > I really don't like the idea of the consistent dedup part of
> > > > > this
> > > > > patch
> > > > > not being applied (Mdr5 & Mdr25).
> > > > > The Mdr7 changes are a slight refactor and a useful comment,
> > > > > but
> > > > > has
> > > > > no
> > > > > logical effect.
> > > > > The Mdr29 changes are an assert to detect inconsistency in
> > > > > Mdr5/25
> > > > > index pointer lengths before these could cause a crash +
> > > > > .equal()/collator.compare() change that could be removed
> > > > > 
> > > > > Ticker
> > > > > 
> > > > > On Fri, 2021-11-05 at 09:02 +0000, svn commit wrote:
> > > > > > Version mkgmap-r4811 was committed by gerd on Fri, 05 Nov
> > > > > > 2021
> > > > > > 
> > > > > > fix java.lang.AssertionError while building index from
> > > > > > unicode
> > > > > > tiles
> > > > > > Changes extracted from mdrUnicode_v8.patch by Ticker Berkin
> > > > > > 
> > > > > > http://www.mkgmap.org.uk/websvn/revision.php?repname=mkgmap&rev=4811
> > > > > > _______________________________________________
> > > > > > mkgmap-svn mailing list
> > > > > > To unsubscribe send an mail to
> > > > > > mkgmap-svn-le...@lists.mkgmap.org.uk
> > > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-svn
> > > > > 
> > > > > 
> > > > > _______________________________________________
> > > > > mkgmap-dev mailing list
> > > > > mkgmap-dev@lists.mkgmap.org.uk
> > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > > > _______________________________________________
> > > > > mkgmap-dev mailing list
> > > > > mkgmap-dev@lists.mkgmap.org.uk
> > > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > > 
> > > > 
> > > > _______________________________________________
> > > > mkgmap-dev mailing list
> > > > mkgmap-dev@lists.mkgmap.org.uk
> > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > > _______________________________________________
> > > > mkgmap-dev mailing list
> > > > mkgmap-dev@lists.mkgmap.org.uk
> > > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > 
> > > 
> > > _______________________________________________
> > > mkgmap-dev mailing list
> > > mkgmap-dev@lists.mkgmap.org.uk
> > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > > _______________________________________________
> > > mkgmap-dev mailing list
> > > mkgmap-dev@lists.mkgmap.org.uk
> > > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > 
> > 
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > _______________________________________________
> > mkgmap-dev mailing list
> > mkgmap-dev@lists.mkgmap.org.uk
> > https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
> > 
> _______________________________________________
> mkgmap-dev mailing list
> mkgmap-dev@lists.mkgmap.org.uk
> https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr25.java
===================================================================
--- src/uk/me/parabola/imgfmt/app/mdr/Mdr25.java	(revision 4817)
+++ src/uk/me/parabola/imgfmt/app/mdr/Mdr25.java	(working copy)
@@ -12,6 +12,7 @@
  */
 package uk.me.parabola.imgfmt.app.mdr;
 
+import java.text.Collator;
 import java.util.ArrayList;
 import java.util.List;
 
@@ -37,6 +38,8 @@
 	 */
 	public void sortCities(List<Mdr5Record> list) {
 		Sort sort = getConfig().getSort();
+		Collator collator = sort.getCollator();
+		collator.setStrength(Collator.TERTIARY);
 
 		List<SortKey<Mdr5Record>> keys = new ArrayList<>();
 		for (Mdr5Record c : list) {
@@ -52,9 +55,10 @@
 		for (SortKey<Mdr5Record> key : keys) {
 			Mdr5Record city = key.getObject();
 
-			if (lastCity == null || !city.getName().equals(lastCity.getName())
-					|| !city.getRegionName().equals(lastCity.getRegionName())
-					|| !city.getCountryName().equals(lastCity.getCountryName())) {
+			if (lastCity == null ||
+				collator.compare(city.getName(), lastCity.getName()) != 0 ||
+				collator.compare(city.getRegionName(), lastCity.getRegionName()) != 0 ||
+				collator.compare(city.getCountryName(), lastCity.getCountryName()) != 0) {
 				record++;
 
 				// Record in the 29 index if there is one for this record
@@ -61,8 +65,8 @@
 				Mdr14Record mdrCountry = city.getMdrCountry();
 				Mdr29Record mdr29 = mdrCountry.getMdr29();
 				String name = mdr29.getName();
-				assert mdrCountry.getName().equals(name);
-				if (!name.equals(lastName)) {
+				assert collator.compare(mdrCountry.getName(), name) == 0;
+				if (lastName == null || collator.compare(name, lastName) != 0) {
 					mdr29.setMdr25(record);
 					lastName = name;
 				}
Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr29.java
===================================================================
--- src/uk/me/parabola/imgfmt/app/mdr/Mdr29.java	(revision 4817)
+++ src/uk/me/parabola/imgfmt/app/mdr/Mdr29.java	(working copy)
@@ -83,6 +83,8 @@
 		PointerSizes sizes = getSizes();
 		int size24 = sizes.getSize(24);
 		int size22 = sizes.getSize(22);
+		assert sizes.getNumberOfItems(5) >= sizes.getNumberOfItems(25)
+			: "5:" + sizes.getNumberOfItems(5) + " 25:" + sizes.getNumberOfItems(25);
 		int size25 = sizes.getSize(5);  // NB appears to be size of 5 (cities), not 25 (cities with country).
 		int size26 = has26? sizes.getSize(26): 0;
 		int size17 = Utils.numberToPointerSize(max17);
Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr5.java
===================================================================
--- src/uk/me/parabola/imgfmt/app/mdr/Mdr5.java	(revision 4817)
+++ src/uk/me/parabola/imgfmt/app/mdr/Mdr5.java	(working copy)
@@ -39,6 +39,8 @@
 	private int maxCityIndex;
 	private int localCitySize;
 	private int mdr20PointerSize = 0; // bytes for mdr20 pointer, or 0 if no mdr20
+	private Sort sort;
+	private Collator collator;
 
 	// We need a common area to save the mdr20 values, since there can be multiple
 	// city records with the same global city index
@@ -46,6 +48,9 @@
 
 	public Mdr5(MdrConfig config) {
 		setConfig(config);
+		sort = config.getSort();
+		collator = sort.getCollator();
+		collator.setStrength(Collator.TERTIARY);
 	}
 
 	public void addCity(Mdr5Record record) {
@@ -62,15 +67,14 @@
 	public void preWriteImpl() {
 		allCities.trimToSize();
 		localCitySize = Utils.numberToPointerSize(maxCityIndex + 1);
-		Sort sort = getConfig().getSort();
-		genCitiesAndMdr20s(sort);
+		genCitiesAndMdr20s();
 		// calculate positions for the different road indexes
-		calcMdr20SortPos(sort);
-		calcMdr21SortPos(sort);
-		calcMdr22SortPos(sort);
+		calcMdr20SortPos();  // TODO: Maybe this can be simplified away - see mdr5-simplify-v2.patch 02-Nov-2021
+		calcMdr21SortPos();
+		calcMdr22SortPos();
 	}
 
-	private void genCitiesAndMdr20s(Sort sort) {
+	private void genCitiesAndMdr20s() {
 		List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size());
 		Map<String, byte[]> regionCache = new HashMap<>();
 		Map<String, byte[]> countryCache = new HashMap<>();
@@ -90,7 +94,6 @@
 		sortKeys.sort(null);
 
 		cities = new ArrayList<>(sortKeys.size());
-		Collator collator = sort.getCollator();
 
 		int count = 0;
 		Mdr5Record lastCity = null;
@@ -122,7 +125,7 @@
 	/**
 	 * Calculate a position when sorting by name, region, and country- This is used for MDR20. 
 	 */
-	private void calcMdr20SortPos(Sort sort) {
+	private void calcMdr20SortPos() {
 		List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size());
 		Map<String, byte[]> regionCache = new HashMap<>();
 		Map<String, byte[]> countryCache = new HashMap<>();
@@ -155,7 +158,7 @@
 	/**
 	 * Calculate a position when sorting by region- This is used for MDR21. 
 	 */
-	private void calcMdr21SortPos(Sort sort) {
+	private void calcMdr21SortPos() {
 		List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size());
 		for (Mdr5Record m : allCities) {
 			if (m.getRegionName() == null) 
@@ -181,7 +184,7 @@
 	 * Calculate a position when sorting by country- This is used for MDR22. 
 	 */
 
-	private void calcMdr22SortPos(Sort sort) {
+	private void calcMdr22SortPos() {
 		List<SortKey<Mdr5Record>> sortKeys = new ArrayList<>(allCities.size());
 		for (Mdr5Record m : allCities) {
 			if (m.getCountryName() == null)
@@ -208,7 +211,6 @@
 		Mdr5Record lastCity = null;
 		boolean hasString = hasFlag(0x8);
 		boolean hasRegion = hasFlag(0x4);
-		Collator collator = getConfig().getSort().getCollator();
 		for (Mdr5Record city : cities) {
 			int gci = city.getGlobalCityIndex();
 			addIndexPointer(city.getMapIndex(), gci);
@@ -284,7 +286,7 @@
 		int val = (localCitySize - 1);
 		// String offset is only included for a mapsource index.
 		if (isForDevice() ) {
-			if (!getConfig().getSort().isMulti())
+			if (!sort.isMulti())
 				val |= 0x40; // mdr17 sub section present (not with unicode)
 		} else {
 			val |= 0x04;  // region
Index: src/uk/me/parabola/imgfmt/app/mdr/Mdr7.java
===================================================================
--- src/uk/me/parabola/imgfmt/app/mdr/Mdr7.java	(revision 4817)
+++ src/uk/me/parabola/imgfmt/app/mdr/Mdr7.java	(working copy)
@@ -52,6 +52,7 @@
 	private int partialInfoSize;
 	private Set<String> exclNames;
 	private final Sort sort;
+	private Collator collator;
 	private int maxPrefixCount;
 	private int maxSuffixCount;
 	private boolean magicIsValid;
@@ -60,6 +61,8 @@
 	public Mdr7(MdrConfig config) {
 		setConfig(config);
 		sort = config.getSort();
+		collator = sort.getCollator();
+		collator.setStrength(Collator.SECONDARY); 
 		splitName = config.isSplitName();
 		exclNames = config.getMdr7Excl();
 		codepage = sort.getCodepage();
@@ -217,8 +220,6 @@
 		// list is now sorted by partial name only, we have to group by name and map index now
 		String lastPartial = null;
 		List<Mdr7Record> samePartial = new ArrayList<>();
-		Collator collator = sort.getCollator();
-		collator.setStrength(Collator.SECONDARY);
 		for (int i = 0; i < sorted.size(); i++) {
 			Mdr7Record r = sorted.get(i);
 			String partial = r.getPartialName();
@@ -264,6 +265,12 @@
 		// per map for the same name.
 		for (int i = 0; i < samePartial.size(); i++) {
 			Mdr7Record r = samePartial.get(i);
+			//if (last != null && r.getMapIndex() == last.getMapIndex() && collator.compare(r.getName(), last.getName()) == 0) {
+			// 28Oct2021 Ticker:
+			// Maybe this should use above "collator.compare(x,y)==0" rather than following "x.equals(y)", but this causes
+			// significant differences to various MDR sections when there is the same street name but with different shields
+			// The eTrex Legend HXc supresses these duplicate entries in the list of streets but gives a result for each.
+			// cf. Mdr20 "Only save a single copy of each street name."
 			if (last != null && r.getMapIndex() == last.getMapIndex() && r.getName().equals(last.getName())) {
 				// This has the same name (and map number) as the previous one.
 				// Save the pointer to that one
@@ -283,8 +290,6 @@
 	public void writeSectData(ImgFileWriter writer) {
 		boolean hasStrings = hasFlag(MDR7_HAS_STRING);
 		boolean hasNameOffset = hasFlag(MDR7_HAS_NAME_OFFSET);
-		Collator collator = sort.getCollator();
-		collator.setStrength(Collator.SECONDARY); 
 		Mdr7Record last = null;
 		for (Mdr7Record s : streets) {
 			addIndexPointer(s.getMapIndex(), s.getIndex());
Index: src/uk/me/parabola/imgfmt/app/srt/Sort.java
===================================================================
--- src/uk/me/parabola/imgfmt/app/srt/Sort.java	(revision 4817)
+++ src/uk/me/parabola/imgfmt/app/srt/Sort.java	(working copy)
@@ -424,11 +424,7 @@
 
 			int exp = (getFlags(c) >> 4) & 0xf;
 			if (exp == 0) {
-				int startIndex = index;
 				index = writePos(type, c, outKey, index);
-				if (index == startIndex && isMulti() && type == Collator.PRIMARY) { // nothing got written but is needed
-					index = writeSort(type, c + maxPrimary, outKey, index);
-				}
 			} else {
 				// now have to redirect to a list of input chars, get the list via the primary value always.
 				int idx = getPrimary(c);
@@ -882,8 +878,6 @@
 								expPos = 0;
 						} else {
 							next = getPos(type, c);
-							if (next == 0 && isMulti() && type == Collator.PRIMARY) // like earlier
-								return (c + maxPrimary) & 0xffff;
 						}
 
 					} while (next == 0);
_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
https://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to