On Sun, Mar 22, 2020 at 11:46:52AM +0000, Rahul Reddy wrote: > Thanks for the reply! > > I took some time to understand the utfasciitable.h and nominatim.c in the > path module/. The entries are already such that ASCII 9-14 will be converted > to space character. > > But the resultant string does not contain the character. This happens with > other characters like @#+() etc.(these are irrelevant in search) too. > > I think the part > '// assume lenngth 1, silently drop bogus characters' > in nominatim.c is dropping these characters. Can anyone help me with this?
Are you sure that it ends up there? I would expect it to hit the first if ( ((*sourcedata & 0x80) == 0) as these should be normal ASCII characters in the 0-128 range. That last case is only hit, when the input is not valid UTF-8. The jucy part where characters are skipped comes further below Look out for 'if (*(asciilookup + *wchardata) > 0)' Sarah > > PS: While trying to understand the table, I fixed issue > #886<https://github.com/openstreetmap/Nominatim/issues/886>. I'll write test > cases and send a PR for that. _______________________________________________ Geocoding mailing list [email protected] https://lists.openstreetmap.org/listinfo/geocoding

