On Sun, Mar 22, 2020 at 11:46:52AM +0000, Rahul Reddy wrote:
> Thanks for the reply!
> 
> I took some time to understand the utfasciitable.h and nominatim.c in the 
> path module/. The entries are already such that ASCII 9-14 will be converted 
> to space character.
> 
> But the resultant string does not contain the character. This happens with 
> other characters like @#+() etc.(these are irrelevant in search) too.
> 
> I think the part
> '// assume lenngth 1, silently drop bogus characters'
> in nominatim.c is dropping these characters. Can anyone help me with this?

Are you sure that it ends up there? I would expect it to hit the
first if ( ((*sourcedata & 0x80) == 0) as these should be normal
ASCII characters in the 0-128 range.

That last case is only hit, when the input is not valid UTF-8.

The jucy part where characters are skipped comes further below
Look out for 'if (*(asciilookup + *wchardata) > 0)'

Sarah

> 
> PS: While trying to understand the table, I fixed issue 
> #886<https://github.com/openstreetmap/Nominatim/issues/886>. I'll write test 
> cases and send a PR for that.

_______________________________________________
Geocoding mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/geocoding

Reply via email to