Hi,
I have written test cases in test/bdd. But I found something else while
doing so. setNearPointFromQuery function used to detect LatLon pairs is
processed separately. This causes the last two examples in the following
Scenario to fail.
Scenario Outline: Search with white space characters
When sending json search query "<data>"
Then exactly 1 result is returned
Examples:
| data |
| amerlugalpe, N 47.15739° E 9.61264° |
| amerlugalpe, N 47.15739° E 9.61264° |
| amerlugalpe , N 47.15739° E 9.61264° |
| amerlugalpe, N 47.15739° E 9.61264° |
| amerlugalpe, N 47.15739° E 9.61264° |
This could be fixed by using a preg_replace in setNearPointFromQuery
function in SearchContext.php or by applying preg_replace on $sQuery.
The former will fix LatLon, but the main query string will still have
those characters.
Which approach should I follow? Or should I ignore this, as this is a
part of LanLon, and would not consist of other white space characters in
general?
Regards,
Rahul
On 01/04/20 11:42 am, Sarah Hoffmann wrote:
Hi Rahul,
On Wed, Apr 01, 2020 at 05:36:00AM +0530, K Rahul Reddy wrote:
For issue #967 <https://github.com/osm-search/Nominatim/issues/967>, These
are some points I found so far:
In Geocode.php lookup(),
1) The sNormQuery is made by using PHP's Transliterator.
2) The normalization method make_standard_name is used on phrases in line
630. This is an sql function which returns
trim(public.gettokenstring(public.transliteration(name))).
We need to replace %09-%0d characters in phrases. This can be done
simply by adding
$sPhrase = preg_replace('/[\x09|\x0a|\x0b|\x0c|\x0d]/', ' ',
$sPhrase);
before normalization function is called.
3) Other solution would be to change normalization(breaks the DB). The
transliteration() uses the utfasciitable.h
Changing UTFASCIILOOKUP by replacing 9-13 th position elements by '2'
does the job.
I have tested both the ways, and both seem to work as expected. What should
I do now?
Go for solution 3). It is true that it breaks the DB but only for places
that have characters %09-%0d in their name. That's basically data that is
broken in the OSM database already and should be fixed. Therefore it is
okay to make an exception to the rule not to change the normalization.
Cheers
Sarah
_______________________________________________
Geocoding mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/geocoding