I originally posted this question to the OSM discussion forum, but someone there suggested that I bring it to the geocoding listserve and ask it here. So, here goes:
I'm using Nominatim to reverse-geocode natural language location descriptions for a research project. I spent some time looking through the source code (in particular, website/search.php), but I can't seem to make heads or tails of how the "importance" score is calculated. >From what I can tell, there is some baseline calculation and then numerous tweaks - one line, for example, says $aResult['importance'] = $aResult['importance'] + ($iCountWords*0.1); // 0.1 is a completely arbitrary number but something in the range 0.1 to 0.5 would seem right I also noticed in the documentation that Nominatim will use Wikipedia to improve the ranking of results, but once again nothing specific beyond "the importance value is calculated as log(totalcount)/log(max totalcount)." I assume that "totalcount" is the number of internal links to an article about a specific location in the result set, and "max totalcount" is the maximum of that value across the entire result set. But this only tells me the scoring contribution from Wikipedia, and not how the baseline score is calculated. My question is, what properties of the OSM data go into the calculation, and then how is the importance score actually calculated? What special tweaks and thresholds should I be aware of? -Alex
_______________________________________________ Geocoding mailing list [email protected] http://lists.openstreetmap.org/listinfo/geocoding

