(I've been trying to write this for weeks and keep getting distracted..)

I have been working on a geographic search engine, and as I mentioned in
my earlier "features" message, I am getting fed up with my Perl robot, and
am trying to use htdig.

I have been somewhat successful and have a version without map
navigation  at http://geotags.com/htdig/

My question is really how much support, if any, I might get from the htdig
community for this, either as mainstream htdig or from anyone else
interested in this kind of thing.

The basic idea is that given a set of Web pages describing
physical objects, such as restaurants, bridges, parks etc., that
the search engine is capable of finding the nearest one.
This requires metadata on the page explicitly giving the
position being described (though it can sometimes be guessed
from things like US zipcodes in the text), and a search algorithm that
can score according to a geographic distance.
The search algorithm is not a true geographic one (as for example
"show me all objects completely or partially inside this polygon") but
rather a modification of a text search, e.g. "pizza AND restaurant SORTBY
distance"



The changes required are (so far)

A database element to store a position for each page
  (I actually store a region code and placename too but don't use them)

An addition to the HTML parser to get the metadata

An addition to the CGI parser to get a requested position (map click)

A weighting algorithm to calculate geographic distance

I also have a config item to essentially force a ROBOTS NONE if there
is no geographic tag on a page, so that I can refrain from indexing
untagged pages.

I am also trying to add support for position passed in an experimental
HTTP header, which allows one to dispense with the map and potentially
generate requests based on current position automatically, e.g.
using GPS.

This is essentially what is online at http://geotags.com/htdig/, using a
fixed map. I will probably create custom templates and
then try to run this version of htsearch using the original Perl as
a wrapper, to enable the map zoom/pan features, which requires
scaling the map clicks according to the currently displayed map.
I could either do the mapclick scaling externally, passing pure
Lat/Long to htsearch, or pass the current map extents to htsearch
and do it internally.

Doing the whole thing including the map manipulations and
graphical markup is somewhat more complicated, and might
require hardwiring some map dependant code, unless it could be done in
templates. I haven't really looked at that yet.

-- 
Andrew Daviel
geotags.com etc.




_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/htdig-dev

Reply via email to