Tillman, James writes:
> Is someone already working on a perl interface to htsearch? I and a friend
> of mine are interested in doing the work, but don't want to duplicate anyone
> else's effort.
>
> What we really want is an XS module.
>
Hi,
I'm going to do that, in a way. Let me explain.
In the search/index methods there are a few different levels:
Data:
. The word database
. The document database
Functions:
a The word insertion/udpate/delete (indexing)
b The document parsing
c The search query parsing (building a query syntax tree)
d The query resolution (using the syntax tree to match words)
d The information retrieval (given top N matches for a query
retrieve the relevant document information)
e The information display
I'm currently working hard on 'a' and will provide a perl XS interface
to it. It will define a set of primitives to access the word database.
I won't do anything (yet) concerning the document database. The next
step is to implement 'd'. This requires to define the syntax tree. At
present c/d are intermixed, which is a very confusing thing. For one thing
it prevents easy implementation of a new query syntax. Many people would
love to have AltaVista like syntax :-)
I plan to release 'a' by Wednesday (including unary tests). Being a
co-author of the Text::Query CPAN module and author of the Text::Query-SQL
CPAN module, I already have a syntax tree structure in mind. My idea is to
be compatible with it in htdig so that Perl interface search have the same
semantic as the htdig C++ search library.
If you could explain what you have in mind and what you need, we can work
together for the time needed to release the beast :-)
Cheers,
--
Loic Dachary
ECILA
100 av. du Gal Leclerc
93500 Pantin - France
Tel: 33 1 56 96 09 80, Fax: 33 1 56 96 09 61
e-mail: [EMAIL PROTECTED] URL: http://www.senga.org/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.