Loic:  I guess we're approaching this work from different viewpoints
(although this doesn't lessen my enthusiasm for working with you!).  I was
concentrating more on allowing htDig to do its thing with the indexing and
just providing Perl scripters with a module for querying.  The idea was to
come up with a compatible data structure for returning a "result set" from
the C++ querying class to Perl after a search had been run, and then
allowing Perl to request the desired document data at will.

Although I didn't have any plans to rework the query syntax or directly
access the htDig database, my current plans to "class out" the query parser
would make providing an interface to Perl users a snap.  We could simply
wrap the class in XS and let Perl access the query parser directly.  Again,
what I wanted to do was make use of what code htDig already had and get my
modifications rolled into the distribution.  That way when changes were
made, I didn't have so much work to do.  Internal processes change often,
but interfaces (i.e, public class methods and properties), should seldom
change.

After lunch, I should be able to give you a better breakdown of my current
status on the "classing out" of the htsearch system.  I also want to let
Geof know where I'm at on this, since he was pretty psyched about the
effort.

Oh, and on the subject of the database indexing: I had figured on the system
not really being able to handle the database fields, and was anticipating
some sort of <meta> tag usage for providing indexing capability on the data
fields.  What would really make things easier is the indexing of XML in
addition to HTML, which would make indexing individual records of data, and
enumerations of primary keys, very simple.  As long as your URL handler
returned valid XML, it could be indexed and searched and linked to the
primary key's "URL".  Much better IMHO than trying to retrofit the system
for database specific structures.  This is a text indexer, after all.

More later,

Jamie

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> Sent: Monday, September 27, 1999 9:14 AM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: RE: [htdig3-dev] Perl interface
> 
> 
> 
>  > 
>  > I hadn't looked yet at the possibility of letting perl 
> actually do the
>  > document parsing and insertion of data.  I was mainly 
> concentrating on a
>  > search interface, but I'm certainly interested in that!
>  > 
> 
>  Providing an interface to the word database could be used to make a
> perl program parsing/index data. I think it will be commonly used to
> search for words instead. 
> 
>  > has promised the use of loadable modules for document 
> parsing, so we could
>  > probably provide some guidance on how best to allow Perl 
> module developers
>  > to write their parsing modules as loadables.  Then you'd 
> be able to hook
> 
>  Yes, that will be a great feature.
> 
>  > your doc parsers right into the current htdig "digger" 
> without having to
>  > write your own "digger".  He's also promised URL handlers 
> which would allow
>  > us to define our own handlers for certain "dig" locations, another
>  > possibility for perl, and even for DBI to index databases.
> 
>  This entirely depends on the versatility of the word 
> database. At present
> htdig has a word database structure that does not fit the 
> needs of database
> fields indexing well. The abstract definition of the WordKey 
> class is here
> to have the ability to mutate the word database schema 
> without loosing 
> efficiency. It's a compile time mutation (examples in the 
> test directory).
> 
>           Cheers,
> 
> -- 
>               Loic Dachary
> 
>               ECILA
>               100 av. du Gal Leclerc
>               93500 Pantin - France
>               Tel: 33 1 56 96 09 80, Fax: 33 1 56 96 09 61
>               e-mail: [EMAIL PROTECTED] URL: http://www.senga.org/
> 
> 
> ------------------------------------
> To unsubscribe from the htdig3-dev mailing list, send a message to
> [EMAIL PROTECTED] containing the single word "unsubscribe" in
> the SUBJECT of the message.
> 

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to