Geoff Hutchison wrote:
> * Document Excerpts moved to separate DB                        Incomplete (need 
>compression turned on)
> * Word DB conversion                                            Incomplete (mostly 
>in place, a few prob.)

These should only be a few lines of code left. I'm having some weird
problems, but I expect that after taking a break from it, I'll get to
these quickly

> * Regex fuzzy                                                   Incomplete
> * Speling fuzzy                                                 Incomplete
> * Transport rewrite                                             Incomplete
>    ExternalTransport                                            Not begun (need API)
> * Trigram fuzzy                                                 Not begun (short)
> * Generate a list of all documents                              Not begun (very 
>short)
> * HtTools                                                       Not begun (medium)

None of these are particularly hard. It would be nice to work out an API
for external transport scripts. What *do* we need? Date and size,
certainly, but anything else?

> * UTF-8/Unicode support                                         ?

Unless I hear someone volunteer for this ASAP, I'm cutting this from the
list of 3.2 goals.

> * Character-Set translation                                     ?

This doesn't need to be hard--just use HtWordCodec to load in
translation tables. But it depends on the decision for the above...

> * Detection of duplicate documents while indexing               Not begun (short)

I've volunteered to add the code for this, but 

> * External Decoders                                             ?
> * Documentation / Website changes                               ?
> * Distributed queries / Database collections                    ?
> * Configuration changes                                         ?
> * URL weighting factors (e.g. server A gets 'boost')            ?
>         indexing of URL text                                    ?
> * Search 'similar'                                              ?
> * Shared libraries for distinct functionalities                 ?

These are marked as '?' because I don't know the status on them.
Offhand, I'd assume they're not begun. I'll add in three I forgot:

* htsearch template enhancement to allow %(WORDS)                 ?
* Search based on date range                                      ?
* Order indexing based on server response time and hopcount     Not begun

Obviously, I'd like to be surprised by 'hey we should add this feature,
I volunteer' or 'here's some code to do X.' But unless I hear some
status updates, I'm probably going to axe some of these from the list
for 3.2.

As for loic's request about * Shared libraries for distinct
functionalities, I'm not sure if it should go into 3.2. At this point, I
guess I'll say this much: I don't know a timetable, but any code that I
listed as '?' is a project that I felt would be nice, but shouldn't hold
up release.

-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to