handling token created/deleted events in an Index

Mathieu Lecarme Mon, 16 Jun 2008 07:56:07 -0700

With the LUCENE-1297, the SpellChecker will be able to choose how toestimate distance between two words.


Here are some other enhancement:

* The capacity to synchronize the main Index and the SpellCheckerIndex. Handling tokens creation is easy, a simple TokenFilter can dothe work. But for Token deletion, it's a bit harder. Lazy deleted canbe used if each time, token popularity is checked in the main Index.It's a pull strategy, a push from the Directory should be lighter.* Choosing the similarity strategy. Now, it's only a Ngramcomputation. Homophony can be nice, for example.* Spell Index can be used for dynamic similarity without disturbingthe main Index. By example, Snowball is nice for grouping words fromits roots, but it disturbs the Index if you wont to make a start withquery.

Some time ago, I suggested a patch LUCENE-1190, but, I guess it's toomonolithic. A more modular way should be better.


Any comments or suggestion?

M.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

handling token created/deleted events in an Index

Reply via email to