heritrix.lucene wrote:
Thanks for your reply.

This analyzer creates combination of words. I am looking for analyzer where
you can break up the words into their n-grams. For example:
2-grams of
google - > go, oo, og, gl, le
like that.

This is also easy.  You can check out our
sample in Gospodentic and Hatcher's Lucene
in Action book if you want to stream them out.
If you're willing to collect them and then push
them out, it's even easier.  (Oh, how I wish
we had the yielding iterator construct of Python
in Java.)

Our version allows you to specify minimum n-gram
length and maximum n-gram length.  You
might want to put them in different fields
if you want weighting between them to be
easy.

- Bob Carpenter
  Alias-i


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to