Hi Raymond,

On 3/1/2009, Raymond Balmès wrote:
> I'm trying to index (& search later) documents that contain tri-grams
> however they have the following form:
> 
> <string> <2 digit> <2 digit>
> 
> Does the ShingleFilter work with numbers in the match ?

Yes, though it is the tokenizer and previous filters in the chain that will be 
the (potential) source of difficulties, not ShingleFilter.

> Another complication, in future features I'd like to add optional
> digits like
> 
> [<1 digit>] <string> <2 digit> <2 digit>
> 
> I suppose the ShingleFilter won't do it ?

ShingleFilter just pastes together the tokens produced by the previous 
component in the analysis chain, in a sliding window.  As currently written, it 
doesn't provide the sort of functionality you seem to be asking for.

> Any better advice ?

What do your documents look like?  What do you hope to accomplish using 
ShingleFilter?  It's tough to give advice without knowing what you want to do.

Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to