Hi Raymond, On 3/1/2009, Raymond Balmès wrote: > I'm trying to index (& search later) documents that contain tri-grams > however they have the following form: > > <string> <2 digit> <2 digit> > > Does the ShingleFilter work with numbers in the match ?
Yes, though it is the tokenizer and previous filters in the chain that will be the (potential) source of difficulties, not ShingleFilter. > Another complication, in future features I'd like to add optional > digits like > > [<1 digit>] <string> <2 digit> <2 digit> > > I suppose the ShingleFilter won't do it ? ShingleFilter just pastes together the tokens produced by the previous component in the analysis chain, in a sliding window. As currently written, it doesn't provide the sort of functionality you seem to be asking for. > Any better advice ? What do your documents look like? What do you hope to accomplish using ShingleFilter? It's tough to give advice without knowing what you want to do. Steve --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org