How to get TokenStream from Field?

2003-12-16 Thread Karl Penney
Is there any way to get a TokenStream for a given Field of a Document (is that information even stored in the index)? I want to use the startOffset / endOffset information for hit highlighting. Do I have to tokenize the text value for the field again to get this information?

Re: Disabling modifiers?

2003-12-16 Thread Karl Penney
One of the token patterns defined by the StandardTokenizer.jj is this: | | ( )+ | ( )+ |( )+ |( )+ ) So basically if you have some sequences of characters separated by a "-" character, sequences that contain a digit will be combined with sequences which are adjacent t

How to get list of unique field values for a subset of documents

2003-12-12 Thread Karl Penney
I'm looking for a fast way (execution wise) to get a list of unique values for a field called "partno" for all documents which have a given value for a field called "type". This is for adding values to a drop-down list. What I have done so far is to build a list of document numbers for each val