Is there any way to get a TokenStream for a given Field of a Document (is that
information even stored in the index)? I want to use the startOffset / endOffset
information for hit highlighting. Do I have to tokenize the text value for the field
again to get this information?
One of the token patterns defined by the StandardTokenizer.jj is this:
|
| ( )+
| ( )+
|( )+
|( )+
)
So basically if you have some sequences of characters separated by a "-"
character, sequences that contain a digit will be combined with sequences
which are adjacent t
I'm looking for a fast way (execution wise) to get a list of unique values for a field
called "partno" for all documents which have a given value for a field called "type".
This is for adding values to a drop-down list.
What I have done so far is to build a list of document numbers for each val