Hey Andre,
The reason the javadoc says the field should not be tokenized stems from
the issue you point out. What you want to do is possible of course, but
making the Lucene code change would complicate a process that can be
quite memory and cpu intensive on large collections. Done right, it
might make a good patch though.
A compromise that you can make outside of the Lucene code is to index a
separate field with the same contents but untokenized. Sorting on this
field instead, Lucene will treat "North Carolina" as one token and sort
as you'd expect. The downside to this approach is that you will have to
juggle the two fields in the future.
- Mark
Andre Rubin wrote:
Hi there!
I'm new to Lucene, so forgive any misconceptions on my part.
I created an Index and now I want to search on it based on a field.
The field is a String field and Field.Store.YES and
Field.Index.TOKENIZED. No problems with the search.
Now, I wanted to sort the results, and according to the Sort javadoc
the field "should not be tokenized". But I decided to try it anyway,
and it worked. However, the results showed that the tokens were
sorted, not the full string in the field.
Just to make myself more clear, here's an example. Let's say I have
these strings indexed:
"North Carolina"
"British Columbia"
"Canada"
Now I search (with sort) for the token 'c*'
The result I get is (sorted by the token found):
1) Canada
2) North Carolina
3) British Columbia
The result I wanted was (sorted by the whole String)"
1) British Columbia
2) Canada
3) North Carolina
Is there a way to do this?
Another option would be to sort the index itself, since this field is
the only field that we'd be searching on. But I'm just guessing here,
cause I have no idea if this is possible at all!
Thanks,
Andre
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]