Sounds like a good alternative. But how do I perform the search on the
tokenized filed and sort on the un_tokenized one?

Thanks,


Andre

On Tue, Aug 5, 2008 at 12:51 PM,  <[EMAIL PROTECTED]> wrote:
> This is what I did and it works fine.  My untokenized fields where named:
> "__AMSUNTOK__" + fieldName.
> Where fieldName was the name of the tokenized field.
>
>
> Bob Hastings
> Ancept Inc.
>
>
>
>
> Mark Miller <[EMAIL PROTECTED]>
> 08/05/2008 02:38 PM
> Please respond to
> java-user@lucene.apache.org
>
>
> To
> java-user@lucene.apache.org
> cc
>
> Subject
> Re: Sorting
>
>
>
>
>
>
> Hey Andre,
>
> The reason the javadoc says the field should not be tokenized stems from
> the issue you point out. What you want to do is possible of course, but
> making the Lucene code change would complicate a process that can be
> quite memory and cpu intensive on large collections. Done right, it
> might make a good patch though.
>
> A compromise that you can make outside of the Lucene code is to index a
> separate field with the same contents but untokenized. Sorting on this
> field instead, Lucene will treat "North Carolina" as one token and sort
> as you'd expect. The downside to this approach is that you will have to
> juggle the two fields in the future.
>
> - Mark
>
> Andre Rubin wrote:
>> Hi there!
>>
>> I'm new to Lucene, so forgive any misconceptions on my part.
>>
>> I created an Index and now I want to search on it based on a field.
>> The field is a String field and Field.Store.YES and
>> Field.Index.TOKENIZED. No problems with the search.
>>
>> Now, I wanted to sort the results, and according to the Sort javadoc
>> the field "should not be tokenized". But I decided to try it anyway,
>> and it worked. However, the results showed that the tokens were
>> sorted, not the full string in the field.
>>
>> Just to make myself more clear, here's an example. Let's say I have
>> these strings indexed:
>>
>> "North Carolina"
>> "British Columbia"
>> "Canada"
>>
>> Now I search (with sort) for the token 'c*'
>>
>> The result I get is (sorted by the token found):
>>
>> 1) Canada
>> 2) North Carolina
>> 3) British Columbia
>>
>> The result I wanted was (sorted by the whole String)"
>>
>> 1) British Columbia
>> 2) Canada
>> 3) North Carolina
>>
>> Is there a way to do this?
>>
>>
>> Another option would be to sort the index itself, since this field is
>> the only field that we'd be searching on. But I'm just guessing here,
>> cause I have no idea if this is possible at all!
>>
>> Thanks,
>>
>>
>> Andre
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to