Re: Grouping and tokens

Ramprakash Ramamoorthy Tue, 19 Feb 2013 21:58:02 -0800

On Tue, Feb 19, 2013 at 9:09 PM, Jack Krupansky <j...@basetechnology.com>wrote:


> Well, you don't need to "store" both copies since they will be the same.
> They both need to be "indexed" (string form for grouping, text form for
> keyword search), but only one needs to be "stored".
>
> Yeah, yeah Jack, understood. That was what I meant.

>
> -- Jack Krupansky
>
> -----Original Message----- From: Ramprakash Ramamoorthy
> Sent: Tuesday, February 19, 2013 1:07 AM
>
> To: java-user@lucene.apache.org
> Subject: Re: Grouping and tokens
>
> On Tue, Feb 19, 2013 at 12:57 PM, Jack Krupansky <j...@basetechnology.com>
> **wrote:
>
>  Oops, sorry for the "Solr" answer. In Lucene you need to simply index the
>> same value, once as a raw string and a second time as a tokenized text
>> field. Grouping would use the raw string version of the data.
>>
>> Yeah, thanks Jack. Was just wondering if there would be a better alternate
>>
> rather than 2x storing. But I don't see any. Thanks again.
>
>  -- Jack Krupansky
>>
>> -----Original Message----- From: Jack Krupansky
>> Sent: Monday, February 18, 2013 11:21 PM
>>
>> To: java-user@lucene.apache.org
>> Subject: Re: Grouping and tokens
>>
>> Okay, so, fields that would normally need to be tokenized must be stored
>> as
>> both raw strings for grouping and tokenized text for keyword search.
>> Simply
>> use copyField to copy from one to the other.
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Ramprakash Ramamoorthy
>> Sent: Monday, February 18, 2013 11:13 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Grouping and tokens
>>
>> On Mon, Feb 18, 2013 at 9:47 PM, Jack Krupansky
>> <j...@basetechnology.com>****wrote:
>>
>>
>>  Please clarify exactly what you want to group by - give a specific
>> example
>>
>>> that makes it clear what terms should affect grouping and which
>>> shouldn't.
>>>
>>>
>>>  Assume I am indexing a library data. Say there are the following fields
>> for
>> a particular book.
>> 1. Published
>> 2. Language
>> 3. Genre
>> 4. Author
>> 5. Title
>> 6. ISBN
>>
>>     While search time, the user can ask to group by any of the above
>> fields, which means all of them are not supposed to be tokenized. So as I
>> had told earlier, there is a book titled "Fifty shades of gray" and the
>> user searches for "shades". The result turns up in case the field is
>> tokenized. But here it doesn't, since it isn't tokenized. Hope I am clear?
>>
>>     In a nutshell, how do I use a groupby on a field that is also
>> tokenized?
>>
>>
>>  -- Jack Krupansky
>>>
>>> -----Original Message----- From: Ramprakash Ramamoorthy
>>> Sent: Monday, February 18, 2013 6:12 AM
>>> To: java-user@lucene.apache.org
>>> Subject: Grouping and tokens
>>>
>>>
>>> Hello all,
>>>
>>>     From the grouping javadoc, I read that fields that are supposed to be
>>> grouped should not be tokenized. I have an use case where the user has
>>> the
>>> freedom to group by any field during search time.
>>>
>>>     Now that only tokenized fields are eligible for grouping, this is
>>> creating an issue with my search. Say for instance the book "*Fifty
>>> shades
>>> of grey*" when tokenized and searched for "*shades*" turns up in the
>>>
>>> result. However this is not the case when I have it as a non-tokenized
>>> field (using StandardAnalyzer-Version4.1).
>>>
>>>     How do I go about this? Is indexing a tokenized and non-tokenized
>>> version of the same field the only go? I am afraid its way too costly!
>>> Thanks in advance for your valuable inputs.
>>>
>>> --
>>> With Thanks and Regards,
>>> Ramprakash Ramamoorthy,
>>> India,
>>> +91 9626975420
>>>
>>> ------------------------------******--------------------------**--**
>>> --**---------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.******apache.org<
>>> java-user-**unsubscribe@**lucene.apache.org<unsubscr...@lucene.apache.org>
>>> <java-user-**unsubscr...@lucene.apache.org<java-user-unsubscr...@lucene.apache.org>
>>> >
>>> >
>>> For additional commands, e-mail: java-user-help@lucene.apache.******org<
>>> java-user-help@lucene.**apache**.org <http://apache.org> <
>>> java-user-help@lucene.apache.**org <java-user-h...@lucene.apache.org>>>
>>>
>>>
>>>
>>>
>> --
>> With Thanks and Regards,
>> Ramprakash Ramamoorthy,
>> India.
>> +91 9626975420
>>
>>
>> ------------------------------****----------------------------**
>> --**---------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.****apache.org<
>> java-user-**unsubscr...@lucene.apache.org<java-user-unsubscr...@lucene.apache.org>
>> >
>> For additional commands, e-mail: java-user-help@lucene.apache.****org<
>> java-user-help@lucene.**apache.org <java-user-h...@lucene.apache.org>>
>>
>> ------------------------------****----------------------------**
>> --**---------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.****apache.org<
>> java-user-**unsubscr...@lucene.apache.org<java-user-unsubscr...@lucene.apache.org>
>> >
>> For additional commands, e-mail: java-user-help@lucene.apache.****org<
>> java-user-help@lucene.**apache.org <java-user-h...@lucene.apache.org>>
>>
>>
>>
>
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> India,
>
> +91 9626975420
>
> ------------------------------**------------------------------**---------
> To unsubscribe, e-mail: 
> java-user-unsubscribe@lucene.**apache.org<java-user-unsubscr...@lucene.apache.org>
> For additional commands, e-mail: 
> java-user-help@lucene.apache.**org<java-user-h...@lucene.apache.org>
>
>


-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Member Technical Staff,
Zoho Corporation.
+91 9626975420

Re: Grouping and tokens

Reply via email to