This is what Field Collapsing does. It is a complex feature and is not
in the Solr trunk yet.

On Tue, Jun 8, 2010 at 9:15 AM, Moazzam Khan <moazz...@gmail.com> wrote:
> How would I do a facet search if I did this and not get duplicates?
>
> Thanks,
> Moazzam
>
> On Mon, Jun 7, 2010 at 10:07 AM, Israel Ekpo <israele...@gmail.com> wrote:
>> I think you need a 1:1 mapping between the consultant and the company, else
>> how are you going to run your queries for let's say consultants that worked
>> for Google or AOL between March 1999 and August 2004?
>>
>> If the mapping is 1:1, your life would be easier and you would not need to
>> do extra parsing of the results your retrieved.
>>
>> Unfortunately, it looks like your are doing to have a lot of records.
>>
>> With an RDBMS, it is easier to do joins but with Lucene and Solr you have to
>> denormalize all the relationships.
>>
>> Hence in this particular scenario, if you have 5 consultants that worked for
>> 4 distinct companies you will have to send 20 documents to Solr
>>
>> On Mon, Jun 7, 2010 at 10:15 AM, Moazzam Khan <moazz...@gmail.com> wrote:
>>
>>> Thanks for the replies guys.
>>>
>>>
>>> I am currently storing consultants like this ..
>>>
>>> <doc>
>>>  <id>123</id>
>>>  <FirstName>tony</FirstName>
>>>  <LastName>marjo</LastName>
>>>  <Company>Google</Company>
>>>  <Company>AOL</Company>
>>> <doc>
>>>
>>> I have a few multi valued fields so if I do it the way Israel
>>> suggested it, I will have tons of records. Do you think it will be
>>> better if I did this instead ?
>>>
>>>
>>> <doc>
>>>  <id>123</id>
>>>  <FirstName>tony</FirstName>
>>>  <LastName>marjo</LastName>
>>>  <Company>Google_StartDate_EndDate</Company>
>>>  <Company>AOL_StartDate_EndDate</Company>
>>> <doc>
>>>
>>> Or is what you guys said better?
>>>
>>> Thanks for all the help.
>>>
>>> Moazzam
>>>
>>>
>>> On Mon, Jun 7, 2010 at 1:10 AM, Lance Norskog <goks...@gmail.com> wrote:
>>> > And for 'present', you would pick some time far in the future:
>>> > 2100-01-01T00:00:00Z
>>> >
>>> > On 6/5/10, Israel Ekpo <israele...@gmail.com> wrote:
>>> >> You need to make each document added to the index a 1 to 1 mapping for
>>> each
>>> >> company and consultant combo
>>> >>
>>> >> <schema>
>>> >>
>>> >> <fields>
>>> >>     <!-- Concatenation of company and consultant id -->
>>> >>     <field name="consultant_id_company_id" type="string" indexed="true"
>>> >> stored="true" required="true"/>
>>> >>     <field name="consultant_firstname" type="string" indexed="true"
>>> >> stored="true" multiValued="false"/>
>>> >>     <field name="consultant_lastname" type="string" indexed="true"
>>> >> stored="true" multiValued="false"/>
>>> >>
>>> >>     <!-- The name of the company the consultant worked for -->
>>> >>     <field name="company" type="text" indexed="true" stored="true"
>>> >> multiValued="false"/>
>>> >>     <field name="start_date" type="tdate" indexed="true" stored="true"
>>> >> multiValued="false"/>
>>> >>     <field name="end_date" type="tdate" indexed="true" stored="true"
>>> >> multiValued="false"/>
>>> >> </fields>
>>> >>
>>> >> <defaultSearchField>text</defaultSearchField>
>>> >>
>>> >> <copyField source="consultant_firstname" dest="text"/>
>>> >> <copyField source="consultant_lastname" dest="text"/>
>>> >> <copyField source="company" dest="text"/>
>>> >>
>>> >> </schema>
>>> >>
>>> >> <!--
>>> >>
>>> >> So for instance, you have 2 consultants
>>> >>
>>> >> Michael Davis and Tom Anderson who worked for AOL and Microsoft, Yahoo,
>>> >> Google and Facebook.
>>> >>
>>> >> Michael Davis = 1
>>> >> Tom Anderson = 2
>>> >>
>>> >> AOL = 1
>>> >> Microsoft = 2
>>> >> Yahoo = 3
>>> >> Google = 4
>>> >> Facebook = 5
>>> >>
>>> >> This is how you would add the documents to the index
>>> >>
>>> >> -->
>>> >>
>>> >> <doc>
>>> >>     <consultant_id_company_id>1_1</consultant_id_company_id>
>>> >>     <consultant_firstname>Michael</consultant_firstname>
>>> >>     <consultant_lastname>Davis</consultant_lastname>
>>> >>     <company>AOL</company>
>>> >>     <start_date>2006-02-13T15:26:37Z</start_date>
>>> >>     <end_date>2008-02-13T15:26:37Z</end_date>
>>> >> </doc>
>>> >>
>>> >> <doc>
>>> >>     <consultant_id_company_id>1_4</consultant_id_company_id>
>>> >>     <consultant_firstname>Michael</consultant_firstname>
>>> >>     <consultant_lastname>Davis</consultant_lastname>
>>> >>     <company>Google</company>
>>> >>     <start_date>2006-02-13T15:26:37Z</start_date>
>>> >>     <end_date>2009-02-13T15:26:37Z</end_date>
>>> >> </doc>
>>> >>
>>> >> <doc>
>>> >>     <consultant_id_company_id>2_3</consultant_id_company_id>
>>> >>     <consultant_firstname>Tom</consultant_firstname>
>>> >>     <consultant_lastname>Anderson</consultant_lastname>
>>> >>     <company>Yahoo</company>
>>> >>     <start_date>2001-01-13T15:26:37Z</start_date>
>>> >>     <end_date>2009-02-13T15:26:37Z</end_date>
>>> >> </doc>
>>> >>
>>> >> <doc>
>>> >>     <consultant_id_company_id>2_4</consultant_id_company_id>
>>> >>     <consultant_firstname>Tom</consultant_firstname>
>>> >>     <consultant_lastname>Anderson</consultant_lastname>
>>> >>     <company>Google</company>
>>> >>     <start_date>1999-02-13T15:26:37Z</start_date>
>>> >>     <end_date>2010-02-13T15:26:37Z</end_date>
>>> >> </doc>
>>> >>
>>> >>
>>> >> The you can search as
>>> >>
>>> >> q=company:X AND start_date:[X TO *] AND end_date:[* TO Z]
>>> >>
>>> >> On Fri, Jun 4, 2010 at 4:58 PM, Moazzam Khan <moazz...@gmail.com>
>>> wrote:
>>> >>
>>> >>> Hi guys,
>>> >>>
>>> >>>
>>> >>> I have a list of consultants and the users (people who work for the
>>> >>> company) are supposed to be able to search for consultants based on
>>> >>> the time frame they worked for, for a company. For example, I should
>>> >>> be able to search for all consultants who worked for Bear Stearns in
>>> >>> the month of july. What is the best of accomplishing this?
>>> >>>
>>> >>> I was thinking of formatting the document like this
>>> >>>
>>> >>> <company>
>>> >>>   <name> Bear Stearns</name>
>>> >>>   <startDate>2000-01-01</startDate>
>>> >>>   <endDate>present</endDate>
>>> >>> </company>
>>> >>> <company>
>>> >>>   <name> AIG</name>
>>> >>>   <startDate>1999-01-01</startDate>
>>> >>>   <endDate>2000-01-01</endDate>
>>> >>> </company>
>>> >>>
>>> >>> Is this possible?
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>> Moazzam
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> "Good Enough" is not good enough.
>>> >> To give anything less than your best is to sacrifice the gift.
>>> >> Quality First. Measure Twice. Cut Once.
>>> >> http://www.israelekpo.com/
>>> >>
>>> >
>>> >
>>> > --
>>> > Lance Norskog
>>> > goks...@gmail.com
>>> >
>>>
>>
>>
>>
>> --
>> "Good Enough" is not good enough.
>> To give anything less than your best is to sacrifice the gift.
>> Quality First. Measure Twice. Cut Once.
>> http://www.israelekpo.com/
>>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to