This is what Field Collapsing does. It is a complex feature and is not in the Solr trunk yet.
On Tue, Jun 8, 2010 at 9:15 AM, Moazzam Khan <moazz...@gmail.com> wrote: > How would I do a facet search if I did this and not get duplicates? > > Thanks, > Moazzam > > On Mon, Jun 7, 2010 at 10:07 AM, Israel Ekpo <israele...@gmail.com> wrote: >> I think you need a 1:1 mapping between the consultant and the company, else >> how are you going to run your queries for let's say consultants that worked >> for Google or AOL between March 1999 and August 2004? >> >> If the mapping is 1:1, your life would be easier and you would not need to >> do extra parsing of the results your retrieved. >> >> Unfortunately, it looks like your are doing to have a lot of records. >> >> With an RDBMS, it is easier to do joins but with Lucene and Solr you have to >> denormalize all the relationships. >> >> Hence in this particular scenario, if you have 5 consultants that worked for >> 4 distinct companies you will have to send 20 documents to Solr >> >> On Mon, Jun 7, 2010 at 10:15 AM, Moazzam Khan <moazz...@gmail.com> wrote: >> >>> Thanks for the replies guys. >>> >>> >>> I am currently storing consultants like this .. >>> >>> <doc> >>> <id>123</id> >>> <FirstName>tony</FirstName> >>> <LastName>marjo</LastName> >>> <Company>Google</Company> >>> <Company>AOL</Company> >>> <doc> >>> >>> I have a few multi valued fields so if I do it the way Israel >>> suggested it, I will have tons of records. Do you think it will be >>> better if I did this instead ? >>> >>> >>> <doc> >>> <id>123</id> >>> <FirstName>tony</FirstName> >>> <LastName>marjo</LastName> >>> <Company>Google_StartDate_EndDate</Company> >>> <Company>AOL_StartDate_EndDate</Company> >>> <doc> >>> >>> Or is what you guys said better? >>> >>> Thanks for all the help. >>> >>> Moazzam >>> >>> >>> On Mon, Jun 7, 2010 at 1:10 AM, Lance Norskog <goks...@gmail.com> wrote: >>> > And for 'present', you would pick some time far in the future: >>> > 2100-01-01T00:00:00Z >>> > >>> > On 6/5/10, Israel Ekpo <israele...@gmail.com> wrote: >>> >> You need to make each document added to the index a 1 to 1 mapping for >>> each >>> >> company and consultant combo >>> >> >>> >> <schema> >>> >> >>> >> <fields> >>> >> <!-- Concatenation of company and consultant id --> >>> >> <field name="consultant_id_company_id" type="string" indexed="true" >>> >> stored="true" required="true"/> >>> >> <field name="consultant_firstname" type="string" indexed="true" >>> >> stored="true" multiValued="false"/> >>> >> <field name="consultant_lastname" type="string" indexed="true" >>> >> stored="true" multiValued="false"/> >>> >> >>> >> <!-- The name of the company the consultant worked for --> >>> >> <field name="company" type="text" indexed="true" stored="true" >>> >> multiValued="false"/> >>> >> <field name="start_date" type="tdate" indexed="true" stored="true" >>> >> multiValued="false"/> >>> >> <field name="end_date" type="tdate" indexed="true" stored="true" >>> >> multiValued="false"/> >>> >> </fields> >>> >> >>> >> <defaultSearchField>text</defaultSearchField> >>> >> >>> >> <copyField source="consultant_firstname" dest="text"/> >>> >> <copyField source="consultant_lastname" dest="text"/> >>> >> <copyField source="company" dest="text"/> >>> >> >>> >> </schema> >>> >> >>> >> <!-- >>> >> >>> >> So for instance, you have 2 consultants >>> >> >>> >> Michael Davis and Tom Anderson who worked for AOL and Microsoft, Yahoo, >>> >> Google and Facebook. >>> >> >>> >> Michael Davis = 1 >>> >> Tom Anderson = 2 >>> >> >>> >> AOL = 1 >>> >> Microsoft = 2 >>> >> Yahoo = 3 >>> >> Google = 4 >>> >> Facebook = 5 >>> >> >>> >> This is how you would add the documents to the index >>> >> >>> >> --> >>> >> >>> >> <doc> >>> >> <consultant_id_company_id>1_1</consultant_id_company_id> >>> >> <consultant_firstname>Michael</consultant_firstname> >>> >> <consultant_lastname>Davis</consultant_lastname> >>> >> <company>AOL</company> >>> >> <start_date>2006-02-13T15:26:37Z</start_date> >>> >> <end_date>2008-02-13T15:26:37Z</end_date> >>> >> </doc> >>> >> >>> >> <doc> >>> >> <consultant_id_company_id>1_4</consultant_id_company_id> >>> >> <consultant_firstname>Michael</consultant_firstname> >>> >> <consultant_lastname>Davis</consultant_lastname> >>> >> <company>Google</company> >>> >> <start_date>2006-02-13T15:26:37Z</start_date> >>> >> <end_date>2009-02-13T15:26:37Z</end_date> >>> >> </doc> >>> >> >>> >> <doc> >>> >> <consultant_id_company_id>2_3</consultant_id_company_id> >>> >> <consultant_firstname>Tom</consultant_firstname> >>> >> <consultant_lastname>Anderson</consultant_lastname> >>> >> <company>Yahoo</company> >>> >> <start_date>2001-01-13T15:26:37Z</start_date> >>> >> <end_date>2009-02-13T15:26:37Z</end_date> >>> >> </doc> >>> >> >>> >> <doc> >>> >> <consultant_id_company_id>2_4</consultant_id_company_id> >>> >> <consultant_firstname>Tom</consultant_firstname> >>> >> <consultant_lastname>Anderson</consultant_lastname> >>> >> <company>Google</company> >>> >> <start_date>1999-02-13T15:26:37Z</start_date> >>> >> <end_date>2010-02-13T15:26:37Z</end_date> >>> >> </doc> >>> >> >>> >> >>> >> The you can search as >>> >> >>> >> q=company:X AND start_date:[X TO *] AND end_date:[* TO Z] >>> >> >>> >> On Fri, Jun 4, 2010 at 4:58 PM, Moazzam Khan <moazz...@gmail.com> >>> wrote: >>> >> >>> >>> Hi guys, >>> >>> >>> >>> >>> >>> I have a list of consultants and the users (people who work for the >>> >>> company) are supposed to be able to search for consultants based on >>> >>> the time frame they worked for, for a company. For example, I should >>> >>> be able to search for all consultants who worked for Bear Stearns in >>> >>> the month of july. What is the best of accomplishing this? >>> >>> >>> >>> I was thinking of formatting the document like this >>> >>> >>> >>> <company> >>> >>> <name> Bear Stearns</name> >>> >>> <startDate>2000-01-01</startDate> >>> >>> <endDate>present</endDate> >>> >>> </company> >>> >>> <company> >>> >>> <name> AIG</name> >>> >>> <startDate>1999-01-01</startDate> >>> >>> <endDate>2000-01-01</endDate> >>> >>> </company> >>> >>> >>> >>> Is this possible? >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Moazzam >>> >>> >>> >> >>> >> >>> >> >>> >> -- >>> >> "Good Enough" is not good enough. >>> >> To give anything less than your best is to sacrifice the gift. >>> >> Quality First. Measure Twice. Cut Once. >>> >> http://www.israelekpo.com/ >>> >> >>> > >>> > >>> > -- >>> > Lance Norskog >>> > goks...@gmail.com >>> > >>> >> >> >> >> -- >> "Good Enough" is not good enough. >> To give anything less than your best is to sacrifice the gift. >> Quality First. Measure Twice. Cut Once. >> http://www.israelekpo.com/ >> > -- Lance Norskog goks...@gmail.com