btw, be careful with you delimiters: pic_url may possibly contain a '-', etc.
2010/6/26 Geert-Jan Brits <gbr...@gmail.com> > >If I understand your suggestion correctly, you said that there's NO need > to have many Dynamic Fields; instead, we can have one definitive field name, > which can store a long string (concatenation of >information about tens of > pictures), e.g., using "-" and "%" delimiters: > pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%... > >I don't clearly see the reason of doing this. Is there a gain in terms of > performance? Or does this make programming on the client-side easier? Or > something else? > > I think you should ask the exact opposite question. If you don't do > anything with these fields which Solr is particularly good at (searching / > filtering / faceting/ sorting) why go through the trouble of creating > dynamic fields? (more fields is more overhead cost/ tracking cost no matter > how you look at it) > > Moreover, indeed from a client-view it's easier the way I suggested, since > otherwise you: > - would have to ask (through SolrJ) to include all dynamic fields to be > returned in the Fl-field ( > http://wiki.apache.org/solr/CommonQueryParameters#fl). This is difficult, > because a-priori you don't know how many dynamic-fields to query. So in > other words you can't just ask SOlr (though SolrJ lik you asked) to just > return all dynamic fields beginning with pic_*. (afaik) > - your client iterate code (looping the pics) is a bit more involved. > > HTH, Cheers, > > Geert-Jan > > 2010/6/26 Saïd Radhouani <r.steve....@gmail.com> > >> Thanks Geert-Jan for the detailed answer. Actually, I don't search at all >> on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the >> number of pictures). Thus, your suggestion of adding an extra field NrOfPics >> [0,N] would be the best solution. >> >> Regarding the other suggestion: >> >> > If you dont need search at all on these fields, the best thing imo is to >> > store all pic-related info of all pics together by concatenating them >> with >> > some delimiter which you know how to seperate at the client-side. >> > That or just store it in an external RDB since solr is just sitting on >> the >> > data and not doing anything intelligent with it. >> >> If I understand your suggestion correctly, you said that there's NO need >> to have many Dynamic Fields; instead, we can have one definitive field name, >> which can store a long string (concatenation of information about tens of >> pictures), e.g., using "-" and "%" delimiters: >> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%... >> >> I don't clearly see the reason of doing this. Is there a gain in terms of >> performance? Or does this make programming on the client-side easier? Or >> something else? >> >> >> My other question was: in case we use Dynamic Fields, is there a >> documentation about using SolrJ for this purpose? >> >> Thanks >> -Saïd >> >> On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote: >> >> > You can treat dynamic fields like any other field, so you can facet, >> sort, >> > filter, etc on these fields (afaik) >> > >> > I believe the confusion arises that sometimes the usecase for dynamic >> fields >> > seems to be ill-understood, i.e: to be able to use them to do some kind >> of >> > wildcard search, e.g: search for a value in any of the dynamic fields at >> > once like pic_url_*. This however is NOT possible. >> > >> > As far as your question goes: >> > >> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc >> w/o >> > pic >> >> To the best of my knowledge, everyone is saying that faceting cannot be >> > done on dynamic fields (only on definitive field names). Thus, I tried >> the >> > following and it's working: I assume that the stored > >pictures have a >> > sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the >> index, it >> > means that the underlying doc has at least one picture: >> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:* >> >> While this is working fine, I'm wondering whether there's a cleaner way >> to >> > do the same thing without assuming that pictures have a sequential >> number. >> > >> > If I understand your question correctly: faceting on docs with and >> without >> > pics could ofcourse by done like you mention, however it would be more >> > efficient to have an extra field defined: hasAtLestOnePic with values >> (0 | >> > 1) >> > use that to facet / filter on. >> > >> > you can extend this to NrOfPics [0,N) if you need to filter / facet on >> docs >> > with a certain nr of pics. >> > >> > also I wondered what else you wanted to do with this pic-related info. >> Do >> > you want to search on pic-description / pic-caption for instance? In >> that >> > case the dynamic-fields approach may not be what you want: how would you >> > know in which dynamic-field to search for a particular term? Would if be >> > pic_desc_1 , or pic_desc_x? Of couse you could OR over all dynamic >> fields, >> > but you need to know how many pics an upperbound for the nr of pics and >> it >> > really doesn't feel right, to me at least. >> > >> > If you need search on pic_description for instance, but don't mind what >> pic >> > matches, you could create a single field pic_description and put in the >> > concat of all pic-descriptions and search on that, or just make it a a >> > multi-valued field. >> > >> > If you dont need search at all on these fields, the best thing imo is to >> > store all pic-related info of all pics together by concatenating them >> with >> > some delimiter which you know how to seperate at the client-side. >> > That or just store it in an external RDB since solr is just sitting on >> the >> > data and not doing anything intelligent with it. >> > >> > I assume btw that you don't want to sort/ facet on pic-desc / >> pic_caption/ >> > pic_url either ( I have a hard time thinking of a useful usecase for >> that) >> > >> > HTH, >> > >> > Geert-Jan >> > >> > >> > >> > 2010/6/26 Saïd Radhouani <r.steve....@gmail.com> >> > >> >> Thanks so much Otis. This is working great. >> >> >> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc >> w/o >> >> pic >> >> >> >> To the best of my knowledge, everyone is saying that faceting cannot be >> >> done on dynamic fields (only on definitive field names). Thus, I tried >> the >> >> following and it's working: I assume that the stored pictures have a >> >> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the >> index, it >> >> means that the underlying doc has at least one picture: >> >> >> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:* >> >> >> >> While this is working fine, I'm wondering whether there's a cleaner way >> to >> >> do the same thing without assuming that pictures have a sequential >> number. >> >> >> >> Also, do you have any documentation about handling Dynamic Fields using >> >> SolrJ. So far, I found only issues about that on JIRA, but no >> documentation. >> >> >> >> Thanks a lot. >> >> >> >> -Saïd >> >> >> >> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote: >> >> >> >>> Saïd, >> >>> >> >>> Dynamic fields could help here, for example imagine a doc with: >> >>> id >> >>> pic_url_* >> >>> pic_caption_* >> >>> pic_description_* >> >>> >> >>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields >> >>> >> >>> So, for you: >> >>> >> >>> <dynamicField name="pic_url_*" type="string" indexed="true" >> >> stored="true"/> >> >>> <dynamicField name="pic_caption_*" type="text" indexed="true" >> >> stored="true"/> >> >>> <dynamicField name="pic_description_*" type="text" indexed="true" >> >> stored="true"/> >> >>> >> >>> Then you can add docs with unlimited number of >> >> pic_(url|caption|description)_* fields, e.g. >> >>> >> >>> id >> >>> pic_url_1 >> >>> pic_caption_1 >> >>> pic_description_1 >> >>> >> >>> id >> >>> pic_url_2 >> >>> pic_caption_2 >> >>> pic_description_2 >> >>> >> >>> >> >>> Otis >> >>> ---- >> >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >> >>> Lucene ecosystem search :: http://search-lucene.com/ >> >>> >> >>> >> >>> >> >>> ----- Original Message ---- >> >>>> From: Saïd Radhouani <r.steve....@gmail.com> >> >>>> To: solr-user@lucene.apache.org >> >>>> Sent: Fri, June 25, 2010 6:01:13 PM >> >>>> Subject: Setting many properties for a multivalued field. Schema.xml >> ? >> >> External file? >> >>>> >> >>>> Hi, >> >>> >> >>> I'm trying to index data containing a multivalued field "picture", >> >>>> that has three properties: url, caption and description: >> >>> >> >>> <picture/> >> >>>> >> >>> <url/> >> >>> >> >>>> <caption/> >> >>> <description/> >> >>> >> >>> Thus, each >> >>>> indexed document might have many pictures, each of them has a url, a >> >> caption, >> >>>> and a description. >> >>> >> >>> I wonder wether it's possible to store this data using >> >>>> only schema.xml. I couldn't figure it out so far. Instead, I'm >> thinking >> >> of using >> >>>> an external file to sore the properties of each picture, but I >> haven't >> >> tried yet >> >>>> this solution, waiting for your suggestions... >> >>> >> >>> Thanks, >> >>> -Saïd >> >> >> >> >> >> >