btw, be careful with you delimiters: pic_url may possibly contain a '-',
etc.

2010/6/26 Geert-Jan Brits <gbr...@gmail.com>

> >If I understand your suggestion correctly, you said that there's NO need
> to have many Dynamic Fields; instead, we can have one definitive field name,
> which can store a long string (concatenation of >information about tens of
> pictures), e.g., using "-" and "%" delimiters:
> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
> >I don't clearly see the reason of doing this. Is there a gain in terms of
> performance? Or does this make programming on the client-side easier? Or
> something else?
>
> I think you should ask the exact opposite question. If you don't do
> anything with these fields which Solr is particularly good at (searching /
> filtering / faceting/ sorting) why go through the trouble of creating
> dynamic fields?  (more fields is more overhead cost/ tracking cost no matter
> how you look at it)
>
> Moreover, indeed from a client-view it's easier the way I suggested, since
> otherwise you:
> - would have to ask (through SolrJ) to include all dynamic fields to be
> returned in the Fl-field (
> http://wiki.apache.org/solr/CommonQueryParameters#fl). This is difficult,
> because a-priori you don't know how many dynamic-fields to query. So in
> other words you can't just ask SOlr (though SolrJ lik you asked) to just
> return all dynamic fields beginning with pic_*. (afaik)
> - your client iterate code (looping the pics) is a bit more involved.
>
> HTH, Cheers,
>
> Geert-Jan
>
> 2010/6/26 Saïd Radhouani <r.steve....@gmail.com>
>
>> Thanks Geert-Jan for the detailed answer. Actually, I don't search at all
>> on these fields. I'm only filtering (w/ vs w/ pic) and sorting (based on the
>> number of pictures). Thus, your suggestion of adding an extra field NrOfPics
>> [0,N] would be the best solution.
>>
>> Regarding the other suggestion:
>>
>> > If you dont need search at all on these fields, the best thing imo is to
>> > store all pic-related info of all pics together by concatenating them
>> with
>> > some delimiter which you know how to seperate at the client-side.
>> > That or just store it in an external RDB since solr is just sitting on
>> the
>> > data and not doing anything intelligent with it.
>>
>> If I understand your suggestion correctly, you said that there's NO need
>> to have many Dynamic Fields; instead, we can have one definitive field name,
>> which can store a long string (concatenation of information about tens of
>> pictures), e.g., using "-" and "%" delimiters:
>> pic_url_value1-pic_caption_value1-pic_description_value1%pic_url_value2-pic_caption_value2-pic_description_value2%...
>>
>> I don't clearly see the reason of doing this. Is there a gain in terms of
>> performance? Or does this make programming on the client-side easier? Or
>> something else?
>>
>>
>> My other question was: in case we use Dynamic Fields, is there a
>> documentation about using SolrJ for this purpose?
>>
>> Thanks
>> -Saïd
>>
>> On Jun 26, 2010, at 12:29 PM, Geert-Jan Brits wrote:
>>
>> > You can treat dynamic fields like any other field, so you can facet,
>> sort,
>> > filter, etc on these fields (afaik)
>> >
>> > I believe the confusion arises that sometimes the usecase for dynamic
>> fields
>> > seems to be ill-understood, i.e: to be able to use them to do some kind
>> of
>> > wildcard search, e.g: search for a value in any of the dynamic fields at
>> > once like pic_url_*. This however is NOT possible.
>> >
>> > As far as your question goes:
>> >
>> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
>> w/o
>> > pic
>> >> To the best of my knowledge, everyone is saying that faceting cannot be
>> > done on dynamic fields (only on definitive field names). Thus, I tried
>> the
>> > following and it's working: I assume that the stored > >pictures have a
>> > sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the
>> index, it
>> > means that the underlying doc has at least one picture:
>> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>> >> While this is working fine, I'm wondering whether there's a cleaner way
>> to
>> > do the same thing without assuming that pictures have a sequential
>> number.
>> >
>> > If I understand your question correctly: faceting on docs with and
>> without
>> > pics could ofcourse by done like you mention, however it  would be more
>> > efficient to have an extra field defined:  hasAtLestOnePic with values
>> (0 |
>> > 1)
>> > use that to facet / filter on.
>> >
>> > you can extend this to NrOfPics [0,N)  if you need to filter / facet on
>> docs
>> > with a certain nr of pics.
>> >
>> > also I wondered what else you wanted to do with this pic-related info.
>> Do
>> > you want to search on pic-description / pic-caption for instance? In
>> that
>> > case the dynamic-fields approach may not be what you want: how would you
>> > know in which dynamic-field to search for a particular term? Would if be
>> > pic_desc_1 , or pic_desc_x?  Of couse you could OR over all dynamic
>> fields,
>> > but you need to know how many pics an upperbound for the nr of pics and
>> it
>> > really doesn't feel right, to me at least.
>> >
>> > If you need search on pic_description for instance, but don't mind what
>> pic
>> > matches, you could create a single field pic_description and put in the
>> > concat of all pic-descriptions and search on that, or just make it a a
>> > multi-valued field.
>> >
>> > If you dont need search at all on these fields, the best thing imo is to
>> > store all pic-related info of all pics together by concatenating them
>> with
>> > some delimiter which you know how to seperate at the client-side.
>> > That or just store it in an external RDB since solr is just sitting on
>> the
>> > data and not doing anything intelligent with it.
>> >
>> > I assume btw that you don't want to sort/ facet on pic-desc /
>> pic_caption/
>> > pic_url either ( I have a hard time thinking of a useful usecase for
>> that)
>> >
>> > HTH,
>> >
>> > Geert-Jan
>> >
>> >
>> >
>> > 2010/6/26 Saïd Radhouani <r.steve....@gmail.com>
>> >
>> >> Thanks so much Otis. This is working great.
>> >>
>> >> Now, I'm trying to make facets on pictures: display doc w/ pic vs. doc
>> w/o
>> >> pic
>> >>
>> >> To the best of my knowledge, everyone is saying that faceting cannot be
>> >> done on dynamic fields (only on definitive field names). Thus, I tried
>> the
>> >> following and it's working: I assume that the stored pictures have a
>> >> sequential number (_1, _2, etc.), i.e., if pic_url_1 exists in the
>> index, it
>> >> means that the underlying doc has at least one picture:
>> >>
>> >> ...&facet=on&facet.field=pic_url_1&facet.mincount=1&fq=pic_url_1:*
>> >>
>> >> While this is working fine, I'm wondering whether there's a cleaner way
>> to
>> >> do the same thing without assuming that pictures have a sequential
>> number.
>> >>
>> >> Also, do you have any documentation about handling Dynamic Fields using
>> >> SolrJ. So far, I found only issues about that on JIRA, but no
>> documentation.
>> >>
>> >> Thanks a lot.
>> >>
>> >> -Saïd
>> >>
>> >> On Jun 26, 2010, at 1:18 AM, Otis Gospodnetic wrote:
>> >>
>> >>> Saïd,
>> >>>
>> >>> Dynamic fields could help here, for example imagine a doc with:
>> >>> id
>> >>> pic_url_*
>> >>> pic_caption_*
>> >>> pic_description_*
>> >>>
>> >>> See http://wiki.apache.org/solr/SchemaXml#Dynamic_fields
>> >>>
>> >>> So, for you:
>> >>>
>> >>> <dynamicField name="pic_url_*"  type="string"  indexed="true"
>> >> stored="true"/>
>> >>> <dynamicField name="pic_caption_*"  type="text"  indexed="true"
>> >> stored="true"/>
>> >>> <dynamicField name="pic_description_*"  type="text"  indexed="true"
>> >> stored="true"/>
>> >>>
>> >>> Then you can add docs with unlimited number of
>> >> pic_(url|caption|description)_* fields, e.g.
>> >>>
>> >>> id
>> >>> pic_url_1
>> >>> pic_caption_1
>> >>> pic_description_1
>> >>>
>> >>> id
>> >>> pic_url_2
>> >>> pic_caption_2
>> >>> pic_description_2
>> >>>
>> >>>
>> >>> Otis
>> >>> ----
>> >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>> >>> Lucene ecosystem search :: http://search-lucene.com/
>> >>>
>> >>>
>> >>>
>> >>> ----- Original Message ----
>> >>>> From: Saïd Radhouani <r.steve....@gmail.com>
>> >>>> To: solr-user@lucene.apache.org
>> >>>> Sent: Fri, June 25, 2010 6:01:13 PM
>> >>>> Subject: Setting many properties for a multivalued field. Schema.xml
>> ?
>> >> External file?
>> >>>>
>> >>>> Hi,
>> >>>
>> >>> I'm trying to index data containing a multivalued field "picture",
>> >>>> that has three properties: url, caption and description:
>> >>>
>> >>> <picture/>
>> >>>>
>> >>>   <url/>
>> >>>
>> >>>> <caption/>
>> >>>   <description/>
>> >>>
>> >>> Thus, each
>> >>>> indexed document might have many pictures, each of them has a url, a
>> >> caption,
>> >>>> and a description.
>> >>>
>> >>> I wonder wether it's possible to store this data using
>> >>>> only schema.xml. I couldn't figure it out so far. Instead, I'm
>> thinking
>> >> of using
>> >>>> an external file to sore the properties of each picture, but I
>> haven't
>> >> tried yet
>> >>>> this solution, waiting for your suggestions...
>> >>>
>> >>> Thanks,
>> >>> -Saïd
>> >>
>> >>
>>
>>
>

Reply via email to