Re: SOLR 3.3.0 multivalued field sort problem
On 13.08.2011 21:28 Erick Erickson wrote: > Fair enough, but what's "first value in the list"? > There's nothing special about "mutliValued" fields, > that is where the schema has "multiValued=true". > under the covers, this is no different than just > concatenating all the values together and putting them > in at one go, except for some games with the > position between one term and another > (positionIncrementGap). Part of my confusion is > that the term multi-valued is sometimes used to > refer to "multiValued=true" and sometimes used > to refer to documents with more than one > *token* in a particular field (often as the result > of the analysis chain) I guess, since multivalued fields are not really different under the hood, they should be treated the same. So, no matter if the different values are the result of a "multiValued=true" or of the analysis chain: if the whole thing starts with an "a" put it first, if it starts with a "z" put it last. Example (multivalued field): Smith, Adam Duck, Dagobert => sort as "s" (or "S") Example tokenized field: This is a tokenized field => sort as "t" (or "T") > The second case seems to be more in the > grouping/field collapsing arena, although > that doesn't work on fields with more than one > value yet either. But that seems a more sensible > place to put the second case rather than > overloading sorting. It depends how you see the meaning of sorting: 1. Sort the records based on one single value per record (and return them in this order) 2. Sort the values of the field to sort on (and return the records belonging to the respective values) As long as sorting is only allowed on single value fields, both are identical. As soon as you allow multivalued fields to be sorted on, both interpretations mean something different and I think both have their valid use case. But I don't want to stress this too far. -Michael
Re: SOLR 3.3.0 multivalued field sort problem
Fair enough, but what's "first value in the list"? There's nothing special about "mutliValued" fields, that is where the schema has "multiValued=true". under the covers, this is no different than just concatenating all the values together and putting them in at one go, except for some games with the position between one term and another (positionIncrementGap). Part of my confusion is that the term multi-valued is sometimes used to refer to "multiValued=true" and sometimes used to refer to documents with more than one *token* in a particular field (often as the result of the analysis chain) The second case seems to be more in the grouping/field collapsing arena, although that doesn't work on fields with more than one value yet either. But that seems a more sensible place to put the second case rather than overloading sorting. I guess my take on the issue is that sorting has a pretty specific meaning, and that rather than overload sorting I'd rather see if the use-cases are best served by another mechanism. Best Erick On Sat, Aug 13, 2011 at 12:39 PM, Michael Lackhoff wrote: > On 13.08.2011 18:03 Erick Erickson wrote: > >> The problem I've always had is that I don't quite know what >> "sorting on multivalued fields" means. If your field had tokens >> a and z, would sorting on that field put the doc >> at the beginning or end of the list? Sure, you can define >> rules (first token, last token, average of all tokens (whatever >> that means)), but each solution would be wrong sometime, >> somewhere, and/or completely useless. > > Of course it would need rules but I think it wouldn't be too hard to > find rules that are at least far better than the current situation. > > My wish would include an option that decides if the field can be used > just once or every value on its own. If the option is set to FALSE, only > the first value would be used, if it is TRUE, every value of the field > would get its place in the result list. > > so, if we have e.g. > record1: ccc and bbb > record2: aaa and zzz > it would be either > record2 (aaa) > record1 (ccc) > or > record2 (aaa) > record1 (bbb) > record1 (ccc) > record2 (zzz) > > I find these two outcomes most plausible so I would allow them if > technical possible but whatever rule looks more plausible to the > experts: some solution is better than no solution. > > -Michael >
Re: SOLR 3.3.0 multivalued field sort problem
On 13.08.2011 20:31 Martijn v Groningen wrote: > The first solution would make sense to me. Some kind of a strategy > mechanism > for this would allow anyone to define their own rules. Duplicating results > would be confusing to me. That is why I would only activate it on request (setting a special option). Example use case: A library catalogue with an author sort. All books of an author would be together, no matter how many co-authors the book has. So I think it could be useful (as an option) but I have no idea how diffcult it would be to implement. As I said, it would be nice to have at least something. Any possible customization would be an extra bonus. -Michael
Re: SOLR 3.3.0 multivalued field sort problem
I have a different use case. Consider a spatial multivalued field with latlong values for addresses. I would want sort by geodist() to return the closest distance in each group. For example find me the closest restaurant which each doc being a chain name like pizza hut. Or doctors with multiple offices. Bill Bell Sent from mobile On Aug 13, 2011, at 12:31 PM, Martijn v Groningen wrote: > The first solution would make sense to me. Some kind of a strategy > mechanism > for this would allow anyone to define their own rules. Duplicating results > would be confusing to me. > > On 13 August 2011 18:39, Michael Lackhoff wrote: > >> On 13.08.2011 18:03 Erick Erickson wrote: >> >>> The problem I've always had is that I don't quite know what >>> "sorting on multivalued fields" means. If your field had tokens >>> a and z, would sorting on that field put the doc >>> at the beginning or end of the list? Sure, you can define >>> rules (first token, last token, average of all tokens (whatever >>> that means)), but each solution would be wrong sometime, >>> somewhere, and/or completely useless. >> >> Of course it would need rules but I think it wouldn't be too hard to >> find rules that are at least far better than the current situation. >> >> My wish would include an option that decides if the field can be used >> just once or every value on its own. If the option is set to FALSE, only >> the first value would be used, if it is TRUE, every value of the field >> would get its place in the result list. >> >> so, if we have e.g. >> record1: ccc and bbb >> record2: aaa and zzz >> it would be either >> record2 (aaa) >> record1 (ccc) >> or >> record2 (aaa) >> record1 (bbb) >> record1 (ccc) >> record2 (zzz) >> >> I find these two outcomes most plausible so I would allow them if >> technical possible but whatever rule looks more plausible to the >> experts: some solution is better than no solution. >> >> -Michael >> > > > > -- > Met vriendelijke groet, > > Martijn van Groningen
Re: SOLR 3.3.0 multivalued field sort problem
The first solution would make sense to me. Some kind of a strategy mechanism for this would allow anyone to define their own rules. Duplicating results would be confusing to me. On 13 August 2011 18:39, Michael Lackhoff wrote: > On 13.08.2011 18:03 Erick Erickson wrote: > > > The problem I've always had is that I don't quite know what > > "sorting on multivalued fields" means. If your field had tokens > > a and z, would sorting on that field put the doc > > at the beginning or end of the list? Sure, you can define > > rules (first token, last token, average of all tokens (whatever > > that means)), but each solution would be wrong sometime, > > somewhere, and/or completely useless. > > Of course it would need rules but I think it wouldn't be too hard to > find rules that are at least far better than the current situation. > > My wish would include an option that decides if the field can be used > just once or every value on its own. If the option is set to FALSE, only > the first value would be used, if it is TRUE, every value of the field > would get its place in the result list. > > so, if we have e.g. > record1: ccc and bbb > record2: aaa and zzz > it would be either > record2 (aaa) > record1 (ccc) > or > record2 (aaa) > record1 (bbb) > record1 (ccc) > record2 (zzz) > > I find these two outcomes most plausible so I would allow them if > technical possible but whatever rule looks more plausible to the > experts: some solution is better than no solution. > > -Michael > -- Met vriendelijke groet, Martijn van Groningen
Re: SOLR 3.3.0 multivalued field sort problem
On 13.08.2011 18:03 Erick Erickson wrote: > The problem I've always had is that I don't quite know what > "sorting on multivalued fields" means. If your field had tokens > a and z, would sorting on that field put the doc > at the beginning or end of the list? Sure, you can define > rules (first token, last token, average of all tokens (whatever > that means)), but each solution would be wrong sometime, > somewhere, and/or completely useless. Of course it would need rules but I think it wouldn't be too hard to find rules that are at least far better than the current situation. My wish would include an option that decides if the field can be used just once or every value on its own. If the option is set to FALSE, only the first value would be used, if it is TRUE, every value of the field would get its place in the result list. so, if we have e.g. record1: ccc and bbb record2: aaa and zzz it would be either record2 (aaa) record1 (ccc) or record2 (aaa) record1 (bbb) record1 (ccc) record2 (zzz) I find these two outcomes most plausible so I would allow them if technical possible but whatever rule looks more plausible to the experts: some solution is better than no solution. -Michael
Re: SOLR 3.3.0 multivalued field sort problem
The problem I've always had is that I don't quite know what "sorting on multivalued fields" means. If your field had tokens a and z, would sorting on that field put the doc at the beginning or end of the list? Sure, you can define rules (first token, last token, average of all tokens (whatever that means)), but each solution would be wrong sometime, somewhere, and/or completely useless. I'd love to have a better answer Best Erick On Fri, Aug 12, 2011 at 11:32 AM, Martijn v Groningen wrote: > Hi Johnny, > > Sorting on a multivalued field has never really worked in Solr. > Solr versions <= 1.4.1 allowed it, but there was a change that an error > occurred and that the sorting might not be what you expect. > From Solr 3.1 and up sorting on a multivalued isn't allowed and a http 400 > is returned. > > Duplicating documents or fields (what Peter describes) is as far as I know > they only option until Lucene supports sorting on multivalued fields > properly. > > Martijn > > 2011/8/12 Péter Király > >> Hi, >> >> There is no direct solution, you have to create single value field(s) >> to create search. I am aware of two workarounds: >> >> - you can use a random or a given (e.g. the first) instance of the >> multiple values of the field, and that would be your sortable field. >> - you can create two sortable fields: _min and _max, which >> contains the minimal and maximal values of the given field values. >> >> At least, that's what I do. Probably there are other solutions as well. >> >> Péter >> -- >> eXtensible Catalog >> http://drupal.org/project/xc >> >> >> 2011/8/12 johnnyisrael : >> > Hi, >> > >> > I am currently using SOLR 1.4.1, With this version sorting working fine >> even >> > in multivalued field. >> > >> > Now I am planning to upgrade my SOLR version from 1.4.1 --> 3.3.0, In >> this >> > latest version sorting is not working on multivauled field. >> > >> > So I am in unable to upgrade my SOLR due to this drawback. >> > >> > Is there a work around available to fix this problem? >> > >> > Thanks, >> > >> > Johnny >> > >> > -- >> > View this message in context: >> http://lucene.472066.n3.nabble.com/SOLR-3-3-0-multivalued-field-sort-problem-tp3248778p3248778.html >> > Sent from the Solr - User mailing list archive at Nabble.com. >> > >> > > > > -- > Met vriendelijke groet, > > Martijn van Groningen >
Re: SOLR 3.3.0 multivalued field sort problem
Hi Johnny, Sorting on a multivalued field has never really worked in Solr. Solr versions <= 1.4.1 allowed it, but there was a change that an error occurred and that the sorting might not be what you expect. >From Solr 3.1 and up sorting on a multivalued isn't allowed and a http 400 is returned. Duplicating documents or fields (what Peter describes) is as far as I know they only option until Lucene supports sorting on multivalued fields properly. Martijn 2011/8/12 Péter Király > Hi, > > There is no direct solution, you have to create single value field(s) > to create search. I am aware of two workarounds: > > - you can use a random or a given (e.g. the first) instance of the > multiple values of the field, and that would be your sortable field. > - you can create two sortable fields: _min and _max, which > contains the minimal and maximal values of the given field values. > > At least, that's what I do. Probably there are other solutions as well. > > Péter > -- > eXtensible Catalog > http://drupal.org/project/xc > > > 2011/8/12 johnnyisrael : > > Hi, > > > > I am currently using SOLR 1.4.1, With this version sorting working fine > even > > in multivalued field. > > > > Now I am planning to upgrade my SOLR version from 1.4.1 --> 3.3.0, In > this > > latest version sorting is not working on multivauled field. > > > > So I am in unable to upgrade my SOLR due to this drawback. > > > > Is there a work around available to fix this problem? > > > > Thanks, > > > > Johnny > > > > -- > > View this message in context: > http://lucene.472066.n3.nabble.com/SOLR-3-3-0-multivalued-field-sort-problem-tp3248778p3248778.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > > -- Met vriendelijke groet, Martijn van Groningen
Re: SOLR 3.3.0 multivalued field sort problem
Hi, There is no direct solution, you have to create single value field(s) to create search. I am aware of two workarounds: - you can use a random or a given (e.g. the first) instance of the multiple values of the field, and that would be your sortable field. - you can create two sortable fields: _min and _max, which contains the minimal and maximal values of the given field values. At least, that's what I do. Probably there are other solutions as well. Péter -- eXtensible Catalog http://drupal.org/project/xc 2011/8/12 johnnyisrael : > Hi, > > I am currently using SOLR 1.4.1, With this version sorting working fine even > in multivalued field. > > Now I am planning to upgrade my SOLR version from 1.4.1 --> 3.3.0, In this > latest version sorting is not working on multivauled field. > > So I am in unable to upgrade my SOLR due to this drawback. > > Is there a work around available to fix this problem? > > Thanks, > > Johnny > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SOLR-3-3-0-multivalued-field-sort-problem-tp3248778p3248778.html > Sent from the Solr - User mailing list archive at Nabble.com. >
SOLR 3.3.0 multivalued field sort problem
Hi, I am currently using SOLR 1.4.1, With this version sorting working fine even in multivalued field. Now I am planning to upgrade my SOLR version from 1.4.1 --> 3.3.0, In this latest version sorting is not working on multivauled field. So I am in unable to upgrade my SOLR due to this drawback. Is there a work around available to fix this problem? Thanks, Johnny -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-3-3-0-multivalued-field-sort-problem-tp3248778p3248778.html Sent from the Solr - User mailing list archive at Nabble.com.