Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Erick Erickson
Hmmm, just specifying sort= is _almost_ what you want,
except it sorts by the value of fields in the doc not numFound.

this shouldn't be hard to do on the client though, but you'd
have to return all the groups...

FWIW,
Erick

On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan brent.r...@gmail.com wrote:
 We ran into 1 snag during development with SOLR and I thought I'd run it by
 anyone to see if they had any slick ways to solve this issue.

 Basically, we're performing a SOLR query with grouping and want to be able
 to sort by the number of documents found within each group.

 Our query response from SOLR looks something like this:

 {

   responseHeader:{

 status:0,

 QTime:17,

 params:{

   indent:true,

   q:*:*,

   group.limit:0,

   group.field:rfp_stub,

   group:true,

   wt:json,

   rows:1000}},

   grouped:{

 rfp_stub:{

   matches:18470,

   groups:[{


 groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e,

   doclist:{*numFound*:3,start:0,docs:[]

   }},

 {


 groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce,

   doclist:{*numFound*:5,start:0,docs:[]

   }},

 {


 groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131,

   doclist:{*numFound*:6,start:0,docs:[]

   }},

 …


 The *numFound* shows the number of documents within that group.  Is there
 anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
 supported, but wondered if anyone their has come across this and if there
 was any suggested workarounds given that the dataset is really too large to
 hold in memory on our app servers?


Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Brent Ryan
ya, that's the problem... you can't sort by numFound and it's not
feasible to do the sort on the client because the grouped result set is too
large.

Brent


On Wed, Sep 25, 2013 at 6:09 AM, Erick Erickson erickerick...@gmail.comwrote:

 Hmmm, just specifying sort= is _almost_ what you want,
 except it sorts by the value of fields in the doc not numFound.

 this shouldn't be hard to do on the client though, but you'd
 have to return all the groups...

 FWIW,
 Erick

 On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan brent.r...@gmail.com wrote:
  We ran into 1 snag during development with SOLR and I thought I'd run it
 by
  anyone to see if they had any slick ways to solve this issue.
 
  Basically, we're performing a SOLR query with grouping and want to be
 able
  to sort by the number of documents found within each group.
 
  Our query response from SOLR looks something like this:
 
  {
 
responseHeader:{
 
  status:0,
 
  QTime:17,
 
  params:{
 
indent:true,
 
q:*:*,
 
group.limit:0,
 
group.field:rfp_stub,
 
group:true,
 
wt:json,
 
rows:1000}},
 
grouped:{
 
  rfp_stub:{
 
matches:18470,
 
groups:[{
 
 
  groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e,
 
doclist:{*numFound*:3,start:0,docs:[]
 
}},
 
  {
 
 
  groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce,
 
doclist:{*numFound*:5,start:0,docs:[]
 
}},
 
  {
 
 
  groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131,
 
doclist:{*numFound*:6,start:0,docs:[]
 
}},
 
  …
 
 
  The *numFound* shows the number of documents within that group.  Is there
  anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
  supported, but wondered if anyone their has come across this and if there
  was any suggested workarounds given that the dataset is really too large
 to
  hold in memory on our app servers?



Re: SOLR grouped query sorting on numFound

2013-09-25 Thread Erick Erickson
but if it's too large on the client, wouldn't it also be too large on
the server? After all, you have to hold the entire set of groups in
memory since you can't know ahead of time which will be the largest.
Or at least the counts of them all. I suppose you could do some
two-pass process where you returned 1 doc/group with absolutely
minimal data (like score and ID) and then issued a second query that
got the data to display if (and only if) that suited your use-case.
Otherwise I'm afraid you're into custom Solr code

Best,
Erick

On Wed, Sep 25, 2013 at 6:40 AM, Brent Ryan brent.r...@gmail.com wrote:
 ya, that's the problem... you can't sort by numFound and it's not
 feasible to do the sort on the client because the grouped result set is too
 large.

 Brent


 On Wed, Sep 25, 2013 at 6:09 AM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Hmmm, just specifying sort= is _almost_ what you want,
 except it sorts by the value of fields in the doc not numFound.

 this shouldn't be hard to do on the client though, but you'd
 have to return all the groups...

 FWIW,
 Erick

 On Tue, Sep 24, 2013 at 1:11 PM, Brent Ryan brent.r...@gmail.com wrote:
  We ran into 1 snag during development with SOLR and I thought I'd run it
 by
  anyone to see if they had any slick ways to solve this issue.
 
  Basically, we're performing a SOLR query with grouping and want to be
 able
  to sort by the number of documents found within each group.
 
  Our query response from SOLR looks something like this:
 
  {
 
responseHeader:{
 
  status:0,
 
  QTime:17,
 
  params:{
 
indent:true,
 
q:*:*,
 
group.limit:0,
 
group.field:rfp_stub,
 
group:true,
 
wt:json,
 
rows:1000}},
 
grouped:{
 
  rfp_stub:{
 
matches:18470,
 
groups:[{
 
 
  groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e,
 
doclist:{*numFound*:3,start:0,docs:[]
 
}},
 
  {
 
 
  groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce,
 
doclist:{*numFound*:5,start:0,docs:[]
 
}},
 
  {
 
 
  groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131,
 
doclist:{*numFound*:6,start:0,docs:[]
 
}},
 
  …
 
 
  The *numFound* shows the number of documents within that group.  Is there
  anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
  supported, but wondered if anyone their has come across this and if there
  was any suggested workarounds given that the dataset is really too large
 to
  hold in memory on our app servers?



SOLR grouped query sorting on numFound

2013-09-24 Thread Brent Ryan
We ran into 1 snag during development with SOLR and I thought I'd run it by
anyone to see if they had any slick ways to solve this issue.

Basically, we're performing a SOLR query with grouping and want to be able
to sort by the number of documents found within each group.

Our query response from SOLR looks something like this:

{

  responseHeader:{

status:0,

QTime:17,

params:{

  indent:true,

  q:*:*,

  group.limit:0,

  group.field:rfp_stub,

  group:true,

  wt:json,

  rows:1000}},

  grouped:{

rfp_stub:{

  matches:18470,

  groups:[{


groupValue:java.util.UUID:a1871c9e-cd7f-4e87-971d-d8a44effc33e,

  doclist:{*numFound*:3,start:0,docs:[]

  }},

{


groupValue:java.util.UUID:0c2f1045-a32d-4a4d-9143-e09db45a20ce,

  doclist:{*numFound*:5,start:0,docs:[]

  }},

{


groupValue:java.util.UUID:a3e1d56b-4172-4594-87c2-8895c5e5f131,

  doclist:{*numFound*:6,start:0,docs:[]

  }},

…


The *numFound* shows the number of documents within that group.  Is there
anyway to perform a sort on *numFound* in SOLR ?  I don't believe this is
supported, but wondered if anyone their has come across this and if there
was any suggested workarounds given that the dataset is really too large to
hold in memory on our app servers?