Re: Division with Stats Component when Grouping in Solr

2015-06-14 Thread kingofhypocrites
I think I have this about working with the analytics component. It seems to
fill in all the gaps that the stats component and the json facet don't
support.

It solved the following problems for me:
- I am able to perform math on stats to form other stats.. Then i can sort
on those as needed.
- When I perform math on stats it uses the summed totals per group rather
than doing it per row
- I am able to to do offsets and number of rows to handle paging

I am confused why this module isn't built into Sor. This functionality is so
vital for any adhoc querying on time series data. Pretty much any scenario
like the SQL query I provided would need all of these things.

Only thing I couldn't figure out is how to get the list of total buckets...
or in other words the distinct count of keywords. If anyone is able to help
with this, I could really use it in order to provide a total record count to
the user (e.g. Showing records 1-10 of 2939). 

Here is what I have in case this helps someone:
olap=trueo.r1.ff=keyword_so.r1.s.visits=sum(visits_i)o.r1.s.bounces=sum(bounces_i)o.r1.s.bounce_rate=div(sum(bounces_i),sum(visits_i))o.r1.ff.keyword_s.sortstatistic=bounce_rateo.r1.ff.keyword_s.sortdirection=desco.r1.ff.keyword_s.offset=0o.r1.ff.keyword_s.limit=10

Also if anyone has access to the original documentation from bloomberg
mentioned in the stats component PDF, I'd love to have it :)
https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

All the links for detailed documentation are now broken.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211751.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-14 Thread Erick Erickson
Why it isn't in core Solr... Because it doesn't (and probably can't)
support distributed mode.
The Streaming aggregation stuff, and the (in trunk Real Soon Now)
Parallel SQL support
are where the effort is going to support this kind of stuff.

https://issues.apache.org/jira/browse/SOLR-7560

https://issues.apache.org/jira/browse/SOLR-7082

Best,
Erick

On Sun, Jun 14, 2015 at 2:25 PM, kingofhypocrites
kingofhypocri...@gmail.com wrote:
 I think I have this about working with the analytics component. It seems to
 fill in all the gaps that the stats component and the json facet don't
 support.

 It solved the following problems for me:
 - I am able to perform math on stats to form other stats.. Then i can sort
 on those as needed.
 - When I perform math on stats it uses the summed totals per group rather
 than doing it per row
 - I am able to to do offsets and number of rows to handle paging

 I am confused why this module isn't built into Sor. This functionality is so
 vital for any adhoc querying on time series data. Pretty much any scenario
 like the SQL query I provided would need all of these things.

 Only thing I couldn't figure out is how to get the list of total buckets...
 or in other words the distinct count of keywords. If anyone is able to help
 with this, I could really use it in order to provide a total record count to
 the user (e.g. Showing records 1-10 of 2939).

 Here is what I have in case this helps someone:
 olap=trueo.r1.ff=keyword_so.r1.s.visits=sum(visits_i)o.r1.s.bounces=sum(bounces_i)o.r1.s.bounce_rate=div(sum(bounces_i),sum(visits_i))o.r1.ff.keyword_s.sortstatistic=bounce_rateo.r1.ff.keyword_s.sortdirection=desco.r1.ff.keyword_s.offset=0o.r1.ff.keyword_s.limit=10

 Also if anyone has access to the original documentation from bloomberg
 mentioned in the stats component PDF, I'd love to have it :)
 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 All the links for detailed documentation are now broken.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211751.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread kingofhypocrites
I was able to get the new version of Solr installed. This query gets me
really close, but it is averaging the rows BEFORE the grouping so it's not
totally accurate. I need it to sum the visits and bounces by keyword and
then perform the division. The avg here probably seems confusing and
pointless, but it wouldn't let me just put the div directly in the facet
without wrapping it with a function.

So instead of summing all the rows into one group and performing the divide,
it is diving each row one by one and then averaging them together which
creates skewed results since one day may have more data than the other.

It seems dividing is possible if only I can tell it to divide the grouped by
keyword result and not the individual rows and having to average them
together, etc.

Here is what I have (granted it's a simplified version for testing)
json.facet={  
   keywords:{  
  type:terms,
  limit:10,
  field:keyword,
  facet:{  
 bounces_sum:sum(bounces),
 visits_sum:sum(visits),
 bounce_rate:avg(div(sum(bounces),sum(visits)))
  }
   }
}

What I really want is:
bounce_rate: div(bounces_sum, visits_sum)  ...  but this doesn't work.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211639.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread kingofhypocrites
Not sure why but half of my posts are showing up as not accepted by the
mailing list. I've made a few replies to others that haven't gone through. I
am not sure if it's because I'm replying via email or what the issue is.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211631.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread Erick Erickson
kingofhypocrites:

Usually that's because your e-mail formats with html or some other
non-plain-text format. Try sending them as plain text.

On Sat, Jun 13, 2015 at 5:26 PM, kingofhypocrites
kingofhypocri...@gmail.com wrote:
 Not sure why but half of my posts are showing up as not accepted by the
 mailing list. I've made a few replies to others that haven't gone through. I
 am not sure if it's because I'm replying via email or what the issue is.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211631.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread kingofhypocrites
@Billnbell What did you conclude with the Analytics component? It sounds like
you are saying it does the same thing as the stats component but it has
several other features that aren't supported by the stats library. I'd love
to have a talk with you offline if possible.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211635.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread kingofhypocrites
@Yonik,  Thanks for this! I was actually just looking at your blog earlier
today and thinking that the json facet feature may be just what I need. I'm
using Solr. 4.3 currently as that is what comes with DataStax, so I'm trying
to create a new build with the latest Solr version so i can test this
feature. For the sort I am assuming this would be sorting on sum(visits) for
the given keyword correct? Also can you confirm if it's possible to do a
division in the facet? Something like facet: { bouncerate: div(sum(bounces)
/ sum(visits)) }  Because of the large number of results, I would need to
precalculate this (division operation) if they happen to sort on it. I don't
see anything like this mentioned in the api docs, so maybe it's not
possible.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211634.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread kingofhypocrites
This looks very promising if only I could get it to work:
https://issues.apache.org/jira/browse/SOLR-5302
https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

Various links it points to are broken now and i can't find anything about
it online, but the PDF indicates I can set olap=true to turn it on,
although this doesn't seem to do anything. The docs say it supports
limiting the results and doing math operations on statistics which is
exactly what I need. I'm not clear if I need to install this or if this
component is even used anymore.

On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
ml-node+s472066n4211422...@n3.nabble.com wrote:

 https://issues.apache.org/jira/browse/SOLR-7560, will almost support this
 in Solr 5.3. The compound function support won't be there yet though. But
 it will be there in the near future.



 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
 [hidden email] http:///user/SendEmail.jtp?type=nodenode=4211422i=0
 wrote:

  I am migrating a database from SQL Server to Cassandra. Currently I have
 a
  setup as follows:
 
  - Log data in Cassandra
  - Summarize data in Spark and put into Cassandra summary tables
  - Query data in Solr
 
  Everything fits beautifully until I need to do stats on groups. I am
 hoping
  to get this to work with Solr so I can stick to one database, but I am
 not
  sure it's possible.
 
  If I had it in SQL Server, I could do it like so:
  SELECT
  site_id,
  keyword,
  SUM(visits) as visits,
  CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
  SUM(pageviews) as pageviews,
  CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
  avg_pages_per_visit
  FROM
  report_all_keywords_daily
  WHERE
  site_id = 55 AND date_key = '20150606' AND date_key = '20150608'
  GROUP BY
  site_id, keyword
  ORDER BY visits DESC
 
  Now I need to replicate this in Solr. The closest I could get to this is
 by
  using the Stats component and then using field collapsing.
 
 
 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword

 
  And here are some results I get back:
  http://pastebin.com/raw.php?i=Fxhe2RA0
 
  However, I need to do able to divide certain metrics. I tried including
  functions in the stats.field such as div(sum(bounce_rate), (sum(visits))
  but
  it doesn't recognize the functions. Also it seems to ignoring the paging
  for
  the stats results and returns all groups regardless.
 
  Ultimately I'd like something like this which is what I would get in
 SQL:
  http://lucene.472066.n3.nabble.com/file/n4211402/pic.png
 
  Is this possible or do I have to give up on the prospect of using Solr?
 I
  have to query this data dynamically so I can't pre-summarize all of it.
 
  To clarify I having the following two problems:
  - Paging is ignored for stats data
  - I can't figure out how to divide two stats together to get a third
 stat.
  Note: In some cases I would need to be able to sort on this combined
 stat
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 


 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html
  To unsubscribe from Division with Stats Component when Grouping in Solr, 
 click
 here
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1
 .
 NAML
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211525.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread William Bell
OK more info

requestHandler name=standard class=solr.StandardRequestHandler
arr name=components
  strquery/str
  strfacet/str
  stranalytics/str
  strhighlight/str
  strdebug/str
  strexpand/str
/arr
  /requestHandler


searchComponent name=analytics
class=org.apache.solr.handler.component.AnalyticsComponent /

I am going to try that after adding it to solrconfig.xml.



On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com wrote:

 Same here.

 What do we need to add to solrconfig.xml to get it to work?


1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302
2.
3. Help/


 On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites 
 kingofhypocri...@gmail.com wrote:

 This looks very promising if only I could get it to work:
 https://issues.apache.org/jira/browse/SOLR-5302

 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 Various links it points to are broken now and i can't find anything about
 it online, but the PDF indicates I can set olap=true to turn it on,
 although this doesn't seem to do anything. The docs say it supports
 limiting the results and doing math operations on statistics which is
 exactly what I need. I'm not clear if I need to install this or if this
 component is even used anymore.

 On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
 ml-node+s472066n4211422...@n3.nabble.com wrote:

  https://issues.apache.org/jira/browse/SOLR-7560, will almost support
 this
  in Solr 5.3. The compound function support won't be there yet though.
 But
  it will be there in the near future.
 
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
  [hidden email] http:///user/SendEmail.jtp?type=nodenode=4211422i=0
  wrote:
 
   I am migrating a database from SQL Server to Cassandra. Currently I
 have
  a
   setup as follows:
  
   - Log data in Cassandra
   - Summarize data in Spark and put into Cassandra summary tables
   - Query data in Solr
  
   Everything fits beautifully until I need to do stats on groups. I am
  hoping
   to get this to work with Solr so I can stick to one database, but I am
  not
   sure it's possible.
  
   If I had it in SQL Server, I could do it like so:
   SELECT
   site_id,
   keyword,
   SUM(visits) as visits,
   CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as
 bounce_rate,
   SUM(pageviews) as pageviews,
   CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
   avg_pages_per_visit
   FROM
   report_all_keywords_daily
   WHERE
   site_id = 55 AND date_key = '20150606' AND date_key = '20150608'
   GROUP BY
   site_id, keyword
   ORDER BY visits DESC
  
   Now I need to replicate this in Solr. The closest I could get to this
 is
  by
   using the Stats component and then using field collapsing.
  
  
 
 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword
 
  
   And here are some results I get back:
   http://pastebin.com/raw.php?i=Fxhe2RA0
  
   However, I need to do able to divide certain metrics. I tried
 including
   functions in the stats.field such as div(sum(bounce_rate),
 (sum(visits))
   but
   it doesn't recognize the functions. Also it seems to ignoring the
 paging
   for
   the stats results and returns all groups regardless.
  
   Ultimately I'd like something like this which is what I would get in
  SQL:
   http://lucene.472066.n3.nabble.com/file/n4211402/pic.png
  
   Is this possible or do I have to give up on the prospect of using
 Solr?
  I
   have to query this data dynamically so I can't pre-summarize all of
 it.
  
   To clarify I having the following two problems:
   - Paging is ignored for stats data
   - I can't figure out how to divide two stats together to get a third
  stat.
   Note: In some cases I would need to be able to sort on this combined
  stat
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
  --
   If you reply to this email, your message will be added to the
 discussion
  below:
 
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html
   To unsubscribe from Division with Stats Component when Grouping in
 Solr, click
  here
  
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1
 
  .
  NAML
  
 

Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread William Bell
Not you need to enable docValues to get range stuff to work.

docValues=true on the field.

On Sat, Jun 13, 2015 at 1:37 PM, William Bell billnb...@gmail.com wrote:

 OK. That works with one more change.

 lib dir=../../../dist/ regex=solr-analytics-.*\.jar /

  lib dir=../../../dist/ regex=solr-analysis-.*\.jar /


 http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=truestats=trueolap=trueolap.overall_score.statistic.sum=sum(overall_score)

 On Sat, Jun 13, 2015 at 1:16 PM, William Bell billnb...@gmail.com wrote:

 OK more info

 requestHandler name=standard class=solr.StandardRequestHandler
 arr name=components
   strquery/str
   strfacet/str
   stranalytics/str
   strhighlight/str
   strdebug/str
   strexpand/str
 /arr
   /requestHandler


 searchComponent name=analytics 
 class=org.apache.solr.handler.component.AnalyticsComponent /

 I am going to try that after adding it to solrconfig.xml.



 On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com
 wrote:

 Same here.

 What do we need to add to solrconfig.xml to get it to work?


1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302
2.
3. Help/


 On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites 
 kingofhypocri...@gmail.com wrote:

 This looks very promising if only I could get it to work:
 https://issues.apache.org/jira/browse/SOLR-5302

 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 Various links it points to are broken now and i can't find anything
 about
 it online, but the PDF indicates I can set olap=true to turn it on,
 although this doesn't seem to do anything. The docs say it supports
 limiting the results and doing math operations on statistics which is
 exactly what I need. I'm not clear if I need to install this or if this
 component is even used anymore.

 On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
 ml-node+s472066n4211422...@n3.nabble.com wrote:

  https://issues.apache.org/jira/browse/SOLR-7560, will almost support
 this
  in Solr 5.3. The compound function support won't be there yet though.
 But
  it will be there in the near future.
 
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
  [hidden email] http://
 /user/SendEmail.jtp?type=nodenode=4211422i=0
  wrote:
 
   I am migrating a database from SQL Server to Cassandra. Currently I
 have
  a
   setup as follows:
  
   - Log data in Cassandra
   - Summarize data in Spark and put into Cassandra summary tables
   - Query data in Solr
  
   Everything fits beautifully until I need to do stats on groups. I am
  hoping
   to get this to work with Solr so I can stick to one database, but I
 am
  not
   sure it's possible.
  
   If I had it in SQL Server, I could do it like so:
   SELECT
   site_id,
   keyword,
   SUM(visits) as visits,
   CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as
 bounce_rate,
   SUM(pageviews) as pageviews,
   CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
   avg_pages_per_visit
   FROM
   report_all_keywords_daily
   WHERE
   site_id = 55 AND date_key = '20150606' AND date_key =
 '20150608'
   GROUP BY
   site_id, keyword
   ORDER BY visits DESC
  
   Now I need to replicate this in Solr. The closest I could get to
 this is
  by
   using the Stats component and then using field collapsing.
  
  
 
 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword
 
  
   And here are some results I get back:
   http://pastebin.com/raw.php?i=Fxhe2RA0
  
   However, I need to do able to divide certain metrics. I tried
 including
   functions in the stats.field such as div(sum(bounce_rate),
 (sum(visits))
   but
   it doesn't recognize the functions. Also it seems to ignoring the
 paging
   for
   the stats results and returns all groups regardless.
  
   Ultimately I'd like something like this which is what I would get in
  SQL:
   http://lucene.472066.n3.nabble.com/file/n4211402/pic.png
  
   Is this possible or do I have to give up on the prospect of using
 Solr?
  I
   have to query this data dynamically so I can't pre-summarize all of
 it.
  
   To clarify I having the following two problems:
   - Paging is ignored for stats data
   - I can't figure out how to divide two stats together to get a third
  stat.
   Note: In some cases I would need to be able to sort on this combined
  stat
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
  --
   If you reply to this email, your message will be added to the
 discussion
  below:
 
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html
   To unsubscribe from Division with Stats Component when Grouping in
 

Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread William Bell
OK. Kinda like pivoting stats...

http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=trueolap=trueolap.req1.fieldfacet=overall_scorefacet=truefacet.field=overall_scoreolap.req1.statistic.count=count(overall_score)

Basically this does the same think in olap and facet.


   - response:
   {
  - numFound: 63061,
  - start: 0,
  - docs:
  []
  },
   - facet_counts:
   {
  - facet_queries: { },
  - facet_fields:
  {
 - overall_score:
 [
- 1,
- 40138,
- 5,
- 17487,
- 2,
- 2299,
- 4,
- 1810,
- 3,
- 1314
]
 },
  - facet_dates: { },
  - facet_ranges: { },
  - facet_intervals: { },
  - facet_heatmaps: { }
  },
   - stats:
   [
  - req1,
  -
  [
 - count,
 - 63048,
 - fieldFacets,
 -
 [
- overall_score,
-
[
   - 1,
   -
   [
  - count,
  - 40138
  ],
   - 2,
   -
   [
  - count,
  - 2299
  ],
   - 3,
   -
   [
  - count,
  - 1314
  ],
   - 4,
   -
   [
  - count,
  - 1810
  ],
   - 5,
   -
   [
  - count,
  - 17487
  ]
   ]
],
 - rangeFacets,
 - [ ],
 - queryFacets,
 - [ ]
 ]
  ]

}


On Sat, Jun 13, 2015 at 2:06 PM, William Bell billnb...@gmail.com wrote:

 Having a hard time getting this to work:


 http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=trueolap=trueolap.req1.fieldfacet=overall_score


 and even tried... I made sure docValues was set for overall_score too.


 http://hgsolr2devmstr:8983/solr/survey/select?q=*%3A*wt=jsonindent=trueolap=trueolap.fieldfacet=overall_score

 field name=overall_score type=int indexed=true stored=true
 docValues=true /

 On Sat, Jun 13, 2015 at 2:02 PM, William Bell billnb...@gmail.com wrote:

 Not you need to enable docValues to get range stuff to work.

 docValues=true on the field.

 On Sat, Jun 13, 2015 at 1:37 PM, William Bell billnb...@gmail.com
 wrote:

 OK. That works with one more change.

 lib dir=../../../dist/ regex=solr-analytics-.*\.jar /

  lib dir=../../../dist/ regex=solr-analysis-.*\.jar /


 http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=truestats=trueolap=trueolap.overall_score.statistic.sum=sum(overall_score)

 On Sat, Jun 13, 2015 at 1:16 PM, William Bell billnb...@gmail.com
 wrote:

 OK more info

 requestHandler name=standard class=solr.StandardRequestHandler
 arr name=components
   strquery/str
   strfacet/str
   stranalytics/str
   strhighlight/str
   strdebug/str
   strexpand/str
 /arr
   /requestHandler


 searchComponent name=analytics 
 class=org.apache.solr.handler.component.AnalyticsComponent /

 I am going to try that after adding it to solrconfig.xml.



 On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com
 wrote:

 Same here.

 What do we need to add to solrconfig.xml to get it to work?


1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302
2.
3. Help/


 On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites 
 kingofhypocri...@gmail.com wrote:

 This looks very promising if only I could get it to work:
 https://issues.apache.org/jira/browse/SOLR-5302

 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 Various links it points to are broken now and i can't find anything
 about
 it online, but the PDF indicates I can set olap=true to turn it on,
 although this doesn't seem to do anything. The docs say it supports
 limiting the results and doing math operations on statistics which is
 exactly what I need. I'm not clear if I need to install this or if
 this
 component is even used anymore.

 On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
 ml-node+s472066n4211422...@n3.nabble.com wrote:

  https://issues.apache.org/jira/browse/SOLR-7560, will almost
 support this
  in Solr 5.3. The compound function support won't be there yet
 though. But
  it will be there in the near future.
 
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
  [hidden email] http://
 /user/SendEmail.jtp?type=nodenode=4211422i=0
  wrote:
 
   I am migrating a database from SQL Server to Cassandra. Currently
 I have
  a
   setup as follows:
  
   - Log data in Cassandra
   - Summarize data in Spark and put into Cassandra summary tables
   - Query data in Solr
  
   Everything fits beautifully until I need to do stats 

Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread William Bell
Same here.

What do we need to add to solrconfig.xml to get it to work?


   1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302
   2.
   3. Help/


On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites 
kingofhypocri...@gmail.com wrote:

 This looks very promising if only I could get it to work:
 https://issues.apache.org/jira/browse/SOLR-5302

 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 Various links it points to are broken now and i can't find anything about
 it online, but the PDF indicates I can set olap=true to turn it on,
 although this doesn't seem to do anything. The docs say it supports
 limiting the results and doing math operations on statistics which is
 exactly what I need. I'm not clear if I need to install this or if this
 component is even used anymore.

 On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
 ml-node+s472066n4211422...@n3.nabble.com wrote:

  https://issues.apache.org/jira/browse/SOLR-7560, will almost support
 this
  in Solr 5.3. The compound function support won't be there yet though. But
  it will be there in the near future.
 
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
  [hidden email] http:///user/SendEmail.jtp?type=nodenode=4211422i=0
  wrote:
 
   I am migrating a database from SQL Server to Cassandra. Currently I
 have
  a
   setup as follows:
  
   - Log data in Cassandra
   - Summarize data in Spark and put into Cassandra summary tables
   - Query data in Solr
  
   Everything fits beautifully until I need to do stats on groups. I am
  hoping
   to get this to work with Solr so I can stick to one database, but I am
  not
   sure it's possible.
  
   If I had it in SQL Server, I could do it like so:
   SELECT
   site_id,
   keyword,
   SUM(visits) as visits,
   CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
   SUM(pageviews) as pageviews,
   CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
   avg_pages_per_visit
   FROM
   report_all_keywords_daily
   WHERE
   site_id = 55 AND date_key = '20150606' AND date_key = '20150608'
   GROUP BY
   site_id, keyword
   ORDER BY visits DESC
  
   Now I need to replicate this in Solr. The closest I could get to this
 is
  by
   using the Stats component and then using field collapsing.
  
  
 
 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword
 
  
   And here are some results I get back:
   http://pastebin.com/raw.php?i=Fxhe2RA0
  
   However, I need to do able to divide certain metrics. I tried including
   functions in the stats.field such as div(sum(bounce_rate),
 (sum(visits))
   but
   it doesn't recognize the functions. Also it seems to ignoring the
 paging
   for
   the stats results and returns all groups regardless.
  
   Ultimately I'd like something like this which is what I would get in
  SQL:
   http://lucene.472066.n3.nabble.com/file/n4211402/pic.png
  
   Is this possible or do I have to give up on the prospect of using Solr?
  I
   have to query this data dynamically so I can't pre-summarize all of it.
  
   To clarify I having the following two problems:
   - Paging is ignored for stats data
   - I can't figure out how to divide two stats together to get a third
  stat.
   Note: In some cases I would need to be able to sort on this combined
  stat
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
  --
   If you reply to this email, your message will be added to the discussion
  below:
 
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html
   To unsubscribe from Division with Stats Component when Grouping in
 Solr, click
  here
  
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1
 
  .
  NAML
  
 http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 
 




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211525.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread William Bell
OK. That works with one more change.

lib dir=../../../dist/ regex=solr-analytics-.*\.jar /

 lib dir=../../../dist/ regex=solr-analysis-.*\.jar /

http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=truestats=trueolap=trueolap.overall_score.statistic.sum=sum(overall_score)

On Sat, Jun 13, 2015 at 1:16 PM, William Bell billnb...@gmail.com wrote:

 OK more info

 requestHandler name=standard class=solr.StandardRequestHandler
 arr name=components
   strquery/str
   strfacet/str
   stranalytics/str
   strhighlight/str
   strdebug/str
   strexpand/str
 /arr
   /requestHandler


 searchComponent name=analytics 
 class=org.apache.solr.handler.component.AnalyticsComponent /

 I am going to try that after adding it to solrconfig.xml.



 On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com wrote:

 Same here.

 What do we need to add to solrconfig.xml to get it to work?


1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302
2.
3. Help/


 On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites 
 kingofhypocri...@gmail.com wrote:

 This looks very promising if only I could get it to work:
 https://issues.apache.org/jira/browse/SOLR-5302

 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 Various links it points to are broken now and i can't find anything about
 it online, but the PDF indicates I can set olap=true to turn it on,
 although this doesn't seem to do anything. The docs say it supports
 limiting the results and doing math operations on statistics which is
 exactly what I need. I'm not clear if I need to install this or if this
 component is even used anymore.

 On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
 ml-node+s472066n4211422...@n3.nabble.com wrote:

  https://issues.apache.org/jira/browse/SOLR-7560, will almost support
 this
  in Solr 5.3. The compound function support won't be there yet though.
 But
  it will be there in the near future.
 
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
  [hidden email] http://
 /user/SendEmail.jtp?type=nodenode=4211422i=0
  wrote:
 
   I am migrating a database from SQL Server to Cassandra. Currently I
 have
  a
   setup as follows:
  
   - Log data in Cassandra
   - Summarize data in Spark and put into Cassandra summary tables
   - Query data in Solr
  
   Everything fits beautifully until I need to do stats on groups. I am
  hoping
   to get this to work with Solr so I can stick to one database, but I
 am
  not
   sure it's possible.
  
   If I had it in SQL Server, I could do it like so:
   SELECT
   site_id,
   keyword,
   SUM(visits) as visits,
   CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as
 bounce_rate,
   SUM(pageviews) as pageviews,
   CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
   avg_pages_per_visit
   FROM
   report_all_keywords_daily
   WHERE
   site_id = 55 AND date_key = '20150606' AND date_key =
 '20150608'
   GROUP BY
   site_id, keyword
   ORDER BY visits DESC
  
   Now I need to replicate this in Solr. The closest I could get to
 this is
  by
   using the Stats component and then using field collapsing.
  
  
 
 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword
 
  
   And here are some results I get back:
   http://pastebin.com/raw.php?i=Fxhe2RA0
  
   However, I need to do able to divide certain metrics. I tried
 including
   functions in the stats.field such as div(sum(bounce_rate),
 (sum(visits))
   but
   it doesn't recognize the functions. Also it seems to ignoring the
 paging
   for
   the stats results and returns all groups regardless.
  
   Ultimately I'd like something like this which is what I would get in
  SQL:
   http://lucene.472066.n3.nabble.com/file/n4211402/pic.png
  
   Is this possible or do I have to give up on the prospect of using
 Solr?
  I
   have to query this data dynamically so I can't pre-summarize all of
 it.
  
   To clarify I having the following two problems:
   - Paging is ignored for stats data
   - I can't figure out how to divide two stats together to get a third
  stat.
   Note: In some cases I would need to be able to sort on this combined
  stat
  
  
  
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
   Sent from the Solr - User mailing list archive at Nabble.com.
  
 
 
  --
   If you reply to this email, your message will be added to the
 discussion
  below:
 
 
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html
   To unsubscribe from Division with Stats Component when Grouping in
 Solr, click
  here
  
 

Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread William Bell
Having a hard time getting this to work:

http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=trueolap=trueolap.req1.fieldfacet=overall_score


and even tried... I made sure docValues was set for overall_score too.

http://hgsolr2devmstr:8983/solr/survey/select?q=*%3A*wt=jsonindent=trueolap=trueolap.fieldfacet=overall_score

field name=overall_score type=int indexed=true stored=true
docValues=true /

On Sat, Jun 13, 2015 at 2:02 PM, William Bell billnb...@gmail.com wrote:

 Not you need to enable docValues to get range stuff to work.

 docValues=true on the field.

 On Sat, Jun 13, 2015 at 1:37 PM, William Bell billnb...@gmail.com wrote:

 OK. That works with one more change.

 lib dir=../../../dist/ regex=solr-analytics-.*\.jar /

  lib dir=../../../dist/ regex=solr-analysis-.*\.jar /


 http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=truestats=trueolap=trueolap.overall_score.statistic.sum=sum(overall_score)

 On Sat, Jun 13, 2015 at 1:16 PM, William Bell billnb...@gmail.com
 wrote:

 OK more info

 requestHandler name=standard class=solr.StandardRequestHandler
 arr name=components
   strquery/str
   strfacet/str
   stranalytics/str
   strhighlight/str
   strdebug/str
   strexpand/str
 /arr
   /requestHandler


 searchComponent name=analytics 
 class=org.apache.solr.handler.component.AnalyticsComponent /

 I am going to try that after adding it to solrconfig.xml.



 On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com
 wrote:

 Same here.

 What do we need to add to solrconfig.xml to get it to work?


1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302
2.
3. Help/


 On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites 
 kingofhypocri...@gmail.com wrote:

 This looks very promising if only I could get it to work:
 https://issues.apache.org/jira/browse/SOLR-5302

 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf

 Various links it points to are broken now and i can't find anything
 about
 it online, but the PDF indicates I can set olap=true to turn it on,
 although this doesn't seem to do anything. The docs say it supports
 limiting the results and doing math operations on statistics which is
 exactly what I need. I'm not clear if I need to install this or if this
 component is even used anymore.

 On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] 
 ml-node+s472066n4211422...@n3.nabble.com wrote:

  https://issues.apache.org/jira/browse/SOLR-7560, will almost
 support this
  in Solr 5.3. The compound function support won't be there yet
 though. But
  it will be there in the near future.
 
 
 
  Joel Bernstein
  http://joelsolr.blogspot.com/
 
  On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
  [hidden email] http://
 /user/SendEmail.jtp?type=nodenode=4211422i=0
  wrote:
 
   I am migrating a database from SQL Server to Cassandra. Currently
 I have
  a
   setup as follows:
  
   - Log data in Cassandra
   - Summarize data in Spark and put into Cassandra summary tables
   - Query data in Solr
  
   Everything fits beautifully until I need to do stats on groups. I
 am
  hoping
   to get this to work with Solr so I can stick to one database, but
 I am
  not
   sure it's possible.
  
   If I had it in SQL Server, I could do it like so:
   SELECT
   site_id,
   keyword,
   SUM(visits) as visits,
   CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as
 bounce_rate,
   SUM(pageviews) as pageviews,
   CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
   avg_pages_per_visit
   FROM
   report_all_keywords_daily
   WHERE
   site_id = 55 AND date_key = '20150606' AND date_key =
 '20150608'
   GROUP BY
   site_id, keyword
   ORDER BY visits DESC
  
   Now I need to replicate this in Solr. The closest I could get to
 this is
  by
   using the Stats component and then using field collapsing.
  
  
 
 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword
 
  
   And here are some results I get back:
   http://pastebin.com/raw.php?i=Fxhe2RA0
  
   However, I need to do able to divide certain metrics. I tried
 including
   functions in the stats.field such as div(sum(bounce_rate),
 (sum(visits))
   but
   it doesn't recognize the functions. Also it seems to ignoring the
 paging
   for
   the stats results and returns all groups regardless.
  
   Ultimately I'd like something like this which is what I would get
 in
  SQL:
   http://lucene.472066.n3.nabble.com/file/n4211402/pic.png
  
   Is this possible or do I have to give up on the prospect of using
 Solr?
  I
   have to query this data dynamically so I can't pre-summarize all
 of it.
  
   To clarify I having the following two problems:
   - Paging is ignored for stats data
   - I can't figure out how to divide two stats together to get a
 third
  stat.
   Note: In some cases I would need to be able to sort on this
 combined
  stat
  
  
  
   --
   View this message in 

Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread Yonik Seeley
On Fri, Jun 12, 2015 at 10:30 AM, kingofhypocrites
kingofhypocri...@gmail.com wrote:
 I am migrating a database from SQL Server to Cassandra. Currently I have a
 setup as follows:

 - Log data in Cassandra
 - Summarize data in Spark and put into Cassandra summary tables
 - Query data in Solr

 Everything fits beautifully until I need to do stats on groups. I am hoping
 to get this to work with Solr so I can stick to one database, but I am not
 sure it's possible.

 If I had it in SQL Server, I could do it like so:
 SELECT
 site_id,
 keyword,
 SUM(visits) as visits,
 CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
 SUM(pageviews) as pageviews,
 CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
 avg_pages_per_visit
 FROM
 report_all_keywords_daily
 WHERE
 site_id = 55 AND date_key = '20150606' AND date_key = '20150608'
 GROUP BY
 site_id, keyword
 ORDER BY visits DESC

This is the closest we can get with the JSON Facet API today:

json.facet={
  sites: {
type : terms,
field : site_id,
sort : visits desc,
facet : {
  visits : sum(visits),
  bounces : sum(bounces),
  pageviews : sum(pageviews)
}
  }
}

That doesn't take into account keyword when sorting the buckets.
You could nest a ketword facet inside a site facet and thus calculate
the stats for the top N keywords per site:

json.facet={
  sites: {
type : terms,
field : site_id,
facet : {
  keywords: {
   type : terms,
   field : keyword,
   sort : visits desc,
   facet : {
  visits : sum(visits),
  bounces : sum(bounces),
  pageviews : sum(pageviews)
  }
 }
  }
}

More info here:  http://yonik.com/json-facet-api/

-Yonik


Re: Division with Stats Component when Grouping in Solr

2015-06-13 Thread Bill Bell
It would be cool to be able to set 2 group by with facets 

 GROUP BY
site_id, keyword


Bill Bell
Sent from mobile


On Jun 13, 2015, at 2:28 PM, Yonik Seeley ysee...@gmail.com wrote:

 GROUP BY
site_id, keyword


Re: Division with Stats Component when Grouping in Solr

2015-06-12 Thread Joel Bernstein
https://issues.apache.org/jira/browse/SOLR-7560, will almost support this
in Solr 5.3. The compound function support won't be there yet though. But
it will be there in the near future.



Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
kingofhypocri...@gmail.com wrote:

 I am migrating a database from SQL Server to Cassandra. Currently I have a
 setup as follows:

 - Log data in Cassandra
 - Summarize data in Spark and put into Cassandra summary tables
 - Query data in Solr

 Everything fits beautifully until I need to do stats on groups. I am hoping
 to get this to work with Solr so I can stick to one database, but I am not
 sure it's possible.

 If I had it in SQL Server, I could do it like so:
 SELECT
 site_id,
 keyword,
 SUM(visits) as visits,
 CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
 SUM(pageviews) as pageviews,
 CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
 avg_pages_per_visit
 FROM
 report_all_keywords_daily
 WHERE
 site_id = 55 AND date_key = '20150606' AND date_key = '20150608'
 GROUP BY
 site_id, keyword
 ORDER BY visits DESC

 Now I need to replicate this in Solr. The closest I could get to this is by
 using the Stats component and then using field collapsing.

 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword

 And here are some results I get back:
 http://pastebin.com/raw.php?i=Fxhe2RA0

 However, I need to do able to divide certain metrics. I tried including
 functions in the stats.field such as div(sum(bounce_rate), (sum(visits))
 but
 it doesn't recognize the functions. Also it seems to ignoring the paging
 for
 the stats results and returns all groups regardless.

 Ultimately I'd like something like this which is what I would get in SQL:
 http://lucene.472066.n3.nabble.com/file/n4211402/pic.png

 Is this possible or do I have to give up on the prospect of using Solr? I
 have to query this data dynamically so I can't pre-summarize all of it.

 To clarify I having the following two problems:
 - Paging is ignored for stats data
 - I can't figure out how to divide two stats together to get a third stat.
 Note: In some cases I would need to be able to sort on this combined stat



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Division with Stats Component when Grouping in Solr

2015-06-12 Thread Joel Bernstein
If you are a java programmer you may want to look at plugging in your own
custom Streams into the Streaming API. The SQL stuff is built on top of the
Streaming API.

http://joelsolr.blogspot.com/2015/04/the-streaming-api-solrjio-basics.html

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jun 12, 2015 at 11:00 AM, Joel Bernstein joels...@gmail.com wrote:

 https://issues.apache.org/jira/browse/SOLR-7560, will almost support this
 in Solr 5.3. The compound function support won't be there yet though. But
 it will be there in the near future.



 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites 
 kingofhypocri...@gmail.com wrote:

 I am migrating a database from SQL Server to Cassandra. Currently I have a
 setup as follows:

 - Log data in Cassandra
 - Summarize data in Spark and put into Cassandra summary tables
 - Query data in Solr

 Everything fits beautifully until I need to do stats on groups. I am
 hoping
 to get this to work with Solr so I can stick to one database, but I am not
 sure it's possible.

 If I had it in SQL Server, I could do it like so:
 SELECT
 site_id,
 keyword,
 SUM(visits) as visits,
 CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate,
 SUM(pageviews) as pageviews,
 CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as
 avg_pages_per_visit
 FROM
 report_all_keywords_daily
 WHERE
 site_id = 55 AND date_key = '20150606' AND date_key = '20150608'
 GROUP BY
 site_id, keyword
 ORDER BY visits DESC

 Now I need to replicate this in Solr. The closest I could get to this is
 by
 using the Stats component and then using field collapsing.

 group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword

 And here are some results I get back:
 http://pastebin.com/raw.php?i=Fxhe2RA0

 However, I need to do able to divide certain metrics. I tried including
 functions in the stats.field such as div(sum(bounce_rate), (sum(visits))
 but
 it doesn't recognize the functions. Also it seems to ignoring the paging
 for
 the stats results and returns all groups regardless.

 Ultimately I'd like something like this which is what I would get in SQL:
 http://lucene.472066.n3.nabble.com/file/n4211402/pic.png

 Is this possible or do I have to give up on the prospect of using Solr? I
 have to query this data dynamically so I can't pre-summarize all of it.

 To clarify I having the following two problems:
 - Paging is ignored for stats data
 - I can't figure out how to divide two stats together to get a third stat.
 Note: In some cases I would need to be able to sort on this combined stat



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html
 Sent from the Solr - User mailing list archive at Nabble.com.





Re: Division with Stats Component when Grouping in Solr

2015-06-12 Thread Chris Hostetter

: However, I need to do able to divide certain metrics. I tried including
: functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but
: it doesn't recognize the functions. Also it seems to ignoring the paging for
: the stats results and returns all groups regardless.

i'm lost on what your goal is regarding grouping and what you mean by 
ignoring the paging but FWIW stats.field does support functions (or 
query scores) -- you just need to use local params to make it clear that 
you are passing in a function name and not a field name...

https://cwiki.apache.org/confluence/display/solr/The+Stats+Component

Example...

http://localhost:8983/solr/techproducts/select?q=*:*stats=truestats.field={!func}termfreq('text','memory')stats.field=pricestats.field=popularityrows=0indent=true

: Ultimately I'd like something like this which is what I would get in SQL: 
: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png 

at first glance, making some assumptions about your data, this looks like 
pivot faceting with some stats hanging 
off of it -- ie: 

facet.pivot={!stats=nest}site_id,keyword
stats.field={!tag=nest sum=true}visits
stats.field={!tag=nest sum=true}bounces
stats.field={!tag=nest sum=true}pageviews

https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-CombiningStatsComponentWithPivots

...that will give you the sum or each of the specified fields for each 
top keyword (by doc count) for each top site_id (by doc count).  
(Computing the bounce_rate and avg_pages_per_visit is simple client side 
division)

: - Paging is ignored for stats data

How/Why exactly do you want/expect paging to affect stats computation? 
stats are over entire result sets -- if you wnated stats just over a 
single page that's trivial to do in the client.

: - I can't figure out how to divide two stats together to get a third stat.
: Note: In some cases I would need to be able to sort on this combined stat

Yeah, unfortunately sorting pivots facet results currently only works by 
either hte doc count or the term, not an arbitrary stat on the docs in the 
pivot subset (that's a really hard problem to solve for arbitrary 
functions in a distributed setup) ... the new JSON faceting stuff might do 
what you want, but i don't really know enough about it to say...

https://cwiki.apache.org/confluence/display/solr/JSON+Request+API


-Hoss
http://www.lucidworks.com/