Pagination with grouping in solr
Document structure of solr document is as mentioned below now i need to get the document having event_name="product view" and group it by email so that email is not duplicate.Now on listing the email how may paginate the unique email.As the query return total number of document not the count of groups "docs":[ { "id":"1", "email":"xxx...@gmail.com", "gender":"M", "location":["yyy"], "created":123444, "event_name":"product viewed", "event_property":"product", "event_value":"sun glassed", "version":1617201602734587904, "location_str":[""] }, { "id":"4", "email":"xxx...@gmail.com", "gender":"F", "location":[""], "created":123447, "event_name":"Add To Cart", "event_property":"Name", "event_value":"sun glasses", "version":1617202784870858752, "location_str":[""] }, { "id":"5", "email":"xxx...@gmail.com", "gender":"M", "location":["k"], "created":123464, "event_name":"Product Clicked", "event_property":"Category", "event_value":"Contact Lens", "version":1617202784871907328, "location_str":["l"] } ] -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: grouping in solr cloud shard replicas
On 1/6/2018 11:25 AM, SANJAY. wrote: Please let me know how to achieve group by in solr could env. We tried grouping in solr cloud shard replicas to fetch unique search result from solr for custom field we . we are getting exception saying unexpected docvalues type "SORTED_SET (expected SORTED)" This typically happens when you index some data and then change the multivalued parameter on a field that has docValues without deleting the index and starting over. When making that kind of change to the schema, you must completely delete the index directory for all cores in the collection, then reload the collection or restart Solr, and reindex from scratch. This rather extreme step is required because the Lucene index records certain kinds of information about the docValues for each field that has them, and if what is expected doesn't match what's actually in the index, a severe error is encountered. Thanks, Shawn
grouping in solr cloud shard replicas
Hi, Please let me know how to achieve group by in solr could env. We tried grouping in solr cloud shard replicas to fetch unique search result from solr for custom field we . we are getting exception saying unexpected docvalues type "SORTED_SET (expected SORTED)" We are using solr cloud and collection having 2 shared replicas .We have created custom field type which is using solr.TextField class . Please suggest me the best possible way to fetch the unique search result. -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: using rank queries(rq) with grouping in solr cloud
Hi Tomerg, 1. Did you consider using the collapse component? https://lucene.apache.org/solr/guide/6_6/collapse-and-expand-results.html it is compatible with rq. 2. If you implement group reranking as a separate component you will end up with a lot of code duplicated from QueryComponent, you could use SOLR-8776 - I'm going to update it to master soon. Another possible solution is to have a component that asks for the top $rerank groups to the shards and then just do the reranking on top of them in the federator, but it could be expensive. Cheers, Diego On Fri, Dec 15, 2017 at 9:46 PM, tomergwrote: > hey, > > i'm using solr 6.5.1 with solrCloud mode. > i use grouping for my results. > i want to use rank query(rq) in order to rerank the top groups(with ltr). > it's ok for me to rerank the groups only by reranking one of the documents > in the group. > i saw in issue SOLR-8776 that rank queries doesn't support grouping. > (link here: https://issues.apache.org/jira/browse/SOLR-8776). > > so i have a few questions: > 1. there is some way to bypass this problem(or use some other existing > features of solr to achieve similar results? > 2. if there is no other way, i would like to implement a component to > achieve this functionality(i don't want to patch the code of solr itself). > do you have a suggestion what might be the best way to implement a rerank of > groups in cloud mode? > can i implement something that rerank the groups for every shard before > merging or there is a way to create component that rerank only the merged > result list from the shards? > > thanks, > tomerg > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
using rank queries(rq) with grouping in solr cloud
hey, i'm using solr 6.5.1 with solrCloud mode. i use grouping for my results. i want to use rank query(rq) in order to rerank the top groups(with ltr). it's ok for me to rerank the groups only by reranking one of the documents in the group. i saw in issue SOLR-8776 that rank queries doesn't support grouping. (link here: https://issues.apache.org/jira/browse/SOLR-8776). so i have a few questions: 1. there is some way to bypass this problem(or use some other existing features of solr to achieve similar results? 2. if there is no other way, i would like to implement a component to achieve this functionality(i don't want to patch the code of solr itself). do you have a suggestion what might be the best way to implement a rerank of groups in cloud mode? can i implement something that rerank the groups for every shard before merging or there is a way to create component that rerank only the merged result list from the shards? thanks, tomerg -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Is it possible to do pivot grouping in SOLR?
well, a quickly formulated query against some strange kind of endpoint... collapse and expand; with expand.sort look it up; its in the ref guide. On 11/17/2016 1:42 PM, bbarani wrote: > Is there a way to do pivot grouping (group within a group) in SOLR? > > We initially group the results by category and inturn we are trying to group > the data under one category based on another field. Is there a way to do > that? > > Categories (group by) > |--Shop >|---Color (group by) > |--Support > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Is-it-possible-to-do-pivot-grouping-in-SOLR-tp4306352.html > Sent from the Solr - User mailing list archive at Nabble.com.
Is it possible to do pivot grouping in SOLR?
Is there a way to do pivot grouping (group within a group) in SOLR? We initially group the results by category and inturn we are trying to group the data under one category based on another field. Is there a way to do that? Categories (group by) |--Shop |---Color (group by) |--Support -- View this message in context: http://lucene.472066.n3.nabble.com/Is-it-possible-to-do-pivot-grouping-in-SOLR-tp4306352.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
I think I have this about working with the analytics component. It seems to fill in all the gaps that the stats component and the json facet don't support. It solved the following problems for me: - I am able to perform math on stats to form other stats.. Then i can sort on those as needed. - When I perform math on stats it uses the summed totals per group rather than doing it per row - I am able to to do offsets and number of rows to handle paging I am confused why this module isn't built into Sor. This functionality is so vital for any adhoc querying on time series data. Pretty much any scenario like the SQL query I provided would need all of these things. Only thing I couldn't figure out is how to get the list of total buckets... or in other words the distinct count of keywords. If anyone is able to help with this, I could really use it in order to provide a total record count to the user (e.g. Showing records 1-10 of 2939). Here is what I have in case this helps someone: olap=trueo.r1.ff=keyword_so.r1.s.visits=sum(visits_i)o.r1.s.bounces=sum(bounces_i)o.r1.s.bounce_rate=div(sum(bounces_i),sum(visits_i))o.r1.ff.keyword_s.sortstatistic=bounce_rateo.r1.ff.keyword_s.sortdirection=desco.r1.ff.keyword_s.offset=0o.r1.ff.keyword_s.limit=10 Also if anyone has access to the original documentation from bloomberg mentioned in the stats component PDF, I'd love to have it :) https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf All the links for detailed documentation are now broken. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211751.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
Why it isn't in core Solr... Because it doesn't (and probably can't) support distributed mode. The Streaming aggregation stuff, and the (in trunk Real Soon Now) Parallel SQL support are where the effort is going to support this kind of stuff. https://issues.apache.org/jira/browse/SOLR-7560 https://issues.apache.org/jira/browse/SOLR-7082 Best, Erick On Sun, Jun 14, 2015 at 2:25 PM, kingofhypocrites kingofhypocri...@gmail.com wrote: I think I have this about working with the analytics component. It seems to fill in all the gaps that the stats component and the json facet don't support. It solved the following problems for me: - I am able to perform math on stats to form other stats.. Then i can sort on those as needed. - When I perform math on stats it uses the summed totals per group rather than doing it per row - I am able to to do offsets and number of rows to handle paging I am confused why this module isn't built into Sor. This functionality is so vital for any adhoc querying on time series data. Pretty much any scenario like the SQL query I provided would need all of these things. Only thing I couldn't figure out is how to get the list of total buckets... or in other words the distinct count of keywords. If anyone is able to help with this, I could really use it in order to provide a total record count to the user (e.g. Showing records 1-10 of 2939). Here is what I have in case this helps someone: olap=trueo.r1.ff=keyword_so.r1.s.visits=sum(visits_i)o.r1.s.bounces=sum(bounces_i)o.r1.s.bounce_rate=div(sum(bounces_i),sum(visits_i))o.r1.ff.keyword_s.sortstatistic=bounce_rateo.r1.ff.keyword_s.sortdirection=desco.r1.ff.keyword_s.offset=0o.r1.ff.keyword_s.limit=10 Also if anyone has access to the original documentation from bloomberg mentioned in the stats component PDF, I'd love to have it :) https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf All the links for detailed documentation are now broken. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211751.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
I was able to get the new version of Solr installed. This query gets me really close, but it is averaging the rows BEFORE the grouping so it's not totally accurate. I need it to sum the visits and bounces by keyword and then perform the division. The avg here probably seems confusing and pointless, but it wouldn't let me just put the div directly in the facet without wrapping it with a function. So instead of summing all the rows into one group and performing the divide, it is diving each row one by one and then averaging them together which creates skewed results since one day may have more data than the other. It seems dividing is possible if only I can tell it to divide the grouped by keyword result and not the individual rows and having to average them together, etc. Here is what I have (granted it's a simplified version for testing) json.facet={ keywords:{ type:terms, limit:10, field:keyword, facet:{ bounces_sum:sum(bounces), visits_sum:sum(visits), bounce_rate:avg(div(sum(bounces),sum(visits))) } } } What I really want is: bounce_rate: div(bounces_sum, visits_sum) ... but this doesn't work. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211639.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
Not sure why but half of my posts are showing up as not accepted by the mailing list. I've made a few replies to others that haven't gone through. I am not sure if it's because I'm replying via email or what the issue is. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211631.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
kingofhypocrites: Usually that's because your e-mail formats with html or some other non-plain-text format. Try sending them as plain text. On Sat, Jun 13, 2015 at 5:26 PM, kingofhypocrites kingofhypocri...@gmail.com wrote: Not sure why but half of my posts are showing up as not accepted by the mailing list. I've made a few replies to others that haven't gone through. I am not sure if it's because I'm replying via email or what the issue is. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211631.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
@Billnbell What did you conclude with the Analytics component? It sounds like you are saying it does the same thing as the stats component but it has several other features that aren't supported by the stats library. I'd love to have a talk with you offline if possible. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211635.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
@Yonik, Thanks for this! I was actually just looking at your blog earlier today and thinking that the json facet feature may be just what I need. I'm using Solr. 4.3 currently as that is what comes with DataStax, so I'm trying to create a new build with the latest Solr version so i can test this feature. For the sort I am assuming this would be sorting on sum(visits) for the given keyword correct? Also can you confirm if it's possible to do a division in the facet? Something like facet: { bouncerate: div(sum(bounces) / sum(visits)) } Because of the large number of results, I would need to precalculate this (division operation) if they happen to sort on it. I don't see anything like this mentioned in the api docs, so maybe it's not possible. -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211634.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
This looks very promising if only I could get it to work: https://issues.apache.org/jira/browse/SOLR-5302 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf Various links it points to are broken now and i can't find anything about it online, but the PDF indicates I can set olap=true to turn it on, although this doesn't seem to do anything. The docs say it supports limiting the results and doing math operations on statistics which is exactly what I need. I'm not clear if I need to install this or if this component is even used anymore. On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] ml-node+s472066n4211422...@n3.nabble.com wrote: https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites [hidden email] http:///user/SendEmail.jtp?type=nodenode=4211422i=0 wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping in Solr, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1 . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211525.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
OK more info requestHandler name=standard class=solr.StandardRequestHandler arr name=components strquery/str strfacet/str stranalytics/str strhighlight/str strdebug/str strexpand/str /arr /requestHandler searchComponent name=analytics class=org.apache.solr.handler.component.AnalyticsComponent / I am going to try that after adding it to solrconfig.xml. On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com wrote: Same here. What do we need to add to solrconfig.xml to get it to work? 1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302 2. 3. Help/ On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: This looks very promising if only I could get it to work: https://issues.apache.org/jira/browse/SOLR-5302 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf Various links it points to are broken now and i can't find anything about it online, but the PDF indicates I can set olap=true to turn it on, although this doesn't seem to do anything. The docs say it supports limiting the results and doing math operations on statistics which is exactly what I need. I'm not clear if I need to install this or if this component is even used anymore. On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] ml-node+s472066n4211422...@n3.nabble.com wrote: https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites [hidden email] http:///user/SendEmail.jtp?type=nodenode=4211422i=0 wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping in Solr, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1 . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml
Re: Division with Stats Component when Grouping in Solr
Not you need to enable docValues to get range stuff to work. docValues=true on the field. On Sat, Jun 13, 2015 at 1:37 PM, William Bell billnb...@gmail.com wrote: OK. That works with one more change. lib dir=../../../dist/ regex=solr-analytics-.*\.jar / lib dir=../../../dist/ regex=solr-analysis-.*\.jar / http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=truestats=trueolap=trueolap.overall_score.statistic.sum=sum(overall_score) On Sat, Jun 13, 2015 at 1:16 PM, William Bell billnb...@gmail.com wrote: OK more info requestHandler name=standard class=solr.StandardRequestHandler arr name=components strquery/str strfacet/str stranalytics/str strhighlight/str strdebug/str strexpand/str /arr /requestHandler searchComponent name=analytics class=org.apache.solr.handler.component.AnalyticsComponent / I am going to try that after adding it to solrconfig.xml. On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com wrote: Same here. What do we need to add to solrconfig.xml to get it to work? 1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302 2. 3. Help/ On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: This looks very promising if only I could get it to work: https://issues.apache.org/jira/browse/SOLR-5302 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf Various links it points to are broken now and i can't find anything about it online, but the PDF indicates I can set olap=true to turn it on, although this doesn't seem to do anything. The docs say it supports limiting the results and doing math operations on statistics which is exactly what I need. I'm not clear if I need to install this or if this component is even used anymore. On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] ml-node+s472066n4211422...@n3.nabble.com wrote: https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites [hidden email] http:// /user/SendEmail.jtp?type=nodenode=4211422i=0 wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping
Re: Division with Stats Component when Grouping in Solr
on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping in Solr, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1 . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211525.html Sent from the Solr - User mailing list archive at Nabble.com. -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Division with Stats Component when Grouping in Solr
Same here. What do we need to add to solrconfig.xml to get it to work? 1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302 2. 3. Help/ On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: This looks very promising if only I could get it to work: https://issues.apache.org/jira/browse/SOLR-5302 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf Various links it points to are broken now and i can't find anything about it online, but the PDF indicates I can set olap=true to turn it on, although this doesn't seem to do anything. The docs say it supports limiting the results and doing math operations on statistics which is exactly what I need. I'm not clear if I need to install this or if this component is even used anymore. On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] ml-node+s472066n4211422...@n3.nabble.com wrote: https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites [hidden email] http:///user/SendEmail.jtp?type=nodenode=4211422i=0 wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping in Solr, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1 . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211525.html Sent from the Solr - User mailing list archive at Nabble.com. -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Division with Stats Component when Grouping in Solr
OK. That works with one more change. lib dir=../../../dist/ regex=solr-analytics-.*\.jar / lib dir=../../../dist/ regex=solr-analysis-.*\.jar / http://localhost:8983/solr/select?q=*%3A*wt=jsonindent=truestats=trueolap=trueolap.overall_score.statistic.sum=sum(overall_score) On Sat, Jun 13, 2015 at 1:16 PM, William Bell billnb...@gmail.com wrote: OK more info requestHandler name=standard class=solr.StandardRequestHandler arr name=components strquery/str strfacet/str stranalytics/str strhighlight/str strdebug/str strexpand/str /arr /requestHandler searchComponent name=analytics class=org.apache.solr.handler.component.AnalyticsComponent / I am going to try that after adding it to solrconfig.xml. On Sat, Jun 13, 2015 at 1:11 PM, William Bell billnb...@gmail.com wrote: Same here. What do we need to add to solrconfig.xml to get it to work? 1. SOLR-5302 https://issues.apache.org/jira/browse/SOLR-5302 2. 3. Help/ On Sat, Jun 13, 2015 at 8:34 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: This looks very promising if only I could get it to work: https://issues.apache.org/jira/browse/SOLR-5302 https://issues.apache.org/jira/secure/attachment/12606793/Search%20Analytics%20Component.pdf Various links it points to are broken now and i can't find anything about it online, but the PDF indicates I can set olap=true to turn it on, although this doesn't seem to do anything. The docs say it supports limiting the results and doing math operations on statistics which is exactly what I need. I'm not clear if I need to install this or if this component is even used anymore. On Fri, Jun 12, 2015 at 12:00 PM Joel Bernstein [via Lucene] ml-node+s472066n4211422...@n3.nabble.com wrote: https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites [hidden email] http:// /user/SendEmail.jtp?type=nodenode=4211422i=0 wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping in Solr, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code
Re: Division with Stats Component when Grouping in Solr
in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com. -- If you reply to this email, your message will be added to the discussion below: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211422.html To unsubscribe from Division with Stats Component when Grouping in Solr, click here http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4211402code=a2luZ29maHlwb2NyaXRlc0BnbWFpbC5jb218NDIxMTQwMnwtNDY4MDgyMzk1 . NAML http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402p4211525.html Sent from the Solr - User mailing list archive at Nabble.com. -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076 -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Division with Stats Component when Grouping in Solr
On Fri, Jun 12, 2015 at 10:30 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC This is the closest we can get with the JSON Facet API today: json.facet={ sites: { type : terms, field : site_id, sort : visits desc, facet : { visits : sum(visits), bounces : sum(bounces), pageviews : sum(pageviews) } } } That doesn't take into account keyword when sorting the buckets. You could nest a ketword facet inside a site facet and thus calculate the stats for the top N keywords per site: json.facet={ sites: { type : terms, field : site_id, facet : { keywords: { type : terms, field : keyword, sort : visits desc, facet : { visits : sum(visits), bounces : sum(bounces), pageviews : sum(pageviews) } } } } More info here: http://yonik.com/json-facet-api/ -Yonik
Re: Division with Stats Component when Grouping in Solr
It would be cool to be able to set 2 group by with facets GROUP BY site_id, keyword Bill Bell Sent from mobile On Jun 13, 2015, at 2:28 PM, Yonik Seeley ysee...@gmail.com wrote: GROUP BY site_id, keyword
Re: Division with Stats Component when Grouping in Solr
https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
If you are a java programmer you may want to look at plugging in your own custom Streams into the Streaming API. The SQL stuff is built on top of the Streaming API. http://joelsolr.blogspot.com/2015/04/the-streaming-api-solrjio-basics.html Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 11:00 AM, Joel Bernstein joels...@gmail.com wrote: https://issues.apache.org/jira/browse/SOLR-7560, will almost support this in Solr 5.3. The compound function support won't be there yet though. But it will be there in the near future. Joel Bernstein http://joelsolr.blogspot.com/ On Fri, Jun 12, 2015 at 9:30 AM, kingofhypocrites kingofhypocri...@gmail.com wrote: I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com.
Division with Stats Component when Grouping in Solr
I am migrating a database from SQL Server to Cassandra. Currently I have a setup as follows: - Log data in Cassandra - Summarize data in Spark and put into Cassandra summary tables - Query data in Solr Everything fits beautifully until I need to do stats on groups. I am hoping to get this to work with Solr so I can stick to one database, but I am not sure it's possible. If I had it in SQL Server, I could do it like so: SELECT site_id, keyword, SUM(visits) as visits, CONVERT(DECIMAL(13, 3), SUM(bounces)) / SUM(visits) as bounce_rate, SUM(pageviews) as pageviews, CONVERT(DECIMAL(13, 3), SUM(pageviews)) / SUM(visits) as avg_pages_per_visit FROM report_all_keywords_daily WHERE site_id = 55 AND date_key = '20150606' AND date_key = '20150608' GROUP BY site_id, keyword ORDER BY visits DESC Now I need to replicate this in Solr. The closest I could get to this is by using the Stats component and then using field collapsing. group=truegroup.field=keywordstats=truestats.field=visitsstats.facet=keyword And here are some results I get back: http://pastebin.com/raw.php?i=Fxhe2RA0 However, I need to do able to divide certain metrics. I tried including functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but it doesn't recognize the functions. Also it seems to ignoring the paging for the stats results and returns all groups regardless. Ultimately I'd like something like this which is what I would get in SQL: http://lucene.472066.n3.nabble.com/file/n4211402/pic.png Is this possible or do I have to give up on the prospect of using Solr? I have to query this data dynamically so I can't pre-summarize all of it. To clarify I having the following two problems: - Paging is ignored for stats data - I can't figure out how to divide two stats together to get a third stat. Note: In some cases I would need to be able to sort on this combined stat -- View this message in context: http://lucene.472066.n3.nabble.com/Division-with-Stats-Component-when-Grouping-in-Solr-tp4211402.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Division with Stats Component when Grouping in Solr
: However, I need to do able to divide certain metrics. I tried including : functions in the stats.field such as div(sum(bounce_rate), (sum(visits)) but : it doesn't recognize the functions. Also it seems to ignoring the paging for : the stats results and returns all groups regardless. i'm lost on what your goal is regarding grouping and what you mean by ignoring the paging but FWIW stats.field does support functions (or query scores) -- you just need to use local params to make it clear that you are passing in a function name and not a field name... https://cwiki.apache.org/confluence/display/solr/The+Stats+Component Example... http://localhost:8983/solr/techproducts/select?q=*:*stats=truestats.field={!func}termfreq('text','memory')stats.field=pricestats.field=popularityrows=0indent=true : Ultimately I'd like something like this which is what I would get in SQL: : http://lucene.472066.n3.nabble.com/file/n4211402/pic.png at first glance, making some assumptions about your data, this looks like pivot faceting with some stats hanging off of it -- ie: facet.pivot={!stats=nest}site_id,keyword stats.field={!tag=nest sum=true}visits stats.field={!tag=nest sum=true}bounces stats.field={!tag=nest sum=true}pageviews https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-CombiningStatsComponentWithPivots ...that will give you the sum or each of the specified fields for each top keyword (by doc count) for each top site_id (by doc count). (Computing the bounce_rate and avg_pages_per_visit is simple client side division) : - Paging is ignored for stats data How/Why exactly do you want/expect paging to affect stats computation? stats are over entire result sets -- if you wnated stats just over a single page that's trivial to do in the client. : - I can't figure out how to divide two stats together to get a third stat. : Note: In some cases I would need to be able to sort on this combined stat Yeah, unfortunately sorting pivots facet results currently only works by either hte doc count or the term, not an arbitrary stat on the docs in the pivot subset (that's a really hard problem to solve for arbitrary functions in a distributed setup) ... the new JSON faceting stuff might do what you want, but i don't really know enough about it to say... https://cwiki.apache.org/confluence/display/solr/JSON+Request+API -Hoss http://www.lucidworks.com/
Re: CollapsingQParserPlugin is slower than standard Solr field grouping in Solr 4.6.1
Hi Joe, With 10,000 documents the CollapsingQParserPlugin will likely not have any performance advantages. The CollapsingQParserPlugin will be faster then standard grouping when you have a higher number of distinct groups and large result sets. For the scale you are working at you will be just fine using standard grouping. Joel Bernstein Search Engineer at Heliosearch On Wed, Feb 26, 2014 at 2:55 PM, Joe Ho j...@basistech.com wrote: I notice that in Solr 4.6.1 CollapsingQParserPlugin is slower than standard Solr field grouping. I have a Solr index of 1 docs, with a signature field which is a Solr dedup field of the doc content. Majority of the signatures are unique. field name=signature type=string stored=true indexed=true multiValued=false / With standard Solr field grouping, http://localhost:4462/solr/collection1/select?q=*:*group.ngroups=truegroup=truegroup.field=signaturegroup.main=truerows=1fl=id I get average QTime 78 after Solr warmed up. Using CollapsingQParserPlugin, http://localhost:4462/solr/collection1/select?q=*:*fq={!collapse%20field=signature}rows=1fl=id I get average QTime 89.2 In fact CollapsingQParserPlugin QTime is always slower than the standard Solr field grouping. How can I get CollapsingQParserPlugin run faster? Joe
CollapsingQParserPlugin is slower than standard Solr field grouping in Solr 4.6.1
I notice that in Solr 4.6.1 CollapsingQParserPlugin is slower than standard Solr field grouping. I have a Solr index of 1 docs, with a signature field which is a Solr dedup field of the doc content. Majority of the signatures are unique. field name=signature type=string stored=true indexed=true multiValued=false / With standard Solr field grouping, http://localhost:4462/solr/collection1/select?q=*:*group.ngroups=truegroup=truegroup.field=signaturegroup.main=truerows=1fl=id I get average QTime 78 after Solr warmed up. Using CollapsingQParserPlugin, http://localhost:4462/solr/collection1/select?q=*:*fq={!collapse%20field=signature}rows=1fl=id I get average QTime 89.2 In fact CollapsingQParserPlugin QTime is always slower than the standard Solr field grouping. How can I get CollapsingQParserPlugin run faster? Joe
Re: Doubts in Result Grouping in solr 3.6.1
Grouping isn't defined for tokenized fields I don't think. See: http://wiki.apache.org/solr/FieldCollapsing where it says for group.field: ..The field must currently be single-valued... Are you sure you don't want faceting? Best Erick On Tue, Sep 4, 2012 at 5:27 AM, mechravi25 mechrav...@yahoo.co.in wrote: Hi, I am currently using solr 3.6.1 version and for indexing data, i am using the data import handler for 3.5 because of the reason posted in the following forum link http://lucene.472066.n3.nabble.com/Dataimport-Handler-in-solr-3-6-1-td4001149.html I am trying to achieve result grouping based on a field grpValue which has value like this Name XYZ|Company. There are totally 359 docs that were indexed and the field grpValue in all the 359 docs contains the word Company in its value. I gave the following in my schema.xml for splitting the word while indexing and querying fieldType name=groupField class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.PatternTokenizerFactory pattern=\s+|\|/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_new.txt enablePositionIncrements=true / /analyzer analyzer type=query tokenizer class=solr.PatternTokenizerFactory pattern=\s+|\|/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_new.txt enablePositionIncrements=true / /analyzer /fieldType I am trying to split the words if I have a single space or an “|” symbol in my data when i use the pattern=\s+|\| in PatternTokenizerFactory. When I gave the analyze option in solr, the sample value was split inot 3 words Name,XYZ,Company in both my index and query analyzer. When i gave the following url http://localhost:8080/solr/core1/select/?q=*%3A*version=2.2start=0rows=359indent=ongroup=truegroup.field=grpValuegroup.limit=0 I noticed that I have a grouping name called Company which has numFound as 73 but the particular field grpValue has the word Company in its value in all the 359 docs. Ideally, i should have got 359 docs as numFound under my group - lst name=grouped - lst name=grpValue int name=matches359/int - arr name=groups - lst str name=groupValueCompany/str result name=doclist numFound=73 start=0 / /lst Please someone guide me as to why only 73 docs is present in that group instead of 359. I also noticed that when I counted the numFound in all the groups, it totalled upto 359. Please guide me on this and I am not sure what I am missing. Please let me know in case more details is needed. Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Doubts-in-Result-Grouping-in-solr-3-6-1-tp4005239.html Sent from the Solr - User mailing list archive at Nabble.com.
Doubts in Result Grouping in solr 3.6.1
Hi, I am currently using solr 3.6.1 version and for indexing data, i am using the data import handler for 3.5 because of the reason posted in the following forum link http://lucene.472066.n3.nabble.com/Dataimport-Handler-in-solr-3-6-1-td4001149.html I am trying to achieve result grouping based on a field grpValue which has value like this Name XYZ|Company. There are totally 359 docs that were indexed and the field grpValue in all the 359 docs contains the word Company in its value. I gave the following in my schema.xml for splitting the word while indexing and querying fieldType name=groupField class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.PatternTokenizerFactory pattern=\s+|\|/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_new.txt enablePositionIncrements=true / /analyzer analyzer type=query tokenizer class=solr.PatternTokenizerFactory pattern=\s+|\|/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_new.txt enablePositionIncrements=true / /analyzer /fieldType I am trying to split the words if I have a single space or an “|” symbol in my data when i use the pattern=\s+|\| in PatternTokenizerFactory. When I gave the analyze option in solr, the sample value was split inot 3 words Name,XYZ,Company in both my index and query analyzer. When i gave the following url http://localhost:8080/solr/core1/select/?q=*%3A*version=2.2start=0rows=359indent=ongroup=truegroup.field=grpValuegroup.limit=0 I noticed that I have a grouping name called Company which has numFound as 73 but the particular field grpValue has the word Company in its value in all the 359 docs. Ideally, i should have got 359 docs as numFound under my group - lst name=grouped - lst name=grpValue int name=matches359/int - arr name=groups - lst str name=groupValueCompany/str result name=doclist numFound=73 start=0 / /lst Please someone guide me as to why only 73 docs is present in that group instead of 359. I also noticed that when I counted the numFound in all the groups, it totalled upto 359. Please guide me on this and I am not sure what I am missing. Please let me know in case more details is needed. Thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Doubts-in-Result-Grouping-in-solr-3-6-1-tp4005239.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multivalued attibute grouping in SOLR
I came across a problem where one of my column is multivalued. eg: value can be (11,22) (11,33) (11,55) , (22,44) , (22,99) I want to perform a grouping operation that will yield: * 11 : count 3 * 22 : count 3 * 33 : 1 * 44 : 1 * 55 : 1 * 99 : 1 According to wiki, Support for grouping on a multi-valued field has not yet been implemented. http://wiki.apache.org/solr/FieldCollapsing#Known_Limitations
Multivalued attibute grouping in SOLR
I came across a problem where one of my column is multivalued. eg: value can be (11,22) (11,33) (11,55) , (22,44) , (22,99) I want to perform a grouping operation that will yield: * 11 : count 3 * 22 : count 3 * 33 : 1 * 44 : 1 * 55 : 1 * 99 : 1 -- View this message in context: http://lucene.472066.n3.nabble.com/Multivalued-attibute-grouping-in-SOLR-tp3994785.html Sent from the Solr - User mailing list archive at Nabble.com.
Field Collapsing and Grouping in Solr 3.2
Hello. Does anybody know if Field Collapsing and Grouping is available in Solr 3.2. I mean directly available, not as a patch. I have read conflicting statements about it... Thanks a lot! http://www.playence.com/ Description: playence Sergio Martín Cantero playence KG Penthouse office Soho II - Top 1 Grabenweg 68 6020 Innsbruck Austria Mobile: (+34)654464222 eMail: mailto:sergio.mar...@playence.com sergio.mar...@playence.com Web:www.playence.com skype:superepi2000 Description: skypeplayence http://twitter.com/playence Description: twitterplayence http://www.linkedin.com/companies/playence Description: linkedinplayence Stay up to date on the latest developments of playence by subscribing to our blog ( http://blog.playence.com http://blog.playence.com) or following us in Twitter ( http://twitter.com/playence http://twitter.com/playence). The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee and access to the e-mail by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you have received this e-mail in error please forward to mailto:off...@playence.com off...@playence.com. Thank you for your cooperation.
Re: Field Collapsing and Grouping in Solr 3.2
Alas, no, not yet.. grouping/field collapse has had a long history with Solr. There were many iterations on SOLR-236, but that impl was never committed. Instead, SOLR-1682 was committed, but committed only to trunk (never backported to 3.x despite requests). Then, a new grouping module was factored out of Solr's trunk implementation, and was backported to 3.x. Finally, there is now an effort to cut over Solr trunk (SOLR-2564) and Solr 3.x (SOLR-2524) to the new grouping module, which looks like it's close to being done! So hopefully for 3.3 but not promises! This is open-source... Mike McCandless http://blog.mikemccandless.com 2011/6/16 Sergio Martín sergio.mar...@playence.com Hello. Does anybody know if Field Collapsing and Grouping is available in Solr 3.2. I mean directly available, not as a patch. I have read conflicting statements about it... Thanks a lot! [image: Description: playence] http://www.playence.com/ *Sergio Martín Cantero* *playence KG* Penthouse office Soho II - Top 1 Grabenweg 68 6020 Innsbruck Austria Mobile: (+34)654464222 eMail: sergio.mar...@playence.com Web:www.playence.com [image: Description: skypeplayence] [image: Description: twitterplayence]http://twitter.com/playence [image: Description: linkedinplayence]http://www.linkedin.com/companies/playence Stay up to date on the latest developments of playence by subscribing to our blog (http://blog.playence.com) or following us in Twitter ( http://twitter.com/playence). The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee and access to the e-mail by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you have received this e-mail in error please forward to off...@playence.com. Thank you for your cooperation.
RE: Field Collapsing and Grouping in Solr 3.2
Mike, thanks a lot for your quick and precise answer! Sergio Martín Cantero playence KG Penthouse office Soho II - Top 1 Grabenweg 68 6020 Innsbruck Austria Mobile: (+34)654464222 eMail: sergio.mar...@playence.com Web:www.playence.com Stay up to date on the latest developments of playence by subscribing to our blog (http://blog.playence.com) or following us in Twitter (http://twitter.com/playence). The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee and access to the e-mail by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you have received this e-mail in error please forward to off...@playence.com. Thank you for your cooperation. -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: jueves, 16 de junio de 2011 12:51 To: solr-user@lucene.apache.org Subject: Re: Field Collapsing and Grouping in Solr 3.2 Alas, no, not yet.. grouping/field collapse has had a long history with Solr. There were many iterations on SOLR-236, but that impl was never committed. Instead, SOLR-1682 was committed, but committed only to trunk (never backported to 3.x despite requests). Then, a new grouping module was factored out of Solr's trunk implementation, and was backported to 3.x. Finally, there is now an effort to cut over Solr trunk (SOLR-2564) and Solr 3.x (SOLR-2524) to the new grouping module, which looks like it's close to being done! So hopefully for 3.3 but not promises! This is open-source... Mike McCandless http://blog.mikemccandless.com 2011/6/16 Sergio Martín sergio.mar...@playence.com Hello. Does anybody know if Field Collapsing and Grouping is available in Solr 3.2. I mean directly available, not as a patch. I have read conflicting statements about it... Thanks a lot! [image: Description: playence] http://www.playence.com/ *Sergio Martín Cantero* *playence KG* Penthouse office Soho II - Top 1 Grabenweg 68 6020 Innsbruck Austria Mobile: (+34)654464222 eMail: sergio.mar...@playence.com Web:www.playence.com [image: Description: skypeplayence] [image: Description: twitterplayence]http://twitter.com/playence [image: Description: linkedinplayence]http://www.linkedin.com/companies/playence Stay up to date on the latest developments of playence by subscribing to our blog (http://blog.playence.com) or following us in Twitter ( http://twitter.com/playence). The information in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee and access to the e-mail by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. If you have received this e-mail in error please forward to off...@playence.com. Thank you for your cooperation.
RE: Grouping in solr ?
I'm really sorry - thank you for the note. -Original Message- From: Chris Hostetter [mailto:hossman_luc...@fucit.org] Sent: Tuesday, September 28, 2010 05:12 To: solr-user@lucene.apache.org Subject: Re: Grouping in solr ? : References: : abcc5d9ce0798544a169c584b8f1447d230313c...@exchange01.toolbox.local : In-Reply-To: : abcc5d9ce0798544a169c584b8f1447d230313c...@exchange01.toolbox.local : Subject: Grouping in solr ? http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is hidden in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking -Hoss -- http://lucenerevolution.org/ ... October 7-8, Boston http://bit.ly/stump-hoss ... Stump The Chump! __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
Grouping in solr ?
Hi all, is it possible somehow to group documents? I have services as documents, and I would like to show the filtered services grouped by company. So I filter services by given criteria, but I show the results grouped by companay. If I got 1000 services, maybe I need to show just 100 companies (this will affect pagination as well), and how could I get the company info? Should I store the company info in each service (I don't need the compnany info to be indexed) ? regards, Rich __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
RE: Grouping in solr ?
http://wiki.apache.org/solr/FieldCollapsing https://issues.apache.org/jira/browse/SOLR-236 -Original message- From: Papp Richard ccode...@gmail.com Sent: Thu 23-09-2010 21:29 To: solr-user@lucene.apache.org; Subject: Grouping in solr ? Hi all, is it possible somehow to group documents? I have services as documents, and I would like to show the filtered services grouped by company. So I filter services by given criteria, but I show the results grouped by companay. If I got 1000 services, maybe I need to show just 100 companies (this will affect pagination as well), and how could I get the company info? Should I store the company info in each service (I don't need the compnany info to be indexed) ? regards, Rich __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com
RE: Grouping in solr ?
thank you! this is really helpful. just tried it and it's amazing. do you know, how trustable is a nightly built version (solr4) ? Rich -Original Message- From: Markus Jelsma [mailto:markus.jel...@buyways.nl] Sent: Thursday, September 23, 2010 22:38 To: solr-user@lucene.apache.org Subject: RE: Grouping in solr ? http://wiki.apache.org/solr/FieldCollapsing https://issues.apache.org/jira/browse/SOLR-236 -Original message- From: Papp Richard ccode...@gmail.com Sent: Thu 23-09-2010 21:29 To: solr-user@lucene.apache.org; Subject: Grouping in solr ? Hi all, is it possible somehow to group documents? I have services as documents, and I would like to show the filtered services grouped by company. So I filter services by given criteria, but I show the results grouped by companay. If I got 1000 services, maybe I need to show just 100 companies (this will affect pagination as well), and how could I get the company info? Should I store the company info in each service (I don't need the compnany info to be indexed) ? regards, Rich __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com __ Information from ESET NOD32 Antivirus, version of virus signature database 5419 (20100902) __ The message was checked by ESET NOD32 Antivirus. http://www.eset.com