[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209147#comment-17209147 ] Aroop commented on SOLR-14916: -- Thanks Joel > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209108#comment-17209108 ] Aroop commented on SOLR-14916: -- [~jbernste] what possible values of "gap" will we support and will the "format" have corresponding valid list of values documented or an enum/constants file to that effect created? > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control the > number and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* and > *minMaxScale* functions already support operations over matrix rows so it's > very easy to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14916) Add split parameter to timeseries Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17209051#comment-17209051 ] Aroop commented on SOLR-14916: -- [~jbernste] this looks very neat! > Add split parameter to timeseries Streaming Expression > -- > > Key: SOLR-14916 > URL: https://issues.apache.org/jira/browse/SOLR-14916 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Joel Bernstein >Priority: Major > > Currently the time series function only supports aggregations across the time > dimension. This ticket will add the *split* parameter which will add a top > level split by categorical field, to produce time lines per each split. The > split-limit and split-sort parameters will also be added to control number > and order of values in the split field result. > Sample syntax: > {code} > timeseries(collection1, >q="*:*", >split="company", >split-limit=10, >split-sort="avg(price_f) desc", >field="timefield", >gap="+1DAY", >format="-dd-MM" , >avg(price_f)) > {code} > The output of this can be easily pivoted into a matrix and correlated or > clustered like the output of the *facet2D* function. The *diff* function > already supports the serial differencing of matrix columns so it's very easy > to perform clustering etc.. on this output. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14660) Migrating HDFS into a package
[ https://issues.apache.org/jira/browse/SOLR-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167457#comment-17167457 ] Aroop commented on SOLR-14660: -- [~warper] this is a great start. I have a few questions regarding the codebase for the hdfs backup/restore, these as you many know are collection apis. And it uses HdfsBackupRepository bindings which you found being configured via solr.xml (optionally for those who need it). Have you foreseen any disruption to that call due to this move? I am assuming the collection api handler for that call will now need to use a different import for the new path ? > Migrating HDFS into a package > - > > Key: SOLR-14660 > URL: https://issues.apache.org/jira/browse/SOLR-14660 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Ishan Chattopadhyaya >Priority: Major > > Following up on the deprecation of HDFS (SOLR-14021), we need to work on > isolating it away from Solr core and making a package for this. This issue is > to track the efforts for that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14614: - Description: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. to give an example of how involved the corresponding streaming expression can get, to get it to work on large scale systems,{color:#4c9aff} _find top 10 cities where someone named Alex works with the respective counts_{color} {code:java} qt=/stream&aggregationMode=facet&expr= select( top( rollup(sort(by%3D"city+asc", +plist( select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) )), +over%3D"city",+sum(Nj3bXa)), +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), +city,+sum(Nj3bXa)+as+Nj3bXa) {code} This is a query on an alias with 2 collections behind it representing 2 data partitions, which is a requirement of sorts in big data systems. This is one of the only ways to get information from Billions of records in a matter of seconds. This is awesome in terms of capability and performance. But one can see how involved this syntax can be in the current scheme and is a barrier to entry for new adopters. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select city, count(*) from collection where name = 'alex' group by city order by count(*) desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. Heres to making the power of Streaming expressions simpler to use for all. was: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. to give an example of how involved the corresponding streaming expression can get, to get it to work on large scale systems, _find me top 10 cities where someone named Alex works with the respective counts_ {code:java} qt=/stream&aggregationMode=facet&expr= select( top( rollup(sort(by%3D"city+asc", +plist( select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) )), +over%3D"city",+sum(Nj3bXa)), +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), +city,+sum(Nj3bXa)+as+Nj3bXa) {code} This is a query on an alias with 2 collections behind it representing 2 data partitions, which is a requirement of sorts in big data systems. This is one of the only ways to get information from Billions of records in a matter of seconds. But one can see how involved this syntax can be in the current scheme and is a barrier to entry for new adopters. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select city, count(*) from collection where name = 'alex'
[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14614: - Description: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. to give an example of how involved the corresponding streaming expression can get, to get it to work on large scale systems, _find me top 10 cities where someone named Alex works with the respective counts_ {code:java} qt=/stream&aggregationMode=facet&expr= select( top( rollup(sort(by%3D"city+asc", +plist( select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) )), +over%3D"city",+sum(Nj3bXa)), +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), +city,+sum(Nj3bXa)+as+Nj3bXa) {code} This is a query on an alias with 2 collections behind it representing 2 data partitions, which is a requirement of sorts in big data systems. This is one of the only ways to get information from Billions of records in a matter of seconds. But one can see how involved this syntax can be in the current scheme and is a barrier to entry for new adopters. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select city, count(*) from collection where name = 'alex' group by city order by count(*) desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. Heres to making the power of Streaming expressions simpler to use for all. was: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. to give an example of how involved the corresponding streaming expression can get, to get it to work on large scale systems, {code:java} qt=/stream&aggregationMode=facet&expr= select( top( rollup(sort(by%3D"city+asc", +plist( select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) )), +over%3D"city",+sum(Nj3bXa)), +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), +city,+sum(Nj3bXa)+as+Nj3bXa) {code} This is a query on an alias with 2 collections behind it representing 2 data partitions, which is a requirement of sorts in big data systems. This is one of the only ways to get information from Billions of records in a matter of seconds. But one can see how involved this syntax can be in the current scheme and is a barrier to entry for new adopters. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select city, count(*) from collection where name = 'alex' group by city order by count(*) desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top
[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14614: - Description: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. to give an example of how involved the corresponding streaming expression can get, to get it to work on large scale systems, {code:java} qt=/stream&aggregationMode=facet&expr= select( top( rollup(sort(by%3D"city+asc", +plist( select(facet(collection1,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa), select(facet(collection2,+q%3D"(*:*+AND+name:alex)",+buckets%3D"city",+bucketSizeLimit%3D"2010",+bucketSorts%3D"count(*)+desc",+count(*)),+city,+count(*)+as+Nj3bXa) )), +over%3D"city",+sum(Nj3bXa)), +n%3D"10",+sort%3D"sum(Nj3bXa)+desc"), +city,+sum(Nj3bXa)+as+Nj3bXa) {code} This is a query on an alias with 2 collections behind it representing 2 data partitions, which is a requirement of sorts in big data systems. This is one of the only ways to get information from Billions of records in a matter of seconds. But one can see how involved this syntax can be in the current scheme and is a barrier to entry for new adopters. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex&dimensions=city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select city, count(*) from collection where name = 'alex' group by city order by count(*) desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. Heres to making the power of Streaming expressions simpler to use for all. was: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. > Add Simplified Aggregation Interface to Streaming Expression > > > Key: SOLR-14614 > URL: https://issues.apache.org/jira/browse/SOLR-14614 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query, query parsers, streaming expressions >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Major > > For the Data Analytics use cases the standard use case is: > # Find a pattern > # Then Aggregate by certain dimensions > # Then compute metrics (like count, sum, avg) > # Sort by a dimension or metric > # look at top-n > This functionality has been available over many different interfaces in the > past on solr, but only streaming expressions have the ability to deliver > results in a scalable, performant and stable manner for systems that have > large data to the tune of Big data systems. > However, one barrier to entry is the query i
[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14614: - Description: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?action=aggregate&q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. was: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. > Add Simplified Aggregation Interface to Streaming Expression > > > Key: SOLR-14614 > URL: https://issues.apache.org/jira/browse/SOLR-14614 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query, query parsers, streaming expressions >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Major > > For the Data Analytics use cases the standard use case is: > # Find a pattern > # Then Aggregate by certain dimensions > # Then compute metrics (like count, sum, avg) > # Sort by a dimension or metric > # look at top-n > This functionality has been available over many different interfaces in the > past on solr, but only streaming expressions have the ability to deliver > results in a scalable, performant and stable manner for systems that have > large data to the tune of Big data systems. > However, one barrier to entry is the query interface, not being simple enough > in streaming expressions. > This Jira is to track the work of creating a simplified analytics endpoint > augmenting streaming expressions. > a starting proposal is to have the endpoint have these query parameters: > {code:java} > /analytics?action=aggregate&q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} > This is equivalent to a sql that an analyst would write: > {code:java} > select age, city, count(*) from collection where name like 'alex%' > group by age, city order by age desc limit 10;{code} > > On the solr side this would get translated to the best possible streaming > expression using *rollups, top, sort, plist* etc.; but all done transparently > to the user. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14614: - Description: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently to the user. was: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics&q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently to the user. > Add Simplified Aggregation Interface to Streaming Expression > > > Key: SOLR-14614 > URL: https://issues.apache.org/jira/browse/SOLR-14614 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query, query parsers, streaming expressions >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Major > > For the Data Analytics use cases the standard use case is: > # Find a pattern > # Then Aggregate by certain dimensions > # Then compute metrics (like count, sum, avg) > # Sort by a dimension or metric > # look at top-n > This functionality has been available over many different interfaces in the > past on solr, but only streaming expressions have the ability to deliver > results in a scalable, performant and stable manner for systems that have > large data to the tune of Big data systems. > However, one barrier to entry is the query interface, not being simple enough > in streaming expressions. > This Jira is to track the work of creating a simplified analytics endpoint > augmenting streaming expressions. > a starting proposal is to have the endpoint have these query parameters: > {code:java} > /analytics?q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} > This is equivalent to a sql that an analyst would write: > {code:java} > select age, city, count(*) from collection where name like 'alex%' > group by age, city order by age desc limit 10;{code} > > On the solr side this would get translated to the best possible streaming > expression using _*rollups, top, sort, plist* etc.; b_ut all done > transparently to the user. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
[ https://issues.apache.org/jira/browse/SOLR-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14614: - Description: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using *rollups, top, sort, plist* etc.; but all done transparently to the user. was: For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics?q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently to the user. > Add Simplified Aggregation Interface to Streaming Expression > > > Key: SOLR-14614 > URL: https://issues.apache.org/jira/browse/SOLR-14614 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query, query parsers, streaming expressions >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Major > > For the Data Analytics use cases the standard use case is: > # Find a pattern > # Then Aggregate by certain dimensions > # Then compute metrics (like count, sum, avg) > # Sort by a dimension or metric > # look at top-n > This functionality has been available over many different interfaces in the > past on solr, but only streaming expressions have the ability to deliver > results in a scalable, performant and stable manner for systems that have > large data to the tune of Big data systems. > However, one barrier to entry is the query interface, not being simple enough > in streaming expressions. > This Jira is to track the work of creating a simplified analytics endpoint > augmenting streaming expressions. > a starting proposal is to have the endpoint have these query parameters: > {code:java} > /analytics?q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} > This is equivalent to a sql that an analyst would write: > {code:java} > select age, city, count(*) from collection where name like 'alex%' > group by age, city order by age desc limit 10;{code} > > On the solr side this would get translated to the best possible streaming > expression using *rollups, top, sort, plist* etc.; but all done transparently > to the user. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14614) Add Simplified Aggregation Interface to Streaming Expression
Aroop created SOLR-14614: Summary: Add Simplified Aggregation Interface to Streaming Expression Key: SOLR-14614 URL: https://issues.apache.org/jira/browse/SOLR-14614 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: query, query parsers, streaming expressions Affects Versions: 8.4.1, 7.7.2 Reporter: Aroop For the Data Analytics use cases the standard use case is: # Find a pattern # Then Aggregate by certain dimensions # Then compute metrics (like count, sum, avg) # Sort by a dimension or metric # look at top-n This functionality has been available over many different interfaces in the past on solr, but only streaming expressions have the ability to deliver results in a scalable, performant and stable manner for systems that have large data to the tune of Big data systems. However, one barrier to entry is the query interface, not being simple enough in streaming expressions. This Jira is to track the work of creating a simplified analytics endpoint augmenting streaming expressions. a starting proposal is to have the endpoint have these query parameters: {code:java} /analytics&q=*:*&fq=name:alex*&dimensions=age,city&metrics=count&sort=count&sortOrder=desc&limit=10{code} This is equivalent to a sql that an analyst would write: {code:java} select age, city, count(*) from collection where name like 'alex%' group by age, city order by age desc limit 10;{code} On the solr side this would get translated to the best possible streaming expression using _*rollups, top, sort, plist* etc.; b_ut all done transparently to the user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Description: There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method. This change removes that warning by handling a checked conversion and also adds to tests to an earlier untested api. was:There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method. > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch > Time Spent: 0.5h > Remaining Estimate: 0h > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. > This change removes that warning by handling a checked conversion and also > adds to tests to an earlier untested api. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Attachment: SOLR-14316.patch > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Attachment: (was: SOLR-14316.patch) > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Labels: patch (was: patch warnings) > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Labels: patch warnings (was: warnings) > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch, warnings > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Summary: Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method (was: Remove there was an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method) > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: warnings > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14316) Remove there was an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
Aroop created SOLR-14316: Summary: Remove there was an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method Key: SOLR-14316 URL: https://issues.apache.org/jira/browse/SOLR-14316 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: SolrJ Affects Versions: 8.4.1, 7.7.2 Reporter: Aroop There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org