[ https://issues.apache.org/jira/browse/SOLR-8530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803191#comment-15803191 ]
ASF subversion and git services commented on SOLR-8530: ------------------------------------------------------- Commit b32cd82318f5c8817a8383e1be7534c772e6fa13 in lucene-solr's branch refs/heads/master from [~joel.bernstein] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=b32cd82 ] SOLR-8530: Add support for aggregate HAVING comparisons without single quotes > Add HavingStream to Streaming API and StreamingExpressions > ---------------------------------------------------------- > > Key: SOLR-8530 > URL: https://issues.apache.org/jira/browse/SOLR-8530 > Project: Solr > Issue Type: Improvement > Components: SolrJ > Affects Versions: 6.0 > Reporter: Dennis Gove > Priority: Minor > Fix For: master (7.0), 6.4 > > Attachments: SOLR-8530.patch > > > The goal here is to support something similar to SQL's HAVING clause where > one can filter documents based on data that is not available in the index. > For example, filter the output of a reduce(....) based on the calculated > metrics. > {code} > having( > reduce( > search(.....), > sum(cost), > on=customerId > ), > q="sum(cost):[500 TO *]" > ) > {code} > This example would return all where the total spent by each distinct customer > is >= 500. The total spent is calculated via the sum(cost) metric in the > reduce stream. > The intent is to support as the filters in the having(...) clause the full > query syntax of a search(...) clause. I see this being possible in one of two > ways. > 1. Use Lucene's MemoryIndex and as each tuple is read out of the underlying > stream creating an instance of MemoryIndex and apply the query to it. If the > result of that is >0 then the tuple should be returned from the HavingStream. > 2. Create an in-memory solr index via something like RamDirectory, read all > tuples into that in-memory index using the UpdateStream, and then stream out > of that all the matching tuples from the query. > There are benefits to each approach but I think the easiest and most direct > one is the MemoryIndex approach. With MemoryIndex it isn't necessary to read > all incoming tuples before returning a single tuple. With a MemoryIndex there > is a need to parse the solr query parameters and create a valid Lucene query > but I suspect that can be done using existing QParser implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org