[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

Cao Manh Dat (JIRA) Sun, 13 Nov 2016 15:42:19 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15662327#comment-15662327
 ]


Cao Manh Dat commented on SOLR-8593:
------------------------------------

Thanks [~julianhyde], that make sense, but will Calcite have an option for that 
in the future? Because along with JDBC API, we also support query through REST 
API.

For example 
{code}
curl --data-urlencode 'stmt=SELECT to, count(*) FROM collection4 GROUP BY to 
ORDER BY count(*) desc LIMIT 10' 
  http://localhost:8983/solr/collection4/sql?aggregationMode=facet
{code}

Below is sample result set:
{code}
{"result-set":{"docs":[
  {"count(*)":9158,"to":"pete.da...@enron.com"},
  {"count(*)":6244,"to":"tana.jo...@enron.com"},
  {"count(*)":5874,"to":"jeff.dasov...@enron.com"},
  {"count(*)":5867,"to":"sara.shackle...@enron.com"},
  {"count(*)":5595,"to":"steven.k...@enron.com"},
  {"count(*)":4904,"to":"vkamin...@aol.com"},
  {"count(*)":4622,"to":"mark.tay...@enron.com"},
  {"count(*)":3819,"to":"kay.m...@enron.com"},
  {"count(*)":3678,"to":"richard.shap...@enron.com"},
  {"count(*)":3653,"to":"kate.sy...@enron.com"},
  {"EOF":"true","RESPONSE_TIME":10}]}
}
{code}
So "EXPR$1" is kinda weird as a field name.

I also want to ask another question, right now we rely on Calcite in memory 
filter to filter expr in having clause ( as {{ having sum(field_i) = 19 }} in 
below query
{code}
select str_s, count(*) as myCount, sum(field_i), min(field_i), max(field_i), 
cast(avg(1.0 * field_i) as float) from collection1 where text='XXXX' group by 
str_s having sum(field_i) = 19 order by sum(field_i) asc
{code}

So I created a filter predicate for convert LogicalFilter to SolrFilter like 
this
{code}
private static final boolean isNotFilterByExpr (List<RexNode> rexNodes, 
List<String> fieldNames) {
  // We dont have a way to filter by result of aggregator now
  boolean result = true;
  for (RexNode rexNode : rexNodes) {
  if (rexNode instanceof RexCall) {
  result = result && isNotFilterByExpr(((RexCall) rexNode).getOperands(), 
fieldNames);
  } else if (rexNode instanceof RexInputRef) {
  result = result && !fieldNames.get(((RexInputRef) 
rexNode).getIndex()).startsWith("EXPR$");
  }
  }
  return result;
}

private static final Predicate<RelNode> FILTER_PREDICATE = relNode -> {
  List<RexNode> filterOperands = ((RexCall) ((LogicalFilter) 
relNode).getCondition()).getOperands();
  return isNotFilterByExpr(filterOperands, 
SolrRules.solrFieldNames(relNode.getRowType()));
};
{code}

But the above code is not works when I add alias for sum(field_i) like this
{code}
select str_s, count(*) as myCount, sum(field_i) as mySum, min(field_i), 
max(field_i), cast(avg(1.0 * field_i) as float) from collection1 where 
text='XXXX' group by str_s having sum(field_i) = 19 order by sum(field_i) asc
{code}

So do Calcite have a way to check for RexInputRef is an alias or not?


> Integrate Apache Calcite into the SQLHandler
> --------------------------------------------
>
>                 Key: SOLR-8593
>                 URL: https://issues.apache.org/jira/browse/SOLR-8593
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>         Attachments: SOLR-8593.patch
>
>
>    The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

Reply via email to