A SqlCall to $HARD_IN will (by SqlToRelConverter) become a RexCall to
$HARD_IN, and then (by RelToSqlConverter) become a SqlCall to
$HARD_IN. $HARD_IN(x, v1, v2) would become (by SqlWriter) the SQL "x
IN (v1, v2)".

At any point in this lifecycle, you could intercept and and simplify.

On Mon, Oct 7, 2019 at 2:34 PM Haisheng Yuan <h.y...@alibaba-inc.com> wrote:
>
> Will the filter condition with “$HARD_IN” internal function be able to pushed 
> down and be recognized by the source SQL system, like Peter mentioned?
>
> If not, we have to translate the internal function back to IN during Rel2Sql 
> phase. Otherwise, the data read from the source table can be much larger.
>
> - Haisheng
>
> ------------------------------------------------------------------
> 发件人:Julian Hyde<jh...@apache.org>
> 日 期:2019年10月08日 04:53:11
> 收件人:dev<dev@calcite.apache.org>
> 主 题:Re: [EXT] SqlRexConvertlet that Replicates "IN" Conversion Logic
>
> In 
> https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16946209
>  
> <https://issues.apache.org/jira/browse/CALCITE-2792?focusedCommentId=16946209&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16946209>
>  I floated the idea of a “$HARD_IN” internal function that has the same 
> semantics as IN but is not expanded to ‘… = OR … = …’.
>
> I think it would be a useful tool, if used judiciously.
>
> Julian
>
>
> > On Oct 4, 2019, at 7:08 PM, Haisheng Yuan <h.y...@alibaba-inc.com> wrote:
> >
> > As a workaround, you can modify you SqlRexConverlet, create a RexCall with 
> > balanced binary tree, e.g. (a=1 or a=2) or (a=3 or a=4), instead of a flat 
> > RexCall with multiple operands, e.g. a=1 or a=2 or a=3 or a=4.
> > Because every OR RexCall has exactly 2 operands, it won't transform into 
> > SqlCall with left deep tree.
> >
> > Let me know it works for you or not.
> >
> > - Haisheng
> >
> > ------------------------------------------------------------------
> > 发件人:Haisheng Yuan<h.y...@alibaba-inc.com>
> > 日 期:2019年10月05日 07:37:04
> > 收件人:Peter Wicks (pwicks)<pwi...@micron.com>; 
> > dev@calcite.apache.org<dev@calcite.apache.org>
> > 主 题:Re: RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion Logic
> >
> > If you want to push the filter down to the source SQL sytem, then 
> > transforming to a join won't help you either.
> >
> > The reason of stackoverflow for large ORs is the left deep binary tree, we 
> > need to change it to balanced binary tree, to reduce the depth of the call.
> >
> > I will open a pull request later.
> >
> > - Haisheng
> >
> > ------------------------------------------------------------------
> > 发件人:Peter Wicks (pwicks)<pwi...@micron.com>
> > 日 期:2019年10月04日 21:32:25
> > 收件人:dev@calcite.apache.org<dev@calcite.apache.org>
> > 主 题:RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion Logic
> >
> > Zoltan,
> >
> > Thanks for the suggestion. I actually tried doing a UDF first, and it was 
> > also successful, sorry for not sharing those details earlier.
> > The problem with the UDF is that the predicates are not pushed down to the 
> > source SQL system (by design), and this can result in a 100x increase in 
> > the amount of data returned from the database. This data will be correctly 
> > filtered by the UDF, but returning 100x the data makes it a lot slower. So 
> > I was trying to push it down to the source server instead.
> >
> > What do you mean by, "I guess Calcite might probably won't be able to do 
> > much with these ORs anyway..."? From my experiments I've seen two results 
> > from passing in this many OR's:
> >
> > - If no other predicates are included in the query, then Calcite succeeds! 
> > It leaves the OR's flat, (a=1 OR a=2 OR a=3 OR a=4)
> > - If additional predicates are included, then Calcite nests the OR 
> > statements, leading to a stackoverflow for very large OR's, which is 
> > CALCITE-2792, ((((a=1) OR a=2) OR a=3) OR a=4)
> >
> > Thanks,
> > Peter
> >
> > -----Original Message-----
> > From: Zoltan Haindrich <k...@rxd.hu>
> > Sent: Friday, October 4, 2019 12:38 AM
> > To: dev@calcite.apache.org; Haisheng Yuan <h.y...@alibaba-inc.com>; Peter 
> > Wicks (pwicks) <pwi...@micron.com>
> > Subject: Re: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion 
> > Logic
> >
> >
> > I think you might try another approach: introduce some UDF and use your 
> > translation logic to call that - as the UDF will be opaque for calcite it 
> > will be left alone.
> > I guess Calcite might probably won't be able to do much with these ORs 
> > anyway...
> >
> >
> > On 10/3/19 11:26 PM, Haisheng Yuan wrote:
> >> I don't think this can be done in SqlRexConvertlet, which converts SqlNode 
> >> to RexNode.
> >> You might need to massage the SqlToRelConverter to create the RelNode that 
> >> you want.
> >>
> >> BTW, I still think we need RexNode for IN/ANY.
> > I also feel that there is some need for IN nodes; but there are some good 
> > sides of not having it as well: like simplification handles them better.
> >
> >
> >>
> >> - Haisheng
> >>
> >> ------------------------------------------------------------------
> >> 发件人:Peter Wicks (pwicks)<pwi...@micron.com>
> >> 日 期:2019年10月04日 04:03:51
> >> 收件人:dev@calcite.apache.org<dev@calcite.apache.org>
> >> 主 题:RE: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion Logic
> >>
> >> Haisheng,
> >>
> >> Yes, that is what I would like to do. Unfortunately, I’m not sure how to 
> >> proceed to actually do that. I was hoping for a pointer to an example that 
> >> is similar?
> >>
> >> Thanks!
> >> Peter
> >>
> >> From: Haisheng Yuan <h.y...@alibaba-inc.com>
> >> Sent: Thursday, October 3, 2019 1:35 PM
> >> To: Peter Wicks (pwicks) <pwi...@micron.com>; dev@calcite.apache.org
> >> Subject: [EXT] Re: SqlRexConvertlet that Replicates "IN" Conversion Logic
> >>
> >> Currently Calcite doesn't have IN RexNode, only has IN SqlNode, 
> >> unfortunately.
> >>
> >> You can create a Values node with these authorization data, and make a 
> >> semi join with the table and Values you created.
> >>
> >> - Haisheng
> >>
> >> ------------------------------------------------------------------
> >> 发件人:Peter Wicks (pwicks)<pwi...@micron.com<mailto:pwi...@micron.com>>
> >> 日 期:2019年10月04日 02:34:02
> >> 收件人:dev@calcite.apache.org<dev@calcite.apache.org<mailto:dev@calcite.apache.org%3c...@calcite.apache.org>>
> >> 主 题:SqlRexConvertlet that Replicates "IN" Conversion Logic
> >>
> >> A little detail about what I'm trying to do:
> >>
> >> I have an external API that contains authorization information on a per 
> >> user basis. I want users to be able to include an operation in their query 
> >> that will filter data based on this authorization data.
> >>
> >> Using Calcite v1.16 / Java 1.8 / RHEL7, I built a class that implements 
> >> SqlRexConvertlet, and I am able to get this working. The user includes in 
> >> their predicate statement `custom_authorize(column)`, my convertlet 
> >> queries the API, gets the authorization rules, builds an OR statement, and 
> >> the results come back. This works sometimes, but other times the OR 
> >> condition becomes too large, and I run into CALCITE-2792: 
> >> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FCALCITE-2792&amp;data=02%7C01%7Cpwicks%40micron.com%7C1920339cebed45b0bd5b08d748955d24%7Cf38a5ecd28134862b11bac1d563c806f%7C0%7C1%7C637057678693693555&amp;sdata=0tuXcWcjTHXMQQr%2BmKXBjrSTAW%2BqBG%2Fp3PsK8df2tfk%3D&amp;reserved=0,
> >>  which causes a stackoverflow and my query dies.
> >>
> >> So I tried converting to an IN statement, having read that IN statements 
> >> are automatically converted to a sub query join when the default limit of 
> >> 20 is exceeded. The problem is that this appears only to be true for IN 
> >> statements that are included in the initial query. IN statements created 
> >> as the result of a convertlet do not get modified, and are sent as an IN 
> >> statement, which results in a failure to parse the query. I looked at how 
> >> Calcite normally does this translation from IN to exists using a join, but 
> >> it depends on a lot of classes/instances that aren't available in the 
> >> SqlRexContext space. Is it possible to rewrite my IN statement to a 
> >> join/exists query like Calcite normally does?
> >>
> >> Also, am I doing things all wrong? Is there a better way to go about this?
> >>
> >> Code Sample below is for the OR version, the commented code can be swapped 
> >> in to see how I was building the IN statement.
> >>
> >> @Override
> >> public RexNode convertCall(SqlRexContext cx, SqlCall call) {
> >> HashSet<String> keyList = null;
> >> try {
> >> keyList = new 
> >> Manager().getAllowedIDs(getContextInformation().getQueryUser());
> >> } catch (SQLException e) {
> >> e.printStackTrace();
> >> }
> >>
> >> final RexBuilder rexBuilder = cx.getRexBuilder();
> >> final RexNode column = cx.convertExpression(call.operand(0));
> >>
> >> final List<RexNode> nodes = new ArrayList<>();
> >> for(String s: keyList) {
> >> nodes.add(rexBuilder.makeCall(EQUALS, column, rexBuilder.makeLiteral(s)));
> >> //nodes.add(rexBuilder.makeLiteral(s));
> >> }
> >>
> >> final RexNode in = rexBuilder.makeCall(SqlStdOperatorTable.OR, nodes);
> >> //final RexNode in = inBuilder(rexBuilder, column, nodes.toArray(new 
> >> RexNode[0]));
> >>
> >> return in;
> >> }
> >>
> >> protected RexNode inBuilder(RexBuilder rexBuilder, RexNode node, 
> >> RexNode... nodes) {
> >> return rexBuilder.makeCall(SqlStdOperatorTable.IN,
> >> ImmutableList.<RexNode>builder().add(node).add(nodes).build());
> >> }
> >>
> >>
> >> Thanks,
> >> Peter
> >>
> >
> >
>
>

Reply via email to