Hi all,

This is a rough idea, I'd like to see how the community think about it.

RexListCmp extends RexNode / RexCall {
  public final SqlOperator op;
  public final RexNode left;
  public final ImmutableList<RexNode> list;
  public final RexQuantifier quantifier;
  public final RelDataType type;
}

Enum RexQuantifier {
  ALL,
  ANY
}

Background:

It is not uncommon that the query contains large number of constant IN list, 
e.g.
1) SELECT * FROM foo WHERE a NOT IN (1, 2, 3, ...., 10000);
2) SELECT * FROM bar WHERE b IN (1, 2, 3, ...., 10000);

Currently, Calcite either translates it into a Join, or expand to OR/AND, which 
is inefficient, and may cause problems.

With RexListCmp, the predicate in query 1) will be represented as:
RexListCmp {
  op = "<>",
  left = "a"
  list = "1,2,3...10000"
  quantifier = "ALL"
}

The predicate in query 2) will be represented as:
RexListCmp {
  op = "=",
  left = "b"
  list = "1,2,3...10000"
  quantifier = "ANY"
}

It may also be used to represent the predicate in the following query:

SELECT * FROM bar WHERE (a,b) IN / NOT IN ((1,1), (2,2), (3,3), ... (1000, 
1000));

Further more, it is extensible. The op is not limited to be equals or not 
equals, it also be >, <, >=, <=, IDF, INDF or even customized sql operator like 
geospatial operator intersect:
boolean &&( geometry A , geometry B )

Thoughts?

Thanks,
Haisheng Yuan


Reply via email to