mayankshriv commented on pull request #5316:
URL: https://github.com/apache/incubator-pinot/pull/5316#issuecomment-625868385
> > > > can we simplify the way we define the expression?
> > > > for e.g.
> > > > `select distinctCountThetaSketch(tsCol, "dim1='foo', dim2='bar',
"dim1 = 'foo and dim2='bar')`
> > > > to
> > > > `select distinctCountThetaSketch(tsCol, "dim1='foo'", "dim2='bar'",
"1 AND 2")`
> > >
> > >
> > > Yes agree, this is a good idea, specially in case when predicates are
long strings. Will address that in following PRs.
> >
> >
> > Is it possible to do `select distinctCountThetaSketch(tsCol, ("dim1=foo"
AND "dim2=bar"))` and parse it to get what we want? Why is that a problem?
>
> One issue with auto deriving is that we can only derive the lowest level
predicates, and not a combination if that was already applied. However, I am
unsure we can get to that, so I am already trying to see if I can get rid of
p1/p2/p3... in this PR (will update).
An advantage of explicitly specifying predicates is that complex predicates
(with and/or) can be specified, and can be applied at once to the filtered
docs, which improves performance. For example, p1 can be of form `(col1 = 1 and
col2 = 'x')`, as opposed to two separate predicates `col1 = 1 and col2 = 'x'`.
In future if we support accepting the first form, it would probably be faster
to apply that at once, as opposed to applying individual once, and then
performing intersection of them using theta-sketches.
BTW, I have updated the PR to make `p1, p2, p3...` optional, i.e. derive
from `postAggregationExpression`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]