Re: Drill query planning error

2017-07-26 Thread Aman Sinha
[Since this is Drill specific, I put dev@calcite on BCC].

If you have two aggregates: Count(distinct a), Count(distinct b), the
Calcite logical plan consists of a cartesian join of 2 subqueries each of
which first does a group-by on the distinct column followed by a count
aggregate.   By default,  Drill only processes cartesian join if one input
of the join is known to be scalar (single row).  It sounds like after you
did the transformation to use the cache, that scalar property somehow did
not get propagated.
You can override this behavior by a session configuration:  (this will use
a nested loop join even if inputs are not provably scalar, but it should be
used for specific query only).For a more general solution, I believe
you may have to create an enhancement JIRA with appropriate details.
   'alter session set planner.enable_nljoin_for_scalar_only = false';

On Wed, Jul 26, 2017 at 4:14 AM, weijie tong 
wrote:

> HI all:
>
>   I materialize the count distinct query result to a cache, then when user
> query the count distinct , a specific rule will translate the query to the
> cache. It turns out right when the query has only one count (distinct )
> operator ,but when it has two count (distinct ) ,it causes error .The error
> info is here:
> https://gist.github.com/weijietong/1b8ed12db9490bf006e8b3fe0ee52269
>
>
> Best Regards.
>


Drill query planning error

2017-07-26 Thread weijie tong
HI all:

  I materialize the count distinct query result to a cache, then when user
query the count distinct , a specific rule will translate the query to the
cache. It turns out right when the query has only one count (distinct )
operator ,but when it has two count (distinct ) ,it causes error .The error
info is here:
https://gist.github.com/weijietong/1b8ed12db9490bf006e8b3fe0ee52269


Best Regards.