Re: Optimizing SUM(1) query

Julian Hyde Fri, 19 Feb 2016 12:12:19 -0800

And indeed COUNT(*) is equivalent to COUNT(1). COUNT(*) is the same as
COUNT(e) where e is any not-null value.


I would argue that SUM(1) should be optimized to COUNT(*). Or,
generalizing a bit, that SUM(c) should be optimized to COUNT(*) * c.

IIRC, Hive performs that optimization. It's a bit tricky, because in
Calcite the expression will be in a Project below (hopefully directly
below) the Aggregate, an the Aggregate just sees a column. But using
RelMdPredicates you can see that the column is always equal to 1.

Julian


On Fri, Feb 19, 2016 at 11:44 AM, Aman Sinha <[email protected]> wrote:
> For #records, why would tableau generate sum(1) instead of count(1) ?
> Drill does not have specific optimization for sum(1).   It does have
> optimization for count for Parquet data.
>
> Aman
>
> On Fri, Feb 19, 2016 at 10:16 AM, Sudip Mukherjee <[email protected]>
> wrote:
>
>> Hi,
>>
>> Have anyone tried optimizing SUM(1) query in drill? Or is it implemented?
>> Getting these query while using Tableau. Mostly probably it is trying to
>> figure out NUMBER_OF_RECORDS.
>>
>> Thanks,
>> Sudip
>>
>>
>>
>> ***************************Legal Disclaimer***************************
>> "This communication may contain confidential and privileged material for
>> the
>> sole use of the intended recipient. Any unauthorized review, use or
>> distribution
>> by others is strictly prohibited. If you have received the message by
>> mistake,
>> please advise the sender by reply email and delete the message. Thank you."
>> **********************************************************************

Re: Optimizing SUM(1) query

Reply via email to