Re: Optimizing SUM(1) query

Julian Hyde Fri, 19 Feb 2016 12:15:06 -0800

PS I did recall correctly:
https://issues.apache.org/jira/browse/HIVE-6192. But it's not
implemented using Calcite, sadly.


On Fri, Feb 19, 2016 at 12:11 PM, Julian Hyde <jh...@apache.org> wrote:
> And indeed COUNT(*) is equivalent to COUNT(1). COUNT(*) is the same as
> COUNT(e) where e is any not-null value.
>
> I would argue that SUM(1) should be optimized to COUNT(*). Or,
> generalizing a bit, that SUM(c) should be optimized to COUNT(*) * c.
>
> IIRC, Hive performs that optimization. It's a bit tricky, because in
> Calcite the expression will be in a Project below (hopefully directly
> below) the Aggregate, an the Aggregate just sees a column. But using
> RelMdPredicates you can see that the column is always equal to 1.
>
> Julian
>
>
> On Fri, Feb 19, 2016 at 11:44 AM, Aman Sinha <amansi...@apache.org> wrote:
>> For #records, why would tableau generate sum(1) instead of count(1) ?
>> Drill does not have specific optimization for sum(1).   It does have
>> optimization for count for Parquet data.
>>
>> Aman
>>
>> On Fri, Feb 19, 2016 at 10:16 AM, Sudip Mukherjee <smukher...@commvault.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Have anyone tried optimizing SUM(1) query in drill? Or is it implemented?
>>> Getting these query while using Tableau. Mostly probably it is trying to
>>> figure out NUMBER_OF_RECORDS.
>>>
>>> Thanks,
>>> Sudip
>>>
>>>
>>>
>>> ***************************Legal Disclaimer***************************
>>> "This communication may contain confidential and privileged material for
>>> the
>>> sole use of the intended recipient. Any unauthorized review, use or
>>> distribution
>>> by others is strictly prohibited. If you have received the message by
>>> mistake,
>>> please advise the sender by reply email and delete the message. Thank you."
>>> **********************************************************************

Re: Optimizing SUM(1) query

Reply via email to