And indeed COUNT(*) is equivalent to COUNT(1). COUNT(*) is the same as COUNT(e) where e is any not-null value.
I would argue that SUM(1) should be optimized to COUNT(*). Or, generalizing a bit, that SUM(c) should be optimized to COUNT(*) * c. IIRC, Hive performs that optimization. It's a bit tricky, because in Calcite the expression will be in a Project below (hopefully directly below) the Aggregate, an the Aggregate just sees a column. But using RelMdPredicates you can see that the column is always equal to 1. Julian On Fri, Feb 19, 2016 at 11:44 AM, Aman Sinha <amansi...@apache.org> wrote: > For #records, why would tableau generate sum(1) instead of count(1) ? > Drill does not have specific optimization for sum(1). It does have > optimization for count for Parquet data. > > Aman > > On Fri, Feb 19, 2016 at 10:16 AM, Sudip Mukherjee <smukher...@commvault.com> > wrote: > >> Hi, >> >> Have anyone tried optimizing SUM(1) query in drill? Or is it implemented? >> Getting these query while using Tableau. Mostly probably it is trying to >> figure out NUMBER_OF_RECORDS. >> >> Thanks, >> Sudip >> >> >> >> ***************************Legal Disclaimer*************************** >> "This communication may contain confidential and privileged material for >> the >> sole use of the intended recipient. Any unauthorized review, use or >> distribution >> by others is strictly prohibited. If you have received the message by >> mistake, >> please advise the sender by reply email and delete the message. Thank you." >> **********************************************************************