Hi,
I've been looking lately at a possibility of writing a custom UDAF and I
noticed that the function interface supports only sequential aggregation of
all results into a single final value. While the COUNT operator is
internally planned as a composition of two aggregation stages, other
aggregati
uld easily add something
> onto
> > > > Drill like this, it'd be hugely beneficial.
> > > >
> > > > On Wed, Apr 8, 2015 at 8:25 AM, Ted Dunning
> > > wrote:
> > > >
> > > > > Marcin,
> > > > >
> > >
t that your data is already known to be sorted and
> > thus the sort step should be omitted?
> >
> >
> > On Tue, Apr 7, 2015 at 3:21 PM, Marcin Karpinski
> > wrote:
> >
> > > @Jacques, thanks for the information - I'm definitely going to check
> out
wrote:
> >
> > > Drill already does most of this type of transformation. If you do an
> > > 'EXPLAIN PLAN FOR '
> > > you will see that it first does a grouping on the column and then
> applies
> > > the COUNT(column). The first level grouping can be
That would be great - I'm all listening :)
On Tue, Apr 7, 2015 at 7:22 PM, Ted Dunning wrote:
> On Tue, Apr 7, 2015 at 9:19 AM, Marcin Karpinski
> wrote:
>
> > @ Ted, ideally, I'd like to get exact results, but in case of real
> > problems, we could perhaps set
grouping on the column and then applies
> > the COUNT(column). The first level grouping can be done either based on
> > sorting or hashing and this is configurable through a system option.
> >
> > Aman
> >
> > On Tue, Apr 7, 2015 at 3:30 AM, Marcin Karpinski
&g
> 'EXPLAIN PLAN FOR '
> you will see that it first does a grouping on the column and then applies
> the COUNT(column). The first level grouping can be done either based on
> sorting or hashing and this is configurable through a system option.
>
> Aman
>
> On Tu
Hi Guys,
I have a specific use case for Drill, in which I'd like to be able to count
unique values in columns with tens millions of distinct values. The COUNT
DISTINCT method, unfortunately, does not scale both time- and memory-wise
and the idea is to sort the data beforehand by the values of that