The other big missing bit is that the working variables can't be complex
data.
That means that I can't write:
- my own form of count unique
- approximate aggregates like hyper-log-log, tdigest
- anything that constructs complex output like list_aggregate
This is just as bad as the lack of two-
Merging is the main missing thing. Drill supports building custom aggregate
functions. However, those are currently run in a single thread per
grouping. Generally, it is much better to do a two-phased aggregate for
custom functions, however the interface doesn't yet support that
functionality.
--
https://drill.apache.org/docs/developing-an-aggregate-function/
See the customer aggregate functions are marked as alpha and experimental
usage only.
What features or aspects are missing to make this a 'ready to deploy in
production' capability.
Appreciate response.
thanks
-Neeraja