Ted Dunning created DRILL-3462:
----------------------------------
Summary: There appears to be no way to have complex intermediate
state
Key: DRILL-3462
URL: https://issues.apache.org/jira/browse/DRILL-3462
Project: Apache Drill
Issue Type: Bug
Reporter: Ted Dunning
After spending several frustrating days on the problem (see also DRILL-3461),
it appears that there is no viable idiom for building an aggregator that has
internal state that is anything more than a scalar.
What is needed is:
1) The ability to allocate a Repeated* type for use in a Workspace variables.
Currently, new works to get the basic structure, but there is no good way to
allocate the corresponding vector.
2) The ability to use and to allocate a ComplexWriter in the Workspace
variables.
3) The ability to write a UDAF that supports multi-phase aggregation. It would
be just fine if I simply have to write a combine method on my UDAF class. I
don't think that there is any way to infer such a combiner from the parameters
and workspace variables. An alternative API would be to have a form of the
output function that is given an Iterable<OutputClass>, but that is probably
much less efficient than simply having a combine method that is called
repeatedly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)