Just to close out this thread....

I got my final UDFs to work. I ended up with 2. One to create an array of
values and the other to calculate a simple linear regression. This data set
was a simple x = y slope

SELECT MyLinearRegression2(xValues,yValues,CAST(22356 as BIGINT)) as
xPerdict FROM (SELECT MyList(test_field1) as xValues, MyList(test_field2)
as yValues  FROM (SELECT test_field1,test_field2 FROM
`hive.default`.`my_hive_table` limit 10));
+-----------+
| xPerdict  |
+-----------+
| 22356.0   |
+-----------+


On Sun, Jul 5, 2015 at 4:10 PM, Jacques Nadeau <jacq...@apache.org> wrote:

> You're right.  You're off the beaten path. I think everyone here would love
> to have more documentation and more comments. Of course, all of these take
> time.
>
> If you have time to volunteer to help improve these things, that would be
> great.
>
> With regards to the question about the jira, describe your use case and
> what functionality you couldn't find or make work. The active developers on
> the project can then do their best to help shape the Jira into better docs,
> javadocs and/or new functionality as time allows.
>
> On Jul 5, 2015 1:37 PM, "Ted Dunning" <ted.dunn...@gmail.com> wrote:
>
> > Uh... actually, I think that it isn't obvious because there is absolutely
> > no documentation and there are no comments in the code.
> >
> > And what should the JIRA say?  We can't even tell what's missing, if
> > anything, because we can't tell how it is supposed to work.
> >
> >
> >
> >
> > On Sun, Jul 5, 2015 at 11:50 AM, Jacques Nadeau <jacq...@apache.org>
> > wrote:
> >
> > > It isn't obvious because you shouldn't do it.  Please file a JIRA to
> add
> > > real support for this type of output.
> > >
> > > Your current function would leak large amounts of memory that would
> > > ultimately crash the node.
> > >
> > > Realistically, there are very few internal Drill APIs that you should
> > > access via a UDF (injectables, holders, complexwriter, fieldreader and
> > > helpers).  A post 1.0 goal was to provide a UDF interface JAR to ensure
> > > people don't accidentally reach into Drill's internals.  (A later
> > > possibility is bytecode weaving to completely protect against it).
> > >
> > > J
> > >
> > > On Sun, Jul 5, 2015 at 11:36 AM, Ted Dunning <ted.dunn...@gmail.com>
> > > wrote:
> > >
> > > > That was impressively non-obvious.
> > > >
> > > >
> > > >
> > > > On Sat, Jul 4, 2015 at 6:40 PM, Jim Bates <jba...@maprtech.com>
> wrote:
> > > >
> > > > > I did get a new RepeatedBigIntHolder built and added a BigIntVector
> > > added
> > > > > to it. I'll try it in the UDF tomorrow and see if there is a
> > difference
> > > > in
> > > > > the ways I found to get a BufferAllocator.
> > > > >
> > > > > .
> > > > > .
> > > > > .
> > > > > @Inject DrillBuf buffer;
> > > > > @Workspace RepeatedBigIntHolder yList;
> > > > > .
> > > > > .
> > > > > .
> > > > > @Override
> > > > > public void setup() {
> > > > > .
> > > > > .
> > > > > .
> > > > > //org.apache.drill.exec.memory.BufferAllocator allocator =
> > > > > buffer.getAllocator();
> > > > > org.apache.drill.exec.memory.BufferAllocator allocator =  new
> > > > > org.apache.drill.exec.memory.TopLevelAllocator();
> > > > > yList = new RepeatedBigIntHolder();
> > > > > yList.vector = new
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.vector.BigIntVector(org.apache.drill.exec.record.MaterializedField.create(new
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.common.expression.SchemaPath("bigints",org.apache.drill.common.expression.ExpressionPosition.UNKNOWN),
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.drill.common.types.Types.optional(org.apache.drill.common.types.TypeProtos.MinorType.BIGINT)),
> > > > > allocator);
> > > > > .
> > > > > .
> > > > > .
> > > > > }
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Jul 4, 2015 at 7:39 PM, Jim Bates <jba...@maprtech.com>
> > wrote:
> > > > >
> > > > > > I still have issues finding the correct way to create and use a
> > > > > > RepeatedHolder and Writers are a non starter for Workspace
> values.
> > I
> > > > can
> > > > > > make do with creating a concatenated string in a VarCharHolder
> for
> > > > small
> > > > > > data sets to get past this in the short term and finish testing
> the
> > > > > output
> > > > > > values I expect but won't be able to do any scale till I figure
> out
> > > how
> > > > > to
> > > > > > make a repeated list.
> > > > > >
> > > > > > On Sat, Jul 4, 2015 at 7:12 PM, Jim Bates <jba...@maprtech.com>
> > > wrote:
> > > > > >
> > > > > >> Well... Converting from string to integers anyway... To many 4th
> > of
> > > > July
> > > > > >> Hot Dogs. going into nitrate overload. :)
> > > > > >>
> > > > > >> I am pulling an array of string values from json data. The
> string
> > > > values
> > > > > >> are actually integers. I am converting to integers and summing
> > each
> > > > > >> array entry to the final tally.
> > > > > >>
> > > > > >> On Sat, Jul 4, 2015 at 7:04 PM, Jim Bates <jba...@maprtech.com>
> > > > wrote:
> > > > > >>
> > > > > >>> Ted,
> > > > > >>>
> > > > > >>> Yes, I started out just getting a basic count to work. I am
> > trying
> > > to
> > > > > >>> keep the workflow as close to a basic user as possible. As
> such,
> > I
> > > am
> > > > > >>> building and using the MapR Apache Drill sandbox to test.
> > > > > >>>
> > > > > >>>
> > > > > >>>    1. Always look at the drillbits.log file to see if drill had
> > any
> > > > > >>>    issues loading your UDF. That was where I learned that all
> > > > > workspace values
> > > > > >>>    needed to be holders
> > > > > >>>       -
> > > > > >>>       - WARN  o.a.d.exec.expr.fn.FunctionConverter - Failure
> > > loading
> > > > > >>>       function class
> > > > > >>>
> > > > >
> com.mapr.example.udfs.drill.MyDrillAggFunctions$MyLinearRegression1,
> > > > field
> > > > > >>>       xList. Aggregate function 'MyLinearRegression1' workspace
> > > > > variable 'xList'
> > > > > >>>       is of type 'interface
> > > > > >>>
> > > > >
> > org.apache.drill.exec.vector.complex.writer.BaseWriter$ComplexWriter'.
> > > > > >>>       Please change it to Holder type.
> > > > > >>>    2. Error messages:
> > > > > >>>       - If you get an error in this format it means that Drill
> > can
> > > > not
> > > > > >>>       find your function so it probably didn't load it. back to
> > > step
> > > > 1:
> > > > > >>>          -
> > > > > >>>          - PARSE ERROR: From line 1, column 8 to line 1, column
> > 44:
> > > > No
> > > > > >>>          match found for function signature
> MyFunctionName(<ANY>)
> > > > > >>>       - If you get an error in this format it means that the
> > > function
> > > > > >>>       is there but Drill could not find a signature that
> matched
> > > the
> > > > > param types
> > > > > >>>       or param numbers you were passing it. The exact wording
> > will
> > > > > change but
> > > > > >>>       the Missing function implementation is the key phrase to
> > look
> > > > > for:
> > > > > >>>          -
> > > > > >>>          - Error: SYSTEM ERROR:
> > > > > >>>          org.apache.drill.exec.exception.SchemaChangeException:
> > > > > Failure while trying
> > > > > >>>          to materialize incoming schema.  Errors:
> > > > > >>>          - Error in expression at index -1.  Error: Missing
> > > function
> > > > > >>>          implementation: [castBIGINT(VARCHAR-REPEATED)].  Full
> > > > > expression: --UNKNOWN
> > > > > >>>          EXPRESSION--
> > > > > >>>       3. In your function definition for aggregate functions
> you
> > > need
> > > > > >>>    to set null processing to internal and your isRandom to
> false.
> > > > > Example
> > > > > >>>    below:
> > > > > >>>       -
> > > > > >>>       - @FunctionTemplate(name = "MyFunctionName", scope =
> > > > > >>>       FunctionTemplate.FunctionScope.POINT_AGGREGATE, nulls =
> > > > > >>>       FunctionTemplate.NullHandling.INTERNAL, isRandom = false,
> > > > > >>>       isBinaryCommutative = false, costCategory =
> > > > > >>>       FunctionTemplate.FunctionCostCategory.COMPLEX)
> > > > > >>>
> > > > > >>> Below is an example from the Apache Drill tutorial data sets
> > > > contained
> > > > > >>> in the MapR Apache Drill sandbox. I am pulling an array if
> string
> > > > > values
> > > > > >>> from json data. The string values are actually integers. I am
> > > > > converting to
> > > > > >>> string and summing each array entry to the final tally. This in
> > no
> > > > way
> > > > > >>> represents what this data was for but it did become a handy way
> > for
> > > > me
> > > > > to
> > > > > >>> peck out the "correct" way to build an aggregation UDF function
> > > > > >>>
> > > > > >>> @FunctionTemplate(name = "MyArraySum", scope =
> > > > > >>> FunctionTemplate.FunctionScope.POINT_AGGREGATE, nulls =
> > > > > >>> FunctionTemplate.NullHandling.INTERNAL, isRandom = false,
> > > > > >>> isBinaryCommutative = false, costCategory =
> > > > > >>> FunctionTemplate.FunctionCostCategory.COMPLEX)
> > > > > >>> public static class MyArraySum implements DrillAggFunc {
> > > > > >>>
> > > > > >>> @Param RepeatedVarCharHolder listToSearch;
> > > > > >>> @Workspace NullableBigIntHolder count;
> > > > > >>> @Workspace NullableBigIntHolder sum;
> > > > > >>> @Workspace NullableVarCharHolder vc;
> > > > > >>> @Output BigIntHolder out;
> > > > > >>>
> > > > > >>> @Override
> > > > > >>> public void setup() {
> > > > > >>> count.value=0;
> > > > > >>> sum.value = 0;
> > > > > >>> }
> > > > > >>>
> > > > > >>> @Override
> > > > > >>> public void add() {
> > > > > >>> int c = listToSearch.end - listToSearch.start;
> > > > > >>> int val = 0;
> > > > > >>> try {
> > > > > >>> for(int i=0; i<c; i++){
> > > > > >>> listToSearch.vector.getAccessor().get(i, vc);
> > > > > >>> String inputStr =
> > > > > >>>
> > > > >
> > > >
> > >
> >
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(vc.start,
> > > > > >>> vc.end, vc.buffer);
> > > > > >>> val = Integer.parseInt(inputStr);
> > > > > >>> sum.value = sum.value + val;
> > > > > >>> }
> > > > > >>> } catch (Exception e) {
> > > > > >>> val = 0;
> > > > > >>> }
> > > > > >>> count.value = count.value + 1;
> > > > > >>> }
> > > > > >>>
> > > > > >>> Example select statement:
> > > > > >>> SELECT MyArraySum(my_arrays) FROM (SELECT t.trans_info.prod_id
> as
> > > > > >>> my_arrays FROM `dfs.clicks`.`./clicks/clicks.campaign.json` t
> > limit
> > > > 5);
> > > > > >>>
> > > > > >>> On Sat, Jul 4, 2015 at 6:22 PM, Ted Dunning <
> > ted.dunn...@gmail.com
> > > >
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>>> Jim,
> > > > > >>>>
> > > > > >>>> I think that you may be having trouble with aggregators in
> > > general.
> > > > > >>>>
> > > > > >>>> Have you been able to build *any* aggregator of anything?  I
> > > > haven't.
> > > > > >>>>
> > > > > >>>> When I try to build an aggregator of int's or doubles, I get a
> > > very
> > > > > >>>> persistent problem with Drill even seeing my aggregates:
> > > > > >>>>
> > > > > >>>> 0: jdbc:drill:zk=local> *select sum_int(employee_id) from
> > > > > >>>> cp.`employee.json`;*
> > > > > >>>>
> > > > > >>>> Jul 04, 2015 4:19:35 PM
> > > > > >>>> org.apache.calcite.sql.validate.SqlValidatorException <init>
> > > > > >>>>
> > > > > >>>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException:
> > No
> > > > > match
> > > > > >>>> found for function signature sum_int(<ANY>)
> > > > > >>>>
> > > > > >>>> Jul 04, 2015 4:19:35 PM
> > > org.apache.calcite.runtime.CalciteException
> > > > > >>>> <init>
> > > > > >>>>
> > > > > >>>> SEVERE: org.apache.calcite.runtime.CalciteContextException:
> From
> > > > line
> > > > > 1,
> > > > > >>>> column 8 to line 1, column 27: No match found for function
> > > signature
> > > > > >>>> sum_int(<ANY>)
> > > > > >>>>
> > > > > >>>> *Error: PARSE ERROR: From line 1, column 8 to line 1, column
> 27:
> > > No
> > > > > >>>> match
> > > > > >>>> found for function signature sum_int(<ANY>)*
> > > > > >>>>
> > > > > >>>> *[Error Id: 91b78fa6-6dd1-4214-a85f-c2bf2c393145 on
> > > 10.0.1.2:31010
> > > > > >>>> <http://10.0.1.2:31010>] (state=,code=0)*
> > > > > >>>>
> > > > > >>>> 0: jdbc:drill:zk=local> *select sum_int(cast(employee_id as
> > int))
> > > > from
> > > > > >>>> cp.`employee.json`*;
> > > > > >>>>
> > > > > >>>> Jul 04, 2015 4:19:45 PM
> > > > > >>>> org.apache.calcite.sql.validate.SqlValidatorException <init>
> > > > > >>>>
> > > > > >>>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException:
> > No
> > > > > match
> > > > > >>>> found for function signature sum_int(<NUMERIC>)
> > > > > >>>>
> > > > > >>>> Jul 04, 2015 4:19:45 PM
> > > org.apache.calcite.runtime.CalciteException
> > > > > >>>> <init>
> > > > > >>>>
> > > > > >>>> SEVERE: org.apache.calcite.runtime.CalciteContextException:
> From
> > > > line
> > > > > 1,
> > > > > >>>> column 8 to line 1, column 40: No match found for function
> > > signature
> > > > > >>>> sum_int(<NUMERIC>)
> > > > > >>>>
> > > > > >>>> *Error: PARSE ERROR: From line 1, column 8 to line 1, column
> 40:
> > > No
> > > > > >>>> match
> > > > > >>>> found for function signature sum_int(<NUMERIC>)*
> > > > > >>>>
> > > > > >>>> *[Error Id: f649fc85-6b6a-4468-9a4f-bfef0b23d06b on
> > > 10.0.1.2:31010
> > > > > >>>> <http://10.0.1.2:31010>] (state=,code=0)*
> > > > > >>>>
> > > > > >>>> 0: jdbc:drill:zk=local>
> > > > > >>>>
> > > > > >>>>
> > > > > >>>> It looks like there is some undocumented subtlety about how to
> > > > > register
> > > > > >>>> an
> > > > > >>>> aggregator.
> > > > > >>>>
> > > > > >>>> On Sat, Jul 4, 2015 at 4:08 PM, Jim Bates <
> jba...@maprtech.com>
> > > > > wrote:
> > > > > >>>>
> > > > > >>>> > I'm working on the same thing. I want to aggregate a list of
> > > > values.
> > > > > >>>> It has
> > > > > >>>> > been a search and guess game for the most part. I'm still
> > stuck
> > > in
> > > > > the
> > > > > >>>> > process of getting the values all into a list. The writers
> > look
> > > > > >>>> interesting
> > > > > >>>> > but for aggregation functions  it looks like the input is
> the
> > > > param
> > > > > >>>> and
> > > > > >>>> > output objects can't hold the aggregations steps. The
> > Workspace
> > > is
> > > > > >>>> where
> > > > > >>>> > that happens. If I try and use a Writer in a workspace it
> > won't
> > > > load
> > > > > >>>> and
> > > > > >>>> > tells me to change it to Holders which was why I was using
> > them
> > > to
> > > > > >>>> start
> > > > > >>>> > with. Maybe I'm missing the architecture of the agg
> function.
> > It
> > > > > >>>> looked
> > > > > >>>> > like it was....
> > > > > >>>> >
> > > > > >>>> > @Param comes in -> initialize @Workspace vars in setup ->
> > > process
> > > > > data
> > > > > >>>> > through @Workspace vars in add -> finalize @Output in
> output.
> > > > > >>>> >
> > > > > >>>> > So I'm back to trying to figure out how to create a
> > > > > >>>> RepeatedBigIntHolder or
> > > > > >>>> > a RepeatedVarCharHolder...
> > > > > >>>> >
> > > > > >>>> >
> > > > > >>>> >
> > > > > >>>> > On Sat, Jul 4, 2015 at 4:53 PM, Ted Dunning <
> > > > ted.dunn...@gmail.com>
> > > > > >>>> wrote:
> > > > > >>>> >
> > > > > >>>> > > I am working on trying to build any kind of list
> > constructing
> > > > > >>>> aggregator
> > > > > >>>> > > and having absolute fits.
> > > > > >>>> > >
> > > > > >>>> > > To simplify life, I decided to just build a generic list
> > > builder
> > > > > >>>> that is
> > > > > >>>> > a
> > > > > >>>> > > scalar function that returns a list containing its
> argument.
> > > > Thus
> > > > > >>>> > zoop(3)
> > > > > >>>> > > => [3], zoop('abc') => 'abc' and zoop([1,2,3]) =>
> [[1,2,3]].
> > > > > >>>> > >
> > > > > >>>> > > The ComplexWriter looks like the place to go. As usual,
> the
> > > > > >>>> complete lack
> > > > > >>>> > > of comments in most of Drill makes this very hard since I
> > have
> > > > to
> > > > > >>>> guess
> > > > > >>>> > > what works and what doesn't.
> > > > > >>>> > >
> > > > > >>>> > > In my code, I note that ComplexWriter has a nice
> > rootAsList()
> > > > > >>>> method.  I
> > > > > >>>> > > used this in zip and it works nicely to construct lists
> for
> > > > > >>>> output.  I
> > > > > >>>> > note
> > > > > >>>> > > that the resulting ListWriter has a method
> > > > copyReader(FieldReader
> > > > > >>>> var1)
> > > > > >>>> > > which looks really good.
> > > > > >>>> > >
> > > > > >>>> > > Unfortunately, the only implementation of copyReader() is
> in
> > > > > >>>> > > AbstractFieldWriter and it looks this:
> > > > > >>>> > >
> > > > > >>>> > > public void copyReader(FieldReader reader) {
> > > > > >>>> > >     this.fail("Copy FieldReader");
> > > > > >>>> > > }
> > > > > >>>> > >
> > > > > >>>> > > I would like to formally say at this point "WTF"?
> > > > > >>>> > >
> > > > > >>>> > > In digging in further, I see other methods that look handy
> > > like
> > > > > >>>> > >
> > > > > >>>> > > public void write(IntHolder holder) {
> > > > > >>>> > >     this.fail("Int");
> > > > > >>>> > > }
> > > > > >>>> > >
> > > > > >>>> > > And then in looking at implementations, it looks like
> there
> > > is a
> > > > > >>>> > > combinatorial explosion because every type seems to need a
> > > write
> > > > > >>>> method
> > > > > >>>> > for
> > > > > >>>> > > every other type.
> > > > > >>>> > >
> > > > > >>>> > > What is the thought here?  How can I copy an arbitrary
> value
> > > > into
> > > > > a
> > > > > >>>> list?
> > > > > >>>> > >
> > > > > >>>> > > My next thought was to build code that dispatches on type.
> > > > There
> > > > > >>>> is a
> > > > > >>>> > > method called getType() on the FieldReader.
> Unfortunately,
> > > that
> > > > > >>>> drives
> > > > > >>>> > > into code generated by protoc and I see no way to dispatch
> > on
> > > > the
> > > > > >>>> type of
> > > > > >>>> > > an incoming value.
> > > > > >>>> > >
> > > > > >>>> > >
> > > > > >>>> > > How is this supposed to work?
> > > > > >>>> > >
> > > > > >>>> > >
> > > > > >>>> > >
> > > > > >>>> > >
> > > > > >>>> > > On Sat, Jul 4, 2015 at 2:14 PM, mehant baid <
> > > > > baid.meh...@gmail.com>
> > > > > >>>> > wrote:
> > > > > >>>> > >
> > > > > >>>> > > > For a detailed example on using ComplexWriter interface
> > you
> > > > can
> > > > > >>>> take a
> > > > > >>>> > > look
> > > > > >>>> > > > at the Mappify
> > > > > >>>> > > > <
> > > > > >>>> > > >
> > > > > >>>> > >
> > > > > >>>> >
> > > > > >>>>
> > > > >
> > > >
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/Mappify.java
> > > > > >>>> > > > >
> > > > > >>>> > > > (kvgen) function. The function itself is very simple
> > however
> > > > it
> > > > > >>>> makes
> > > > > >>>> > use
> > > > > >>>> > > > of the utility methods in MappifyUtility
> > > > > >>>> > > > <
> > > > > >>>> > > >
> > > > > >>>> > >
> > > > > >>>> >
> > > > > >>>>
> > > > >
> > > >
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/MappifyUtility.java
> > > > > >>>> > > > >
> > > > > >>>> > > > and MapUtility
> > > > > >>>> > > > <
> > > > > >>>> > > >
> > > > > >>>> > >
> > > > > >>>> >
> > > > > >>>>
> > > > >
> > > >
> > >
> >
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/MapUtility.java
> > > > > >>>> > > > >
> > > > > >>>> > > > which perform most of the work.
> > > > > >>>> > > >
> > > > > >>>> > > > Currently we don't have a generic infrastructure to
> handle
> > > > > errors
> > > > > >>>> > coming
> > > > > >>>> > > > out of functions. However there is UserException, which
> > when
> > > > > >>>> raised
> > > > > >>>> > will
> > > > > >>>> > > > make sure that Drill does not gobble up the error
> message
> > in
> > > > > that
> > > > > >>>> > > > exception. So you can probably throw a UserException
> with
> > > the
> > > > > >>>> failing
> > > > > >>>> > > input
> > > > > >>>> > > > in your function to make sure it propagates to the user.
> > > > > >>>> > > >
> > > > > >>>> > > > Thanks
> > > > > >>>> > > > Mehant
> > > > > >>>> > > >
> > > > > >>>> > > > On Sat, Jul 4, 2015 at 1:48 PM, Jacques Nadeau <
> > > > > >>>> jacq...@apache.org>
> > > > > >>>> > > wrote:
> > > > > >>>> > > >
> > > > > >>>> > > > > *Holders are for both input and output.  You can also
> > use
> > > > > >>>> > CompleWriter
> > > > > >>>> > > > for
> > > > > >>>> > > > > output and FieldReader for input if you want to write
> or
> > > > read
> > > > > a
> > > > > >>>> > complex
> > > > > >>>> > > > > value.
> > > > > >>>> > > > >
> > > > > >>>> > > > > I don't think we've provided a really clean way to
> > > > construct a
> > > > > >>>> > > > > Repeated*Holder for output purposes.  You can probably
> > do
> > > it
> > > > > by
> > > > > >>>> > > reaching
> > > > > >>>> > > > > into a bunch of internal interfaces in Drill.
> However,
> > I
> > > > > would
> > > > > >>>> > > recommend
> > > > > >>>> > > > > using the ComplexWriter output pattern for now.  This
> > will
> > > > be
> > > > > a
> > > > > >>>> > little
> > > > > >>>> > > > less
> > > > > >>>> > > > > efficient but substantially less brittle.  I suggest
> you
> > > > open
> > > > > >>>> up a
> > > > > >>>> > jira
> > > > > >>>> > > > for
> > > > > >>>> > > > > using a Repeated*Holder as an output.
> > > > > >>>> > > > >
> > > > > >>>> > > > > On Sat, Jul 4, 2015 at 1:38 PM, Ted Dunning <
> > > > > >>>> ted.dunn...@gmail.com>
> > > > > >>>> > > > wrote:
> > > > > >>>> > > > >
> > > > > >>>> > > > > > Holders are for input, I think.
> > > > > >>>> > > > > >
> > > > > >>>> > > > > > Try the different kinds of writers.
> > > > > >>>> > > > > >
> > > > > >>>> > > > > >
> > > > > >>>> > > > > >
> > > > > >>>> > > > > > On Sat, Jul 4, 2015 at 12:49 PM, Jim Bates <
> > > > > >>>> jba...@maprtech.com>
> > > > > >>>> > > > wrote:
> > > > > >>>> > > > > >
> > > > > >>>> > > > > > > Using a repeatedholder as a @param I've got
> > working. I
> > > > was
> > > > > >>>> > working
> > > > > >>>> > > > on a
> > > > > >>>> > > > > > > custom aggregator function using DrillAggFunc. In
> > > this I
> > > > > >>>> can do
> > > > > >>>> > > > simple
> > > > > >>>> > > > > > > things but If I want to build a list values and do
> > > > > >>>> something with
> > > > > >>>> > > it
> > > > > >>>> > > > in
> > > > > >>>> > > > > > the
> > > > > >>>> > > > > > > final output method I think I need to use
> > > > RepeatedHolders
> > > > > >>>> in the
> > > > > >>>> > > > > > > @Workspace. To do that I need to create a new one
> in
> > > the
> > > > > >>>> setup
> > > > > >>>> > > > method.
> > > > > >>>> > > > > I
> > > > > >>>> > > > > > > can't get one built. They all require a
> > > BufferAllocator
> > > > to
> > > > > >>>> be
> > > > > >>>> > > passed
> > > > > >>>> > > > in
> > > > > >>>> > > > > > to
> > > > > >>>> > > > > > > build it. I have not found a way to get an
> allocator
> > > > yet.
> > > > > >>>> Any
> > > > > >>>> > > > > > suggestions?
> > > > > >>>> > > > > > >
> > > > > >>>> > > > > > > On Sat, Jul 4, 2015 at 1:37 PM, Ted Dunning <
> > > > > >>>> > ted.dunn...@gmail.com
> > > > > >>>> > > >
> > > > > >>>> > > > > > wrote:
> > > > > >>>> > > > > > >
> > > > > >>>> > > > > > > > If you look at the zip function in
> > > > > >>>> > > > > > > >
> > > https://github.com/mapr-demos/simple-drill-functions
> > > > > you
> > > > > >>>> can
> > > > > >>>> > > have
> > > > > >>>> > > > an
> > > > > >>>> > > > > > > > example of building a structure.
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > > The basic idea is that your output is denoted as
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > >         @Output
> > > > > >>>> > > > > > > >         BaseWriter.ComplexWriter writer;
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > > The pattern for building a list of lists of
> > integers
> > > > is
> > > > > >>>> like
> > > > > >>>> > > this:
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > >         writer.setValueCount(n);
> > > > > >>>> > > > > > > >         ...
> > > > > >>>> > > > > > > >         BaseWriter.ListWriter outer =
> > > > > writer.rootAsList();
> > > > > >>>> > > > > > > >         outer.start(); // [ outer list
> > > > > >>>> > > > > > > >         ...
> > > > > >>>> > > > > > > >         // for each inner list
> > > > > >>>> > > > > > > >             BaseWriter.ListWriter inner =
> > > > outer.list();
> > > > > >>>> > > > > > > >             inner.start();
> > > > > >>>> > > > > > > >             // for each inner list element
> > > > > >>>> > > > > > > >
> > > > >  inner.integer().writeInt(accessor.get(i));
> > > > > >>>> > > > > > > >             }
> > > > > >>>> > > > > > > >             inner.end();   // ] inner list
> > > > > >>>> > > > > > > >         }
> > > > > >>>> > > > > > > >         outer.end(); // ] outer list
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > > On Sat, Jul 4, 2015 at 10:29 AM, Jim Bates <
> > > > > >>>> > jba...@maprtech.com>
> > > > > >>>> > > > > > wrote:
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > > > > I have working aggregation and simple UDFs.
> I've
> > > > been
> > > > > >>>> trying
> > > > > >>>> > to
> > > > > >>>> > > > > > > document
> > > > > >>>> > > > > > > > > and understand each of the options available
> in
> > a
> > > > > Drill
> > > > > >>>> UDF.
> > > > > >>>> > > > > > > > Understanding
> > > > > >>>> > > > > > > > > the different FunctionScope's, the ones that
> are
> > > > > >>>> allowed, the
> > > > > >>>> > > > ones
> > > > > >>>> > > > > > that
> > > > > >>>> > > > > > > > are
> > > > > >>>> > > > > > > > > not. The impact of different cost categories.
> > The
> > > > > >>>> different
> > > > > >>>> > > > steps
> > > > > >>>> > > > > > > needed
> > > > > >>>> > > > > > > > > to understand handling any of the supported
> data
> > > > types
> > > > > >>>> and
> > > > > >>>> > > > > > structures
> > > > > >>>> > > > > > > in
> > > > > >>>> > > > > > > > > drill.
> > > > > >>>> > > > > > > > >
> > > > > >>>> > > > > > > > > Here are a few of my current road blocks. Any
> > > > pointers
> > > > > >>>> would
> > > > > >>>> > be
> > > > > >>>> > > > > > greatly
> > > > > >>>> > > > > > > > > appreciated.
> > > > > >>>> > > > > > > > >
> > > > > >>>> > > > > > > > >
> > > > > >>>> > > > > > > > >    1. I've been trying to understand how to
> > > > correctly
> > > > > >>>> use
> > > > > >>>> > > > > > > RepeatedHolders
> > > > > >>>> > > > > > > > >    of whatever type. For this discussion lets
> > > start
> > > > > >>>> with a
> > > > > >>>> > > > > > > > >    RepeatedBigIntHolder. I'm trying to figure
> > out
> > > > the
> > > > > >>>> best
> > > > > >>>> > way
> > > > > >>>> > > to
> > > > > >>>> > > > > > > create
> > > > > >>>> > > > > > > > a
> > > > > >>>> > > > > > > > > new
> > > > > >>>> > > > > > > > >    one. I have not figured out where in the
> > > existing
> > > > > >>>> drill
> > > > > >>>> > code
> > > > > >>>> > > > > > someone
> > > > > >>>> > > > > > > > > does
> > > > > >>>> > > > > > > > >    this. If I use a  RepeatedBigIntHolder as a
> > > > > Workspace
> > > > > >>>> > object
> > > > > >>>> > > > is
> > > > > >>>> > > > > is
> > > > > >>>> > > > > > > > null
> > > > > >>>> > > > > > > > > to
> > > > > >>>> > > > > > > > >    start with. I created a new one in the
> > startup
> > > > > >>>> section of
> > > > > >>>> > > the
> > > > > >>>> > > > > udf
> > > > > >>>> > > > > > > but
> > > > > >>>> > > > > > > > > the
> > > > > >>>> > > > > > > > >    vector was null. I can find no reference in
> > > > > creating
> > > > > >>>> a new
> > > > > >>>> > > > > > > > BigIntVector.
> > > > > >>>> > > > > > > > >    There is a way to create a BigIntVector
> and I
> > > did
> > > > > >>>> find an
> > > > > >>>> > > > > example
> > > > > >>>> > > > > > of
> > > > > >>>> > > > > > > > >    creating a new VarCharVector but I can't do
> > > that
> > > > > >>>> using the
> > > > > >>>> > > > drill
> > > > > >>>> > > > > > jar
> > > > > >>>> > > > > > > > > files
> > > > > >>>> > > > > > > > >    from 1.0. The
> > > > > >>>> org.apache.drill.common.types.TypeProtos and
> > > > > >>>> > > > > > > > >    the
> > > > > >>>> org.apache.drill.common.types.TypeProtos.MinorType
> > > > > >>>> > > classes
> > > > > >>>> > > > > do
> > > > > >>>> > > > > > > not
> > > > > >>>> > > > > > > > >    appear to be accessible from the drill jar
> > > files.
> > > > > >>>> > > > > > > > >    2. What is the best way to close out a UDF
> in
> > > the
> > > > > >>>> event it
> > > > > >>>> > > > > > generates
> > > > > >>>> > > > > > > > an
> > > > > >>>> > > > > > > > >    exception? Are there specific steps one
> > should
> > > > > >>>> follow to
> > > > > >>>> > > make
> > > > > >>>> > > > a
> > > > > >>>> > > > > > > clean
> > > > > >>>> > > > > > > > > exit
> > > > > >>>> > > > > > > > >    in a catch block that are beneficial to
> > Drill?
> > > > > >>>> > > > > > > > >
> > > > > >>>> > > > > > > >
> > > > > >>>> > > > > > >
> > > > > >>>> > > > > >
> > > > > >>>> > > > >
> > > > > >>>> > > >
> > > > > >>>> > >
> > > > > >>>> >
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to