Here's some background on what Gautam is trying to do:  Currently, SQL does
not have a standard way to do a DISTINCT on a subset of the columns in the
SELECT list.  Suppose there are 2 columns:
  a:  INTEGER
  b:  MAP
Suppose I want to only do DISTINCT on 'a' and I don't really care about the
column 'b' .. I just want the first or any value of 'b' within a single
group of 'a'.    Postgres actually has a 'DISTINCT ON(a), b' syntax but
based on our discussion on the Calcite mailing list, we want to avoid that
syntax.   So, there's an alternative proposal to do the following:

    SELECT a, ANY_VALUE(b) FROM table GROUP BY a

This means, ANY_VALUE will essentially be treated as an Aggregate function
and from a code-gen perspective, we want to read 1 item (a MapHolder) from
the incoming MapVector and write it to a particular index in the output
MapVector.    This is where  it would be useful to  have
MapVector.setSafe()  since the StreamingAgg and HashAgg both generate
setSafe()  for normal aggregate functions.

However, it seems the better (or perhaps only) way to do this is through
the MapOrListWriter (ComplexWriter) as long as there's a way to instruct
the writer to write to a specific output index (the output index is needed
because there are several groups in the output container and we want to
write to a specific one).

-Aman


On Wed, Apr 11, 2018 at 2:13 PM, Paul Rogers <par0...@yahoo.com.invalid>
wrote:

> What semantics are wanted? SetSafe sets a single value in a vector. What
> does it mean to set a single map or array value? What would we pass as an
> argument?
> For non-simple types, something needs to iterate over the values: be they
> elements of a map, elements in an array, elements of an array of maps, then
> over the map members, etc.
> I believe that you are hitting a fundamental difference between simple
> scale values and complex (composite) values.
> This is for an aggregate. There is no meaningful aggregate of a map or an
> array. Once could aggregate over a scalar that is a member of a map to
> produce a scalar result. Or, one could iterate over the members of an array
> to produce, say, an average or sum.
> You are dealing with aggregate UDFs (even built in functions implement the
> UDAF protocol.) A quick check of the source code does not find a
> "AnyValueComplexFunctions" class, so this may perhaps be something new you
> are developing. What are the desired semantics?
> The UDAF protocol can include a complex writer for maps. I've not played
> with that yet. But, it does not seem meaningful to aggregate a map to
> produce a map or to aggregate an array to produce an array. The idea is
> that the UDAF figures out what to do with maps, then uses the complex
> writer to produce the desired result. This makes sense since there is no
> way to store a map as a simple value passed to setSafe().
> Can you provide additional details of what you are trying to do?
> Thanks,
> - Paul
>
>
>
>     On Wednesday, April 11, 2018, 1:53:12 PM PDT, Padma Penumarthy <
> ppenumar...@mapr.com> wrote:
>
>  I guess you can add a setSafe method which recursively does setSafe for
> all children.
>
> Thanks
> Padma
>
>
> > On Apr 11, 2018, at 1:19 PM, Gautam Parai <gpa...@mapr.com> wrote:
> >
> > Hi Paul/Padma,
> >
> >
> > Thank you so much for the responses. This function is supposed to return
> `any value` from the batch of incoming rows. Hence, the need to handle
> maps/lists.
> >
> >
> > This codegen is for the StreamingAggregator for Complex type(e.g. maps)
> in the incoming batch. It is trying to assign the values in the
> ComplexHolder to the outgoing MapVector.
> >
> >
> > MapVector vv9; // Output VV of StreamingAgg
> >
> > ....
> >
> >
> >    public void outputRecordValues(int outIndex)
> >
> >        throws SchemaChangeException
> >
> >    {
> >
> >        {
> >
> >            ComplexHolder out8;
> >
> >            {
> >
> >                final ComplexHolder out = new ComplexHolder();
> >
> >                FieldReader fr = work0;
> >
> >                MapHolder value = work1;
> >
> >                BigIntHolder nonNullCount = work2;
> >
> >
> >
> > AnyValueComplexFunctions$MapAnyValue_output: {
> >
> >    out.reader = fr;
> >
> > }
> >
> >
> >
> >                work0 = fr;
> >
> >                work1 = value;
> >
> >                work2 = nonNullCount;
> >
> >                out8 = out;
> >
> >            }
> >
> >            vv9 .getMutator().setSafe((outIndex), out8); //Don't have
> setSafe for MapVector
> >
> >        }
> >
> >    }
> >
> >
> > Please let me know your thoughts.
> >
> >
> > Gautam
> >
> >
> >
> > ________________________________
> > From: Paul Rogers <par0...@yahoo.com.INVALID>
> > Sent: Wednesday, April 11, 2018 12:40:15 PM
> > To: dev@drill.apache.org
> > Subject: Re: [DISCUSS] Regarding mutator interface
> >
> > Note that, for maps and lists, there is nothing to set. Maps are purely
> containers for other vectors. Lists (you didn't mention whether "repeated"
> or "non-repeated") are also containers. Non-repeated lists are containers
> for unions, repeated-lists are containers for arrays.
> > Any setting should be done on the contained vectors. For lists, only the
> offset vector is updated.
> > So, another question is: what is the generated code trying to set?
> >
> > Thanks,
> > - Paul
> >
> >
> >
> >    On Wednesday, April 11, 2018, 12:33:52 PM PDT, Padma Penumarthy <
> ppenumar...@mapr.com> wrote:
> >
> > Can you explain how aggregation on complex type works (or supposed to
> work).
> >
> > Thanks
> > Padma
> >
> >
> >> On Apr 11, 2018, at 12:15 PM, Gautam Parai <gpa...@mapr.com> wrote:
> >>
> >> Hi all,
> >>
> >>
> >> I am implementing a new aggregate function which also handles Complex
> types (map and list). However, the codegen barfs with
> >>
> >>
> >> CompileException: Line 104, Column 39: A method named "setSafe" is not
> declared in any enclosing class nor any supertype, nor through a static
> import
> >>
> >>
> >> It looks like we do not have set()/ setSafe() methods for
> MapVector/ListVector mutators.
> >>
> >>
> >> Should we add these methods to the Mutator interface to ensure all
> mutators implement them? Is these a reason we chose not to do so?
> >>
> >>
> >> Please let me know your thoughts. Thanks!
> >>
> >>
> >> Gautam
> >
>
>

Reply via email to