You can look at RecordBatchMemoryManager.java and follow one of the
operator code (like flatten) to see how this was done.

Thanks
Padma


On Wed, Apr 24, 2019 at 12:00 PM Paul Rogers <par0...@yahoo.com.invalid>
wrote:

> Hi Igor,
>
> Thanks for the recap. You asked about vector allocation. Here is where I
> think things stand. Others can fill in details that I may miss.
>
> We have several ways to size value vectors; but no single standard. As you
> note, the most common way is simply to accept the cost of letting the
> vector double in size multiple times.
>
> One way to pre-allocate vectors is to use the "sizer" along with its
> associated allocation helper. This was always meant to be a quick & dirty
> temporary solution, but has turned out, I believe, to be the primary vector
> size management solution in most operators.
>
> Another is the new row set framework: vector size (in terms of number of
> items and estimated item size) is expressed in metadata, then is used to
> allocate each new batch to the desired size.
>
> You can also just do the work yourself: pick a number, and, when
> allocating a vector, tell it to use that size. You then take on the task of
> estimating average width, picking a good target number of rows for your
> batch, working out the number of items in arrays, etc. (This is, in fact,
> what the other two methods mentioned above actually do.)
>
> The key problem with the ad-hoc techniques is that they can't limit
> maximum vector size to 16 MB (to avoid Netty fragmentation) nor limit
> overall batch size to some reasonable number. The ad-hoc techniques can
> also lead to internal fragmentation (excessive unused space within each
> vector.) Solving these problems is what the row set framework was designed
> to do.
>
> Thanks,
> - Paul
>
>
>
>     On Wednesday, April 24, 2019, 10:48:44 AM PDT, Igor Guzenko <
> ihor.huzenko....@gmail.com> wrote:
>
>  Hello Everyone,
>
> Sorry for the late reply, here is presentations about
>
> Map<K,V> vector    -
>
> https://docs.google.com/presentation/d/1FG4swOrkFIRL7qjiP7PSOPy8a1vnxs5Z9PM3ZfRPRYo/edit#slide=id.p
> Hive complex types  -
>
> https://docs.google.com/presentation/d/1nc0ID5aju-qj-7hjquFpH-TwGjeReWTYogsExuOe8ZA/edit?usp=sharing
> .
>
> Discussion results for Map<K,V> new vector:
> - Need to eliminate possibility of key duplication;
> - Need to check Hive behavior when ORDER BY is performed for Map
> complex type column;
> - Need to describe design and all use cases for the vector in design
> document.
>
> Discussion results for Hive complex types:
> - Aman Sinha made few great suggestions. First is that creation of
> Hive writers may be done once for table scan and second is that at
> this moment
>   would be good to calculate size for vectors and allocate early. Need
> to provide few examples describing how will the allocation work for
> complex types.
> - Need to describe suggested approach in design document and proceed
> discussion there.
>
> Question from my side. Do we have already implemented somewhere
> predicted allocation of value vectors ? Any example would be useful,
> because
> now I can see that our existing vector writers usually use mutator's
> setSafe(...) methods inside which size of buffer may be increased when
> necessary.
>
> The future design document will be located at
>
> https://docs.google.com/document/d/1yEcaJi9dyksfMs4w5_GsZCQH_Pffe-HLeLVNNKsV7CA/edit?usp=sharing
> .
> Please feel free to leave your comments and suggestions in the
> document and presentations.
>
> Thanks,
> Igor Guzenko
>
>
> On Wed, Apr 17, 2019 at 3:04 AM Jyothsna Reddy <jyothsna....@gmail.com>
> wrote:
> >
> > Hi All,
> > The hangout will start at 9:30 AM PST instead of 10 AM PST on 04-18-2019.
> >
> >
> > Thank you,
> > Jyothsna
> >
> >
> >
> >
> > On Tue, Apr 16, 2019 at 2:00 PM Jyothsna Reddy <jyothsna....@gmail.com>
> > wrote:
> >
> > > Hi Charles,
> > > Yes, sure!! Probably we can start with your discussion first and Hive
> > > complex types later since there will be some discussion around the
> later
> > > topic.
> > >
> > > Thank you,
> > > Jyothsna
> > >
> > >
> > >
> > >
> > > On Tue, Apr 16, 2019 at 1:40 PM Charles Givre <cgi...@gmail.com>
> wrote:
> > >
> > >> Hi Jyothsna,
> > >> Could I get a few minutes on the next Hangout to promote the Drill
> day at
> > >> ApacheCon?
> > >> Thanks
> > >>
> > >> > On Apr 16, 2019, at 16:38, Jyothsna Reddy <jyothsna....@gmail.com>
> > >> wrote:
> > >> >
> > >> > Hi Everyone,
> > >> >
> > >> > Here are some key points of today's hangout discussion:
> > >> >
> > >> > Sorabh mentioned that there are some regressions in TPCDS queries
> and
> > >> its a
> > >> > blocker for 1.16 release.
> > >> >
> > >> > Bohdan presented tehir proposal for Hive Complex types support.
> Here are
> > >> > some of the important points
> > >> >
> > >> >  - Structure of MapVector : Keys are of primitive type where values
> can
> > >> >  be of either primitive or complex type.
> > >> >  - MapReader and MapWriter are used to read and write from the
> > >> MapVector
> > >> >  - MapWriter tracks the current row/length and is used to calculate
> > >> write
> > >> >  position and offset
> > >> >
> > >> > Following are some of the questions from the audience
> > >> >
> > >> >  - Will the types be implicitly casted since calcite supports keys
> of
> > >> >  type int and string.
> > >> >  - Future improvements include sorting the keys for better lookup,
> Is
> > >> it
> > >> >  per row or across all the rows?
> > >> >
> > >> > Since there is more to discuss, there will be a hangout session on
> > >> > 04-18-2019 at 10 AM PST (link
> > >> > http://meet.google.com/yki-iqdf-tai).
> > >> >
> > >> > Thank you,
> > >> > Jyothsna
> > >> >
> > >> >
> > >> >
> > >> > On Mon, Apr 15, 2019 at 11:48 AM Bohdan Kazydub <
> > >> bohdan.kazy...@gmail.com>
> > >> > wrote:
> > >> >
> > >> >> Hello,
> > >> >> Igor and I would like to discuss Hive Complex types support.
> > >> >>
> > >> >> Thanks,
> > >> >> Bohdan
> > >> >>
> > >> >> On Mon, Apr 15, 2019 at 8:47 PM Charles Givre <cgi...@gmail.com>
> > >> wrote:
> > >> >>
> > >> >>> I’d like to promote the Drill track for ApacheCon.
> > >> >>>
> > >> >>> Sent from my iPhone
> > >> >>>
> > >> >>>> On Apr 15, 2019, at 13:09, Jyothsna Reddy <
> jyothsna....@gmail.com>
> > >> >>> wrote:
> > >> >>>>
> > >> >>>> Hello Everyone,
> > >> >>>> Does anyone have any topics for tomorrow's hangout?
> > >> >>>>
> > >> >>>> We will start the hangout at 10 AM PST (link
> > >> >>>> http://meet.google.com/yki-iqdf-tai).
> > >> >>>>
> > >> >>>> Thank you,
> > >> >>>> Jyothsna
> > >> >>>
> > >> >>
> > >>
> > >>

Reply via email to