Re: [Agenda] Drill developer meetup 2018

Hanumanth Maduri Wed, 14 Nov 2018 08:33:02 -0800


Hello Drillers,


Here is the webex link for remote attendees.
Remote attendees can join at 
https://mapr.webex.com/mapr/j.phpMTID=ma05d8b5406acdb6292d5b81c79240a38

Thanks


> On Nov 2, 2018, at 11:25 AM, Abhishek Girish <[email protected]> wrote:
> 
> Charles, I'm sure we'll have a link for remote folks to join - will share
> it closer to the day.
> 
>> On Thu, Nov 1, 2018 at 1:58 PM hanu mapr <[email protected]> wrote:
>> 
>> Hello All,
>> 
>> There was typo for the year in the mail. It should be 2018 instead of 2019.
>> Thanks Aman for correcting it.
>> 
>> Regards,
>> -Hanu
>> 
>>> On Thu, Nov 1, 2018 at 6:30 AM Charles Givre <[email protected]> wrote:
>>> 
>>> Hi Hanumath,
>>> This looks great!!  Will you be streaming the event for those of us not
>> in
>>> the Bay Area?
>>> Thx,
>>> — C
>>> 
>>>> On Nov 1, 2018, at 00:10, Hanumath Rao Maduri <[email protected]>
>>> wrote:
>>>> 
>>>> Drill Developers,
>>>> 
>>>> 
>>>> I am quite excited to announce the details of the Drill developers day
>>>> 2018. I have consolidated the topics from our earlier discussions and
>>>> prioritized them according to the votes.
>>>> 
>>>> 
>>>> MapR has offered to host it on Nov 14th in Training room downstairs.
>>>> 
>>>> 
>>>> Here is the exact location
>>>> 
>>>> 
>>>> Training Room at
>>>> 
>>>> 4555 Great America Pkwy, Suite 201, Santa Clara, CA, 95054.
>>>> 
>>>> 
>>>> Please find the agenda for the meetup.
>>>> 
>>>> 
>>>> 
>>>> *Lunch starts at 12:00PM.*
>>>> 
>>>> 
>>>> *[12:25 - 12:40] Welcome *
>>>> 
>>>>  - Recap on last year's activities
>>>>  - Preview of this year's focus
>>>> 
>>>> *[12:40 - 1:00] Storage plugins*
>>>> 
>>>> 
>>>> 
>>>>  - Adding new storage plugins for the following:
>>>>     - Netflix Iceberg, Kudu(some code already exists), Cassandra,
>>>>     Elasticsearch, Carbondata, ORC/XML file formats, Spark
>>>>     RDD/DataFrames/Datasets, Graph databases & more
>>>>  - Improving documentation related to Storage plugins
>>>> 
>>>> 
>>>> *[1:00 - 1:45] Schema discovery & Evolution*
>>>> 
>>>> 
>>>> 
>>>>  - Creation, management of schema
>>>>  - Handling schema changes in certain common cases
>>>>  - Handling NULL values elegantly
>>>>  - Schema learning (similar to MSGpack plugin)
>>>>  - Query hints
>>>> 
>>>> *[1:45 - 2:30] Metadata Management*
>>>> 
>>>> 
>>>> 
>>>>  - Defining an abstraction layer for various types of metadata: views,
>>>>  schema, statistics, security
>>>>  - Underlying storage for metadata: what are the options and their
>>>>  trade-offs?
>>>>  - Hive metastore
>>>>  - Parquet metadata cache (parquet specific for row group metadata)
>>>>  - Ease of using the parquet files generated by other engines (like
>>> spark)
>>>> 
>>>> 
>>>> *[2:30 - 2:45] Break*
>>>> 
>>>> 
>>>> *[2:45 - 4:00] Resource management*
>>>> 
>>>> 
>>>> 
>>>>  - Resource limits per query
>>>>  - Optimal memory assignment for blocking operators based on stats
>>>>  - Enhancing the blocking and exchange operators to live within memory
>>>>  limits
>>>>  - Aligning with admission control/queueing (YARN concepts)
>>>>  - Query scheduling based on queues using tagging and costing
>>>>  - Drill on kubernetes
>>>> 
>>>> 
>>>> *[4:00 - 4:20] Apache Arrow*
>>>> 
>>>>  - Benefits of integrating Apache Drill with Apache Arrow
>>>>  - Possible trade-offs & implementation hurdles
>>>> 
>>>> *[4:20 - 4:40] **Performance Improvements*
>>>> 
>>>>  - Efficient handling of Broadcast/Semi/Anti Semi join
>>>>  - Drill Statistics handling
>>>>  - Optimizing complex Parquet reader
>>>> 
>>>> Thanks,
>>>> -Hanu
>>> 
>>> 
>>

Re: [Agenda] Drill developer meetup 2018

Reply via email to