Attendance: - Criteo: Mickael working on Hive Serde - Apache Drill: Parth (MapR) - Cloudera: Ryan - Netflix: Dan, Tonjie, Zhengxiao, Nezih (working on Presto) - Twitter: Julien
Notes: - Dealing with List and Maps containing nulls. in the Serde, Map of array and array of Map has been fixed Mickael currently working on HIVE-6994 => null inside array. List or arrays are modeled with a 3 level representation: - One optional field for the list itself that can be null - One repeated field for the items - One optional field to allow storing nulls in the list Ryan to send a PR for standardizing representation of lists. We need a permissive model for backward compatibility. We need to make sure there's no ambiguity between user defined one field groups and synthetic extra layers to represent null in lists - Vectorized execution. Netflix and Drill team working together proposed API based on presto. people interested should review (Drill, Hive, Spark) Parth: we should be able to pass in an allocator. (init and cleanup) See PARQUET-8[7-8] possibly we should use [Byte,...]Buffers instead of arrays - Jobs with significant setup time. What done to speed it up. PARQUET-100: HCatalog => write one file per partition. increasing default parallelism. Need to be reviewed. - Java 8 support: Tom form Cloudera working on it. - Parquet release: - We need to add license headers. - plan: release, rename packages, merge byte buffer APIs, merge 2.0 related JIRAs - See PARQUET-111: plan for release to review - encoding fallback: Julien to add description in PR - new PRs for Parquet 2.0 encoding fall back new page formats predicate push down on dictionary Next sync up Tuesday, Nov 18, 2014 10:30 am PST If you want a reminder send an email. On Tue, Oct 28, 2014 at 10:31 AM, Julien Le Dem <[email protected]> wrote: > Happening now: > https://plus.google.com/events/c2qu63kvjn2m31gnlq9hcrounh8 >
