Hi all,
This is the meeting notes that I took. Feel free to add or correct it if
something is missed or wrong.
9/19/2019
Attendee:
Xinli Shang(Uber)
Gidon Gershinsky(IBM)
Jim Apple (Netflix)
Nandor Kollar, Gabor, and several other Cloudera folks
Julien Le Dem (WeWork)
Deepak (Vertica)
Please add if you are missed.
Topics:
Column Encryption
Parquet-format has the specification merged.
One PR is merged into parquet-mr, the second is being reviewed.
For parquet-cpp, we still have some integration errors.
Xinli backported the encryption code to parquet 1.10.1 to mitigate the
risk. We can wait for 1.11.0 release before deciding should public
community should do that.
Bloom filter
The spec has been checked in to parquet-format.
Will continue the validation of the correctness on parquet-mr(feature
branch) and parquet-cpp(master branch? some code like reader/writer not in
master branch yet).
Netflix has done enough testing on performance. The remaining tests are
mainly for correctness.
There are unit tests and integration tests that cover.
Parquet-format 2.7.0
Releasing of parquet-format is slow now. We need the release before
checking into parquet-mr master.
There are several options. We prefer option 3 that is to release bloom
filter and parquet encryption together in 2.7.0.
3 PMC voted in this meeting +1 for the option 3.
Ryan can help on the release, signing keys etc.
Remove old Parquet modules
Hive modules - sounds good
Scooge - Julien will reach out to twitter
Tools - undecided - Cloudera may still use the parquet-tools according to
Gabor.
Cascading - undecided
We can change the module as deprecated as description.
1.11.0 Release
Column index validation - Need Ryan to review it.
Someone is proposing byte_stream_split encoding in the mailing list
Ryan made a proposal and the owner just replied to try the proposal and get
back.
7. Merge Parquet and ORC
Ryan and Owen had a talk in ApacheCon regarding merging ORC and Parquet.
There are a lot of benefits to doing that but also a lot of work. Overall,
people in this meeting support this effort.
Ryan can start socializing this effort.
Xinli Shang (Uber)
Title: Parquet Sync
Hi all,This is an invitation for the next occasion of the regular sync
meeting of the Parquet community.Xinli Shang Join Zoom
Meetinghttps://uber.zoom.us/j/112318682One tap
mobile+16699006833,,112318682# US (San Jose)+16468769923,,112318682# US
(New York)Dial by your
location +1 669 900 6833 US
(San Jose) +1 646 876 9923
US (New York) 855 880 1246
US Toll-free 877 369 0926 US
Toll-freeMeeting ID: 112 318 682Find your local number:
https://zoom.us/u/aZKZunOZ9Join by [email protected] by
H.323162.255.37.11 (US West)162.255.36.11 (US East)221.122.88.195
(China)115.114.131.7 (India)213.19.144.110 (EMEA)103.122.166.55
(Australia)209.9.211.110 (Hong Kong)64.211.144.160 (Brazil)69.174.57.160
(Canada)207.226.132.110 (Japan)Meeting ID: 112 318 682
When: Thu Sep 19, 2019 9am – 10am Pacific Time - Los Angeles
Where: https://uber.zoom.us/j/112318682
Who:
* [email protected] - organizer
* [email protected]
* Daniel Weeks
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* Lars Volker
* Mohit Sabharwal
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* Julien Le Dem
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* Xu, Cheng A
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* Ryan Blue
* Wei Han
* [email protected]
* Zoltan Ivanfi
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* Pavi Subenderan
* Reynold Xin
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* Parth Chandra
* [email protected]
* [email protected]
* Mohammad Islam
* [email protected]
* Sergio Pena
* [email protected]
* [email protected]
* [email protected]
* [email protected]
* [email protected]