Hello, Iceberg's code has been pushed yesterday to github: https://github.com/Netflix/iceberg
And the draft of specs have been published there: https://docs.google.com/document/d/1Q-zL5lSCle6NEEdyfiYsXYzX_ Q8Qf0ctMyGBKslOswA/edit?usp=sharing -- Joel On Sat, Dec 9, 2017 at 4:18 AM, Paul Rogers <[email protected]> wrote: > Very cool indeed. > > Iceberg, if I understand the post, is a file container format: it > identifies the set of files (in this case, Parquet files) that make up a > “table.” Since Iceberg mentions Hive, it presumably would work for any file > format (since it is just a file container.) > > This would be a great way to solve our Parquet metadata problem: it allows > us to identify the set of files that define Drill’s Parquet metadata, and > to make changes to that metadata in a transactional way. > > I wonder if Iceberg could be augmented to include both data and metadata > in the same container? That way, Drill views and/or Parquet metadata could > be managed as a unit with the files they describe. > > Further, it would seem that Iceberg might be extended to support MVCC by > listing the files that make up each version (or, equivalently, by listing > the deltas between versions.) > > This is definitely something to watch. Thanks, Parth for bringing it to > our attention (and thanks to Netflix for open sourcing the format). > > - Paul > > > On Dec 8, 2017, at 9:37 AM, Parth Chandra <[email protected]> wrote: > > > > FYI > > > > The Parquet is working on introducing new table format called 'Iceberg' > [1] > > that has interesting and useful features. > > > > Take a look at the initial post. > > > > > > [1] > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists. > apache.org_thread.html_f90ac1c268dea4077e358df1df8dd4 > 8f3766db8d4db476c3e0d9baa8-40-253Cdev.parquet.apache.org-253E&d=DwIBaQ&c= > cskdkSMqhcnjZxdQVpwTXg&r=Dz59a-Un_5n3KbQ2RYN0KA&m=PHcXxx7w-DDoW_UM90WJzI_ > LGQAhQAGqF1z-3z9xIRc&s=NkV7as3K7vfi7rObvLdSmeDv58FdehfcqxhfRqSBjTU&e= > >
