Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-08-08 Thread Ryan Blue
> With an iceberg raw store, I suspect that you might not need a storage handler and could go straight to a input/output format. You would probably need an input and output format for each of the storage formats: Iceberg{Orc,Parquet,Avro}{Input,Output}Format. I don't think that would work because

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-08-07 Thread Owen O'Malley
> On Jul 24, 2019, at 22:52, Adrien Guillo > wrote: > > Hi Iceberg folks, > > In the last few months, we (the data infrastructure team at Airbnb) have been > closely following the project. We are currently evaluating potential > strategies to migrate our data warehouse to Iceberg.

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-07-29 Thread Daniel Weeks
Owen or Carl, Do you have any thoughts on this approach? We had previously discussed this but now that we've looked into it more closely there are a few areas that are unclear. HiveMetaHook looks like a good entry point for DDL (though as Adrien pointed out, it doesn't cover all operations).

Re: [DISCUSS] Implementation strategies for supporting Iceberg tables in Hive

2019-07-25 Thread RD
Hi Adrien, We at LinkedIn went through a similar thought process, but given our Hive deployment is not that large, we are in the process of considering deprecating Hive and asking our users to move to Spark [since Spark supports Hive ql]. I'm guessing we'd have to invest in Spark's catalog