> With an iceberg raw store, I suspect that you might not need a storage
handler and could go straight to a input/output format. You would probably
need an input and output format for each of the storage formats:
Iceberg{Orc,Parquet,Avro}{Input,Output}Format.
I don't think that would work because
> On Jul 24, 2019, at 22:52, Adrien Guillo
> wrote:
>
> Hi Iceberg folks,
>
> In the last few months, we (the data infrastructure team at Airbnb) have been
> closely following the project. We are currently evaluating potential
> strategies to migrate our data warehouse to Iceberg.
Owen or Carl,
Do you have any thoughts on this approach? We had previously discussed
this but now that we've looked into it more closely there are a few areas
that are unclear.
HiveMetaHook looks like a good entry point for DDL (though as Adrien
pointed out, it doesn't cover all operations).
Hi Adrien,
We at LinkedIn went through a similar thought process, but given our
Hive deployment is not that large, we are in the process of considering
deprecating Hive and asking our users to move to Spark [since Spark
supports Hive ql].
I'm guessing we'd have to invest in Spark's catalog