convert existing Parquet table using Hive/MR

2019-05-22 Thread Shangzhong zhu
Hi All, Do we have some examples on how to convert an existing Parquet table to Hudi managed table using Hive/MR? Seems the provided HDFSParquetImporter tool needs to use Spark only. Regards, Shanzhong

Re: convert existing Parquet table using Hive/MR

2019-05-22 Thread Vinoth Chandar
Hi, Unfortunately not. Hudi writing is a spark job. With just MR/Hive, we were unable to implement features like file sizing/index lookup which need shuffling of data. Hive OutputFormat for e.g, does not allow to be extended in this fashion. Thanks Vinoth On Wed, May 22, 2019 at 12:55 AM Shangzh

Re: convert existing Parquet table using Hive/MR

2019-05-22 Thread Shangzhong zhu
Thanks, Vinoth! Regards, Shanzhong > On May 22, 2019, at 7:53 AM, Vinoth Chandar wrote: > > Hi, > > Unfortunately not. Hudi writing is a spark job. With just MR/Hive, we were > unable to implement features like file sizing/index lookup which need > shuffling of data. > Hive OutputFormat for e