Here's the update: https://github.com/julienledem/incubator-parquet-mr/commit/ccdd08c7030d9d4579b6beef2b93981e327637c5 This should be bw compatible. Ideally, Hive should implement the same.
On Thu, Sep 4, 2014 at 3:20 PM, Julien Le Dem <[email protected]> wrote: > Actually, now that I'm looking into this in more details I may come up > with a backward compatible solution. > I'll send an update soon. > > > On Thu, Sep 4, 2014 at 2:46 PM, Brock Noland <[email protected]> wrote: > >> Sounds good. There is a hive release coming up so we should try and get >> this in soon. >> On Sep 4, 2014 2:27 PM, "Julien Le Dem" <[email protected]> wrote: >> >>> I just realized the parquet-hive and Hive have their own implementation >>> of InputFormat now. Basically it is a plain FileInputFormat and the >>> metadata is read on the tasks. >>> However it does reuse the ParquetInputSplit that I am changing for >>> Parquet-84 https://github.com/apache/incubator-parquet-mr/pull/45 >>> So I filed https://issues.apache.org/jira/browse/HIVE-7986 >>> I'm going to make a change in parquet-hive as part of #45 This will >>> break users who put the latest Parquet with a current Hive unless HIVE-7986 >>> is implemented. >>> >> >
