?Ok Julien ! I will start to look at it on Monday (few things needed to be done before on my side).
-- Mickaël Lacour Senior Software Engineer Analytics Infrastructure team @Scalability ________________________________ De : Julien Le Dem <[email protected]> Envoyé : vendredi 5 septembre 2014 02:10 À : Brock Noland Cc : Justin Coffey; Mickaël Lacour; [email protected]; Remy Pecqueur Objet : Re: Hive depending on ParquetInputSplit constructor Here's the update: https://github.com/julienledem/incubator-parquet-mr/commit/ccdd08c7030d9d4579b6beef2b93981e327637c5 This should be bw compatible. Ideally, Hive should implement the same. On Thu, Sep 4, 2014 at 3:20 PM, Julien Le Dem <[email protected]<mailto:[email protected]>> wrote: Actually, now that I'm looking into this in more details I may come up with a backward compatible solution. I'll send an update soon. On Thu, Sep 4, 2014 at 2:46 PM, Brock Noland <[email protected]<mailto:[email protected]>> wrote: Sounds good. There is a hive release coming up so we should try and get this in soon. On Sep 4, 2014 2:27 PM, "Julien Le Dem" <[email protected]<mailto:[email protected]>> wrote: I just realized the parquet-hive and Hive have their own implementation of InputFormat now. Basically it is a plain FileInputFormat and the metadata is read on the tasks. However it does reuse the ParquetInputSplit that I am changing for Parquet-84 https://github.com/apache/incubator-parquet-mr/pull/45 So I filed https://issues.apache.org/jira/browse/HIVE-7986 I'm going to make a change in parquet-hive as part of #45 This will break users who put the latest Parquet with a current Hive unless HIVE-7986 is implemented.
