I haven't been following this closely, but I'm aware that there are
some tricky compatibility problems between Avro and Parquet, both of
which are used in Spark. That's made it pretty hard to update in 2.x.
master/3.0 is on Parquet 1.10.1 and Avro 1.8.2. Just a general
question: is that the best combo going forward? because the time to
update would be right about now for Spark 3. Backporting to 2.x is
pretty unlikely though.

On Fri, Nov 22, 2019 at 12:45 PM Michael Heuer <heue...@gmail.com> wrote:
>
> Hello,
>
> I am sorry for asking a somewhat inappropriate question.
>
> For context, our projects depend on a fix in Parquet master but not yet 
> released.  Parquet 1.11.0 is in release-candidate phase.  It looks like we 
> can't build against Parquet 1.11.0 RC to include the fix and run successfully 
> on Spark 2.4.x, which includes 1.10.1, without various classpath workarounds.
>
> I see now that Spark policy requires the Avro upgrade to wait until Spark 
> 3.0, and since Parquet 1.11.0 RC currently depends on Avro 1.9.1, it may also 
> have to wait.  I'll continue to think on this in the scope of the Parquet 
> community.
>
> Thank you for the clarification,
>
>    michael
>
>
> On Nov 22, 2019, at 12:07 PM, Dongjoon Hyun <dongjoon.h...@gmail.com> wrote:
>
> Hi, Michael.
>
> I'm not sure Apache Spark is in the status close to what you want.
>
> First, both Apache Spark 3.0.0-preview and Apache Spark 2.4 is using Avro 
> 1.8.2. Also, `master` and `branch-2.4` branch does. Cutting new releases do 
> not provide you what you want.
>
> Do we have a PR on the master branch? Otherwise, before starting to discuss 
> the releases, could you make a PR first on the master branch? For Parquet, 
> it's the same.
>
> Second, we want to provide Apache Spark 3.0.0 as compatible as possible. The 
> incompatible change could be a reason for rejection even in `master` branch 
> for Apache Spark 3.0.0.
>
> Lastly, we may consider backporting if it lands at `master` branch for 3.0.
> However, as Nan Zhu said, the dependency upgrade backporting PR is -1 by 
> default. Usually, it's allowed only for those serious cases like 
> security/production outage.
>
> Bests,
> Dongjoon.
>
>
> On Fri, Nov 22, 2019 at 9:00 AM Ryan Blue <rb...@netflix.com.invalid> wrote:
>>
>> Just to clarify, I don't think that Parquet 1.10.1 to 1.11.0 is a 
>> runtime-incompatible change. The example mixed 1.11.0 and 1.10.1 in the same 
>> execution.
>>
>> Michael, please be more careful about announcing compatibility problems in 
>> other communities. If you've observed problems, let's find out the root 
>> cause first.
>>
>> rb
>>
>> On Fri, Nov 22, 2019 at 8:56 AM Michael Heuer <heue...@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> Avro 1.8.2 to 1.9.1 is a binary incompatible update, and it appears that 
>>> Parquet 1.10.1 to 1.11 will be a runtime-incompatible update (see thread on 
>>> dev@parquet).
>>>
>>> Might there be any desire to cut a Spark 2.4.5 release so that users can 
>>> pick up these changes independently of all the other changes in Spark 3.0?
>>>
>>> Thank you in advance,
>>>
>>>    michael
>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to