Re: [DISCUSS] Spark 2.5 release

Holden Karau Mon, 23 Sep 2019 10:45:10 -0700

I would personally love to see us provide a gentle migration path to Spark
3 especially if much of the work is already going to happen anyways.


Maybe giving it a different name (eg something like
Spark-2-to-3-transitional) would make it more clear about its intended
purpose and encourage folks to move to 3 when they can?

On Mon, Sep 23, 2019 at 9:17 AM Ryan Blue <rb...@netflix.com.invalid> wrote:

> My understanding is that 3.0-preview is not going to be a production-ready
> release. For those of us that have been using backports of DSv2 in
> production, that doesn't help.
>
> It also doesn't help as a stepping stone because users would need to
> handle all of the incompatible changes in 3.0. Using 3.0-preview would be
> an unstable release with breaking changes instead of a stable release
> without the breaking changes.
>
> I'm offering to help build a stable release without breaking changes. But
> if there is no community interest in it, I'm happy to drop this.
>
> On Sun, Sep 22, 2019 at 6:39 PM Hyukjin Kwon <gurwls...@gmail.com> wrote:
>
>> +1 for Matei's as well.
>>
>> On Sun, 22 Sep 2019, 14:59 Marco Gaido, <marcogaid...@gmail.com> wrote:
>>
>>> I agree with Matei too.
>>>
>>> Thanks,
>>> Marco
>>>
>>> Il giorno dom 22 set 2019 alle ore 03:44 Dongjoon Hyun <
>>> dongjoon.h...@gmail.com> ha scritto:
>>>
>>>> +1 for Matei's suggestion!
>>>>
>>>> Bests,
>>>> Dongjoon.
>>>>
>>>> On Sat, Sep 21, 2019 at 5:44 PM Matei Zaharia <matei.zaha...@gmail.com>
>>>> wrote:
>>>>
>>>>> If the goal is to get people to try the DSv2 API and build DSv2 data
>>>>> sources, can we recommend the 3.0-preview release for this? That would get
>>>>> people shifting to 3.0 faster, which is probably better overall compared 
>>>>> to
>>>>> maintaining two major versions. There’s not that much else changing in 3.0
>>>>> if you already want to update your Java version.
>>>>>
>>>>> On Sep 21, 2019, at 2:45 PM, Ryan Blue <rb...@netflix.com.INVALID>
>>>>> wrote:
>>>>>
>>>>> > If you insist we shouldn't change the unstable temporary API in 3.x
>>>>> . . .
>>>>>
>>>>> Not what I'm saying at all. I said we should carefully
>>>>> consider whether a breaking change is the right decision in the 3.x line.
>>>>>
>>>>> All I'm suggesting is that we can make a 2.5 release with the feature
>>>>> and an API that is the same as the one in 3.0.
>>>>>
>>>>> > I also don't get this backporting a giant feature to 2.x line
>>>>>
>>>>> I am planning to do this so we can use DSv2 before 3.0 is released.
>>>>> Then we can have a source implementation that works in both 2.x and 3.0 to
>>>>> make the transition easier. Since I'm already doing the work, I'm offering
>>>>> to share it with the community.
>>>>>
>>>>>
>>>>> On Sat, Sep 21, 2019 at 2:36 PM Reynold Xin <r...@databricks.com>
>>>>> wrote:
>>>>>
>>>>>> Because for example we'd need to move the location of InternalRow,
>>>>>> breaking the package name. If you insist we shouldn't change the unstable
>>>>>> temporary API in 3.x to maintain compatibility with 3.0, which is totally
>>>>>> different from my understanding of the situation when you exposed it, 
>>>>>> then
>>>>>> I'd say we should gate 3.0 on having a stable row interface.
>>>>>>
>>>>>> I also don't get this backporting a giant feature to 2.x line ... as
>>>>>> suggested by others in the thread, DSv2 would be one of the main reasons
>>>>>> people upgrade to 3.0. What's so special about DSv2 that we are doing 
>>>>>> this?
>>>>>> Why not abandoning 3.0 entirely and backport all the features to 2.x?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Sep 21, 2019 at 2:31 PM, Ryan Blue <rb...@netflix.com> wrote:
>>>>>>
>>>>>>> Why would that require an incompatible change?
>>>>>>>
>>>>>>> We *could* make an incompatible change and remove support for
>>>>>>> InternalRow, but I think we would want to carefully consider whether 
>>>>>>> that
>>>>>>> is the right decision. And in any case, we would be able to keep 2.5 and
>>>>>>> 3.0 compatible, which is the main goal.
>>>>>>>
>>>>>>> On Sat, Sep 21, 2019 at 2:28 PM Reynold Xin <r...@databricks.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> How would you not make incompatible changes in 3.x? As discussed
>>>>>>>> the InternalRow API is not stable and needs to change.
>>>>>>>>
>>>>>>>> On Sat, Sep 21, 2019 at 2:27 PM Ryan Blue <rb...@netflix.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> > Making downstream to diverge their implementation heavily
>>>>>>>>> between minor versions (say, 2.4 vs 2.5) wouldn't be a good experience
>>>>>>>>>
>>>>>>>>> You're right that the API has been evolving in the 2.x line. But,
>>>>>>>>> it is now reasonably stable with respect to the current feature set 
>>>>>>>>> and we
>>>>>>>>> should not need to break compatibility in the 3.x line. Because we 
>>>>>>>>> have
>>>>>>>>> reached our goals for the 3.0 release, we can backport at least those
>>>>>>>>> features to 2.x and confidently have an API that works in both a 2.x
>>>>>>>>> release and is compatible with 3.0, if not 3.1 and later releases as 
>>>>>>>>> well.
>>>>>>>>>
>>>>>>>>> > I'd rather say preparation of Spark 2.5 should be started after
>>>>>>>>> Spark 3.0 is officially released
>>>>>>>>>
>>>>>>>>> The reason I'm suggesting this is that I'm already going to do the
>>>>>>>>> work to backport the 3.0 release features to 2.4. I've been asked by
>>>>>>>>> several people when DSv2 will be released, so I know there is a lot of
>>>>>>>>> interest in making this available sooner than 3.0. If I'm already 
>>>>>>>>> doing the
>>>>>>>>> work, then I'd be happy to share that with the community.
>>>>>>>>>
>>>>>>>>> I don't see why 2.5 and 3.0 are mutually exclusive. We can work on
>>>>>>>>> 2.5 while preparing the 3.0 preview and fixing bugs. For DSv2, the 
>>>>>>>>> work is
>>>>>>>>> about complete so we can easily release the same set of features and 
>>>>>>>>> API in
>>>>>>>>> 2.5 and 3.0.
>>>>>>>>>
>>>>>>>>> If we decide for some reason to wait until after 3.0 is released,
>>>>>>>>> I don't know that there is much value in a 2.5. The purpose is to be 
>>>>>>>>> a step
>>>>>>>>> toward 3.0, and releasing that step after 3.0 doesn't seem helpful to 
>>>>>>>>> me.
>>>>>>>>> It also wouldn't get these features out any sooner than 3.0, as a 2.5
>>>>>>>>> release probably would, given the work needed to validate the 
>>>>>>>>> incompatible
>>>>>>>>> changes in 3.0.
>>>>>>>>>
>>>>>>>>> > DSv2 change would be the major backward incompatibility which
>>>>>>>>> Spark 2.x users may hesitate to upgrade
>>>>>>>>>
>>>>>>>>> As I pointed out, DSv2 has been changing in the 2.x line, so this
>>>>>>>>> is expected. I don't think it will need incompatible changes in the 
>>>>>>>>> 3.x
>>>>>>>>> line.
>>>>>>>>>
>>>>>>>>> On Fri, Sep 20, 2019 at 9:25 PM Jungtaek Lim <kabh...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Just 2 cents, I haven't tracked the change of DSv2 (though I
>>>>>>>>>> needed to deal with this as the change made confusion on my PRs...), 
>>>>>>>>>> but my
>>>>>>>>>> bet is that DSv2 would be already changed in incompatible way, at 
>>>>>>>>>> least who
>>>>>>>>>> works for custom DataSource. Making downstream to diverge their
>>>>>>>>>> implementation heavily between minor versions (say, 2.4 vs 2.5) 
>>>>>>>>>> wouldn't be
>>>>>>>>>> a good experience - especially we are not completely closed the 
>>>>>>>>>> chance
>>>>>>>>>> to further modify DSv2, and the change could be backward 
>>>>>>>>>> incompatible.
>>>>>>>>>>
>>>>>>>>>> If we really want to bring the DSv2 change to 2.x version line to
>>>>>>>>>> let end users avoid forcing to upgrade Spark 3.x to enjoy new DSv2, 
>>>>>>>>>> I'd
>>>>>>>>>> rather say preparation of Spark 2.5 should be started after Spark 
>>>>>>>>>> 3.0 is
>>>>>>>>>> officially released, honestly even later than that, say, getting some
>>>>>>>>>> reports from Spark 3.0 about DSv2 so that we feel DSv2 is OK. I hope 
>>>>>>>>>> we
>>>>>>>>>> don't make Spark 2.5 be a kind of "tech-preview" which Spark 2.4 
>>>>>>>>>> users may
>>>>>>>>>> be frustrated to upgrade to next minor version.
>>>>>>>>>>
>>>>>>>>>> Btw, do we have any specific target users for this? Personally
>>>>>>>>>> DSv2 change would be the major backward incompatibility which Spark 
>>>>>>>>>> 2.x
>>>>>>>>>> users may hesitate to upgrade, so they might be already prepared to 
>>>>>>>>>> migrate
>>>>>>>>>> to Spark 3.0 if they are prepared to migrate to new DSv2.
>>>>>>>>>>
>>>>>>>>>> On Sat, Sep 21, 2019 at 12:46 PM Dongjoon Hyun <
>>>>>>>>>> dongjoon.h...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Do you mean you want to have a breaking API change between 3.0
>>>>>>>>>>> and 3.1?
>>>>>>>>>>> I believe we follow Semantic Versioning (
>>>>>>>>>>> https://spark.apache.org/versioning-policy.html ).
>>>>>>>>>>>
>>>>>>>>>>> > We just won’t add any breaking changes before 3.1.
>>>>>>>>>>>
>>>>>>>>>>> Bests,
>>>>>>>>>>> Dongjoon.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:48 AM Ryan Blue <
>>>>>>>>>>> rb...@netflix.com.invalid> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I don’t think we need to gate a 3.0 release on making a more
>>>>>>>>>>>> stable version of InternalRow
>>>>>>>>>>>>
>>>>>>>>>>>> Sounds like we agree, then. We will use it for 3.0, but there
>>>>>>>>>>>> are known problems with it.
>>>>>>>>>>>>
>>>>>>>>>>>> Thinking we’d have dsv2 working in both 3.x (which will change
>>>>>>>>>>>> and progress towards more stable, but will have to break certain 
>>>>>>>>>>>> APIs) and
>>>>>>>>>>>> 2.x seems like a false premise.
>>>>>>>>>>>>
>>>>>>>>>>>> Why do you think we will need to break certain APIs before 3.0?
>>>>>>>>>>>>
>>>>>>>>>>>> I’m only suggesting that we release the same support in a 2.5
>>>>>>>>>>>> release that we do in 3.0. Since we are nearly finished with the 
>>>>>>>>>>>> 3.0 goals,
>>>>>>>>>>>> it seems like we can certainly do that. We just won’t add any 
>>>>>>>>>>>> breaking
>>>>>>>>>>>> changes before 3.1.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:39 AM Reynold Xin <
>>>>>>>>>>>> r...@databricks.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I don't think we need to gate a 3.0 release on making a more
>>>>>>>>>>>>> stable version of InternalRow, but thinking we'd have dsv2 
>>>>>>>>>>>>> working in both
>>>>>>>>>>>>> 3.x (which will change and progress towards more stable, but will 
>>>>>>>>>>>>> have to
>>>>>>>>>>>>> break certain APIs) and 2.x seems like a false premise.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To point out some problems with InternalRow that you think are
>>>>>>>>>>>>> already pragmatic and stable:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The class is in catalyst, which states:
>>>>>>>>>>>>> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/package.scala
>>>>>>>>>>>>>
>>>>>>>>>>>>> /**
>>>>>>>>>>>>> * Catalyst is a library for manipulating relational query
>>>>>>>>>>>>> plans.  All classes in catalyst are
>>>>>>>>>>>>> * considered an internal API to Spark SQL and are subject to
>>>>>>>>>>>>> change between minor releases.
>>>>>>>>>>>>> */
>>>>>>>>>>>>>
>>>>>>>>>>>>> There is no even any annotation on the interface.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The entire dependency chain were created to be private, and
>>>>>>>>>>>>> tightly coupled with internal implementations. For example,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/apache/spark/blob/master/common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java
>>>>>>>>>>>>>
>>>>>>>>>>>>> /**
>>>>>>>>>>>>> * A UTF-8 String for internal Spark use.
>>>>>>>>>>>>> * <p>
>>>>>>>>>>>>> * A String encoded in UTF-8 as an Array[Byte], which can be
>>>>>>>>>>>>> used for comparison,
>>>>>>>>>>>>> * search, see http://en.wikipedia.org/wiki/UTF-8 for details.
>>>>>>>>>>>>> * <p>
>>>>>>>>>>>>> * Note: This is not designed for general use cases, should not
>>>>>>>>>>>>> be used outside SQL.
>>>>>>>>>>>>> */
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala
>>>>>>>>>>>>>
>>>>>>>>>>>>> (which again is in catalyst package)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you want to argue this way, you might as well argue we
>>>>>>>>>>>>> should make the entire catalyst package public to be pragmatic 
>>>>>>>>>>>>> and not
>>>>>>>>>>>>> allow any changes.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:32 AM, Ryan Blue <rb...@netflix.com
>>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> When you created the PR to make InternalRow public
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This isn’t quite accurate. The change I made was to use
>>>>>>>>>>>>>> InternalRow instead of UnsafeRow, which is a specific
>>>>>>>>>>>>>> implementation of InternalRow. Exposing this API has always
>>>>>>>>>>>>>> been a part of DSv2 and while both you and I did some work to 
>>>>>>>>>>>>>> avoid this,
>>>>>>>>>>>>>> we are still in the phase of starting with that API.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Note that any change to InternalRow would be very costly to
>>>>>>>>>>>>>> implement because this interface is widely used. That is why I 
>>>>>>>>>>>>>> think we can
>>>>>>>>>>>>>> certainly consider it stable enough to use here, and that’s 
>>>>>>>>>>>>>> probably why
>>>>>>>>>>>>>> UnsafeRow was part of the original proposal.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In any case, the goal for 3.0 was not to replace the use of
>>>>>>>>>>>>>> InternalRow, it was to get the majority of SQL working on
>>>>>>>>>>>>>> top of the interface added after 2.4. That’s done and stable, so 
>>>>>>>>>>>>>> I think a
>>>>>>>>>>>>>> 2.5 release with it is also reasonable.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:23 AM Reynold Xin <
>>>>>>>>>>>>>> r...@databricks.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To push back, while I agree we should not drastically change
>>>>>>>>>>>>>>> "InternalRow", there are a lot of changes that need to happen 
>>>>>>>>>>>>>>> to make it
>>>>>>>>>>>>>>> stable. For example, none of the publicly exposed interfaces 
>>>>>>>>>>>>>>> should be in
>>>>>>>>>>>>>>> the Catalyst package or the unsafe package. External 
>>>>>>>>>>>>>>> implementations should
>>>>>>>>>>>>>>> be decoupled from the internal implementations, with cheap ways 
>>>>>>>>>>>>>>> to convert
>>>>>>>>>>>>>>> back and forth.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you created the PR to make InternalRow public, the
>>>>>>>>>>>>>>> understanding was to work towards making it stable in the 
>>>>>>>>>>>>>>> future, assuming
>>>>>>>>>>>>>>> we will start with an unstable API temporarily. You can't just 
>>>>>>>>>>>>>>> make a bunch
>>>>>>>>>>>>>>> internal APIs tightly coupled with other internal pieces public 
>>>>>>>>>>>>>>> and stable
>>>>>>>>>>>>>>> and call it a day, just because it happen to satisfy some use 
>>>>>>>>>>>>>>> cases
>>>>>>>>>>>>>>> temporarily assuming the rest of Spark doesn't change.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:19 AM, Ryan Blue <
>>>>>>>>>>>>>>> rb...@netflix.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > DSv2 is far from stable right?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> No, I think it is reasonably stable and very close to being
>>>>>>>>>>>>>>>> ready for a release.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > All the actual data types are unstable and you guys have
>>>>>>>>>>>>>>>> completely ignored that.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think what you're referring to is the use of
>>>>>>>>>>>>>>>> `InternalRow`. That's a stable API and there has been no work 
>>>>>>>>>>>>>>>> to avoid
>>>>>>>>>>>>>>>> using it. In any case, I don't think that anyone is suggesting 
>>>>>>>>>>>>>>>> that we
>>>>>>>>>>>>>>>> delay 3.0 until a replacement for `InternalRow` is added, 
>>>>>>>>>>>>>>>> right?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> While I understand the motivation for a better solution
>>>>>>>>>>>>>>>> here, I think the pragmatic solution is to continue using 
>>>>>>>>>>>>>>>> `InternalRow`.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> > If the goal is to make DSv2 work across 3.x and 2.x, that
>>>>>>>>>>>>>>>> seems too invasive of a change to backport once you consider 
>>>>>>>>>>>>>>>> the parts
>>>>>>>>>>>>>>>> needed to make dsv2 stable.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I believe that those of us working on DSv2 are confident
>>>>>>>>>>>>>>>> about the current stability. We set goals for what to get into 
>>>>>>>>>>>>>>>> the 3.0
>>>>>>>>>>>>>>>> release months ago and have very nearly reached the point 
>>>>>>>>>>>>>>>> where we are
>>>>>>>>>>>>>>>> ready for that release.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I don't think instability would be a problem in maintaining
>>>>>>>>>>>>>>>> compatibility between the 2.5 version and the 3.0 version. If 
>>>>>>>>>>>>>>>> we find that
>>>>>>>>>>>>>>>> we need to make API changes (other than additions) then we can 
>>>>>>>>>>>>>>>> make those
>>>>>>>>>>>>>>>> in the 3.1 release. Because the goals we set for the 3.0 
>>>>>>>>>>>>>>>> release have been
>>>>>>>>>>>>>>>> reached with the current API and if we are ready to release 
>>>>>>>>>>>>>>>> 3.0, we can
>>>>>>>>>>>>>>>> release a 2.5 with the same API.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 11:05 AM Reynold Xin <
>>>>>>>>>>>>>>>> r...@databricks.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> DSv2 is far from stable right? All the actual data types
>>>>>>>>>>>>>>>>> are unstable and you guys have completely ignored that. We'd 
>>>>>>>>>>>>>>>>> need to work
>>>>>>>>>>>>>>>>> on that and that will be a breaking change. If the goal is to 
>>>>>>>>>>>>>>>>> make DSv2
>>>>>>>>>>>>>>>>> work across 3.x and 2.x, that seems too invasive of a change 
>>>>>>>>>>>>>>>>> to backport
>>>>>>>>>>>>>>>>> once you consider the parts needed to make dsv2 stable.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Sep 20, 2019 at 10:47 AM, Ryan Blue <
>>>>>>>>>>>>>>>>> rb...@netflix.com.invalid> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In the DSv2 sync this week, we talked about a possible
>>>>>>>>>>>>>>>>>> Spark 2.5 release based on the latest Spark 2.4, but with 
>>>>>>>>>>>>>>>>>> DSv2 and Java 11
>>>>>>>>>>>>>>>>>> support added.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> A Spark 2.5 release with these two additions will help
>>>>>>>>>>>>>>>>>> people migrate to Spark 3.0 when it is released because they 
>>>>>>>>>>>>>>>>>> will be able
>>>>>>>>>>>>>>>>>> to use a single implementation for DSv2 sources that works 
>>>>>>>>>>>>>>>>>> in both 2.5 and
>>>>>>>>>>>>>>>>>> 3.0. Similarly, upgrading to 3.0 won't also require also 
>>>>>>>>>>>>>>>>>> updating to Java
>>>>>>>>>>>>>>>>>> 11 because users could update to Java 11 with the 2.5 
>>>>>>>>>>>>>>>>>> release and have
>>>>>>>>>>>>>>>>>> fewer major changes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Another reason to consider a 2.5 release is that many
>>>>>>>>>>>>>>>>>> people are interested in a release with the latest DSv2 API 
>>>>>>>>>>>>>>>>>> and support for
>>>>>>>>>>>>>>>>>> DSv2 SQL. I'm already going to be backporting DSv2 support 
>>>>>>>>>>>>>>>>>> to the Spark 2.4
>>>>>>>>>>>>>>>>>> line, so it makes sense to share this work with the 
>>>>>>>>>>>>>>>>>> community.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This release line would just consist of backports like
>>>>>>>>>>>>>>>>>> DSv2 and Java 11 that assist compatibility, to keep the 
>>>>>>>>>>>>>>>>>> scope of the
>>>>>>>>>>>>>>>>>> release small. The purpose is to assist people moving to 3.0 
>>>>>>>>>>>>>>>>>> and not
>>>>>>>>>>>>>>>>>> distract from the 3.0 release.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Would a Spark 2.5 release help anyone else? Are there any
>>>>>>>>>>>>>>>>>> concerns about this plan?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> rb
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>>>> Netflix
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Ryan Blue
>>>>>>>>>>>> Software Engineer
>>>>>>>>>>>> Netflix
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Name : Jungtaek Lim
>>>>>>>>>> Blog : http://medium.com/@heartsavior
>>>>>>>>>> Twitter : http://twitter.com/heartsavior
>>>>>>>>>> LinkedIn : http://www.linkedin.com/in/heartsavior
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ryan Blue
>>>>>>>>> Software Engineer
>>>>>>>>> Netflix
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>>
>>>>>> --
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
YouTube Live Streams: https://www.youtube.com/user/holdenkarau

Re: [DISCUSS] Spark 2.5 release

Reply via email to