Re: code freeze and branch cut for Apache Spark 2.4

Xingbo Jiang Wed, 01 Aug 2018 11:13:53 -0700

Speaking of the code from hydrogen PRs, actually we didn't remove any of
the existing logic, and I tried my best to hide almost all of the newly
added logic behind a `isBarrier` tag (or something similar). I have to add
some new variables and new methods to the core code paths, but I think they
shall not be hit if you are not running barrier workloads.


The only significant change I can think of is I swapped the sequence of
failure handling in DAGScheduler, moving the `case FetchFailed` block to
before the `case Resubmitted` block, but again I don't think this shall
affect a regular workload because anyway you can only have one failure type.

Actually I also reviewed the previous PRs adding Spark on K8s support, and
I feel it's a good example of how to add new features to a project without
breaking existing workloads, I'm trying to follow that way in adding
barrier execution mode support.

I really appreciate any notice on hydrogen PRs and welcome comments to help
improve the feature, thanks!

2018-08-01 4:19 GMT+08:00 Reynold Xin <[email protected]>:

> I actually totally agree that we should make sure it should have no impact
> on existing code if the feature is not used.
>
>
> On Tue, Jul 31, 2018 at 1:18 PM Erik Erlandson <[email protected]>
> wrote:
>
>> I don't have a comprehensive knowledge of the project hydrogen PRs,
>> however I've perused them, and they make substantial modifications to
>> Spark's core DAG scheduler code.
>>
>> What I'm wondering is: how high is the confidence level that the
>> "traditional" code paths are still stable. Put another way, is it even
>> possible to "turn off" or "opt out" of this experimental feature? This
>> analogy isn't perfect, but for example the k8s back-end is a major body of
>> code, but it has a very small impact on any *core* code paths, and so if
>> you opt out of it, it is well understood that you aren't running any
>> experimental code.
>>
>> Looking at the project hydrogen code, I'm less sure the same is true.
>> However, maybe there is a clear way to show how it is true.
>>
>>
>> On Tue, Jul 31, 2018 at 12:03 PM, Mark Hamstra <[email protected]>
>> wrote:
>>
>>> No reasonable amount of time is likely going to be sufficient to fully
>>> vet the code as a PR. I'm not entirely happy with the design and code as
>>> they currently are (and I'm still trying to find the time to more publicly
>>> express my thoughts and concerns), but I'm fine with them going into 2.4
>>> much as they are as long as they go in with proper stability annotations
>>> and are understood not to be cast-in-stone final implementations, but
>>> rather as a way to get people using them and generating the feedback that
>>> is necessary to get us to something more like a final design and
>>> implementation.
>>>
>>> On Tue, Jul 31, 2018 at 11:54 AM Erik Erlandson <[email protected]>
>>> wrote:
>>>
>>>>
>>>> Barrier mode seems like a high impact feature on Spark's core code: is
>>>> one additional week enough time to properly vet this feature?
>>>>
>>>> On Tue, Jul 31, 2018 at 7:10 AM, Joseph Torres <
>>>> [email protected]> wrote:
>>>>
>>>>> Full continuous processing aggregation support ran into unanticipated
>>>>> scalability and scheduling problems. We’re planning to overcome those by
>>>>> using some of the barrier execution machinery, but since barrier execution
>>>>> itself is still in progress the full support isn’t going to make it into
>>>>> 2.4.
>>>>>
>>>>> Jose
>>>>>
>>>>> On Tue, Jul 31, 2018 at 6:07 AM Tomasz Gawęda <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> what is the status of Continuous Processing + Aggregations? As far as
>>>>>> I
>>>>>> remember, Jose Torres said it should  be easy to perform aggregations
>>>>>> if
>>>>>> coalesce(1) work. IIRC it's already merged to master.
>>>>>>
>>>>>> Is this work in progress? If yes, it would be great to have full
>>>>>> aggregation/join support in Spark 2.4 in CP.
>>>>>>
>>>>>> Pozdrawiam / Best regards,
>>>>>>
>>>>>> Tomek
>>>>>>
>>>>>>
>>>>>> On 2018-07-31 10:43, Petar Zečević wrote:
>>>>>> > This one is important to us: https://issues.apache.org/
>>>>>> jira/browse/SPARK-24020 (Sort-merge join inner range optimization)
>>>>>> but I think it could be useful to others too.
>>>>>> >
>>>>>> > It is finished and is ready to be merged (was ready a month ago at
>>>>>> least).
>>>>>> >
>>>>>> > Do you think you could consider including it in 2.4?
>>>>>> >
>>>>>> > Petar
>>>>>> >
>>>>>> >
>>>>>> > Wenchen Fan @ 1970-01-01 01:00 CET:
>>>>>> >
>>>>>> >> I went through the open JIRA tickets and here is a list that we
>>>>>> should consider for Spark 2.4:
>>>>>> >>
>>>>>> >> High Priority:
>>>>>> >> SPARK-24374: Support Barrier Execution Mode in Apache Spark
>>>>>> >> This one is critical to the Spark ecosystem for deep learning. It
>>>>>> only has a few remaining works and I think we should have it in Spark 
>>>>>> 2.4.
>>>>>> >>
>>>>>> >> Middle Priority:
>>>>>> >> SPARK-23899: Built-in SQL Function Improvement
>>>>>> >> We've already added a lot of built-in functions in this release,
>>>>>> but there are a few useful higher-order functions in progress, like
>>>>>> `array_except`, `transform`, etc. It would be great if we can get them in
>>>>>> Spark 2.4.
>>>>>> >>
>>>>>> >> SPARK-14220: Build and test Spark against Scala 2.12
>>>>>> >> Very close to finishing, great to have it in Spark 2.4.
>>>>>> >>
>>>>>> >> SPARK-4502: Spark SQL reads unnecessary nested fields from Parquet
>>>>>> >> This one is there for years (thanks for your patience Michael!),
>>>>>> and is also close to finishing. Great to have it in 2.4.
>>>>>> >>
>>>>>> >> SPARK-24882: data source v2 API improvement
>>>>>> >> This is to improve the data source v2 API based on what we learned
>>>>>> during this release. From the migration of existing sources and design of
>>>>>> new features, we found some problems in the API and want to address 
>>>>>> them. I
>>>>>> believe this should be
>>>>>> >> the last significant API change to data source v2, so great to
>>>>>> have in Spark 2.4. I'll send a discuss email about it later.
>>>>>> >>
>>>>>> >> SPARK-24252: Add catalog support in Data Source V2
>>>>>> >> This is a very important feature for data source v2, and is
>>>>>> currently being discussed in the dev list.
>>>>>> >>
>>>>>> >> SPARK-24768: Have a built-in AVRO data source implementation
>>>>>> >> Most of it is done, but date/timestamp support is still missing.
>>>>>> Great to have in 2.4.
>>>>>> >>
>>>>>> >> SPARK-23243: Shuffle+Repartition on an RDD could lead to incorrect
>>>>>> answers
>>>>>> >> This is a long-standing correctness bug, great to have in 2.4.
>>>>>> >>
>>>>>> >> There are some other important features like the adaptive
>>>>>> execution, streaming SQL, etc., not in the list, since I think we are not
>>>>>> able to finish them before 2.4.
>>>>>> >>
>>>>>> >> Feel free to add more things if you think they are important to
>>>>>> Spark 2.4 by replying to this email.
>>>>>> >>
>>>>>> >> Thanks,
>>>>>> >> Wenchen
>>>>>> >>
>>>>>> >> On Mon, Jul 30, 2018 at 11:00 PM Sean Owen <[email protected]>
>>>>>> wrote:
>>>>>> >>
>>>>>> >>   In theory releases happen on a time-based cadence, so it's
>>>>>> pretty much wrap up what's ready by the code freeze and ship it. In
>>>>>> practice, the cadence slips frequently, and it's very much a negotiation
>>>>>> about what features should push the
>>>>>> >>   code freeze out a few weeks every time. So, kind of a hybrid
>>>>>> approach here that works OK.
>>>>>> >>
>>>>>> >>   Certainly speak up if you think there's something that really
>>>>>> needs to get into 2.4. This is that discuss thread.
>>>>>> >>
>>>>>> >>   (BTW I updated the page you mention just yesterday, to reflect
>>>>>> the plan suggested in this thread.)
>>>>>> >>
>>>>>> >>   On Mon, Jul 30, 2018 at 9:51 AM Tom Graves
>>>>>> <[email protected]> wrote:
>>>>>> >>
>>>>>> >>   Shouldn't this be a discuss thread?
>>>>>> >>
>>>>>> >>   I'm also happy to see more release managers and agree the time
>>>>>> is getting close, but we should see what features are in progress and see
>>>>>> how close things are and propose a date based on that.  Cutting a branch 
>>>>>> to
>>>>>> soon just creates
>>>>>> >>   more work for committers to push to more branches.
>>>>>> >>
>>>>>> >>    http://spark.apache.org/versioning-policy.html mentioned the
>>>>>> code freeze and release branch cut mid-august.
>>>>>> >>
>>>>>> >>   Tom
>>>>>> >
>>>>>> > ------------------------------------------------------------
>>>>>> ---------
>>>>>> > To unsubscribe e-mail: [email protected]
>>>>>> >
>>>>>>
>>>>>>
>>>>
>>

Re: code freeze and branch cut for Apache Spark 2.4

Reply via email to