The root cause for a case where closure cleaner is involved is described here: https://github.com/apache/spark/pull/22004/files#r207753682 but I am also waiting for some feedback from Lukas Rytz why this even worked in 2.11. If it is something that needs fix and can be fixed we will fix and add test cases for sure. I do understand the UX issue and that is why I mentioned this in the first place. It is my concern too. Meanwhile sometimes adoption requires changes. Best case only implementation changes. Worst case the way you use something changes as well, not to mention that this is not the common scenario that fails and the user has options. Wouldnt say that it is dentrimental but anyway. I propose we move the discussion to https://issues.apache.org/jira/browse/SPARK-25029 as this is an umbrella jira for this and others. Anyway we are looking into this and also the janino thing.
Stavros On Mon, Aug 6, 2018 at 1:18 PM, Mridul Muralidharan <mri...@gmail.com> wrote: > > A spark user’s expectation would be that any closure which worked in 2.11 > will continue to work in 2.12 (exhibiting same behavior wrt functionality, > serializability, etc). > If there are behavioral changes, we will need to understand what they are > - but expection would be that they are minimal (if any) source changes for > users/libraries - requiring otherwise would be very detrimental to adoption. > > Do we know the root cause here ? I am not sure how well we test the > cornercases in cleaner- if this was not caught by suite, perhaps we should > augment it ... > > Regards > Mridul > > On Mon, Aug 6, 2018 at 1:08 AM Stavros Kontopoulos <stavros.kontopoulos@ > lightbend.com> wrote: > >> Closure cleaner's initial purpose AFAIK is to clean the dependencies >> brought in with outer pointers (compiler's side effect). With LMFs in >> Scala 2.12 there are no outer pointers, that is why in the new design >> document we kept the implementation minimal focusing on the return >> statements (it was intentional). Also the majority of the generated >> closures AFAIK are of type LMF. >> Regarding references in the LMF body that was not part of the doc since >> we expect the user not to point to non-serializable objects etc. >> In all these cases you know you are adding references you shouldn't. >> If users were used to another UX we can try fix it, not sure how well >> this worked in the past though and if covered all cases. >> >> Regards, >> Stavros >> >> On Mon, Aug 6, 2018 at 8:36 AM, Mridul Muralidharan <mri...@gmail.com> >> wrote: >> >>> I agree, we should not work around the testcase but rather understand >>> and fix the root cause. >>> Closure cleaner should have null'ed out the references and allowed it >>> to be serialized. >>> >>> Regards, >>> Mridul >>> >>> On Sun, Aug 5, 2018 at 8:38 PM Wenchen Fan <cloud0...@gmail.com> wrote: >>> > >>> > It seems to me that the closure cleaner fails to clean up something. >>> The failed test case defines a serializable class inside the test case, and >>> the class doesn't refer to anything in the outer class. Ideally it can be >>> serialized after cleaning up the closure. >>> > >>> > This is somehow a very weird way to define a class, so I'm not sure >>> how serious the problem is. >>> > >>> > On Mon, Aug 6, 2018 at 3:41 AM Stavros Kontopoulos < >>> stavros.kontopou...@lightbend.com> wrote: >>> >> >>> >> Makes sense, not sure if closure cleaning is related to the last one >>> for example or others. The last one is a bit weird, unless I am missing >>> something about the LegacyAccumulatorWrapper logic. >>> >> >>> >> Stavros >>> >> >>> >> On Sun, Aug 5, 2018 at 10:23 PM, Sean Owen <sro...@gmail.com> wrote: >>> >>> >>> >>> Yep that's what I did. There are more failures with different >>> resolutions. I'll open a JIRA and PR and ping you, to make sure that the >>> changes are all reasonable, and not an artifact of missing something about >>> closure cleaning in 2.12. >>> >>> >>> >>> In the meantime having a 2.12 build up and running for master will >>> just help catch these things. >>> >>> >>> >>> On Sun, Aug 5, 2018 at 2:16 PM Stavros Kontopoulos < >>> stavros.kontopou...@lightbend.com> wrote: >>> >>>> >>> >>>> Hi Sean, >>> >>>> >>> >>>> I run a quick build so the failing tests seem to be: >>> >>>> >>> >>>> - SPARK-17644: After one stage is aborted for too many failed >>> attempts, subsequent stagesstill behave correctly on fetch failures *** >>> FAILED *** >>> >>>> A job with one fetch failure should eventually succeed >>> (DAGSchedulerSuite.scala:2422) >>> >>>> >>> >>>> >>> >>>> - LegacyAccumulatorWrapper with AccumulatorParam that has no >>> equals/hashCode *** FAILED *** >>> >>>> java.io.NotSerializableException: org.scalatest.Assertions$ >>> AssertionsHelper >>> >>>> Serialization stack: >>> >>>> - object not serializable (class: >>> >>>> org.scalatest.Assertions$AssertionsHelper, >>> value: org.scalatest.Assertions$AssertionsHelper@3bc5fc8f) >>> >>>> >>> >>>> >>> >>>> The last one can be fixed easily if you set class `MyData(val i: >>> Int) extends Serializable `outside of the test suite. For some reason >>> outers (not removed) are capturing >>> >>>> the Scalatest stuff in 2.12. >>> >>>> >>> >>>> Let me know if we see the same failures. >>> >>>> >>> >>>> Stavros >>> >>>> >>> >>>> On Sun, Aug 5, 2018 at 5:10 PM, Sean Owen <sro...@gmail.com> wrote: >>> >>>>> >>> >>>>> Shane et al - could we get a test job in Jenkins to test the Scala >>> 2.12 build? I don't think I have the access or expertise for it, though I >>> could probably copy and paste a job. I think we just need to clone the, >>> say, master Maven Hadoop 2.7 job, and add two steps: run >>> "./dev/change-scala-version.sh 2.12" first, then add "-Pscala-2.12" to the >>> profiles that are enabled. >>> >>>>> >>> >>>>> I can already see two test failures for the 2.12 build right now >>> and will try to fix those, but this should help verify whether the failures >>> are 'real' and detect them going forward. >>> >>>>> >>> >>>>> >>> >>>> >>> >> >>> >> >>> >> >>> >> >> >> >>