For me a +1 on upgrading Jackson as well. This has been long overdue. There are some behavioural changes regarding handling null/None. This is also described in the PR: https://github.com/apache/spark/pull/21596
Also it has a positive impact on the performance. Cheers, Fokko Op vr 19 apr. 2019 om 19:16 schreef Arun Mahadevan <ar...@apache.org> > +1 to upgrade Jackson. It has come up multiple times due to CVEs and the > back port has worked out but it may be good to include if its not going to > delay the release. > > On Thu, 18 Apr 2019 at 19:53, Wenchen Fan <cloud0...@gmail.com> wrote: > >> I've cut RC1. If people think we must upgrade Jackson in 2.4, I can cut >> RC2 shortly. >> >> Thanks, >> Wenchen >> >> On Fri, Apr 19, 2019 at 3:32 AM Felix Cheung <felixcheun...@hotmail.com> >> wrote: >> >>> Re shading - same argument I’ve made earlier today in a PR... >>> >>> (Context- in many cases Spark has light or indirect dependencies but >>> bringing them into the process breaks users code easily) >>> >>> >>> ------------------------------ >>> *From:* Michael Heuer <heue...@gmail.com> >>> *Sent:* Thursday, April 18, 2019 6:41 AM >>> *To:* Reynold Xin >>> *Cc:* Sean Owen; Michael Armbrust; Ryan Blue; Spark Dev List; Wenchen >>> Fan; Xiao Li >>> *Subject:* Re: Spark 2.4.2 >>> >>> +100 >>> >>> >>> On Apr 18, 2019, at 1:48 AM, Reynold Xin <r...@databricks.com> wrote: >>> >>> We should have shaded all Spark’s dependencies :( >>> >>> On Wed, Apr 17, 2019 at 11:47 PM Sean Owen <sro...@gmail.com> wrote: >>> >>>> For users that would inherit Jackson and use it directly, or whose >>>> dependencies do. Spark itself (with modifications) should be OK with >>>> the change. >>>> It's risky and normally wouldn't backport, except that I've heard a >>>> few times about concerns about CVEs affecting Databind, so wondering >>>> who else out there might have an opinion. I'm not pushing for it >>>> necessarily. >>>> >>>> On Wed, Apr 17, 2019 at 6:18 PM Reynold Xin <r...@databricks.com> >>>> wrote: >>>> > >>>> > For Jackson - are you worrying about JSON parsing for users or >>>> internal Spark functionality breaking? >>>> > >>>> > On Wed, Apr 17, 2019 at 6:02 PM Sean Owen <sro...@gmail.com> wrote: >>>> >> >>>> >> There's only one other item on my radar, which is considering >>>> updating >>>> >> Jackson to 2.9 in branch-2.4 to get security fixes. Pros: it's come >>>> up >>>> >> a few times now that there are a number of CVEs open for 2.6.7. Cons: >>>> >> not clear they affect Spark, and Jackson 2.6->2.9 does change Jackson >>>> >> behavior non-trivially. That said back-porting the update PR to 2.4 >>>> >> worked out OK locally. Any strong opinions on this one? >>>> >> >>>> >> On Wed, Apr 17, 2019 at 7:49 PM Wenchen Fan <cloud0...@gmail.com> >>>> wrote: >>>> >> > >>>> >> > I volunteer to be the release manager for 2.4.2, as I was also >>>> going to propose 2.4.2 because of the reverting of SPARK-25250. Is there >>>> any other ongoing bug fixes we want to include in 2.4.2? If no I'd like to >>>> start the release process today (CST). >>>> >> > >>>> >> > Thanks, >>>> >> > Wenchen >>>> >> > >>>> >> > On Thu, Apr 18, 2019 at 3:44 AM Sean Owen <sro...@gmail.com> >>>> wrote: >>>> >> >> >>>> >> >> I think the 'only backport bug fixes to branches' principle >>>> remains sound. But what's a bug fix? Something that changes behavior to >>>> match what is explicitly supposed to happen, or implicitly supposed to >>>> happen -- implied by what other similar things do, by reasonable user >>>> expectations, or simply how it worked previously. >>>> >> >> >>>> >> >> Is this a bug fix? I guess the criteria that matches is that >>>> behavior doesn't match reasonable user expectations? I don't know enough to >>>> have a strong opinion. I also don't think there is currently an objection >>>> to backporting it, whatever it's called. >>>> >> >> >>>> >> >> >>>> >> >> Is the question whether this needs a new release? There's no harm >>>> in another point release, other than needing a volunteer release manager. >>>> One could say, wait a bit longer to see what more info comes in about >>>> 2.4.1. But given that 2.4.1 took like 2 months, it's reasonable to move >>>> towards a release cycle again. I don't see objection to that either (?) >>>> >> >> >>>> >> >> >>>> >> >> The meta question remains: is a 'bug fix' definition even agreed, >>>> and being consistently applied? There aren't correct answers, only best >>>> guesses from each person's own experience, judgment and priorities. These >>>> can differ even when applied in good faith. >>>> >> >> >>>> >> >> Sometimes the variance of opinion comes because people have >>>> different info that needs to be surfaced. Here, maybe it's best to share >>>> what about that offline conversation was convincing, for example. >>>> >> >> >>>> >> >> I'd say it's also important to separate what one would prefer >>>> from what one can't live with(out). Assuming one trusts the intent and >>>> experience of the handful of others with an opinion, I'd defer to someone >>>> who wants X and will own it, even if I'm moderately against it. Otherwise >>>> we'd get little done. >>>> >> >> >>>> >> >> In that light, it seems like both of the PRs at issue here are >>>> not _wrong_ to backport. This is a good pair that highlights why, when >>>> there isn't a clear reason to do / not do something (e.g. obvious errors, >>>> breaking public APIs) we give benefit-of-the-doubt in order to get it >>>> later. >>>> >> >> >>>> >> >> >>>> >> >> On Wed, Apr 17, 2019 at 12:09 PM Ryan Blue < >>>> rb...@netflix.com.invalid> wrote: >>>> >> >>> >>>> >> >>> Sorry, I should be more clear about what I'm trying to say here. >>>> >> >>> >>>> >> >>> In the past, Xiao has taken the opposite stance. A good example >>>> is PR #21060 that was a very similar situation: behavior didn't match what >>>> was expected and there was low risk. There was a long argument and the >>>> patch didn't make it into 2.3 (to my knowledge). >>>> >> >>> >>>> >> >>> What we call these low-risk behavior fixes doesn't matter. I >>>> called it a bug on #21060 but I'm applying Xiao's previous definition here >>>> to make a point. Whatever term we use, we clearly have times when we want >>>> to allow a patch because it is low risk and helps someone. Let's just be >>>> clear that that's perfectly fine. >>>> >> >>> >>>> >> >>> On Wed, Apr 17, 2019 at 9:34 AM Ryan Blue <rb...@netflix.com> >>>> wrote: >>>> >> >>>> >>>> >> >>>> How is this a bug fix? >>>> >> >>>> >>>> >> >>>> On Wed, Apr 17, 2019 at 9:30 AM Xiao Li <lix...@databricks.com> >>>> wrote: >>>> >> >>>>> >>>> >> >>>>> Michael and I had an offline discussion about this PR >>>> https://github.com/apache/spark/pull/24365. He convinced me that this >>>> is a bug fix. The code changes of this bug fix are very tiny and the risk >>>> is very low. To avoid any behavior change in the patch releases, this PR >>>> also added a legacy flag whose default value is off. >>>> >> >>>>> >>>> >> >>>> >> --------------------------------------------------------------------- >>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>> >> >>>> >>> >>>