The "malicious release manager" is an interesting attack, one that the ASF
"we trust the community" doesn't defend against. The risk here is that
someone generates a set of malicious artifacts (maybe just publishes them
to maven), while the source code is safe.

To help defend against this, here's some code which will do a
bytecode-level diff between JARs, ignoring debugging stuff, generated
metadata etc. Enjoy

https://github.com/steveloughran/auditor

This wouldn't defend against someone adding a malicious dependency to the
artifacts they publish on maven, so really the tool should audit that too.
But you can at least check out a spark branch, build the binaries and then
audit the RC's artifacts against them to look for tangible variations.

On Wed, 22 Apr 2026 at 23:53, Tian Gao via dev <[email protected]> wrote:

> So are you suggesting that we don't enforce this 1-week buffer for all
> Apache projects? I agree that a legitimate Apache project release is
> well-vetted and generally safe, but there could be situations where a
> release is maliciously executed by stealing identities of people who have
> access to make releases - that's where many supply chain attacks occur.
> Moreover, it would be more difficult to enforce this (whether for LLM or
> for human) to treat Apache projects differently. Also I think a 7-day delay
> to accept an Apache project release is not a big deal for us.
>
> Regarding the Spark-related projects, we don't need to enforce the policy
> for them.
>
> I think for supply chain attacks, we are defending ourselves not only
> against package developers, but more importantly, we are defending
> ourselves against potential loopholes in the release process. We must
> assume that there could be something wrong during the release process of
> any project.
>
> Tian
>
> On Wed, Apr 22, 2026 at 3:32 PM Dongjoon Hyun <[email protected]> wrote:
>
>> To be clear, this discussion should be applied to Apache Spark main
>> repository only.
>>
>> https://github.com/apache/spark
>>
>> It's because subprojects need to consume Apache Spark releases ASAP. For
>> example, Apache Spark K8s Operator will upgrade its dependency on the same
>> day of Apache Spark release because we trust our release process (including
>> vote).
>>
>> In addition, probably, we may want to extend our exceptions to include
>> all ASF project releases (Apache Hadoop, Avro, Parquet, ORC, Kafka, ...)
>> which have established community vote process.
>>
>> Dongjoon.
>>
>> On 2026/04/22 22:21:41 Dongjoon Hyun wrote:
>> > Thank you for the suggestion.
>> >
>> > +1 for the general predefined (1-week) grace-period policy sounds good
>> to me.
>> >
>> > For the exception cases, I believe we can let the PMC members make the
>> final decision on merge timing like the PMC members decides the `Blocker`
>> level priority of JIRA issues already.
>> >
>> > If we have a voted policy, it would be great if we can add the policy
>> to AGENTS.md explicitly to apply the policy from the PR steps.
>> >
>> > Best,
>> > Dongjoon.
>> >
>> > On 2026/04/22 20:47:24 Steve Loughran wrote:
>> > > 7 days is long enough to catch most (all?) malicious attacks.
>> > >
>> > > Regarding developers, there's a strong case to be made for only doing
>> > > builds and especially tests in isolated containers, even though
>> artifacts
>> > > will leak across shared containers through a shared maven repo. It
>> still
>> > > limits the damage malicious binaries can do.
>> > >
>> > > On Tue, 21 Apr 2026 at 23:58, Jungtaek Lim <
>> [email protected]>
>> > > wrote:
>> > >
>> > > > +1
>> > > >
>> > > > We tend to consider that merging to master branch gives some time
>> to bake
>> > > > before releasing. But we (Spark devs) are people who build Spark and
>> > > > run some tests against the master branch almost day to day. For us,
>> there
>> > > > is literally no time for these library upgrades to be baked - we are
>> > > > exposed to any kind of potential CVE from these library upgrades.
>> > > >
>> > > > It's arguable whether we should stay up to date with the recent
>> release
>> > > > version for dependencies, but that'd probably be uneasy to make
>> consensus;
>> > > > there is a clear trade-off. The current proposal sounds to me as a
>> good
>> > > > compromise - IMHO delaying by 2 weeks (14 days) seems reasonable,
>> but
>> > > > strict 1 week (7 days) is better than nothing if anyone is
>> concerned 2
>> > > > weeks is too long.
>> > > >
>> > > > On Tue, Apr 21, 2026 at 9:45 PM Szehon Ho <[email protected]>
>> wrote:
>> > > >
>> > > >> +1 make sense to me as well.  We should of course be fast for
>> security
>> > > >> upgrades, but make sense to avoid such eager upgrades for the rest
>> of
>> > > >> the hundreds of Spark dependencies, due to the increased supply
>> chain
>> > > >> attack risks in the ecosystem.
>> > > >>
>> > > >> Thanks
>> > > >> Szehon
>> > > >>
>> > > >> On Tue, Apr 21, 2026 at 3:32 AM Wenchen Fan <[email protected]>
>> wrote:
>> > > >>
>> > > >>> Thanks for starting this discussion! I did a data analysis a
>> while ago
>> > > >>> but didn't have time to act on it. The analysis shows:
>> > > >>>
>> > > >>> *58* maven dep upgrades in the last 3 months.
>> > > >>> *46%* (27/58) within 7 days of release
>> > > >>> ≤7d      : 27 / 58  (47%)
>> > > >>> 8d–30d   : 12 / 58  (21%)
>> > > >>> >30d     : 19 / 58  (32%)
>> > > >>>
>> > > >>> You can find the raw data in the attached file. This does look a
>> bit
>> > > >>> aggressive. I build Spark locally everyday, and I believe I'm not
>> the only
>> > > >>> one. Having a couple of weeks as the buffer time is a good idea
>> to protect
>> > > >>> developers like me from potential supply chain attacks.
>> > > >>>
>> > > >>> On Tue, Apr 21, 2026 at 6:24 AM Hyukjin Kwon <
>> [email protected]>
>> > > >>> wrote:
>> > > >>>
>> > > >>>> SGTM I think it's good practice to give a couple of weeks before
>> the
>> > > >>>> upgrade
>> > > >>>>
>> > > >>>> On Tue, 21 Apr 2026 at 07:13, Tian Gao via dev <
>> [email protected]>
>> > > >>>> wrote:
>> > > >>>>
>> > > >>>>> Hi, I want to start a discussion about our dependency upgrade
>> policy
>> > > >>>>> for active development.
>> > > >>>>>
>> > > >>>>> Our current dependency upgrade (mostly for Java, but Python
>> should be
>> > > >>>>> included too) is a bit spontaneous. People find that a
>> dependency has a new
>> > > >>>>> version available and we just do the upgrade.
>> > > >>>>>
>> > > >>>>> This raises concerns about potential supply chain attacks. We
>> already
>> > > >>>>> established a few sets of rules (including pinning the github
>> action
>> > > >>>>> versions) to avoid the supply chain attack, but manually
>> upgrading the
>> > > >>>>> dependency version too eagerly could also be risky.
>> > > >>>>>
>> > > >>>>> It normally takes time for a bad release to be recognized, so I
>> think
>> > > >>>>> we should set a buffer time before upgrading to the latest
>> version. For
>> > > >>>>> example, we can wait a week or two after the latest release
>> before we set
>> > > >>>>> our development dependency to it. This could reduce the
>> possibility of
>> > > >>>>> being impacted by malicious releases, or just give them enough
>> time to fix
>> > > >>>>> their own severe bugs.
>> > > >>>>>
>> > > >>>>> The cost for this policy is very low - it barely impacts us if
>> we
>> > > >>>>> can’t use the “latest” version of dependencies.
>> > > >>>>>
>> > > >>>>> Of course, there should be exceptions when dependency upgrades
>> include
>> > > >>>>> security fixes for known vulnerabilities; we should upgrade as
>> fast as
>> > > >>>>> possible.
>> > > >>>>>
>> > > >>>>> Tian
>> > > >>>>>
>> > > >>>>
>> > > >>>
>> ---------------------------------------------------------------------
>> > > >>> To unsubscribe e-mail: [email protected]
>> > > >>
>> > > >>
>> > >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe e-mail: [email protected]
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: [email protected]
>>
>>

Reply via email to