HI All, Splitting Commons Lang is a -1 for me. Maven already has a solution for projects that only want one class: Shadding. You decide what you want, done. Everyone who wants this type of slicing and dicing will have different needs. Oh, can you add this one class to this one module, on and on.
Splitting Commons Compress is on the to-do list, based on archive and compressor type, not individual classes. The advantage here is that all dependencies a specific module has can be required instead of optional. Similar work has been ongoing for Commons VFS. Commons Net provides FTP in a separate JAR in a non-standard (Maven) manner, and the build's a bit of a mess because of it. On Thu, Oct 30, 2025 at 1:50 AM Vladimir Sitnikov <[email protected]> wrote: > > Hi all, > > Following the “Branch protection rules (CTR-style)” thread, > I’d like to spin off a separate discussion about micro-modularizing some > Commons libraries > to reduce CVE blast radius and dependency weight. These are bogus reasons IMO and FUD to boot. A bug only affects code you actually use, as was mentioned several times in the other thread. A JVM only loads a class when running code sees that reference. Maven allows you to specify the scope of a dependency precisely. > > Motivation (real-world pain): > > As Sebb noted, unused classes shouldn’t affect runtime, however > vulnerability scanners flag artifacts, > not “used classes”. There is a lot of "project management by checklists" out there: "Run scanners, and do what they say", regardless of any issues that can actually happen, never mind the false positives. I've actually seen bugs and inefficiencies introduced because a scanner raised a flag. I'll spare stories, I'm on the guilty list ;-) > In practice teams must upgrade/patch even when only a tiny part is > affected; proving non-impact is often harder than bumping or excluding. > > Mere presence of a vulnerable class on the classpath can widen attack > surface "can" and most times not, as usual, "it depends". > (e.g., unsafe deserialization paths + a vulnerable helper available to the > attacker). > > Recent examples show cross-bleed: projects that depend on > commons-compress:1.25.0 saw multiple CVEs > (CVE-2024-26308 Pack200 OOM, CVE-2024-25710 DUMP DoS) and also pulled in > commons-lang3 where ClassUtils CVE-2025-48924 then arrives transitively. > A modular layout like commons-pack200, commons-dump, commons-stringutils, > commons-arrayutils, etc., > would let consumers pick only what they need and limit exposure. > > Concrete proposal (small, testable): > > Pilot a commons-stringutils4 artifact containing only StringUtils and > Strings (and minimal shared internals if any). > Use org.apache.commons.stringutils4 package so it could co-exist with the > current commons-lang3. -1 shade the class if that's what you want. > > The existing commons-lang3 could depend on commons-stringutils4 so > lang3.StringUtils could delegate all the methods to > stringutils4.StringUtils. -1, what a horror. Gary > > This would keep full backward compatibility for commons-lang3, and it would > avoid code duplication. > It would give users the ability to pull only StringUtils. > > Questions for the community: > > Are folks open to a pilot micro-module (commons-stringutils) released from > the lang repo? > Any hard blockers you see? > > Success criteria: adoption by projects that currently shade/extract > StringUtils; fewer CVE flags for users that don’t pull the rest of lang3. > For instance, even commons-compress runtime seem to > require just stringtuils and arrayutils. > > If there’s interest, I can draft a PR with commons-stringutils4. > > Thanks, > Vladimir --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
