in this case the json.org<http://json.org> classes are lurking inside the AMS JARs, so it's not swappable, and it's not immediately obvious there's a problem. You need something to scan all the JARs for forbidden .class files.
Oddly enough, Java ships with a tool to scan all the JARs for specific .class files, we call it "the classloader". It would be possible for someone to write some parameter driven test suite which attempted a loadResource() of the forbidden classes, failing a test if one was there. Subclass this suite into the various separate modules at the end of the DAG (hadoop-aws, hadoop-azure), and we can use JUnit to implement the work On 7 Nov 2016, at 19:14, Andrew Wang <andrew.w...@cloudera.com<mailto:andrew.w...@cloudera.com>> wrote: Have we looked into swapping in the Android cleanroom implementation of json.org<http://json.org/>? The issue with Jackson bumps is always the classpath clashes with downstream projects. https://wiki.debian.org/qa.debian.org/jsonevil https://android.googlesource.com/platform/libcore/+/master/json/ Maybe we need to build it ourselves, but it's still better than bumping the Jackson version. I'm wondering if we can't just produce our own shaded derivative of the AWS jar: merge in the AWS artifacts unshaded, shade in its jackson dependency. This would let us use it in 2.7+ without worrying about jackson versions. I'd still avoid it for 2.6.x, because I doubt new versions will be compatible with Java 6; it's not worth worrying about. I think I might give a lighting talk at Apachecon Big Data next week, "just because it''s a project right to use an incompatible version of jackson, doesn't mean it's a duty". I can reminisce fondly about the Elder Days when Xerces didn't come with the JVM; every project bundled Xerces and Xalan on the CP —but at least they were single JAR releases with stable APIs. On Mon, Nov 7, 2016 at 10:14 AM, Steve Loughran <ste...@hortonworks.com<mailto:ste...@hortonworks.com>> wrote: https://issues.apache.org/jira/browse/HADOOP-13794: JSON.org<http://JSON.org><http://JSON.org<http://json.org/>> license is now forbidden by the ASF From distribution. Which means we can't make any Hadoop releases with the AWS SDK JARs < =1.11.0 in them, meaning https://issues.apache.org/jira/browse/HADOOP-13050 has moved up from a minor issue to a blocker, and are going to have to worry about the older branches. 1. The latest amazon-AWS SDKs absolutely do not work with shipping jackson version: it even references artifacts that don't appear until Jackson 2.3.3; and needs to on a later version than that to actually work. 2. AWS SDK updates have generally needed code changes (example: HADOOP-12269) For 2.8.x we can increment the AWS SDK, and take this as a time to increment jackson, which an XEE vulnerability was hinting at anwyay ( https://issues.apache.org/jira/browse/HADOOP-12705) . I know this has a risk of problems, but Sean Mackrory has done the due diliegence to show that Jackson 2.7.8 doesn't break existing API use in Hadoop; after that jackson goes incompatible (again). For Branch 2.6.x we may just want to take the easy way out, and not bundle the (very dated) AWS JAR; just strip it out of the final set of artifacts to include in the project dist, and tell people that if they want to use s3a in 2.6.x (which I think people should really avoid, given it to too 2.7.1 to stabilize), then they need to manually install it. Which leaves Hadoop 2.7.x, doesn't it? What to do? People are using s3a, it's working well, and putting the AWS JARs are going to cause problems. But pushing up a Jackson update in a 2.7.x update is going to be traumatic. -Steve