in this case the json.org<http://json.org> classes are lurking inside the AMS 
JARs, so it's not swappable, and it's not immediately obvious there's a 
problem. You need something to scan all the JARs for forbidden .class files.

Oddly enough, Java ships with a tool to scan all the JARs for specific .class 
files, we call it "the classloader". It would be possible for someone to write 
some parameter driven test suite which attempted a loadResource() of  the 
forbidden classes, failing a test if one was there. Subclass this suite into 
the various separate modules at the end of the DAG (hadoop-aws, hadoop-azure), 
and we can use JUnit to implement the work



On 7 Nov 2016, at 19:14, Andrew Wang 
<andrew.w...@cloudera.com<mailto:andrew.w...@cloudera.com>> wrote:

Have we looked into swapping in the Android cleanroom implementation of 
json.org<http://json.org/>? The issue with Jackson bumps is always the 
classpath clashes with downstream projects.

https://wiki.debian.org/qa.debian.org/jsonevil
https://android.googlesource.com/platform/libcore/+/master/json/

Maybe we need to build it ourselves, but it's still better than bumping the 
Jackson version.


I'm wondering if we can't just produce our own shaded derivative of the AWS 
jar: merge in the AWS artifacts unshaded, shade in its jackson dependency. This 
would let us use it in 2.7+ without worrying about jackson versions.

I'd still avoid it for 2.6.x, because I doubt new versions will be compatible 
with Java 6; it's not worth worrying about.

I think I might give a lighting talk at Apachecon Big Data next week, "just 
because it''s a project right to use an incompatible version of jackson, 
doesn't mean it's a duty".  I can reminisce fondly about the Elder Days when 
Xerces didn't come with the JVM; every project bundled Xerces and Xalan on the 
CP —but at least they were single JAR releases with stable APIs.


On Mon, Nov 7, 2016 at 10:14 AM, Steve Loughran 
<ste...@hortonworks.com<mailto:ste...@hortonworks.com>> wrote:

https://issues.apache.org/jira/browse/HADOOP-13794: 
JSON.org<http://JSON.org><http://JSON.org<http://json.org/>> license is now 
forbidden by the ASF From distribution.


Which means we can't make any Hadoop releases with the AWS SDK JARs < =1.11.0 
in them, meaning https://issues.apache.org/jira/browse/HADOOP-13050 has moved 
up from a minor issue to a blocker, and are going to have to worry about the 
older branches.

1. The latest amazon-AWS SDKs absolutely do not work with shipping jackson 
version: it even references artifacts that don't appear until  Jackson 2.3.3; 
and needs to on a later version than that to actually work.
2. AWS SDK updates have generally needed code changes (example: HADOOP-12269)

For 2.8.x we can increment the AWS SDK, and take this as a time to increment 
jackson, which an XEE vulnerability was hinting at anwyay ( 
https://issues.apache.org/jira/browse/HADOOP-12705) . I know this has a risk of 
problems, but Sean Mackrory has done the due diliegence to show that Jackson 
2.7.8 doesn't break existing API use in Hadoop; after that jackson goes 
incompatible (again).


For Branch 2.6.x we may just want to take the easy way out, and not bundle the 
(very dated) AWS JAR; just strip it out of the final set of artifacts to 
include in the project dist, and tell people that if they want to use s3a in 
2.6.x (which I think people should really avoid, given it to too 2.7.1 to 
stabilize), then they need to manually install it.


Which leaves Hadoop 2.7.x, doesn't it? What to do? People are using s3a, it's 
working well, and putting the AWS JARs are going to cause problems. But pushing 
up a Jackson update in a 2.7.x update is going to be traumatic.

-Steve


Reply via email to