Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/22081 I've just pushed up my PR which is ~ in sync with this one; I'll close that one now and this can be the one to use. Assume: kinesis uses bouncy castle somewhere. There's some hints in the AWS docs [Encrypt and Decrypt Amazon Kinesis Records Using AWS KMS](https://aws.amazon.com/blogs/big-data/encrypt-and-decrypt-amazon-kinesis-records-using-aws-kms/) covers end-to-end encryption of Kinesis records. For this you need the AWS encryption SDK, whose docs [say you need bouncy castle](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/java.html). And it looks like the AWS encryption SDK does explicitly [depend on bouncy castle](http://mvnrepository.com/artifact/com.amazonaws/aws-encryption-sdk-java/1.3.5). Imagine if *somehow* the removal of bouncy castle as a java crypto provider was stopping that round trip working with some of the encrypt/decrypt not happening. In which case adding bouncy castle should fix things. It worked before because jets3t in spark-core added bouncy castle, and the last bouncy-castle version update made it in sync with kinesis (and broke jets3t, but nobody has noticed...) But * There's no refs to javax.crypto, the aws crypto libs or calls to the class `KinesisEncryptionUtils`referenced in the blog post in the spark kinesis module (it's not in the latest SDKs either( * There's no build-time dependency on the aws-sdk encryption, which would transitively pull in the bouncy castle stuff. * Looking through the aws-sdk-bundle: no refs to javax.crypto in the kinesis code; encryption refs limited to the PUT request where you can request server-side encryption with a given KMS key. * Nor is there any `com.aws.encryptionsdk` in that bundle, or shaded bouncy castle (which is good, as otherwise I'd have to deal with the fact that some ASF projects were shipping a shaded version of it unknowingly) It could just be a strong java crypto provided is needed, and in the absence of the unlimited java crypto JAR in the JDK lib dir (where it's needed for kerberos to work), bouncy-castle needs to be on the CP. What to do? 1. you can remove jets3t independent of the bouncy castle changes, because Kinesis isn't going to be using jets3t. The aws-s3 module significantly supercedes the jets3t client's functionality, and is the only one you'd expect the other parts of the AWS SDK to pick up. 1. the bouncy-castle dependency could be upgraded to a later version in the kinesis module(s) alone, and explicitly added to kinesis-asl. 1. Someone needs to do some experiments with what happens to the test suite with/without the full JCE and bouncy castle, maybe including more details on whats not matching up in the round trip tests 1 Maybe including some new test which somehow explores what encryption algorithms/keys you get with/without the BC and JCE-unlimited JARs
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org