The situation is this:
There is client side encrypted data on S3. There is an EMR cluster that uses
this as EMRFS. The EMR client reaches out to a custom java class for decrypting
it. EMR does it using the envelope encryption method, documented on AWS.
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-cse.html
My question was, is there a way that I can use the custom java module that I
have (aka EncryptionProvider) to work with Drill so that I can achieve the same
kind of envelope decryption that EMR does? Or does it have to be a completely
new UDF that I use that in turn calls a custom Java module that can decrypt
this data? Apologies if my message is confusing.
-Ganesh
> Subject: Re: Drill to query Client-side encrypted data from S3
> From: dtuc...@maprtech.com
> Date: Tue, 7 Apr 2015 14:47:39 -0700
> To: user@drill.apache.org
>
> Ganesh,
>
> When you say the keys are “custom controlled”, does that mean that only
> special logic within your Java application allows the data to be properly
> accessed ? There are several mechanisms within the S3 API such that
> encryption/decryption occur transparently to the application. If your data
> is accessible in that manner, it’s likely that simply setting the correct
> properties and jar files for your Drill environment will allow your queries
> to access the data.
>
> — David
>
> On Apr 7, 2015, at 2:41 PM, Ganesha Muthuraman <mganesh...@outlook.com> wrote:
>
> > I am trying to use Drill to read from Amazon S3 where the data is
> > Client-side encrypted, meaning the keys to decrypt the data are custom
> > controlled. Is there a way I can use drill with this data given that I have
> > a java module that can be called that will provide the master key to
> > decrypt the data on the fly?
> > My situation: A lot of the use cases that we have might work well with the
> > new approach of S3 client-side encryption, but for using drill to explore
> > that data. So any pointers/help here will be much appreciated.
> > Thanks!
> > -Ganesh
>