All,

In case others are in the same situation as I am, I will tell you how I solved 
this.  After A LLLOOOOTTTT of digging through source code, I discovered the 
following facts:
• Drill is using hadoop’s FileSystem to support S3 queries.  So any 
configuration items that work for that will also work if you place them in the 
core-site.xml file here.
• In the Hadoop-aws jar/source code, it uses these classes to get credentials:
o S3AFileSystem
o S3AUtils
o [Default]S3ClientFactory
• If you configure nothing, then naturally credentials will be searched in this 
order:
o BasicAWSCredentialsProvider – looks for access and secret in the core-site 
xml file
o EnvironmentVariableCredentialsProvider – looks for access and secret in 
environment variables.
o SharedInstanceProfileCredentialsProvider – tries to get credentials from the 
instance metadata, THIS IS THE ONE THAT CAN FIND IAM CREDENTIALS!

So to solve this problem I had to do these steps:
1. Make sure that core-site.xml DOES NOT set the access and secret key
2. Make sure that your S3 Storage configuration DOES NOT set the access and 
secret key from the Apache Drill web UI, Storage tab
3. In my case, I also needed server side encryption to be supported, there is a 
property you can add to core-site.xml for that.

Here is what my core-site.xml file eventually looked like:

<configuration>
  <property>
    <name>fs.s3a.server-side-encryption-algorithm</name>
    <value>YOUR_VALUE_HERE</value>
  </property>
  <property>
    <name>fs.s3a.connection.maximum</name>
    <value>100</value>
  </property>
</configuration>

When you query from drill, the format should look like this:
SELECT * FROM s3.`s3a://my-bucket/drill/nation.parquet` limit 3;

Also, if somebody needs to troubleshoot this, then modify the logback.xml, add 
these:

<logger name="com.amazonaws.services.s3" additivity="false">
    <level value="trace"/>
    <appender-ref ref="FILE" />
  </logger>

  <logger name="org.apache.drill.exec.store.dfs" additivity="false">
    <level value="trace"/>
    <appender-ref ref="FILE" />
  </logger>

Then you can see log entries for these things in drillbit.log

I hope this may help other people who need to use IAM and/or server side 
encryption with drill.

I also hope that somebody will update the Drill documentation to explain how to 
do this, it could have saved me a day of work!

Michael Knapp



On 4/3/17, 1:13 PM, "Knapp, Michael" <michael.kn...@capitalone.com> wrote:

    Drill Developers,
    
    I am using IAM roles on EC2 instances, your documentation here:
    https://drill.apache.org/docs/s3-storage-plugin/
    
    instructs me to provide an access key and secret key, which I do not have 
since I am using IAM roles.
    
    I have been reviewing the source code a few hours now and still have not 
found a point in the code where you connect with S3.  I was surprised to find 
that you do not use the AWS SDK.
    
    Can you please tell me:
    
    1.       Does Drill support using IAM roles to provide credentials for S3 
access?
    
    2.       Where in the code does Drill establish a connection with S3?
    
    Michael Knapp
    ________________________________________________________
    
    The information contained in this e-mail is confidential and/or proprietary 
to Capital One and/or its affiliates and may only be used solely in performance 
of work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.
    

________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates and may only be used solely in performance of 
work or services for Capital One. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed. If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.

Reply via email to