You may be better of downloading the NYC bike data set locally and convert to 
parquet.
Converting from csv.zip to parquet will result in large improvements in 
performance if you do various queries on the data set.

--Andries

On 6/11/17, 10:48 PM, "Abhishek Girish" <agir...@apache.org> wrote:

    Drill connects to to S3 buckets (AWS) via the S3a library. And the storage
    plugin configuration requires the access & secret keys [1].
    
    I'm not sure if Drill can access S3 without the credentials. It might be
    possible via custom authenticators [2]. Hopefully others who have tried
    this will comment.
    
    
    [1] https://drill.apache.org/docs/s3-storage-plugin/
    [2] http://docs.aws.amazon.com/AmazonS3/latest/API/sig-
    v4-authenticating-requests.html
    
    On Wed, Jun 7, 2017 at 3:02 PM, Jack Ingoldsby <jack.ingold...@gmail.com>
    wrote:
    
    > Hi,
    > I'm trying to access the NYC Citibike S3 bucket, which seems to publicly
    > available
    >
    > https://s3.amazonaws.com/tripdata/index.html
    > If I leave the Access Key & Secret Key empty, I get the following message
    >
    > 0: jdbc:drill:zk=local> !tables
    > Error: Failure getting metadata: Unable to load AWS credentials from any
    > provider in the chain (state=,code=0)
    >
    > If I try entering random numbers as keys, I get the following message
    >
    > Error: Failure getting metadata: Status Code: 403, AWS Service: Amazon S3,
    > AWS Request ID: 1C888A3A21D79F87, AWS Error Code: InvalidAccessKeyId, AWS
    > Error Message: The AWS Access Key Id you provided does not exist in our
    > records. (state=,code=0)
    >
    > Is it possible to connect to a data source that does not seem to require a
    > key?
    >
    > Thanks,
    > Jack
    >
    

Reply via email to