date:20170608

Connecting to S3 bucket which does not seem to require a key

2017-06-08 Thread Jack Ingoldsby

Hi,
I'm trying to access the NYC Citibike S3 bucket, which seems to publicly
available

https://s3.amazonaws.com/tripdata/index.html
If I leave the Access Key & Secret Key empty, I get the following message

0: jdbc:drill:zk=local> !tables
Error: Failure getting metadata: Unable to load AWS credentials from any
provider in the chain (state=,code=0)

If I try entering random numbers as keys, I get the following message

Error: Failure getting metadata: Status Code: 403, AWS Service: Amazon S3,
AWS Request ID: 1C888A3A21D79F87, AWS Error Code: InvalidAccessKeyId, AWS
Error Message: The AWS Access Key Id you provided does not exist in our
records. (state=,code=0)

Is it possible to connect to a data source that does not seem to require a
key?

Thanks,
Jack

Re: Accessing json fields within CSV file

2017-06-08 Thread Andries Engelbrecht

You can use convert_from and JSON data type.

0: jdbc:drill:> select t.col1, t.col2, t.conv.key1 as key1, t.conv.key2 as 
key2, t.col4 from
. . . . . . . > (select columns[0] as col1 , columns[1] as col2 , 
convert_from(columns[2], 'JSON') as conv  , columns[3] as col4 from 
`/flat/psv-json/json.tbl`) t;
+---+---+-+-+---+
| col1  | col2  |  key1   |  key2   | col4  |
+---+---+-+-+---+
| 1 | xyz   | value1  | value2  | abc   |




If you want to use functions like flatten you will need to make sure the JSON 
in represented in an array.
i.e. [{"key":1, "value": 1},{"key":2, "value":2}]

0: jdbc:drill:> select t.col1, t.col2, t.conv.key as key, t.conv.`value` as 
`value`, t.col4 from
. . . . . . . > (select columns[0] as col1, columns[1]as col2, 
flatten((convert_from(columns[2],'JSON'))) as conv,  columns[3] as col4 from 
`/flat/psv-json/json.tbl`) t;
+---+---+--++---+
| col1  | col2  | key  | value  | col4  |
+---+---+--++---+
| 1 | xyz   | 1| 1  | abc   |
| 1 | xyz   | 2| 2  | abc   |
+---+---+--++---+



--Andries




On 6/8/17, 2:22 AM, "ankit jain"  wrote:

Hi,
I have a few psv file with a few of the columns being a json key value map.
Example:

> 1|xyz|{"key1":"value1", "key2":"value2"}|abc|


I am converting these files to parquet format but want to convert the json
key and values to different columns. How is that possible?

end product being:
id name key1 key2 description
1 xyz value1 value2 abc

Right now am doing something like this but the json column wont explode:

CREATE TABLE dfs.data.`/logs/logsp/`  AS SELECT
> CAST(columns[0] AS INT)  `id`,
> columns[1] AS `name`,
> columns[2] AS `json_column`,
> columns[3] AS `description`,
> from dfs.data.`logs/events.tbl`;


And this is what I get

id name json_column description
1 xyz {"key1":"value1", "key2":"value2"} abc

Thanks in advance,
Ankit Jain

Accessing json fields within CSV file

2017-06-08 Thread ankit jain

Hi,
I have a few psv file with a few of the columns being a json key value map.
Example:

> 1|xyz|{"key1":"value1", "key2":"value2"}|abc|


I am converting these files to parquet format but want to convert the json
key and values to different columns. How is that possible?

end product being:
id name key1 key2 description
1 xyz value1 value2 abc

Right now am doing something like this but the json column wont explode:

CREATE TABLE dfs.data.`/logs/logsp/`  AS SELECT
> CAST(columns[0] AS INT)  `id`,
> columns[1] AS `name`,
> columns[2] AS `json_column`,
> columns[3] AS `description`,
> from dfs.data.`logs/events.tbl`;


And this is what I get

id name json_column description
1 xyz {"key1":"value1", "key2":"value2"} abc

Thanks in advance,
Ankit Jain

Re: Column alias are ignored when Storage Plugin is enabled

2017-06-08 Thread Kunal Khatua

It could be related to these as well:

https://issues.apache.org/jira/browse/DRILL-5537

https://issues.apache.org/jira/browse/DRILL-5538


Please go ahead and file a bug. If it is related, they'll be linked and 
resolved together.


~ Kunal


From: Rahul Raj 
Sent: Thursday, June 8, 2017 12:12:47 AM
To: user@drill.apache.org
Subject: Column alias are ignored when Storage Plugin is enabled

Drill ignores column aliases when a JDBC storage plugin is enabled.

If I execute 'select destination as x from ...some.csv' the column name
appears as 'destination' instead  of 'x' while JDBC storage plugin is
enabled. On disabling the storage plugin, drill returns the results with
aliased name 'x'.

This could be related to https://issues.apache.org/jira/browse/DRILL-4903,
where  results return the implicit columns(fqn,filepath etc..) as well.

Should I go ahead and raise a JIRA on this?

Regards,
Rahul

--
 This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom it is
addressed. If you are not the named addressee then you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately and delete this e-mail from your system.

Re: CTAS to wait till the time table is created

2017-06-08 Thread Ted Dunning

On Thu, Jun 8, 2017 at 7:07 AM, Sing, Jasbir 
wrote:

> I am using CTAS command to copy one parquet file from another. But my
> threads are not waiting for the task completion and are moving forward. I
> want my tread to wait till the time my parquet file is created.
> How can I achieve this?
>

What threads?

How are you invoking the CTAS command?

Are you calling Drill via JDBC? Or what?

Column alias are ignored when Storage Plugin is enabled

2017-06-08 Thread Rahul Raj

Drill ignores column aliases when a JDBC storage plugin is enabled.

If I execute 'select destination as x from ...some.csv' the column name
appears as 'destination' instead  of 'x' while JDBC storage plugin is
enabled. On disabling the storage plugin, drill returns the results with
aliased name 'x'.

This could be related to https://issues.apache.org/jira/browse/DRILL-4903,
where  results return the implicit columns(fqn,filepath etc..) as well.

Should I go ahead and raise a JIRA on this?

Regards,
Rahul

-- 
 This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom it is 
addressed. If you are not the named addressee then you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately and delete this e-mail from your system.

Connecting to S3 bucket which does not seem to require a key

Re: Accessing json fields within CSV file

Accessing json fields within CSV file

Re: Column alias are ignored when Storage Plugin is enabled

Re: CTAS to wait till the time table is created

Column alias are ignored when Storage Plugin is enabled

6 matches

Site Navigation

Mail list logo

Footer information