[jira] [Commented] (DRILL-4877) max(dir0), max(dir1) query against parquet data slower by 2X

2016-09-30 Thread Khurram Faraaz (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535287#comment-15535287
 ] 

Khurram Faraaz commented on DRILL-4877:
---

 Test was run on Drill 1.9.0 git commit ID: f3c26e34 on 4 node CentOS cluster.
Before running the below query, REFRESH TABLE METADATA command was run on the 
table used in query on 1.9.0.

{noformat}
Query => select max(dir0), max(dir1), max(dir2) from `DRILL_4589`;

Execution times:
Run 1 : 58.57 seconds
Run 2 : 53.54 seconds
Run 3 : 49.05 seconds
Run 4 : 43.51 seconds
{noformat}

[~dgu-atmapr] Can you please run the test on latest 1.9.0 master on the 
performance cluster, and confirm the numbers, before we can mark this JIRA as 
verified and closed ?

nproc command returned 24 on each of the 4 nodes on the cluster.

> max(dir0), max(dir1) query against parquet data slower by 2X
> 
>
> Key: DRILL-4877
> URL: https://issues.apache.org/jira/browse/DRILL-4877
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.9.0
> Environment: 4 node cluster centos
>Reporter: Khurram Faraaz
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.9.0
>
>
> max(dir0), max(dir1) query against parquet data slower by 2X
> test was run with meta data cache on both 1.7.0 and 1.9.0
> there is a difference in query plan and also execution time on 1.9.0 is close 
> to 2X that on 1.7.0 
> Test from Drill 1.9.0 git commit id: 28d315bb
> on 4 node Centos cluster
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select max(dir0), max(dir1), max(dir2) from 
> `DRILL_4589`;
> +-+-+-+
> | EXPR$0  | EXPR$1  | EXPR$2  |
> +-+-+-+
> | 2015| Q4  | null|
> +-+-+-+
> 1 row selected (70.644 seconds)
> {noformat}
> Query plan for the above query, note than in Drill 1.9.0 usedMetadataFile is 
> not available is the query plan text.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select max(dir0), max(dir1), 
> max(dir2) from `DRILL_4589`;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2])
> 00-02StreamAgg(group=[{}], EXPR$0=[MAX($0)], EXPR$1=[MAX($1)], 
> EXPR$2=[MAX($2)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[MAX($0)], EXPR$1=[MAX($1)], 
> EXPR$2=[MAX($2)])
> 01-02  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q1/f672.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2011/Q4/f162.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2000/Q2/f1101.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1996/Q2/f110.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q3/f1192.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1999/Q2/f174.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q4/f885.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q3/f1720.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q1/f1779.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1991/Q2/f629.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q4/f821.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2015/Q3/f896.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2002/Q2/f1458.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2004/Q4/f1756.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q2/f1490.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q3/f1137.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2013/Q1/f561.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q3/f1562.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q1/f1445.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q1/f236.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1992/Q4/f1209.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2014/Q2/f518.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1993/Q4/f1598.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2008/Q1/f780.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1999/Q1/f1763.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q4/f381.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q1/f1870.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2014/Q1/f915.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q2/f673.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1998/Q1/f736.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2013/Q2/f749.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2007/Q3/f111.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1993/Q3/f776.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2002/Q1/f403.parquet], 
> ReadEn

[jira] [Updated] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4373:

Labels: doc-impacting  (was: )

> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4373:

Fix Version/s: 1.9.0

> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15535895#comment-15535895
 ] 

Vitalii Diravka commented on DRILL-4373:


So I added int96 to timestamp converter for both parquet readers and controling 
it by system / session option "store.parquet.int96_as_timestamp". 
The value of the option is false by default for the proper work of the old 
query scripts with the "convert_from TIMESTAMP_IMPALA" function. 

When the option is true using of that function is unnesessary and can lead to 
the query fail. 


> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4373:

Issue Type: Improvement  (was: Bug)

> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive, Storage - Parquet
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4373:

Affects Version/s: 1.8.0

> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive, Storage - Parquet
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4919) select count(1) on csv with header no longer works

2016-09-30 Thread JIRA
F Méthot created DRILL-4919:
---

 Summary: select count(1) on csv with header no longer works
 Key: DRILL-4919
 URL: https://issues.apache.org/jira/browse/DRILL-4919
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.8.0
Reporter: F Méthot
Priority: Minor


Dataset (I used extended char for display purpose) test.csvh:

a,b,c,d\n
1,2,3,4\n
5,6,7,8\n

Storage config:
"csvh": {
  "type": "text",
  "extensions" : [
  "csvh"
   ],
   "extractHeader": true,
   "delimiter": ","
  }

select count(1) from dfs.`test.csvh`

Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
names are supported
coumn name columns
column index
Fragment 0:0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4919) select count(1) on csv with header no longer works

2016-09-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/DRILL-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

F Méthot updated DRILL-4919:

Description: 
This is new in  1.8

Dataset (I used extended char for display purpose) test.csvh:

a,b,c,d\n
1,2,3,4\n
5,6,7,8\n

Storage config:
"csvh": {
  "type": "text",
  "extensions" : [
  "csvh"
   ],
   "extractHeader": true,
   "delimiter": ","
  }

select count(1) from dfs.`test.csvh`

Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
names are supported
coumn name columns
column index
Fragment 0:0




  was:
Dataset (I used extended char for display purpose) test.csvh:

a,b,c,d\n
1,2,3,4\n
5,6,7,8\n

Storage config:
"csvh": {
  "type": "text",
  "extensions" : [
  "csvh"
   ],
   "extractHeader": true,
   "delimiter": ","
  }

select count(1) from dfs.`test.csvh`

Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
names are supported
coumn name columns
column index
Fragment 0:0





> select count(1) on csv with header no longer works
> --
>
> Key: DRILL-4919
> URL: https://issues.apache.org/jira/browse/DRILL-4919
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
>Reporter: F Méthot
>Priority: Minor
>
> This is new in  1.8
> Dataset (I used extended char for display purpose) test.csvh:
> a,b,c,d\n
> 1,2,3,4\n
> 5,6,7,8\n
> Storage config:
> "csvh": {
>   "type": "text",
>   "extensions" : [
>   "csvh"
>],
>"extractHeader": true,
>"delimiter": ","
>   }
> select count(1) from dfs.`test.csvh`
> Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
> names are supported
> coumn name columns
> column index
> Fragment 0:0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4919) select count(1) on csv with header no longer works

2016-09-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/DRILL-4919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

F Méthot updated DRILL-4919:

Description: 
This happens since  1.8

Dataset (I used extended char for display purpose) test.csvh:

a,b,c,d\n
1,2,3,4\n
5,6,7,8\n

Storage config:
"csvh": {
  "type": "text",
  "extensions" : [
  "csvh"
   ],
   "extractHeader": true,
   "delimiter": ","
  }

select count(1) from dfs.`test.csvh`

Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
names are supported
coumn name columns
column index
Fragment 0:0




  was:
This is new in  1.8

Dataset (I used extended char for display purpose) test.csvh:

a,b,c,d\n
1,2,3,4\n
5,6,7,8\n

Storage config:
"csvh": {
  "type": "text",
  "extensions" : [
  "csvh"
   ],
   "extractHeader": true,
   "delimiter": ","
  }

select count(1) from dfs.`test.csvh`

Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
names are supported
coumn name columns
column index
Fragment 0:0





> select count(1) on csv with header no longer works
> --
>
> Key: DRILL-4919
> URL: https://issues.apache.org/jira/browse/DRILL-4919
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.8.0
>Reporter: F Méthot
>Priority: Minor
>
> This happens since  1.8
> Dataset (I used extended char for display purpose) test.csvh:
> a,b,c,d\n
> 1,2,3,4\n
> 5,6,7,8\n
> Storage config:
> "csvh": {
>   "type": "text",
>   "extensions" : [
>   "csvh"
>],
>"extractHeader": true,
>"delimiter": ","
>   }
> select count(1) from dfs.`test.csvh`
> Error: UNSUPPORTED_OPERATION ERROR: With extractHeader enabled, only header 
> names are supported
> coumn name columns
> column index
> Fragment 0:0



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4203) Parquet File : Date is stored wrongly

2016-09-30 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4203:

Fix Version/s: 1.9.0

> Parquet File : Date is stored wrongly
> -
>
> Key: DRILL-4203
> URL: https://issues.apache.org/jira/browse/DRILL-4203
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Stéphane Trou
>Assignee: Vitalii Diravka
>Priority: Critical
> Fix For: 1.9.0
>
>
> Hello,
> I have some problems when i try to read parquet files produce by drill with  
> Spark,  all dates are corrupted.
> I think the problem come from drill :)
> {code}
> cat /tmp/date_parquet.csv 
> Epoch,1970-01-01
> {code}
> {code}
> 0: jdbc:drill:zk=local> select columns[0] as name, cast(columns[1] as date) 
> as epoch_date from dfs.tmp.`date_parquet.csv`;
> ++-+
> |  name  | epoch_date  |
> ++-+
> | Epoch  | 1970-01-01  |
> ++-+
> {code}
> {code}
> 0: jdbc:drill:zk=local> create table dfs.tmp.`buggy_parquet`as select 
> columns[0] as name, cast(columns[1] as date) as epoch_date from 
> dfs.tmp.`date_parquet.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 1  |
> +---++
> {code}
> When I read the file with parquet tools, i found  
> {code}
> java -jar parquet-tools-1.8.1.jar head /tmp/buggy_parquet/
> name = Epoch
> epoch_date = 4881176
> {code}
> According to 
> [https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#date], 
> epoch_date should be equals to 0.
> Meta : 
> {code}
> java -jar parquet-tools-1.8.1.jar meta /tmp/buggy_parquet/
> file:file:/tmp/buggy_parquet/0_0_0.parquet 
> creator: parquet-mr version 1.8.1-drill-r0 (build 
> 6b605a4ea05b66e1a6bf843353abcb4834a4ced8) 
> extra:   drill.version = 1.4.0 
> file schema: root 
> 
> name:OPTIONAL BINARY O:UTF8 R:0 D:1
> epoch_date:  OPTIONAL INT32 O:DATE R:0 D:1
> row group 1: RC:1 TS:93 OFFSET:4 
> 
> name: BINARY SNAPPY DO:0 FPO:4 SZ:52/50/0,96 VC:1 
> ENC:RLE,BIT_PACKED,PLAIN
> epoch_date:   INT32 SNAPPY DO:0 FPO:56 SZ:45/43/0,96 VC:1 
> ENC:RLE,BIT_PACKED,PLAIN
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4920) Connect to json file on web (http/https)

2016-09-30 Thread Michael Rans (JIRA)
Michael Rans created DRILL-4920:
---

 Summary: Connect to json file on web (http/https)
 Key: DRILL-4920
 URL: https://issues.apache.org/jira/browse/DRILL-4920
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - JSON
Affects Versions: 1.8.0
Reporter: Michael Rans


I have not been able to set up Drill to connect to a JSON file at url:
https://data.humdata.org/api/3/action/current_package_list_with_resources?limit=1

I can connect to files locally.

It is not clear to me from the documentation whether or not this feature 
exists. If it doesn't, it should.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4539) Add support for Null Equality Joins

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong updated DRILL-4539:

Assignee: Roman  (was: Venki Korukanti)

> Add support for Null Equality Joins
> ---
>
> Key: DRILL-4539
> URL: https://issues.apache.org/jira/browse/DRILL-4539
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Jacques Nadeau
>Assignee: Roman
>
> Tableau frequently generates queries similar to this:
> {code}
> SELECT `t0`.`city` AS `city`,
>   `t2`.`X_measure__B` AS `max_Calculation_DFIDBHHAIIECCJFDAG_ok`,
>   `t0`.`state` AS `state`,
>   `t0`.`sum_stars_ok` AS `sum_stars_ok`
> FROM (
>   SELECT `business`.`city` AS `city`,
> `business`.`state` AS `state`,
> SUM(`business`.`stars`) AS `sum_stars_ok`
>   FROM `mongo.academic`.`business` `business`
>   GROUP BY `business`.`city`,
> `business`.`state`
> ) `t0`
>   INNER JOIN (
>   SELECT MAX(`t1`.`X_measure__A`) AS `X_measure__B`,
> `t1`.`city` AS `city`,
> `t1`.`state` AS `state`
>   FROM (
> SELECT `business`.`city` AS `city`,
>   `business`.`state` AS `state`,
>   `business`.`business_id` AS `business_id`,
>   SUM(`business`.`stars`) AS `X_measure__A`
> FROM `mongo.academic`.`business` `business`
> GROUP BY `business`.`city`,
>   `business`.`state`,
>   `business`.`business_id`
>   ) `t1`
>   GROUP BY `t1`.`city`,
> `t1`.`state`
> ) `t2` ON (((`t0`.`city` = `t2`.`city`) OR ((`t0`.`city` IS NULL) AND 
> (`t2`.`city` IS NULL))) AND ((`t0`.`state` = `t2`.`state`) OR ((`t0`.`state` 
> IS NULL) AND (`t2`.`state` IS NULL
> {code}
> If you look at the join condition, you'll note that the join condition is an 
> equality condition which also allows null=null. We should add a planning 
> rewrite rule and execution join option to allow null equality so that we 
> don't treat this as a cartesian join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536285#comment-15536285
 ] 

ASF GitHub Bot commented on DRILL-4373:
---

GitHub user vdiravka opened a pull request:

https://github.com/apache/drill/pull/600

DRILL-4373: Drill and Hive have incompatible timestamp representations in 
parquet

- added sys/sess option "store.parquet.int96_as_timestamp";
- added int96 to timestamp converter for both readers;
- added unit tests;

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vdiravka/drill DRILL-4373

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/600.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #600


commit 0d768c42f7c732360cafcacc91e29b67ae44fca4
Author: Vitalii Diravka 
Date:   2016-09-02T21:43:50Z

DRILL-4373: Drill and Hive have incompatible timestamp representations in 
parquet
- added sys/sess option "store.parquet.int96_as_timestamp";
- added int96 to timestamp converter for both readers;
- added unit tests;




> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive, Storage - Parquet
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4877) max(dir0), max(dir1) query against parquet data slower by 2X

2016-09-30 Thread Dechang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536289#comment-15536289
 ] 

Dechang Gu commented on DRILL-4877:
---

I run the query on the latest 1.9.0 (gitid: 2295715), and the query times for 3 
runs:
Run1:TOTAL TIME : 33291 msec
Run2:   TOTAL TIME : 24709 msec
Run3:   TOTAL TIME : 26906 msec

which show no regression.   

[~khfaraaz] So the jira is verified and can be closed.

> max(dir0), max(dir1) query against parquet data slower by 2X
> 
>
> Key: DRILL-4877
> URL: https://issues.apache.org/jira/browse/DRILL-4877
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.9.0
> Environment: 4 node cluster centos
>Reporter: Khurram Faraaz
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.9.0
>
>
> max(dir0), max(dir1) query against parquet data slower by 2X
> test was run with meta data cache on both 1.7.0 and 1.9.0
> there is a difference in query plan and also execution time on 1.9.0 is close 
> to 2X that on 1.7.0 
> Test from Drill 1.9.0 git commit id: 28d315bb
> on 4 node Centos cluster
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select max(dir0), max(dir1), max(dir2) from 
> `DRILL_4589`;
> +-+-+-+
> | EXPR$0  | EXPR$1  | EXPR$2  |
> +-+-+-+
> | 2015| Q4  | null|
> +-+-+-+
> 1 row selected (70.644 seconds)
> {noformat}
> Query plan for the above query, note than in Drill 1.9.0 usedMetadataFile is 
> not available is the query plan text.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select max(dir0), max(dir1), 
> max(dir2) from `DRILL_4589`;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2])
> 00-02StreamAgg(group=[{}], EXPR$0=[MAX($0)], EXPR$1=[MAX($1)], 
> EXPR$2=[MAX($2)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[MAX($0)], EXPR$1=[MAX($1)], 
> EXPR$2=[MAX($2)])
> 01-02  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q1/f672.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2011/Q4/f162.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2000/Q2/f1101.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1996/Q2/f110.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q3/f1192.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1999/Q2/f174.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q4/f885.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q3/f1720.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q1/f1779.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1991/Q2/f629.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q4/f821.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2015/Q3/f896.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2002/Q2/f1458.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2004/Q4/f1756.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q2/f1490.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q3/f1137.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2013/Q1/f561.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q3/f1562.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q1/f1445.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q1/f236.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1992/Q4/f1209.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2014/Q2/f518.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1993/Q4/f1598.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2008/Q1/f780.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1999/Q1/f1763.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q4/f381.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q1/f1870.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2014/Q1/f915.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q2/f673.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1998/Q1/f736.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2013/Q2/f749.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2007/Q3/f111.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1993/Q3/f776.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2002/Q1/f403.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2005/Q2/f904.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2000/Q4/f944.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1994/Q2/f506.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1994/Q4/f612.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1991/Q1/f1838.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_458

[jira] [Closed] (DRILL-4877) max(dir0), max(dir1) query against parquet data slower by 2X

2016-09-30 Thread Khurram Faraaz (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz closed DRILL-4877.
-

Verified and test added to performance tests.

> max(dir0), max(dir1) query against parquet data slower by 2X
> 
>
> Key: DRILL-4877
> URL: https://issues.apache.org/jira/browse/DRILL-4877
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.9.0
> Environment: 4 node cluster centos
>Reporter: Khurram Faraaz
>Assignee: Aman Sinha
>Priority: Critical
> Fix For: 1.9.0
>
>
> max(dir0), max(dir1) query against parquet data slower by 2X
> test was run with meta data cache on both 1.7.0 and 1.9.0
> there is a difference in query plan and also execution time on 1.9.0 is close 
> to 2X that on 1.7.0 
> Test from Drill 1.9.0 git commit id: 28d315bb
> on 4 node Centos cluster
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select max(dir0), max(dir1), max(dir2) from 
> `DRILL_4589`;
> +-+-+-+
> | EXPR$0  | EXPR$1  | EXPR$2  |
> +-+-+-+
> | 2015| Q4  | null|
> +-+-+-+
> 1 row selected (70.644 seconds)
> {noformat}
> Query plan for the above query, note than in Drill 1.9.0 usedMetadataFile is 
> not available is the query plan text.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> explain plan for select max(dir0), max(dir1), 
> max(dir2) from `DRILL_4589`;
> +--+--+
> | text | json |
> +--+--+
> | 00-00Screen
> 00-01  Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2])
> 00-02StreamAgg(group=[{}], EXPR$0=[MAX($0)], EXPR$1=[MAX($1)], 
> EXPR$2=[MAX($2)])
> 00-03  UnionExchange
> 01-01StreamAgg(group=[{}], EXPR$0=[MAX($0)], EXPR$1=[MAX($1)], 
> EXPR$2=[MAX($2)])
> 01-02  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q1/f672.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2011/Q4/f162.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2000/Q2/f1101.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1996/Q2/f110.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q3/f1192.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1999/Q2/f174.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q4/f885.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q3/f1720.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q1/f1779.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1991/Q2/f629.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q4/f821.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2015/Q3/f896.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2002/Q2/f1458.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2004/Q4/f1756.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q2/f1490.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q3/f1137.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2013/Q1/f561.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q3/f1562.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2003/Q1/f1445.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2006/Q1/f236.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1992/Q4/f1209.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2014/Q2/f518.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1993/Q4/f1598.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2008/Q1/f780.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1999/Q1/f1763.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q4/f381.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1990/Q1/f1870.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2014/Q1/f915.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2001/Q2/f673.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1998/Q1/f736.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2013/Q2/f749.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2007/Q3/f111.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1993/Q3/f776.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2002/Q1/f403.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2005/Q2/f904.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2000/Q4/f944.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1994/Q2/f506.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1994/Q4/f612.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1991/Q1/f1838.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2012/Q2/f1764.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2010/Q1/f684.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2005/Q4/f176.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/1991/Q4/f150.parquet], 
> ReadEntryWithPath [path=/tmp/DRILL_4589/2012/Q3/

[jira] [Commented] (DRILL-4203) Parquet File : Date is stored wrongly

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15536722#comment-15536722
 ] 

ASF GitHub Bot commented on DRILL-4203:
---

Github user jaltekruse commented on a diff in the pull request:

https://github.com/apache/drill/pull/595#discussion_r81396800
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/Metadata.java 
---
@@ -918,18 +916,22 @@ public void setMax(Object max) {
 @JsonProperty public ConcurrentHashMap columnTypeInfo;
 @JsonProperty List files;
 @JsonProperty List directories;
-@JsonProperty String drillVersion;
--- End diff --

I had intentionally added the drill version here assuming that it would be 
good information to have around if a similar issue ever comes up the the 
future, as well as provide all of the info we need to have an explicit flag 
that the dates have become correct. For this to work completely, this commit 
should be the last commit right before a release (it could be a point release). 
Any particular reason that we would want to not write it into the file?


> Parquet File : Date is stored wrongly
> -
>
> Key: DRILL-4203
> URL: https://issues.apache.org/jira/browse/DRILL-4203
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Stéphane Trou
>Assignee: Vitalii Diravka
>Priority: Critical
> Fix For: 1.9.0
>
>
> Hello,
> I have some problems when i try to read parquet files produce by drill with  
> Spark,  all dates are corrupted.
> I think the problem come from drill :)
> {code}
> cat /tmp/date_parquet.csv 
> Epoch,1970-01-01
> {code}
> {code}
> 0: jdbc:drill:zk=local> select columns[0] as name, cast(columns[1] as date) 
> as epoch_date from dfs.tmp.`date_parquet.csv`;
> ++-+
> |  name  | epoch_date  |
> ++-+
> | Epoch  | 1970-01-01  |
> ++-+
> {code}
> {code}
> 0: jdbc:drill:zk=local> create table dfs.tmp.`buggy_parquet`as select 
> columns[0] as name, cast(columns[1] as date) as epoch_date from 
> dfs.tmp.`date_parquet.csv`;
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 1  |
> +---++
> {code}
> When I read the file with parquet tools, i found  
> {code}
> java -jar parquet-tools-1.8.1.jar head /tmp/buggy_parquet/
> name = Epoch
> epoch_date = 4881176
> {code}
> According to 
> [https://github.com/Parquet/parquet-format/blob/master/LogicalTypes.md#date], 
> epoch_date should be equals to 0.
> Meta : 
> {code}
> java -jar parquet-tools-1.8.1.jar meta /tmp/buggy_parquet/
> file:file:/tmp/buggy_parquet/0_0_0.parquet 
> creator: parquet-mr version 1.8.1-drill-r0 (build 
> 6b605a4ea05b66e1a6bf843353abcb4834a4ced8) 
> extra:   drill.version = 1.4.0 
> file schema: root 
> 
> name:OPTIONAL BINARY O:UTF8 R:0 D:1
> epoch_date:  OPTIONAL INT32 O:DATE R:0 D:1
> row group 1: RC:1 TS:93 OFFSET:4 
> 
> name: BINARY SNAPPY DO:0 FPO:4 SZ:52/50/0,96 VC:1 
> ENC:RLE,BIT_PACKED,PLAIN
> epoch_date:   INT32 SNAPPY DO:0 FPO:56 SZ:45/43/0,96 VC:1 
> ENC:RLE,BIT_PACKED,PLAIN
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4876) Remain disconnected connection

2016-09-30 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-4876:
---
Assignee: Sorabh Hamirwasia

> Remain disconnected connection
> --
>
> Key: DRILL-4876
> URL: https://issues.apache.org/jira/browse/DRILL-4876
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.6.0, 1.7.0, 1.8.0
> Environment: CentOS 7
>Reporter: Takuya Kojima
>Assignee: Sorabh Hamirwasia
>Priority: Minor
> Attachments: 1_normal.png, 2_after_restart.png, 
> 3_try_to_connect_after_restart.png, 
> 4_disconnected_after_minEvictableIdleTimeMillis.png, 
> 5_after_disconnected.png, drill-connection-pool.txt
>
>
> I'm using drill via Java Application on Tomcat with JDBC driver.
> I found that disconnected connection is not released when restart a drillbit.
> Drillbit is restarted, but JDBC's connection keeps to try to connect the 
> connection which started before restart.
> Expected behavior is its connection release and reconnect when drillbit is 
> restarted, but as a matter of fact, the connection will be released after 
> elapsed time of "minEvictableIdleTimeMillis" setting.
> As a result, the application can't connect in the meantime despite drillbit 
> is active.
> I thought this is not major issue, but Postgres and Vertica's JDBC driver 
> works well in the same situation. I spend the much time to identify the 
> cause, so I create a new issue of this.
> The attached is log and JMX's monitor graph with 1.6.0's JDBC driver, but I 
> also get it with 1.7.0 and 1.8.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4821) Column type into POST /query.json response

2016-09-30 Thread Fabrizio Spataro (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537017#comment-15537017
 ] 

Fabrizio Spataro commented on DRILL-4821:
-

+1

> Column type into POST /query.json response
> --
>
> Key: DRILL-4821
> URL: https://issues.apache.org/jira/browse/DRILL-4821
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Fabrizio Spataro
>Priority: Minor
>
> Hello everyone,
> I am using POST /query.json restlet api, according to documentation the 
> service return a json with column name and data.
> {code}
>  {
>"columns" : [ "id", "type", "name", "ppu", "sales", "batters", "topping", 
> "filling" ],
>"rows" : [ {
>...
>  }]
> }
> {code}
> It would be very helpful to have the type of each column 
> (String/Numeric/) for example:
> {code}
>  {
>"columns" : {
> "id": "string", 
> "type": "numeric", 
> "name": "string" 
> },
>"rows" : [ {
>...
>  }]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4921) Scripts drill_config.sh, drillbit.sh, and drill-embedded fail when accessed via a symbolic link

2016-09-30 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-4921:
---

 Summary: Scripts drill_config.sh,  drillbit.sh, and drill-embedded 
fail when accessed via a symbolic link
 Key: DRILL-4921
 URL: https://issues.apache.org/jira/browse/DRILL-4921
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.8.0
 Environment: The drill-embedded on the Mac; the other files on Linux
Reporter: Boaz Ben-Zvi
Priority: Minor
 Fix For: 1.9.0


  Several of the drill... scripts under $DRILL_HOME/bin use "pwd" to produce 
the local path of that script. However "pwd" defaults to "logical" (i.e. the 
same as "pwd -L"); so if accessed via a symbolic link, that link is used 
verbatim in the path, which can produce wrong paths (e.g., when followed by "cd 
..").

For example, creating a symbolic link and using it (on the Mac):
$  cd ~/drill
$  ln -s $DRILL_HOME/bin 
$  bin/drill-embedded
ERROR: Drill config file missing: 
/Users/boazben-zvi/drill/conf/drill-override.conf -- Wrong config dir?

Similarly on Linux the CLASS_PATH gets set wrong (when running "drillbit.sh 
start" via a symlink).

Solution: need to replace all the "pwd" in all the scripts with "pwd -P" which 
produces the Physical path. (Or replace a preceding "cd" with "cd -P" which 
does the same).

Relevant scripts:
=
$ cd bin; grep pwd *
drillbit.sh:bin=`cd "$bin">/dev/null; pwd`
drillbit.sh:  echo "cwd:" `pwd`
drill-conf:bin=`cd "$bin">/dev/null; pwd`
drill-config.sh:home=`cd "$bin/..">/dev/null; pwd`
drill-config.sh:  DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
drill-config.sh:JAVA_HOME="$( cd -P "$( dirname "$SOURCE" )" && cd .. && 
pwd )"
drill-embedded:bin=`cd "$bin">/dev/null; pwd`
drill-localhost:bin=`cd "$bin">/dev/null; pwd`
submit_plan:bin=`cd "$bin">/dev/null; pwd`
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4921) Scripts drill_config.sh, drillbit.sh, and drill-embedded fail when accessed via a symbolic link

2016-09-30 Thread Boaz Ben-Zvi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537059#comment-15537059
 ] 

Boaz Ben-Zvi commented on DRILL-4921:
-

To save some time/effort: maybe this work can be lumped together with 
DRILL-4870 (fix JAVA_HOME in drill_config.sh) (note [~Paul.Rogers])



> Scripts drill_config.sh,  drillbit.sh, and drill-embedded fail when accessed 
> via a symbolic link
> 
>
> Key: DRILL-4921
> URL: https://issues.apache.org/jira/browse/DRILL-4921
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Affects Versions: 1.8.0
> Environment: The drill-embedded on the Mac; the other files on Linux
>Reporter: Boaz Ben-Zvi
>Priority: Minor
> Fix For: 1.9.0
>
>
>   Several of the drill... scripts under $DRILL_HOME/bin use "pwd" to produce 
> the local path of that script. However "pwd" defaults to "logical" (i.e. the 
> same as "pwd -L"); so if accessed via a symbolic link, that link is used 
> verbatim in the path, which can produce wrong paths (e.g., when followed by 
> "cd ..").
> For example, creating a symbolic link and using it (on the Mac):
> $  cd ~/drill
> $  ln -s $DRILL_HOME/bin 
> $  bin/drill-embedded
> ERROR: Drill config file missing: 
> /Users/boazben-zvi/drill/conf/drill-override.conf -- Wrong config dir?
> Similarly on Linux the CLASS_PATH gets set wrong (when running "drillbit.sh 
> start" via a symlink).
> Solution: need to replace all the "pwd" in all the scripts with "pwd -P" 
> which produces the Physical path. (Or replace a preceding "cd" with "cd -P" 
> which does the same).
> Relevant scripts:
> =
> $ cd bin; grep pwd *
> drillbit.sh:bin=`cd "$bin">/dev/null; pwd`
> drillbit.sh:  echo "cwd:" `pwd`
> drill-conf:bin=`cd "$bin">/dev/null; pwd`
> drill-config.sh:home=`cd "$bin/..">/dev/null; pwd`
> drill-config.sh:  DIR="$( cd -P "$( dirname "$SOURCE" )" && pwd )"
> drill-config.sh:JAVA_HOME="$( cd -P "$( dirname "$SOURCE" )" && cd .. && 
> pwd )"
> drill-embedded:bin=`cd "$bin">/dev/null; pwd`
> drill-localhost:bin=`cd "$bin">/dev/null; pwd`
> submit_plan:bin=`cd "$bin">/dev/null; pwd`
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4280) Kerberos Authentication

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4280:
---

Assignee: Chunhui Shi  (was: Sudheesh Katkam)

Assigning to [~cshi] to code review.

> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Chunhui Shi
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4726) Dynamic UDFs support

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4726:
---

Assignee: Paul Rogers  (was: Arina Ielchiieva)

Assigning to [~Paul.Rogers] for code review.

> Dynamic UDFs support
> 
>
> Key: DRILL-4726
> URL: https://issues.apache.org/jira/browse/DRILL-4726
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Paul Rogers
> Fix For: Future
>
>
> Allow register UDFs without  restart of Drillbits.
> Design is described in document below:
> https://docs.google.com/document/d/1FfyJtWae5TLuyheHCfldYUpCdeIezR2RlNsrOTYyAB4/edit?usp=sharing
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4373) Drill and Hive have incompatible timestamp representations in parquet

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4373:
---

Assignee: Karthikeyan Manivannan  (was: Vitalii Diravka)

Assigning to Karthik for code review.

> Drill and Hive have incompatible timestamp representations in parquet
> -
>
> Key: DRILL-4373
> URL: https://issues.apache.org/jira/browse/DRILL-4373
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive, Storage - Parquet
>Affects Versions: 1.8.0
>Reporter: Rahul Challapalli
>Assignee: Karthikeyan Manivannan
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> git.commit.id.abbrev=83d460c
> I created a parquet file with a timestamp type using Drill. Now if I define a 
> hive table on top of the parquet file and use "timestamp" as the column type, 
> drill fails to read the hive table through the hive storage plugin



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4906) CASE Expression with constant generates class exception

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4906:
---

Assignee: Aman Sinha  (was: Serhii Harnyk)

Assigning to [~amansinha100] for code review/checkin.

> CASE Expression with constant generates class exception
> ---
>
> Key: DRILL-4906
> URL: https://issues.apache.org/jira/browse/DRILL-4906
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.6.0, 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Aman Sinha
> Fix For: 1.9.0
>
>
> How to reproduce:
> select (case when (true) then 1 end) from (values(1));
> Error
> Error: SYSTEM ERROR: ClassCastException: 
> org.apache.drill.exec.expr.holders.NullableVarCharHolder cannot be cast to 
> org.apache.drill.exec.expr.holders.VarCharHolder



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4842) SELECT * on JSON data results in NumberFormatException

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4842:
---

Assignee: Chunhui Shi  (was: Serhii Harnyk)

Assigning to [~cshi] for code review.

> SELECT * on JSON data results in NumberFormatException
> --
>
> Key: DRILL-4842
> URL: https://issues.apache.org/jira/browse/DRILL-4842
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Chunhui Shi
> Attachments: tooManyNulls.json
>
>
> Note that doing SELECT c1 returns correct results, the failure is seen when 
> we do SELECT star. json.all_text_mode was set to true.
> JSON file tooManyNulls.json has one key c1 with 4096 nulls as its value and 
> the 4097th key c1 has the value "Hello World"
> git commit ID : aaf220ff
> MapR Drill 1.8.0 RPM
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> alter session set 
> `store.json.all_text_mode`=true;
> +---++
> |  ok   |  summary   |
> +---++
> | true  | store.json.all_text_mode updated.  |
> +---++
> 1 row selected (0.27 seconds)
> 0: jdbc:drill:schema=dfs.tmp> SELECT c1 FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> +--+
> |  c1  |
> +--+
> | Hello World  |
> +--+
> 1 row selected (0.243 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select * FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> Error: SYSTEM ERROR: NumberFormatException: Hello World
> Fragment 0:0
> [Error Id: 9cafb3f9-3d5c-478a-b55c-900602b8765e on centos-01.qa.lab:31010]
>  (java.lang.NumberFormatException) Hello World
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI():95
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt():120
> org.apache.drill.exec.test.generated.FiltererGen1169.doSetup():45
> org.apache.drill.exec.test.generated.FiltererGen1169.setup():54
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():195
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():107
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745 (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> Caused by: java.lang.NumberFormatException: Hello World
> at 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:95)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt(StringFunctionHelpers.java:120)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-S

[jira] [Assigned] (DRILL-4826) Query against INFORMATION_SCHEMA.TABLES degrades as the number of views increases

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4826:
---

Assignee: Padma Penumarthy  (was: Parth Chandra)

Assigning to [~ppenumarthy] for code review.

> Query against INFORMATION_SCHEMA.TABLES degrades as the number of views 
> increases
> -
>
> Key: DRILL-4826
> URL: https://issues.apache.org/jira/browse/DRILL-4826
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Padma Penumarthy
>
> Queries against INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.VIEWS slow 
> down as the number of views increases. 
> BI tools like Tableau issue a query like the following at connection time:
> {code}
> select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from 
> INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND 
> TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY 
> TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
> {code}
> The time to query the information schema tables degrades as the number of 
> views increases. On a test system:
> || Views || Time(secs) ||
> |500 | 6 |
> |1000 | 19 |
> |1500 | 33 |
> This can result in a single connection taking more than a minute to establish.
> The problem occurs because we read the view file for every view and this 
> appears to take most of the time.
> Querying information_schema.tables does not, in fact, need to open the view 
> file at all, it merely needs to get a listing of the view files. Eliminating 
> the view file read will speed up the query tremendously.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4864:
---

Assignee: Gautam Kumar Parai  (was: Serhii Harnyk)

Assigning to [~gparai] for code review.

> Add ANSI format for date/time functions
> ---
>
> Key: DRILL-4864
> URL: https://issues.apache.org/jira/browse/DRILL-4864
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Gautam Kumar Parai
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> The TO_DATE() is exposing the Joda string formatting conventions into the SQL 
> layer. This is not following SQL conventions used by ANSI and many other 
> database engines on the market.
> Add new UDF "ansi_to_joda(string)", that takes string that represents ANSI 
> datetime format and returns string that represents equal Joda format.
> Add new session option "drill.exec.fn.to_date_format" that can be one of two 
> values - "JODA"(default) and "ANSI".
> If option is set to "JODA" queries with to_date() function would work in 
> usual way.
> If option is set to "ANSI" second argument would be wrapped with 
> ansi_to_joda() function, that allows user to use ANSI datetime format
> Wrapping is used in to_date(), to_time() and to_timestamp() functions.
> Table of joda and ansi patterns which may be replaced
> ||Pattern name||  Ansi format ||  JodaTime format
> | Full name of day|   day |   
> | Day of year |   ddd |   D
> | Day of month|   dd  |   d
> | Day of week |   d   |   e
> | Name of month   |   month   |   
> | Abr name of month   |   mon |   MMM
> | Full era name   |   ee  |   G
> | Name of day |   dy  |   E
> | Time zone   |   tz  |   TZ
> | Hour 12 |   hh  |   h
> | Hour 12 |   hh12|   h
> | Hour 24 |   hh24|   H
> | Minute of hour  |   mi  |   m
> | Second of minute|   ss  |   s
> | Millisecond of minute   |   ms  |   S
> | Week of year|   ww  |   w
> | Month   |   mm  |   MM
> | Halfday am  |   am  |   aa
> | Halfday pm  |   pm  |   aa
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html| 
>   
> http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
>  |
> Table of ansi pattern modifiers, which may be deleted from string
> ||Description ||  Pattern ||
> | fill mode (suppress padding blanks and zeroes)  |   fm  |
> | fixed format global option (see usage notes)|   fx  |
> | translation mode (print localized day and month names based on 
> lc_messages) |   tm  |
> | spell mode (not yet implemented)|   sp  |
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4824) JSON with complex nested data produces incorrect output with missing fields

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4824:
---

Assignee: Paul Rogers  (was: Roman)

Assigning to [~Paul.Rogers] for code review.

> JSON with complex nested data produces incorrect output with missing fields
> ---
>
> Key: DRILL-4824
> URL: https://issues.apache.org/jira/browse/DRILL-4824
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.0.0
>Reporter: Roman
>Assignee: Paul Rogers
> Fix For: 1.9.0
>
>
> There is incorrect output in case of JSON file with complex nested data.
> _JSON:_
> {code:none|title=example.json|borderStyle=solid}
> {
> "Field1" : {
> }
> }
> {
> "Field1" : {
> "InnerField1": {"key1":"value1"},
> "InnerField2": {"key2":"value2"}
> }
> }
> {
> "Field1" : {
> "InnerField3" : ["value3", "value4"],
> "InnerField4" : ["value5", "value6"]
> }
> }
> {code}
> _Query:_
> {code:sql}
> select Field1 from dfs.`/tmp/example.json`
> {code}
> _Incorrect result:_
> {code:none}
> +---+
> |  Field1   |
> +---+
> {"InnerField1":{},"InnerField2":{},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{"key1":"value1"},"InnerField2" 
> {"key2":"value2"},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{},"InnerField2":{},"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--+
> {code}
> Theres is no need to output missing fields. In case of deeply nested 
> structure we will get unreadable result for user.
> _Correct result:_
> {code:none}
> +--+
> | Field1   |
> +--+
> |{} 
> {"InnerField1":{"key1":"value1"},"InnerField2":{"key2":"value2"}}
> {"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4841) Use user server event loop group for web clients

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4841:
---

Assignee: Sorabh Hamirwasia  (was: Sudheesh Katkam)

Assigning to [~shamirwasia] for code review.

> Use user server event loop group for web clients
> 
>
> Key: DRILL-4841
> URL: https://issues.apache.org/jira/browse/DRILL-4841
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP
>Reporter: Sudheesh Katkam
>Assignee: Sorabh Hamirwasia
>Priority: Minor
>
> Currently we spawn an event loop group for handling requests from clients. 
> This group should also be used to handles responses (from server) for web 
> clients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4823) Fix OOM while trying to prune partitions with reasonable data size

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4823:
---

Assignee: Boaz Ben-Zvi  (was: Arina Ielchiieva)

Assigning to [~ben-zvi] for code review.

> Fix OOM while trying to prune partitions with reasonable data size
> --
>
> Key: DRILL-4823
> URL: https://issues.apache.org/jira/browse/DRILL-4823
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill, Query Planning & Optimization
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Boaz Ben-Zvi
> Fix For: 1.9.0
>
>
> _Example query:_
> {code:sql}
> select  '/'||dir0||'/'||dir1||'/'||dir2||'/'||dir3 , count(*)
>   FROM dfs.`/path/to/parquet/files`
>   WHERE ('/'||dir0||'/'||dir1||'/'||dir2||'/'||dir3)  IN ( 
>   '/2015/11/30', 
>   '//2015/09/01',
>   '/2015/09/02', 
>   '/2015/09/03',
>   '/2015/09/04',
>   '/2015/09/09',  
>   '/2016/03/30'
>   )
>   group by   '/'||dir0||'/'||dir1||'/'||dir2||'/'||dir3
>   order by 1;
> {code}
> _Error:_
> OOM while trying to prune partitions:
> {noformat}
> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate 
> buffer of size 256 due to memory limit. Current allocation: 5242880
>   at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:216) 
> ~[drill-memory-base-1.6.0.jar:1.6.0]
>   at 
> org.apache.drill.exec.ops.BufferManagerImpl.getManagedBuffer(BufferManagerImpl.java:60)
>  ~[drill-java-exec-1.6.0.jar:1.6.0]
>   at 
> org.apache.drill.exec.ops.BufferManagerImpl.getManagedBuffer(BufferManagerImpl.java:56)
>  ~[drill-java-exec-1.6.0.jar:1.6.0]
>   at 
> org.apache.drill.exec.ops.QueryContext.getManagedBuffer(QueryContext.java:241)
>  ~[drill-java-exec-1.6.0.jar:1.6.0]
>   at 
> org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.getManagedBufferIfAvailable(InterpreterEvaluator.java:158)
>  ~[drill-java-exec-1.6.0.jar:1.6.0]
> {noformat}
> _Cause:_
> Interpreter always asks for a new buffer to hold varchar/varbinary or decimal 
> constant values. That's why the memory size required would be proportion to # 
> of constant expressions multiplied by # of input rows (partition). This is 
> different from evaluation from run-time generated where constant expression 
> will be evaluated once and use only one buffer per value.
> _Fix:_
> To use one buffer for each unique constant value in query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4792) Include session options used for a query as part of the profile

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4792:
---

Assignee: Sudheesh Katkam  (was: Arina Ielchiieva)

Assigning to [~sudheeshkatkam] for code review.

> Include session options used for a query as part of the profile
> ---
>
> Key: DRILL-4792
> URL: https://issues.apache.org/jira/browse/DRILL-4792
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.7.0
>Reporter: Arina Ielchiieva
>Assignee: Sudheesh Katkam
>Priority: Minor
> Fix For: 1.9.0
>
> Attachments: no_session_options.JPG, session_options_block.JPG, 
> session_options_collapsed.JPG, session_options_json.JPG
>
>
> Include session options used for a query as part of the profile.
> This will be very useful for debugging/diagnostics.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4699) Add Description Column in sys.options

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4699:
---

Assignee: Paul Rogers  (was: Sudheesh Katkam)

Assigning to [~Paul.Rogers] for code review.

> Add Description Column in sys.options
> -
>
> Key: DRILL-4699
> URL: https://issues.apache.org/jira/browse/DRILL-4699
> Project: Apache Drill
>  Issue Type: Improvement
>  Components:  Server, Documentation
>Affects Versions: 1.6.0
>Reporter: John Omernik
>Assignee: Paul Rogers
>
> select * from sys.options provides a user with a strong understanding of what 
> options are available to Drill. These options are not well documented.  Some 
> options are "experimental" other options have a function only in specific 
> cases (writers vs readers for example).  If we had a large text field for 
> description, we could enforce documentation of the settings are option 
> creation time, and the description of the setting could change as the 
> versions change (i.e. when an option graduates to being supported from being 
> experimental, it would be changed in the version the user is using. I.e. when 
> they run select * from sys.options, they know the exact state of the option 
> every time they query. It could also facilitate better self documentation via 
> QA on pull requests "Did you update the sys.options.desc?"  This makes it 
> easier for users, and admins in the use of Drill in an enterprise.
> The first step is adding the field, and then going back and filling in the 
> desc for each option.  (Another JIRA after the option is available)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3510) Add ANSI_QUOTES option so that Drill's SQL Parser will recognize ANSI_SQL identifiers

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-3510:
---

Assignee: Sudheesh Katkam  (was: Vitalii Diravka)

Assigning to [~sudheeshkatkam] for code review.

> Add ANSI_QUOTES option so that Drill's SQL Parser will recognize ANSI_SQL 
> identifiers 
> --
>
> Key: DRILL-3510
> URL: https://issues.apache.org/jira/browse/DRILL-3510
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: SQL Parser
>Reporter: Jinfeng Ni
>Assignee: Sudheesh Katkam
> Fix For: Future
>
> Attachments: DRILL-3510.patch, DRILL-3510.patch
>
>
> Currently Drill's SQL parser uses backtick as identifier quotes, the same as 
> what MySQL does. However, this is different from ANSI SQL specification, 
> where double quote is used as identifier quotes.  
> MySQL has an option "ANSI_QUOTES", which could be switched on/off by user. 
> Drill should follow the same way, so that Drill users do not have to rewrite 
> their existing queries, if their queries use double quotes. 
> {code}
> SET sql_mode='ANSI_QUOTES';
> {code}
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4618) random numbers generator function broken

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4618:
---

Assignee: Boaz Ben-Zvi  (was: Chunhui Shi)

Assigning to [~ben-zvi] for code review

> random numbers generator function broken
> 
>
> Key: DRILL-4618
> URL: https://issues.apache.org/jira/browse/DRILL-4618
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Chunhui Shi
>Assignee: Boaz Ben-Zvi
>
> File this JIRA based on the the bug description from Ted's email and 
> discussion in dev mail list for record purpose:
> I am trying to generate some random numbers. I have a large base file (foo)
> this is what I get:
> 0: jdbc:drill:>  select floor(1000*random()) as x, floor(1000*random()) as
> y, floor(1000*rand()) as z from (select * from maprfs.tdunning.foo) a limit
> 20;
> ++++
> |   x|   y|   z|
> ++++
> | 556.0  | 556.0  | 618.0  |
> | 564.0  | 564.0  | 618.0  |
> | 129.0  | 129.0  | 618.0  |
> | 48.0   | 48.0   | 618.0  |
> | 696.0  | 696.0  | 618.0  |
> | 642.0  | 642.0  | 618.0  |
> | 535.0  | 535.0  | 618.0  |
> | 440.0  | 440.0  | 618.0  |
> | 894.0  | 894.0  | 618.0  |
> | 24.0   | 24.0   | 618.0  |
> | 508.0  | 508.0  | 618.0  |
> | 28.0   | 28.0   | 618.0  |
> | 816.0  | 816.0  | 618.0  |
> | 717.0  | 717.0  | 618.0  |
> | 334.0  | 334.0  | 618.0  |
> | 978.0  | 978.0  | 618.0  |
> | 646.0  | 646.0  | 618.0  |
> | 787.0  | 787.0  | 618.0  |
> | 260.0  | 260.0  | 618.0  |
> | 711.0  | 711.0  | 618.0  |
> ++++
> On this page, https://drill.apache.org/docs/math-and-trig/, the rand
> function is described and random() is not. But it appears that rand()
> delivers a constant instead (although a different constant each time the
> query is run) and it appears that random() delivers the same value when
> used multiple times in each returned value.
> This seems very, very wrong.
> The fault does not seem to be related to my querying a table:
> 0: jdbc:drill:> select rand(), random(), random() from (values (1),(2),(3))
> x;
> +-+---+---+
> |   EXPR$0|EXPR$1 |EXPR$2 |
> +-+---+---+
> | 0.1347749257216052  | 0.36724556209765014   | 0.36724556209765014   |
> | 0.1347749257216052  | 0.006087161689924625  | 0.006087161689924625  |
> | 0.1347749257216052  | 0.09417099142512142   | 0.09417099142512142   |
> +-+---+---+
> For reference, postgres doesn't have rand() and does the right thing with
> random().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4642) Let RexBuilder.ensureType() mechanism take place during Rex conversion.

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4642:
---

Assignee: Jinfeng Ni  (was: Sean Hsuan-Yi Chu)

Assigning to [~jni] for code review.

> Let RexBuilder.ensureType() mechanism take place during Rex conversion.
> ---
>
> Key: DRILL-4642
> URL: https://issues.apache.org/jira/browse/DRILL-4642
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jinfeng Ni
> Fix For: Future
>
>
> In DRILL-4372, the logic of ensuring same type is removed since, in some case 
> such as below, undesirable cast function will be added and cause failure.
> {code}
> SELECT * 
> FROM T 
> WHERE (cast(col1 as timestamp)  - to_timestamp(col2,'-MM-dd HH:mm:ss') < 
> interval 'X XX:XX:XX' day to second)
> {code}
> The fundamental reason for this behavior roots in Drill-Calcite [1], where 
> SqlNode WHERE is expanded to a new object but is not passed into validation 
> step.
> [1] 
> https://github.com/mapr/incubator-calcite/blob/DrillCalcite1.4.0-mapr-1.4.0/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L3362



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4604) Generate warning on Web UI if drillbits version mismatch is detected

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4604:
---

Assignee: Sudheesh Katkam  (was: Arina Ielchiieva)

Assigning to [~sudheeshkatkam] for code review.

> Generate warning on Web UI if drillbits version mismatch is detected
> 
>
> Key: DRILL-4604
> URL: https://issues.apache.org/jira/browse/DRILL-4604
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Sudheesh Katkam
> Fix For: Future
>
> Attachments: index_page.JPG, index_page_mismatch.JPG, 
> screenshots_with_different_states.docx
>
>
> Display drillbit version on web UI. If any of drillbits version doesn't match 
> with current drillbit, generate warning.
> Screenshots - screenshots_with_different_states.docx.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4606) Create DrillClient.Builder class

2016-09-30 Thread Zelaine Fong (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537472#comment-15537472
 ] 

Zelaine Fong commented on DRILL-4606:
-

[~sudheeshkatkam] - [~parthc] has +1'd this.  Anything more needed on this one?

> Create DrillClient.Builder class
> 
>
> Key: DRILL-4606
> URL: https://issues.apache.org/jira/browse/DRILL-4606
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Sudheesh Katkam
>Assignee: Sudheesh Katkam
>
> + Create a helper class to build DrillClient instances, and deprecate 
> DrillClient constructors
> + Allow DrillClient to specify an event loop group (so user event loop can be 
> used for queries from Web API calls)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4596) Drill should do version check among drillbits

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4596:
---

Assignee: Paul Rogers  (was: Arina Ielchiieva)

Assigning to [~Paul.Rogers] for code review.

> Drill should do version check among drillbits
> -
>
> Key: DRILL-4596
> URL: https://issues.apache.org/jira/browse/DRILL-4596
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.6.0
>Reporter: Arina Ielchiieva
>Assignee: Paul Rogers
> Fix For: Future
>
>
> Before registering new drillbit in zookeeper, we should do version check, and 
> make sure all the running drillbits are in the same version.
> Using drillbits of different version can lead to unexpected results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4504) Create an event loop for each of [user, control, data] RPC components

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4504:
---

Assignee: Sorabh Hamirwasia  (was: Sudheesh Katkam)

Assigning to [~shamirwasia] for code review

> Create an event loop for each of [user, control, data] RPC components
> -
>
> Key: DRILL-4504
> URL: https://issues.apache.org/jira/browse/DRILL-4504
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - RPC
>Reporter: Sudheesh Katkam
>Assignee: Sorabh Hamirwasia
>
> + Create an event loop group for each client-server pair (data, client and 
> user)
> Miscellaneous:
> + Move WorkEventBus from exec/rpc/control to exec/work



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4503) Schema change exception even with all_text_mode enabled

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4503:
---

Assignee: Padma Penumarthy  (was: Aman Sinha)

Assigning to [~ppenumarthy] for code review.

> Schema change exception even with all_text_mode enabled
> ---
>
> Key: DRILL-4503
> URL: https://issues.apache.org/jira/browse/DRILL-4503
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.6.0
>Reporter: Aman Sinha
>Assignee: Padma Penumarthy
> Fix For: Future
>
> Attachments: mostlynulls_1.json
>
>
> Both HashAggregate and StreamingAggregate encounter schema change error whey 
> querying a JSON file with non-null values for column 'a' and many null values 
> for column 'c'.
> This occurs even when all_text_mode is enabled, which seems counterintuitive 
> since once all_text_mode is enabled, everything (including nulls) should be 
> treated as varchar and one would expect no schema change errors.  
> Here are some example queries that encounter this error: 
> {noformat}
> 0: jdbc:drill:zk=local> select a, c from dfs.`mostlynulls_1.json` group by a, 
> c;
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
> changes
> 0: jdbc:drill:zk=local> alter session set `store.json.all_text_mode` = true;
> +---++
> |  ok   |  summary   |
> +---++
> | true  | store.json.all_text_mode updated.  |
> +---++
> 1 row selected (0.15 seconds)
> 0: jdbc:drill:zk=local> select a, c from dfs.`mostlynulls_1.json` group by a, 
> c;
> Error: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
> changes
> 0: jdbc:drill:zk=local> select min(a), min(c) from dfs.`mostlynulls_1.json`;
> Error: UNSUPPORTED_OPERATION ERROR: Streaming aggregate does not support 
> schema changes
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4491) FormatPluginOptionsDescriptor requires FormatPluginConfig fields to be public

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-4491:
---

Assignee: Parth Chandra  (was: Aditya Kishore)

Assigning to [~parthc] for code review

> FormatPluginOptionsDescriptor requires FormatPluginConfig fields to be public
> -
>
> Key: DRILL-4491
> URL: https://issues.apache.org/jira/browse/DRILL-4491
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Aditya Kishore
>Assignee: Parth Chandra
>Priority: Minor
> Fix For: Future
>
>
> The code uses {{getField()}} instead of {{getDeclaredField()}}, which returns 
> only the public fields.
> {code:title=FormatPluginOptionsDescriptor.java:165|borderStyle=solid}
> Field field = pluginConfigClass.getField(paramDef.name);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4069) Enable RPC Thread Offload by default

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong updated DRILL-4069:

Assignee: Sudheesh Katkam

> Enable RPC Thread Offload by default
> 
>
> Key: DRILL-4069
> URL: https://issues.apache.org/jira/browse/DRILL-4069
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Reporter: Jacques Nadeau
>Assignee: Sudheesh Katkam
>
> Once we enabled RPC thread offload, we saw concurrency regressions DRILL-4041 
> and DRILL-4057. The regressions appear to be unrelated to, but exposed by the 
> thread offload. To simplify things, we disabled the RPC thread offload by 
> default. This is the tracking issue to fix the underlying concurrency bug(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3754) Remove redundancy in run-time generated code for common column references.

2016-09-30 Thread Zelaine Fong (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zelaine Fong reassigned DRILL-3754:
---

Assignee: Chunhui Shi  (was: Jinfeng Ni)

Assigning to [~cshi] for code review

> Remove redundancy in run-time generated code for common column references. 
> ---
>
> Key: DRILL-3754
> URL: https://issues.apache.org/jira/browse/DRILL-3754
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 1.1.0
>Reporter: Jinfeng Ni
>Assignee: Chunhui Shi
> Fix For: Future
>
> Attachments: 
> 0002-DRILL-3754-Reduce-redundancy-in-run-time-generated-c.patch
>
>
> When a operator (Filter, project) has expression which refer one same field 
> multiple times, Drill will initialize a value vector and do value holder 
> assignment   for each field reference in the run-time generated code.  The 
> redundancy might impact the expression evaluation, after the compiled code is 
> executed over large number of incoming rows.
> This has been seen in recent performance issue reported on the drill user 
> list,  where the query contains multiple multiple in list filter conditions. 
> In this JIRA, we'll remove the redundancy for the common field reference, so 
> that only one initialization and assignment happen in the run-time generated 
> code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537657#comment-15537657
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81438363
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AnsiToJoda.java 
---
@@ -0,0 +1,58 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+
+import javax.inject.Inject;
+
+/**
+ * Replaces all ansi patterns to joda equivalents.
+ */
+@FunctionTemplate(name = "ansi_to_joda",
+  scope = FunctionTemplate.FunctionScope.SIMPLE,
+  nulls= FunctionTemplate.NullHandling.NULL_IF_NULL)
--- End diff --

nulls =


> Add ANSI format for date/time functions
> ---
>
> Key: DRILL-4864
> URL: https://issues.apache.org/jira/browse/DRILL-4864
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Gautam Kumar Parai
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> The TO_DATE() is exposing the Joda string formatting conventions into the SQL 
> layer. This is not following SQL conventions used by ANSI and many other 
> database engines on the market.
> Add new UDF "ansi_to_joda(string)", that takes string that represents ANSI 
> datetime format and returns string that represents equal Joda format.
> Add new session option "drill.exec.fn.to_date_format" that can be one of two 
> values - "JODA"(default) and "ANSI".
> If option is set to "JODA" queries with to_date() function would work in 
> usual way.
> If option is set to "ANSI" second argument would be wrapped with 
> ansi_to_joda() function, that allows user to use ANSI datetime format
> Wrapping is used in to_date(), to_time() and to_timestamp() functions.
> Table of joda and ansi patterns which may be replaced
> ||Pattern name||  Ansi format ||  JodaTime format
> | Full name of day|   day |   
> | Day of year |   ddd |   D
> | Day of month|   dd  |   d
> | Day of week |   d   |   e
> | Name of month   |   month   |   
> | Abr name of month   |   mon |   MMM
> | Full era name   |   ee  |   G
> | Name of day |   dy  |   E
> | Time zone   |   tz  |   TZ
> | Hour 12 |   hh  |   h
> | Hour 12 |   hh12|   h
> | Hour 24 |   hh24|   H
> | Minute of hour  |   mi  |   m
> | Second of minute|   ss  |   s
> | Millisecond of minute   |   ms  |   S
> | Week of year|   ww  |   w
> | Month   |   mm  |   MM
> | Halfday am  |   am  |   aa
> | Halfday pm  |   pm  |   aa
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html| 
>   
> http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
>  |
> Table of ansi pattern modifiers, which may be deleted from string
> ||Description ||  Pattern ||
> | fill mode (suppress padding blanks and zeroes)  |   fm  |
> | fixed format global option (see usage notes)|   fx  |
> | translation mode (print localized day and month names based on 
> lc_messages) |   tm  |
> | sp

[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537656#comment-15537656
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81440360
  
--- Diff: 
logical/src/main/java/org/apache/drill/common/expression/fn/JodaDateValidator.java
 ---
@@ -0,0 +1,213 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+package org.apache.drill.common.expression.fn;
+
+import com.google.common.collect.Sets;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.map.CaseInsensitiveMap;
+
+import java.util.Comparator;
+import java.util.Set;
+
+public class JodaDateValidator {
+
+  private static final Set ansiValuesForDeleting = 
Sets.newTreeSet(new LengthDescComparator());
+  private static final CaseInsensitiveMap ansiToJodaMap = 
CaseInsensitiveMap.newTreeMap(new LengthDescComparator());
+
+  //tokens for deleting
+  public static final String SUFFIX_SP = "sp";
+  public static final String PREFIX_FM = "fm";
+  public static final String PREFIX_FX = "fx";
+  public static final String PREFIX_TM = "tm";
+
+  //ansi patterns
+  public static final String ANSI_FULL_NAME_OF_DAY = "day";
+  public static final String ANSI_DAY_OF_YEAR = "ddd";
+  public static final String ANSI_DAY_OF_MONTH = "dd";
+  public static final String ANSI_DAY_OF_WEEK = "d";
+  public static final String ANSI_NAME_OF_MONTH = "month";
+  public static final String ANSI_ABR_NAME_OF_MONTH = "mon";
+  public static final String ANSI_FULL_ERA_NAME = "ee";
+  public static final String ANSI_NAME_OF_DAY = "dy";
+  public static final String ANSI_TIME_ZONE_NAME = "tz";
+  public static final String ANSI_HOUR_12_NAME = "hh";
+  public static final String ANSI_HOUR_12_OTHER_NAME = "hh12";
+  public static final String ANSI_HOUR_24_NAME = "hh24";
+  public static final String ANSI_MINUTE_OF_HOUR_NAME = "mi";
+  public static final String ANSI_SECOND_OF_MINUTE_NAME = "ss";
+  public static final String ANSI_MILLISECOND_OF_MINUTE_NAME = "ms";
+  public static final String ANSI_WEEK_OF_YEAR = "ww";
+  public static final String ANSI_MONTH = "mm";
+  public static final String ANSI_HALFDAY_AM = "am";
+  public static final String ANSI_HALFDAY_PM = "pm";
+
+  //jodaTime patterns
+  public static final String JODA_FULL_NAME_OF_DAY = "";
+  public static final String JODA_DAY_OF_YEAR = "D";
+  public static final String JODA_DAY_OF_MONTH = "d";
+  public static final String JODA_DAY_OF_WEEK = "e";
+  public static final String JODA_NAME_OF_MONTH = "";
+  public static final String JODA_ABR_NAME_OF_MONTH = "MMM";
+  public static final String JODA_FULL_ERA_NAME = "G";
+  public static final String JODA_NAME_OF_DAY = "E";
+  public static final String JODA_TIME_ZONE_NAME = "TZ";
+  public static final String JODA_HOUR_12_NAME = "h";
+  public static final String JODA_HOUR_12_OTHER_NAME = "h";
+  public static final String JODA_HOUR_24_NAME = "H";
+  public static final String JODA_MINUTE_OF_HOUR_NAME = "m";
+  public static final String JODA_SECOND_OF_MINUTE_NAME = "s";
+  public static final String JODA_MILLISECOND_OF_MINUTE_NAME = "S";
+  public static final String JODA_WEEK_OF_YEAR = "w";
+  public static final String JODA_MONTH = "MM";
+  public static final String JODA_HALFDAY = "aa";
+
+  static {
+ansiToJodaMap.put(ANSI_FULL_NAME_OF_DAY, JODA_FULL_NAME_OF_DAY);
+ansiToJodaMap.put(ANSI_DAY_OF_YEAR, JODA_DAY_OF_YEAR);
+ansiToJodaMap.put(ANSI_DAY_OF_MONTH, JODA_DAY_OF_MONTH);
+ansiToJodaMap.put(ANSI_DAY_OF_WEEK, JODA_DAY_OF_WEEK);
+ansiToJodaMap.put(ANSI_NAME_OF_MONTH, JODA_NAME_OF_MONTH);
+ansiToJodaMap.put(ANSI_ABR_NAME_OF_MONTH, JODA_ABR_NAME_OF_MONTH);
+ansiToJodaMap.put(ANSI_FULL_ERA_

[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537658#comment-15537658
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81438806
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java
 ---
@@ -408,6 +411,12 @@ private LogicalExpression 
getDrillFunctionFromOptiqCall(RexCall call) {
 
   return first;
 }
+  } else if (functionName.equals("to_date") || 
functionName.equals("to_time") || functionName.equals("to_timestamp")) {
+// convert ansi date format string to joda according to session 
option
+OptionManager om = this.context.getPlannerSettings().getOptions();
+
if(ToDateFormats.valueOf(om.getOption(ExecConstants.TO_DATE_FORMAT).string_val.toUpperCase()).equals(ToDateFormats.ANSI))
 {
+  args.set(1, FunctionCallFactory.createExpression("ansi_to_joda", 
Arrays.asList(args.get(1;
--- End diff --

What would happen if 
drill.exec.fn.to_date_format = 'ansi'  
query: select to_date(1234545, ansi_to_joda('dd-MM-')) from emp;

Would we get select to_date(1234545, 
ansi_to_joda(ansi_to_joda('dd-MM-'))) from emp;?


> Add ANSI format for date/time functions
> ---
>
> Key: DRILL-4864
> URL: https://issues.apache.org/jira/browse/DRILL-4864
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Gautam Kumar Parai
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> The TO_DATE() is exposing the Joda string formatting conventions into the SQL 
> layer. This is not following SQL conventions used by ANSI and many other 
> database engines on the market.
> Add new UDF "ansi_to_joda(string)", that takes string that represents ANSI 
> datetime format and returns string that represents equal Joda format.
> Add new session option "drill.exec.fn.to_date_format" that can be one of two 
> values - "JODA"(default) and "ANSI".
> If option is set to "JODA" queries with to_date() function would work in 
> usual way.
> If option is set to "ANSI" second argument would be wrapped with 
> ansi_to_joda() function, that allows user to use ANSI datetime format
> Wrapping is used in to_date(), to_time() and to_timestamp() functions.
> Table of joda and ansi patterns which may be replaced
> ||Pattern name||  Ansi format ||  JodaTime format
> | Full name of day|   day |   
> | Day of year |   ddd |   D
> | Day of month|   dd  |   d
> | Day of week |   d   |   e
> | Name of month   |   month   |   
> | Abr name of month   |   mon |   MMM
> | Full era name   |   ee  |   G
> | Name of day |   dy  |   E
> | Time zone   |   tz  |   TZ
> | Hour 12 |   hh  |   h
> | Hour 12 |   hh12|   h
> | Hour 24 |   hh24|   H
> | Minute of hour  |   mi  |   m
> | Second of minute|   ss  |   s
> | Millisecond of minute   |   ms  |   S
> | Week of year|   ww  |   w
> | Month   |   mm  |   MM
> | Halfday am  |   am  |   aa
> | Halfday pm  |   pm  |   aa
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html| 
>   
> http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
>  |
> Table of ansi pattern modifiers, which may be deleted from string
> ||Description ||  Pattern ||
> | fill mode (suppress padding blanks and zeroes)  |   fm  |
> | fixed format global option (see usage notes)|   fx  |
> | translation mode (print localized day and month names based on 
> lc_messages) |   tm  |
> | spell mode (not yet implemented)|   sp  |
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537660#comment-15537660
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81439112
  
--- Diff: 
logical/src/main/java/org/apache/drill/common/expression/fn/JodaDateValidator.java
 ---
@@ -0,0 +1,213 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+package org.apache.drill.common.expression.fn;
+
+import com.google.common.collect.Sets;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.map.CaseInsensitiveMap;
+
+import java.util.Comparator;
+import java.util.Set;
+
+public class JodaDateValidator {
+
+  private static final Set ansiValuesForDeleting = 
Sets.newTreeSet(new LengthDescComparator());
+  private static final CaseInsensitiveMap ansiToJodaMap = 
CaseInsensitiveMap.newTreeMap(new LengthDescComparator());
+
+  //tokens for deleting
+  public static final String SUFFIX_SP = "sp";
+  public static final String PREFIX_FM = "fm";
+  public static final String PREFIX_FX = "fx";
+  public static final String PREFIX_TM = "tm";
+
+  //ansi patterns
+  public static final String ANSI_FULL_NAME_OF_DAY = "day";
+  public static final String ANSI_DAY_OF_YEAR = "ddd";
+  public static final String ANSI_DAY_OF_MONTH = "dd";
+  public static final String ANSI_DAY_OF_WEEK = "d";
+  public static final String ANSI_NAME_OF_MONTH = "month";
+  public static final String ANSI_ABR_NAME_OF_MONTH = "mon";
+  public static final String ANSI_FULL_ERA_NAME = "ee";
+  public static final String ANSI_NAME_OF_DAY = "dy";
+  public static final String ANSI_TIME_ZONE_NAME = "tz";
+  public static final String ANSI_HOUR_12_NAME = "hh";
+  public static final String ANSI_HOUR_12_OTHER_NAME = "hh12";
+  public static final String ANSI_HOUR_24_NAME = "hh24";
+  public static final String ANSI_MINUTE_OF_HOUR_NAME = "mi";
+  public static final String ANSI_SECOND_OF_MINUTE_NAME = "ss";
+  public static final String ANSI_MILLISECOND_OF_MINUTE_NAME = "ms";
+  public static final String ANSI_WEEK_OF_YEAR = "ww";
+  public static final String ANSI_MONTH = "mm";
+  public static final String ANSI_HALFDAY_AM = "am";
+  public static final String ANSI_HALFDAY_PM = "pm";
+
+  //jodaTime patterns
+  public static final String JODA_FULL_NAME_OF_DAY = "";
+  public static final String JODA_DAY_OF_YEAR = "D";
+  public static final String JODA_DAY_OF_MONTH = "d";
+  public static final String JODA_DAY_OF_WEEK = "e";
+  public static final String JODA_NAME_OF_MONTH = "";
+  public static final String JODA_ABR_NAME_OF_MONTH = "MMM";
+  public static final String JODA_FULL_ERA_NAME = "G";
+  public static final String JODA_NAME_OF_DAY = "E";
+  public static final String JODA_TIME_ZONE_NAME = "TZ";
+  public static final String JODA_HOUR_12_NAME = "h";
+  public static final String JODA_HOUR_12_OTHER_NAME = "h";
+  public static final String JODA_HOUR_24_NAME = "H";
+  public static final String JODA_MINUTE_OF_HOUR_NAME = "m";
+  public static final String JODA_SECOND_OF_MINUTE_NAME = "s";
+  public static final String JODA_MILLISECOND_OF_MINUTE_NAME = "S";
+  public static final String JODA_WEEK_OF_YEAR = "w";
+  public static final String JODA_MONTH = "MM";
+  public static final String JODA_HALFDAY = "aa";
+
+  static {
+ansiToJodaMap.put(ANSI_FULL_NAME_OF_DAY, JODA_FULL_NAME_OF_DAY);
+ansiToJodaMap.put(ANSI_DAY_OF_YEAR, JODA_DAY_OF_YEAR);
+ansiToJodaMap.put(ANSI_DAY_OF_MONTH, JODA_DAY_OF_MONTH);
+ansiToJodaMap.put(ANSI_DAY_OF_WEEK, JODA_DAY_OF_WEEK);
+ansiToJodaMap.put(ANSI_NAME_OF_MONTH, JODA_NAME_OF_MONTH);
+ansiToJodaMap.put(ANSI_ABR_NAME_OF_MONTH, JODA_ABR_NAME_OF_MONTH);
+ansiToJodaMap.put(ANSI_FULL_ERA_

[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537661#comment-15537661
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81440134
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AnsiToJoda.java 
---
@@ -0,0 +1,58 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to you under the Apache License, Version 2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+* http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+
+import javax.inject.Inject;
+
+/**
+ * Replaces all ansi patterns to joda equivalents.
+ */
+@FunctionTemplate(name = "ansi_to_joda",
+  scope = FunctionTemplate.FunctionScope.SIMPLE,
+  nulls= FunctionTemplate.NullHandling.NULL_IF_NULL)
+public class AnsiToJoda implements DrillSimpleFunc {
+
+  @Param
+  VarCharHolder in;
+
+  @Output
+  VarCharHolder out;
+
+  @Inject
+  DrillBuf buffer;
+
+  @Override
+  public void setup() {
+  }
+
+  @Override
+  public void eval() {
+String pattern = 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(in.start,
 in.end, in.buffer);
--- End diff --

Would it be good to validate the ANSI pattern prior to converting it to 
JODA?


> Add ANSI format for date/time functions
> ---
>
> Key: DRILL-4864
> URL: https://issues.apache.org/jira/browse/DRILL-4864
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Gautam Kumar Parai
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> The TO_DATE() is exposing the Joda string formatting conventions into the SQL 
> layer. This is not following SQL conventions used by ANSI and many other 
> database engines on the market.
> Add new UDF "ansi_to_joda(string)", that takes string that represents ANSI 
> datetime format and returns string that represents equal Joda format.
> Add new session option "drill.exec.fn.to_date_format" that can be one of two 
> values - "JODA"(default) and "ANSI".
> If option is set to "JODA" queries with to_date() function would work in 
> usual way.
> If option is set to "ANSI" second argument would be wrapped with 
> ansi_to_joda() function, that allows user to use ANSI datetime format
> Wrapping is used in to_date(), to_time() and to_timestamp() functions.
> Table of joda and ansi patterns which may be replaced
> ||Pattern name||  Ansi format ||  JodaTime format
> | Full name of day|   day |   
> | Day of year |   ddd |   D
> | Day of month|   dd  |   d
> | Day of week |   d   |   e
> | Name of month   |   month   |   
> | Abr name of month   |   mon |   MMM
> | Full era name   |   ee  |   G
> | Name of day |   dy  |   E
> | Time zone   |   tz  |   TZ
> | Hour 12 |   hh  |   h
> | Hour 12 |   hh12|   h
> | Hour 24 |   hh24|   H
> | Minute of hour  |   mi  |   m
> | Second of minute|   ss  |   s
> | Millisecond of minute   |   ms  |   S
> | Week of year|   ww  |   w
> | Month   |   mm  |   MM
> | Halfday am  |   am  |   aa
> | Halfday pm  |   pm  |   aa
> | ref.|   
> https://www.postgresql.org/docs/8.2/st

[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537659#comment-15537659
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81438265
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java
 ---
@@ -408,6 +411,12 @@ private LogicalExpression 
getDrillFunctionFromOptiqCall(RexCall call) {
 
   return first;
 }
+  } else if (functionName.equals("to_date") || 
functionName.equals("to_time") || functionName.equals("to_timestamp")) {
--- End diff --

equalsIgnoreCase needed?


> Add ANSI format for date/time functions
> ---
>
> Key: DRILL-4864
> URL: https://issues.apache.org/jira/browse/DRILL-4864
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Gautam Kumar Parai
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> The TO_DATE() is exposing the Joda string formatting conventions into the SQL 
> layer. This is not following SQL conventions used by ANSI and many other 
> database engines on the market.
> Add new UDF "ansi_to_joda(string)", that takes string that represents ANSI 
> datetime format and returns string that represents equal Joda format.
> Add new session option "drill.exec.fn.to_date_format" that can be one of two 
> values - "JODA"(default) and "ANSI".
> If option is set to "JODA" queries with to_date() function would work in 
> usual way.
> If option is set to "ANSI" second argument would be wrapped with 
> ansi_to_joda() function, that allows user to use ANSI datetime format
> Wrapping is used in to_date(), to_time() and to_timestamp() functions.
> Table of joda and ansi patterns which may be replaced
> ||Pattern name||  Ansi format ||  JodaTime format
> | Full name of day|   day |   
> | Day of year |   ddd |   D
> | Day of month|   dd  |   d
> | Day of week |   d   |   e
> | Name of month   |   month   |   
> | Abr name of month   |   mon |   MMM
> | Full era name   |   ee  |   G
> | Name of day |   dy  |   E
> | Time zone   |   tz  |   TZ
> | Hour 12 |   hh  |   h
> | Hour 12 |   hh12|   h
> | Hour 24 |   hh24|   H
> | Minute of hour  |   mi  |   m
> | Second of minute|   ss  |   s
> | Millisecond of minute   |   ms  |   S
> | Week of year|   ww  |   w
> | Month   |   mm  |   MM
> | Halfday am  |   am  |   aa
> | Halfday pm  |   pm  |   aa
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html| 
>   
> http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
>  |
> Table of ansi pattern modifiers, which may be deleted from string
> ||Description ||  Pattern ||
> | fill mode (suppress padding blanks and zeroes)  |   fm  |
> | fixed format global option (see usage notes)|   fx  |
> | translation mode (print localized day and month names based on 
> lc_messages) |   tm  |
> | spell mode (not yet implemented)|   sp  |
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-4864) Add ANSI format for date/time functions

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15537662#comment-15537662
 ] 

ASF GitHub Bot commented on DRILL-4864:
---

Github user gparai commented on a diff in the pull request:

https://github.com/apache/drill/pull/581#discussion_r81438289
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java
 ---
@@ -408,6 +411,12 @@ private LogicalExpression 
getDrillFunctionFromOptiqCall(RexCall call) {
 
   return first;
 }
+  } else if (functionName.equals("to_date") || 
functionName.equals("to_time") || functionName.equals("to_timestamp")) {
+// convert ansi date format string to joda according to session 
option
+OptionManager om = this.context.getPlannerSettings().getOptions();
+
if(ToDateFormats.valueOf(om.getOption(ExecConstants.TO_DATE_FORMAT).string_val.toUpperCase()).equals(ToDateFormats.ANSI))
 {
--- End diff --

if (


> Add ANSI format for date/time functions
> ---
>
> Key: DRILL-4864
> URL: https://issues.apache.org/jira/browse/DRILL-4864
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Serhii Harnyk
>Assignee: Gautam Kumar Parai
>  Labels: doc-impacting
> Fix For: 1.9.0
>
>
> The TO_DATE() is exposing the Joda string formatting conventions into the SQL 
> layer. This is not following SQL conventions used by ANSI and many other 
> database engines on the market.
> Add new UDF "ansi_to_joda(string)", that takes string that represents ANSI 
> datetime format and returns string that represents equal Joda format.
> Add new session option "drill.exec.fn.to_date_format" that can be one of two 
> values - "JODA"(default) and "ANSI".
> If option is set to "JODA" queries with to_date() function would work in 
> usual way.
> If option is set to "ANSI" second argument would be wrapped with 
> ansi_to_joda() function, that allows user to use ANSI datetime format
> Wrapping is used in to_date(), to_time() and to_timestamp() functions.
> Table of joda and ansi patterns which may be replaced
> ||Pattern name||  Ansi format ||  JodaTime format
> | Full name of day|   day |   
> | Day of year |   ddd |   D
> | Day of month|   dd  |   d
> | Day of week |   d   |   e
> | Name of month   |   month   |   
> | Abr name of month   |   mon |   MMM
> | Full era name   |   ee  |   G
> | Name of day |   dy  |   E
> | Time zone   |   tz  |   TZ
> | Hour 12 |   hh  |   h
> | Hour 12 |   hh12|   h
> | Hour 24 |   hh24|   H
> | Minute of hour  |   mi  |   m
> | Second of minute|   ss  |   s
> | Millisecond of minute   |   ms  |   S
> | Week of year|   ww  |   w
> | Month   |   mm  |   MM
> | Halfday am  |   am  |   aa
> | Halfday pm  |   pm  |   aa
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html| 
>   
> http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html
>  |
> Table of ansi pattern modifiers, which may be deleted from string
> ||Description ||  Pattern ||
> | fill mode (suppress padding blanks and zeroes)  |   fm  |
> | fixed format global option (see usage notes)|   fx  |
> | translation mode (print localized day and month names based on 
> lc_messages) |   tm  |
> | spell mode (not yet implemented)|   sp  |
> | ref.|   
> https://www.postgresql.org/docs/8.2/static/functions-formatting.html|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)