[jira] [Updated] (HIVE-12673) Orcfiledump throws NPE when no files are available

2015-12-14 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-12673:

Attachment: HIVE-12673.2.patch

Exiting early from printJsonMetaData when files are not present.

> Orcfiledump throws NPE when no files are available
> --
>
> Key: HIVE-12673
> URL: https://issues.apache.org/jira/browse/HIVE-12673
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-12673.1.patch, HIVE-12673.2.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
>   at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106)
>   at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116)
>   at 
> org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293)
>   at 
> org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants

2015-12-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11927:
---
Attachment: HIVE-11927.12.patch

> Implement/Enable constant related optimization rules in Calcite: enable 
> HiveReduceExpressionsRule to fold constants
> ---
>
> Key: HIVE-11927
> URL: https://issues.apache.org/jira/browse/HIVE-11927
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, 
> HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, 
> HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch, 
> HIVE-11927.09.patch, HIVE-11927.10.patch, HIVE-11927.11.patch, 
> HIVE-11927.12.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12675) PerfLogger should log performance metrics at debug level

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12675:
-
Attachment: HIVE-12675.1.patch

cc-ing [~jpullokkaran] and [~ashutoshc] for review.

> PerfLogger should log performance metrics at debug level
> 
>
> Key: HIVE-12675
> URL: https://issues.apache.org/jira/browse/HIVE-12675
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12675.1.patch
>
>
> As more and more subcomponents of Hive (Tez, Optimizer) etc are using 
> PerfLogger to track the performance metrics, it will be more meaningful to 
> set the PerfLogger logging level to DEBUG. Otherwise, we will print the 
> performance metrics unnecessarily for each and every query if the underlying 
> subcomponent does not control the PerfLogging via a parameter on its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-12673) Orcfiledump throws NPE when no files are available

2015-12-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056982#comment-15056982
 ] 

Prasanth Jayachandran edited comment on HIVE-12673 at 12/14/15 11:38 PM:
-

The problem is JSONObject is in incomplete state. I think we should fix that. 

1) If the files list is empty or null return at the beginning of the function
2) line:208. Add else condition and do writer.endObject() to finish the object 
as 'done'



was (Author: prasanth_j):
The problem is JSONObject is in complete state. I think we should fix that. 

1) If the files list is empty or null return at the beginning of the function
2) line:208. Add else condition and do writer.endObject() to finish the object 
as 'done'


> Orcfiledump throws NPE when no files are available
> --
>
> Key: HIVE-12673
> URL: https://issues.apache.org/jira/browse/HIVE-12673
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-12673.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
>   at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106)
>   at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116)
>   at 
> org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293)
>   at 
> org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12673) Orcfiledump throws NPE when no files are available

2015-12-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056982#comment-15056982
 ] 

Prasanth Jayachandran commented on HIVE-12673:
--

The problem is JSONObject is in complete state. I think we should fix that. 

1) If the files list is empty or null return at the beginning of the function
2) line:208. Add else condition and do writer.endObject() to finish the object 
as 'done'


> Orcfiledump throws NPE when no files are available
> --
>
> Key: HIVE-12673
> URL: https://issues.apache.org/jira/browse/HIVE-12673
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-12673.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
>   at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106)
>   at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116)
>   at 
> org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293)
>   at 
> org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12666:
-
Attachment: (was: HIVE-12666.1.patch)

> PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes 
> dynamic partition pruner generated synthetic join predicates.
> 
>
> Key: HIVE-12666
> URL: https://issues.apache.org/jira/browse/HIVE-12666
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Blocker
> Attachments: HIVE-12666.1.patch
>
>
> Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the 
> IN partition conditions from the predicate list since the static dynamic 
> partitioning would kick in and push these predicates down to metastore. 
> However, the check is too aggressive and removes events such as below :
> {code}
> -Select Operator
> -  expressions: UDFToDouble(UDFToInteger((hr / 2))) 
> (type: double)
> -  outputColumnNames: _col0
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Group By Operator
> -keys: _col0 (type: double)
> -mode: hash
> -outputColumnNames: _col0
> -Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -Dynamic Partitioning Event Operator
> -  Target Input: srcpart
> -  Partition key expr: UDFToDouble(hr)
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Target column: hr
> -  Target Vertex: Map 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12666:
-
Attachment: HIVE-12666.1.patch

> PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes 
> dynamic partition pruner generated synthetic join predicates.
> 
>
> Key: HIVE-12666
> URL: https://issues.apache.org/jira/browse/HIVE-12666
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Blocker
> Attachments: HIVE-12666.1.patch
>
>
> Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the 
> IN partition conditions from the predicate list since the static dynamic 
> partitioning would kick in and push these predicates down to metastore. 
> However, the check is too aggressive and removes events such as below :
> {code}
> -Select Operator
> -  expressions: UDFToDouble(UDFToInteger((hr / 2))) 
> (type: double)
> -  outputColumnNames: _col0
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Group By Operator
> -keys: _col0 (type: double)
> -mode: hash
> -outputColumnNames: _col0
> -Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -Dynamic Partitioning Event Operator
> -  Target Input: srcpart
> -  Partition key expr: UDFToDouble(hr)
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Target column: hr
> -  Target Vertex: Map 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057061#comment-15057061
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12666:
--

This includes :
1. Revert HIVE-12462 (change in 
2. Correct fix made in PcrExprProcFactory.
3. Merged diff files 

cc-ing [~hagleitn], [~sershe], [~gopalv]

[~jpullokkaran] Can you please review this patch

Thanks
Hari

> PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes 
> dynamic partition pruner generated synthetic join predicates.
> 
>
> Key: HIVE-12666
> URL: https://issues.apache.org/jira/browse/HIVE-12666
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Blocker
> Attachments: HIVE-12666.1.patch
>
>
> Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the 
> IN partition conditions from the predicate list since the static dynamic 
> partitioning would kick in and push these predicates down to metastore. 
> However, the check is too aggressive and removes events such as below :
> {code}
> -Select Operator
> -  expressions: UDFToDouble(UDFToInteger((hr / 2))) 
> (type: double)
> -  outputColumnNames: _col0
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Group By Operator
> -keys: _col0 (type: double)
> -mode: hash
> -outputColumnNames: _col0
> -Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -Dynamic Partitioning Event Operator
> -  Target Input: srcpart
> -  Partition key expr: UDFToDouble(hr)
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Target column: hr
> -  Target Vertex: Map 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057120#comment-15057120
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-12462:
--

I've uploaded the patch with the potential fix on this issue in HIVE-12666, 
will be great if someone can review it. I am also reverting this patch as part 
of the fix.

Thanks
Hari

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions

2015-12-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12658:
-
Attachment: HIVE-12658.1.patch

> Task rejection by an llap daemon spams the log with 
> RejectedExecutionExceptions
> ---
>
> Key: HIVE-12658
> URL: https://issues.apache.org/jira/browse/HIVE-12658
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12658.1.patch
>
>
> The execution queue throws a RejectedExecutionException - which is logged by 
> the hadoop IPC layer.
> Instead of relying on an Exception in the protocol - move to sending back an 
> explicit response to indicate a rejected fragment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12675) PerfLogger should log performance metrics at debug level

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12675:
-
Description: As more and more subcomponents of Hive (Tez, Optimizer) etc 
are using PerfLogger to track the performance metrics, it will be more 
meaningful to set the PerfLogger logging level to DEBUG. Otherwise, we will 
print the performance metrics unnecessarily for each and every query if the 
underlying subcomponent does not control the PerfLogging via a parameter on its 
own.  (was: As more and more subcomponents are Hive (Tez, Optimizer) etc are 
using PerfLogger to track the performance metrics, it will be more meaningful 
to set the PerfLogger logging level to DEBUG. Otherwise, we will print the 
performance metrics unnecessarily for each and every query if the underlying 
subcomponent does not control the PerfLogging via a parameter.)

> PerfLogger should log performance metrics at debug level
> 
>
> Key: HIVE-12675
> URL: https://issues.apache.org/jira/browse/HIVE-12675
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>
> As more and more subcomponents of Hive (Tez, Optimizer) etc are using 
> PerfLogger to track the performance metrics, it will be more meaningful to 
> set the PerfLogger logging level to DEBUG. Otherwise, we will print the 
> performance metrics unnecessarily for each and every query if the underlying 
> subcomponent does not control the PerfLogging via a parameter on its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9907) insert into table values() when UTF-8 character is not correct

2015-12-14 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057197#comment-15057197
 ] 

niklaus xiao commented on HIVE-9907:


Attached patch solved this issue, pls check.

> insert into table values()   when UTF-8 character is not correct
> 
>
> Key: HIVE-9907
> URL: https://issues.apache.org/jira/browse/HIVE-9907
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients, JDBC
>Affects Versions: 0.14.0, 0.13.1, 1.0.0
> Environment: centos 6   LANG=zh_CN.UTF-8
> hadoop 2.6
> hive 1.1.0
>Reporter: Fanhong Li
>Priority: Critical
> Attachments: HIVE-9907.1.patch
>
>
> insert into table test_acid partition(pt='pt_2')
> values( 2, '中文_2' , 'city_2' )
> ;
> hive> select *
> > from test_acid 
> > ;
> OK
> 2 -�_2city_2  pt_2
> Time taken: 0.237 seconds, Fetched: 1 row(s)
> hive> 
> CREATE TABLE test_acid(id INT, 
> name STRING, 
> city STRING) 
> PARTITIONED BY (pt STRING)
> clustered by (id) into 1 buckets
> stored as ORCFILE
> TBLPROPERTIES('transactional'='true')
> ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9907) insert into table values() when UTF-8 character is not correct

2015-12-14 Thread niklaus xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklaus xiao updated HIVE-9907:
---
Attachment: HIVE-9907.1.patch

> insert into table values()   when UTF-8 character is not correct
> 
>
> Key: HIVE-9907
> URL: https://issues.apache.org/jira/browse/HIVE-9907
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients, JDBC
>Affects Versions: 0.14.0, 0.13.1, 1.0.0
> Environment: centos 6   LANG=zh_CN.UTF-8
> hadoop 2.6
> hive 1.1.0
>Reporter: Fanhong Li
>Priority: Critical
> Attachments: HIVE-9907.1.patch
>
>
> insert into table test_acid partition(pt='pt_2')
> values( 2, '中文_2' , 'city_2' )
> ;
> hive> select *
> > from test_acid 
> > ;
> OK
> 2 -�_2city_2  pt_2
> Time taken: 0.237 seconds, Fetched: 1 row(s)
> hive> 
> CREATE TABLE test_acid(id INT, 
> name STRING, 
> city STRING) 
> PARTITIONED BY (pt STRING)
> clustered by (id) into 1 buckets
> stored as ORCFILE
> TBLPROPERTIES('transactional'='true')
> ;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12616) NullPointerException when spark session is reused to run a mapjoin

2015-12-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057239#comment-15057239
 ] 

Xuefu Zhang commented on HIVE-12616:


+1

[~nemon], could you create a followup JIRA that covers the problems that didn't 
get addressed by the patch here based on the discussion here? Thanks.

> NullPointerException when spark session is reused to run a mapjoin
> --
>
> Key: HIVE-12616
> URL: https://issues.apache.org/jira/browse/HIVE-12616
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-12616.1.patch, HIVE-12616.2.patch, HIVE-12616.patch
>
>
> The way to reproduce:
> {noformat}
> set hive.execution.engine=spark;
> create table if not exists test(id int);
> create table if not exists test1(id int);
> insert into test values(1);
> insert into test1 values(1);
> select max(a.id) from test a ,test1 b
> where a.id = b.id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions

2015-12-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057255#comment-15057255
 ] 

Prasanth Jayachandran commented on HIVE-12658:
--

[~sseth] Could you take a look please?

> Task rejection by an llap daemon spams the log with 
> RejectedExecutionExceptions
> ---
>
> Key: HIVE-12658
> URL: https://issues.apache.org/jira/browse/HIVE-12658
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-12658.1.patch
>
>
> The execution queue throws a RejectedExecutionException - which is logged by 
> the hadoop IPC layer.
> Instead of relying on an Exception in the protocol - move to sending back an 
> explicit response to indicate a rejected fragment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12673) Orcfiledump throws NPE when no files are available

2015-12-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-12673:
-
Assignee: Rajesh Balamohan  (was: Prasanth Jayachandran)

> Orcfiledump throws NPE when no files are available
> --
>
> Key: HIVE-12673
> URL: https://issues.apache.org/jira/browse/HIVE-12673
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-12673.1.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
>   at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106)
>   at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116)
>   at 
> org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293)
>   at 
> org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-14 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Attachment: HIVE-12664-1.patch

Original patch wasn't against trunk... sorry about that.


> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664-1.patch, HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver

2015-12-14 Thread Bing Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057168#comment-15057168
 ] 

Bing Li commented on HIVE-10982:


[~alangates] and [~thejas], thank you for your reviewing !

> Customizable the value of  java.sql.statement.setFetchSize in Hive JDBC Driver
> --
>
> Key: HIVE-10982
> URL: https://issues.apache.org/jira/browse/HIVE-10982
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-10982.1.patch, HIVE-10982.2.patch, 
> HIVE-10982.3.patch
>
>
> The current JDBC driver for Hive hard-code the value of setFetchSize to 50, 
> which will be a bottleneck for performance.
> Pentaho filed this issue as  http://jira.pentaho.com/browse/PDI-11511, whose 
> status is open.
> Also it has discussion in 
> http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform
> http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function

2015-12-14 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057001#comment-15057001
 ] 

Laljo John Pullokkaran commented on HIVE-12570:
---

+1

> Incorrect error message Expression not in GROUP BY key thrown instead of 
> Invalid function
> -
>
> Key: HIVE-12570
> URL: https://issues.apache.org/jira/browse/HIVE-12570
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, 
> HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch
>
>
> {code}
> explain create table avg_salary_by_supervisor3 as select average(key) as 
> key_avg from src group by value;
> {code}
> We get the following stack trace :
> {code}
> FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY 
> key 'key'
> ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 
> Expression not in GROUP BY key 'key'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not 
> in GROUP BY key 'key'
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Instead of the above error message, it be more appropriate to throw the below 
> error :
> ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid 
> function 'average'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11687) TaskExecutorService can reject work even if capacity is available

2015-12-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-11687:


Assignee: Prasanth Jayachandran

> TaskExecutorService can reject work even if capacity is available
> -
>
> Key: HIVE-11687
> URL: https://issues.apache.org/jira/browse/HIVE-11687
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: llap
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
> Fix For: llap
>
>
> The waitQueue has a fixed capacity - which is the wait queue size. Addition 
> of new work doe snot factor in the capacity available to execute work. This 
> ends up being left to the race between work getting scheduled for execution 
> and added to the waitQueue.
> cc [~prasanth_j]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12673) Orcfiledump throws NPE when no files are available

2015-12-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057069#comment-15057069
 ] 

Prasanth Jayachandran commented on HIVE-12673:
--

+1

> Orcfiledump throws NPE when no files are available
> --
>
> Key: HIVE-12673
> URL: https://issues.apache.org/jira/browse/HIVE-12673
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
> Attachments: HIVE-12673.1.patch, HIVE-12673.2.patch
>
>
> {noformat}
> Exception in thread "main" java.lang.NullPointerException
>   at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106)
>   at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116)
>   at 
> org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185)
>   at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293)
>   at 
> org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {noformat}
> hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.

2015-12-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057188#comment-15057188
 ] 

Eugene Koifman commented on HIVE-10632:
---

The right way to do this is to add a MetaStoreListener to clean up Acid related 
metastore tables on dropTable/Partition.

> Make sure TXN_COMPONENTS gets cleaned up if table is dropped before 
> compaction.
> ---
>
> Key: HIVE-10632
> URL: https://issues.apache.org/jira/browse/HIVE-10632
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore, Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> The compaction process will clean up entries in  TXNS, 
> COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS.  If the table/partition is dropped 
> before compaction is complete there will be data left in these tables.  Need 
> to investigate if there are other situations where this may happen and 
> address it.
> see HIVE-10595 for additional info



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057199#comment-15057199
 ] 

Sergey Shelukhin commented on HIVE-12577:
-

Tracking time using the currentMilliseconds call is fraught with peril, machine 
clock can move and cause weird behavior.

Nits: The caller of BiMap 
getContainerAttemptMapForNode(String hostname, int port)  creates an nodeid but 
passes name and port to the method which also creates an id; getContext() is 
called twice; there appear to be some indentation issues.



> NPE in LlapTaskCommunicator when unregistering containers
> -
>
> Key: HIVE-12577
> URL: https://issues.apache.org/jira/browse/HIVE-12577
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, 
> HIVE-12577.1.wip.txt
>
>
> {code}
> 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:745)
> 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> 

[jira] [Updated] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12666:
-
Attachment: HIVE-12666.1.patch

> PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes 
> dynamic partition pruner generated synthetic join predicates.
> 
>
> Key: HIVE-12666
> URL: https://issues.apache.org/jira/browse/HIVE-12666
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Blocker
> Attachments: HIVE-12666.1.patch
>
>
> Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the 
> IN partition conditions from the predicate list since the static dynamic 
> partitioning would kick in and push these predicates down to metastore. 
> However, the check is too aggressive and removes events such as below :
> {code}
> -Select Operator
> -  expressions: UDFToDouble(UDFToInteger((hr / 2))) 
> (type: double)
> -  outputColumnNames: _col0
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Group By Operator
> -keys: _col0 (type: double)
> -mode: hash
> -outputColumnNames: _col0
> -Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -Dynamic Partitioning Event Operator
> -  Target Input: srcpart
> -  Partition key expr: UDFToDouble(hr)
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Target column: hr
> -  Target Vertex: Map 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11736) while creating this hcatalog table then getting this error

2015-12-14 Thread niklaus xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklaus xiao resolved HIVE-11736.
-
Resolution: Not A Problem

> while creating this hcatalog  table then getting this error 
> 
>
> Key: HIVE-11736
> URL: https://issues.apache.org/jira/browse/HIVE-11736
> Project: Hive
>  Issue Type: Bug
>Reporter: Sadeek Mohammad
>Priority: Blocker
>
> HCatClient error on create table: {"statement":"use default; create table 
> batting_data(`playerid` string, `yearid` int, `stint` bigint, `teamid` 
> string, `lgid` string, `g` bigint, `g_batting` bigint, `ab` bigint, `r` 
> bigint, `h` bigint, `2b` bigint, `3b` bigint, `hr` bigint, `rbi` bigint, `sb` 
> bigint, `cs` bigint, `bb` bigint, `so` bigint, `ibb` bigint, `hbp` bigint, 
> `sh` bigint, `sf` bigint, `gidp` bigint, `g_old` bigint) row format delimited 
> fields terminated by ',';","error":"unable to create table: 
> batting_data","exec":{"stdout":"","stderr":"which: no 
> /usr/hdp/2.2.4.2-2//hadoop/bin/hadoop.distro in ((null))\ndirname: missing 
> operand\nTry `dirname --help' for more information.\nSLF4J: Class path 
> contains multiple SLF4J bindings.\nSLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J:
>  Found binding in 
> [jar:file:/usr/hdp/2.2.4.2-2/hive/lib/hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J:
>  See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.\nSLF4J: Actual binding is of type 
> [org.slf4j.impl.Log4jLoggerFactory]\n Command  was terminated due to 
> timeout(6ms).  See templeton.exec.timeout property","exitcode":143}} 
> (error 500)
> any help is appreciated 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057154#comment-15057154
 ] 

Hive QA commented on HIVE-12366:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777551/HIVE-12366.8.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6352/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6352/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6352/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6352/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   f14b3c6..e2c8bfa  branch-1   -> origin/branch-1
   866b236..d8ee05a  branch-2.0 -> origin/branch-2.0
   23f78cc..c5b2c0e  master -> origin/master
+ git reset --hard HEAD
HEAD is now at 23f78cc HIVE-12435 SELECT COUNT(CASE WHEN...) GROUPBY returns 1 
for 'NULL' in a case of ORC and vectorization is enabled. (Matt McCline, 
reviewed by Prasanth J)
+ git clean -f -d
Removing ql/src/test/queries/clientnegative/invalid_select_fn.q
Removing ql/src/test/results/clientnegative/invalid_select_fn.q.out
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at c5b2c0e HIVE-12526 : PerfLogger for hive compiler and optimizer 
(Hari Subramaniyan, reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777551 - PreCommit-HIVE-TRUNK-Build

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, 
> HIVE-12366.6.patch, HIVE-12366.7.patch, HIVE-12366.8.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12675) PerfLogger should log performance metrics at debug level

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057213#comment-15057213
 ] 

Sergey Shelukhin commented on HIVE-12675:
-

I don't know if this is a good idea. With DEBUG level, the amount of other 
logging will increase logging dramatically, which affects perf a lot with 
log4j.  Many people run clusters and WARN level because even INFO may be too 
much for perf.
Without crafting a special logging configuration, it would only be possible to 
see perflogger output when it's obscured by perf loss from DEBUG logging..

> PerfLogger should log performance metrics at debug level
> 
>
> Key: HIVE-12675
> URL: https://issues.apache.org/jira/browse/HIVE-12675
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12675.1.patch
>
>
> As more and more subcomponents of Hive (Tez, Optimizer) etc are using 
> PerfLogger to track the performance metrics, it will be more meaningful to 
> set the PerfLogger logging level to DEBUG. Otherwise, we will print the 
> performance metrics unnecessarily for each and every query if the underlying 
> subcomponent does not control the PerfLogging via a parameter on its own.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12633) LLAP: package included serde jars

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057056#comment-15057056
 ] 

Sergey Shelukhin commented on HIVE-12633:
-

[~vikram.dixit] can you take a look?

> LLAP: package included serde jars
> -
>
> Key: HIVE-12633
> URL: https://issues.apache.org/jira/browse/HIVE-12633
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12633.01.patch, HIVE-12633.02.patch, 
> HIVE-12633.patch
>
>
> Some SerDes like JSONSerde are not packaged with LLAP. One cannot localize 
> jars on the daemon (due to security consideration if nothing else), so we 
> should package them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12447) Fix LlapTaskReporter post TEZ-808 changes

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057115#comment-15057115
 ] 

Sergey Shelukhin commented on HIVE-12447:
-

+1

> Fix LlapTaskReporter post TEZ-808 changes
> -
>
> Key: HIVE-12447
> URL: https://issues.apache.org/jira/browse/HIVE-12447
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-12447.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057151#comment-15057151
 ] 

Hive QA commented on HIVE-12570:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777536/HIVE-12570.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9870 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorized_parquet.q-orc_merge6.q-vector_outer_join0.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6351/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6351/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6351/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777536 - PreCommit-HIVE-TRUNK-Build

> Incorrect error message Expression not in GROUP BY key thrown instead of 
> Invalid function
> -
>
> Key: HIVE-12570
> URL: https://issues.apache.org/jira/browse/HIVE-12570
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.0.0
>
> Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, 
> HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch
>
>
> {code}
> explain create table avg_salary_by_supervisor3 as select average(key) as 
> key_avg from src group by value;
> {code}
> We get the following stack trace :
> {code}
> FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY 
> key 'key'
> ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 
> Expression not in GROUP BY key 'key'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not 
> in GROUP BY key 'key'
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064)
>   at 
> 

[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException

2015-12-14 Thread Johan Gustavsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Gustavsson updated HIVE-12664:

Description: The optimisation check for reduce deduplication only checks 
the first child node for join -and the check itself also contains a major bug- 
causing ArrayOutOfBoundException no matter what.  (was: The optimisation check 
for reduce deduplication only checks the first child node for join and the 
check itself also contains a major bug causing ArrayOutOfBoundException no 
matter what.)

> Bug in reduce deduplication optimization causing ArrayOutOfBoundException
> -
>
> Key: HIVE-12664
> URL: https://issues.apache.org/jira/browse/HIVE-12664
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.1, 1.2.1
>Reporter: Johan Gustavsson
>Assignee: Johan Gustavsson
> Attachments: HIVE-12664.patch
>
>
> The optimisation check for reduce deduplication only checks the first child 
> node for join -and the check itself also contains a major bug- causing 
> ArrayOutOfBoundException no matter what.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS

2015-12-14 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057153#comment-15057153
 ] 

Ashutosh Chauhan commented on HIVE-11107:
-

* We only need rawDS and numRows fields. All extra fields arent needed.
* I dont see much value in TestPerfCliDriver.vm. We can achieve its effect from 
TestCliDriver either by passing mode parameter or via creating a mapping in 
pom.xml

+1 for existing patch. We should take up these improvements in follow-up.

> Support for Performance regression test suite with TPCDS
> 
>
> Key: HIVE-11107
> URL: https://issues.apache.org/jira/browse/HIVE-11107
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, 
> HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch, 
> HIVE-11107.6.patch, HIVE-11107.7.patch
>
>
> Support to add TPCDS queries to the performance regression test suite with 
> Hive CBO turned on.
> This benchmark is intended to make sure that subsequent changes to the 
> optimizer or any hive code do not yield any unexpected plan changes. i.e.  
> the intention is to not run the entire TPCDS query set, but just "explain 
> plan" for the TPCDS queries.
> As part of this jira, we will manually verify that expected hive 
> optimizations kick in for the queries (for given stats/dataset). If there is 
> a difference in plan within this test suite due to a future commit, it needs 
> to be analyzed and we need to make sure that it is not a regression.
> The test suite can be run in master branch from itests by 
> {code}
> mvn test -Dtest=TestPerfCliDriver 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12526) PerfLogger for hive compiler and optimizer

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057166#comment-15057166
 ] 

Sergey Shelukhin commented on HIVE-12526:
-

Master is 2.1.0; to have a 2.0.0 fix version, it needs to be committed to 
branch-2.0

> PerfLogger for hive compiler and optimizer
> --
>
> Key: HIVE-12526
> URL: https://issues.apache.org/jira/browse/HIVE-12526
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.1.0
>
> Attachments: HIVE-12526.1.patch, HIVE-12526.2.patch, 
> HIVE-12526.3.patch, HIVE-12526.4.patch
>
>
> This jira is intended to use the perflogger to track compilation times and 
> optimization times (calcite, tez compiler, physical compiler) etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12526) PerfLogger for hive compiler and optimizer

2015-12-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12526:

Fix Version/s: (was: 2.0.0)
   2.1.0

> PerfLogger for hive compiler and optimizer
> --
>
> Key: HIVE-12526
> URL: https://issues.apache.org/jira/browse/HIVE-12526
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.1.0
>
> Attachments: HIVE-12526.1.patch, HIVE-12526.2.patch, 
> HIVE-12526.3.patch, HIVE-12526.4.patch
>
>
> This jira is intended to use the perflogger to track compilation times and 
> optimization times (calcite, tez compiler, physical compiler) etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12570:
-
Fix Version/s: (was: 2.0.0)
   2.1.0

> Incorrect error message Expression not in GROUP BY key thrown instead of 
> Invalid function
> -
>
> Key: HIVE-12570
> URL: https://issues.apache.org/jira/browse/HIVE-12570
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Fix For: 2.1.0
>
> Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, 
> HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch
>
>
> {code}
> explain create table avg_salary_by_supervisor3 as select average(key) as 
> key_avg from src group by value;
> {code}
> We get the following stack trace :
> {code}
> FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY 
> key 'key'
> ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 
> Expression not in GROUP BY key 'key'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not 
> in GROUP BY key 'key'
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Instead of the above error message, it be more appropriate to throw the below 
> error :
> ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid 
> function 'average'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12064) prevent transactional=false

2015-12-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057183#comment-15057183
 ] 

Eugene Koifman commented on HIVE-12064:
---

The reason that testTransactionalValidation() fails in the 4 cases above is 
that these are all instances of TestRemoteHiveMetaStore.  It passes in 
TestEmbeddedHiveMetaStore.  The patch adds an EventListener that throws an 
exception which is not propagated to the client in Remote case but is 
propagated with Embedded.

I've modified 
AuthorizationPreEventListener.authorizeCreateDatabase(PreCreateDatabaseEvent 
context) to throw
if(true) {
throw new MetaException("Oops");
  }
then when I look at TestAuthorizationPreEventListener.testListener() it does 
"driver.run("create database " + dbName);" but no exception is surfaced
but hive.log definitely has the "Oops" exception


cc [~sushanth]

> prevent transactional=false
> ---
>
> Key: HIVE-12064
> URL: https://issues.apache.org/jira/browse/HIVE-12064
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-12064.2.patch, HIVE-12064.patch
>
>
> currently a tblproperty transactional=true must be set to make a table behave 
> in ACID compliant way.
> This is misleading in that it seems like changing it to transactional=false 
> makes the table non-acid but on disk layout of acid table is different than 
> plain tables.  So changing this  property may cause wrong data to be returned.
> Should prevent transactional=false.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.

2015-12-14 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057194#comment-15057194
 ] 

Laljo John Pullokkaran commented on HIVE-12666:
---

+1 conditional on QA run.

> PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes 
> dynamic partition pruner generated synthetic join predicates.
> 
>
> Key: HIVE-12666
> URL: https://issues.apache.org/jira/browse/HIVE-12666
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
>Priority: Blocker
> Attachments: HIVE-12666.1.patch
>
>
> Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the 
> IN partition conditions from the predicate list since the static dynamic 
> partitioning would kick in and push these predicates down to metastore. 
> However, the check is too aggressive and removes events such as below :
> {code}
> -Select Operator
> -  expressions: UDFToDouble(UDFToInteger((hr / 2))) 
> (type: double)
> -  outputColumnNames: _col0
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Group By Operator
> -keys: _col0 (type: double)
> -mode: hash
> -outputColumnNames: _col0
> -Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -Dynamic Partitioning Event Operator
> -  Target Input: srcpart
> -  Partition key expr: UDFToDouble(hr)
> -  Statistics: Num rows: 1 Data size: 7 Basic stats: 
> COMPLETE Column stats: NONE
> -  Target column: hr
> -  Target Vertex: Map 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-14 Thread yangfang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12653:

Attachment: HIVE-12653.3.patch

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, 
> HIVE-12653.patch, HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below:
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-14 Thread wangwenli (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055804#comment-15055804
 ] 

wangwenli commented on HIVE-12653:
--

seems you forgot add super.initialize(conf, tbl) in initialize() 

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.2.patch, HIVE-12653.patch, HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below:
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-14 Thread yangfang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055851#comment-15055851
 ] 

yangfang commented on HIVE-12653:
-

Thanks very much, I have Re packaged.

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, 
> HIVE-12653.patch, HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below:
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-14 Thread yangfang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055852#comment-15055852
 ] 

yangfang commented on HIVE-12653:
-

Thanks very much, I have Re packaged.

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, 
> HIVE-12653.patch, HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below:
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12616) NullPointerException when spark session is reused to run a mapjoin

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055934#comment-15055934
 ] 

Hive QA commented on HIVE-12616:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777428/HIVE-12616.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 9881 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_non_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_basic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_6
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_interval_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_ints_casts
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6346/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6346/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6346/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 33 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777428 - PreCommit-HIVE-TRUNK-Build

> NullPointerException when spark session is reused to run a mapjoin
> --
>
> Key: HIVE-12616
> URL: https://issues.apache.org/jira/browse/HIVE-12616
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.3.0
>Reporter: Nemon Lou
>Assignee: Nemon Lou
> Attachments: HIVE-12616.1.patch, HIVE-12616.2.patch, HIVE-12616.patch
>
>
> The way to reproduce:
> {noformat}
> set hive.execution.engine=spark;
> create table if not exists test(id int);
> create table if not exists test1(id int);
> insert into test values(1);
> insert into test1 values(1);
> select max(a.id) from test a ,test1 b
> where a.id = b.id;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057496#comment-15057496
 ] 

Hive QA commented on HIVE-12663:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777563/HIVE-12663.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9887 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_mapjoin1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_shufflejoin
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6355/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6355/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6355/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777563 - PreCommit-HIVE-TRUNK-Build

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, 
> HIVE-12663.03.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056098#comment-15056098
 ] 

Hive QA commented on HIVE-12653:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777452/HIVE-12653.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9896 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_2
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6347/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6347/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6347/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777452 - PreCommit-HIVE-TRUNK-Build

> The property  "serialization.encoding" in the class 
> "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
> ---
>
> Key: HIVE-12653
> URL: https://issues.apache.org/jira/browse/HIVE-12653
> Project: Hive
>  Issue Type: Improvement
>  Components: Contrib
>Affects Versions: 1.2.1
>Reporter: yangfang
>Assignee: yangfang
> Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, 
> HIVE-12653.patch, HIVE-12653.patch
>
>
> when I create table with ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files 
> with chinese encoded by GBK:
> create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr 
> string, 
> num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string  ) 
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' 
> WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK');
> load data local inpath 
> '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table 
> PersonInfo;
>  I found chinese disorder code in the table and  'serialization.encoding' 
> does not work, the chinese disorder data list as below:
> | 
> 
> 9999�ϴ���  
> 0624624002��ʱ��   
>   
>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056185#comment-15056185
 ] 

Matt McCline commented on HIVE-12435:
-

Patch #3 also disappeared down rabbit hole.

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056198#comment-15056198
 ] 

Matt McCline commented on HIVE-12435:
-

Started Hive QA for #4 as 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6348/

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12435:

Attachment: HIVE-12435.04.patch

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO

2015-12-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-11775:
---
Attachment: HIVE-11775.09.patch

> Implement limit push down through union all in CBO
> --
>
> Key: HIVE-11775
> URL: https://issues.apache.org/jira/browse/HIVE-11775
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, 
> HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, 
> HIVE-11775.06.patch, HIVE-11775.07.patch, HIVE-11775.08.patch, 
> HIVE-11775.09.patch
>
>
> Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually 
> push limit down through union all, which reduces the intermediate number of 
> rows in union branches. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-12-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.9.patch

Upload rebased patch 9

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, 
> HIVE-12366.6.patch, HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473

2015-12-14 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-12667:
--
Description: 
HIVE-12473 has added an incorrect comment and also lacks a test case.

Benefits of this fix:

   * Does not say: "Probably doesn't work"
   * Does not use grammar like "subquery columns and such"
   * Adds test cases, that let you verify the fix
   * Doesn't rely on certain structure of key expr, just takes the type at 
compile time
   * Doesn't require an additional walk of each key expression
   * Shows the type used in explain

  was:HIVE-12473 has added an incorrect comment and also lacks a test case.


> Proper fix for HIVE-12473
> -
>
> Key: HIVE-12667
> URL: https://issues.apache.org/jira/browse/HIVE-12667
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>
> HIVE-12473 has added an incorrect comment and also lacks a test case.
> Benefits of this fix:
>* Does not say: "Probably doesn't work"
>* Does not use grammar like "subquery columns and such"
>* Adds test cases, that let you verify the fix
>* Doesn't rely on certain structure of key expr, just takes the type at 
> compile time
>* Doesn't require an additional walk of each key expression
>* Shows the type used in explain



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12597) LLAP - allow using elevator without cache

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057325#comment-15057325
 ] 

Hive QA commented on HIVE-12597:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777554/HIVE-12597.02.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9885 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-bucket3.q-vectorization_7.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6354/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6354/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6354/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777554 - PreCommit-HIVE-TRUNK-Build

> LLAP - allow using elevator without cache
> -
>
> Key: HIVE-12597
> URL: https://issues.apache.org/jira/browse/HIVE-12597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12597.01.patch, HIVE-12597.02.patch, 
> HIVE-12597.patch
>
>
> Elevator is currently tied up with cache due to the way the memory is 
> allocated. We should allow using elevator with the cache disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12667) Proper fix for HIVE-12473

2015-12-14 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057447#comment-15057447
 ] 

Gunther Hagleitner commented on HIVE-12667:
---

[~vikram.dixit]/[~wzheng] can you take a look please?

> Proper fix for HIVE-12473
> -
>
> Key: HIVE-12667
> URL: https://issues.apache.org/jira/browse/HIVE-12667
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Attachments: HIVE-12667.1.patch
>
>
> HIVE-12473 has added an incorrect comment and also lacks a test case.
> Benefits of this fix:
>* Does not say: "Probably doesn't work"
>* Does not use grammar like "subquery columns and such"
>* Adds test cases, that let you verify the fix
>* Doesn't rely on certain structure of key expr, just takes the type at 
> compile time
>* Doesn't require an additional walk of each key expression
>* Shows the type used in explain



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473

2015-12-14 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-12667:
--
Target Version/s: 2.0.0

> Proper fix for HIVE-12473
> -
>
> Key: HIVE-12667
> URL: https://issues.apache.org/jira/browse/HIVE-12667
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>
> HIVE-12473 has added an incorrect comment and also lacks a test case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473

2015-12-14 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-12667:
--
Summary: Proper fix for HIVE-12473  (was: Add test case for HIVE-12473)

> Proper fix for HIVE-12473
> -
>
> Key: HIVE-12667
> URL: https://issues.apache.org/jira/browse/HIVE-12667
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>
> HIVE-12473 has added an incorrect comment and also lacks a test case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12663:
--
Component/s: Transactions

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9544) Error dropping fully qualified partitioned table - Internal error processing get_partition_names

2015-12-14 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang resolved HIVE-9544.
---
   Resolution: Cannot Reproduce
Fix Version/s: 2.0.0

> Error dropping fully qualified partitioned table - Internal error processing 
> get_partition_names
> 
>
> Key: HIVE-9544
> URL: https://issues.apache.org/jira/browse/HIVE-9544
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.14.0
> Environment: HDP 2.2
>Reporter: Hari Sekhon
>Assignee: Chaoyu Tang
>Priority: Minor
> Fix For: 2.0.0
>
>
> When attempting to drop a partitioned table using a fully qualified name I 
> get this error:
> {code}
> hive -e 'drop table myDB.my_table_name;'
> Logging initialized using configuration in 
> file:/etc/hive/conf/hive-log4j.properties
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.DDLTask. 
> org.apache.thrift.TApplicationException: Internal error processing 
> get_partition_names
> {code}
> It succeeds if I instead do:
> {code}hive -e 'use myDB; drop table my_table_name;'{code}
> Regards,
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12663:
--
Target Version/s: 1.3.0, 2.1.0

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056247#comment-15056247
 ] 

Eugene Koifman commented on HIVE-12663:
---

Seems like there should be a quoteName(String colName) method in some util 
class (hopefully same method used for non-acid tables)

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-12663:
--
Fix Version/s: (was: 2.0.0)

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path

2015-12-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-12055:
-
Attachment: HIVE-12055.patch

Fixed two test case output files.

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC, Shims
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions

2015-12-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056759#comment-15056759
 ] 

Prasanth Jayachandran commented on HIVE-12658:
--

[~sseth] Can I take over this issue? If you haven't already started on it.. 
IIUC, RejectedExecutionException should be caught by 
LlapDaemonProtocolServerImpl and the response should contain the rejected 
FragmentSpecProto or fragment identifier. Is that correct?

> Task rejection by an llap daemon spams the log with 
> RejectedExecutionExceptions
> ---
>
> Key: HIVE-12658
> URL: https://issues.apache.org/jira/browse/HIVE-12658
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> The execution queue throws a RejectedExecutionException - which is logged by 
> the hadoop IPC layer.
> Instead of relying on an Exception in the protocol - move to sending back an 
> explicit response to indicate a rejected fragment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12597) LLAP - allow using elevator without cache

2015-12-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12597:

Attachment: HIVE-12597.02.patch

The rebased patch.

> LLAP - allow using elevator without cache
> -
>
> Key: HIVE-12597
> URL: https://issues.apache.org/jira/browse/HIVE-12597
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12597.01.patch, HIVE-12597.02.patch, 
> HIVE-12597.patch
>
>
> Elevator is currently tied up with cache due to the way the memory is 
> allocated. We should allow using elevator with the cache disabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12663:
---
Attachment: HIVE-12663.02.patch

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056609#comment-15056609
 ] 

Pengcheng Xiong commented on HIVE-12663:


[~ekoifman], thanks for your comments. I have addressed it in the new patch. 
Could you take anther look? Thanks.

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056613#comment-15056613
 ] 

Hive QA commented on HIVE-12055:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777507/HIVE-12055.patch

{color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9895 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6349/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6349/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6349/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777507 - PreCommit-HIVE-TRUNK-Build

> Create row-by-row shims for the write path 
> ---
>
> Key: HIVE-12055
> URL: https://issues.apache.org/jira/browse/HIVE-12055
> Project: Hive
>  Issue Type: Sub-task
>  Components: ORC, Shims
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, 
> HIVE-12055.patch
>
>
> As part of removing the row-by-row writer, we'll need to shim out the higher 
> level API (OrcSerde and OrcOutputFormat) so that we maintain backwards 
> compatibility.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056633#comment-15056633
 ] 

Eugene Koifman commented on HIVE-12663:
---

SemanticAnalyzer uses unparseIdentifier(String identifier, Configuration conf). 
 Why did you choose to use unparseIdentifier(String identifier) ?

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056654#comment-15056654
 ] 

Eugene Koifman commented on HIVE-12663:
---


+1 pending tests.  Could you commit to branch-1 as well please

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, 
> HIVE-12663.03.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12663:
---
Attachment: HIVE-12663.03.patch

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, 
> HIVE-12663.03.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on

2015-12-14 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056639#comment-15056639
 ] 

Pengcheng Xiong commented on HIVE-12663:


[~ekoifman], sorry, i used it in some places but not in all the places. I have 
changed that. Thanks.

> Support quoted table names/columns when ACID is on
> --
>
> Key: HIVE-12663
> URL: https://issues.apache.org/jira/browse/HIVE-12663
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, 
> HIVE-12663.03.patch
>
>
> Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support 
> quoted names.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055703#comment-15055703
 ] 

Hive QA commented on HIVE-12661:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777420/HIVE-12661.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 124 failed/errored test(s), 9896 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_many
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_merge
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin9
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin_negative
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_map_ppr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_map_ppr_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_sort_skew_1_23
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join17

[jira] [Commented] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers

2015-12-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056438#comment-15056438
 ] 

Siddharth Seth commented on HIVE-12577:
---

EntityTracker tracks the relationship between containers and tasks, along with 
the nodes they run on. This is used for various bits of accounting - including 
telling unknown fragments to die, processing heartbeats for fragments which are 
in the wait queue of an llap instance.

There were some discrepancies in this tracking, the most important one being 
the null check which causes the exception. The patch fixes these and adds some 
unit tests for verification.

> NPE in LlapTaskCommunicator when unregistering containers
> -
>
> Key: HIVE-12577
> URL: https://issues.apache.org/jira/browse/HIVE-12577
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
> Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, 
> HIVE-12577.1.wip.txt
>
>
> {code}
> 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586)
> at 
> org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188)
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36)
> at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:745)
> 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] 
> |common.AsyncDispatcher|: Error in dispatcher thread
> java.lang.NullPointerException
> at 
> org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892)
> at 
> org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at 
> 

[jira] [Commented] (HIVE-12448) Change to tracking of dag status via dagIdentifier instead of dag name

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056465#comment-15056465
 ] 

Sergey Shelukhin commented on HIVE-12448:
-

+1 pending tests

> Change to tracking of dag status via dagIdentifier instead of dag name
> --
>
> Key: HIVE-12448
> URL: https://issues.apache.org/jira/browse/HIVE-12448
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-12448.1.txt, HIVE-12448.2.txt, HIVE-12448.3.txt, 
> HIVE-12448.4.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056499#comment-15056499
 ] 

Sergey Shelukhin commented on HIVE-12473:
-

The comment is not actually correct, as I realized later; we only receive one 
expression currently, so there's no chance of getting a wrong value.
The reason it does the right thing now is that the top-level expression does 
not need to be cast in general case, Hive already takes care of comparing 
correctly.
What needs to be cast is the partition column string. It needs to be cast to 
argument type of whatever it's passed to. Most of the time the partition column 
is the top-level expression and is passed into UDFCompareBlahBlah, but it's not 
always the case; it's different if it's wrapped in, and passed into, most UDFs 
(e.g. YEAR(date)). The patch changes the code to take the type of the argument, 
instead of taking the type of the top-level expression. 


> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12573) some DPP tests are broken

2015-12-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12573:

Priority: Blocker  (was: Major)

> some DPP tests are broken
> -
>
> Key: HIVE-12573
> URL: https://issues.apache.org/jira/browse/HIVE-12573
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Attachments: HIVE-12573.patch
>
>
> -It looks like LLAP out files were not updated in some DPP JIRA because the 
> test was entirely broken in HiveQA at the time- actually looks like out files 
> have explain output with a glitch



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-12-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.7.patch

Upload patch 7

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, 
> HIVE-12366.6.patch, HIVE-12366.7.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056518#comment-15056518
 ] 

Sergey Shelukhin commented on HIVE-12462:
-

I think the reason for this patch from [~gopalv]'s query runs was precisely 
that TS predicate was a superset of FIL predicate. I was assuming that's by 
design.

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver

2015-12-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-10982:
--
Attachment: HIVE-10982.3.patch

I made one small change to the patch before committing.  You had changed one of 
the constructors in HiveStatement rather than adding a completely new one.  I 
was afraid this could break backwards compatibility, so I changed it to add a 
new constructor with all five arguments and leave the existing four argument 
constructor in place.  I've attached patch 3 with my change for completeness.

> Customizable the value of  java.sql.statement.setFetchSize in Hive JDBC Driver
> --
>
> Key: HIVE-10982
> URL: https://issues.apache.org/jira/browse/HIVE-10982
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Attachments: HIVE-10982.1.patch, HIVE-10982.2.patch, 
> HIVE-10982.3.patch
>
>
> The current JDBC driver for Hive hard-code the value of setFetchSize to 50, 
> which will be a bottleneck for performance.
> Pentaho filed this issue as  http://jira.pentaho.com/browse/PDI-11511, whose 
> status is open.
> Also it has discussion in 
> http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform
> http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056486#comment-15056486
 ] 

Matt McCline commented on HIVE-12435:
-

None of the failures look related.

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12570:
-
Attachment: HIVE-12570.5.patch

> Incorrect error message Expression not in GROUP BY key thrown instead of 
> Invalid function
> -
>
> Key: HIVE-12570
> URL: https://issues.apache.org/jira/browse/HIVE-12570
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, 
> HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch
>
>
> {code}
> explain create table avg_salary_by_supervisor3 as select average(key) as 
> key_avg from src group by value;
> {code}
> We get the following stack trace :
> {code}
> FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY 
> key 'key'
> ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 
> Expression not in GROUP BY key 'key'
> org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not 
> in GROUP BY key 'key'
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}
> Instead of the above error message, it be more appropriate to throw the below 
> error :
> ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid 
> function 'average'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-12-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: (was: HIVE-12366.7.patch)

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, 
> HIVE-12366.6.patch, HIVE-12366.7.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-12-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.7.patch

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, 
> HIVE-12366.6.patch, HIVE-12366.7.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL

2015-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056516#comment-15056516
 ] 

Sergey Shelukhin commented on HIVE-12462:
-

We should figure out the right fix for the above issue before reverting this. 
Do you want to track it in  HIVE-12462?

> DPP: DPP optimizers need to run on the TS predicate not FIL 
> 
>
> Key: HIVE-12462
> URL: https://issues.apache.org/jira/browse/HIVE-12462
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Gopal V
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch
>
>
> HIVE-11398 + HIVE-11791, the partition-condition-remover became more 
> effective.
> This removes predicates from the FilterExpression which involve partition 
> columns, causing a miss for dynamic-partition pruning if the DPP relies on 
> FilterDesc.
> The TS desc will have the correct predicate in that condition.
> {code}
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> Filter Operator (FIL_20)
>   predicate: ((account_id = 22) and year(dt) is not null) (type: boolean)
>   Select Operator (SEL_4)
> expressions: dt (type: date)
> outputColumnNames: _col1
> Reduce Output Operator (RS_8)
>   key expressions: year(_col1) (type: int)
>   sort order: +
>   Map-reduce partition columns: year(_col1) (type: int)
>   Join Operator (JOIN_9)
> condition map:
>  Inner Join 0 to 1
> keys:
>   0 year(_col1) (type: int)
>   1 year(_col1) (type: int)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly

2015-12-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-12473.
-
Resolution: Fixed

> DPP: UDFs on the partition column side does not evaluate correctly
> --
>
> Key: HIVE-12473
> URL: https://issues.apache.org/jira/browse/HIVE-12473
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Affects Versions: 1.3.0, 1.2.1, 2.0.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-12473.patch
>
>
> Related to HIVE-12462
> {code}
> select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) 
> and account_id = 22;
> $hdt$_0:$hdt$_1:a
>   TableScan (TS_2)
> alias: a
> filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) 
> IN (RS[6])) (type: boolean)
> {code}
> Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only 
> checks for final type, not the column type.
> {code}
> ObjectInspector oi =
> 
> PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory
> .getPrimitiveTypeInfo(si.fieldInspector.getTypeName()));
> Converter converter =
> ObjectInspectorConverters.getConverter(
> PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction

2015-12-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-12366:
-
Attachment: HIVE-12366.8.patch

Upload patch 8

> Refactor Heartbeater logic for transaction
> --
>
> Key: HIVE-12366
> URL: https://issues.apache.org/jira/browse/HIVE-12366
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, 
> HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, 
> HIVE-12366.6.patch, HIVE-12366.7.patch, HIVE-12366.8.patch
>
>
> Currently there is a gap between the time locks acquisition and the first 
> heartbeat being sent out. Normally the gap is negligible, but when it's big 
> it will cause query fail since the locks are timed out by the time the 
> heartbeat is sent.
> Need to remove this gap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2015-12-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12661:
---
Attachment: HIVE-12661.03.patch

> StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
> ---
>
> Key: HIVE-12661
> URL: https://issues.apache.org/jira/browse/HIVE-12661
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, 
> HIVE-12661.03.patch
>
>
> PROBLEM:
> Hive stats are autogathered properly till an 'analyze table [tablename] 
> compute statistics for columns' is run. Then it does not auto-update the 
> stats till the command is run again. repo:
> {code}
> set hive.stats.autogather=true; 
> set hive.stats.atomic=false ; 
> set hive.stats.collect.rawdatasize=true ; 
> set hive.stats.collect.scancols=false ; 
> set hive.stats.collect.tablekeys=false ; 
> set hive.stats.fetch.column.stats=true; 
> set hive.stats.fetch.partition.stats=true ; 
> set hive.stats.reliable=false ; 
> set hive.compute.query.using.stats=true; 
> CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 
> 'orc.compress'='NONE') ; 
> insert into calendar values (2010), (2011), (2012); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> ++--+ 
> select max(year) from calendar; 
> | 2012 | 
> insert into calendar values (2013); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> | 2013 | 
> ++--+ 
> select max(year) from calendar; 
> | 2013 | 
> insert into calendar values (2014); 
> select max(year) from calendar; 
> | 2014 |
> analyze table calendar compute statistics for columns;
> insert into calendar values (2015);
> select max(year) from calendar;
> | 2014 |
> insert into calendar values (2016), (2017), (2018);
> select max(year) from calendar;
> | 2014  |
> analyze table calendar compute statistics for columns;
> select max(year) from calendar;
> | 2018  |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12448) Change to tracking of dag status via dagIdentifier instead of dag name

2015-12-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-12448:
--
Attachment: HIVE-12448.4.txt

rebased patch after some recent conflicting changes.

> Change to tracking of dag status via dagIdentifier instead of dag name
> --
>
> Key: HIVE-12448
> URL: https://issues.apache.org/jira/browse/HIVE-12448
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Affects Versions: 2.0.0
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-12448.1.txt, HIVE-12448.2.txt, HIVE-12448.3.txt, 
> HIVE-12448.4.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly

2015-12-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-12661:
---
Attachment: (was: HIVE-12661.03.patch)

> StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
> ---
>
> Key: HIVE-12661
> URL: https://issues.apache.org/jira/browse/HIVE-12661
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, 
> HIVE-12661.03.patch
>
>
> PROBLEM:
> Hive stats are autogathered properly till an 'analyze table [tablename] 
> compute statistics for columns' is run. Then it does not auto-update the 
> stats till the command is run again. repo:
> {code}
> set hive.stats.autogather=true; 
> set hive.stats.atomic=false ; 
> set hive.stats.collect.rawdatasize=true ; 
> set hive.stats.collect.scancols=false ; 
> set hive.stats.collect.tablekeys=false ; 
> set hive.stats.fetch.column.stats=true; 
> set hive.stats.fetch.partition.stats=true ; 
> set hive.stats.reliable=false ; 
> set hive.compute.query.using.stats=true; 
> CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
> 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( 
> 'orc.compress'='NONE') ; 
> insert into calendar values (2010), (2011), (2012); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> ++--+ 
> select max(year) from calendar; 
> | 2012 | 
> insert into calendar values (2013); 
> select * from calendar; 
> ++--+ 
> | calendar.year | 
> ++--+ 
> | 2010 | 
> | 2011 | 
> | 2012 | 
> | 2013 | 
> ++--+ 
> select max(year) from calendar; 
> | 2013 | 
> insert into calendar values (2014); 
> select max(year) from calendar; 
> | 2014 |
> analyze table calendar compute statistics for columns;
> insert into calendar values (2015);
> select max(year) from calendar;
> | 2014 |
> insert into calendar values (2016), (2017), (2018);
> select max(year) from calendar;
> | 2014  |
> analyze table calendar compute statistics for columns;
> select max(year) from calendar;
> | 2018  |
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12640:
-
Attachment: (was: HIVE-12640.1.patch)

> Allow StatsOptimizer to optimize the query for Constant GroupBy keys 
> -
>
> Key: HIVE-12640
> URL: https://issues.apache.org/jira/browse/HIVE-12640
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12640.1.patch, HIVE-12640.2.patch
>
>
> {code}
> hive> select count('1') from src group by '1';
> {code}
> In the above query, while performing StatsOptimizer optimization we can 
> safely ignore the group by on the constant key '1' since the above query will 
> return the same result as "select count('1') from src".
> Exception:
> If src is empty, according to the SQL standard,
> {code}
>  select count('1') from src group by '1'
> {code}
> and
> {code}
>  select count('1') from src
> {code}
> should produce 1 and 0 rows respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12640:
-
Attachment: HIVE-12640.1.patch

> Allow StatsOptimizer to optimize the query for Constant GroupBy keys 
> -
>
> Key: HIVE-12640
> URL: https://issues.apache.org/jira/browse/HIVE-12640
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12640.1.patch, HIVE-12640.2.patch
>
>
> {code}
> hive> select count('1') from src group by '1';
> {code}
> In the above query, while performing StatsOptimizer optimization we can 
> safely ignore the group by on the constant key '1' since the above query will 
> return the same result as "select count('1') from src".
> Exception:
> If src is empty, according to the SQL standard,
> {code}
>  select count('1') from src group by '1'
> {code}
> and
> {code}
>  select count('1') from src
> {code}
> should produce 1 and 0 rows respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056419#comment-15056419
 ] 

Hive QA commented on HIVE-12435:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777494/HIVE-12435.04.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9897 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6348/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6348/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6348/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 16 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777494 - PreCommit-HIVE-TRUNK-Build

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12528) don't start HS2 Tez sessions in a single thread

2015-12-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056441#comment-15056441
 ] 

Siddharth Seth commented on HIVE-12528:
---

NPE where ?
Some of the variables being setup may not be visisble in the threads that make 
use of them. Making some of them final would be ideal.

> don't start HS2 Tez sessions in a single thread
> ---
>
> Key: HIVE-12528
> URL: https://issues.apache.org/jira/browse/HIVE-12528
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12528.patch
>
>
> Starting sessions in parallel would improve the startup time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys

2015-12-14 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-12640:
-
Attachment: HIVE-12640.2.patch

> Allow StatsOptimizer to optimize the query for Constant GroupBy keys 
> -
>
> Key: HIVE-12640
> URL: https://issues.apache.org/jira/browse/HIVE-12640
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12640.1.patch, HIVE-12640.2.patch
>
>
> {code}
> hive> select count('1') from src group by '1';
> {code}
> In the above query, while performing StatsOptimizer optimization we can 
> safely ignore the group by on the constant key '1' since the above query will 
> return the same result as "select count('1') from src".
> Exception:
> If src is empty, according to the SQL standard,
> {code}
>  select count('1') from src group by '1'
> {code}
> and
> {code}
>  select count('1') from src
> {code}
> should produce 1 and 0 rows respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver

2015-12-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056478#comment-15056478
 ] 

Alan Gates commented on HIVE-10982:
---

+1

> Customizable the value of  java.sql.statement.setFetchSize in Hive JDBC Driver
> --
>
> Key: HIVE-10982
> URL: https://issues.apache.org/jira/browse/HIVE-10982
> Project: Hive
>  Issue Type: Improvement
>  Components: JDBC
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Bing Li
>Assignee: Bing Li
>Priority: Critical
> Attachments: HIVE-10982.1.patch, HIVE-10982.2.patch
>
>
> The current JDBC driver for Hive hard-code the value of setFetchSize to 50, 
> which will be a bottleneck for performance.
> Pentaho filed this issue as  http://jira.pentaho.com/browse/PDI-11511, whose 
> status is open.
> Also it has discussion in 
> http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform
> http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12668) package script for LLAP was broken by recent config changes

2015-12-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12668:

Target Version/s: 2.0.0

> package script for LLAP was broken by recent config changes
> ---
>
> Key: HIVE-12668
> URL: https://issues.apache.org/jira/browse/HIVE-12668
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> I didn't realize that was part of Hive... the script needs to be updated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056798#comment-15056798
 ] 

Matt McCline commented on HIVE-12435:
-

Committd to master.

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions

2015-12-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056815#comment-15056815
 ] 

Siddharth Seth commented on HIVE-12658:
---

Something along those lines. I think it'll be better to catch the 
RejectedExecution as early as possible - and set a status in 
SubmitWorkResponse, rather than leaving the logic upto 
LlapDaemonProtocolServerImpl - which is meant to be a proxy layer over the 
protocol

> Task rejection by an llap daemon spams the log with 
> RejectedExecutionExceptions
> ---
>
> Key: HIVE-12658
> URL: https://issues.apache.org/jira/browse/HIVE-12658
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>
> The execution queue throws a RejectedExecutionException - which is logged by 
> the hadoop IPC layer.
> Instead of relying on an Exception in the protocol - move to sending back an 
> explicit response to indicate a rejected fragment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056818#comment-15056818
 ] 

Matt McCline commented on HIVE-12435:
-

Committed to branch-1

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 1.3.0, 2.1.0
>
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions

2015-12-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-12658:


Assignee: Prasanth Jayachandran  (was: Siddharth Seth)

> Task rejection by an llap daemon spams the log with 
> RejectedExecutionExceptions
> ---
>
> Key: HIVE-12658
> URL: https://issues.apache.org/jira/browse/HIVE-12658
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>Assignee: Prasanth Jayachandran
>
> The execution queue throws a RejectedExecutionException - which is logged by 
> the hadoop IPC layer.
> Instead of relying on an Exception in the protocol - move to sending back an 
> explicit response to indicate a rejected fragment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys

2015-12-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056866#comment-15056866
 ] 

Hive QA commented on HIVE-12640:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777529/HIVE-12640.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 912 failed/errored test(s), 9882 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_partition_diff_num_cols.q-tez_joins_explain.q-vector_decimal_aggregate.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_invalidate_column_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1_sql_std
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_admin_almighty2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_cli_nonsql
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update_own_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avrocountemptytbl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_constant
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketpruning1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_tinyint_to_double
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_to_int
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby_empty
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby_empty

[jira] [Updated] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.

2015-12-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-12435:

Fix Version/s: 2.1.0

> SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and 
> vectorization is enabled.
> --
>
> Key: HIVE-12435
> URL: https://issues.apache.org/jira/browse/HIVE-12435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.0.0
>Reporter: Takahiko Saito
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.1.0
>
> Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, 
> HIVE-12435.03.patch, HIVE-12435.04.patch
>
>
> Run the following query:
> {noformat}
> create table count_case_groupby (key string, bool boolean) STORED AS orc;
> insert into table count_case_groupby values ('key1', true),('key2', 
> false),('key3', NULL),('key4', false),('key5',NULL);
> {noformat}
> The table contains the following:
> {noformat}
> key1  true
> key2  false
> key3  NULL
> key4  false
> key5  NULL
> {noformat}
> The below query returns:
> {noformat}
> SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) 
> AS cnt_bool0_ok FROM count_case_groupby GROUP BY key;
> key1  1
> key2  1
> key3  1
> key4  1
> key5  1
> {noformat}
> while it expects the following results:
> {noformat}
> key1  1
> key2  1
> key3  0
> key4  1
> key5  0
> {noformat}
> The query works with hive ver 1.2. Also it works when a table is not orc 
> format.
> Also even if it's an orc table, when vectorization is disabled, the query 
> works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >