[jira] [Updated] (HIVE-12673) Orcfiledump throws NPE when no files are available
[ https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-12673: Attachment: HIVE-12673.2.patch Exiting early from printJsonMetaData when files are not present. > Orcfiledump throws NPE when no files are available > -- > > Key: HIVE-12673 > URL: https://issues.apache.org/jira/browse/HIVE-12673 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-12673.1.patch, HIVE-12673.2.patch > > > {noformat} > Exception in thread "main" java.lang.NullPointerException > at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106) > at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116) > at > org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293) > at > org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197) > at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11927: --- Attachment: HIVE-11927.12.patch > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch, HIVE-11927.08.patch, > HIVE-11927.09.patch, HIVE-11927.10.patch, HIVE-11927.11.patch, > HIVE-11927.12.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12675) PerfLogger should log performance metrics at debug level
[ https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12675: - Attachment: HIVE-12675.1.patch cc-ing [~jpullokkaran] and [~ashutoshc] for review. > PerfLogger should log performance metrics at debug level > > > Key: HIVE-12675 > URL: https://issues.apache.org/jira/browse/HIVE-12675 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12675.1.patch > > > As more and more subcomponents of Hive (Tez, Optimizer) etc are using > PerfLogger to track the performance metrics, it will be more meaningful to > set the PerfLogger logging level to DEBUG. Otherwise, we will print the > performance metrics unnecessarily for each and every query if the underlying > subcomponent does not control the PerfLogging via a parameter on its own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12673) Orcfiledump throws NPE when no files are available
[ https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056982#comment-15056982 ] Prasanth Jayachandran edited comment on HIVE-12673 at 12/14/15 11:38 PM: - The problem is JSONObject is in incomplete state. I think we should fix that. 1) If the files list is empty or null return at the beginning of the function 2) line:208. Add else condition and do writer.endObject() to finish the object as 'done' was (Author: prasanth_j): The problem is JSONObject is in complete state. I think we should fix that. 1) If the files list is empty or null return at the beginning of the function 2) line:208. Add else condition and do writer.endObject() to finish the object as 'done' > Orcfiledump throws NPE when no files are available > -- > > Key: HIVE-12673 > URL: https://issues.apache.org/jira/browse/HIVE-12673 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-12673.1.patch > > > {noformat} > Exception in thread "main" java.lang.NullPointerException > at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106) > at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116) > at > org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293) > at > org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197) > at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12673) Orcfiledump throws NPE when no files are available
[ https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056982#comment-15056982 ] Prasanth Jayachandran commented on HIVE-12673: -- The problem is JSONObject is in complete state. I think we should fix that. 1) If the files list is empty or null return at the beginning of the function 2) line:208. Add else condition and do writer.endObject() to finish the object as 'done' > Orcfiledump throws NPE when no files are available > -- > > Key: HIVE-12673 > URL: https://issues.apache.org/jira/browse/HIVE-12673 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-12673.1.patch > > > {noformat} > Exception in thread "main" java.lang.NullPointerException > at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106) > at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116) > at > org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293) > at > org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197) > at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.
[ https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12666: - Attachment: (was: HIVE-12666.1.patch) > PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes > dynamic partition pruner generated synthetic join predicates. > > > Key: HIVE-12666 > URL: https://issues.apache.org/jira/browse/HIVE-12666 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Blocker > Attachments: HIVE-12666.1.patch > > > Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the > IN partition conditions from the predicate list since the static dynamic > partitioning would kick in and push these predicates down to metastore. > However, the check is too aggressive and removes events such as below : > {code} > -Select Operator > - expressions: UDFToDouble(UDFToInteger((hr / 2))) > (type: double) > - outputColumnNames: _col0 > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Group By Operator > -keys: _col0 (type: double) > -mode: hash > -outputColumnNames: _col0 > -Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > -Dynamic Partitioning Event Operator > - Target Input: srcpart > - Partition key expr: UDFToDouble(hr) > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Target column: hr > - Target Vertex: Map 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.
[ https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12666: - Attachment: HIVE-12666.1.patch > PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes > dynamic partition pruner generated synthetic join predicates. > > > Key: HIVE-12666 > URL: https://issues.apache.org/jira/browse/HIVE-12666 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Blocker > Attachments: HIVE-12666.1.patch > > > Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the > IN partition conditions from the predicate list since the static dynamic > partitioning would kick in and push these predicates down to metastore. > However, the check is too aggressive and removes events such as below : > {code} > -Select Operator > - expressions: UDFToDouble(UDFToInteger((hr / 2))) > (type: double) > - outputColumnNames: _col0 > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Group By Operator > -keys: _col0 (type: double) > -mode: hash > -outputColumnNames: _col0 > -Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > -Dynamic Partitioning Event Operator > - Target Input: srcpart > - Partition key expr: UDFToDouble(hr) > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Target column: hr > - Target Vertex: Map 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.
[ https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057061#comment-15057061 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-12666: -- This includes : 1. Revert HIVE-12462 (change in 2. Correct fix made in PcrExprProcFactory. 3. Merged diff files cc-ing [~hagleitn], [~sershe], [~gopalv] [~jpullokkaran] Can you please review this patch Thanks Hari > PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes > dynamic partition pruner generated synthetic join predicates. > > > Key: HIVE-12666 > URL: https://issues.apache.org/jira/browse/HIVE-12666 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Blocker > Attachments: HIVE-12666.1.patch > > > Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the > IN partition conditions from the predicate list since the static dynamic > partitioning would kick in and push these predicates down to metastore. > However, the check is too aggressive and removes events such as below : > {code} > -Select Operator > - expressions: UDFToDouble(UDFToInteger((hr / 2))) > (type: double) > - outputColumnNames: _col0 > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Group By Operator > -keys: _col0 (type: double) > -mode: hash > -outputColumnNames: _col0 > -Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > -Dynamic Partitioning Event Operator > - Target Input: srcpart > - Partition key expr: UDFToDouble(hr) > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Target column: hr > - Target Vertex: Map 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057120#comment-15057120 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-12462: -- I've uploaded the patch with the potential fix on this issue in HIVE-12666, will be great if someone can review it. I am also reverting this patch as part of the fix. Thanks Hari > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions
[ https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12658: - Attachment: HIVE-12658.1.patch > Task rejection by an llap daemon spams the log with > RejectedExecutionExceptions > --- > > Key: HIVE-12658 > URL: https://issues.apache.org/jira/browse/HIVE-12658 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Prasanth Jayachandran > Attachments: HIVE-12658.1.patch > > > The execution queue throws a RejectedExecutionException - which is logged by > the hadoop IPC layer. > Instead of relying on an Exception in the protocol - move to sending back an > explicit response to indicate a rejected fragment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12675) PerfLogger should log performance metrics at debug level
[ https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12675: - Description: As more and more subcomponents of Hive (Tez, Optimizer) etc are using PerfLogger to track the performance metrics, it will be more meaningful to set the PerfLogger logging level to DEBUG. Otherwise, we will print the performance metrics unnecessarily for each and every query if the underlying subcomponent does not control the PerfLogging via a parameter on its own. (was: As more and more subcomponents are Hive (Tez, Optimizer) etc are using PerfLogger to track the performance metrics, it will be more meaningful to set the PerfLogger logging level to DEBUG. Otherwise, we will print the performance metrics unnecessarily for each and every query if the underlying subcomponent does not control the PerfLogging via a parameter.) > PerfLogger should log performance metrics at debug level > > > Key: HIVE-12675 > URL: https://issues.apache.org/jira/browse/HIVE-12675 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > > As more and more subcomponents of Hive (Tez, Optimizer) etc are using > PerfLogger to track the performance metrics, it will be more meaningful to > set the PerfLogger logging level to DEBUG. Otherwise, we will print the > performance metrics unnecessarily for each and every query if the underlying > subcomponent does not control the PerfLogging via a parameter on its own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9907) insert into table values() when UTF-8 character is not correct
[ https://issues.apache.org/jira/browse/HIVE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057197#comment-15057197 ] niklaus xiao commented on HIVE-9907: Attached patch solved this issue, pls check. > insert into table values() when UTF-8 character is not correct > > > Key: HIVE-9907 > URL: https://issues.apache.org/jira/browse/HIVE-9907 > Project: Hive > Issue Type: Bug > Components: CLI, Clients, JDBC >Affects Versions: 0.14.0, 0.13.1, 1.0.0 > Environment: centos 6 LANG=zh_CN.UTF-8 > hadoop 2.6 > hive 1.1.0 >Reporter: Fanhong Li >Priority: Critical > Attachments: HIVE-9907.1.patch > > > insert into table test_acid partition(pt='pt_2') > values( 2, '中文_2' , 'city_2' ) > ; > hive> select * > > from test_acid > > ; > OK > 2 -�_2city_2 pt_2 > Time taken: 0.237 seconds, Fetched: 1 row(s) > hive> > CREATE TABLE test_acid(id INT, > name STRING, > city STRING) > PARTITIONED BY (pt STRING) > clustered by (id) into 1 buckets > stored as ORCFILE > TBLPROPERTIES('transactional'='true') > ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9907) insert into table values() when UTF-8 character is not correct
[ https://issues.apache.org/jira/browse/HIVE-9907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklaus xiao updated HIVE-9907: --- Attachment: HIVE-9907.1.patch > insert into table values() when UTF-8 character is not correct > > > Key: HIVE-9907 > URL: https://issues.apache.org/jira/browse/HIVE-9907 > Project: Hive > Issue Type: Bug > Components: CLI, Clients, JDBC >Affects Versions: 0.14.0, 0.13.1, 1.0.0 > Environment: centos 6 LANG=zh_CN.UTF-8 > hadoop 2.6 > hive 1.1.0 >Reporter: Fanhong Li >Priority: Critical > Attachments: HIVE-9907.1.patch > > > insert into table test_acid partition(pt='pt_2') > values( 2, '中文_2' , 'city_2' ) > ; > hive> select * > > from test_acid > > ; > OK > 2 -�_2city_2 pt_2 > Time taken: 0.237 seconds, Fetched: 1 row(s) > hive> > CREATE TABLE test_acid(id INT, > name STRING, > city STRING) > PARTITIONED BY (pt STRING) > clustered by (id) into 1 buckets > stored as ORCFILE > TBLPROPERTIES('transactional'='true') > ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12616) NullPointerException when spark session is reused to run a mapjoin
[ https://issues.apache.org/jira/browse/HIVE-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057239#comment-15057239 ] Xuefu Zhang commented on HIVE-12616: +1 [~nemon], could you create a followup JIRA that covers the problems that didn't get addressed by the patch here based on the discussion here? Thanks. > NullPointerException when spark session is reused to run a mapjoin > -- > > Key: HIVE-12616 > URL: https://issues.apache.org/jira/browse/HIVE-12616 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.3.0 >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-12616.1.patch, HIVE-12616.2.patch, HIVE-12616.patch > > > The way to reproduce: > {noformat} > set hive.execution.engine=spark; > create table if not exists test(id int); > create table if not exists test1(id int); > insert into test values(1); > insert into test1 values(1); > select max(a.id) from test a ,test1 b > where a.id = b.id; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions
[ https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057255#comment-15057255 ] Prasanth Jayachandran commented on HIVE-12658: -- [~sseth] Could you take a look please? > Task rejection by an llap daemon spams the log with > RejectedExecutionExceptions > --- > > Key: HIVE-12658 > URL: https://issues.apache.org/jira/browse/HIVE-12658 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Prasanth Jayachandran > Attachments: HIVE-12658.1.patch > > > The execution queue throws a RejectedExecutionException - which is logged by > the hadoop IPC layer. > Instead of relying on an Exception in the protocol - move to sending back an > explicit response to indicate a rejected fragment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12673) Orcfiledump throws NPE when no files are available
[ https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12673: - Assignee: Rajesh Balamohan (was: Prasanth Jayachandran) > Orcfiledump throws NPE when no files are available > -- > > Key: HIVE-12673 > URL: https://issues.apache.org/jira/browse/HIVE-12673 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-12673.1.patch > > > {noformat} > Exception in thread "main" java.lang.NullPointerException > at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106) > at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116) > at > org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293) > at > org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197) > at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException
[ https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Gustavsson updated HIVE-12664: Attachment: HIVE-12664-1.patch Original patch wasn't against trunk... sorry about that. > Bug in reduce deduplication optimization causing ArrayOutOfBoundException > - > > Key: HIVE-12664 > URL: https://issues.apache.org/jira/browse/HIVE-12664 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1, 1.2.1 >Reporter: Johan Gustavsson >Assignee: Johan Gustavsson > Attachments: HIVE-12664-1.patch, HIVE-12664.patch > > > The optimisation check for reduce deduplication only checks the first child > node for join -and the check itself also contains a major bug- causing > ArrayOutOfBoundException no matter what. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver
[ https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057168#comment-15057168 ] Bing Li commented on HIVE-10982: [~alangates] and [~thejas], thank you for your reviewing ! > Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver > -- > > Key: HIVE-10982 > URL: https://issues.apache.org/jira/browse/HIVE-10982 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 1.2.0, 1.2.1 >Reporter: Bing Li >Assignee: Bing Li >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-10982.1.patch, HIVE-10982.2.patch, > HIVE-10982.3.patch > > > The current JDBC driver for Hive hard-code the value of setFetchSize to 50, > which will be a bottleneck for performance. > Pentaho filed this issue as http://jira.pentaho.com/browse/PDI-11511, whose > status is open. > Also it has discussion in > http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform > http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057001#comment-15057001 ] Laljo John Pullokkaran commented on HIVE-12570: --- +1 > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Instead of the above error message, it be more appropriate to throw the below > error : > ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid > function 'average' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-11687) TaskExecutorService can reject work even if capacity is available
[ https://issues.apache.org/jira/browse/HIVE-11687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-11687: Assignee: Prasanth Jayachandran > TaskExecutorService can reject work even if capacity is available > - > > Key: HIVE-11687 > URL: https://issues.apache.org/jira/browse/HIVE-11687 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: llap >Reporter: Siddharth Seth >Assignee: Prasanth Jayachandran > Fix For: llap > > > The waitQueue has a fixed capacity - which is the wait queue size. Addition > of new work doe snot factor in the capacity available to execute work. This > ends up being left to the race between work getting scheduled for execution > and added to the waitQueue. > cc [~prasanth_j] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12673) Orcfiledump throws NPE when no files are available
[ https://issues.apache.org/jira/browse/HIVE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057069#comment-15057069 ] Prasanth Jayachandran commented on HIVE-12673: -- +1 > Orcfiledump throws NPE when no files are available > -- > > Key: HIVE-12673 > URL: https://issues.apache.org/jira/browse/HIVE-12673 > Project: Hive > Issue Type: Bug >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan > Attachments: HIVE-12673.1.patch, HIVE-12673.2.patch > > > {noformat} > Exception in thread "main" java.lang.NullPointerException > at org.codehaus.jettison.json.JSONTokener.more(JSONTokener.java:106) > at org.codehaus.jettison.json.JSONTokener.next(JSONTokener.java:116) > at > org.codehaus.jettison.json.JSONTokener.nextClean(JSONTokener.java:170) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:185) > at org.codehaus.jettison.json.JSONObject.(JSONObject.java:293) > at > org.apache.hadoop.hive.ql.io.orc.JsonFileDump.printJsonMetaData(JsonFileDump.java:197) > at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:107) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {noformat} > hive --orcfiledump -j -p /tmp/orc/inventory/inv_date_sk=2452654 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10632) Make sure TXN_COMPONENTS gets cleaned up if table is dropped before compaction.
[ https://issues.apache.org/jira/browse/HIVE-10632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057188#comment-15057188 ] Eugene Koifman commented on HIVE-10632: --- The right way to do this is to add a MetaStoreListener to clean up Acid related metastore tables on dropTable/Partition. > Make sure TXN_COMPONENTS gets cleaned up if table is dropped before > compaction. > --- > > Key: HIVE-10632 > URL: https://issues.apache.org/jira/browse/HIVE-10632 > Project: Hive > Issue Type: Bug > Components: Metastore, Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > > The compaction process will clean up entries in TXNS, > COMPLETED_TXN_COMPONENTS, TXN_COMPONENTS. If the table/partition is dropped > before compaction is complete there will be data left in these tables. Need > to investigate if there are other situations where this may happen and > address it. > see HIVE-10595 for additional info -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers
[ https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057199#comment-15057199 ] Sergey Shelukhin commented on HIVE-12577: - Tracking time using the currentMilliseconds call is fraught with peril, machine clock can move and cause weird behavior. Nits: The caller of BiMapgetContainerAttemptMapForNode(String hostname, int port) creates an nodeid but passes name and port to the method which also creates an id; getContext() is called twice; there appear to be some indentation issues. > NPE in LlapTaskCommunicator when unregistering containers > - > > Key: HIVE-12577 > URL: https://issues.apache.org/jira/browse/HIVE-12577 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, > HIVE-12577.1.wip.txt > > > {code} > 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] > |common.AsyncDispatcher|: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586) > at > org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188) > at > org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72) > at > org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60) > at > org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] > |common.AsyncDispatcher|: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at >
[jira] [Updated] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.
[ https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12666: - Attachment: HIVE-12666.1.patch > PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes > dynamic partition pruner generated synthetic join predicates. > > > Key: HIVE-12666 > URL: https://issues.apache.org/jira/browse/HIVE-12666 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Blocker > Attachments: HIVE-12666.1.patch > > > Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the > IN partition conditions from the predicate list since the static dynamic > partitioning would kick in and push these predicates down to metastore. > However, the check is too aggressive and removes events such as below : > {code} > -Select Operator > - expressions: UDFToDouble(UDFToInteger((hr / 2))) > (type: double) > - outputColumnNames: _col0 > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Group By Operator > -keys: _col0 (type: double) > -mode: hash > -outputColumnNames: _col0 > -Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > -Dynamic Partitioning Event Operator > - Target Input: srcpart > - Partition key expr: UDFToDouble(hr) > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Target column: hr > - Target Vertex: Map 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11736) while creating this hcatalog table then getting this error
[ https://issues.apache.org/jira/browse/HIVE-11736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] niklaus xiao resolved HIVE-11736. - Resolution: Not A Problem > while creating this hcatalog table then getting this error > > > Key: HIVE-11736 > URL: https://issues.apache.org/jira/browse/HIVE-11736 > Project: Hive > Issue Type: Bug >Reporter: Sadeek Mohammad >Priority: Blocker > > HCatClient error on create table: {"statement":"use default; create table > batting_data(`playerid` string, `yearid` int, `stint` bigint, `teamid` > string, `lgid` string, `g` bigint, `g_batting` bigint, `ab` bigint, `r` > bigint, `h` bigint, `2b` bigint, `3b` bigint, `hr` bigint, `rbi` bigint, `sb` > bigint, `cs` bigint, `bb` bigint, `so` bigint, `ibb` bigint, `hbp` bigint, > `sh` bigint, `sf` bigint, `gidp` bigint, `g_old` bigint) row format delimited > fields terminated by ',';","error":"unable to create table: > batting_data","exec":{"stdout":"","stderr":"which: no > /usr/hdp/2.2.4.2-2//hadoop/bin/hadoop.distro in ((null))\ndirname: missing > operand\nTry `dirname --help' for more information.\nSLF4J: Class path > contains multiple SLF4J bindings.\nSLF4J: Found binding in > [jar:file:/usr/hdp/2.2.4.2-2/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: > Found binding in > [jar:file:/usr/hdp/2.2.4.2-2/hive/lib/hive-jdbc-0.14.0.2.2.4.2-2-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]\nSLF4J: > See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation.\nSLF4J: Actual binding is of type > [org.slf4j.impl.Log4jLoggerFactory]\n Command was terminated due to > timeout(6ms). See templeton.exec.timeout property","exitcode":143}} > (error 500) > any help is appreciated -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057154#comment-15057154 ] Hive QA commented on HIVE-12366: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777551/HIVE-12366.8.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6352/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6352/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6352/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6352/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive f14b3c6..e2c8bfa branch-1 -> origin/branch-1 866b236..d8ee05a branch-2.0 -> origin/branch-2.0 23f78cc..c5b2c0e master -> origin/master + git reset --hard HEAD HEAD is now at 23f78cc HIVE-12435 SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled. (Matt McCline, reviewed by Prasanth J) + git clean -f -d Removing ql/src/test/queries/clientnegative/invalid_select_fn.q Removing ql/src/test/results/clientnegative/invalid_select_fn.q.out + git checkout master Already on 'master' Your branch is behind 'origin/master' by 3 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at c5b2c0e HIVE-12526 : PerfLogger for hive compiler and optimizer (Hari Subramaniyan, reviewed by Jesus Camacho Rodriguez) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12777551 - PreCommit-HIVE-TRUNK-Build > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, > HIVE-12366.6.patch, HIVE-12366.7.patch, HIVE-12366.8.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12675) PerfLogger should log performance metrics at debug level
[ https://issues.apache.org/jira/browse/HIVE-12675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057213#comment-15057213 ] Sergey Shelukhin commented on HIVE-12675: - I don't know if this is a good idea. With DEBUG level, the amount of other logging will increase logging dramatically, which affects perf a lot with log4j. Many people run clusters and WARN level because even INFO may be too much for perf. Without crafting a special logging configuration, it would only be possible to see perflogger output when it's obscured by perf loss from DEBUG logging.. > PerfLogger should log performance metrics at debug level > > > Key: HIVE-12675 > URL: https://issues.apache.org/jira/browse/HIVE-12675 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12675.1.patch > > > As more and more subcomponents of Hive (Tez, Optimizer) etc are using > PerfLogger to track the performance metrics, it will be more meaningful to > set the PerfLogger logging level to DEBUG. Otherwise, we will print the > performance metrics unnecessarily for each and every query if the underlying > subcomponent does not control the PerfLogging via a parameter on its own. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12633) LLAP: package included serde jars
[ https://issues.apache.org/jira/browse/HIVE-12633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057056#comment-15057056 ] Sergey Shelukhin commented on HIVE-12633: - [~vikram.dixit] can you take a look? > LLAP: package included serde jars > - > > Key: HIVE-12633 > URL: https://issues.apache.org/jira/browse/HIVE-12633 > Project: Hive > Issue Type: Bug >Reporter: Takahiko Saito >Assignee: Sergey Shelukhin > Attachments: HIVE-12633.01.patch, HIVE-12633.02.patch, > HIVE-12633.patch > > > Some SerDes like JSONSerde are not packaged with LLAP. One cannot localize > jars on the daemon (due to security consideration if nothing else), so we > should package them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12447) Fix LlapTaskReporter post TEZ-808 changes
[ https://issues.apache.org/jira/browse/HIVE-12447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057115#comment-15057115 ] Sergey Shelukhin commented on HIVE-12447: - +1 > Fix LlapTaskReporter post TEZ-808 changes > - > > Key: HIVE-12447 > URL: https://issues.apache.org/jira/browse/HIVE-12447 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12447.1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057151#comment-15057151 ] Hive QA commented on HIVE-12570: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777536/HIVE-12570.5.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9870 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vectorized_parquet.q-orc_merge6.q-vector_outer_join0.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6351/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6351/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6351/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777536 - PreCommit-HIVE-TRUNK-Build > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.0.0 > > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at >
[jira] [Updated] (HIVE-12664) Bug in reduce deduplication optimization causing ArrayOutOfBoundException
[ https://issues.apache.org/jira/browse/HIVE-12664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Johan Gustavsson updated HIVE-12664: Description: The optimisation check for reduce deduplication only checks the first child node for join -and the check itself also contains a major bug- causing ArrayOutOfBoundException no matter what. (was: The optimisation check for reduce deduplication only checks the first child node for join and the check itself also contains a major bug causing ArrayOutOfBoundException no matter what.) > Bug in reduce deduplication optimization causing ArrayOutOfBoundException > - > > Key: HIVE-12664 > URL: https://issues.apache.org/jira/browse/HIVE-12664 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1, 1.2.1 >Reporter: Johan Gustavsson >Assignee: Johan Gustavsson > Attachments: HIVE-12664.patch > > > The optimisation check for reduce deduplication only checks the first child > node for join -and the check itself also contains a major bug- causing > ArrayOutOfBoundException no matter what. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057153#comment-15057153 ] Ashutosh Chauhan commented on HIVE-11107: - * We only need rawDS and numRows fields. All extra fields arent needed. * I dont see much value in TestPerfCliDriver.vm. We can achieve its effect from TestCliDriver either by passing mode parameter or via creating a mapping in pom.xml +1 for existing patch. We should take up these improvements in follow-up. > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch, HIVE-11107.4.patch, HIVE-11107.5.patch, > HIVE-11107.6.patch, HIVE-11107.7.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12526) PerfLogger for hive compiler and optimizer
[ https://issues.apache.org/jira/browse/HIVE-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057166#comment-15057166 ] Sergey Shelukhin commented on HIVE-12526: - Master is 2.1.0; to have a 2.0.0 fix version, it needs to be committed to branch-2.0 > PerfLogger for hive compiler and optimizer > -- > > Key: HIVE-12526 > URL: https://issues.apache.org/jira/browse/HIVE-12526 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12526.1.patch, HIVE-12526.2.patch, > HIVE-12526.3.patch, HIVE-12526.4.patch > > > This jira is intended to use the perflogger to track compilation times and > optimization times (calcite, tez compiler, physical compiler) etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12526) PerfLogger for hive compiler and optimizer
[ https://issues.apache.org/jira/browse/HIVE-12526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12526: Fix Version/s: (was: 2.0.0) 2.1.0 > PerfLogger for hive compiler and optimizer > -- > > Key: HIVE-12526 > URL: https://issues.apache.org/jira/browse/HIVE-12526 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12526.1.patch, HIVE-12526.2.patch, > HIVE-12526.3.patch, HIVE-12526.4.patch > > > This jira is intended to use the perflogger to track compilation times and > optimization times (calcite, tez compiler, physical compiler) etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12570: - Fix Version/s: (was: 2.0.0) 2.1.0 > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Fix For: 2.1.0 > > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Instead of the above error message, it be more appropriate to throw the below > error : > ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid > function 'average' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12064) prevent transactional=false
[ https://issues.apache.org/jira/browse/HIVE-12064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057183#comment-15057183 ] Eugene Koifman commented on HIVE-12064: --- The reason that testTransactionalValidation() fails in the 4 cases above is that these are all instances of TestRemoteHiveMetaStore. It passes in TestEmbeddedHiveMetaStore. The patch adds an EventListener that throws an exception which is not propagated to the client in Remote case but is propagated with Embedded. I've modified AuthorizationPreEventListener.authorizeCreateDatabase(PreCreateDatabaseEvent context) to throw if(true) { throw new MetaException("Oops"); } then when I look at TestAuthorizationPreEventListener.testListener() it does "driver.run("create database " + dbName);" but no exception is surfaced but hive.log definitely has the "Oops" exception cc [~sushanth] > prevent transactional=false > --- > > Key: HIVE-12064 > URL: https://issues.apache.org/jira/browse/HIVE-12064 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-12064.2.patch, HIVE-12064.patch > > > currently a tblproperty transactional=true must be set to make a table behave > in ACID compliant way. > This is misleading in that it seems like changing it to transactional=false > makes the table non-acid but on disk layout of acid table is different than > plain tables. So changing this property may cause wrong data to be returned. > Should prevent transactional=false. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12666) PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes dynamic partition pruner generated synthetic join predicates.
[ https://issues.apache.org/jira/browse/HIVE-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057194#comment-15057194 ] Laljo John Pullokkaran commented on HIVE-12666: --- +1 conditional on QA run. > PCRExprProcFactory.GenericFuncExprProcessor.process() aggressively removes > dynamic partition pruner generated synthetic join predicates. > > > Key: HIVE-12666 > URL: https://issues.apache.org/jira/browse/HIVE-12666 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan >Priority: Blocker > Attachments: HIVE-12666.1.patch > > > Introduced by HIVE-11634. The original idea in HIVE-11634 was to remove the > IN partition conditions from the predicate list since the static dynamic > partitioning would kick in and push these predicates down to metastore. > However, the check is too aggressive and removes events such as below : > {code} > -Select Operator > - expressions: UDFToDouble(UDFToInteger((hr / 2))) > (type: double) > - outputColumnNames: _col0 > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Group By Operator > -keys: _col0 (type: double) > -mode: hash > -outputColumnNames: _col0 > -Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > -Dynamic Partitioning Event Operator > - Target Input: srcpart > - Partition key expr: UDFToDouble(hr) > - Statistics: Num rows: 1 Data size: 7 Basic stats: > COMPLETE Column stats: NONE > - Target column: hr > - Target Vertex: Map 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yangfang updated HIVE-12653: Attachment: HIVE-12653.3.patch > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055804#comment-15055804 ] wangwenli commented on HIVE-12653: -- seems you forgot add super.initialize(conf, tbl) in initialize() > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055851#comment-15055851 ] yangfang commented on HIVE-12653: - Thanks very much, I have Re packaged. > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055852#comment-15055852 ] yangfang commented on HIVE-12653: - Thanks very much, I have Re packaged. > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12616) NullPointerException when spark session is reused to run a mapjoin
[ https://issues.apache.org/jira/browse/HIVE-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055934#comment-15055934 ] Hive QA commented on HIVE-12616: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777428/HIVE-12616.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 33 failed/errored test(s), 9881 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_stats org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_where_non_partitioned org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_basic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_bmj_schema_evolution org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_decimal_6 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_interval_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_limit org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_timestamp_ints_casts org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6346/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6346/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6346/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 33 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777428 - PreCommit-HIVE-TRUNK-Build > NullPointerException when spark session is reused to run a mapjoin > -- > > Key: HIVE-12616 > URL: https://issues.apache.org/jira/browse/HIVE-12616 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.3.0 >Reporter: Nemon Lou >Assignee: Nemon Lou > Attachments: HIVE-12616.1.patch, HIVE-12616.2.patch, HIVE-12616.patch > > > The way to reproduce: > {noformat} > set hive.execution.engine=spark; > create table if not exists test(id int); > create table if not exists test1(id int); > insert into test values(1); > insert into test1 values(1); > select max(a.id) from test a ,test1 b > where a.id = b.id; > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057496#comment-15057496 ] Hive QA commented on HIVE-12663: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777563/HIVE-12663.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 9887 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_shufflejoin org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6355/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6355/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6355/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777563 - PreCommit-HIVE-TRUNK-Build > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12653) The property "serialization.encoding" in the class "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work
[ https://issues.apache.org/jira/browse/HIVE-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056098#comment-15056098 ] Hive QA commented on HIVE-12653: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777452/HIVE-12653.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 9896 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_hybridgrace_hashjoin_2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6347/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6347/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6347/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777452 - PreCommit-HIVE-TRUNK-Build > The property "serialization.encoding" in the class > "org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe" does not work > --- > > Key: HIVE-12653 > URL: https://issues.apache.org/jira/browse/HIVE-12653 > Project: Hive > Issue Type: Improvement > Components: Contrib >Affects Versions: 1.2.1 >Reporter: yangfang >Assignee: yangfang > Attachments: HIVE-12653.2.patch, HIVE-12653.3.patch, > HIVE-12653.patch, HIVE-12653.patch > > > when I create table with ROW FORMAT SERDE > 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' and load some files > with chinese encoded by GBK: > create table PersonInfo (cod_fn_ent string, num_seq_trc_form string, date_tr > string, > num_jrn_no string, cod_trc_form_typ string,id_intl_ip string, name string ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' > WITH SERDEPROPERTIES ("field.delim"="|!","serialization.encoding"='GBK'); > load data local inpath > '/home/mr/hive/99-BoEing-IF_PMT_NOTE-2G-20151019-0' overwrite into table > PersonInfo; > I found chinese disorder code in the table and 'serialization.encoding' > does not work, the chinese disorder data list as below: > | > > 9999�ϴ��� > 0624624002��ʱ�� > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056185#comment-15056185 ] Matt McCline commented on HIVE-12435: - Patch #3 also disappeared down rabbit hole. > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056198#comment-15056198 ] Matt McCline commented on HIVE-12435: - Started Hive QA for #4 as http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6348/ > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-12435: Attachment: HIVE-12435.04.patch > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11775) Implement limit push down through union all in CBO
[ https://issues.apache.org/jira/browse/HIVE-11775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-11775: --- Attachment: HIVE-11775.09.patch > Implement limit push down through union all in CBO > -- > > Key: HIVE-11775 > URL: https://issues.apache.org/jira/browse/HIVE-11775 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11775.01.patch, HIVE-11775.02.patch, > HIVE-11775.03.patch, HIVE-11775.04.patch, HIVE-11775.05.patch, > HIVE-11775.06.patch, HIVE-11775.07.patch, HIVE-11775.08.patch, > HIVE-11775.09.patch > > > Enlightened by HIVE-11684 (Kudos to [~jcamachorodriguez]), we can actually > push limit down through union all, which reduces the intermediate number of > rows in union branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12366: - Attachment: HIVE-12366.9.patch Upload rebased patch 9 > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, > HIVE-12366.6.patch, HIVE-12366.7.patch, HIVE-12366.8.patch, HIVE-12366.9.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-12667: -- Description: HIVE-12473 has added an incorrect comment and also lacks a test case. Benefits of this fix: * Does not say: "Probably doesn't work" * Does not use grammar like "subquery columns and such" * Adds test cases, that let you verify the fix * Doesn't rely on certain structure of key expr, just takes the type at compile time * Doesn't require an additional walk of each key expression * Shows the type used in explain was:HIVE-12473 has added an incorrect comment and also lacks a test case. > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > > HIVE-12473 has added an incorrect comment and also lacks a test case. > Benefits of this fix: >* Does not say: "Probably doesn't work" >* Does not use grammar like "subquery columns and such" >* Adds test cases, that let you verify the fix >* Doesn't rely on certain structure of key expr, just takes the type at > compile time >* Doesn't require an additional walk of each key expression >* Shows the type used in explain -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12597) LLAP - allow using elevator without cache
[ https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057325#comment-15057325 ] Hive QA commented on HIVE-12597: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777554/HIVE-12597.02.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9885 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_interval_2.q-bucket3.q-vectorization_7.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6354/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6354/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6354/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777554 - PreCommit-HIVE-TRUNK-Build > LLAP - allow using elevator without cache > - > > Key: HIVE-12597 > URL: https://issues.apache.org/jira/browse/HIVE-12597 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12597.01.patch, HIVE-12597.02.patch, > HIVE-12597.patch > > > Elevator is currently tied up with cache due to the way the memory is > allocated. We should allow using elevator with the cache disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057447#comment-15057447 ] Gunther Hagleitner commented on HIVE-12667: --- [~vikram.dixit]/[~wzheng] can you take a look please? > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-12667.1.patch > > > HIVE-12473 has added an incorrect comment and also lacks a test case. > Benefits of this fix: >* Does not say: "Probably doesn't work" >* Does not use grammar like "subquery columns and such" >* Adds test cases, that let you verify the fix >* Doesn't rely on certain structure of key expr, just takes the type at > compile time >* Doesn't require an additional walk of each key expression >* Shows the type used in explain -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-12667: -- Target Version/s: 2.0.0 > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > > HIVE-12473 has added an incorrect comment and also lacks a test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12667) Proper fix for HIVE-12473
[ https://issues.apache.org/jira/browse/HIVE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-12667: -- Summary: Proper fix for HIVE-12473 (was: Add test case for HIVE-12473) > Proper fix for HIVE-12473 > - > > Key: HIVE-12667 > URL: https://issues.apache.org/jira/browse/HIVE-12667 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > > HIVE-12473 has added an incorrect comment and also lacks a test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12663: -- Component/s: Transactions > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9544) Error dropping fully qualified partitioned table - Internal error processing get_partition_names
[ https://issues.apache.org/jira/browse/HIVE-9544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang resolved HIVE-9544. --- Resolution: Cannot Reproduce Fix Version/s: 2.0.0 > Error dropping fully qualified partitioned table - Internal error processing > get_partition_names > > > Key: HIVE-9544 > URL: https://issues.apache.org/jira/browse/HIVE-9544 > Project: Hive > Issue Type: Bug >Affects Versions: 0.14.0 > Environment: HDP 2.2 >Reporter: Hari Sekhon >Assignee: Chaoyu Tang >Priority: Minor > Fix For: 2.0.0 > > > When attempting to drop a partitioned table using a fully qualified name I > get this error: > {code} > hive -e 'drop table myDB.my_table_name;' > Logging initialized using configuration in > file:/etc/hive/conf/hive-log4j.properties > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/usr/hdp/2.2.0.0-2041/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/usr/hdp/2.2.0.0-2041/hive/lib/hive-jdbc-0.14.0.2.2.0.0-2041-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask. > org.apache.thrift.TApplicationException: Internal error processing > get_partition_names > {code} > It succeeds if I instead do: > {code}hive -e 'use myDB; drop table my_table_name;'{code} > Regards, > Hari Sekhon > http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12663: -- Target Version/s: 1.3.0, 2.1.0 > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056247#comment-15056247 ] Eugene Koifman commented on HIVE-12663: --- Seems like there should be a quoteName(String colName) method in some util class (hopefully same method used for non-acid tables) > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12663: -- Fix Version/s: (was: 2.0.0) > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-12055: - Attachment: HIVE-12055.patch Fixed two test case output files. > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task > Components: ORC, Shims >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, > HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, > HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions
[ https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056759#comment-15056759 ] Prasanth Jayachandran commented on HIVE-12658: -- [~sseth] Can I take over this issue? If you haven't already started on it.. IIUC, RejectedExecutionException should be caught by LlapDaemonProtocolServerImpl and the response should contain the rejected FragmentSpecProto or fragment identifier. Is that correct? > Task rejection by an llap daemon spams the log with > RejectedExecutionExceptions > --- > > Key: HIVE-12658 > URL: https://issues.apache.org/jira/browse/HIVE-12658 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > The execution queue throws a RejectedExecutionException - which is logged by > the hadoop IPC layer. > Instead of relying on an Exception in the protocol - move to sending back an > explicit response to indicate a rejected fragment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12597) LLAP - allow using elevator without cache
[ https://issues.apache.org/jira/browse/HIVE-12597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12597: Attachment: HIVE-12597.02.patch The rebased patch. > LLAP - allow using elevator without cache > - > > Key: HIVE-12597 > URL: https://issues.apache.org/jira/browse/HIVE-12597 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12597.01.patch, HIVE-12597.02.patch, > HIVE-12597.patch > > > Elevator is currently tied up with cache due to the way the memory is > allocated. We should allow using elevator with the cache disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12663: --- Attachment: HIVE-12663.02.patch > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056609#comment-15056609 ] Pengcheng Xiong commented on HIVE-12663: [~ekoifman], thanks for your comments. I have addressed it in the new patch. Could you take anther look? Thanks. > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056613#comment-15056613 ] Hive QA commented on HIVE-12055: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777507/HIVE-12055.patch {color:green}SUCCESS:{color} +1 due to 6 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9895 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testFetchingPartitionsWithDifferentSchemas org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6349/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6349/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6349/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777507 - PreCommit-HIVE-TRUNK-Build > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task > Components: ORC, Shims >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, > HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch, > HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056633#comment-15056633 ] Eugene Koifman commented on HIVE-12663: --- SemanticAnalyzer uses unparseIdentifier(String identifier, Configuration conf). Why did you choose to use unparseIdentifier(String identifier) ? > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056654#comment-15056654 ] Eugene Koifman commented on HIVE-12663: --- +1 pending tests. Could you commit to branch-1 as well please > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12663: --- Attachment: HIVE-12663.03.patch > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12663) Support quoted table names/columns when ACID is on
[ https://issues.apache.org/jira/browse/HIVE-12663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056639#comment-15056639 ] Pengcheng Xiong commented on HIVE-12663: [~ekoifman], sorry, i used it in some places but not in all the places. I have changed that. Thanks. > Support quoted table names/columns when ACID is on > -- > > Key: HIVE-12663 > URL: https://issues.apache.org/jira/browse/HIVE-12663 > Project: Hive > Issue Type: Sub-task > Components: Transactions >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12663.01.patch, HIVE-12663.02.patch, > HIVE-12663.03.patch > > > Right now the rewrite part in UpdateDeleteSemanticAnalyzer does not support > quoted names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15055703#comment-15055703 ] Hive QA commented on HIVE-12661: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777420/HIVE-12661.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 124 failed/errored test(s), 9896 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_with_different_encryption_keys org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_many org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_merge org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_spark4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin3 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin4 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin5 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin8 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin_negative org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_map_ppr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_map_ppr_multi_distinct org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_sort_skew_1_23 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input_part2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join17
[jira] [Commented] (HIVE-12577) NPE in LlapTaskCommunicator when unregistering containers
[ https://issues.apache.org/jira/browse/HIVE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056438#comment-15056438 ] Siddharth Seth commented on HIVE-12577: --- EntityTracker tracks the relationship between containers and tasks, along with the nodes they run on. This is used for various bits of accounting - including telling unknown fragments to die, processing heartbeats for fragments which are in the wait queue of an llap instance. There were some discrepancies in this tracking, the most important one being the null check which causes the exception. The patch fixes these and adds some unit tests for verification. > NPE in LlapTaskCommunicator when unregistering containers > - > > Key: HIVE-12577 > URL: https://issues.apache.org/jira/browse/HIVE-12577 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth >Priority: Critical > Attachments: HIVE-12577.1.review.txt, HIVE-12577.1.txt, > HIVE-12577.1.wip.txt > > > {code} > 2015-12-02 13:29:00,160 [ERROR] [Dispatcher thread {Central}] > |common.AsyncDispatcher|: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator$EntityTracker.unregisterContainer(LlapTaskCommunicator.java:586) > at > org.apache.hadoop.hive.llap.tezplugins.LlapTaskCommunicator.registerContainerEnd(LlapTaskCommunicator.java:188) > at > org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:389) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:415) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.handle(AMContainerImpl.java:72) > at > org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:60) > at > org.apache.tez.dag.app.rm.container.AMContainerMap.handle(AMContainerMap.java:36) > at > org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114) > at java.lang.Thread.run(Thread.java:745) > 2015-12-02 13:29:00,167 [ERROR] [Dispatcher thread {Central}] > |common.AsyncDispatcher|: Error in dispatcher thread > java.lang.NullPointerException > at > org.apache.tez.dag.app.TaskCommunicatorManager.unregisterRunningContainer(TaskCommunicatorManager.java:386) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl.unregisterFromTAListener(AMContainerImpl.java:1121) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtLaunchingTransition.transition(AMContainerImpl.java:699) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtIdleTransition.transition(AMContainerImpl.java:805) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:892) > at > org.apache.tez.dag.app.rm.container.AMContainerImpl$StopRequestAtRunningTransition.transition(AMContainerImpl.java:887) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at >
[jira] [Commented] (HIVE-12448) Change to tracking of dag status via dagIdentifier instead of dag name
[ https://issues.apache.org/jira/browse/HIVE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056465#comment-15056465 ] Sergey Shelukhin commented on HIVE-12448: - +1 pending tests > Change to tracking of dag status via dagIdentifier instead of dag name > -- > > Key: HIVE-12448 > URL: https://issues.apache.org/jira/browse/HIVE-12448 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12448.1.txt, HIVE-12448.2.txt, HIVE-12448.3.txt, > HIVE-12448.4.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly
[ https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056499#comment-15056499 ] Sergey Shelukhin commented on HIVE-12473: - The comment is not actually correct, as I realized later; we only receive one expression currently, so there's no chance of getting a wrong value. The reason it does the right thing now is that the top-level expression does not need to be cast in general case, Hive already takes care of comparing correctly. What needs to be cast is the partition column string. It needs to be cast to argument type of whatever it's passed to. Most of the time the partition column is the top-level expression and is passed into UDFCompareBlahBlah, but it's not always the case; it's different if it's wrapped in, and passed into, most UDFs (e.g. YEAR(date)). The patch changes the code to take the type of the argument, instead of taking the type of the top-level expression. > DPP: UDFs on the partition column side does not evaluate correctly > -- > > Key: HIVE-12473 > URL: https://issues.apache.org/jira/browse/HIVE-12473 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12473.patch > > > Related to HIVE-12462 > {code} > select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) > and account_id = 22; > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > {code} > Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only > checks for final type, not the column type. > {code} > ObjectInspector oi = > > PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory > .getPrimitiveTypeInfo(si.fieldInspector.getTypeName())); > Converter converter = > ObjectInspectorConverters.getConverter( > PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12573) some DPP tests are broken
[ https://issues.apache.org/jira/browse/HIVE-12573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12573: Priority: Blocker (was: Major) > some DPP tests are broken > - > > Key: HIVE-12573 > URL: https://issues.apache.org/jira/browse/HIVE-12573 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin >Priority: Blocker > Attachments: HIVE-12573.patch > > > -It looks like LLAP out files were not updated in some DPP JIRA because the > test was entirely broken in HiveQA at the time- actually looks like out files > have explain output with a glitch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12366: - Attachment: HIVE-12366.7.patch Upload patch 7 > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, > HIVE-12366.6.patch, HIVE-12366.7.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056518#comment-15056518 ] Sergey Shelukhin commented on HIVE-12462: - I think the reason for this patch from [~gopalv]'s query runs was precisely that TS predicate was a superset of FIL predicate. I was assuming that's by design. > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver
[ https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-10982: -- Attachment: HIVE-10982.3.patch I made one small change to the patch before committing. You had changed one of the constructors in HiveStatement rather than adding a completely new one. I was afraid this could break backwards compatibility, so I changed it to add a new constructor with all five arguments and leave the existing four argument constructor in place. I've attached patch 3 with my change for completeness. > Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver > -- > > Key: HIVE-10982 > URL: https://issues.apache.org/jira/browse/HIVE-10982 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 1.2.0, 1.2.1 >Reporter: Bing Li >Assignee: Bing Li >Priority: Critical > Attachments: HIVE-10982.1.patch, HIVE-10982.2.patch, > HIVE-10982.3.patch > > > The current JDBC driver for Hive hard-code the value of setFetchSize to 50, > which will be a bottleneck for performance. > Pentaho filed this issue as http://jira.pentaho.com/browse/PDI-11511, whose > status is open. > Also it has discussion in > http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform > http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056486#comment-15056486 ] Matt McCline commented on HIVE-12435: - None of the failures look related. > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12570) Incorrect error message Expression not in GROUP BY key thrown instead of Invalid function
[ https://issues.apache.org/jira/browse/HIVE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12570: - Attachment: HIVE-12570.5.patch > Incorrect error message Expression not in GROUP BY key thrown instead of > Invalid function > - > > Key: HIVE-12570 > URL: https://issues.apache.org/jira/browse/HIVE-12570 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12570.1.patch, HIVE-12570.2.patch, > HIVE-12570.3.patch, HIVE-12570.4.patch, HIVE-12570.5.patch > > > {code} > explain create table avg_salary_by_supervisor3 as select average(key) as > key_avg from src group by value; > {code} > We get the following stack trace : > {code} > FAILED: SemanticException [Error 10025]: Line 1:57 Expression not in GROUP BY > key 'key' > ERROR ql.Driver: FAILED: SemanticException [Error 10025]: Line 1:57 > Expression not in GROUP BY key 'key' > org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:57 Expression not > in GROUP BY key 'key' > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10484) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10432) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3824) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3603) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8862) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8817) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9668) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9561) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10053) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:345) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10064) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:222) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:237) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:462) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1227) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1276) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1152) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1140) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > Instead of the above error message, it be more appropriate to throw the below > error : > ERROR ql.Driver: FAILED: SemanticException [Error 10011]: Line 1:58 Invalid > function 'average' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12366: - Attachment: (was: HIVE-12366.7.patch) > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, > HIVE-12366.6.patch, HIVE-12366.7.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12366: - Attachment: HIVE-12366.7.patch > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, > HIVE-12366.6.patch, HIVE-12366.7.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12462) DPP: DPP optimizers need to run on the TS predicate not FIL
[ https://issues.apache.org/jira/browse/HIVE-12462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056516#comment-15056516 ] Sergey Shelukhin commented on HIVE-12462: - We should figure out the right fix for the above issue before reverting this. Do you want to track it in HIVE-12462? > DPP: DPP optimizers need to run on the TS predicate not FIL > > > Key: HIVE-12462 > URL: https://issues.apache.org/jira/browse/HIVE-12462 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 2.0.0 >Reporter: Gopal V >Assignee: Gopal V >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HIVE-12462.02.patch, HIVE-12462.1.patch > > > HIVE-11398 + HIVE-11791, the partition-condition-remover became more > effective. > This removes predicates from the FilterExpression which involve partition > columns, causing a miss for dynamic-partition pruning if the DPP relies on > FilterDesc. > The TS desc will have the correct predicate in that condition. > {code} > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > Filter Operator (FIL_20) > predicate: ((account_id = 22) and year(dt) is not null) (type: boolean) > Select Operator (SEL_4) > expressions: dt (type: date) > outputColumnNames: _col1 > Reduce Output Operator (RS_8) > key expressions: year(_col1) (type: int) > sort order: + > Map-reduce partition columns: year(_col1) (type: int) > Join Operator (JOIN_9) > condition map: > Inner Join 0 to 1 > keys: > 0 year(_col1) (type: int) > 1 year(_col1) (type: int) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12473) DPP: UDFs on the partition column side does not evaluate correctly
[ https://issues.apache.org/jira/browse/HIVE-12473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-12473. - Resolution: Fixed > DPP: UDFs on the partition column side does not evaluate correctly > -- > > Key: HIVE-12473 > URL: https://issues.apache.org/jira/browse/HIVE-12473 > Project: Hive > Issue Type: Bug > Components: Tez >Affects Versions: 1.3.0, 1.2.1, 2.0.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin >Priority: Blocker > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-12473.patch > > > Related to HIVE-12462 > {code} > select count(1) from accounts a, transactions t where year(a.dt) = year(t.dt) > and account_id = 22; > $hdt$_0:$hdt$_1:a > TableScan (TS_2) > alias: a > filterExpr: (((account_id = 22) and year(dt) is not null) and (year(dt)) > IN (RS[6])) (type: boolean) > {code} > Ends up being evaluated as {{year(cast(dt as int))}} because the pruner only > checks for final type, not the column type. > {code} > ObjectInspector oi = > > PrimitiveObjectInspectorFactory.getPrimitiveWritableObjectInspector(TypeInfoFactory > .getPrimitiveTypeInfo(si.fieldInspector.getTypeName())); > Converter converter = > ObjectInspectorConverters.getConverter( > PrimitiveObjectInspectorFactory.javaStringObjectInspector, oi); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-12366: - Attachment: HIVE-12366.8.patch Upload patch 8 > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch, HIVE-12366.5.patch, > HIVE-12366.6.patch, HIVE-12366.7.patch, HIVE-12366.8.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12661: --- Attachment: HIVE-12661.03.patch > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12448) Change to tracking of dag status via dagIdentifier instead of dag name
[ https://issues.apache.org/jira/browse/HIVE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12448: -- Attachment: HIVE-12448.4.txt rebased patch after some recent conflicting changes. > Change to tracking of dag status via dagIdentifier instead of dag name > -- > > Key: HIVE-12448 > URL: https://issues.apache.org/jira/browse/HIVE-12448 > Project: Hive > Issue Type: Sub-task > Components: llap >Affects Versions: 2.0.0 >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-12448.1.txt, HIVE-12448.2.txt, HIVE-12448.3.txt, > HIVE-12448.4.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12661) StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly
[ https://issues.apache.org/jira/browse/HIVE-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-12661: --- Attachment: (was: HIVE-12661.03.patch) > StatsSetupConst.COLUMN_STATS_ACCURATE is not used correctly > --- > > Key: HIVE-12661 > URL: https://issues.apache.org/jira/browse/HIVE-12661 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-12661.01.patch, HIVE-12661.02.patch, > HIVE-12661.03.patch > > > PROBLEM: > Hive stats are autogathered properly till an 'analyze table [tablename] > compute statistics for columns' is run. Then it does not auto-update the > stats till the command is run again. repo: > {code} > set hive.stats.autogather=true; > set hive.stats.atomic=false ; > set hive.stats.collect.rawdatasize=true ; > set hive.stats.collect.scancols=false ; > set hive.stats.collect.tablekeys=false ; > set hive.stats.fetch.column.stats=true; > set hive.stats.fetch.partition.stats=true ; > set hive.stats.reliable=false ; > set hive.compute.query.using.stats=true; > CREATE TABLE `default`.`calendar` (`year` int) ROW FORMAT SERDE > 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' TBLPROPERTIES ( > 'orc.compress'='NONE') ; > insert into calendar values (2010), (2011), (2012); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > ++--+ > select max(year) from calendar; > | 2012 | > insert into calendar values (2013); > select * from calendar; > ++--+ > | calendar.year | > ++--+ > | 2010 | > | 2011 | > | 2012 | > | 2013 | > ++--+ > select max(year) from calendar; > | 2013 | > insert into calendar values (2014); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > insert into calendar values (2015); > select max(year) from calendar; > | 2014 | > insert into calendar values (2016), (2017), (2018); > select max(year) from calendar; > | 2014 | > analyze table calendar compute statistics for columns; > select max(year) from calendar; > | 2018 | > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys
[ https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12640: - Attachment: (was: HIVE-12640.1.patch) > Allow StatsOptimizer to optimize the query for Constant GroupBy keys > - > > Key: HIVE-12640 > URL: https://issues.apache.org/jira/browse/HIVE-12640 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12640.1.patch, HIVE-12640.2.patch > > > {code} > hive> select count('1') from src group by '1'; > {code} > In the above query, while performing StatsOptimizer optimization we can > safely ignore the group by on the constant key '1' since the above query will > return the same result as "select count('1') from src". > Exception: > If src is empty, according to the SQL standard, > {code} > select count('1') from src group by '1' > {code} > and > {code} > select count('1') from src > {code} > should produce 1 and 0 rows respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys
[ https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12640: - Attachment: HIVE-12640.1.patch > Allow StatsOptimizer to optimize the query for Constant GroupBy keys > - > > Key: HIVE-12640 > URL: https://issues.apache.org/jira/browse/HIVE-12640 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12640.1.patch, HIVE-12640.2.patch > > > {code} > hive> select count('1') from src group by '1'; > {code} > In the above query, while performing StatsOptimizer optimization we can > safely ignore the group by on the constant key '1' since the above query will > return the same result as "select count('1') from src". > Exception: > If src is empty, according to the SQL standard, > {code} > select count('1') from src group by '1' > {code} > and > {code} > select count('1') from src > {code} > should produce 1 and 0 rows respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056419#comment-15056419 ] Hive QA commented on HIVE-12435: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777494/HIVE-12435.04.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 9897 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles org.apache.hive.spark.client.TestSparkClient.testCounters org.apache.hive.spark.client.TestSparkClient.testErrorJob org.apache.hive.spark.client.TestSparkClient.testJobSubmission org.apache.hive.spark.client.TestSparkClient.testMetricsCollection org.apache.hive.spark.client.TestSparkClient.testRemoteClient org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob org.apache.hive.spark.client.TestSparkClient.testSyncRpc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6348/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6348/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6348/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12777494 - PreCommit-HIVE-TRUNK-Build > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12528) don't start HS2 Tez sessions in a single thread
[ https://issues.apache.org/jira/browse/HIVE-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056441#comment-15056441 ] Siddharth Seth commented on HIVE-12528: --- NPE where ? Some of the variables being setup may not be visisble in the threads that make use of them. Making some of them final would be ideal. > don't start HS2 Tez sessions in a single thread > --- > > Key: HIVE-12528 > URL: https://issues.apache.org/jira/browse/HIVE-12528 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12528.patch > > > Starting sessions in parallel would improve the startup time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys
[ https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-12640: - Attachment: HIVE-12640.2.patch > Allow StatsOptimizer to optimize the query for Constant GroupBy keys > - > > Key: HIVE-12640 > URL: https://issues.apache.org/jira/browse/HIVE-12640 > Project: Hive > Issue Type: Bug >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-12640.1.patch, HIVE-12640.2.patch > > > {code} > hive> select count('1') from src group by '1'; > {code} > In the above query, while performing StatsOptimizer optimization we can > safely ignore the group by on the constant key '1' since the above query will > return the same result as "select count('1') from src". > Exception: > If src is empty, according to the SQL standard, > {code} > select count('1') from src group by '1' > {code} > and > {code} > select count('1') from src > {code} > should produce 1 and 0 rows respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10982) Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver
[ https://issues.apache.org/jira/browse/HIVE-10982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056478#comment-15056478 ] Alan Gates commented on HIVE-10982: --- +1 > Customizable the value of java.sql.statement.setFetchSize in Hive JDBC Driver > -- > > Key: HIVE-10982 > URL: https://issues.apache.org/jira/browse/HIVE-10982 > Project: Hive > Issue Type: Improvement > Components: JDBC >Affects Versions: 1.2.0, 1.2.1 >Reporter: Bing Li >Assignee: Bing Li >Priority: Critical > Attachments: HIVE-10982.1.patch, HIVE-10982.2.patch > > > The current JDBC driver for Hive hard-code the value of setFetchSize to 50, > which will be a bottleneck for performance. > Pentaho filed this issue as http://jira.pentaho.com/browse/PDI-11511, whose > status is open. > Also it has discussion in > http://forums.pentaho.com/showthread.php?158381-Hive-JDBC-Query-too-slow-too-many-fetches-after-query-execution-Kettle-Xform > http://mail-archives.apache.org/mod_mbox/hive-user/201307.mbox/%3ccacq46vevgrfqg5rwxnr1psgyz7dcf07mvlo8mm2qit3anm1...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12668) package script for LLAP was broken by recent config changes
[ https://issues.apache.org/jira/browse/HIVE-12668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12668: Target Version/s: 2.0.0 > package script for LLAP was broken by recent config changes > --- > > Key: HIVE-12668 > URL: https://issues.apache.org/jira/browse/HIVE-12668 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > > I didn't realize that was part of Hive... the script needs to be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056798#comment-15056798 ] Matt McCline commented on HIVE-12435: - Committd to master. > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions
[ https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056815#comment-15056815 ] Siddharth Seth commented on HIVE-12658: --- Something along those lines. I think it'll be better to catch the RejectedExecution as early as possible - and set a status in SubmitWorkResponse, rather than leaving the logic upto LlapDaemonProtocolServerImpl - which is meant to be a proxy layer over the protocol > Task rejection by an llap daemon spams the log with > RejectedExecutionExceptions > --- > > Key: HIVE-12658 > URL: https://issues.apache.org/jira/browse/HIVE-12658 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > > The execution queue throws a RejectedExecutionException - which is logged by > the hadoop IPC layer. > Instead of relying on an Exception in the protocol - move to sending back an > explicit response to indicate a rejected fragment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056818#comment-15056818 ] Matt McCline commented on HIVE-12435: - Committed to branch-1 > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Fix For: 1.3.0, 2.1.0 > > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12658) Task rejection by an llap daemon spams the log with RejectedExecutionExceptions
[ https://issues.apache.org/jira/browse/HIVE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran reassigned HIVE-12658: Assignee: Prasanth Jayachandran (was: Siddharth Seth) > Task rejection by an llap daemon spams the log with > RejectedExecutionExceptions > --- > > Key: HIVE-12658 > URL: https://issues.apache.org/jira/browse/HIVE-12658 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth >Assignee: Prasanth Jayachandran > > The execution queue throws a RejectedExecutionException - which is logged by > the hadoop IPC layer. > Instead of relying on an Exception in the protocol - move to sending back an > explicit response to indicate a rejected fragment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys
[ https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15056866#comment-15056866 ] Hive QA commented on HIVE-12640: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12777529/HIVE-12640.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 912 failed/errored test(s), 9882 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_partition_diff_num_cols.q-tez_joins_explain.q-vector_decimal_aggregate.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_allcolref_in_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_change_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_cascade org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_table_invalidate_column_stats org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ambiguitycheck org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_excludeHadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_1_sql_std org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_admin_almighty2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_cli_nonsql org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_update_own_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_reordering_values org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_partitioned_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_evolution_native org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avrocountemptytbl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binary_constant org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketpruning1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_tinyint_to_double org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast_to_int org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_gby_empty org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_cross_product_check_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_gby_empty
[jira] [Updated] (HIVE-12435) SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and vectorization is enabled.
[ https://issues.apache.org/jira/browse/HIVE-12435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-12435: Fix Version/s: 2.1.0 > SELECT COUNT(CASE WHEN...) GROUPBY returns 1 for 'NULL' in a case of ORC and > vectorization is enabled. > -- > > Key: HIVE-12435 > URL: https://issues.apache.org/jira/browse/HIVE-12435 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 2.0.0 >Reporter: Takahiko Saito >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0 > > Attachments: HIVE-12435.01.patch, HIVE-12435.02.patch, > HIVE-12435.03.patch, HIVE-12435.04.patch > > > Run the following query: > {noformat} > create table count_case_groupby (key string, bool boolean) STORED AS orc; > insert into table count_case_groupby values ('key1', true),('key2', > false),('key3', NULL),('key4', false),('key5',NULL); > {noformat} > The table contains the following: > {noformat} > key1 true > key2 false > key3 NULL > key4 false > key5 NULL > {noformat} > The below query returns: > {noformat} > SELECT key, COUNT(CASE WHEN bool THEN 1 WHEN NOT bool THEN 0 ELSE NULL END) > AS cnt_bool0_ok FROM count_case_groupby GROUP BY key; > key1 1 > key2 1 > key3 1 > key4 1 > key5 1 > {noformat} > while it expects the following results: > {noformat} > key1 1 > key2 1 > key3 0 > key4 1 > key5 0 > {noformat} > The query works with hive ver 1.2. Also it works when a table is not orc > format. > Also even if it's an orc table, when vectorization is disabled, the query > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)