[jira] [Updated] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9069: -- Attachment: HIVE-9069.12.patch Simplify filter predicates for CBO -- Key: HIVE-9069 URL: https://issues.apache.org/jira/browse/HIVE-9069 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Jesus Camacho Rodriguez Fix For: 0.14.1 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.patch Simplify predicates for disjunctive predicates so that can get pushed down to the scan. Looks like this is still an issue, some of the filters can be pushed down to the scan. {code} set hive.cbo.enable=true set hive.stats.fetch.column.stats=true set hive.exec.dynamic.partition.mode=nonstrict set hive.tez.auto.reducer.parallelism=true set hive.auto.convert.join.noconditionaltask.size=32000 set hive.exec.reducers.bytes.per.reducer=1 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.tez.exec.print.summary=true explain select substr(r_reason_desc,1,20) as r ,avg(ws_quantity) wq ,avg(wr_refunded_cash) ref ,avg(wr_fee) fee from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where web_sales.ws_web_page_sk = web_page.wp_web_page_sk and web_sales.ws_item_sk = web_returns.wr_item_sk and web_sales.ws_order_number = web_returns.wr_order_number and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk and reason.r_reason_sk = web_returns.wr_reason_sk and ( ( cd1.cd_marital_status = 'M' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '4 yr Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 100.00 and 150.00 ) or ( cd1.cd_marital_status = 'D' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Primary' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 50.00 and 100.00 ) or ( cd1.cd_marital_status = 'U' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Advanced Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 150.00 and 200.00 ) ) and ( ( ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ws_net_profit between 100 and 200 ) or ( ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ws_net_profit between 150 and 300 ) or ( ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ws_net_profit between 50 and 250 ) ) group by r_reason_desc order by r, wq, ref, fee limit 100 OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 9 - Map 1 (BROADCAST_EDGE) Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE) Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE) Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE) Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE) Reducer 7 - Reducer 6 (SIMPLE_EDGE) Reducer 8 - Reducer 7 (SIMPLE_EDGE) DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_page filterExpr: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 18408 Basic stats: COMPLETE Column stats: COMPLETE Select Operator
[jira] [Commented] (HIVE-10809) HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories
[ https://issues.apache.org/jira/browse/HIVE-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557543#comment-14557543 ] Hive QA commented on HIVE-10809: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734987/HIVE-10809.1.patch {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8973 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4021/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4021/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4021/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734987 - PreCommit-HIVE-TRUNK-Build HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories -- Key: HIVE-10809 URL: https://issues.apache.org/jira/browse/HIVE-10809 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-10809.1.patch When static partition is added through HCatStorer or HCatWriter {code} JoinedData = LOAD '/user/selinaz/data/part-r-0' USING JsonLoader(); STORE JoinedData INTO 'selina.joined_events_e' USING org.apache.hive.hcatalog.pig.HCatStorer('author=selina'); {code} The table directory looks like {noformat} drwx-- - selinaz users 0 2015-05-22 21:19 /user/selinaz/joined_events_e/_SCRATCH0.9157208938193798 drwx-- - selinaz users 0 2015-05-22 21:19 /user/selinaz/joined_events_e/author=selina {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10684) Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files
[ https://issues.apache.org/jira/browse/HIVE-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557495#comment-14557495 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10684: -- [~Ferd] Is it possible to introduce ant tasks instead of bash script, can you see if this is possible via the Tasks mentioned in https://ant.apache.org/manual/Tasks/move.html . I believe this should support portability as well. If the above method is not possible, It must be possible to get an equivalent windows script as well. But before the windows equivalent script can be checked in, it has to be tested in windows. You need a OS based maven profile for this purpose. This can be done as a separate jira; for the purpose of this jira, you can introduce a separate profile through which the end-user can provide a maven option (using -P or -D option) to skip executing the bash in windows so that the maven command would not fail in windows. Thanks Hari Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files -- Key: HIVE-10684 URL: https://issues.apache.org/jira/browse/HIVE-10684 Project: Hive Issue Type: Bug Components: Tests Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10684.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10677) hive.exec.parallel=true has problem when it is used for analyze table column stats
[ https://issues.apache.org/jira/browse/HIVE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10677: Affects Version/s: 1.1.0 0.12.0 0.13.0 0.14.0 1.0.0 1.2.0 hive.exec.parallel=true has problem when it is used for analyze table column stats -- Key: HIVE-10677 URL: https://issues.apache.org/jira/browse/HIVE-10677 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Fix For: 1.2.1 Attachments: HIVE-10677.01.patch, HIVE-10677.02.patch To reproduce it, in q tests. {code} hive set hive.exec.parallel; hive.exec.parallel=true hive analyze table src compute statistics for columns; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask java.lang.RuntimeException: Error caching map.xml: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:747) at org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:682) at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:674) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:375) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75) Caused by: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:541) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.util.Shell.execCommand(Shell.java:791) at org.apache.hadoop.util.Shell.execCommand(Shell.java:774) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:460) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:715) ... 7 more hive Job Submission failed with exception 'java.lang.RuntimeException(Error caching map.xml: java.io.IOException: java.lang.InterruptedException)' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10809) HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories
[ https://issues.apache.org/jira/browse/HIVE-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557498#comment-14557498 ] Selina Zhang commented on HIVE-10809: - Thanks! Will do! HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories -- Key: HIVE-10809 URL: https://issues.apache.org/jira/browse/HIVE-10809 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.2.0 Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-10809.1.patch When static partition is added through HCatStorer or HCatWriter {code} JoinedData = LOAD '/user/selinaz/data/part-r-0' USING JsonLoader(); STORE JoinedData INTO 'selina.joined_events_e' USING org.apache.hive.hcatalog.pig.HCatStorer('author=selina'); {code} The table directory looks like {noformat} drwx-- - selinaz users 0 2015-05-22 21:19 /user/selinaz/joined_events_e/_SCRATCH0.9157208938193798 drwx-- - selinaz users 0 2015-05-22 21:19 /user/selinaz/joined_events_e/author=selina {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10808) Inner join on Null throwing Cast Exception
[ https://issues.apache.org/jira/browse/HIVE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557507#comment-14557507 ] Hive QA commented on HIVE-10808: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734978/HIVE-10808.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8972 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4020/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4020/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4020/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734978 - PreCommit-HIVE-TRUNK-Build Inner join on Null throwing Cast Exception -- Key: HIVE-10808 URL: https://issues.apache.org/jira/browse/HIVE-10808 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Reporter: Naveen Gangam Assignee: Naveen Gangam Priority: Critical Attachments: HIVE-10808.patch select a.col1, a.col2, a.col3, a.col4 from tab1 a inner join ( select max(x) as x from tab1 where x 20130327 ) r on a.x = r.x where a.col1 = 'F' and a.col3 in ('A', 'S', 'G'); Failed Task log snippet: 2015-05-18 19:22:17,372 INFO [main] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: __MAP_PLAN__ 2015-05-18 19:22:17,372 INFO [main] org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: __MAP_PLAN__ 2015-05-18 19:22:17,457 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 9 more Caused by: java.lang.RuntimeException: Error in configuring object at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) ... 17 more Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:157) ... 22 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334) at
[jira] [Resolved] (HIVE-10107) Union All : Vertex missing stats resulting in OOM and in-efficient plans
[ https://issues.apache.org/jira/browse/HIVE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong resolved HIVE-10107. Resolution: Fixed resolved following hive-8769 Union All : Vertex missing stats resulting in OOM and in-efficient plans Key: HIVE-10107 URL: https://issues.apache.org/jira/browse/HIVE-10107 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Pengcheng Xiong Reducer Vertices sending data to a Union all edge are missing statistics and as a result we either use very few reducers in the UNION ALL edge or decide to broadcast the results of UNION ALL. Query {code} select count(*) rowcount from (select ss_item_sk, ss_ticket_number, ss_store_sk from store_sales a, store_returns b where a.ss_item_sk = b.sr_item_sk and a.ss_ticket_number = b.sr_ticket_number union all select ss_item_sk, ss_ticket_number, ss_store_sk from store_sales c, store_returns d where c.ss_item_sk = d.sr_item_sk and c.ss_ticket_number = d.sr_ticket_number) t group by t.ss_store_sk , t.ss_item_sk , t.ss_ticket_number having rowcount 1; {code} Plan snippet {code} Edges: Reducer 2 - Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Union 3 (CONTAINS) Reducer 4 - Union 3 (SIMPLE_EDGE) Reducer 7 - Map 6 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE), Union 3 (CONTAINS) Reducer 4 Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) keys: KEY._col0 (type: int), KEY._col1 (type: int), KEY._col2 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (_col3 1) (type: boolean) Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: COMPLETE Select Operator expressions: _col3 (type: bigint) outputColumnNames: _col0 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: COMPLETE File Output Operator compressed: false Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Reducer 7 Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 keys: 0 ss_item_sk (type: int), ss_ticket_number (type: int) 1 sr_item_sk (type: int), sr_ticket_number (type: int) outputColumnNames: _col1, _col6, _col8, _col27, _col34 Filter Operator predicate: ((_col1 = _col27) and (_col8 = _col34)) (type: boolean) Select Operator expressions: _col1 (type: int), _col8 (type: int), _col6 (type: int) outputColumnNames: _col0, _col1, _col2 Group By Operator aggregations: count() keys: _col2 (type: int), _col0 (type: int), _col1 (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Reduce Output Operator key expressions: _col0 (type: int), _col1 (type: int), _col2 (type: int) sort order: +++ Map-reduce partition columns: _col0 (type: int), _col1 (type: int), _col2 (type: int) value expressions: _col3 (type: bigint) {code} The full explain plan {code} STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 2 - Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Union 3 (CONTAINS) Reducer 4 - Union 3 (SIMPLE_EDGE) Reducer 7 - Map 6 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE), Union 3 (CONTAINS) DagName: mmokhtar_20150214132727_95878ea1-ee6a-4b7e-bc86-843abd5cf664:7 Vertices: Map 1 Map Operator Tree:
[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557530#comment-14557530 ] Pengcheng Xiong commented on HIVE-10812: [~jpullokkaran], [~ashutoshc] and [~mmokhtar], we will address the PK/FK selectivity scaling problem in this patch. And also it will address [~ashutoshc]'s previous comments regarding the SERDE. Scaling PK/FK's selectivity for stats annotation Key: HIVE-10812 URL: https://issues.apache.org/jira/browse/HIVE-10812 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Right now, the computation of the selectivity of FK side based on PK side does not take into consideration of the range of FK and the range of PK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10623) Implement hive cli options using beeline functionality
[ https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557566#comment-14557566 ] Lefty Leverenz commented on HIVE-10623: --- Thanks [~xuefuz], I added a link. Implement hive cli options using beeline functionality -- Key: HIVE-10623 URL: https://issues.apache.org/jira/browse/HIVE-10623 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Fix For: beeline-cli-branch Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, HIVE-10623.3.patch, HIVE-10623.4.patch, HIVE-10623.patch We need to support the original hive cli options for the purpose of backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10812: --- Attachment: HIVE-10812.01.patch Scaling PK/FK's selectivity for stats annotation Key: HIVE-10812 URL: https://issues.apache.org/jira/browse/HIVE-10812 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10812.01.patch Right now, the computation of the selectivity of FK side based on PK side does not take into consideration of the range of FK and the range of PK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10805) OOM in vectorized reduce
[ https://issues.apache.org/jira/browse/HIVE-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557602#comment-14557602 ] Hive QA commented on HIVE-10805: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734959/HIVE-10805.01.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8972 tests executed *Failed tests:* {noformat} TestHs2Hooks - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4024/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4024/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4024/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734959 - PreCommit-HIVE-TRUNK-Build OOM in vectorized reduce Key: HIVE-10805 URL: https://issues.apache.org/jira/browse/HIVE-10805 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Matt McCline Assignee: Matt McCline Priority: Blocker Fix For: 1.2.1 Attachments: HIVE-10805.01.patch Vectorized reduce does not release scratch byte space in BytesColumnVectors and runs out of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9072) Swap the main MiniDrivers and the Vectorized MiniDrivers so Vectorization is on be default and Non-Vectorized tests are still executed
[ https://issues.apache.org/jira/browse/HIVE-9072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-9072. Resolution: Won't Fix Try different approach later. Swap the main MiniDrivers and the Vectorized MiniDrivers so Vectorization is on be default and Non-Vectorized tests are still executed -- Key: HIVE-9072 URL: https://issues.apache.org/jira/browse/HIVE-9072 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical The 3rd and final step in turning Vectorization on by default. Swap the MiniDrivers so Vectorization is on by default in the main MiniDrivers (TestCliDriver, TestNegativeCliDriver, TestMiniTezCliDriver) and run as Non-Vectorized in newly named MiniDrivers (TestNonVecCliDriver, TestNonVecNegativeCliDriver, TestNonVecMiniTezCliDriver). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-9071) Enhance Vectorization to read non-ORC tables for better test coverage
[ https://issues.apache.org/jira/browse/HIVE-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-9071. Resolution: Won't Fix Try different approach later. Enhance Vectorization to read non-ORC tables for better test coverage - Key: HIVE-9071 URL: https://issues.apache.org/jira/browse/HIVE-9071 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical By enabling Vectorization to read non-ORC tables, many more query unit tests will execute in vectorization mode. This is the 2nd step towards turning Vectorization on by default. Goal: 80% of MapWork tasks vectorize in new MiniDrivers (TestVecCliDriver, TestVecNegativeCliDriver, TestVecMiniTezCliDriver). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9071) Enhance Vectorization to read non-ORC tables for better test coverage
[ https://issues.apache.org/jira/browse/HIVE-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-9071: --- Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-5538) Enhance Vectorization to read non-ORC tables for better test coverage - Key: HIVE-9071 URL: https://issues.apache.org/jira/browse/HIVE-9071 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical By enabling Vectorization to read non-ORC tables, many more query unit tests will execute in vectorization mode. This is the 2nd step towards turning Vectorization on by default. Goal: 80% of MapWork tasks vectorize in new MiniDrivers (TestVecCliDriver, TestVecNegativeCliDriver, TestVecMiniTezCliDriver). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10789) union distinct query with NULL constant on both the sides throws Unsuported vector output type: void error
[ https://issues.apache.org/jira/browse/HIVE-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557197#comment-14557197 ] Matt McCline commented on HIVE-10789: - Committed to trunk and 1.2 branch. union distinct query with NULL constant on both the sides throws Unsuported vector output type: void error Key: HIVE-10789 URL: https://issues.apache.org/jira/browse/HIVE-10789 Project: Hive Issue Type: Bug Components: Hive Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Fix For: 1.2.1 Attachments: HIVE-10789.01.patch A NULL expression in the SELECT projection list causes exception to be thrown instead of not vectorizing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9070) Add new MiniDrivers to test Vectorization
[ https://issues.apache.org/jira/browse/HIVE-9070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-9070: --- Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-5538) Add new MiniDrivers to test Vectorization - Key: HIVE-9070 URL: https://issues.apache.org/jira/browse/HIVE-9070 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Matt McCline Assignee: Matt McCline Priority: Critical Begin the multi-step move towards having Vectorization being on by default by creating a new set of MiniDrivers that run the query unit tests with vectorization turned on. New drivers: TestVecCliDriver TestVecNegativeCliDriver TestVecMiniTezCliDriver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)
[ https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-8769: -- Attachment: HIVE-8769.05.patch [~jpullokkaran], according to your comments, I have rebased the patch. Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected) -- Key: HIVE-8769 URL: https://issues.apache.org/jira/browse/HIVE-8769 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Pengcheng Xiong Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch, HIVE-8769.03.patch, HIVE-8769.04.patch, HIVE-8769.05.patch TPC-DS Q82 is running slower than hive 13 because the join type is not correct. The estimate for item x inventory x date_dim is 227 Million rows while the actual is 3K rows. Hive 13 finishes in 753 seconds. Hive 14 finishes in 1,267 seconds. Hive 14 + force map join finished in 431 seconds. Query {code} select i_item_id ,i_item_desc ,i_current_price from item, inventory, date_dim, store_sales where i_current_price between 30 and 30+30 and inv_item_sk = i_item_sk and d_date_sk=inv_date_sk and d_date between '2002-05-30' and '2002-07-30' and i_manufact_id in (437,129,727,663) and inv_quantity_on_hand between 100 and 500 and ss_item_sk = i_item_sk group by i_item_id,i_item_desc,i_current_price order by i_item_id limit 100 {code} Plan {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 7 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE) Reducer 4 - Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE) Reducer 5 - Reducer 4 (SIMPLE_EDGE) Reducer 6 - Reducer 5 (SIMPLE_EDGE) DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1 Vertices: Map 1 Map Operator Tree: TableScan alias: item filterExpr: ((i_current_price BETWEEN 30 AND 60 and (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 462000 Data size: 663862160 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((i_current_price BETWEEN 30 AND 60 and (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 115500 Data size: 34185680 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: i_item_sk (type: int), i_item_id (type: string), i_item_desc (type: string), i_current_price (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 115500 Data size: 33724832 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 115500 Data size: 33724832 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: float) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: date_dim filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' and d_date_sk is not null) (type: boolean) Statistics: Num rows: 36524 Data size: 3579352 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: int)
[jira] [Commented] (HIVE-10659) Beeline command which contains semi-colon as a non-command terminator will fail
[ https://issues.apache.org/jira/browse/HIVE-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557254#comment-14557254 ] Lefty Leverenz commented on HIVE-10659: --- This should be documented in the wiki. I'm not adding a TODOC label because we don't have one for 1.2.1, but we could use the TODOC1.2 label. * [HiveServer2 Clients -- Beeline | https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Beeline–NewCommandLineShell] Beeline command which contains semi-colon as a non-command terminator will fail --- Key: HIVE-10659 URL: https://issues.apache.org/jira/browse/HIVE-10659 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.2.1 Attachments: HIVE-10659.1.patch Consider a scenario where beeline is used to connect to a mysql server. The commands executed via beeline can include stored procedures. For e.g. the following command used to create a stored procedure is a valid command : {code} CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN `LINK_TARGET_ID` ; END IF; END {code} MySQL stored procedures have semi-colon ( ; ) as the statement terminator. Since this coincides with beeline's only available command terminator, semi-colon, beeline will not able to execute the above command successfully . i.e, beeline tries to execute the below partial command instead of the complete command shown above. {code} CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; {code} The above situation can actually happen within Hive when Hive SchemaTool is used to upgrade a mysql metastore db and the scripts used for the upgrade process contain stored procedures(as the one introduced initially by HIVE-7018). As of now, we cannot have any stored procedure as part of MySQL metastore db upgrade scripts because schemaTool uses beeline to connect to MySQL. As of now, beeline fails to execute any create procedure command or similar command containing ; . This is a serious limitation; it needs to be fixed by allowing the end user to provide an option to beeline to not use semi-colon as the command delimiter and instead use new line character as the command delimiter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10800) CBO (Calcite Return Path): Setup correct information if CBO succeeds
[ https://issues.apache.org/jira/browse/HIVE-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557189#comment-14557189 ] Hive QA commented on HIVE-10800: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734835/HIVE-10800.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8970 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4012/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4012/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4012/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734835 - PreCommit-HIVE-TRUNK-Build CBO (Calcite Return Path): Setup correct information if CBO succeeds Key: HIVE-10800 URL: https://issues.apache.org/jira/browse/HIVE-10800 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10800.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10684) Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files
[ https://issues.apache.org/jira/browse/HIVE-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557257#comment-14557257 ] Ferdinand Xu commented on HIVE-10684: - It's easy to have a cmd script for windows as the bash script based on current solution. Any thoughts about crossing platform issue? Also Is there any suggestion about replacing the maven-antrun-plugin? The maven-jar-plugin is for the purpose of building a jar for the source code. I am wondering whether we can replace the maven-antrun-plugin by maven-jar-plugin. Any thoughts about it? Thank you! Ferd Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files -- Key: HIVE-10684 URL: https://issues.apache.org/jira/browse/HIVE-10684 Project: Hive Issue Type: Bug Components: Tests Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10684.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10553) Remove hardcoded Parquet references from SearchArgumentImpl
[ https://issues.apache.org/jira/browse/HIVE-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557264#comment-14557264 ] Hive QA commented on HIVE-10553: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734917/HIVE-10553.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8970 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4014/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4014/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4014/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734917 - PreCommit-HIVE-TRUNK-Build Remove hardcoded Parquet references from SearchArgumentImpl --- Key: HIVE-10553 URL: https://issues.apache.org/jira/browse/HIVE-10553 Project: Hive Issue Type: Sub-task Reporter: Gopal V Assignee: Owen O'Malley Attachments: HIVE-10553.patch, HIVE-10553.patch SARGs currently depend on Parquet code, which causes a tight coupling between parquet releases and storage-api versions. Move Parquet code out to its own RecordReader, similar to ORC's SargApplier implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10623) Implement hive cli options using beeline functionality
[ https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557281#comment-14557281 ] Lefty Leverenz commented on HIVE-10623: --- Does this need to be documented in the wiki (when beeline-cli-branch gets merged to master)? If so, we should either add a TODOC-BEECLI label (or some such) or create a JIRA issue for Beeline-CLI issues that need to be documented. I favor the latter. For examples, see HIVE-9850 and HIVE-9752. Implement hive cli options using beeline functionality -- Key: HIVE-10623 URL: https://issues.apache.org/jira/browse/HIVE-10623 Project: Hive Issue Type: Sub-task Components: CLI Reporter: Ferdinand Xu Assignee: Ferdinand Xu Fix For: beeline-cli-branch Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, HIVE-10623.3.patch, HIVE-10623.4.patch, HIVE-10623.patch We need to support the original hive cli options for the purpose of backwards compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10690) ArrayIndexOutOfBounds exception in MetaStoreDirectSql.aggrColStatsForPartitions()
[ https://issues.apache.org/jira/browse/HIVE-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10690: -- Fix Version/s: 1.2.1 ArrayIndexOutOfBounds exception in MetaStoreDirectSql.aggrColStatsForPartitions() - Key: HIVE-10690 URL: https://issues.apache.org/jira/browse/HIVE-10690 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 1.2.0 Reporter: Jason Dere Assignee: Vaibhav Gumashta Fix For: 1.3.0, 1.2.1 Attachments: HIVE-10690.1.patch Noticed a bunch of these stack traces in hive.log while running some unit tests: {noformat} 2015-05-11 21:18:59,371 WARN [main]: metastore.ObjectStore (ObjectStore.java:handleDirectSqlError(2420)) - Direct SQL failed java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:635) at java.util.ArrayList.get(ArrayList.java:411) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1132) at org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6162) at org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6158) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2385) at org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6158) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114) at com.sun.proxy.$Proxy84.get_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_aggr_stats_for(HiveMetaStore.java:5662) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy86.get_aggr_stats_for(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAggrColStatsFor(HiveMetaStoreClient.java:2064) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy87.getAggrColStatsFor(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getAggrColStatsFor(Hive.java:3110) at org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:245) at org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:329) at org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:399) at org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:392) at org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:150) at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:77) at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:64) at sun.reflect.GeneratedMethodAccessor296.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182) at com.sun.proxy.$Proxy108.getDistinctRowCount(Unknown Source) at sun.reflect.GeneratedMethodAccessor234.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
[jira] [Commented] (HIVE-10801) 'drop view' fails throwing java.lang.NullPointerException
[ https://issues.apache.org/jira/browse/HIVE-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557288#comment-14557288 ] Hive QA commented on HIVE-10801: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734946/HIVE-10801.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8973 tests executed *Failed tests:* {noformat} org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4015/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4015/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4015/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734946 - PreCommit-HIVE-TRUNK-Build 'drop view' fails throwing java.lang.NullPointerException - Key: HIVE-10801 URL: https://issues.apache.org/jira/browse/HIVE-10801 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10801.1.patch, HIVE-10801.2.patch When trying to drop a view, hive log shows: {code} 2015-05-21 11:53:06,126 ERROR [HiveServer2-Background-Pool: Thread-197]: hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) - Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !! 2015-05-21 11:53:06,134 ERROR [HiveServer2-Background-Pool: Thread-197]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(155)) - MetaException(message:java.lang.NullPointerException) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5379) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1734) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at com.sun.proxy.$Proxy7.drop_table_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:2056) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.drop_table_with_environment_context(SessionHiveMetaStoreClient.java:118) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:968) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:904) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156) at com.sun.proxy.$Proxy8.dropTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:1035) at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:972) at org.apache.hadoop.hive.ql.exec.DDLTask.dropTable(DDLTask.java:3836) at org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3692) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:331) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) at
[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats
[ https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557289#comment-14557289 ] Lefty Leverenz commented on HIVE-10650: --- Doc note: This needs to be documented in the wiki for the 1.3.0 release. * [Windowing and Analytics -- WINDOW clause | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-WINDOWclause] Improve sum() function over windowing to support additional range formats - Key: HIVE-10650 URL: https://issues.apache.org/jira/browse/HIVE-10650 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu Assignee: Aihua Xu Labels: TODOC1.3 Fix For: 1.3.0 Attachments: HIVE-10650.patch Support the following windowing function {{x preceding and y preceding}} and {{x following and y following}}. e.g. {noformat} select sum(value) over (partition by key order by value rows between 2 preceding and 1 preceding) from tbl1; select sum(value) over (partition by key order by value rows between unbounded preceding and 1 preceding) from tbl1; {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6867) Bucketized Table feature fails in some cases
[ https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557373#comment-14557373 ] Hive QA commented on HIVE-6867: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734942/HIVE-6867.04.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8972 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.exec.tez.TestDynamicPartitionPruner.testSingleSourceMultipleFiltersOrdering1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4017/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4017/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4017/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12734942 - PreCommit-HIVE-TRUNK-Build Bucketized Table feature fails in some cases Key: HIVE-6867 URL: https://issues.apache.org/jira/browse/HIVE-6867 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Pengcheng Xiong Attachments: HIVE-6867.01.patch, HIVE-6867.02.patch, HIVE-6867.03.patch, HIVE-6867.04.patch Bucketized Table feature fails in some cases. if src destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination. Example -- CREATE TABLE P1(key STRING, val STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1; – perform an insert to make sure there are 2 files INSERT OVERWRITE TABLE P1 select key, val from P1; -- This is not a regression. This has never worked. This got only discovered due to Hadoop2 changes. In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads). Long term solution seems to be to prevent load data for bucketed table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10806) Incorrect example for exploding map function in hive wiki
[ https://issues.apache.org/jira/browse/HIVE-10806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557443#comment-14557443 ] Gabor Liptak commented on HIVE-10806: - This is the section referenced: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-explode I do not seem to have permissions to edit the wiki ... Incorrect example for exploding map function in hive wiki - Key: HIVE-10806 URL: https://issues.apache.org/jira/browse/HIVE-10806 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.10.0 Reporter: anup b Priority: Trivial In hive wiki, example for exploding map is wrong it doesnt work in hive 0.10 Example given in wiki which doesnt work: SELECT explode(myMap) AS myMapKey, myMapValue FROM myMapTable; It should be updated to : SELECT explode(myMap) AS (myMapKey, myMapValue) FROM myMapTable; Link : https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-explode -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10803) document jdbc url format properly
[ https://issues.apache.org/jira/browse/HIVE-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557444#comment-14557444 ] Gabor Liptak commented on HIVE-10803: - [~thejas] How does one get access to edit the wiki? document jdbc url format properly - Key: HIVE-10803 URL: https://issues.apache.org/jira/browse/HIVE-10803 Project: Hive Issue Type: Bug Components: Documentation, HiveServer2 Reporter: Thejas M Nair This is the format of the HS2 string, this needs to be documented in the wiki doc (taken from jdbc.Utils.java) jdbc:hive2://host1:port1,host2:port2/dbName;sess_var_list?hive_conf_list#hive_var_list -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10807) Invalidate basic stats for insert queries if autogather=false
[ https://issues.apache.org/jira/browse/HIVE-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557458#comment-14557458 ] Hive QA commented on HIVE-10807: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12734976/HIVE-10807.patch {color:red}ERROR:{color} -1 due to 134 failed/errored test(s), 8972 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_serde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_display_colstats_tbllvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_12 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_3
[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)
[ https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8769: --- Affects Version/s: 1.1.0 1.0.0 1.2.0 Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected) -- Key: HIVE-8769 URL: https://issues.apache.org/jira/browse/HIVE-8769 Project: Hive Issue Type: Bug Components: Physical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Mostafa Mokhtar Assignee: Pengcheng Xiong Fix For: 1.2.1 Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch, HIVE-8769.03.patch, HIVE-8769.04.patch, HIVE-8769.05.patch TPC-DS Q82 is running slower than hive 13 because the join type is not correct. The estimate for item x inventory x date_dim is 227 Million rows while the actual is 3K rows. Hive 13 finishes in 753 seconds. Hive 14 finishes in 1,267 seconds. Hive 14 + force map join finished in 431 seconds. Query {code} select i_item_id ,i_item_desc ,i_current_price from item, inventory, date_dim, store_sales where i_current_price between 30 and 30+30 and inv_item_sk = i_item_sk and d_date_sk=inv_date_sk and d_date between '2002-05-30' and '2002-07-30' and i_manufact_id in (437,129,727,663) and inv_quantity_on_hand between 100 and 500 and ss_item_sk = i_item_sk group by i_item_id,i_item_desc,i_current_price order by i_item_id limit 100 {code} Plan {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 7 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE) Reducer 4 - Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE) Reducer 5 - Reducer 4 (SIMPLE_EDGE) Reducer 6 - Reducer 5 (SIMPLE_EDGE) DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1 Vertices: Map 1 Map Operator Tree: TableScan alias: item filterExpr: ((i_current_price BETWEEN 30 AND 60 and (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 462000 Data size: 663862160 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((i_current_price BETWEEN 30 AND 60 and (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 115500 Data size: 34185680 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: i_item_sk (type: int), i_item_id (type: string), i_item_desc (type: string), i_current_price (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 115500 Data size: 33724832 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 115500 Data size: 33724832 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: float) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: date_dim filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' and d_date_sk is not null) (type: boolean) Statistics: Num rows: 36524 Data size: 3579352 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Select Operator
[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)
[ https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8769: --- Issue Type: Improvement (was: Bug) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected) -- Key: HIVE-8769 URL: https://issues.apache.org/jira/browse/HIVE-8769 Project: Hive Issue Type: Improvement Components: Physical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Mostafa Mokhtar Assignee: Pengcheng Xiong Fix For: 1.2.1 Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch, HIVE-8769.03.patch, HIVE-8769.04.patch, HIVE-8769.05.patch TPC-DS Q82 is running slower than hive 13 because the join type is not correct. The estimate for item x inventory x date_dim is 227 Million rows while the actual is 3K rows. Hive 13 finishes in 753 seconds. Hive 14 finishes in 1,267 seconds. Hive 14 + force map join finished in 431 seconds. Query {code} select i_item_id ,i_item_desc ,i_current_price from item, inventory, date_dim, store_sales where i_current_price between 30 and 30+30 and inv_item_sk = i_item_sk and d_date_sk=inv_date_sk and d_date between '2002-05-30' and '2002-07-30' and i_manufact_id in (437,129,727,663) and inv_quantity_on_hand between 100 and 500 and ss_item_sk = i_item_sk group by i_item_id,i_item_desc,i_current_price order by i_item_id limit 100 {code} Plan {code} STAGE PLANS: Stage: Stage-1 Tez Edges: Map 7 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE) Reducer 4 - Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE) Reducer 5 - Reducer 4 (SIMPLE_EDGE) Reducer 6 - Reducer 5 (SIMPLE_EDGE) DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1 Vertices: Map 1 Map Operator Tree: TableScan alias: item filterExpr: ((i_current_price BETWEEN 30 AND 60 and (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 462000 Data size: 663862160 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: ((i_current_price BETWEEN 30 AND 60 and (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: boolean) Statistics: Num rows: 115500 Data size: 34185680 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: i_item_sk (type: int), i_item_id (type: string), i_item_desc (type: string), i_current_price (type: float) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 115500 Data size: 33724832 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 115500 Data size: 33724832 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: string), _col2 (type: string), _col3 (type: float) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: date_dim filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' and d_date_sk is not null) (type: boolean) Statistics: Num rows: 36524 Data size: 3579352 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 36524 Data size: 146096 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: _col0 (type: int)
[jira] [Updated] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10811: --- Attachment: HIVE-10811.patch RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Attachments: HIVE-10811.patch RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536) ... 32 more Caused by: java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at
[jira] [Commented] (HIVE-10803) document jdbc url format properly
[ https://issues.apache.org/jira/browse/HIVE-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557485#comment-14557485 ] Thejas M Nair commented on HIVE-10803: -- How to get permission to edit Create a Confluence account if you don't already have one. Sign up for the user mailing list by sending a message to user-subscr...@hive.apache.org. Send a message to u...@hive.apache.org requesting write access to the Hive wiki, and provide your Confluence username. (from : https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki) document jdbc url format properly - Key: HIVE-10803 URL: https://issues.apache.org/jira/browse/HIVE-10803 Project: Hive Issue Type: Bug Components: Documentation, HiveServer2 Reporter: Thejas M Nair This is the format of the HS2 string, this needs to be documented in the wiki doc (taken from jdbc.Utils.java) jdbc:hive2://host1:port1,host2:port2/dbName;sess_var_list?hive_conf_list#hive_var_list -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases
[ https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-10811: --- Component/s: CBO RelFieldTrimmer throws NoSuchElementException in some cases --- Key: HIVE-10811 URL: https://issues.apache.org/jira/browse/HIVE-10811 Project: Hive Issue Type: Bug Components: CBO Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez RelFieldTrimmer runs into NoSuchElementException in some cases. Stack trace: {noformat} Exception in thread main java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768) at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536) ... 32 more Caused by: java.lang.AssertionError: Internal error: While invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)' at org.apache.calcite.util.Util.newInternal(Util.java:743) at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543) at org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269) at org.apache.calcite.sql2rel.RelFieldTrimmer.trimChild(RelFieldTrimmer.java:210) at
[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557619#comment-14557619 ] Hive QA commented on HIVE-9069: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735036/HIVE-9069.12.patch {color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 8969 tests executed *Failed tests:* {noformat} TestContribNegativeCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4025/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4025/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4025/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 21 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12735036 - PreCommit-HIVE-TRUNK-Build Simplify filter predicates for CBO -- Key: HIVE-9069 URL: https://issues.apache.org/jira/browse/HIVE-9069 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Jesus Camacho Rodriguez Fix For: 0.14.1 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.patch Simplify predicates for disjunctive predicates so that can get pushed down to the scan. Looks like this is still an issue, some of the filters can be pushed down to the scan. {code} set hive.cbo.enable=true set hive.stats.fetch.column.stats=true set hive.exec.dynamic.partition.mode=nonstrict set hive.tez.auto.reducer.parallelism=true set hive.auto.convert.join.noconditionaltask.size=32000 set hive.exec.reducers.bytes.per.reducer=1 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.tez.exec.print.summary=true explain select substr(r_reason_desc,1,20) as r ,avg(ws_quantity) wq ,avg(wr_refunded_cash) ref ,avg(wr_fee) fee from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where web_sales.ws_web_page_sk = web_page.wp_web_page_sk and web_sales.ws_item_sk = web_returns.wr_item_sk and web_sales.ws_order_number = web_returns.wr_order_number and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk and reason.r_reason_sk = web_returns.wr_reason_sk and ( ( cd1.cd_marital_status = 'M' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '4 yr Degree' and
[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation
[ https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557635#comment-14557635 ] Hive QA commented on HIVE-10812: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735040/HIVE-10812.01.patch {color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 8973 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_smb_mapjoin_8 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter_partitioned org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_uber_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4026/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4026/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4026/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 44 tests failed {noformat} This message is automatically generated.