[jira] [Updated] (HIVE-9069) Simplify filter predicates for CBO

2015-05-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-9069:
--
Attachment: HIVE-9069.12.patch

 Simplify filter predicates for CBO
 --

 Key: HIVE-9069
 URL: https://issues.apache.org/jira/browse/HIVE-9069
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.14.1

 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, 
 HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, 
 HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, 
 HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, 
 HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.patch


 Simplify predicates for disjunctive predicates so that can get pushed down to 
 the scan.
 Looks like this is still an issue, some of the filters can be pushed down to 
 the scan.
 {code}
 set hive.cbo.enable=true
 set hive.stats.fetch.column.stats=true
 set hive.exec.dynamic.partition.mode=nonstrict
 set hive.tez.auto.reducer.parallelism=true
 set hive.auto.convert.join.noconditionaltask.size=32000
 set hive.exec.reducers.bytes.per.reducer=1
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
 set hive.support.concurrency=false
 set hive.tez.exec.print.summary=true
 explain  
 select  substr(r_reason_desc,1,20) as r
,avg(ws_quantity) wq
,avg(wr_refunded_cash) ref
,avg(wr_fee) fee
  from web_sales, web_returns, web_page, customer_demographics cd1,
   customer_demographics cd2, customer_address, date_dim, reason 
  where web_sales.ws_web_page_sk = web_page.wp_web_page_sk
and web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk 
and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and reason.r_reason_sk = web_returns.wr_reason_sk
and
(
 (
  cd1.cd_marital_status = 'M'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = '4 yr Degree'
  and 
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 100.00 and 150.00
 )
or
 (
  cd1.cd_marital_status = 'D'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Primary' 
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 50.00 and 100.00
 )
or
 (
  cd1.cd_marital_status = 'U'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = 'Advanced Degree'
  and
  cd1.cd_education_status = cd2.cd_education_status
  and
  ws_sales_price between 150.00 and 200.00
 )
)
and
(
 (
  ca_country = 'United States'
  and
  ca_state in ('KY', 'GA', 'NM')
  and ws_net_profit between 100 and 200  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('MT', 'OR', 'IN')
  and ws_net_profit between 150 and 300  
 )
 or
 (
  ca_country = 'United States'
  and
  ca_state in ('WI', 'MO', 'WV')
  and ws_net_profit between 50 and 250  
 )
)
 group by r_reason_desc
 order by r, wq, ref, fee
 limit 100
 OK
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 9 - Map 1 (BROADCAST_EDGE)
 Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE)
 Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE)
 Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE)
 Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 
 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE)
 Reducer 7 - Reducer 6 (SIMPLE_EDGE)
 Reducer 8 - Reducer 7 (SIMPLE_EDGE)
   DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: web_page
   filterExpr: wp_web_page_sk is not null (type: boolean)
   Statistics: Num rows: 4602 Data size: 2696178 Basic stats: 
 COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: wp_web_page_sk is not null (type: boolean)
 Statistics: Num rows: 4602 Data size: 18408 Basic stats: 
 COMPLETE Column stats: COMPLETE
 Select Operator

[jira] [Commented] (HIVE-10809) HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557543#comment-14557543
 ] 

Hive QA commented on HIVE-10809:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734987/HIVE-10809.1.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 8973 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4021/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4021/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4021/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734987 - PreCommit-HIVE-TRUNK-Build

 HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories
 --

 Key: HIVE-10809
 URL: https://issues.apache.org/jira/browse/HIVE-10809
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Selina Zhang
Assignee: Selina Zhang
 Attachments: HIVE-10809.1.patch


 When static partition is added through HCatStorer or HCatWriter
 {code}
 JoinedData = LOAD '/user/selinaz/data/part-r-0' USING JsonLoader();
 STORE JoinedData INTO 'selina.joined_events_e' USING 
 org.apache.hive.hcatalog.pig.HCatStorer('author=selina');
 {code}
 The table directory looks like
 {noformat}
 drwx--   - selinaz users  0 2015-05-22 21:19 
 /user/selinaz/joined_events_e/_SCRATCH0.9157208938193798
 drwx--   - selinaz users  0 2015-05-22 21:19 
 /user/selinaz/joined_events_e/author=selina
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10684) Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-23 Thread Hari Sankar Sivarama Subramaniyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557495#comment-14557495
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10684:
--

[~Ferd] Is it possible to introduce ant tasks instead of bash script, can you 
see if this is possible via the Tasks mentioned in 
https://ant.apache.org/manual/Tasks/move.html . I believe this should support 
portability as well.

If the above method is not possible, It must be possible to get an equivalent 
windows script as well. But before the windows equivalent script can be checked 
in, it has to be tested in windows. You need a OS based maven profile for this 
purpose. This can be done as a separate jira; for the purpose of this jira, you 
can introduce a separate profile through which the end-user can provide a maven 
option (using -P or -D option) to skip executing the bash  in windows so that 
the maven command would not fail in windows.

Thanks
Hari

 Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary 
 jar files
 --

 Key: HIVE-10684
 URL: https://issues.apache.org/jira/browse/HIVE-10684
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10684.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10677) hive.exec.parallel=true has problem when it is used for analyze table column stats

2015-05-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-10677:

Affects Version/s: 1.1.0
   0.12.0
   0.13.0
   0.14.0
   1.0.0
   1.2.0

 hive.exec.parallel=true has problem when it is used for analyze table column 
 stats
 --

 Key: HIVE-10677
 URL: https://issues.apache.org/jira/browse/HIVE-10677
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.1

 Attachments: HIVE-10677.01.patch, HIVE-10677.02.patch


 To reproduce it, in q tests.
 {code}
 hive set hive.exec.parallel;
 hive.exec.parallel=true
 hive analyze table src compute statistics for columns;
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.ColumnStatsTask
 java.lang.RuntimeException: Error caching map.xml: java.io.IOException: 
 java.lang.InterruptedException
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:747)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:682)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:674)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:375)
   at 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75)
 Caused by: java.io.IOException: java.lang.InterruptedException
   at org.apache.hadoop.util.Shell.runCommand(Shell.java:541)
   at org.apache.hadoop.util.Shell.run(Shell.java:455)
   at 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
   at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
   at org.apache.hadoop.util.Shell.execCommand(Shell.java:774)
   at 
 org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646)
   at 
 org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472)
   at 
 org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:460)
   at 
 org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784)
   at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:715)
   ... 7 more
 hive Job Submission failed with exception 'java.lang.RuntimeException(Error 
 caching map.xml: java.io.IOException: java.lang.InterruptedException)'
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10809) HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories

2015-05-23 Thread Selina Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557498#comment-14557498
 ] 

Selina Zhang commented on HIVE-10809:
-

Thanks! Will do!

 HCat FileOutputCommitterContainer leaves behind empty _SCRATCH directories
 --

 Key: HIVE-10809
 URL: https://issues.apache.org/jira/browse/HIVE-10809
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 1.2.0
Reporter: Selina Zhang
Assignee: Selina Zhang
 Attachments: HIVE-10809.1.patch


 When static partition is added through HCatStorer or HCatWriter
 {code}
 JoinedData = LOAD '/user/selinaz/data/part-r-0' USING JsonLoader();
 STORE JoinedData INTO 'selina.joined_events_e' USING 
 org.apache.hive.hcatalog.pig.HCatStorer('author=selina');
 {code}
 The table directory looks like
 {noformat}
 drwx--   - selinaz users  0 2015-05-22 21:19 
 /user/selinaz/joined_events_e/_SCRATCH0.9157208938193798
 drwx--   - selinaz users  0 2015-05-22 21:19 
 /user/selinaz/joined_events_e/author=selina
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10808) Inner join on Null throwing Cast Exception

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557507#comment-14557507
 ] 

Hive QA commented on HIVE-10808:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734978/HIVE-10808.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8972 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4020/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4020/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4020/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734978 - PreCommit-HIVE-TRUNK-Build

 Inner join on Null throwing Cast Exception
 --

 Key: HIVE-10808
 URL: https://issues.apache.org/jira/browse/HIVE-10808
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Critical
 Attachments: HIVE-10808.patch


 select
  a.col1,
  a.col2,
  a.col3,
  a.col4
  from
  tab1 a
  inner join
  (
  select
  max(x) as x
  from
  tab1
  where
  x  20130327
  ) r
  on
  a.x = r.x
  where
  a.col1 = 'F'
  and a.col3 in ('A', 'S', 'G');
 Failed Task log snippet:
 2015-05-18 19:22:17,372 INFO [main] 
 org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring retrieval request: 
 __MAP_PLAN__
 2015-05-18 19:22:17,372 INFO [main] 
 org.apache.hadoop.hive.ql.exec.mr.ObjectCache: Ignoring cache key: 
 __MAP_PLAN__
 2015-05-18 19:22:17,457 WARN [main] org.apache.hadoop.mapred.YarnChild: 
 Exception running child : java.lang.RuntimeException: Error in configuring 
 object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:446)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 9 more
 Caused by: java.lang.RuntimeException: Error in configuring object
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
 at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
 at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38)
 ... 14 more
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
 ... 17 more
 Caused by: java.lang.RuntimeException: Map operator initialization failed
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:157)
 ... 22 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.serde2.NullStructSerDe$NullStructSerDeObjectInspector 
 cannot be cast to 
 org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
 at 
 org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:334)
 at 
 

[jira] [Resolved] (HIVE-10107) Union All : Vertex missing stats resulting in OOM and in-efficient plans

2015-05-23 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong resolved HIVE-10107.

Resolution: Fixed

resolved following hive-8769

 Union All : Vertex missing stats resulting in OOM and in-efficient plans
 

 Key: HIVE-10107
 URL: https://issues.apache.org/jira/browse/HIVE-10107
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Pengcheng Xiong

 Reducer Vertices sending data to a Union all edge are missing statistics and 
 as a result we either use very few reducers in the UNION ALL edge or decide 
 to broadcast the results of UNION ALL.
 Query
 {code}
 select 
 count(*) rowcount
 from
 (select 
 ss_item_sk, ss_ticket_number, ss_store_sk
 from
 store_sales a, store_returns b
 where
 a.ss_item_sk = b.sr_item_sk
 and a.ss_ticket_number = b.sr_ticket_number union all select 
 ss_item_sk, ss_ticket_number, ss_store_sk
 from
 store_sales c, store_returns d
 where
 c.ss_item_sk = d.sr_item_sk
 and c.ss_ticket_number = d.sr_ticket_number) t
 group by t.ss_store_sk , t.ss_item_sk , t.ss_ticket_number
 having rowcount  1;
 {code}
 Plan snippet 
 {code}
  Edges:
 Reducer 2 - Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Union 3 
 (CONTAINS)
 Reducer 4 - Union 3 (SIMPLE_EDGE)
 Reducer 7 - Map 6 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE), Union 3 
 (CONTAINS)
   Reducer 4
 Reduce Operator Tree:
   Group By Operator
 aggregations: count(VALUE._col0)
 keys: KEY._col0 (type: int), KEY._col1 (type: int), KEY._col2 
 (type: int)
 mode: mergepartial
 outputColumnNames: _col0, _col1, _col2, _col3
 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
 Column stats: COMPLETE
 Filter Operator
   predicate: (_col3  1) (type: boolean)
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
 Column stats: COMPLETE
   Select Operator
 expressions: _col3 (type: bigint)
 outputColumnNames: _col0
 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
 Column stats: COMPLETE
 File Output Operator
   compressed: false
   Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
 Column stats: COMPLETE
   table:
   input format: 
 org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   serde: 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 Reducer 7
 Reduce Operator Tree:
   Merge Join Operator
 condition map:
  Inner Join 0 to 1
 keys:
   0 ss_item_sk (type: int), ss_ticket_number (type: int)
   1 sr_item_sk (type: int), sr_ticket_number (type: int)
 outputColumnNames: _col1, _col6, _col8, _col27, _col34
 Filter Operator
   predicate: ((_col1 = _col27) and (_col8 = _col34)) (type: 
 boolean)
   Select Operator
 expressions: _col1 (type: int), _col8 (type: int), _col6 
 (type: int)
 outputColumnNames: _col0, _col1, _col2
 Group By Operator
   aggregations: count()
   keys: _col2 (type: int), _col0 (type: int), _col1 
 (type: int)
   mode: hash
   outputColumnNames: _col0, _col1, _col2, _col3
   Reduce Output Operator
 key expressions: _col0 (type: int), _col1 (type: 
 int), _col2 (type: int)
 sort order: +++
 Map-reduce partition columns: _col0 (type: int), 
 _col1 (type: int), _col2 (type: int)
 value expressions: _col3 (type: bigint)
 {code}
 The full explain plan 
 {code}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 depends on stages: Stage-1
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Reducer 2 - Map 1 (SIMPLE_EDGE), Map 5 (SIMPLE_EDGE), Union 3 
 (CONTAINS)
 Reducer 4 - Union 3 (SIMPLE_EDGE)
 Reducer 7 - Map 6 (SIMPLE_EDGE), Map 8 (SIMPLE_EDGE), Union 3 
 (CONTAINS)
   DagName: mmokhtar_20150214132727_95878ea1-ee6a-4b7e-bc86-843abd5cf664:7
   Vertices:
 Map 1
 Map Operator Tree:
 

[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-23 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557530#comment-14557530
 ] 

Pengcheng Xiong commented on HIVE-10812:


[~jpullokkaran], [~ashutoshc] and [~mmokhtar], we will address the PK/FK 
selectivity scaling problem in this patch. And also it will address 
[~ashutoshc]'s previous comments regarding the SERDE.

 Scaling PK/FK's selectivity for stats annotation
 

 Key: HIVE-10812
 URL: https://issues.apache.org/jira/browse/HIVE-10812
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong

 Right now, the computation of the selectivity of FK side based on PK side 
 does not take into consideration of the range of FK and the range of PK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557566#comment-14557566
 ] 

Lefty Leverenz commented on HIVE-10623:
---

Thanks [~xuefuz], I added a link.

 Implement hive cli options using beeline functionality
 --

 Key: HIVE-10623
 URL: https://issues.apache.org/jira/browse/HIVE-10623
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Fix For: beeline-cli-branch

 Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, 
 HIVE-10623.3.patch, HIVE-10623.4.patch, HIVE-10623.patch


 We need to support the original hive cli options for the purpose of backwards 
 compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-23 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10812:
---
Attachment: HIVE-10812.01.patch

 Scaling PK/FK's selectivity for stats annotation
 

 Key: HIVE-10812
 URL: https://issues.apache.org/jira/browse/HIVE-10812
 Project: Hive
  Issue Type: Bug
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-10812.01.patch


 Right now, the computation of the selectivity of FK side based on PK side 
 does not take into consideration of the range of FK and the range of PK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10805) OOM in vectorized reduce

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557602#comment-14557602
 ] 

Hive QA commented on HIVE-10805:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734959/HIVE-10805.01.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 8972 tests executed
*Failed tests:*
{noformat}
TestHs2Hooks - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4024/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4024/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4024/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734959 - PreCommit-HIVE-TRUNK-Build

 OOM in vectorized reduce
 

 Key: HIVE-10805
 URL: https://issues.apache.org/jira/browse/HIVE-10805
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Fix For: 1.2.1

 Attachments: HIVE-10805.01.patch


 Vectorized reduce does not release scratch byte space in BytesColumnVectors 
 and runs out of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9072) Swap the main MiniDrivers and the Vectorized MiniDrivers so Vectorization is on be default and Non-Vectorized tests are still executed

2015-05-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-9072.

Resolution: Won't Fix

Try different approach later.

 Swap the main MiniDrivers and the Vectorized MiniDrivers so Vectorization is 
 on be default and Non-Vectorized tests are still executed
 --

 Key: HIVE-9072
 URL: https://issues.apache.org/jira/browse/HIVE-9072
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical

 The 3rd and final step in turning Vectorization on by default.
 Swap the MiniDrivers so Vectorization is on by default in the main 
 MiniDrivers (TestCliDriver, TestNegativeCliDriver, TestMiniTezCliDriver) and 
 run as Non-Vectorized in newly named MiniDrivers (TestNonVecCliDriver, 
 TestNonVecNegativeCliDriver, TestNonVecMiniTezCliDriver).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-9071) Enhance Vectorization to read non-ORC tables for better test coverage

2015-05-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline resolved HIVE-9071.

Resolution: Won't Fix

Try different approach later.

 Enhance Vectorization to read non-ORC tables for better test coverage
 -

 Key: HIVE-9071
 URL: https://issues.apache.org/jira/browse/HIVE-9071
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical

 By enabling Vectorization to read non-ORC tables, many more query unit tests 
 will execute in vectorization mode.
 This is the 2nd step towards turning Vectorization on by default.
 Goal: 80% of MapWork tasks vectorize in new MiniDrivers (TestVecCliDriver, 
 TestVecNegativeCliDriver, TestVecMiniTezCliDriver).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9071) Enhance Vectorization to read non-ORC tables for better test coverage

2015-05-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9071:
---
Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-5538)

 Enhance Vectorization to read non-ORC tables for better test coverage
 -

 Key: HIVE-9071
 URL: https://issues.apache.org/jira/browse/HIVE-9071
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical

 By enabling Vectorization to read non-ORC tables, many more query unit tests 
 will execute in vectorization mode.
 This is the 2nd step towards turning Vectorization on by default.
 Goal: 80% of MapWork tasks vectorize in new MiniDrivers (TestVecCliDriver, 
 TestVecNegativeCliDriver, TestVecMiniTezCliDriver).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10789) union distinct query with NULL constant on both the sides throws Unsuported vector output type: void error

2015-05-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557197#comment-14557197
 ] 

Matt McCline commented on HIVE-10789:
-

Committed to trunk and 1.2 branch.

 union distinct query with NULL constant on both the sides throws Unsuported 
 vector output type: void error
 

 Key: HIVE-10789
 URL: https://issues.apache.org/jira/browse/HIVE-10789
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Fix For: 1.2.1

 Attachments: HIVE-10789.01.patch


 A NULL expression in the SELECT projection list causes exception to be thrown 
 instead of not vectorizing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9070) Add new MiniDrivers to test Vectorization

2015-05-23 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9070:
---
Issue Type: Bug  (was: Sub-task)
Parent: (was: HIVE-5538)

 Add new MiniDrivers to test Vectorization
 -

 Key: HIVE-9070
 URL: https://issues.apache.org/jira/browse/HIVE-9070
 Project: Hive
  Issue Type: Bug
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical

 Begin the multi-step move towards having Vectorization being on by default by 
 creating a new set of MiniDrivers that run the query unit tests with 
 vectorization turned on.
 New drivers:
 TestVecCliDriver
 TestVecNegativeCliDriver
 TestVecMiniTezCliDriver



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)

2015-05-23 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8769:
--
Attachment: HIVE-8769.05.patch

[~jpullokkaran], according to your comments, I have rebased the patch.

 Physical optimizer : Incorrect CE results in a shuffle join instead of a Map 
 join (PK/FK pattern not detected)
 --

 Key: HIVE-8769
 URL: https://issues.apache.org/jira/browse/HIVE-8769
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Pengcheng Xiong
 Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch, 
 HIVE-8769.03.patch, HIVE-8769.04.patch, HIVE-8769.05.patch


 TPC-DS Q82 is running slower than hive 13 because the join type is not 
 correct.
 The estimate for item x inventory x date_dim is 227 Million rows while the 
 actual is  3K rows.
 Hive 13 finishes in  753  seconds.
 Hive 14 finishes in  1,267  seconds.
 Hive 14 + force map join finished in 431 seconds.
 Query
 {code}
 select  i_item_id
,i_item_desc
,i_current_price
  from item, inventory, date_dim, store_sales
  where i_current_price between 30 and 30+30
  and inv_item_sk = i_item_sk
  and d_date_sk=inv_date_sk
  and d_date between '2002-05-30' and '2002-07-30'
  and i_manufact_id in (437,129,727,663)
  and inv_quantity_on_hand between 100 and 500
  and ss_item_sk = i_item_sk
  group by i_item_id,i_item_desc,i_current_price
  order by i_item_id
  limit 100
 {code}
 Plan 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 7 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE)
 Reducer 4 - Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
 Reducer 5 - Reducer 4 (SIMPLE_EDGE)
 Reducer 6 - Reducer 5 (SIMPLE_EDGE)
   DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: item
   filterExpr: ((i_current_price BETWEEN 30 AND 60 and 
 (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
 boolean)
   Statistics: Num rows: 462000 Data size: 663862160 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ((i_current_price BETWEEN 30 AND 60 and 
 (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
 boolean)
 Statistics: Num rows: 115500 Data size: 34185680 Basic 
 stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: i_item_sk (type: int), i_item_id (type: 
 string), i_item_desc (type: string), i_current_price (type: float)
   outputColumnNames: _col0, _col1, _col2, _col3
   Statistics: Num rows: 115500 Data size: 33724832 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 115500 Data size: 33724832 
 Basic stats: COMPLETE Column stats: COMPLETE
 value expressions: _col1 (type: string), _col2 (type: 
 string), _col3 (type: float)
 Execution mode: vectorized
 Map 2 
 Map Operator Tree:
 TableScan
   alias: date_dim
   filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
 and d_date_sk is not null) (type: boolean)
   Statistics: Num rows: 73049 Data size: 81741831 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
 and d_date_sk is not null) (type: boolean)
 Statistics: Num rows: 36524 Data size: 3579352 Basic 
 stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: d_date_sk (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 36524 Data size: 146096 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 36524 Data size: 146096 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Select Operator
 expressions: _col0 (type: int)
 

[jira] [Commented] (HIVE-10659) Beeline command which contains semi-colon as a non-command terminator will fail

2015-05-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557254#comment-14557254
 ] 

Lefty Leverenz commented on HIVE-10659:
---

This should be documented in the wiki.  I'm not adding a TODOC label because we 
don't have one for 1.2.1, but we could use the TODOC1.2 label.

* [HiveServer2 Clients -- Beeline | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-Beeline–NewCommandLineShell]

 Beeline command which contains semi-colon as a non-command terminator will 
 fail
 ---

 Key: HIVE-10659
 URL: https://issues.apache.org/jira/browse/HIVE-10659
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Fix For: 1.2.1

 Attachments: HIVE-10659.1.patch


 Consider a scenario where beeline is used to connect to a mysql server. The 
 commands executed via beeline can include stored procedures. For e.g. the 
 following command used to create a stored procedure is a valid command :
 {code}
 CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM 
 `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` 
 = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; 
 ALTER TABLE `TBLS` DROP KEY `TBLS_N51` ; ALTER TABLE `TBLS` DROP COLUMN 
 `LINK_TARGET_ID` ; END IF; END
 {code}
 MySQL stored procedures have semi-colon ( ; ) as the statement terminator. 
 Since this coincides with beeline's only available command terminator, 
 semi-colon, beeline will not able to execute the above command successfully . 
 i.e, beeline tries to execute the below partial command instead of the 
 complete command shown above.
 {code}
 CREATE PROCEDURE RM_TLBS_LINKID() BEGIN IF EXISTS (SELECT * FROM 
 `INFORMATION_SCHEMA`.`COLUMNS` WHERE `TABLE_NAME` = 'TBLS' AND `COLUMN_NAME` 
 = 'LINK_TARGET_ID') THEN ALTER TABLE `TBLS` DROP FOREIGN KEY `TBLS_FK3` ; 
 {code} 
 The above situation can actually happen within Hive when Hive SchemaTool is 
 used to upgrade a mysql metastore db and the scripts used for the upgrade 
 process contain stored procedures(as the one introduced initially by 
 HIVE-7018). As of now, we cannot have any stored procedure as part of MySQL 
 metastore db upgrade scripts because schemaTool uses beeline to connect to 
 MySQL. As of now, beeline fails to execute any create procedure command or 
 similar command containing ; . This is a serious limitation; it needs to be 
 fixed by allowing the end user to provide an option to beeline to not use  
 semi-colon as the command delimiter and instead use new line character as the 
 command delimiter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10800) CBO (Calcite Return Path): Setup correct information if CBO succeeds

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557189#comment-14557189
 ] 

Hive QA commented on HIVE-10800:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734835/HIVE-10800.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8970 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4012/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4012/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4012/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734835 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Setup correct information if CBO succeeds
 

 Key: HIVE-10800
 URL: https://issues.apache.org/jira/browse/HIVE-10800
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10800.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10684) Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary jar files

2015-05-23 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557257#comment-14557257
 ] 

Ferdinand Xu commented on HIVE-10684:
-

It's easy to have a cmd script for windows as the bash script based on current 
solution. Any thoughts about crossing platform issue? Also Is there any 
suggestion about replacing the maven-antrun-plugin? The maven-jar-plugin is for 
the purpose of building a jar for the source code. I am wondering whether we 
can replace the maven-antrun-plugin by maven-jar-plugin. Any thoughts about it?

Thank you! 
Ferd

 Fix the unit test failures for HIVE-7553 after HIVE-10674 removed the binary 
 jar files
 --

 Key: HIVE-10684
 URL: https://issues.apache.org/jira/browse/HIVE-10684
 Project: Hive
  Issue Type: Bug
  Components: Tests
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Attachments: HIVE-10684.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10553) Remove hardcoded Parquet references from SearchArgumentImpl

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557264#comment-14557264
 ] 

Hive QA commented on HIVE-10553:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734917/HIVE-10553.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8970 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4014/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4014/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4014/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734917 - PreCommit-HIVE-TRUNK-Build

 Remove hardcoded Parquet references from SearchArgumentImpl
 ---

 Key: HIVE-10553
 URL: https://issues.apache.org/jira/browse/HIVE-10553
 Project: Hive
  Issue Type: Sub-task
Reporter: Gopal V
Assignee: Owen O'Malley
 Attachments: HIVE-10553.patch, HIVE-10553.patch


 SARGs currently depend on Parquet code, which causes a tight coupling between 
 parquet releases and storage-api versions.
 Move Parquet code out to its own RecordReader, similar to ORC's SargApplier 
 implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10623) Implement hive cli options using beeline functionality

2015-05-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557281#comment-14557281
 ] 

Lefty Leverenz commented on HIVE-10623:
---

Does this need to be documented in the wiki (when beeline-cli-branch gets 
merged to master)?

If so, we should either add a TODOC-BEECLI label (or some such) or create a 
JIRA issue for Beeline-CLI issues that need to be documented.  I favor the 
latter.  For examples, see HIVE-9850 and HIVE-9752.

 Implement hive cli options using beeline functionality
 --

 Key: HIVE-10623
 URL: https://issues.apache.org/jira/browse/HIVE-10623
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Reporter: Ferdinand Xu
Assignee: Ferdinand Xu
 Fix For: beeline-cli-branch

 Attachments: HIVE-10623.1.patch, HIVE-10623.2.patch, 
 HIVE-10623.3.patch, HIVE-10623.4.patch, HIVE-10623.patch


 We need to support the original hive cli options for the purpose of backwards 
 compatibility. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10690) ArrayIndexOutOfBounds exception in MetaStoreDirectSql.aggrColStatsForPartitions()

2015-05-23 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10690:
--
Fix Version/s: 1.2.1

 ArrayIndexOutOfBounds exception in 
 MetaStoreDirectSql.aggrColStatsForPartitions()
 -

 Key: HIVE-10690
 URL: https://issues.apache.org/jira/browse/HIVE-10690
 Project: Hive
  Issue Type: Bug
  Components: Statistics
Affects Versions: 1.2.0
Reporter: Jason Dere
Assignee: Vaibhav Gumashta
 Fix For: 1.3.0, 1.2.1

 Attachments: HIVE-10690.1.patch


 Noticed a bunch of these stack traces in hive.log while running some unit 
 tests:
 {noformat}
 2015-05-11 21:18:59,371 WARN  [main]: metastore.ObjectStore 
 (ObjectStore.java:handleDirectSqlError(2420)) - Direct SQL failed
 java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.metastore.MetaStoreDirectSql.aggrColStatsForPartitions(MetaStoreDirectSql.java:1132)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6162)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore$8.getSqlResult(ObjectStore.java:6158)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2385)
 at 
 org.apache.hadoop.hive.metastore.ObjectStore.get_aggr_stats_for(ObjectStore.java:6158)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
 at com.sun.proxy.$Proxy84.get_aggr_stats_for(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_aggr_stats_for(HiveMetaStore.java:5662)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
 at com.sun.proxy.$Proxy86.get_aggr_stats_for(Unknown Source)
 at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAggrColStatsFor(HiveMetaStoreClient.java:2064)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
 at com.sun.proxy.$Proxy87.getAggrColStatsFor(Unknown Source)
 at 
 org.apache.hadoop.hive.ql.metadata.Hive.getAggrColStatsFor(Hive.java:3110)
 at 
 org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:245)
 at 
 org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.updateColStats(RelOptHiveTable.java:329)
 at 
 org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:399)
 at 
 org.apache.hadoop.hive.ql.optimizer.calcite.RelOptHiveTable.getColStat(RelOptHiveTable.java:392)
 at 
 org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveTableScan.getColStat(HiveTableScan.java:150)
 at 
 org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:77)
 at 
 org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdDistinctRowCount.getDistinctRowCount(HiveRelMdDistinctRowCount.java:64)
 at sun.reflect.GeneratedMethodAccessor296.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at 
 org.apache.calcite.rel.metadata.ReflectiveRelMetadataProvider$1$1.invoke(ReflectiveRelMetadataProvider.java:182)
 at com.sun.proxy.$Proxy108.getDistinctRowCount(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor234.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at 

[jira] [Commented] (HIVE-10801) 'drop view' fails throwing java.lang.NullPointerException

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557288#comment-14557288
 ] 

Hive QA commented on HIVE-10801:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734946/HIVE-10801.2.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8973 tests executed
*Failed tests:*
{noformat}
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4015/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4015/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4015/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734946 - PreCommit-HIVE-TRUNK-Build

 'drop view' fails throwing java.lang.NullPointerException
 -

 Key: HIVE-10801
 URL: https://issues.apache.org/jira/browse/HIVE-10801
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10801.1.patch, HIVE-10801.2.patch


 When trying to drop a view, hive log shows:
 {code}
 2015-05-21 11:53:06,126 ERROR [HiveServer2-Background-Pool: Thread-197]: 
 hdfs.KeyProviderCache (KeyProviderCache.java:createKeyProviderURI(87)) - 
 Could not find uri with key [dfs.encryption.key.provider.uri] to create a 
 keyProvider !!
 2015-05-21 11:53:06,134 ERROR [HiveServer2-Background-Pool: Thread-197]: 
 metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(155)) - 
 MetaException(message:java.lang.NullPointerException)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5379)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:1734)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
   at com.sun.proxy.$Proxy7.drop_table_with_environment_context(Unknown 
 Source)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.drop_table_with_environment_context(HiveMetaStoreClient.java:2056)
   at 
 org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.drop_table_with_environment_context(SessionHiveMetaStoreClient.java:118)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:968)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropTable(HiveMetaStoreClient.java:904)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
   at com.sun.proxy.$Proxy8.dropTable(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:1035)
   at org.apache.hadoop.hive.ql.metadata.Hive.dropTable(Hive.java:972)
   at org.apache.hadoop.hive.ql.exec.DDLTask.dropTable(DDLTask.java:3836)
   at 
 org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3692)
   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:331)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154)
   at 
 

[jira] [Commented] (HIVE-10650) Improve sum() function over windowing to support additional range formats

2015-05-23 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557289#comment-14557289
 ] 

Lefty Leverenz commented on HIVE-10650:
---

Doc note:  This needs to be documented in the wiki for the 1.3.0 release.

* [Windowing and Analytics -- WINDOW clause | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-WINDOWclause]

 Improve sum() function over windowing to support additional range formats
 -

 Key: HIVE-10650
 URL: https://issues.apache.org/jira/browse/HIVE-10650
 Project: Hive
  Issue Type: Sub-task
  Components: PTF-Windowing
Reporter: Aihua Xu
Assignee: Aihua Xu
  Labels: TODOC1.3
 Fix For: 1.3.0

 Attachments: HIVE-10650.patch


 Support the following windowing function {{x preceding and y preceding}} and 
 {{x following and y following}}.
 e.g.
 {noformat} 
 select sum(value) over (partition by key order by value rows between 2 
 preceding and 1 preceding) from tbl1;
 select sum(value) over (partition by key order by value rows between 
 unbounded preceding and 1 preceding) from tbl1;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6867) Bucketized Table feature fails in some cases

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557373#comment-14557373
 ] 

Hive QA commented on HIVE-6867:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734942/HIVE-6867.04.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8972 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.exec.tez.TestDynamicPartitionPruner.testSingleSourceMultipleFiltersOrdering1
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4017/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4017/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4017/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12734942 - PreCommit-HIVE-TRUNK-Build

 Bucketized Table feature fails in some cases
 

 Key: HIVE-6867
 URL: https://issues.apache.org/jira/browse/HIVE-6867
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.12.0
Reporter: Laljo John Pullokkaran
Assignee: Pengcheng Xiong
 Attachments: HIVE-6867.01.patch, HIVE-6867.02.patch, 
 HIVE-6867.03.patch, HIVE-6867.04.patch


 Bucketized Table feature fails in some cases. if src  destination is 
 bucketed on same key, and if actual data in the src is not bucketed (because 
 data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
 bucketed while writing to destination.
 Example
 --
 CREATE TABLE P1(key STRING, val STRING)
 CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
 P1;
 – perform an insert to make sure there are 2 files
 INSERT OVERWRITE TABLE P1 select key, val from P1;
 --
 This is not a regression. This has never worked.
 This got only discovered due to Hadoop2 changes.
 In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
 what is requested by app. Hadoop2 now honors the number of reducer setting in 
 local mode (by spawning threads).
 Long term solution seems to be to prevent load data for bucketed table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10806) Incorrect example for exploding map function in hive wiki

2015-05-23 Thread Gabor Liptak (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557443#comment-14557443
 ] 

Gabor Liptak commented on HIVE-10806:
-

This is the section referenced:

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-explode

I do not seem to have permissions to edit the wiki ...

 Incorrect example for exploding map function in hive wiki
 -

 Key: HIVE-10806
 URL: https://issues.apache.org/jira/browse/HIVE-10806
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.10.0
Reporter: anup b
Priority: Trivial

 In hive wiki, example for exploding map is wrong it doesnt work in hive 0.10
 Example given in wiki which doesnt work:
 SELECT explode(myMap) AS myMapKey, myMapValue FROM myMapTable;
 It should be updated to :
 SELECT explode(myMap) AS (myMapKey, myMapValue) FROM myMapTable;
 Link : 
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-explode



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10803) document jdbc url format properly

2015-05-23 Thread Gabor Liptak (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557444#comment-14557444
 ] 

Gabor Liptak commented on HIVE-10803:
-

[~thejas] How does one get access to edit the wiki?

 document jdbc url format properly
 -

 Key: HIVE-10803
 URL: https://issues.apache.org/jira/browse/HIVE-10803
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Thejas M Nair

 This is the format of the HS2 string, this needs to be documented in the wiki 
 doc (taken from jdbc.Utils.java)
  
 jdbc:hive2://host1:port1,host2:port2/dbName;sess_var_list?hive_conf_list#hive_var_list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10807) Invalidate basic stats for insert queries if autogather=false

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557458#comment-14557458
 ] 

Hive QA commented on HIVE-10807:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12734976/HIVE-10807.patch

{color:red}ERROR:{color} -1 due to 134 failed/errored test(s), 8972 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_spark3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin_negative3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_join_unencrypted_tbl
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_3

[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)

2015-05-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8769:
---
Affects Version/s: 1.1.0
   1.0.0
   1.2.0

 Physical optimizer : Incorrect CE results in a shuffle join instead of a Map 
 join (PK/FK pattern not detected)
 --

 Key: HIVE-8769
 URL: https://issues.apache.org/jira/browse/HIVE-8769
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Pengcheng Xiong
 Fix For: 1.2.1

 Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch, 
 HIVE-8769.03.patch, HIVE-8769.04.patch, HIVE-8769.05.patch


 TPC-DS Q82 is running slower than hive 13 because the join type is not 
 correct.
 The estimate for item x inventory x date_dim is 227 Million rows while the 
 actual is  3K rows.
 Hive 13 finishes in  753  seconds.
 Hive 14 finishes in  1,267  seconds.
 Hive 14 + force map join finished in 431 seconds.
 Query
 {code}
 select  i_item_id
,i_item_desc
,i_current_price
  from item, inventory, date_dim, store_sales
  where i_current_price between 30 and 30+30
  and inv_item_sk = i_item_sk
  and d_date_sk=inv_date_sk
  and d_date between '2002-05-30' and '2002-07-30'
  and i_manufact_id in (437,129,727,663)
  and inv_quantity_on_hand between 100 and 500
  and ss_item_sk = i_item_sk
  group by i_item_id,i_item_desc,i_current_price
  order by i_item_id
  limit 100
 {code}
 Plan 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 7 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE)
 Reducer 4 - Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
 Reducer 5 - Reducer 4 (SIMPLE_EDGE)
 Reducer 6 - Reducer 5 (SIMPLE_EDGE)
   DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: item
   filterExpr: ((i_current_price BETWEEN 30 AND 60 and 
 (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
 boolean)
   Statistics: Num rows: 462000 Data size: 663862160 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ((i_current_price BETWEEN 30 AND 60 and 
 (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
 boolean)
 Statistics: Num rows: 115500 Data size: 34185680 Basic 
 stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: i_item_sk (type: int), i_item_id (type: 
 string), i_item_desc (type: string), i_current_price (type: float)
   outputColumnNames: _col0, _col1, _col2, _col3
   Statistics: Num rows: 115500 Data size: 33724832 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 115500 Data size: 33724832 
 Basic stats: COMPLETE Column stats: COMPLETE
 value expressions: _col1 (type: string), _col2 (type: 
 string), _col3 (type: float)
 Execution mode: vectorized
 Map 2 
 Map Operator Tree:
 TableScan
   alias: date_dim
   filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
 and d_date_sk is not null) (type: boolean)
   Statistics: Num rows: 73049 Data size: 81741831 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
 and d_date_sk is not null) (type: boolean)
 Statistics: Num rows: 36524 Data size: 3579352 Basic 
 stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: d_date_sk (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 36524 Data size: 146096 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 36524 Data size: 146096 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Select Operator
 

[jira] [Updated] (HIVE-8769) Physical optimizer : Incorrect CE results in a shuffle join instead of a Map join (PK/FK pattern not detected)

2015-05-23 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8769:
---
Issue Type: Improvement  (was: Bug)

 Physical optimizer : Incorrect CE results in a shuffle join instead of a Map 
 join (PK/FK pattern not detected)
 --

 Key: HIVE-8769
 URL: https://issues.apache.org/jira/browse/HIVE-8769
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
Reporter: Mostafa Mokhtar
Assignee: Pengcheng Xiong
 Fix For: 1.2.1

 Attachments: HIVE-8769.01.patch, HIVE-8769.02.patch, 
 HIVE-8769.03.patch, HIVE-8769.04.patch, HIVE-8769.05.patch


 TPC-DS Q82 is running slower than hive 13 because the join type is not 
 correct.
 The estimate for item x inventory x date_dim is 227 Million rows while the 
 actual is  3K rows.
 Hive 13 finishes in  753  seconds.
 Hive 14 finishes in  1,267  seconds.
 Hive 14 + force map join finished in 431 seconds.
 Query
 {code}
 select  i_item_id
,i_item_desc
,i_current_price
  from item, inventory, date_dim, store_sales
  where i_current_price between 30 and 30+30
  and inv_item_sk = i_item_sk
  and d_date_sk=inv_date_sk
  and d_date between '2002-05-30' and '2002-07-30'
  and i_manufact_id in (437,129,727,663)
  and inv_quantity_on_hand between 100 and 500
  and ss_item_sk = i_item_sk
  group by i_item_id,i_item_desc,i_current_price
  order by i_item_id
  limit 100
 {code}
 Plan 
 {code}
 STAGE PLANS:
   Stage: Stage-1
 Tez
   Edges:
 Map 7 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE)
 Reducer 4 - Map 3 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE)
 Reducer 5 - Reducer 4 (SIMPLE_EDGE)
 Reducer 6 - Reducer 5 (SIMPLE_EDGE)
   DagName: mmokhtar_20141106005353_7a2eb8df-12ff-4fe9-89b4-30f1e4e3fb90:1
   Vertices:
 Map 1 
 Map Operator Tree:
 TableScan
   alias: item
   filterExpr: ((i_current_price BETWEEN 30 AND 60 and 
 (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
 boolean)
   Statistics: Num rows: 462000 Data size: 663862160 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: ((i_current_price BETWEEN 30 AND 60 and 
 (i_manufact_id) IN (437, 129, 727, 663)) and i_item_sk is not null) (type: 
 boolean)
 Statistics: Num rows: 115500 Data size: 34185680 Basic 
 stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: i_item_sk (type: int), i_item_id (type: 
 string), i_item_desc (type: string), i_current_price (type: float)
   outputColumnNames: _col0, _col1, _col2, _col3
   Statistics: Num rows: 115500 Data size: 33724832 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 115500 Data size: 33724832 
 Basic stats: COMPLETE Column stats: COMPLETE
 value expressions: _col1 (type: string), _col2 (type: 
 string), _col3 (type: float)
 Execution mode: vectorized
 Map 2 
 Map Operator Tree:
 TableScan
   alias: date_dim
   filterExpr: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
 and d_date_sk is not null) (type: boolean)
   Statistics: Num rows: 73049 Data size: 81741831 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Filter Operator
 predicate: (d_date BETWEEN '2002-05-30' AND '2002-07-30' 
 and d_date_sk is not null) (type: boolean)
 Statistics: Num rows: 36524 Data size: 3579352 Basic 
 stats: COMPLETE Column stats: COMPLETE
 Select Operator
   expressions: d_date_sk (type: int)
   outputColumnNames: _col0
   Statistics: Num rows: 36524 Data size: 146096 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Reduce Output Operator
 key expressions: _col0 (type: int)
 sort order: +
 Map-reduce partition columns: _col0 (type: int)
 Statistics: Num rows: 36524 Data size: 146096 Basic 
 stats: COMPLETE Column stats: COMPLETE
   Select Operator
 expressions: _col0 (type: int)

[jira] [Updated] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10811:
---
Attachment: HIVE-10811.patch

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-10811.patch


 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768)
   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
   at 
 org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536)
   ... 32 more
 Caused by: java.lang.AssertionError: Internal error: While invoking method 
 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 

[jira] [Commented] (HIVE-10803) document jdbc url format properly

2015-05-23 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557485#comment-14557485
 ] 

Thejas M Nair commented on HIVE-10803:
--

How to get permission to edit

Create a Confluence account if you don't already have one.
Sign up for the user mailing list by sending a message to 
user-subscr...@hive.apache.org.
Send a message to u...@hive.apache.org requesting write access to the Hive 
wiki, and provide your Confluence username.

(from : https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki)


 document jdbc url format properly
 -

 Key: HIVE-10803
 URL: https://issues.apache.org/jira/browse/HIVE-10803
 Project: Hive
  Issue Type: Bug
  Components: Documentation, HiveServer2
Reporter: Thejas M Nair

 This is the format of the HS2 string, this needs to be documented in the wiki 
 doc (taken from jdbc.Utils.java)
  
 jdbc:hive2://host1:port1,host2:port2/dbName;sess_var_list?hive_conf_list#hive_var_list



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10811) RelFieldTrimmer throws NoSuchElementException in some cases

2015-05-23 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10811:
---
Component/s: CBO

 RelFieldTrimmer throws NoSuchElementException in some cases
 ---

 Key: HIVE-10811
 URL: https://issues.apache.org/jira/browse/HIVE-10811
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez

 RelFieldTrimmer runs into NoSuchElementException in some cases.
 Stack trace:
 {noformat}
 Exception in thread main java.lang.AssertionError: Internal error: While 
 invoking method 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trim(RelFieldTrimmer.java:175)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:947)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:820)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:768)
   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:109)
   at 
 org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:730)
   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:145)
   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:105)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:607)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:244)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10048)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:207)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
 Caused by: java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:536)
   ... 32 more
 Caused by: java.lang.AssertionError: Internal error: While invoking method 
 'public org.apache.calcite.sql2rel.RelFieldTrimmer$TrimResult 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimFields(org.apache.calcite.rel.core.Sort,org.apache.calcite.util.ImmutableBitSet,java.util.Set)'
   at org.apache.calcite.util.Util.newInternal(Util.java:743)
   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:543)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.dispatchTrimFields(RelFieldTrimmer.java:269)
   at 
 org.apache.calcite.sql2rel.RelFieldTrimmer.trimChild(RelFieldTrimmer.java:210)
   at 
 

[jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557619#comment-14557619
 ] 

Hive QA commented on HIVE-9069:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735036/HIVE-9069.12.patch

{color:red}ERROR:{color} -1 due to 21 failed/errored test(s), 8969 tests 
executed
*Failed tests:*
{noformat}
TestContribNegativeCliDriver - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_explain_logical
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_oneskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_gby2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_vc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4025/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4025/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4025/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 21 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12735036 - PreCommit-HIVE-TRUNK-Build

 Simplify filter predicates for CBO
 --

 Key: HIVE-9069
 URL: https://issues.apache.org/jira/browse/HIVE-9069
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Jesus Camacho Rodriguez
 Fix For: 0.14.1

 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, 
 HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, 
 HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, 
 HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, 
 HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.patch


 Simplify predicates for disjunctive predicates so that can get pushed down to 
 the scan.
 Looks like this is still an issue, some of the filters can be pushed down to 
 the scan.
 {code}
 set hive.cbo.enable=true
 set hive.stats.fetch.column.stats=true
 set hive.exec.dynamic.partition.mode=nonstrict
 set hive.tez.auto.reducer.parallelism=true
 set hive.auto.convert.join.noconditionaltask.size=32000
 set hive.exec.reducers.bytes.per.reducer=1
 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager
 set hive.support.concurrency=false
 set hive.tez.exec.print.summary=true
 explain  
 select  substr(r_reason_desc,1,20) as r
,avg(ws_quantity) wq
,avg(wr_refunded_cash) ref
,avg(wr_fee) fee
  from web_sales, web_returns, web_page, customer_demographics cd1,
   customer_demographics cd2, customer_address, date_dim, reason 
  where web_sales.ws_web_page_sk = web_page.wp_web_page_sk
and web_sales.ws_item_sk = web_returns.wr_item_sk
and web_sales.ws_order_number = web_returns.wr_order_number
and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998
and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk 
and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk
and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk
and reason.r_reason_sk = web_returns.wr_reason_sk
and
(
 (
  cd1.cd_marital_status = 'M'
  and
  cd1.cd_marital_status = cd2.cd_marital_status
  and
  cd1.cd_education_status = '4 yr Degree'
  and 
  

[jira] [Commented] (HIVE-10812) Scaling PK/FK's selectivity for stats annotation

2015-05-23 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557635#comment-14557635
 ] 

Hive QA commented on HIVE-10812:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12735040/HIVE-10812.01.patch

{color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 8973 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_crc32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_sha1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join30
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket5
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucket6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_bucketmapjoin7
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_disable_merge_for_bucketing
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_empty_dir_in_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_external_table_with_space_in_location_path
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap_auto
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_bucketed_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_merge
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_num_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_infer_bucket_sort_reducers_power_two
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_leftsemijoin_mr
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_list_bucket_dml_10
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_parallel_orderby
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_quotedid_smb
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_remote_script
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_schemeAuthority2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_scriptfile1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_smb_mapjoin_8
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_uber_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_annotate_stats_join
org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4026/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4026/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4026/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 44 tests failed
{noformat}

This message is automatically generated.