[jira] [Created] (HIVE-27083) hive开启单元测试编译失败

2023-02-15 Thread yutiantian (Jira)
yutiantian created HIVE-27083:
-

 Summary: hive开启单元测试编译失败
 Key: HIVE-27083
 URL: https://issues.apache.org/jira/browse/HIVE-27083
 Project: Hive
  Issue Type: Bug
Reporter: yutiantian


环境:linux

hive 版本:2.3.7

编译hive时,开启hive的单元测试

命令为:mvn clean package -Phadoop-2 -Pdist -Dtar -Dmaven.test.failure.ignore=true

编译过程中,有大量的报错,报错信息如下:

[DEBUG] Forking command line: /bin/sh -c cd /home/yutiantian/src/spark-hive/ql 
&& /home/yutiantian/software/jdk1.8.0_181/jre/bin/java -Xmx1024m 
-XX:MaxPermSize=256M -jar 
/home/yutiantian/src/spark-hive/ql/target/surefire/surefirebooter3681176817358000819.jar
 /home/yutiantian/src/spark-hive/ql/target/surefire 
2023-02-15T10-46-58_934-jvmRun1 surefire7006618945989804747tmp 
surefire_2009499396449910tmp
[DEBUG] Fork Channel [1] connected to the client.
[ERROR] Java HotSpot(TM) 64-Bit Server VM warning: ignoring option 
MaxPermSize=256M; support was removed in 8.0
[INFO] Running org.apache.hadoop.hive.ql.exec.TestOperators
[ERROR] Tests run: 7, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 67.938 
s <<< FAILURE! - in org.apache.hadoop.hive.ql.exec.TestOperators
[ERROR] org.apache.hadoop.hive.ql.exec.TestOperators.testScriptOperator  Time 
elapsed: 24.728 s  <<< ERROR!
java.lang.ExceptionInInitializerError
    at 
org.apache.hadoop.hive.ql.exec.TestOperators.testScriptOperator(TestOperators.java:216)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at junit.framework.TestCase.runTest(TestCase.java:176)
    at junit.framework.TestCase.runBare(TestCase.java:141)
    at junit.framework.TestResult$1.protect(TestResult.java:122)
    at junit.framework.TestResult.runProtected(TestResult.java:142)
    at junit.framework.TestResult.run(TestResult.java:125)
    at junit.framework.TestCase.run(TestCase.java:129)
    at junit.framework.TestSuite.runTest(TestSuite.java:255)
    at junit.framework.TestSuite.run(TestSuite.java:250)
    at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:364)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:272)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:237)
    at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:158)
    at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:428)
    at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:162)
    at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562)
    at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548)
Caused by: java.lang.RuntimeException: Encountered throwable
    at 
org.apache.hadoop.hive.ql.exec.TestExecDriver.(TestExecDriver.java:149)
    ... 22 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at 
org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:236)
    at org.apache.hadoop.hive.ql.metadata.Hive.(Hive.java:395)
    at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:339)
    at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:319)
    at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
    at 
org.apache.hadoop.hive.ql.exec.TestExecDriver.(TestExecDriver.java:135)
    ... 22 more
Caused by: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
    at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1742)
    at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83)
    at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133)
    at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
    at 
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3607)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3659)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3639)
    at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3901)
    at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:248)
    at 
org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
    ... 27 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at 

[jira] [Created] (HIVE-27084) Iceberg: Stats are not populated correctly during query compilation

2023-02-15 Thread Rajesh Balamohan (Jira)
Rajesh Balamohan created HIVE-27084:
---

 Summary: Iceberg: Stats are not populated correctly during query 
compilation
 Key: HIVE-27084
 URL: https://issues.apache.org/jira/browse/HIVE-27084
 Project: Hive
  Issue Type: Improvement
  Components: Iceberg integration
Reporter: Rajesh Balamohan


- Table stats are not properly used/computed during query compilation phase.
 - Here is an example. Check out the query with the filter which give more data 
than the regular query

This is just an example, real world queries can have bad query plans due to this

{{10470974584 with filter, vs 303658262936 without filter}}

{noformat}
explain select count(*) from store_sales where ss_sold_date_sk=2450822 and 
ss_wholesale_cost > 0.0

Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Tez
  DagId: hive_20230216065808_80d68e3f-3a6b-422b-9265-50bc707ae3c6:48
  Edges:
Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
  DagName: hive_20230216065808_80d68e3f-3a6b-422b-9265-50bc707ae3c6:48
  Vertices:
Map 1
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: ((ss_sold_date_sk = 2450822) and 
(ss_wholesale_cost > 0)) (type: boolean)
  Statistics: Num rows: 2755519629 Data size: 303658262936 
Basic stats: COMPLETE Column stats: NONE
  Filter Operator
predicate: ((ss_sold_date_sk = 2450822) and 
(ss_wholesale_cost > 0)) (type: boolean)
Statistics: Num rows: 5 Data size: 550 Basic stats: 
COMPLETE Column stats: NONE
Select Operator
  Statistics: Num rows: 5 Data size: 550 Basic stats: 
COMPLETE Column stats: NONE
  Group By Operator
aggregations: count()
minReductionHashAggr: 0.99
mode: hash
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 124 Basic stats: 
COMPLETE Column stats: NONE
Reduce Output Operator
  null sort order:
  sort order:
  Statistics: Num rows: 1 Data size: 124 Basic stats: 
COMPLETE Column stats: NONE
  value expressions: _col0 (type: bigint)
Execution mode: vectorized, llap
LLAP IO: all inputs (cache only)
Reducer 2
Execution mode: vectorized, llap
Reduce Operator Tree:
  Group By Operator
aggregations: count(VALUE._col0)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 124 Basic stats: COMPLETE 
Column stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 1 Data size: 124 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
ListSink

58 rows selected (0.73 seconds)



explain select count(*) from store_sales where ss_sold_date_sk=2450822
INFO  : Starting task [Stage-3:EXPLAIN] in serial mode
INFO  : Completed executing 
command(queryId=hive_20230216065813_e51482a2-1c9a-41a7-b1b3-9aec2fba9ba7); Time 
taken: 0.061 seconds
INFO  : OK
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Tez
  DagId: hive_20230216065813_e51482a2-1c9a-41a7-b1b3-9aec2fba9ba7:49
  Edges:
Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
  DagName: hive_20230216065813_e51482a2-1c9a-41a7-b1b3-9aec2fba9ba7:49
  Vertices:
Map 1
Map Operator Tree:
TableScan
  alias: store_sales
  filterExpr: (ss_sold_date_sk = 2450822) (type: boolean)
  Statistics: Num rows: 2755519629 Data size: 10470974584 Basic 
stats: COMPLETE Column stats: NONE
  Filter Operator
predicate: (ss_sold_date_sk = 2450822) (type: boolean)
Statistics: Num rows: 5 Data size: 18 Basic stats: COMPLETE 
Column stats: NONE
Select Operator
  Statistics: Num rows: 5 Data size: 18 Basic stats: 
COMPLETE Column stats: NONE
  Group By Operator
aggregations: count()
minReductionHashAggr: 0.99