[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060440#comment-16060440 ] Hive QA commented on HIVE-11297: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874190/HIVE-11297.8.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10846 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_create] (batchId=83) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=99) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] (batchId=125) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5739/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5739/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5739/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874190 - PreCommit-HIVE-Build > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, > HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, > HIVE-11297.6.patch, HIVE-11297.7.patch, HIVE-11297.8.patch, hive-site.xml > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060405#comment-16060405 ] Hive QA commented on HIVE-16943: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874172/HIVE-16943.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 10730 tests executed *Failed tests:* {noformat} TestCleaner2 - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestConvertAstToSearchArg - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestIOContextMap - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestInitiator - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestRecordIdentifier - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestSearchArgumentImpl - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestWorker - did not produce a TEST-*.xml file (likely timed out) (batchId=258) TestWorker2 - did not produce a TEST-*.xml file (likely timed out) (batchId=258) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=146) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] (batchId=125) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5738/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5738/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5738/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 20 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874172 - PreCommit-HIVE-Build > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.1.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java --
[jira] [Updated] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16832: -- Attachment: HIVE-16832.09.patch > duplicate ROW__ID possible in multi insert into transactional table > --- > > Key: HIVE-16832 > URL: https://issues.apache.org/jira/browse/HIVE-16832 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, > HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, > HIVE-16832.08.patch, HIVE-16832.09.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-11297: Attachment: HIVE-11297.8.patch some minor changes about spark_partition_pruning.q.out > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, > HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, > HIVE-11297.6.patch, HIVE-11297.7.patch, HIVE-11297.8.patch, hive-site.xml > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Status: Patch Available (was: Open) > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Attachment: HIVE-16929.2.patch > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Attachment: (was: HIVE-16929.2.patch) > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Attachment: HIVE-16929.2.patch > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Status: Open (was: Patch Available) > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Attachment: (was: HIVE-16929.2.patch) > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060380#comment-16060380 ] Ashutosh Chauhan commented on HIVE-16943: - +1 pending tests > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.1.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16948) Invalid explain when running dynamic partition pruning query in HOS
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060377#comment-16060377 ] Pengcheng Xiong commented on HIVE-16948: thanks. :) > Invalid explain when running dynamic partition pruning query in HOS > --- > > Key: HIVE-16948 > URL: https://issues.apache.org/jira/browse/HIVE-16948 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > > union_subquery.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > explain select ds from (select distinct(ds) as ds from srcpart union all > select distinct(ds) as ds from srcpart) s where s.ds in (select > max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart); > {code} > explain > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > DagName: root_20170622231525_20a777e5-e659-4138-b605-65f8395e18e2:2 > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Reducer 11 > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator > partition key expr: ds >
[jira] [Commented] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060379#comment-16060379 ] Hive QA commented on HIVE-16832: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874171/HIVE-16832.08.patch {color:green}SUCCESS:{color} +1 due to 16 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 76 failed/errored test(s), 10858 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_subquery] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_table_stats] (batchId=50) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=60) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_explode2] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_noalias] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_7] (batchId=42) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_8] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_9] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_acid_no_masking] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[row__id] (batchId=74) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_stack] (batchId=36) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lateral_view] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf_streaming] (batchId=155) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_notin] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[windowing] (batchId=155) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=98) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[invalid_cast_from_binary_1] (batchId=88) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_assert_true2] (batchId=89) org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[udf_assert_true] (batchId=89) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[lateral_view_explode2] (batchId=136) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] (batchId=125) org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testCombinationInputFormatWithAcid (batchId=262) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testNewBaseAndDelta (batchId=262) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderIncompleteDelta (batchId=262) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta (batchId=262) org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta (batchId=262) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactAfterAbort (batchId=215) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactWhileStreaming (batchId=215) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.majorCompactWhileStreamingForSplitUpdate (batchId=215)
[jira] [Assigned] (HIVE-16948) Invalid explain when running dynamic partition pruning query in HOS
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel reassigned HIVE-16948: --- Assignee: liyunzhang_intel > Invalid explain when running dynamic partition pruning query in HOS > --- > > Key: HIVE-16948 > URL: https://issues.apache.org/jira/browse/HIVE-16948 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > > union_subquery.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > explain select ds from (select distinct(ds) as ds from srcpart union all > select distinct(ds) as ds from srcpart) s where s.ds in (select > max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart); > {code} > explain > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > DagName: root_20170622231525_20a777e5-e659-4138-b605-65f8395e18e2:2 > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Reducer 11 > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator > partition key expr: ds >
[jira] [Updated] (HIVE-16948) Invalid explain when running dynamic partition pruning query in HOS
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-16948: Summary: Invalid explain when running dynamic partition pruning query in HOS (was: Invalid explain when running dynamic partition pruning query) > Invalid explain when running dynamic partition pruning query in HOS > --- > > Key: HIVE-16948 > URL: https://issues.apache.org/jira/browse/HIVE-16948 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel > > union_subquery.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > explain select ds from (select distinct(ds) as ds from srcpart union all > select distinct(ds) as ds from srcpart) s where s.ds in (select > max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart); > {code} > explain > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > DagName: root_20170622231525_20a777e5-e659-4138-b605-65f8395e18e2:2 > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Reducer 11 > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator >
[jira] [Commented] (HIVE-16948) Invalid explain when running dynamic partition pruning query
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060374#comment-16060374 ] liyunzhang_intel commented on HIVE-16948: - [~pxiong]: i found it in HoS, will modify the description soon. > Invalid explain when running dynamic partition pruning query > > > Key: HIVE-16948 > URL: https://issues.apache.org/jira/browse/HIVE-16948 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel > > union_subquery.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > explain select ds from (select distinct(ds) as ds from srcpart union all > select distinct(ds) as ds from srcpart) s where s.ds in (select > max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart); > {code} > explain > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > DagName: root_20170622231525_20a777e5-e659-4138-b605-65f8395e18e2:2 > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Reducer 11 > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator > partition key expr: ds >
[jira] [Commented] (HIVE-16948) Invalid explain when running dynamic partition pruning query
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060371#comment-16060371 ] Pengcheng Xiong commented on HIVE-16948: HoS? Hive on Spark? > Invalid explain when running dynamic partition pruning query > > > Key: HIVE-16948 > URL: https://issues.apache.org/jira/browse/HIVE-16948 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel > > union_subquery.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > explain select ds from (select distinct(ds) as ds from srcpart union all > select distinct(ds) as ds from srcpart) s where s.ds in (select > max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart); > {code} > explain > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > DagName: root_20170622231525_20a777e5-e659-4138-b605-65f8395e18e2:2 > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Reducer 11 > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Spark Partition Pruning Sink Operator > partition key expr: ds > Statistics: Num rows: 2 Data
[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060369#comment-16060369 ] liyunzhang_intel commented on HIVE-11297: - [~csun]: for the second query you mentioned in RB. file HIVE-16948 to trace > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, > HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, > HIVE-11297.6.patch, HIVE-11297.7.patch, hive-site.xml > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16947) Semijoin Reduction : Task cycle created due to multiple semijoins in conjunction with hashjoin
[ https://issues.apache.org/jira/browse/HIVE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16947: -- Status: Patch Available (was: In Progress) > Semijoin Reduction : Task cycle created due to multiple semijoins in > conjunction with hashjoin > -- > > Key: HIVE-16947 > URL: https://issues.apache.org/jira/browse/HIVE-16947 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16947.1.patch > > > Typically a semijoin branch and a mapjoin may create a cycle when on same > operator tree. This is already handled, however, a semijoin branch can serve > more than one filters and the cycle detection logic currently only handles > the 1st one causing cycles preventing the queries from running. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16947) Semijoin Reduction : Task cycle created due to multiple semijoins in conjunction with hashjoin
[ https://issues.apache.org/jira/browse/HIVE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal updated HIVE-16947: -- Attachment: HIVE-16947.1.patch Initial patch > Semijoin Reduction : Task cycle created due to multiple semijoins in > conjunction with hashjoin > -- > > Key: HIVE-16947 > URL: https://issues.apache.org/jira/browse/HIVE-16947 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > Attachments: HIVE-16947.1.patch > > > Typically a semijoin branch and a mapjoin may create a cycle when on same > operator tree. This is already handled, however, a semijoin branch can serve > more than one filters and the cycle detection logic currently only handles > the 1st one causing cycles preventing the queries from running. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Work started] (HIVE-16947) Semijoin Reduction : Task cycle created due to multiple semijoins in conjunction with hashjoin
[ https://issues.apache.org/jira/browse/HIVE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-16947 started by Deepak Jaiswal. - > Semijoin Reduction : Task cycle created due to multiple semijoins in > conjunction with hashjoin > -- > > Key: HIVE-16947 > URL: https://issues.apache.org/jira/browse/HIVE-16947 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Typically a semijoin branch and a mapjoin may create a cycle when on same > operator tree. This is already handled, however, a semijoin branch can serve > more than one filters and the cycle detection logic currently only handles > the 1st one causing cycles preventing the queries from running. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16947) Semijoin Reduction : Task cycle created due to multiple semijoins in conjunction with hashjoin
[ https://issues.apache.org/jira/browse/HIVE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Jaiswal reassigned HIVE-16947: - > Semijoin Reduction : Task cycle created due to multiple semijoins in > conjunction with hashjoin > -- > > Key: HIVE-16947 > URL: https://issues.apache.org/jira/browse/HIVE-16947 > Project: Hive > Issue Type: Bug >Reporter: Deepak Jaiswal >Assignee: Deepak Jaiswal > > Typically a semijoin branch and a mapjoin may create a cycle when on same > operator tree. This is already handled, however, a semijoin branch can serve > more than one filters and the cycle detection logic currently only handles > the 1st one causing cycles preventing the queries from running. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Attachment: HIVE-16929.2.patch > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Status: Patch Available (was: Open) > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16929) User-defined UDF functions can be registered as invariant functions
[ https://issues.apache.org/jira/browse/HIVE-16929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZhangBing Lin updated HIVE-16929: - Status: Open (was: Patch Available) > User-defined UDF functions can be registered as invariant functions > --- > > Key: HIVE-16929 > URL: https://issues.apache.org/jira/browse/HIVE-16929 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: ZhangBing Lin >Assignee: ZhangBing Lin > Attachments: HIVE-16929.1.patch, HIVE-16929.2.patch > > > Add a configuration item "hive.aux.udf.package.name.list" in hive-site.xml, > which is a scan corresponding to the $HIVE_HOME/auxlib/ directory jar package > that contains the corresponding configuration package name under the class > registered as a constant function. > Such as, > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf,com.test.udf > > {code} > Instructions: > 1, upload your jar file to $ HIVE_HOME/auxlib > 2, configure your UDF function corresponding to the package to the > following configuration parameters > {code:java} > > hive.aux.udf.package.name.list > com.sample.udf > > {code} > 3, the configuration items need to be placed in the hive-site.xml file > 4, restart the Hive service to take effect -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-11297: Attachment: hive-site.xml > Combine op trees for partition info generating tasks [Spark branch] > --- > > Key: HIVE-11297 > URL: https://issues.apache.org/jira/browse/HIVE-11297 > Project: Hive > Issue Type: Bug >Affects Versions: spark-branch >Reporter: Chao Sun >Assignee: liyunzhang_intel > Attachments: HIVE-11297.1.patch, HIVE-11297.2.patch, > HIVE-11297.3.patch, HIVE-11297.4.patch, HIVE-11297.5.patch, > HIVE-11297.6.patch, HIVE-11297.7.patch, hive-site.xml > > > Currently, for dynamic partition pruning in Spark, if a small table generates > partition info for more than one partition columns, multiple operator trees > are created, which all start from the same table scan op, but have different > spark partition pruning sinks. > As an optimization, we can combine these op trees and so don't have to do > table scan multiple times. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-11297) Combine op trees for partition info generating tasks [Spark branch]
[ https://issues.apache.org/jira/browse/HIVE-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060336#comment-16060336 ] liyunzhang_intel commented on HIVE-11297: - [~csun]: about the questions you mentioned in RB. there are two queries are different. explain query1( please use the attached hive-site.xml to verify, without the configuration in hive-site.xml, i can not reproduce following explain) {code} set hive.execution.engine=spark; set hive.spark.dynamic.partition.pruning=true; set hive.optimize.ppd=true; set hive.ppd.remove.duplicatefilters=true; set hive.optimize.metadataonly=false; set hive.optimize.index.filter=true; set hive.strict.checks.cartesian.product=false; explain select count(*) from srcpart join srcpart_date on (srcpart.ds = srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr) where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11 and srcpart.hr = 11 {code} previous explain {code} STAGE DEPENDENCIES: Stage-2 is a root stage Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark DagName: root_20170622213734_eb4c35e8-952a-4c4d-8972-ba5381bf51a3:2 Vertices: Map 7 Map Operator Tree: TableScan alias: srcpart_date filterExpr: ((date = '2008-04-08') and ds is not null) (type: boolean) Statistics: Num rows: 2 Data size: 42 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((date = '2008-04-08') and ds is not null) (type: boolean) Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ds (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator partition key expr: ds Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE target column name: ds target work: Map 1 Stage: Stage-1 Spark Edges: Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 2), Map 5 (PARTITION-LEVEL SORT, 2) Reducer 3 <- Map 6 (PARTITION-LEVEL SORT, 2), Reducer 2 (PARTITION-LEVEL SORT, 2) Reducer 4 <- Reducer 3 (GROUP, 1) DagName: root_20170622213734_eb4c35e8-952a-4c4d-8972-ba5381bf51a3:1 Vertices: Map 1 Map Operator Tree: TableScan alias: srcpart Statistics: Num rows: 1 Data size: 11624 Basic stats: PARTIAL Column stats: NONE Select Operator expressions: ds (type: string), hr (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 1 Data size: 11624 Basic stats: PARTIAL Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 1 Data size: 11624 Basic stats: PARTIAL Column stats: NONE value expressions: _col1 (type: string) Map 5 Map Operator Tree: TableScan alias: srcpart_date filterExpr: ((date = '2008-04-08') and ds is not null) (type: boolean) Statistics: Num rows: 2 Data size: 42 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((date = '2008-04-08') and ds is not null) (type: boolean) Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ds (type: string) outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 21 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: +
[jira] [Updated] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-16943: --- Status: Patch Available (was: Open) > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.1.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-16943: --- Attachment: HIVE-16943.1.patch > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.1.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-16943: --- Attachment: (was: HIVE-16943.patch) > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.1.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-16832: -- Attachment: HIVE-16832.08.patch > duplicate ROW__ID possible in multi insert into transactional table > --- > > Key: HIVE-16832 > URL: https://issues.apache.org/jira/browse/HIVE-16832 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.2.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Critical > Attachments: HIVE-16832.01.patch, HIVE-16832.03.patch, > HIVE-16832.04.patch, HIVE-16832.05.patch, HIVE-16832.06.patch, > HIVE-16832.08.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060249#comment-16060249 ] Fei Hui commented on HIVE-16943: CC [~alangates] [~Ferd] > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060201#comment-16060201 ] Matt McCline commented on HIVE-16589: - Committed to master. > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.096.patch, > HIVE-16589.097.patch, HIVE-16589.098.patch, HIVE-16589.0991.patch, > HIVE-16589.0992.patch, HIVE-16589.0993.patch, HIVE-16589.0994.patch, > HIVE-16589.0995.patch, HIVE-16589.099.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Resolution: Fixed Status: Resolved (was: Patch Available) > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.096.patch, > HIVE-16589.097.patch, HIVE-16589.098.patch, HIVE-16589.0991.patch, > HIVE-16589.0992.patch, HIVE-16589.0993.patch, HIVE-16589.0994.patch, > HIVE-16589.0995.patch, HIVE-16589.099.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-16589: Fix Version/s: 3.0.0 > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.096.patch, > HIVE-16589.097.patch, HIVE-16589.098.patch, HIVE-16589.0991.patch, > HIVE-16589.0992.patch, HIVE-16589.0993.patch, HIVE-16589.0994.patch, > HIVE-16589.0995.patch, HIVE-16589.099.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060181#comment-16060181 ] Sergey Shelukhin commented on HIVE-16778: - Also committed to branch-2 > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16778.patch, HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16778) LLAP IO: better refcount management
[ https://issues.apache.org/jira/browse/HIVE-16778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16778: Fix Version/s: 2.4.0 > LLAP IO: better refcount management > --- > > Key: HIVE-16778 > URL: https://issues.apache.org/jira/browse/HIVE-16778 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16778.patch, HIVE-16778.patch > > > Looks like task cancellation can close the UGI, causing the background thread > to die with an exception, leaving a bunch of unreleased cache buffers. > Overall, it's probably better to modify how refcounts are handled - if > there's some bug in the code we don't want to leak them. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16939) metastore error: 'export: -Dproc_metastore : not a valid identifier'
[ https://issues.apache.org/jira/browse/HIVE-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-16939: Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Committed to the upstream. Thanks [~ferhui] for the contribution. > metastore error: 'export: -Dproc_metastore : not a valid identifier' > > > Key: HIVE-16939 > URL: https://issues.apache.org/jira/browse/HIVE-16939 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Fix For: 3.0.0 > > Attachments: HIVE-16939.patch > > > When i run metastore, it reports errors as bellow > {quote} > bin/ext/metastore.sh: line 29: export: ` -Dproc_metastore ': not a valid > identifier > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14688) Hive drop call fails in presence of TDE
[ https://issues.apache.org/jira/browse/HIVE-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060102#comment-16060102 ] Hive QA commented on HIVE-14688: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844114/HIVE-14688.4.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5736/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5736/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5736/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-06-22 22:15:46.299 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5736/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-06-22 22:15:46.301 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 7819cd3..b47736f master -> origin/master 3298e7f..f4a8fef branch-2 -> origin/branch-2 + git reset --hard HEAD HEAD is now at 7819cd3 HIVE-16867: Extend shared scan optimizer to reuse computation from other operators (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) + git clean -f -d Removing ql/src/test/queries/clientpositive/llap_smb.q Removing ql/src/test/results/clientpositive/llap/llap_smb.q.out + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at b47736f HIVE-16930: HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters (Yibing Shi via Chaoyu Tang) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-06-22 22:15:52.164 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: itests/src/test/resources/testconfiguration.properties:710 error: itests/src/test/resources/testconfiguration.properties: patch does not apply error: patch failed: metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:1786 error: metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java: patch does not apply error: patch failed: ql/src/test/results/clientpositive/encrypted/encryption_drop_partition.q.out:111 error: ql/src/test/results/clientpositive/encrypted/encryption_drop_partition.q.out: patch does not apply error: patch failed: ql/src/test/results/clientpositive/encrypted/encryption_drop_table.q.out:67 error: ql/src/test/results/clientpositive/encrypted/encryption_drop_table.q.out: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12844114 - PreCommit-HIVE-Build > Hive drop call fails in presence of TDE > --- > > Key: HIVE-14688 > URL: https://issues.apache.org/jira/browse/HIVE-14688 > Project: Hive > Issue Type: Bug > Components: Security >Affects Versions: 1.2.1, 2.0.0 >Reporter: Deepesh Khandelwal >Assignee: Wei Zheng > Attachments: HIVE-14688.1.patch, HIVE-14688.2.patch, > HIVE-14688.3.patch, HIVE-14688.4.patch > > > This should be committed to when Hive moves to Hadoop 2.8 > In Hadoop 2.8.0 TDE trash collection was fixed through HDFS-8831. This > enables us to make drop table calls for Hive managed tables where Hive > metastore warehouse directory is in encrypted zone. However even with the > feature in HDFS, Hive drop table currently fail:
[jira] [Commented] (HIVE-16874) qurey fail when try to read file from remote hdfs
[ https://issues.apache.org/jira/browse/HIVE-16874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060066#comment-16060066 ] Thejas M Nair commented on HIVE-16874: -- This might be fixed via changes in HIVE-14380 > qurey fail when try to read file from remote hdfs > - > > Key: HIVE-16874 > URL: https://issues.apache.org/jira/browse/HIVE-16874 > Project: Hive > Issue Type: Bug > Components: Clients >Affects Versions: 1.2.1 >Reporter: Yunjian Zhang > Attachments: HIVE-6.ext.patch > > > as per an extend issue on HIVE-6, table join and insert on remote hdfs > storage will fail with same issue. > batch base on > https://issues.apache.org/jira/secure/attachment/12820392/HIVE-6.1.patch, > attached patch will fix the issues mentioned here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-14380) Queries on tables with remote HDFS paths fail in "encryption" checks.
[ https://issues.apache.org/jira/browse/HIVE-14380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060064#comment-16060064 ] Thejas M Nair commented on HIVE-14380: -- Thanks for checking about the metastore fix [~mithun]! > Queries on tables with remote HDFS paths fail in "encryption" checks. > - > > Key: HIVE-14380 > URL: https://issues.apache.org/jira/browse/HIVE-14380 > Project: Hive > Issue Type: Bug > Components: Encryption >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 2.2.0 > > Attachments: HIVE-14380.1.patch > > > If a table has table/partition locations set to remote HDFS paths, querying > them will cause the following IAException: > {noformat} > 2016-07-26 01:16:27,471 ERROR parse.CalcitePlanner > (SemanticAnalyzer.java:getMetaData(1867)) - > org.apache.hadoop.hive.ql.metadata.HiveException: Unable to determine if > hdfs://foo.ygrid.yahoo.com:8020/projects/my_db/my_table is encrypted: > java.lang.IllegalArgumentException: Wrong FS: > hdfs://foo.ygrid.yahoo.com:8020/projects/my_db/my_table, expected: > hdfs://bar.ygrid.yahoo.com:8020 > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.isPathEncrypted(SemanticAnalyzer.java:2204) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getStrongestEncryptedTablePath(SemanticAnalyzer.java:2274) > ... > {noformat} > This is because of the following code in {{SessionState}}: > {code:title=SessionState.java|borderStyle=solid} > public HadoopShims.HdfsEncryptionShim getHdfsEncryptionShim() throws > HiveException { > if (hdfsEncryptionShim == null) { > try { > FileSystem fs = FileSystem.get(sessionConf); > if ("hdfs".equals(fs.getUri().getScheme())) { > hdfsEncryptionShim = > ShimLoader.getHadoopShims().createHdfsEncryptionShim(fs, sessionConf); > } else { > LOG.debug("Could not get hdfsEncryptionShim, it is only applicable > to hdfs filesystem."); > } > } catch (Exception e) { > throw new HiveException(e); > } > } > return hdfsEncryptionShim; > } > {code} > When the {{FileSystem}} instance is created, using the {{sessionConf}} > implies that the current HDFS is going to be used. This call should instead > fetch the {{FileSystem}} instance corresponding to the path being checked. > A fix is forthcoming... -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060060#comment-16060060 ] Sergey Shelukhin commented on HIVE-16761: - Interesting... the results have changed. Need to investigate > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-16761.01.patch, HIVE-16761.02.patch, > HIVE-16761.patch > > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16060058#comment-16060058 ] Hive QA commented on HIVE-16761: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874131/HIVE-16761.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10847 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[llap_smb] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] (batchId=125) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5735/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5735/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5735/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874131 - PreCommit-HIVE-Build > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-16761.01.patch, HIVE-16761.02.patch, > HIVE-16761.patch > > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16930) HoS should verify the value of Kerberos principal and keytab file before adding them to spark-submit command parameters
[ https://issues.apache.org/jira/browse/HIVE-16930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-16930: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed to 3.0.0 and 2.4.0. Thanks [~Yibing] for the patch. > HoS should verify the value of Kerberos principal and keytab file before > adding them to spark-submit command parameters > --- > > Key: HIVE-16930 > URL: https://issues.apache.org/jira/browse/HIVE-16930 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Yibing Shi >Assignee: Yibing Shi > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16930.1.patch > > > When Kerberos is enabled, Hive CLI fails to run Hive on Spark queries: > {noformat} > >hive -e "set hive.execution.engine=spark; create table if not exists test(a > >int); select count(*) from test" --hiveconf hive.root.logger=INFO,console > > >/var/tmp/hive_log.txt > /var/tmp/hive_log_2.txt > 17/06/16 16:13:13 [main]: ERROR client.SparkClientImpl: Error while waiting > for client to connect. > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Cancel > client 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited > before connecting back with error log Error: Cannot load main class from JAR > file:/tmp/spark-submit.7196051517706529285.properties > Run with --help for usage help or --verbose for debug output > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > at > org.apache.hive.spark.client.SparkClientImpl.(SparkClientImpl.java:107) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:100) > > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.(RemoteHiveSparkClient.java:96) > > at > org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:66) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:62) > > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:111) > > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:97) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1972) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1685) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1421) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1205) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1195) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220) > at > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:720) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > Caused by: java.lang.RuntimeException: Cancel client > 'a5de85d1-6933-43e7-986f-5f8e5c001b5f'. Error: Child process exited before > connecting back with error log Error: Cannot load main class from JAR > file:/tmp/spark-submit.7196051517706529285.properties > Run with --help for usage help or --verbose for debug output > at > org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) > at > org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:490) > at
[jira] [Updated] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-16761: Attachment: HIVE-16761.02.patch Added a test. Verified that it fails with the path error with the new code commented out. > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-16761.01.patch, HIVE-16761.02.patch, > HIVE-16761.patch > > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16589) Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE
[ https://issues.apache.org/jira/browse/HIVE-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059884#comment-16059884 ] Matt McCline commented on HIVE-16589: - Thank you [~jdere] for your diligent and careful code review. > Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and > COMPLETE for AVG, VARIANCE > --- > > Key: HIVE-16589 > URL: https://issues.apache.org/jira/browse/HIVE-16589 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-16589.01.patch, HIVE-16589.02.patch, > HIVE-16589.03.patch, HIVE-16589.04.patch, HIVE-16589.05.patch, > HIVE-16589.06.patch, HIVE-16589.07.patch, HIVE-16589.08.patch, > HIVE-16589.091.patch, HIVE-16589.092.patch, HIVE-16589.093.patch, > HIVE-16589.094.patch, HIVE-16589.095.patch, HIVE-16589.096.patch, > HIVE-16589.097.patch, HIVE-16589.098.patch, HIVE-16589.0991.patch, > HIVE-16589.0992.patch, HIVE-16589.0993.patch, HIVE-16589.0994.patch, > HIVE-16589.0995.patch, HIVE-16589.099.patch, HIVE-16589.09.patch > > > Allow Complex Types to be vectorized (since HIVE-16207: "Add support for > Complex Types in Fast SerDe" was committed). > Add more classes we vectorize AVG in preparation for fully supporting AVG > GroupBy. In particular, the PARTIAL2 and FINAL groupby modes that take in > the AVG struct as input. And, add the COMPLETE mode that takes in the > Original data and produces the Full Aggregation for completeness, so to speak. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage
[ https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059874#comment-16059874 ] Hive QA commented on HIVE-15665: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874119/HIVE-15665.04.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 10846 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_llap_counters] (batchId=144) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] (batchId=125) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5734/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5734/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5734/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874119 - PreCommit-HIVE-Build > LLAP: OrcFileMetadata objects in cache can impact heap usage > > > Key: HIVE-15665 > URL: https://issues.apache.org/jira/browse/HIVE-15665 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, > HIVE-15665.03.patch, HIVE-15665.04.patch, HIVE-15665.patch > > > OrcFileMetadata internally has filestats, stripestats etc which are allocated > in heap. On large data sets, this could have an impact on the heap usage and > the memory usage by different executors in LLAP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059844#comment-16059844 ] Deepak Jaiswal commented on HIVE-16761: --- +1 > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-16761.01.patch, HIVE-16761.patch > > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16932) incorrect predicate evaluation
[ https://issues.apache.org/jira/browse/HIVE-16932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059805#comment-16059805 ] Jesus Camacho Rodriguez commented on HIVE-16932: [~hopperjim], I have run this with multiple versions and it does not seem to be a problem (for given example I get 75000). What is the result that you get? > incorrect predicate evaluation > -- > > Key: HIVE-16932 > URL: https://issues.apache.org/jira/browse/HIVE-16932 > Project: Hive > Issue Type: Bug > Components: CLI, Hive, ORC >Affects Versions: 1.2.1 > Environment: CentOS, HDP 2.6 >Reporter: Jim Hopper > > hive returns incorrect number of rows when BETWEEN and NOT BETWEEN operators > are used in WHERE clause while querying a table that uses ORC as a storage > format. > script to replicate the issue on HDP 2.6: > {code} > SET hive.exec.compress.output=false; > SET hive.vectorized.execution.enabled=false; > SET hive.optimize.ppd=true; > SET hive.optimize.ppd.storage=true; > SET N=10; > SET TTT=default.tmp_tbl_text; > SET TTO=default.tmp_tbl_orc; > DROP TABLE IF EXISTS ${hiveconf:TTT}; > DROP TABLE IF EXISTS ${hiveconf:TTO}; > create table ${hiveconf:TTT} > stored as textfile > as > select pos as c > from ( > select posexplode(split(repeat(',', ${hiveconf:N}), ',')) > ) as t; > create table ${hiveconf:TTO} > stored as orc > as > select c > from ${hiveconf:TTT}; > SELECT count(c) as cnt > FROM ${hiveconf:TTT} > WHERE > c between 0 and ${hiveconf:N} > and c not between ${hiveconf:N} div 4 and ${hiveconf:N} div 2 > ; > SELECT count(c) as cnt > FROM ${hiveconf:TTO} > WHERE > c between 0 and ${hiveconf:N} > and c not between ${hiveconf:N} div 4 and ${hiveconf:N} div 2 > ; > DROP TABLE IF EXISTS ${hiveconf:TTT}; > DROP TABLE IF EXISTS ${hiveconf:TTO}; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-15665) LLAP: OrcFileMetadata objects in cache can impact heap usage
[ https://issues.apache.org/jira/browse/HIVE-15665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-15665: Attachment: HIVE-15665.04.patch Fixing issues > LLAP: OrcFileMetadata objects in cache can impact heap usage > > > Key: HIVE-15665 > URL: https://issues.apache.org/jira/browse/HIVE-15665 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Rajesh Balamohan >Assignee: Sergey Shelukhin > Attachments: HIVE-15665.01.patch, HIVE-15665.02.patch, > HIVE-15665.03.patch, HIVE-15665.04.patch, HIVE-15665.patch > > > OrcFileMetadata internally has filestats, stripestats etc which are allocated > in heap. On large data sets, this could have an impact on the heap usage and > the memory usage by different executors in LLAP. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16946) Information Schema Improvements
[ https://issues.apache.org/jira/browse/HIVE-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner reassigned HIVE-16946: - > Information Schema Improvements > --- > > Key: HIVE-16946 > URL: https://issues.apache.org/jira/browse/HIVE-16946 > Project: Hive > Issue Type: Improvement >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > > Collection of requested enhancements and fixes for the info schema. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059738#comment-16059738 ] Deepak Jaiswal commented on HIVE-16761: --- Sure. > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-16761.01.patch, HIVE-16761.patch > > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16761) LLAP IO: SMB joins fail elevator
[ https://issues.apache.org/jira/browse/HIVE-16761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059718#comment-16059718 ] Gunther Hagleitner commented on HIVE-16761: --- Patch looks good, but needs a test before commit. [~djaiswal] can you also take a look? > LLAP IO: SMB joins fail elevator > - > > Key: HIVE-16761 > URL: https://issues.apache.org/jira/browse/HIVE-16761 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-16761.01.patch, HIVE-16761.patch > > > {code} > Caused by: java.io.IOException: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:153) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:78) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360) > ... 26 more > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.ql.exec.vector.LongColumnVector cannot be cast to > org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextString(BatchToRowReader.java:334) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.nextValue(BatchToRowReader.java:602) > at > org.apache.hadoop.hive.ql.io.BatchToRowReader.next(BatchToRowReader.java:149) > ... 28 more > {code} > {code} > set hive.enforce.sortmergebucketmapjoin=false; > set hive.optimize.bucketmapjoin=true; > set hive.optimize.bucketmapjoin.sortedmerge=true; > set hive.auto.convert.sortmerge.join=true; > set hive.auto.convert.join=true; > set hive.auto.convert.join.noconditionaltask.size=500; > select year,quarter,count(*) from transactions_raw_orc_200 a join > customer_accounts_orc_200 b on a.account_id=b.account_id group by > year,quarter; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16888) Upgrade Calcite to 1.13 and Avatica to 1.10
[ https://issues.apache.org/jira/browse/HIVE-16888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059708#comment-16059708 ] Remus Rusanu commented on HIVE-16888: - I'm going through safe golden files updates (ie. better reduced predicates) and I'll put a new patch soon to see only more problematic diffs (result difs and some Tez graph diffs) > Upgrade Calcite to 1.13 and Avatica to 1.10 > --- > > Key: HIVE-16888 > URL: https://issues.apache.org/jira/browse/HIVE-16888 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Remus Rusanu >Assignee: Remus Rusanu > Attachments: HIVE-16888.01.patch, HIVE-16888.02.patch, > HIVE-16888.03.patch > > > I'm creating this early to be able to ptest the current Calcite > 1.13.0-SNAPSHOT -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16908) Failures in TestHcatClient due to HIVE-16844
[ https://issues.apache.org/jira/browse/HIVE-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059651#comment-16059651 ] Ratandeep Ratti commented on HIVE-16908: [~sbeeram] . The tests look OK to me. > Failures in TestHcatClient due to HIVE-16844 > > > Key: HIVE-16908 > URL: https://issues.apache.org/jira/browse/HIVE-16908 > Project: Hive > Issue Type: Bug >Reporter: Sunitha Beeram >Assignee: Sunitha Beeram > Attachments: HIVE-16908.1.patch, HIVE-16908.2.patch > > > Some of the tests in TestHCatClient.java, for ex: > {noformat} > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema > (batchId=177) > org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation > (batchId=177) > {noformat} > are failing due to HIVE-16844. HIVE-16844 fixes a connection leak when a new > configuration object is set on the ObjectStore. TestHCatClient fires up a > second instance of metastore thread with a different conf object that results > in the PersistenceMangaerFactory closure and hence tests fail. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16905) Add zookeeper ACL for hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059585#comment-16059585 ] Hive QA commented on HIVE-16905: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874089/HIVE%20ACL%20FOR%20HIVESERVER2.pdf {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5733/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5733/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5733/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-06-22 16:02:18.335 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-5733/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-06-22 16:02:18.338 + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 71f52d8..7819cd3 master -> origin/master + git reset --hard HEAD HEAD is now at 71f52d8 HIVE-16875: Query against view with partitioned child on HoS fails with privilege exception. (Yongzhi Chen, reviewed by Aihua Xu) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use "git pull" to update your local branch) + git reset --hard origin/master HEAD is now at 7819cd3 HIVE-16867: Extend shared scan optimizer to reuse computation from other operators (Jesus Camacho Rodriguez, reviewed by Ashutosh Chauhan) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-06-22 16:02:21.598 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. fatal: unrecognized input The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12874089 - PreCommit-HIVE-Build > Add zookeeper ACL for hiveserver2 > - > > Key: HIVE-16905 > URL: https://issues.apache.org/jira/browse/HIVE-16905 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang > Attachments: HIVE-16905.1.patch, HIVE ACL FOR HIVESERVER2.pdf > > > Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode > of hiveserver2 deleted by accident. > -- > case: > when i do beeline connections throught hive HA with zookeeper, i suddenly > find the beeline can not connect the hiveserve2.The reason of the problem is > that others delete the /hiveserver2 falsely which cause to the beeline > connection is failed and can not read the configs from zookeeper. > - > as a result of the acl of /hiveserver2, the acl is set to world:anyone:cdrwa > which meant to anyone easily delete the /hiveserver2 and znodes anytime.It is > unsafe and necessary to protect the znode /hiveserver2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16888) Upgrade Calcite to 1.13 and Avatica to 1.10
[ https://issues.apache.org/jira/browse/HIVE-16888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059564#comment-16059564 ] Hive QA commented on HIVE-16888: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874085/HIVE-16888.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 155 failed/errored test(s), 10846 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_index] (batchId=229) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_predicate_pushdown] (batchId=229) org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver[accumulo_queries] (batchId=229) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[escape_comments] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[mapjoin2] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_7] (batchId=238) org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[udf_unix_timestamp] (batchId=238) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join12] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join16] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join4] (batchId=67) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join5] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join8] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_date] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cast_on_constant] (batchId=23) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join17] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_cross_product_check_2] (batchId=19) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_gby2_map_multi_distinct] (batchId=78) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_groupby3_noskew_multi_distinct] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_join0] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_outer_join_ppr] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[char_cast] (batchId=84) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas_date] (batchId=1) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[date_1] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[date_4] (batchId=46) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[date_udf] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_basic2] (batchId=11) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_intervals] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_timeseries] (batchId=56) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extrapolate_part_stats_date] (batchId=20) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_union] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[fold_eq_with_case_when] (batchId=77) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[fouter_join_ppr] (batchId=31) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_position] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_ppr_multi_distinct] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables] (batchId=81) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_mult_tables_compact] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[interval_3] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[interval_alt] (batchId=4) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[interval_arithmetic] (batchId=45) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join12] (batchId=39) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join16] (batchId=29) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join4] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join5] (batchId=66) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join8] (batchId=45)
[jira] [Commented] (HIVE-16944) schematool -dbType hive should give some more feedback/assistance
[ https://issues.apache.org/jira/browse/HIVE-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059560#comment-16059560 ] Carter Shanklin commented on HIVE-16944: Also [~vihangk1] if you're interested in setting up INFORMATION_SCHEMA there's a full how-to in HIVE-16941 > schematool -dbType hive should give some more feedback/assistance > - > > Key: HIVE-16944 > URL: https://issues.apache.org/jira/browse/HIVE-16944 > Project: Hive > Issue Type: Bug >Reporter: Carter Shanklin > > Given the other ways schematool is used, the most obvious guess I would have > for initializing the Hive schema is: > {code} > schematool -metaDbType mysql -dbType hive -initSchema > {code} > Unfortunately that fails with this NPE: > {code} > Exception in thread "main" java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:570) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:564) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:560) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper$HiveCommandParser.(HiveSchemaHelper.java:373) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:573) > at > org.apache.hive.beeline.HiveSchemaTool.getDbCommandParser(HiveSchemaTool.java:165) > at > org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:101) > at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:90) > at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1166) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:233) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {code} > Two additional arguments are needed: > -url jdbc:hive2://localhost:1/default -driver > org.apache.hive.jdbc.HiveDriver > If the user does not supply these for dbType hive, schematool should detect > and error out appropriately, plus give an example of what it's looking for. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16944) schematool -dbType hive should give some more feedback/assistance
[ https://issues.apache.org/jira/browse/HIVE-16944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059528#comment-16059528 ] Vihang Karajgaonkar commented on HIVE-16944: I can take a look at this [~cartershanklin]. Curious to understand what is the difference between {{metaDbType}} and {{dbType}} > schematool -dbType hive should give some more feedback/assistance > - > > Key: HIVE-16944 > URL: https://issues.apache.org/jira/browse/HIVE-16944 > Project: Hive > Issue Type: Bug >Reporter: Carter Shanklin > > Given the other ways schematool is used, the most obvious guess I would have > for initializing the Hive schema is: > {code} > schematool -metaDbType mysql -dbType hive -initSchema > {code} > Unfortunately that fails with this NPE: > {code} > Exception in thread "main" java.lang.NullPointerException > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:570) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:564) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:560) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper$HiveCommandParser.(HiveSchemaHelper.java:373) > at > org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getDbCommandParser(HiveSchemaHelper.java:573) > at > org.apache.hive.beeline.HiveSchemaTool.getDbCommandParser(HiveSchemaTool.java:165) > at > org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:101) > at org.apache.hive.beeline.HiveSchemaTool.(HiveSchemaTool.java:90) > at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1166) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at org.apache.hadoop.util.RunJar.run(RunJar.java:233) > at org.apache.hadoop.util.RunJar.main(RunJar.java:148) > {code} > Two additional arguments are needed: > -url jdbc:hive2://localhost:1/default -driver > org.apache.hive.jdbc.HiveDriver > If the user does not supply these for dbType hive, schematool should detect > and error out appropriately, plus give an example of what it's looking for. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16867) Extend shared scan optimizer to reuse computation from other operators
[ https://issues.apache.org/jira/browse/HIVE-16867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-16867: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master, thanks for reviewing [~ashutoshc]! > Extend shared scan optimizer to reuse computation from other operators > -- > > Key: HIVE-16867 > URL: https://issues.apache.org/jira/browse/HIVE-16867 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-16867.01.patch, HIVE-16867.02.patch, > HIVE-16867.03.patch, HIVE-16867.04.patch, HIVE-16867.patch > > > Follow-up of the work in HIVE-16602. > HIVE-16602 introduced an optimization that identifies scans on input tables > that can be merged so the data is read only once. > This extension to that rule allows to reuse the computation that is done in > the work containing those scans. In particular, we traverse both parts of the > plan upstream and reuse the operators if possible. > Currently, the optimizer will not go beyond the output edge(s) of that work. > Follow-up extensions might remove this limitation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16648) Allow select distinct with group by
[ https://issues.apache.org/jira/browse/HIVE-16648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059504#comment-16059504 ] Carter Shanklin commented on HIVE-16648: To clarify I wasn't looking for a workaround but this case does come up when porting SQL from other DBs to Hive, most mature SQL engines are smart enough to ignore the distinct clause in this case. Looks like HIVE-16924 is going to tackle the more complex case when aggregates are also present but that should cover this one as well. > Allow select distinct with group by > --- > > Key: HIVE-16648 > URL: https://issues.apache.org/jira/browse/HIVE-16648 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Carter Shanklin >Assignee: Anshuman > > Although there are very few legitimate reasons to have both "select distinct" > and "group by" in the same query, it is still used from time to time and > other systems support it. > Illustrating the issue: > {code} > hive> create table test (c1 integer); > OK > Time taken: 0.073 seconds > hive> select distinct c1 from test group by c1; > FAILED: SemanticException 1:38 SELECT DISTINCT and GROUP BY can not be in the > same query. Error encountered near token 'c1' > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059499#comment-16059499 ] Fei Hui commented on HIVE-16943: [~sershe] could you please take a look ? Thanks. I see you did the similar change in HIVE-11568 > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui > Attachments: HIVE-16943.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui reassigned HIVE-16943: -- Assignee: Fei Hui > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16943.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16943) MoveTask should separate src FileSystem from dest FileSystem
[ https://issues.apache.org/jira/browse/HIVE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HIVE-16943: --- Attachment: HIVE-16943.patch patch upload > MoveTask should separate src FileSystem from dest FileSystem > - > > Key: HIVE-16943 > URL: https://issues.apache.org/jira/browse/HIVE-16943 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Fei Hui > Attachments: HIVE-16943.patch > > > {code:title=MoveTask.java|borderStyle=solid} > private void moveFileInDfs (Path sourcePath, Path targetPath, FileSystem fs) > throws HiveException, IOException { > // if source exists, rename. Otherwise, create a empty directory > if (fs.exists(sourcePath)) { > Path deletePath = null; > // If it multiple level of folder are there fs.rename is failing so > first > // create the targetpath.getParent() if it not exist > if (HiveConf.getBoolVar(conf, > HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) { > deletePath = createTargetPath(targetPath, fs); > } > Hive.clearDestForSubDirSrc(conf, targetPath, sourcePath, false); > if (!Hive.moveFile(conf, sourcePath, targetPath, true, false)) { > try { > if (deletePath != null) { > fs.delete(deletePath, true); > } > } catch (IOException e) { > LOG.info("Unable to delete the path created for facilitating rename" > + deletePath); > } > throw new HiveException("Unable to rename: " + sourcePath > + " to: " + targetPath); > } > } else if (!fs.mkdirs(targetPath)) { > throw new HiveException("Unable to make directory: " + targetPath); > } > } > {code} > Maybe sourcePath and targetPath come from defferent filesystem, we should > separate them. > I see that HIVE-11568 had done it in Hive.java -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16905) Add zookeeper ACL for hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saijin Huang updated HIVE-16905: Attachment: HIVE ACL FOR HIVESERVER2.pdf > Add zookeeper ACL for hiveserver2 > - > > Key: HIVE-16905 > URL: https://issues.apache.org/jira/browse/HIVE-16905 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang > Attachments: HIVE-16905.1.patch, HIVE ACL FOR HIVESERVER2.pdf > > > Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode > of hiveserver2 deleted by accident. > -- > case: > when i do beeline connections throught hive HA with zookeeper, i suddenly > find the beeline can not connect the hiveserve2.The reason of the problem is > that others delete the /hiveserver2 falsely which cause to the beeline > connection is failed and can not read the configs from zookeeper. > - > as a result of the acl of /hiveserver2, the acl is set to world:anyone:cdrwa > which meant to anyone easily delete the /hiveserver2 and znodes anytime.It is > unsafe and necessary to protect the znode /hiveserver2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16905) Add zookeeper ACL for hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-16905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059453#comment-16059453 ] Saijin Huang commented on HIVE-16905: - the doc is updated. > Add zookeeper ACL for hiveserver2 > - > > Key: HIVE-16905 > URL: https://issues.apache.org/jira/browse/HIVE-16905 > Project: Hive > Issue Type: New Feature >Affects Versions: 3.0.0 >Reporter: Saijin Huang >Assignee: Saijin Huang > Attachments: HIVE-16905.1.patch, HIVE ACL FOR HIVESERVER2.pdf > > > Add zookeeper ACL for hiveserver2 is necessary for hive to protect the znode > of hiveserver2 deleted by accident. > -- > case: > when i do beeline connections throught hive HA with zookeeper, i suddenly > find the beeline can not connect the hiveserve2.The reason of the problem is > that others delete the /hiveserver2 falsely which cause to the beeline > connection is failed and can not read the configs from zookeeper. > - > as a result of the acl of /hiveserver2, the acl is set to world:anyone:cdrwa > which meant to anyone easily delete the /hiveserver2 and znodes anytime.It is > unsafe and necessary to protect the znode /hiveserver2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16875) Query against view with partitioned child on HoS fails with privilege exception.
[ https://issues.apache.org/jira/browse/HIVE-16875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-16875: Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) Committed the fix to master and branch-2. Thanks [~aihuaxu] for reviewing the code. > Query against view with partitioned child on HoS fails with privilege > exception. > > > Key: HIVE-16875 > URL: https://issues.apache.org/jira/browse/HIVE-16875 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 1.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-16875.1.patch, HIVE-16875.2.patch, > HIVE-16875.3.patch > > > Query against view with child table that has partitions fails with privilege > exception even with correct privileges. > Reproduce: > {noformat} > create table jsamp1 (a string) partitioned by (b int); > insert into table jsamp1 partition (b=1) values ("hello"); > create view jview as select * from jsamp1; > create role viewtester; > grant all on table jview to role viewtester; > grant role viewtester to group testers; > Use MR, the select will succeed: > set hive.execution.engine=mr; > select count(*) from jview; > while use spark: > set hive.execution.engine=spark; > select count(*) from jview; > it fails with: > Error: Error while compiling statement: FAILED: SemanticException No valid > privileges > User tester does not have privileges for QUERY > The required privileges: > Server=server1->Db=default->Table=j1part->action=select; > (state=42000,code=4) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable
[ https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059437#comment-16059437 ] Hive QA commented on HIVE-16934: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874082/HIVE-16934.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 10846 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=238) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=142) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnstats_part_coltype] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=99) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=233) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_1_23] (batchId=134) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby_sort_skew_1_23] (batchId=104) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join35] (batchId=127) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union24] (batchId=125) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=217) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=217) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=178) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5731/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5731/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5731/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874082 - PreCommit-HIVE-Build > Transform COUNT(x) into COUNT() when x is not nullable > -- > > Key: HIVE-16934 > URL: https://issues.apache.org/jira/browse/HIVE-16934 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-16934.01.patch, HIVE-16934.patch > > > Add a rule to simplify COUNT aggregation function if possible, removing > expressions that cannot be nullable from its parameters. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16888) Upgrade Calcite to 1.13 and Avatica to 1.10
[ https://issues.apache.org/jira/browse/HIVE-16888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-16888: Attachment: HIVE-16888.03.patch Patch.03 is considering VARCHAR(HiveTypeSystemImpl.MAX_VARCHAR_PRECISION) as TOK_STRING in TypeConverter.hiveToken The Calcite 1.13 literals now come as a CAST(... AS VARCHAR()) and the existing type conversion only considered TOK_STRING for VARCHAR(Integer.MAX_VALUE) > Upgrade Calcite to 1.13 and Avatica to 1.10 > --- > > Key: HIVE-16888 > URL: https://issues.apache.org/jira/browse/HIVE-16888 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 3.0.0 >Reporter: Remus Rusanu >Assignee: Remus Rusanu > Attachments: HIVE-16888.01.patch, HIVE-16888.02.patch, > HIVE-16888.03.patch > > > I'm creating this early to be able to ptest the current Calcite > 1.13.0-SNAPSHOT -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable
[ https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-16934: --- Attachment: HIVE-16934.01.patch > Transform COUNT(x) into COUNT() when x is not nullable > -- > > Key: HIVE-16934 > URL: https://issues.apache.org/jira/browse/HIVE-16934 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-16934.01.patch, HIVE-16934.patch > > > Add a rule to simplify COUNT aggregation function if possible, removing > expressions that cannot be nullable from its parameters. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16939) metastore error: 'export: -Dproc_metastore : not a valid identifier'
[ https://issues.apache.org/jira/browse/HIVE-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059086#comment-16059086 ] Fei Hui commented on HIVE-16939: Failed tests are unrelated > metastore error: 'export: -Dproc_metastore : not a valid identifier' > > > Key: HIVE-16939 > URL: https://issues.apache.org/jira/browse/HIVE-16939 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16939.patch > > > When i run metastore, it reports errors as bellow > {quote} > bin/ext/metastore.sh: line 29: export: ` -Dproc_metastore ': not a valid > identifier > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable
[ https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059024#comment-16059024 ] Jesus Camacho Rodriguez commented on HIVE-16934: [~vgarg], there are multiple benefits that I can think of. First, at the execution side we will not have to access/evaluate any expression when calculating the COUNT. Further, by removing expressions that are referenced by the aggregate call, we might be able to further prune columns in the operator plan. Another benefit is that this might lead to some aggregate calls not being computed twice, e.g., {{COUNT(x)}} and {{COUNT(y)}} if _x_ and _y_ are not nullable. Finally, as a side effect, we might be able to recognize more equivalent expressions in MVs rewriting or SharedWorkOptimizer, and push more computation to Druid, since currently Druid is only capable of executing {{count(*)}}. > Transform COUNT(x) into COUNT() when x is not nullable > -- > > Key: HIVE-16934 > URL: https://issues.apache.org/jira/browse/HIVE-16934 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-16934.patch > > > Add a rule to simplify COUNT aggregation function if possible, removing > expressions that cannot be nullable from its parameters. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-16934) Transform COUNT(x) into COUNT() when x is not nullable
[ https://issues.apache.org/jira/browse/HIVE-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059024#comment-16059024 ] Jesus Camacho Rodriguez edited comment on HIVE-16934 at 6/22/17 9:14 AM: - [~vgarg], there are multiple benefits that I can think of. First, at the execution side we will not have to access/evaluate any expression when calculating the COUNT. Further, by removing expressions that are referenced by the aggregate call, we might be able to further prune columns in the operator plan. Another benefit is that this might lead to some aggregate calls not being computed twice, e.g., {{COUNT\(x\)}} and {{COUNT\(y\)}} if _x_ and _y_ are not nullable. Finally, as a side effect, we might be able to recognize more equivalent expressions in MVs rewriting or SharedWorkOptimizer, and push more computation to Druid, since currently Druid is only capable of executing {{COUNT\(*\)}}. was (Author: jcamachorodriguez): [~vgarg], there are multiple benefits that I can think of. First, at the execution side we will not have to access/evaluate any expression when calculating the COUNT. Further, by removing expressions that are referenced by the aggregate call, we might be able to further prune columns in the operator plan. Another benefit is that this might lead to some aggregate calls not being computed twice, e.g., {{COUNT(x)}} and {{COUNT(y)}} if _x_ and _y_ are not nullable. Finally, as a side effect, we might be able to recognize more equivalent expressions in MVs rewriting or SharedWorkOptimizer, and push more computation to Druid, since currently Druid is only capable of executing {{count(*)}}. > Transform COUNT(x) into COUNT() when x is not nullable > -- > > Key: HIVE-16934 > URL: https://issues.apache.org/jira/browse/HIVE-16934 > Project: Hive > Issue Type: Improvement > Components: Logical Optimizer >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-16934.patch > > > Add a rule to simplify COUNT aggregation function if possible, removing > expressions that cannot be nullable from its parameters. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-13567) Auto-gather column stats - phase 2
[ https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058992#comment-16058992 ] Hive QA commented on HIVE-13567: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874016/HIVE-13567.17.patch {color:green}SUCCESS:{color} +1 due to 20 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 138 failed/errored test(s), 10846 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter2] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_merge_stats] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table2_h23] (batchId=14) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_numbuckets_partitioned_table_h23] (batchId=65) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_change_col] (batchId=24) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_partition_coltype] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_rename_partition_authorization] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[alter_table_serde2] (batchId=25) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_6] (batchId=44) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_explain] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_2] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[authorization_view_disable_cbo_3] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_comments] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_date] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[avro_partitioned] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ba_table3] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin10] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin11] (batchId=68) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin12] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin13] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin8] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketmapjoin9] (batchId=16) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[bucketsortoptimize_insert_1] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_like_view] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_or_replace_view] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[database_drop] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_whole_partition] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[describe_table] (batchId=41) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[display_colstats_tbllvl] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[druid_topn] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_04_evolved_parts] (batchId=30) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[extract] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_cond_pushdown] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby2_limit] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby7_noskew_multi_single_reducer] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[hook_order] (batchId=3) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[implicit_cast1] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_auto_self_join] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_compact_binary_search] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input0] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input33] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input9] (batchId=57) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_columnarserde] (batchId=57)
[jira] [Commented] (HIVE-16940) Residual predicates in join operator prevent vectorization
[ https://issues.apache.org/jira/browse/HIVE-16940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058934#comment-16058934 ] Gopal V commented on HIVE-16940: There's the planning part which throws out vectorization and then all the specialized map-joins need to replace all forward() calls with forwardToFilter() and bounce the output VRBs through any filters if they exist. > Residual predicates in join operator prevent vectorization > -- > > Key: HIVE-16940 > URL: https://issues.apache.org/jira/browse/HIVE-16940 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez > > With HIVE-16885, filter predicates in ON clause for INNER joins are pushed > within the join operator (residual predicates in the operator). Previously, > residual predicates were only used for OUTER join operators. > Currently, vectorization does not support the evaluation of residual > predicates that are within an INNER join, and thus, it gets disabled if the > filter expression is pushed within the join. We should implement the > vectorization of INNER join in the presence of residual predicates. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16939) metastore error: 'export: -Dproc_metastore : not a valid identifier'
[ https://issues.apache.org/jira/browse/HIVE-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058911#comment-16058911 ] Hive QA commented on HIVE-16939: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12874004/HIVE-16939.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 10841 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=232) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication (batchId=216) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication (batchId=216) org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS (batchId=216) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema (batchId=177) org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=177) org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery (batchId=226) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5729/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5729/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5729/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12874004 - PreCommit-HIVE-Build > metastore error: 'export: -Dproc_metastore : not a valid identifier' > > > Key: HIVE-16939 > URL: https://issues.apache.org/jira/browse/HIVE-16939 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16939.patch > > > When i run metastore, it reports errors as bellow > {quote} > bin/ext/metastore.sh: line 29: export: ` -Dproc_metastore ': not a valid > identifier > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16940) Residual predicates in join operator prevent vectorization
[ https://issues.apache.org/jira/browse/HIVE-16940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058899#comment-16058899 ] Jesus Camacho Rodriguez commented on HIVE-16940: Cc [~mmccline] [~ashutoshc] [~gopalv] > Residual predicates in join operator prevent vectorization > -- > > Key: HIVE-16940 > URL: https://issues.apache.org/jira/browse/HIVE-16940 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 3.0.0 >Reporter: Jesus Camacho Rodriguez > > With HIVE-16885, filter predicates in ON clause for INNER joins are pushed > within the join operator (residual predicates in the operator). Previously, > residual predicates were only used for OUTER join operators. > Currently, vectorization does not support the evaluation of residual > predicates that are within an INNER join, and thus, it gets disabled if the > filter expression is pushed within the join. We should implement the > vectorization of INNER join in the presence of residual predicates. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16885) Non-equi Joins: Filter clauses should be pushed into the ON clause
[ https://issues.apache.org/jira/browse/HIVE-16885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-16885: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Pushed to master, thanks for reviewing [~ashutoshc]! > Non-equi Joins: Filter clauses should be pushed into the ON clause > -- > > Key: HIVE-16885 > URL: https://issues.apache.org/jira/browse/HIVE-16885 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Jesus Camacho Rodriguez > Fix For: 3.0.0 > > Attachments: HIVE-16885.01.patch, HIVE-16885.02.patch, > HIVE-16885.03.patch, HIVE-16885.patch > > > FIL_24 -> MAPJOIN_23 > {code} > hive> explain select * from part where p_size > (select max(p_size) from > part group by p_type); > Warning: Map Join MAPJOIN[14][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 3 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_26] > Select Operator [SEL_25] (rows=110 width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_24] (rows=110 width=625) > predicate:(_col5 > _col9) > Map Join Operator [MAPJOIN_23] (rows=330 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9"] > <-Reducer 3 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_21] > Select Operator [SEL_20] (rows=165 width=4) > Output:["_col0"] > Group By Operator [GBY_19] (rows=165 width=109) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_18] > PartitionCols:_col0 > Group By Operator [GBY_17] (rows=14190 width=109) > > Output:["_col0","_col1"],aggregations:["max(p_size)"],keys:p_type > Select Operator [SEL_16] (rows=2 width=109) > Output:["p_type","p_size"] > TableScan [TS_2] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Select Operator [SEL_22] (rows=2 width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > TableScan [TS_0] (rows=2 width=621) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_partkey","p_name","p_mfgr","p_brand","p_type","p_size","p_container","p_retailprice","p_comment"] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16832) duplicate ROW__ID possible in multi insert into transactional table
[ https://issues.apache.org/jira/browse/HIVE-16832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058842#comment-16058842 ] Hive QA commented on HIVE-16832: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12873989/HIVE-16832.06.patch {color:green}SUCCESS:{color} +1 due to 15 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 188 failed/errored test(s), 10839 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_subquery] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_vectorization] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[autoColumnStats_4] (batchId=12) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_non_partitioned] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_all_partitioned] (batchId=27) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_orig_table] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_tmp_table] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_where_non_partitioned] (batchId=37) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_where_partitioned] (batchId=38) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[delete_whole_partition] (batchId=9) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_update_delete] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_explode2] (batchId=80) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_noalias] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=76) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_7] (batchId=42) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_8] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_9] (batchId=75) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[masking_acid_no_masking] (batchId=22) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udtf_stack] (batchId=36) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_after_multiple_inserts] (batchId=65) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_after_multiple_inserts_special_characters] (batchId=69) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_non_partitioned] (batchId=7) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_partitioned] (batchId=49) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_all_types] (batchId=17) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_orig_table] (batchId=58) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_tmp_table] (batchId=34) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_two_cols] (batchId=20) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_where_non_partitioned] (batchId=15) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[update_where_partitioned] (batchId=59) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_non_partitioned] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_tmp_table] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_non_partitioned] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_partitioned] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_whole_partition] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_semijoin_reduction_3] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_update_delete] (batchId=160) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lateral_view] (batchId=159) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf] (batchId=146) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ptf_streaming] (batchId=154) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_table_update] (batchId=157) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part_update] (batchId=146)
[jira] [Commented] (HIVE-16939) metastore error: 'export: -Dproc_metastore : not a valid identifier'
[ https://issues.apache.org/jira/browse/HIVE-16939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16058818#comment-16058818 ] Ferdinand Xu commented on HIVE-16939: - LGTM +1 > metastore error: 'export: -Dproc_metastore : not a valid identifier' > > > Key: HIVE-16939 > URL: https://issues.apache.org/jira/browse/HIVE-16939 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Fei Hui >Assignee: Fei Hui > Attachments: HIVE-16939.patch > > > When i run metastore, it reports errors as bellow > {quote} > bin/ext/metastore.sh: line 29: export: ` -Dproc_metastore ': not a valid > identifier > {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-13567) Auto-gather column stats - phase 2
[ https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13567: --- Attachment: HIVE-13567.17.patch > Auto-gather column stats - phase 2 > -- > > Key: HIVE-13567 > URL: https://issues.apache.org/jira/browse/HIVE-13567 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, > HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, > HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, > HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, > HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, > HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch > > > in phase 2, we are going to set auto-gather column on as default. This needs > to update golden files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-13567) Auto-gather column stats - phase 2
[ https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13567: --- Status: Open (was: Patch Available) > Auto-gather column stats - phase 2 > -- > > Key: HIVE-13567 > URL: https://issues.apache.org/jira/browse/HIVE-13567 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, > HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, > HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, > HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, > HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, > HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch > > > in phase 2, we are going to set auto-gather column on as default. This needs > to update golden files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-13567) Auto-gather column stats - phase 2
[ https://issues.apache.org/jira/browse/HIVE-13567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13567: --- Status: Patch Available (was: Open) > Auto-gather column stats - phase 2 > -- > > Key: HIVE-13567 > URL: https://issues.apache.org/jira/browse/HIVE-13567 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13567.01.patch, HIVE-13567.02.patch, > HIVE-13567.03.patch, HIVE-13567.04.patch, HIVE-13567.05.patch, > HIVE-13567.06.patch, HIVE-13567.07.patch, HIVE-13567.08.patch, > HIVE-13567.09.patch, HIVE-13567.10.patch, HIVE-13567.11.patch, > HIVE-13567.12.patch, HIVE-13567.13.patch, HIVE-13567.14.patch, > HIVE-13567.15.patch, HIVE-13567.16.patch, HIVE-13567.17.patch > > > in phase 2, we are going to set auto-gather column on as default. This needs > to update golden files. -- This message was sent by Atlassian JIRA (v6.4.14#64029)