[jira] [Comment Edited] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214737#comment-16214737 ] Rui Li edited comment on HIVE-17193 at 10/23/17 6:57 AM: - Hi [~kellyzly], bq. how to compare the result of dpp work in the period of physical plan? We can compare the DPP works the same way as we compare other works, i.e. if two works have the same operator tree and each operator has an equivalent counterpart, then the two works can be combined. was (Author: lirui): Hi [~kellyzly], bq. how to compare the result of dpp work in the period of physical plan? We can compare the DPP works the same way as we compare other works, i.e. if two works have the same operator tree and all the each operator has an equivalent counterpart, then the two works can be combined. > HoS: don't combine map works that are targets of different DPPs > --- > > Key: HIVE-17193 > URL: https://issues.apache.org/jira/browse/HIVE-17193 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li > > Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger > the issue: > {code} > explain > select * from > (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) > a > join > (select srcpart.ds,srcpart.key from srcpart join src on > srcpart.ds=src.value) b > on a.key=b.key; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214737#comment-16214737 ] Rui Li commented on HIVE-17193: --- Hi [~kellyzly], bq. how to compare the result of dpp work in the period of physical plan? We can compare the DPP works the same way as we compare other works, i.e. if two works have the same operator tree and all the each operator has an equivalent counterpart, then the two works can be combined. > HoS: don't combine map works that are targets of different DPPs > --- > > Key: HIVE-17193 > URL: https://issues.apache.org/jira/browse/HIVE-17193 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li > > Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger > the issue: > {code} > explain > select * from > (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) > a > join > (select srcpart.ds,srcpart.key from srcpart join src on > srcpart.ds=src.value) b > on a.key=b.key; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17830) dbnotification fails to work with rdbms other than postgres
[ https://issues.apache.org/jira/browse/HIVE-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214731#comment-16214731 ] Daniel Dai commented on HIVE-17830: --- Ok so "SET @@session.sql_mode=ANSI_QUOTES" will be required, right? Last time I read the code, it seems prepareTxn will be invoked every time we created a new ObjectStore. However, I must miss somewhere as otherwise, we will never hit the sql syntax error. > dbnotification fails to work with rdbms other than postgres > --- > > Key: HIVE-17830 > URL: https://issues.apache.org/jira/browse/HIVE-17830 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: anishek >Assignee: Daniel Dai >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17830.0.patch, HIVE-17830.1.patch > > > as part of HIVE-17721 we had changed the direct sql to acquire the lock for > postgres as > {code} > select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update; > {code} > however this breaks other databases and we have to use different sql > statements for different databases > for postgres use > {code} > select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update; > {code} > for SQLServer > {code} > select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" with (updlock); > {code} > for other databases > {code} > select NEXT_EVENT_ID from NOTIFICATION_SEQUENCE for update; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214728#comment-16214728 ] liyunzhang commented on HIVE-17193: --- [~lirui]: {quote} 1. The simplest solution is, if the DPP works' IDs (tracked by the target map works) are different, then we consider the target map works are different and don't combine them. 2. Another solution is we walk the parent tasks first, and combine equivalent DPP works. Two DPP works can be considered equivalent as long as they output same records. {quote} For #1, it can be implemented from the current code. For #2, how to compare the result of dpp work in the period of physical plan? You mean directly comparing the estimated data size(Statistics: Num rows: 58 Data size: 5812)? {code} Map 9 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: value (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator Target column: ds (string) partition key expr: ds Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE target work: Map 5 {code} {code} Map 8 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator Target column: ds (string) partition key expr: ds Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE target work: Map 1 {code} > HoS: don't combine map works that are targets of different DPPs > --- > > Key: HIVE-17193 > URL: https://issues.apache.org/jira/browse/HIVE-17193 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li > > Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger > the issue: > {code} > explain > select * from > (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) > a > join > (select srcpart.ds,srcpart.key from srcpart join src on > srcpart.ds=src.value) b > on a.key=b.key; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17830) dbnotification fails to work with rdbms other than postgres
[ https://issues.apache.org/jira/browse/HIVE-17830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214723#comment-16214723 ] anishek commented on HIVE-17830: Thanks [~daijy] for this patch. A Quick question. on looking at the code which sets the ANSI_QUOTE its in *MetastoreDirectSql.java* {code} public void prepareTxn() throws MetaException { if (dbType != DatabaseProduct.MYSQL) return; try { assert pm.currentTransaction().isActive(); // must be inside tx together with queries executeNoResult("SET @@session.sql_mode=ANSI_QUOTES"); } catch (SQLException sqlEx) { throw new MetaException("Error setting ansi quotes: " + sqlEx.getMessage()); } } {code} here we are setting the sql_mode only for the *session* and not *global*. I just ran the below on a mysql server without modifying the sql_mode {code} mysql> select "NEXT_EVENT_ID" from NOTIFICATION_SEQUENCE; +---+ | NEXT_EVENT_ID | +---+ | NEXT_EVENT_ID | +---+ 1 row in set (0.00 sec) {code} since we use connection pooling depending on which connection is used to execute the above statement we will get different results, wont we. May be i am missing something here. cc [~thejas] > dbnotification fails to work with rdbms other than postgres > --- > > Key: HIVE-17830 > URL: https://issues.apache.org/jira/browse/HIVE-17830 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: anishek >Assignee: Daniel Dai >Priority: Critical > Fix For: 3.0.0 > > Attachments: HIVE-17830.0.patch, HIVE-17830.1.patch > > > as part of HIVE-17721 we had changed the direct sql to acquire the lock for > postgres as > {code} > select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update; > {code} > however this breaks other databases and we have to use different sql > statements for different databases > for postgres use > {code} > select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update; > {code} > for SQLServer > {code} > select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" with (updlock); > {code} > for other databases > {code} > select NEXT_EVENT_ID from NOTIFICATION_SEQUENCE for update; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-16198) Vectorize GenericUDFIndex for ARRAY
[ https://issues.apache.org/jira/browse/HIVE-16198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214720#comment-16214720 ] Colin Ma commented on HIVE-16198: - hi, [~teddy.choi], [~mmccline], because of the problem HIVE-17133, I rebased the patch based on HIVE-2.3.0 with some minor changes. To evaluate the performance improvement, the following table is used: {code} hive> describe temperature_orc_5g; t_date string citystring temperaturesarray hive> show tblproperties temperature_orc_5g; COLUMN_STATS_ACCURATE {"BASIC_STATS":"true"} numFiles 20 numRows 1 rawDataSize 241 totalSize 1793960785 {code} Tested by HIVE on Spark, with the sql {color:#59afe1}select city, avg(temperatures\[0\]), avg(temperatures\[5\]) from temperature_orc_5g where temperatures\[2\] > 20 group by city limit 10{color}, the following are the result: || ||Disable vectorization||Enable vectorization|| |execution time|{color:#d04437}34s{color}|{color:#14892c}26s{color}| Specifically, the detail time cost for the same task which will process 15154763 rows as follow table: || ||Disable vectorization||Enable vectorization|| |Time with RecorderReader|{color:#d04437}8.9s{color}|{color:#14892c}5.9s{color}| |Time with filter operator|{color:#d04437}3.1s{color}|{color:#14892c}0.1s{color}| |Time with groupBy and followup operators|10.8s|11.5s| I think the improvement is obviously, do you know why the patch isn't committed until now, thanks. > Vectorize GenericUDFIndex for ARRAY > --- > > Key: HIVE-16198 > URL: https://issues.apache.org/jira/browse/HIVE-16198 > Project: Hive > Issue Type: Sub-task > Components: UDF, Vectorization >Reporter: Teddy Choi >Assignee: Teddy Choi > Attachments: HIVE-16198.1.patch, HIVE-16198.2.patch, > HIVE-16198.3.patch > > > Vectorize GenericUDFIndex for array data type. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214719#comment-16214719 ] Ferdinand Xu commented on HIVE-17874: - Hi [~vihangk1], can you help check the failed test cases? > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, > HIVE-17874.02.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214690#comment-16214690 ] Hive QA commented on HIVE-17874: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893482/HIVE-17874.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11317 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_parquet_projection] (batchId=42) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=158) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_parquet_projection] (batchId=121) org.apache.hadoop.hive.cli.TestSparkPerfCliDriver.testCliDriver[query39] (batchId=243) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.io.parquet.TestVectorizedColumnReader.testNullSplitForParquetReader (batchId=262) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=269) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7441/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7441/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7441/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893482 - PreCommit-HIVE-Build > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, > HIVE-17874.02.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214670#comment-16214670 ] Ferdinand Xu commented on HIVE-17874: - LGTM +1 pending on the Precommit. > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, > HIVE-17874.02.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214644#comment-16214644 ] liyunzhang_intel edited comment on HIVE-17193 at 10/23/17 5:24 AM: --- I can reproduce after disabling cbo {code} set hive.explain.user=false; set hive.spark.dynamic.partition.pruning=true; set hive.tez.dynamic.partition.pruning=true; set hive.auto.convert.join=false; set hive.cbo.enable=false; explain select * from (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) a join (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.value) b on a.key=b.key; {code} the explain {code} STAGE DEPENDENCIES: Stage-2 is a root stage Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark DagName: root_20171023004308_4b3c304e-3deb-4193-846d-12cf9e6a50ab:2 Vertices: Map 8 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator Target column: ds (string) partition key expr: ds Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE target work: Map 1 Stage: Stage-1 Spark Edges: Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 4 (PARTITION-LEVEL SORT, 1) Reducer 3 <- Reducer 2 (PARTITION-LEVEL SORT, 1), Reducer 6 (PARTITION-LEVEL SORT, 1) Reducer 6 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 7 (PARTITION-LEVEL SORT, 1) DagName: root_20171023004308_4b3c304e-3deb-4193-846d-12cf9e6a50ab:1 Vertices: Map 1 Map Operator Tree: TableScan alias: srcpart Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: ds (type: string) sort order: + Map-reduce partition columns: ds (type: string) Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE value expressions: key (type: string) Map 4 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: key (type: string) sort order: + Map-reduce partition columns: key (type: string) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Map 7 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Reducer 2 Reduce Operator Tree: Join Operator condition map:
[jira] [Updated] (HIVE-16948) Invalid explain when running dynamic partition pruning query in Hive On Spark
[ https://issues.apache.org/jira/browse/HIVE-16948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated HIVE-16948: Attachment: 17193_compare_RS_in_Map_5_1.PNG > Invalid explain when running dynamic partition pruning query in Hive On Spark > - > > Key: HIVE-16948 > URL: https://issues.apache.org/jira/browse/HIVE-16948 > Project: Hive > Issue Type: Bug >Reporter: liyunzhang_intel >Assignee: liyunzhang_intel > Fix For: 3.0.0 > > Attachments: 17193_compare_RS_in_Map_5_1.PNG, HIVE-16948.2.patch, > HIVE-16948.5.patch, HIVE-16948.6.patch, HIVE-16948.7.patch, HIVE-16948.patch, > HIVE-16948_1.patch > > > in > [union_subquery.q|https://github.com/apache/hive/blob/master/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q#L107] > in spark_dynamic_partition_pruning.q > {code} > set hive.optimize.ppd=true; > set hive.ppd.remove.duplicatefilters=true; > set hive.spark.dynamic.partition.pruning=true; > set hive.optimize.metadataonly=false; > set hive.optimize.index.filter=true; > set hive.strict.checks.cartesian.product=false; > explain select ds from (select distinct(ds) as ds from srcpart union all > select distinct(ds) as ds from srcpart) s where s.ds in (select > max(srcpart.ds) from srcpart union all select min(srcpart.ds) from srcpart); > {code} > explain > {code} > STAGE DEPENDENCIES: > Stage-2 is a root stage > Stage-1 depends on stages: Stage-2 > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-2 > Spark > Edges: > Reducer 11 <- Map 10 (GROUP, 1) > Reducer 13 <- Map 12 (GROUP, 1) > DagName: root_20170622231525_20a777e5-e659-4138-b605-65f8395e18e2:2 > Vertices: > Map 10 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: max(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Map 12 > Map Operator Tree: > TableScan > alias: srcpart > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Select Operator > expressions: ds (type: string) > outputColumnNames: ds > Statistics: Num rows: 1 Data size: 23248 Basic stats: > PARTIAL Column stats: NONE > Group By Operator > aggregations: min(ds) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > sort order: > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col0 (type: string) > Reducer 11 > Reduce Operator Tree: > Group By Operator > aggregations: max(VALUE._col0) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 184 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: _col0 is not null (type: boolean) > Statistics: Num rows: 1 Data size: 184 Basic stats: > COMPLETE Column stats: NONE > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 2 Data size: 368 Basic stats: > COMPLETE Column stats: NONE > Group
[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214644#comment-16214644 ] liyunzhang_intel commented on HIVE-17193: - I can reproduce after disabling cbo {code} set hive.explain.user=false; set hive.spark.dynamic.partition.pruning=true; set hive.tez.dynamic.partition.pruning=true; set hive.auto.convert.join=false; set hive.cbo.enable=false; explain select * from (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) a join (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.value) b on a.key=b.key; {code} the explain {code} STAGE DEPENDENCIES: Stage-2 is a root stage Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark DagName: root_20171023004308_4b3c304e-3deb-4193-846d-12cf9e6a50ab:2 Vertices: Map 8 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator Target column: ds (string) partition key expr: ds Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE target work: Map 1 Stage: Stage-1 Spark Edges: Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 4 (PARTITION-LEVEL SORT, 1) Reducer 3 <- Reducer 2 (PARTITION-LEVEL SORT, 1), Reducer 6 (PARTITION-LEVEL SORT, 1) Reducer 6 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 7 (PARTITION-LEVEL SORT, 1) DagName: root_20171023004308_4b3c304e-3deb-4193-846d-12cf9e6a50ab:1 Vertices: Map 1 Map Operator Tree: TableScan alias: srcpart Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: ds (type: string) sort order: + Map-reduce partition columns: ds (type: string) Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE value expressions: key (type: string) Map 4 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: key (type: string) sort order: + Map-reduce partition columns: key (type: string) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Map 7 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: value (type: string) sort order: + Map-reduce partition columns: value (type: string) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Reducer 2 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1
[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17874: --- Attachment: HIVE-17874.02.patch > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, > HIVE-17874.02.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214620#comment-16214620 ] Vihang Karajgaonkar commented on HIVE-17874: Thanks for the review [~Ferd]. I made changes as you suggested. I moved {{colsToInclude = ColumnProjectionUtils.getReadColumnIDs(conf);}} in the {{initialize}} method because I got rid of unnecessary field {{indexColumnsWanted}} and reused colsToInclude instead. I have moved the {{rbCtx = Utilities.getVectorizedRowBatchCtx(conf);}} in the initialize method as well like you suggested. Also updated the comment and removed unnecessary diff. Feel free to let me know if you want me to publish the patch on RB as well. > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, > HIVE-17874.02.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214617#comment-16214617 ] Rui Li commented on HIVE-17193: --- [~kellyzly], the problem is map works for {{srcpart}} (in your case Map1 and Map5) are combined, while they shouldn't because they're targets of different DPPs and therefore are likely to output different results. I think you can disable CBO to see if the issue can be reproduced. Another way is to change the outer query into a union instead of a join. > HoS: don't combine map works that are targets of different DPPs > --- > > Key: HIVE-17193 > URL: https://issues.apache.org/jira/browse/HIVE-17193 > Project: Hive > Issue Type: Bug >Reporter: Rui Li >Assignee: Rui Li > > Suppose {{srcpart}} is partitioned by {{ds}}. The following query can trigger > the issue: > {code} > explain > select * from > (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) > a > join > (select srcpart.ds,srcpart.key from srcpart join src on > srcpart.ds=src.value) b > on a.key=b.key; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17193) HoS: don't combine map works that are targets of different DPPs
[ https://issues.apache.org/jira/browse/HIVE-17193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214603#comment-16214603 ] liyunzhang_intel commented on HIVE-17193: - [~lirui]: I remember this problem when i developed HIVE-16948. But I can not reproduce this problem on hive(commit a51ae9c) now {code} set hive.explain.user=false; set hive.spark.dynamic.partition.pruning=true; set hive.tez.dynamic.partition.pruning=true; set hive.auto.convert.join=false; explain select * from (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.key) a join (select srcpart.ds,srcpart.key from srcpart join src on srcpart.ds=src.value) b on a.key=b.key; {code} the explain {code} STAGE DEPENDENCIES: Stage-2 is a root stage Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark DagName: root_20171022233200_990c146c-b49f-49b9-9a5b-a0028e34f200:2 Vertices: Map 8 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator Target column: ds (string) partition key expr: ds Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE target work: Map 1 Map 9 Map Operator Tree: TableScan alias: src Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: value is not null (type: boolean) Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: value (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string) outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: string) mode: hash outputColumnNames: _col0 Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE Spark Partition Pruning Sink Operator Target column: ds (string) partition key expr: ds Statistics: Num rows: 58 Data size: 5812 Basic stats: COMPLETE Column stats: NONE target work: Map 5 Stage: Stage-1 Spark Edges: Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1), Map 4 (PARTITION-LEVEL SORT, 1) Reducer 3 <- Reducer 2 (PARTITION-LEVEL SORT, 1), Reducer 6 (PARTITION-LEVEL SORT, 1) Reducer 6 <- Map 5 (PARTITION-LEVEL SORT, 1), Map 7 (PARTITION-LEVEL SORT, 1) DagName: root_20171022233200_990c146c-b49f-49b9-9a5b-a0028e34f200:1 Vertices: Map 1 Map Operator Tree: TableScan alias: srcpart Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: key is not null (type: boolean) Statistics: Num rows: 232 Data size: 23248 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string), ds (type: string)
[jira] [Commented] (HIVE-17259) Hive JDBC does not recognize UNIONTYPE columns
[ https://issues.apache.org/jira/browse/HIVE-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214577#comment-16214577 ] Ashutosh Chauhan commented on HIVE-17259: - [~pvillard] Patch didn't apply cleanly. You need to rebase the patch and upload again. > Hive JDBC does not recognize UNIONTYPE columns > -- > > Key: HIVE-17259 > URL: https://issues.apache.org/jira/browse/HIVE-17259 > Project: Hive > Issue Type: Bug > Components: Beeline, JDBC > Environment: Hive 1.2.1000.2.6.1.0-129 > Beeline version 1.2.1000.2.6.1.0-129 by Apache Hive >Reporter: Pierre Villard >Assignee: Pierre Villard > Attachments: HIVE-17259.patch > > > Hive JDBC does not recognize UNIONTYPE columns. > I've an external table backed by an avro schema containing a union type field. > {noformat} > "name" : "value", > "type" : [ "int", "string", "null" ] > {noformat} > When describing the table I've: > {noformat} > describe test_table; > +---+---+--+--+ > | col_name | data_type > | comment | > +---+---+--+--+ > | description | string > | | > | name | string > | | > | value | uniontype > | | > +---+---+--+--+ > {noformat} > When doing a select query over the data using the Hive CLI, it works: > {noformat} > hive> select value from test_table; > OK > {0:10} > {0:10} > {0:9} > {0:9} > ... > {noformat} > But when using beeline, it fails: > {noformat} > 0: jdbc:hive2://> select * from test_table; > Error: Unrecognized column type: UNIONTYPE (state=,code=0) > {noformat} > By applying the patch provided with this JIRA, the command succeeds and > return the expected output. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17259) Hive JDBC does not recognize UNIONTYPE columns
[ https://issues.apache.org/jira/browse/HIVE-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-17259: Status: Open (was: Patch Available) > Hive JDBC does not recognize UNIONTYPE columns > -- > > Key: HIVE-17259 > URL: https://issues.apache.org/jira/browse/HIVE-17259 > Project: Hive > Issue Type: Bug > Components: Beeline, JDBC > Environment: Hive 1.2.1000.2.6.1.0-129 > Beeline version 1.2.1000.2.6.1.0-129 by Apache Hive >Reporter: Pierre Villard >Assignee: Pierre Villard > Attachments: HIVE-17259.patch > > > Hive JDBC does not recognize UNIONTYPE columns. > I've an external table backed by an avro schema containing a union type field. > {noformat} > "name" : "value", > "type" : [ "int", "string", "null" ] > {noformat} > When describing the table I've: > {noformat} > describe test_table; > +---+---+--+--+ > | col_name | data_type > | comment | > +---+---+--+--+ > | description | string > | | > | name | string > | | > | value | uniontype > | | > +---+---+--+--+ > {noformat} > When doing a select query over the data using the Hive CLI, it works: > {noformat} > hive> select value from test_table; > OK > {0:10} > {0:10} > {0:9} > {0:9} > ... > {noformat} > But when using beeline, it fails: > {noformat} > 0: jdbc:hive2://> select * from test_table; > Error: Unrecognized column type: UNIONTYPE (state=,code=0) > {noformat} > By applying the patch provided with this JIRA, the command succeeds and > return the expected output. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17696) Vectorized reader does not seem to be pushing down projection columns in certain code paths
[ https://issues.apache.org/jira/browse/HIVE-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214566#comment-16214566 ] Hive QA commented on HIVE-17696: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12892984/HIVE-17696.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 11315 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=158) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time] (batchId=163) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=269) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=228) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7440/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7440/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7440/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12892984 - PreCommit-HIVE-Build > Vectorized reader does not seem to be pushing down projection columns in > certain code paths > --- > > Key: HIVE-17696 > URL: https://issues.apache.org/jira/browse/HIVE-17696 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Ferdinand Xu > Attachments: HIVE-17696.patch > > > This is the code snippet from {{VectorizedParquetRecordReader.java}} > {noformat} > MessageType tableSchema; > if (indexAccess) { > List indexSequence = new ArrayList<>(); > // Generates a sequence list of indexes > for(int i = 0; i < columnNamesList.size(); i++) { > indexSequence.add(i); > } > tableSchema = DataWritableReadSupport.getSchemaByIndex(fileSchema, > columnNamesList, > indexSequence); > } else { > tableSchema = DataWritableReadSupport.getSchemaByName(fileSchema, > columnNamesList, > columnTypesList); > } > indexColumnsWanted = > ColumnProjectionUtils.getReadColumnIDs(configuration); > if (!ColumnProjectionUtils.isReadAllColumns(configuration) && > !indexColumnsWanted.isEmpty()) { > requestedSchema = > DataWritableReadSupport.getSchemaByIndex(tableSchema, > columnNamesList, indexColumnsWanted); > } else { > requestedSchema = fileSchema; > } > this.reader = new ParquetFileReader( > configuration, footer.getFileMetaData(), file, blocks, > requestedSchema.getColumns()); > {noformat} > Couple of things to notice here: > Most of this code is duplicated from {{DataWritableReadSupport.init()}} > method. > the else condition passes in fileSchema instead of using tableSchema like we > do in DataWritableReadSupport.init() method. Does this cause projection > columns to be missed when we read parquet files? We should probably just > reuse ReadContext returned from {{DataWritableReadSupport.init()}} method > here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17873) External LLAP client: allow same handleID to be used more than once
[ https://issues.apache.org/jira/browse/HIVE-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214549#comment-16214549 ] Gunther Hagleitner commented on HIVE-17873: --- LGTM +1 > External LLAP client: allow same handleID to be used more than once > --- > > Key: HIVE-17873 > URL: https://issues.apache.org/jira/browse/HIVE-17873 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17873.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214519#comment-16214519 ] Hive QA commented on HIVE-17874: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893472/HIVE-17874.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 11317 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] (batchId=47) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_parquet_projection] (batchId=42) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=156) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=158) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[vectorization_parquet_projection] (batchId=121) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=269) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7439/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7439/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7439/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893472 - PreCommit-HIVE-Build > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214512#comment-16214512 ] Ferdinand Xu commented on HIVE-17874: - Thank you for the patch. Just a few minor comments. Is the last line of comments not needed or half done? {code:java} + //if there are colsToInclude initialize each columnReader {code} I see the following is moving from constructor to the initial method. Is it just for clean up code? If so, not sure whether we can move rbCtx = Utilities.getVectorizedRowBatchCtx(conf); as well. {code:java} colsToInclude = ColumnProjectionUtils.getReadColumnIDs(conf); {code} Unnecessary change for the following line. {code:java} + private VectorizedColumnReader buildVectorizedParquetReader( {code} > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17876) row.serde.deserialize broken for non-vectorized file inputformats
[ https://issues.apache.org/jira/browse/HIVE-17876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214500#comment-16214500 ] Vihang Karajgaonkar commented on HIVE-17876: CC: [~mmccline] > row.serde.deserialize broken for non-vectorized file inputformats > - > > Key: HIVE-17876 > URL: https://issues.apache.org/jira/browse/HIVE-17876 > Project: Hive > Issue Type: Bug >Affects Versions: 3.0.0, 2.4.0 >Reporter: Vihang Karajgaonkar > > Vectorization using {{hive.vectorized.use.row.serde.deserialize}} errors out > for both Orc and Parquet input format. > Steps to reproduce: > {noformat} > set hive.fetch.task.conversion=none; > set hive.vectorized.use.row.serde.deserialize=true; > set > hive.vectorized.input.format.excludes=org.apache.hadoop.hive.ql.io.orc.OrcInputFormat; > set hive.vectorized.execution.enabled=true; > explain vectorization select * from alltypesorc where cint = 528534767 limit > 10; > ++ > | Explain | > ++ > | PLAN VECTORIZATION:| > | enabled: true| > | enabledConditionsMet: [hive.vectorized.execution.enabled IS true] | > || > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Map Reduce | > | Map Operator Tree: | > | TableScan| > | alias: alltypesorc | > | Statistics: Num rows: 12288 Data size: 2641964 Basic stats: > COMPLETE Column stats: NONE | > | Filter Operator| > | predicate: (cint = 528534767) (type: boolean) | > | Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE | > | Select Operator | > | expressions: ctinyint (type: tinyint), csmallint (type: > smallint), 528534767 (type: int), cbigint (type: bigint), cfloat (type: > float), cdouble (type: double), cstring1 (type: string), cstring2 (type: > string), ctimestamp1 (type: timestamp), ctimestamp2 (type: timestamp), > cboolean1 (type: boolean), cboolean2 (type: boolean) | > | outputColumnNames: _col0, _col1, _col2, _col3, _col4, > _col5, _col6, _col7, _col8, _col9, _col10, _col11 | > | Statistics: Num rows: 6144 Data size: 1320982 Basic stats: > COMPLETE Column stats: NONE | > | Limit | > | Number of rows: 10 | > | Statistics: Num rows: 10 Data size: 2150 Basic stats: > COMPLETE Column stats: NONE | > | File Output Operator | > | compressed: false | > | Statistics: Num rows: 10 Data size: 2150 Basic stats: > COMPLETE Column stats: NONE | > | table: | > | input format: > org.apache.hadoop.mapred.SequenceFileInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat | > | serde: > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | > | Execution mode: vectorized | > | Map Vectorization: | > | enabled: true| > | enabledConditionsMet: hive.vectorized.use.row.serde.deserialize > IS true | > | groupByVectorOutput: true| > | inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > | > | allNative: false | > | usesVectorUDFAdaptor: false | > | vectorized: true | > || > | Stage: Stage-0 | > | Fetch Operator | > | limit: 10| > | Processor Tree: | > | ListSink | > || > ++ > 48 rows selected (0.742 seconds) > 0: jdbc:
[jira] [Assigned] (HIVE-17875) Vectorization support for complex types breaks parquet vectorization
[ https://issues.apache.org/jira/browse/HIVE-17875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-17875: -- > Vectorization support for complex types breaks parquet vectorization > > > Key: HIVE-17875 > URL: https://issues.apache.org/jira/browse/HIVE-17875 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > HIVE-16589 introduced support for complex types for vectorized execution. It > introduces two new configs {{hive.vectorized.complex.types.enabled}} and > {{hive.vectorized.groupby.complex.types.enabled}} which default to true and > control whether {{Vectorizer}} creates a vectorized execution plan for > queries using complex types. Since Parquet fileformat does not support > vectorization for complex types yet, any query running on parquet tables with > complex types current fails with a RuntimeException complaining that the > complex type is not supported. We should improve the logic in Vectorizer to > check if the FileinputFormat supports complex types and if not it should not > vectorize the query plan. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17368) DBTokenStore fails to connect in Kerberos enabled remote HMS environment
[ https://issues.apache.org/jira/browse/HIVE-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17368: --- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Status: Resolved (was: Patch Available) > DBTokenStore fails to connect in Kerberos enabled remote HMS environment > > > Key: HIVE-17368 > URL: https://issues.apache.org/jira/browse/HIVE-17368 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0, 2.0.0, 2.1.0, 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-17368.01-branch-2.patch, HIVE-17368.01.patch, > HIVE-17368.02-branch-2.patch, HIVE-17368.02.patch, > HIVE-17368.03-branch-2.patch, HIVE-17368.04-branch-2.patch, > HIVE-17368.05-branch-2.patch, HIVE-17368.06-branch-2.patch > > > In setups where HMS is running as a remote process secured using Kerberos, > and when {{DBTokenStore}} is configured as the token store, the HS2 Thrift > API call {{GetDelegationToken}} fail with exception trace seen below. HS2 is > not able to invoke HMS APIs needed to add/remove/renew tokens from the DB > since it is possible that the user which is issue the {{GetDelegationToken}} > is not kerberos enabled. > Eg. Oozie submits a job on behalf of user "Joe". When Oozie opens a session > with HS2 it uses Oozie's principal and creates a proxy UGI with Hive. This > principal can establish a transport authenticated using Kerberos. It stores > the HMS delegation token string in the sessionConf and sessionToken. Now, > lets say Oozie issues a {{GetDelegationToken}} which has {{Joe}} as the owner > and {{oozie}} as the renewer in {{GetDelegationTokenReq}}. This API call > cannot instantiate a HMSClient and open transport to HMS using the HMSToken > string available in the sessionConf, since DBTokenStore uses server HiveConf > instead of sessionConf. It tries to establish transport using Kerberos and it > fails since user Joe is not Kerberos enabled. > I see the following exception trace in HS2 logs. > {noformat} > 2017-08-21T18:07:19,644 ERROR [HiveServer2-Handler-Pool: Thread-61] > transport.TSaslTransport: SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_121] > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > [libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > [libthrift-0.9.3.jar:0.9.3] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_121] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_121] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > [hadoop-common-2.7.2.jar:?] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:488) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:255) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70) > [hive-exec-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:1.8.0_121] > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > [?:1.8.0_121] > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > [?:1.8.0_121] > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > [?:1.8.0_121] > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1699) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(Retrying
[jira] [Commented] (HIVE-17368) DBTokenStore fails to connect in Kerberos enabled remote HMS environment
[ https://issues.apache.org/jira/browse/HIVE-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214492#comment-16214492 ] Vihang Karajgaonkar commented on HIVE-17368: Patch merged to master > DBTokenStore fails to connect in Kerberos enabled remote HMS environment > > > Key: HIVE-17368 > URL: https://issues.apache.org/jira/browse/HIVE-17368 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0, 2.0.0, 2.1.0, 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Fix For: 3.0.0, 2.4.0 > > Attachments: HIVE-17368.01-branch-2.patch, HIVE-17368.01.patch, > HIVE-17368.02-branch-2.patch, HIVE-17368.02.patch, > HIVE-17368.03-branch-2.patch, HIVE-17368.04-branch-2.patch, > HIVE-17368.05-branch-2.patch, HIVE-17368.06-branch-2.patch > > > In setups where HMS is running as a remote process secured using Kerberos, > and when {{DBTokenStore}} is configured as the token store, the HS2 Thrift > API call {{GetDelegationToken}} fail with exception trace seen below. HS2 is > not able to invoke HMS APIs needed to add/remove/renew tokens from the DB > since it is possible that the user which is issue the {{GetDelegationToken}} > is not kerberos enabled. > Eg. Oozie submits a job on behalf of user "Joe". When Oozie opens a session > with HS2 it uses Oozie's principal and creates a proxy UGI with Hive. This > principal can establish a transport authenticated using Kerberos. It stores > the HMS delegation token string in the sessionConf and sessionToken. Now, > lets say Oozie issues a {{GetDelegationToken}} which has {{Joe}} as the owner > and {{oozie}} as the renewer in {{GetDelegationTokenReq}}. This API call > cannot instantiate a HMSClient and open transport to HMS using the HMSToken > string available in the sessionConf, since DBTokenStore uses server HiveConf > instead of sessionConf. It tries to establish transport using Kerberos and it > fails since user Joe is not Kerberos enabled. > I see the following exception trace in HS2 logs. > {noformat} > 2017-08-21T18:07:19,644 ERROR [HiveServer2-Handler-Pool: Thread-61] > transport.TSaslTransport: SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_121] > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > [libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > [libthrift-0.9.3.jar:0.9.3] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_121] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_121] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > [hadoop-common-2.7.2.jar:?] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:488) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:255) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70) > [hive-exec-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:1.8.0_121] > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > [?:1.8.0_121] > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > [?:1.8.0_121] > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > [?:1.8.0_121] > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1699) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83) > [hive-metastore-2.
[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17874: --- Status: Patch Available (was: Open) > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17874: --- Attachment: HIVE-17874.01.patch HIVE-17874.01-branch-2.patch > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar updated HIVE-17874: --- Affects Version/s: 2.2.0 > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch > > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17368) DBTokenStore fails to connect in Kerberos enabled remote HMS environment
[ https://issues.apache.org/jira/browse/HIVE-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214485#comment-16214485 ] Vihang Karajgaonkar commented on HIVE-17368: test failures are unrelated. > DBTokenStore fails to connect in Kerberos enabled remote HMS environment > > > Key: HIVE-17368 > URL: https://issues.apache.org/jira/browse/HIVE-17368 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0, 2.0.0, 2.1.0, 2.2.0 >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > Attachments: HIVE-17368.01-branch-2.patch, HIVE-17368.01.patch, > HIVE-17368.02-branch-2.patch, HIVE-17368.02.patch, > HIVE-17368.03-branch-2.patch, HIVE-17368.04-branch-2.patch, > HIVE-17368.05-branch-2.patch, HIVE-17368.06-branch-2.patch > > > In setups where HMS is running as a remote process secured using Kerberos, > and when {{DBTokenStore}} is configured as the token store, the HS2 Thrift > API call {{GetDelegationToken}} fail with exception trace seen below. HS2 is > not able to invoke HMS APIs needed to add/remove/renew tokens from the DB > since it is possible that the user which is issue the {{GetDelegationToken}} > is not kerberos enabled. > Eg. Oozie submits a job on behalf of user "Joe". When Oozie opens a session > with HS2 it uses Oozie's principal and creates a proxy UGI with Hive. This > principal can establish a transport authenticated using Kerberos. It stores > the HMS delegation token string in the sessionConf and sessionToken. Now, > lets say Oozie issues a {{GetDelegationToken}} which has {{Joe}} as the owner > and {{oozie}} as the renewer in {{GetDelegationTokenReq}}. This API call > cannot instantiate a HMSClient and open transport to HMS using the HMSToken > string available in the sessionConf, since DBTokenStore uses server HiveConf > instead of sessionConf. It tries to establish transport using Kerberos and it > fails since user Joe is not Kerberos enabled. > I see the following exception trace in HS2 logs. > {noformat} > 2017-08-21T18:07:19,644 ERROR [HiveServer2-Handler-Pool: Thread-61] > transport.TSaslTransport: SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211) > ~[?:1.8.0_121] > at > org.apache.thrift.transport.TSaslClientTransport.handleSaslStartMessage(TSaslClientTransport.java:94) > ~[libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271) > [libthrift-0.9.3.jar:0.9.3] > at > org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) > [libthrift-0.9.3.jar:0.9.3] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at java.security.AccessController.doPrivileged(Native Method) > ~[?:1.8.0_121] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_121] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > [hadoop-common-2.7.2.jar:?] > at > org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) > [hive-shims-common-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:488) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:255) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.(SessionHiveMetaStoreClient.java:70) > [hive-exec-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) ~[?:1.8.0_121] > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > [?:1.8.0_121] > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > [?:1.8.0_121] > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > [?:1.8.0_121] > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1699) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT] > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:83) > [hive-metastore-2.3.0-SNAPSHOT.jar:2.3.0-SNAPSHOT]
[jira] [Commented] (HIVE-17873) External LLAP client: allow same handleID to be used more than once
[ https://issues.apache.org/jira/browse/HIVE-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214483#comment-16214483 ] Hive QA commented on HIVE-17873: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12893470/HIVE-17873.1.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11315 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[optimize_nullscan] (batchId=163) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[resourceplan] (batchId=158) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=101) org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=204) org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=221) org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch (batchId=269) org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes (batchId=228) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7438/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7438/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7438/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12893470 - PreCommit-HIVE-Build > External LLAP client: allow same handleID to be used more than once > --- > > Key: HIVE-17873 > URL: https://issues.apache.org/jira/browse/HIVE-17873 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17873.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns
[ https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vihang Karajgaonkar reassigned HIVE-17874: -- > Parquet vectorization fails on tables with complex columns when there are no > projected columns > -- > > Key: HIVE-17874 > URL: https://issues.apache.org/jira/browse/HIVE-17874 > Project: Hive > Issue Type: Sub-task >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar > > When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or > {{UNION}} simple queries like {{select count(*) from table}} fails with > {{unsupported type exception}} even though vectorized reader doesn't really > need read the complex type into batches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17737) ObjectStore.getNotificationEventsCount may cause NPE
[ https://issues.apache.org/jira/browse/HIVE-17737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Nagaraj Subramanya reassigned HIVE-17737: Assignee: Prasad Nagaraj Subramanya > ObjectStore.getNotificationEventsCount may cause NPE > > > Key: HIVE-17737 > URL: https://issues.apache.org/jira/browse/HIVE-17737 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.0, 3.0.0 >Reporter: Alexander Kolbasov >Assignee: Prasad Nagaraj Subramanya > > In ObjectStore.getNotificationEventsCount(): > {code} > public NotificationEventsCountResponse > getNotificationEventsCount(NotificationEventsCountRequest rqst) { > Long result = 0L; > try { > openTransaction(); > long fromEventId = rqst.getFromEventId(); > String inputDbName = rqst.getDbName(); > String queryStr = "select count(eventId) from " + > MNotificationLog.class.getName() > + " where eventId > fromEventId && dbName == inputDbName"; > query = pm.newQuery(queryStr); > query.declareParameters("java.lang.Long fromEventId, java.lang.String > inputDbName"); > result = (Long) query.execute(fromEventId, inputDbName); // <- Here > commited = commitTransaction(); > return new NotificationEventsCountResponse(result.longValue()); > } > } > {code} > It is possible that query.execute will return null in which case > rsult.longValue() may throw NPE. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17873) External LLAP client: allow same handleID to be used more than once
[ https://issues.apache.org/jira/browse/HIVE-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17873: -- Status: Patch Available (was: Open) > External LLAP client: allow same handleID to be used more than once > --- > > Key: HIVE-17873 > URL: https://issues.apache.org/jira/browse/HIVE-17873 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17873.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17873) External LLAP client: allow same handleID to be used more than once
[ https://issues.apache.org/jira/browse/HIVE-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-17873: -- Attachment: HIVE-17873.1.patch > External LLAP client: allow same handleID to be used more than once > --- > > Key: HIVE-17873 > URL: https://issues.apache.org/jira/browse/HIVE-17873 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > Attachments: HIVE-17873.1.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17873) External LLAP client: allow same handleID to be used more than once
[ https://issues.apache.org/jira/browse/HIVE-17873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere reassigned HIVE-17873: - > External LLAP client: allow same handleID to be used more than once > --- > > Key: HIVE-17873 > URL: https://issues.apache.org/jira/browse/HIVE-17873 > Project: Hive > Issue Type: Bug > Components: llap >Reporter: Jason Dere >Assignee: Jason Dere > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17870) Update NoDeleteRollingFileAppender to use Log4j2 api
[ https://issues.apache.org/jira/browse/HIVE-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214435#comment-16214435 ] Prasad Nagaraj Subramanya commented on HIVE-17870: -- [~aihuaxu] It looks like NoDeleteRollingFileAppender is never used. Should we just remove the class instead? > Update NoDeleteRollingFileAppender to use Log4j2 api > > > Key: HIVE-17870 > URL: https://issues.apache.org/jira/browse/HIVE-17870 > Project: Hive > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Aihua Xu > > NoDeleteRollingFileAppender is still using log4jv1 api. Since we already > moved to use log4j2 in hive, we better update to use log4jv2 as well. -- This message was sent by Atlassian JIRA (v6.4.14#64029)