[jira] [Work logged] (HIVE-24524) LLAP ShuffleHandler: upgrade to netty4
[ https://issues.apache.org/jira/browse/HIVE-24524?focusedWorklogId=566659&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566659 ] ASF GitHub Bot logged work on HIVE-24524: - Author: ASF GitHub Bot Created on: 16/Mar/21 00:50 Start Date: 16/Mar/21 00:50 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1778: URL: https://github.com/apache/hive/pull/1778 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566659) Time Spent: 0.5h (was: 20m) > LLAP ShuffleHandler: upgrade to netty4 > -- > > Key: HIVE-24524 > URL: https://issues.apache.org/jira/browse/HIVE-24524 > Project: Hive > Issue Type: Improvement >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Tez already has a WIP patch for upgrading its shuffle handler to netty4. > Netty4 is told to be a possible performance improvement compared to Netty3. > However, the refactor is not trivial, TEZ-4157 covers that more or less (the > code bases are very similar). > Background: > netty4 migration guideline: > https://netty.io/wiki/new-and-noteworthy-in-4.0.html > articles of possible performance improvement: > https://blog.twitter.com/engineering/en_us/a/2013/netty-4-at-twitter-reduced-gc-overhead.html > https://developer.squareup.com/blog/upgrading-a-reverse-proxy-from-netty-3-to-4/ > some other notes: Netty3 is EOL since 2016: > https://netty.io/news/2016/06/29/3-10-6-Final.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-21737) Upgrade Avro to version 1.10.1
[ https://issues.apache.org/jira/browse/HIVE-21737?focusedWorklogId=566657&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566657 ] ASF GitHub Bot logged work on HIVE-21737: - Author: ASF GitHub Bot Created on: 16/Mar/21 00:50 Start Date: 16/Mar/21 00:50 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1806: URL: https://github.com/apache/hive/pull/1806 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566657) Time Spent: 9h (was: 8h 50m) > Upgrade Avro to version 1.10.1 > -- > > Key: HIVE-21737 > URL: https://issues.apache.org/jira/browse/HIVE-21737 > Project: Hive > Issue Type: Improvement > Components: Hive >Reporter: Ismaël Mejía >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Attachments: > 0001-HIVE-21737-Make-Avro-use-in-Hive-compatible-with-Avr.patch > > Time Spent: 9h > Remaining Estimate: 0h > > Avro >= 1.9.x bring a lot of fixes including a leaner version of Avro without > Jackson in the public API and Guava as a dependency. Worth the update. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24594) results_cache_invalidation2.q is flaky
[ https://issues.apache.org/jira/browse/HIVE-24594?focusedWorklogId=566655&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566655 ] ASF GitHub Bot logged work on HIVE-24594: - Author: ASF GitHub Bot Created on: 16/Mar/21 00:49 Start Date: 16/Mar/21 00:49 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #1837: URL: https://github.com/apache/hive/pull/1837 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566655) Time Spent: 0.5h (was: 20m) > results_cache_invalidation2.q is flaky > -- > > Key: HIVE-24594 > URL: https://issues.apache.org/jira/browse/HIVE-24594 > Project: Hive > Issue Type: Test >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > results_cache_invalidation2.q failed for me couple of times on a unrelated > PR. Here is the error log. > {noformat} > --- > Test set: org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver > --- > Tests run: 90, Failures: 1, Errors: 0, Skipped: 6, Time elapsed: 450.54 s <<< > FAILURE! - in org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver > org.apache.hadoop.hive.cli.split19.TestMiniLlapLocalCliDriver.testCliDriver[results_cache_invalidation2] > Time elapsed: 15.087 s <<< FAILURE! > java.lang.AssertionError: > Client Execution succeeded but contained differences (error code = 1) after > executing results_cache_invalidation2.q ^M > 266a267 > > A masked pattern was here > 271a273 > > A masked pattern was here > 273c275,276 > < Stage-0 is a root stage > --- > > Stage-1 is a root stage > > Stage-0 depends on stages: Stage-1 > 275a279,365 > > Stage: Stage-1 > > Tez > > A masked pattern was here > > Edges: > > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) > > Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE) > > A masked pattern was here > > Vertices: > > Map 1 > > Map Operator Tree: > > TableScan > > alias: tab1 > > filterExpr: key is not null (type: boolean) > > Statistics: Num rows: 1500 Data size: 130500 Basic stats: > > COMPLETE Column stats: COMPLETE > > Filter Operator > > predicate: key is not null (type: boolean) > > Statistics: Num rows: 1500 Data size: 130500 Basic > > stats: COMPLETE Column stats: COMPLETE > > Select Operator > > expressions: key (type: string) > > outputColumnNames: _col0 > > Statistics: Num rows: 1500 Data size: 130500 Basic > > stats: COMPLETE Column stats: COMPLETE > > Reduce Output Operator > > key expressions: _col0 (type: string) > > null sort order: z > > sort order: + > > Map-reduce partition columns: _col0 (type: string) > > Statistics: Num rows: 1500 Data size: 130500 Basic > > stats: COMPLETE Column stats: COMPLETE > > Execution mode: vectorized, llap > > LLAP IO: all inputs > > Map 4 > > Map Operator Tree: > > TableScan > > alias: tab2 > > filterExpr: key is not null (type: boolean) > > Statistics: Num rows: 500 Data size: 43500 Basic stats: > > COMPLETE Column stats: COMPLETE > > Fil^M > {noformat} > The test works for me locally. In fact the same PR had a successful run of > this test in a previous commit. I think we should disable this and re-enable > it after fixing the flakiness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24595) Vectorization causing incorrect results for scalar subquery
[ https://issues.apache.org/jira/browse/HIVE-24595?focusedWorklogId=566656&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566656 ] ASF GitHub Bot logged work on HIVE-24595: - Author: ASF GitHub Bot Created on: 16/Mar/21 00:49 Start Date: 16/Mar/21 00:49 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #1867: URL: https://github.com/apache/hive/pull/1867#issuecomment-799860346 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566656) Time Spent: 20m (was: 10m) > Vectorization causing incorrect results for scalar subquery > --- > > Key: HIVE-24595 > URL: https://issues.apache.org/jira/browse/HIVE-24595 > Project: Hive > Issue Type: Bug > Components: Vectorization >Affects Versions: 3.0.0 >Reporter: Vineet Garg >Assignee: Mustafa İman >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > *Repro* > {code:sql} > CREATE EXTERNAL TABLE `alltypessmall`( >`id` int, >`bool_col` boolean, >`tinyint_col` tinyint, >`smallint_col` smallint, >`int_col` int, >`bigint_col` bigint, >`float_col` float, >`double_col` double, >`date_string_col` string, >`string_col` string, >`timestamp_col` timestamp) > PARTITIONED BY ( >`year` int, >`month` int) > ROW FORMAT SERDE >'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > WITH SERDEPROPERTIES ( >'escape.delim'='\\', >'field.delim'=',', >'serialization.format'=',') > STORED AS INPUTFORMAT >'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT >'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > TBLPROPERTIES ( >'DO_NOT_UPDATE_STATS'='true', >'OBJCAPABILITIES'='EXTREAD,EXTWRITE', >'STATS_GENERATED'='TASK', >'impala.lastComputeStatsTime'='1608312793', >'transient_lastDdlTime'='1608310442'); > insert into alltypessmall partition(year=2002,month=1) values(1, true, > 3,3,4,3434,5.4,44.3,'str1','str2', '01-01-2001'); > insert into alltypessmall partition(year=2002,month=1) values(1, true, > 3,3,4,3434,5.4,44.3,'str1','str2', '01-01-2001'); > insert into alltypessmall partition(year=2002,month=1) values(1, true, > 3,3,40,3434,5.4,44.3,'str1','str2', '01-01-2001'); > {code} > Following query should fail but it succeeds > {code:sql} > SELECT id FROM alltypessmall > WHERE int_col = > (SELECT int_col >FROM alltypessmall) > ORDER BY id; > {code} > *Explain plan* > {code:java} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > DagId: vgarg_20210106115838_3fe73bf6-66c2-4281-92e8-fd75fd8ad400:17 > Edges: > Map 1 <- Map 3 (BROADCAST_EDGE), Reducer 4 (BROADCAST_EDGE) > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE) > DagName: vgarg_20210106115838_3fe73bf6-66c2-4281-92e8-fd75fd8ad400:17 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypessmall > filterExpr: int_col is not null (type: boolean) > Statistics: Num rows: 3 Data size: 24 Basic stats: COMPLETE > Column stats: COMPLETE > Filter Operator > predicate: int_col is not null (type: bool
[jira] [Commented] (HIVE-24718) Moving to file based iteration for copying data
[ https://issues.apache.org/jira/browse/HIVE-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302080#comment-17302080 ] Pravin Sinha commented on HIVE-24718: - +1 > Moving to file based iteration for copying data > --- > > Key: HIVE-24718 > URL: https://issues.apache.org/jira/browse/HIVE-24718 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24718.01.patch, HIVE-24718.02.patch, > HIVE-24718.04.patch, HIVE-24718.05.patch, HIVE-24718.06.patch > > Time Spent: 6.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24887) getDatabase() to call translation code even if client has no capabilities
[ https://issues.apache.org/jira/browse/HIVE-24887?focusedWorklogId=566593&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566593 ] ASF GitHub Bot logged work on HIVE-24887: - Author: ASF GitHub Bot Created on: 15/Mar/21 22:33 Start Date: 15/Mar/21 22:33 Worklog Time Spent: 10m Work Description: nrg4878 opened a new pull request #2076: URL: https://github.com/apache/hive/pull/2076 …nslation (Naveen Gangam) ### What changes were proposed in this pull request? A minor change to have getDatabase() call call the translation layer even when client does not set the capabilities. Another change is to make checkDeletePermissions in the storage based authorizer to be package visibility, much like the other check* methods on this class. This allows the subclasses to use this method as well. ### Why are the changes needed? Mostly consistency with other methods like createTable() etc. ### Does this PR introduce _any_ user-facing change? Potentially, clients can see a different locationUri if their original database object had location from the managed warehouse. This old location will now be set as managedLocationUri. ### How was this patch tested? Manually. Failed unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566593) Remaining Estimate: 0h Time Spent: 10m > getDatabase() to call translation code even if client has no capabilities > - > > Key: HIVE-24887 > URL: https://issues.apache.org/jira/browse/HIVE-24887 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We do this for other calls that go thru translation layer. For some reason, > the current code only calls it when the client sets the capabilities. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24887) getDatabase() to call translation code even if client has no capabilities
[ https://issues.apache.org/jira/browse/HIVE-24887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24887: -- Labels: pull-request-available (was: ) > getDatabase() to call translation code even if client has no capabilities > - > > Key: HIVE-24887 > URL: https://issues.apache.org/jira/browse/HIVE-24887 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We do this for other calls that go thru translation layer. For some reason, > the current code only calls it when the client sets the capabilities. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24887) getDatabase() to call translation code even if client has no capabilities
[ https://issues.apache.org/jira/browse/HIVE-24887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-24887: > getDatabase() to call translation code even if client has no capabilities > - > > Key: HIVE-24887 > URL: https://issues.apache.org/jira/browse/HIVE-24887 > Project: Hive > Issue Type: Sub-task >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Major > > We do this for other calls that go thru translation layer. For some reason, > the current code only calls it when the client sets the capabilities. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23779) BasicStatsTask Info is not getting printed in beeline console
[ https://issues.apache.org/jira/browse/HIVE-23779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naresh P R updated HIVE-23779: -- Fix Version/s: 4.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > BasicStatsTask Info is not getting printed in beeline console > - > > Key: HIVE-23779 > URL: https://issues.apache.org/jira/browse/HIVE-23779 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > After HIVE-16061, partition basic stats are not getting printed in beeline > console. > {code:java} > INFO : Partition {dt=2020-06-29} stats: [numFiles=21, numRows=22, > totalSize=14607, rawDataSize=0]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23779) BasicStatsTask Info is not getting printed in beeline console
[ https://issues.apache.org/jira/browse/HIVE-23779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302014#comment-17302014 ] Naresh P R commented on HIVE-23779: --- Thanks for the review & merge [~mgergely] > BasicStatsTask Info is not getting printed in beeline console > - > > Key: HIVE-23779 > URL: https://issues.apache.org/jira/browse/HIVE-23779 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > After HIVE-16061, partition basic stats are not getting printed in beeline > console. > {code:java} > INFO : Partition {dt=2020-06-29} stats: [numFiles=21, numRows=22, > totalSize=14607, rawDataSize=0]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-23779) BasicStatsTask Info is not getting printed in beeline console
[ https://issues.apache.org/jira/browse/HIVE-23779?focusedWorklogId=566553&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566553 ] ASF GitHub Bot logged work on HIVE-23779: - Author: ASF GitHub Bot Created on: 15/Mar/21 21:17 Start Date: 15/Mar/21 21:17 Worklog Time Spent: 10m Work Description: miklosgergely merged pull request #2064: URL: https://github.com/apache/hive/pull/2064 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566553) Time Spent: 1h 50m (was: 1h 40m) > BasicStatsTask Info is not getting printed in beeline console > - > > Key: HIVE-23779 > URL: https://issues.apache.org/jira/browse/HIVE-23779 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > After HIVE-16061, partition basic stats are not getting printed in beeline > console. > {code:java} > INFO : Partition {dt=2020-06-29} stats: [numFiles=21, numRows=22, > totalSize=14607, rawDataSize=0]{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-24883) Add support for array type columns in Hive Joins
[ https://issues.apache.org/jira/browse/HIVE-24883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301925#comment-17301925 ] Suprith commented on HIVE-24883: Looks like it addresses a subset of problem(array type) reported here https://issues.apache.org/jira/browse/HIVE-20962 > Add support for array type columns in Hive Joins > > > Key: HIVE-24883 > URL: https://issues.apache.org/jira/browse/HIVE-24883 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: mahesh kumar behera >Assignee: mahesh kumar behera >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Hive fails to execute joins on array type columns as the comparison functions > are not able to handle array type columns. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24816) Upgrade jackson to 2.10.5.1 or 2.11.0+ due to CVE-2020-25649
[ https://issues.apache.org/jira/browse/HIVE-24816?focusedWorklogId=566514&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566514 ] ASF GitHub Bot logged work on HIVE-24816: - Author: ASF GitHub Bot Created on: 15/Mar/21 19:50 Start Date: 15/Mar/21 19:50 Worklog Time Spent: 10m Work Description: saihemanth-cloudera opened a new pull request #2075: URL: https://github.com/apache/hive/pull/2075 ### What changes were proposed in this pull request? Jackson version changed to 2.11.0 in the pom files. ### Why are the changes needed? To avoid CVE-2020-25649 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Local machine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566514) Time Spent: 20m (was: 10m) > Upgrade jackson to 2.10.5.1 or 2.11.0+ due to CVE-2020-25649 > > > Key: HIVE-24816 > URL: https://issues.apache.org/jira/browse/HIVE-24816 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, hive is pulling Jackson 2.10.5 version jar. Please upgrade to > 2.10.5.1 or 2.11.0+ due to CVE-2020-25649. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24853) HMS leaks queries in case of timeout
[ https://issues.apache.org/jira/browse/HIVE-24853?focusedWorklogId=566472&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566472 ] ASF GitHub Bot logged work on HIVE-24853: - Author: ASF GitHub Bot Created on: 15/Mar/21 18:58 Start Date: 15/Mar/21 18:58 Worklog Time Spent: 10m Work Description: ayushtkn commented on a change in pull request #2044: URL: https://github.com/apache/hive/pull/2044#discussion_r594605627 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java ## @@ -1783,12 +1783,16 @@ private long partsFoundForPartitions( MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end); List list = MetastoreDirectSqlUtils.ensureList(qResult); List colStats = new ArrayList(list.size()); - for (Object[] row : list) { -colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner)); -Deadline.checkTimeout(); +for (Object[] row : list) { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566472) Time Spent: 3h 20m (was: 3h 10m) > HMS leaks queries in case of timeout > > > Key: HIVE-24853 > URL: https://issues.apache.org/jira/browse/HIVE-24853 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > The queries aren't closed in case of timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24853) HMS leaks queries in case of timeout
[ https://issues.apache.org/jira/browse/HIVE-24853?focusedWorklogId=566470&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566470 ] ASF GitHub Bot logged work on HIVE-24853: - Author: ASF GitHub Bot Created on: 15/Mar/21 18:57 Start Date: 15/Mar/21 18:57 Worklog Time Spent: 10m Work Description: ayushtkn commented on a change in pull request #2044: URL: https://github.com/apache/hive/pull/2044#discussion_r594604694 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java ## @@ -1783,12 +1783,16 @@ private long partsFoundForPartitions( MetastoreDirectSqlUtils.timingTrace(doTrace, queryText, start, end); List list = MetastoreDirectSqlUtils.ensureList(qResult); List colStats = new ArrayList(list.size()); - for (Object[] row : list) { -colStats.add(prepareCSObjWithAdjustedNDV(row, 0, useDensityFunctionForNDVEstimation, ndvTuner)); -Deadline.checkTimeout(); +for (Object[] row : list) { + colStats.add(prepareCSObjWithAdjustedNDV(row, 0, + useDensityFunctionForNDVEstimation, ndvTuner)); + Deadline.checkTimeout(); +} +return colStats; + } catch (Exception e) { +throwMetaOrRuntimeException(e); +return Collections.emptyList(); Review comment: Compiler wants it. :-( This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566470) Time Spent: 3h 10m (was: 3h) > HMS leaks queries in case of timeout > > > Key: HIVE-24853 > URL: https://issues.apache.org/jira/browse/HIVE-24853 > Project: Hive > Issue Type: Bug >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > The queries aren't closed in case of timeout. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24828) [HMS] Provide new HMS API to return latest committed compaction record for a given table
[ https://issues.apache.org/jira/browse/HIVE-24828?focusedWorklogId=566358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566358 ] ASF GitHub Bot logged work on HIVE-24828: - Author: ASF GitHub Bot Created on: 15/Mar/21 16:15 Start Date: 15/Mar/21 16:15 Worklog Time Spent: 10m Work Description: hsnusonic commented on pull request #2073: URL: https://github.com/apache/hive/pull/2073#issuecomment-799547557 @kishendas @pvargacl Could you please review the changes? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566358) Time Spent: 1h (was: 50m) > [HMS] Provide new HMS API to return latest committed compaction record for a > given table > > > Key: HIVE-24828 > URL: https://issues.apache.org/jira/browse/HIVE-24828 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Kishen Das >Assignee: Yu-Wen Lai >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > We need a new HMS API to return the latest committed compaction record for a > given table. This can be used by a remote cache to decide whether a given > table's file metadata has been compacted or not, in order to decide whether > file metadata has to be refreshed from the file system before serving or it > can serve the current data from the cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24879) Create new metric about ACID metadata size
[ https://issues.apache.org/jira/browse/HIVE-24879?focusedWorklogId=566339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566339 ] ASF GitHub Bot logged work on HIVE-24879: - Author: ASF GitHub Bot Created on: 15/Mar/21 15:59 Start Date: 15/Mar/21 15:59 Worklog Time Spent: 10m Work Description: pvargacl commented on a change in pull request #2074: URL: https://github.com/apache/hive/pull/2074#discussion_r594465250 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/MetricsInfo.java ## @@ -0,0 +1,26 @@ +package org.apache.hadoop.hive.metastore.txn; Review comment: Missing apache licence This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566339) Time Spent: 20m (was: 10m) > Create new metric about ACID metadata size > -- > > Key: HIVE-24879 > URL: https://issues.apache.org/jira/browse/HIVE-24879 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > 2 new metrics should be created: > * Number of rows in txn_to_writeid table > * Number of rows in completed_txns table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24879) Create new metric about ACID metadata size
[ https://issues.apache.org/jira/browse/HIVE-24879?focusedWorklogId=566314&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566314 ] ASF GitHub Bot logged work on HIVE-24879: - Author: ASF GitHub Bot Created on: 15/Mar/21 15:38 Start Date: 15/Mar/21 15:38 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request #2074: URL: https://github.com/apache/hive/pull/2074 ### What changes were proposed in this pull request? Introduced ACID metadata related metrics ### Why are the changes needed? Compaction observability ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566314) Remaining Estimate: 0h Time Spent: 10m > Create new metric about ACID metadata size > -- > > Key: HIVE-24879 > URL: https://issues.apache.org/jira/browse/HIVE-24879 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > 2 new metrics should be created: > * Number of rows in txn_to_writeid table > * Number of rows in completed_txns table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24879) Create new metric about ACID metadata size
[ https://issues.apache.org/jira/browse/HIVE-24879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-24879: -- Labels: pull-request-available (was: ) > Create new metric about ACID metadata size > -- > > Key: HIVE-24879 > URL: https://issues.apache.org/jira/browse/HIVE-24879 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > 2 new metrics should be created: > * Number of rows in txn_to_writeid table > * Number of rows in completed_txns table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24874) Worker performance metric
[ https://issues.apache.org/jira/browse/HIVE-24874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24874 started by Denys Kuzmenko. - > Worker performance metric > - > > Key: HIVE-24874 > URL: https://issues.apache.org/jira/browse/HIVE-24874 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > > Wrap Compaction Worker with PerformanceLogger. > Major and Minor compactions should be measured to separate metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-24879) Create new metric about ACID metadata size
[ https://issues.apache.org/jira/browse/HIVE-24879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-24879 started by Denys Kuzmenko. - > Create new metric about ACID metadata size > -- > > Key: HIVE-24879 > URL: https://issues.apache.org/jira/browse/HIVE-24879 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > > 2 new metrics should be created: > * Number of rows in txn_to_writeid table > * Number of rows in completed_txns table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24879) Create new metric about ACID metadata size
[ https://issues.apache.org/jira/browse/HIVE-24879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko reassigned HIVE-24879: - Assignee: Denys Kuzmenko > Create new metric about ACID metadata size > -- > > Key: HIVE-24879 > URL: https://issues.apache.org/jira/browse/HIVE-24879 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > > 2 new metrics should be created: > * Number of rows in txn_to_writeid table > * Number of rows in completed_txns table -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-24871) Initiator / Cleaner performance metrics
[ https://issues.apache.org/jira/browse/HIVE-24871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko resolved HIVE-24871. --- Resolution: Fixed > Initiator / Cleaner performance metrics > --- > > Key: HIVE-24871 > URL: https://issues.apache.org/jira/browse/HIVE-24871 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > The PerformanceLogger should be used in Initiator and Cleaner service. > * One cycle of Initiator should be measured, with ignoring the time spent > waiting on the lock for AUX table > * One compaction cleanup should be measured in Cleaner (using different > metric for major and minor compaction cleanup) > Important note: the PerformanceLogger implementation from metastore should be > used (not the ql one) otherwise the metric won't be published in HMS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24874) Worker performance metric
[ https://issues.apache.org/jira/browse/HIVE-24874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko reassigned HIVE-24874: - Assignee: Denys Kuzmenko > Worker performance metric > - > > Key: HIVE-24874 > URL: https://issues.apache.org/jira/browse/HIVE-24874 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > > Wrap Compaction Worker with PerformanceLogger. > Major and Minor compactions should be measured to separate metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24871) Initiator / Cleaner performance metrics
[ https://issues.apache.org/jira/browse/HIVE-24871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko reassigned HIVE-24871: - Assignee: Denys Kuzmenko > Initiator / Cleaner performance metrics > --- > > Key: HIVE-24871 > URL: https://issues.apache.org/jira/browse/HIVE-24871 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > The PerformanceLogger should be used in Initiator and Cleaner service. > * One cycle of Initiator should be measured, with ignoring the time spent > waiting on the lock for AUX table > * One compaction cleanup should be measured in Cleaner (using different > metric for major and minor compaction cleanup) > Important note: the PerformanceLogger implementation from metastore should be > used (not the ql one) otherwise the metric won't be published in HMS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-24886) Support simple equality operations between MAP data types
[ https://issues.apache.org/jira/browse/HIVE-24886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis reassigned HIVE-24886: -- > Support simple equality operations between MAP data types > - > > Key: HIVE-24886 > URL: https://issues.apache.org/jira/browse/HIVE-24886 > Project: Hive > Issue Type: New Feature > Components: Query Planning, Query Processor >Reporter: Stamatis Zampetakis >Assignee: Stamatis Zampetakis >Priority: Major > > Currently equality operations between MAP data types work in some very > limited cases e.g: > {code:sql} > create table table_map_types (id int, c1 map, c2 map); > select id from table_map_types where map(1,1) IN (map(1,1), map(1,2), > map(1,3)); > {code} > but this feature was never introduced explicitly (zero tests & JIRAs around > the subject) and the vast majority of queries involving comparisons between > MAP types now fail at compile time. > The goal of this issue is to support simple equality operations: > * EQUALS(=) > * NOT_EQUALS(<>), > * IN, > * IS DISTINCT FROM, > * IS NOT DISTINCT FROM > between MAP data types when the compared (MAP) types are identical. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-24885) The state of unset low or high value in LongColumnStatsData can not be retrieved
[ https://issues.apache.org/jira/browse/HIVE-24885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated HIVE-24885: -- Description: During the work to improve Impala column stats to compute min/max for columns, it is found that the state of unset low or high value in LongColumnStatsData can not be retrieved back. This is illustrated in the following Impala test case added to MetastoreEventsProcessorTest. {code:java} @Test public void testUnsetAndCheckUnsetLowHighValue() throws CatalogException { try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) { List colNames = new ArrayList(); colNames.add("id"); colNames.add("int_col"); colNames.add("bigint_col"); List colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); longColStatsData.unsetLowValue(); longColStatsData.unsetHighValue(); colStatsData.setLongStats(longColStatsData); } assertTrue("All good!", true); colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); assertFalse("isSetLowValue() should be false", longColStatsData.isSetLowValue()); assertFalse( "isSetHighValue() should be false", longColStatsData.isSetHighValue()); } assertTrue("All good!", true); } catch (NoSuchObjectException e) { assertFalse(String.format("No such object exception: %s", e), false); } catch (MetaException e) { assertFalse(String.format("Metadata exception: %s", e), false); } catch (TException e) { assertFalse(String.format("TException: %s", e), false); } } {code} The assertion on isSetLowValue() or isSetHighValue() should be false, since longColStatsData.unsetLowValue() is called in the first loop. To build the test, {code:java} mvn -f $IMPALA_HOME/fe/pom.xml test -e -Djava.compiler=NONE -ff -Dtest=MetastoreEventsProcessorTest#testUnsetAndCheckUnsetLowHighValue {code} Table unique_database.alltypes is defined as follows. {code:java} CREATE EXTERNAL TABLE unique_database.alltypes ( id INT, bool_col BOOLEAN, tinyint_col TINYINT, smallint_col SMALLINT, int_col INT, bigint_col BIGINT,
[jira] [Updated] (HIVE-24885) The state of unset low or high value in LongColumnStatsData can not be retrieved
[ https://issues.apache.org/jira/browse/HIVE-24885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated HIVE-24885: -- Description: During the work to improve Impala column stats to compute min/max for columns, it is found that the state of unset low or high value in LongColumnStatsData can not be retrieved back. This is illustrated in the following Impala test case added to MetastoreEventsProcessorTest. {code:java} @Test public void testUnsetAndCheckUnsetLowHighValue() throws CatalogException { try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) { List colNames = new ArrayList(); colNames.add("id"); colNames.add("int_col"); colNames.add("bigint_col"); List colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); longColStatsData.unsetLowValue(); longColStatsData.unsetHighValue(); colStatsData.setLongStats(longColStatsData); } assertTrue("All good!", true); colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); assertFalse("isSetLowValue() should be false", longColStatsData.isSetLowValue()); assertFalse( "isSetHighValue() should be false", longColStatsData.isSetHighValue()); } assertTrue("All good!", true); } catch (NoSuchObjectException e) { assertFalse(String.format("No such object exception: %s", e), false); } catch (MetaException e) { assertFalse(String.format("Metadata exception: %s", e), false); } catch (TException e) { assertFalse(String.format("TException: %s", e), false); } } {code} The assertion on isSetLowValue() or isSetHighValue() should be false, since longColStatsData.unsetLowValue() is called in the first loop. To build the test, mvn -f $IMPALA_HOME/fe/pom.xml test -e -Djava.compiler=NONE -ff -Dtest=MetastoreEventsProcessorTest#testUnsetAndCheckUnsetLowHighValue Table unique_database.alltypes is defined as follows. CREATE EXTERNAL TABLE unique_database.alltypes ( id INT, bool_col BOOLEAN, tinyint_col TINYINT, smallint_col SMALLINT, int_col INT, bigint_col BIGINT, float_col FLOAT,
[jira] [Updated] (HIVE-24885) The state of unset low or high value in LongColumnStatsData can not be retrieved
[ https://issues.apache.org/jira/browse/HIVE-24885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated HIVE-24885: -- Description: During the work to improve Impala column stats to compute min/max for columns, it is found that the state of unset low or high value in LongColumnStatsData can not be retrieved back. This is illustrated in the following Impala test case added to MetastoreEventsProcessorTest. {code:java} @Test public void testUnsetAndCheckUnsetLowHighValue() throws CatalogException { try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) { List colNames = new ArrayList(); colNames.add("id"); colNames.add("int_col"); colNames.add("bigint_col"); List colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); longColStatsData.unsetLowValue(); longColStatsData.unsetHighValue(); colStatsData.setLongStats(longColStatsData); } assertTrue("All good!", true); colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); assertFalse("isSetLowValue() should be false", longColStatsData.isSetLowValue()); assertFalse( "isSetHighValue() should be false", longColStatsData.isSetHighValue()); } assertTrue("All good!", true); } catch (NoSuchObjectException e) { assertFalse(String.format("No such object exception: %s", e), false); } catch (MetaException e) { assertFalse(String.format("Metadata exception: %s", e), false); } catch (TException e) { assertFalse(String.format("TException: %s", e), false); } } {code:java} The assertion on isSetLowValue() or isSetHighValue() should be false, since longColStatsData.unsetLowValue() is called in the first loop. To build the test, mvn -f $IMPALA_HOME/fe/pom.xml test -e -Djava.compiler=NONE -ff -Dtest=MetastoreEventsProcessorTest#testUnsetAndCheckUnsetLowHighValue Table unique_database.alltypes is defined as follows. CREATE EXTERNAL TABLE unique_database.alltypes ( id INT, bool_col BOOLEAN, tinyint_col TINYINT, smallint_col SMALLINT, int_col INT, bigint_col BIGINT, float_col FLOAT,
[jira] [Updated] (HIVE-24885) The state of unset low or high value in LongColumnStatsData can not be retrieved
[ https://issues.apache.org/jira/browse/HIVE-24885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated HIVE-24885: -- Description: During the work to improve Impala column stats to compute min/max for columns, it is found that the state of unset low or high value in LongColumnStatsData can not be retrieved back. This is illustrated in the following Impala test case added to MetastoreEventsProcessorTest. @Test public void testUnsetAndCheckUnsetLowHighValue() throws CatalogException { try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) { List colNames = new ArrayList(); colNames.add("id"); colNames.add("int_col"); colNames.add("bigint_col"); List colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); longColStatsData.unsetLowValue(); longColStatsData.unsetHighValue(); colStatsData.setLongStats(longColStatsData); } assertTrue("All good!", true); colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); assertFalse("isSetLowValue() should be false", longColStatsData.isSetLowValue()); assertFalse( "isSetHighValue() should be false", longColStatsData.isSetHighValue()); } assertTrue("All good!", true); } catch (NoSuchObjectException e) { assertFalse(String.format("No such object exception: %s", e), false); } catch (MetaException e) { assertFalse(String.format("Metadata exception: %s", e), false); } catch (TException e) { assertFalse(String.format("TException: %s", e), false); } } The assertion on isSetLowValue() or isSetHighValue() should be false, since longColStatsData.unsetLowValue() is called in the first loop. To build the test, mvn -f $IMPALA_HOME/fe/pom.xml test -e -Djava.compiler=NONE -ff -Dtest=MetastoreEventsProcessorTest#testUnsetAndCheckUnsetLowHighValue Table unique_database.alltypes is defined as follows. CREATE EXTERNAL TABLE unique_database.alltypes ( id INT, bool_col BOOLEAN, tinyint_col TINYINT, smallint_col SMALLINT, int_col INT, bigint_col BIGINT, float_col FLOAT,
[jira] [Updated] (HIVE-24885) The state of unset low or high value in LongColumnStatsData can not be retrieved
[ https://issues.apache.org/jira/browse/HIVE-24885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qifan Chen updated HIVE-24885: -- Description: During the work to improve Impala column stats to compute min/max for columns, it is found that the state of unset low or high value in LongColumnStatsData can not be retrieved back. This is illustrated in the following Impala test case added to MetastoreEventsProcessorTest. /** * Unset the low and the high value first and then check. */ @Test public void testUnsetAndCheckUnsetLowHighValue() throws CatalogException { try (MetaStoreClient msClient = catalog_.getMetaStoreClient()) { List colNames = new ArrayList(); colNames.add("id"); colNames.add("int_col"); colNames.add("bigint_col"); List colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); longColStatsData.unsetLowValue(); longColStatsData.unsetHighValue(); colStatsData.setLongStats(longColStatsData); } assertTrue("All good!", true); colStatsObjs = msClient.getHiveClient().getTableColumnStatistics( "unique_database", "alltypes", colNames, "impala"); for (ColumnStatisticsObj colStatsObj : colStatsObjs) { ColumnStatisticsData colStatsData = colStatsObj.getStatsData(); LongColumnStatsData longColStatsData = colStatsData.getLongStats(); assertFalse("isSetLowValue() should be false", longColStatsData.isSetLowValue()); assertFalse( "isSetHighValue() should be false", longColStatsData.isSetHighValue()); } assertTrue("All good!", true); } catch (NoSuchObjectException e) { assertFalse(String.format("No such object exception: %s", e), false); } catch (MetaException e) { assertFalse(String.format("Metadata exception: %s", e), false); } catch (TException e) { assertFalse(String.format("TException: %s", e), false); } } The assertion on isSetLowValue() or isSetHighValue() should be false, since longColStatsData.unsetLowValue() is called in the first loop. To build the test, mvn -f $IMPALA_HOME/fe/pom.xml test -e -Djava.compiler=NONE -ff -Dtest=MetastoreEventsProcessorTest#testUnsetAndCheckUnsetLowHighValue Table unique_database.alltypes is defined as follows. CREATE EXTERNAL TABLE unique_database.alltypes ( id INT, bool_col BOOLEAN, tinyint_col TINYINT, smallint_col SMALLINT, int_col INT,
[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders
[ https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=566165&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566165 ] ASF GitHub Bot logged work on HIVE-24590: - Author: ASF GitHub Bot Created on: 15/Mar/21 12:54 Start Date: 15/Mar/21 12:54 Worklog Time Spent: 10m Work Description: EugeneChung edited a comment on pull request #1849: URL: https://github.com/apache/hive/pull/1849#issuecomment-799393602 @zabetak Yes, I forgot to let you know. It worked well, but in my company's repo. I chose to clear all the log4j MDC. The leak and incorrect operation logging have been gone away. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566165) Time Spent: 1.5h (was: 1h 20m) > Operation Logging still leaks the log4j Appenders > - > > Key: HIVE-24590 > URL: https://issues.apache.org/jira/browse/HIVE-24590 > Project: Hive > Issue Type: Bug > Components: Logging >Reporter: Eugene Chung >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot > 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen > Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, > Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch > > Time Spent: 1.5h > Remaining Estimate: 0h > > I'm using Hive 3.1.2 with options below. > * hive.server2.logging.operation.enabled=true > * hive.server2.logging.operation.level=VERBOSE > * hive.async.log.enabled=false > I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 > but HS2 still leaks log4j RandomAccessFileManager. > !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197! > I checked the operation log file which is not closed/deleted properly. > !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272! > Then there's the log, > {code:java} > client.TezClient: Shutting down Tez Session, sessionName= {code} > !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders
[ https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=566163&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566163 ] ASF GitHub Bot logged work on HIVE-24590: - Author: ASF GitHub Bot Created on: 15/Mar/21 12:51 Start Date: 15/Mar/21 12:51 Worklog Time Spent: 10m Work Description: EugeneChung commented on pull request #1849: URL: https://github.com/apache/hive/pull/1849#issuecomment-799393602 @zabetak Yes, I forgot to let you know. It worked well, but in my company's repo. I chose to clear all the log4j MDC. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566163) Time Spent: 1h 20m (was: 1h 10m) > Operation Logging still leaks the log4j Appenders > - > > Key: HIVE-24590 > URL: https://issues.apache.org/jira/browse/HIVE-24590 > Project: Hive > Issue Type: Bug > Components: Logging >Reporter: Eugene Chung >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot > 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen > Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, > Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I'm using Hive 3.1.2 with options below. > * hive.server2.logging.operation.enabled=true > * hive.server2.logging.operation.level=VERBOSE > * hive.async.log.enabled=false > I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 > but HS2 still leaks log4j RandomAccessFileManager. > !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197! > I checked the operation log file which is not closed/deleted properly. > !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272! > Then there's the log, > {code:java} > client.TezClient: Shutting down Tez Session, sessionName= {code} > !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders
[ https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=566128&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566128 ] ASF GitHub Bot logged work on HIVE-24590: - Author: ASF GitHub Bot Created on: 15/Mar/21 11:16 Start Date: 15/Mar/21 11:16 Worklog Time Spent: 10m Work Description: zabetak commented on pull request #1849: URL: https://github.com/apache/hive/pull/1849#issuecomment-799336894 Hey @EugeneChung did you have a change to try out this fix? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566128) Time Spent: 1h (was: 50m) > Operation Logging still leaks the log4j Appenders > - > > Key: HIVE-24590 > URL: https://issues.apache.org/jira/browse/HIVE-24590 > Project: Hive > Issue Type: Bug > Components: Logging >Reporter: Eugene Chung >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot > 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen > Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, > Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch > > Time Spent: 1h > Remaining Estimate: 0h > > I'm using Hive 3.1.2 with options below. > * hive.server2.logging.operation.enabled=true > * hive.server2.logging.operation.level=VERBOSE > * hive.async.log.enabled=false > I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 > but HS2 still leaks log4j RandomAccessFileManager. > !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197! > I checked the operation log file which is not closed/deleted properly. > !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272! > Then there's the log, > {code:java} > client.TezClient: Shutting down Tez Session, sessionName= {code} > !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders
[ https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=566129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566129 ] ASF GitHub Bot logged work on HIVE-24590: - Author: ASF GitHub Bot Created on: 15/Mar/21 11:16 Start Date: 15/Mar/21 11:16 Worklog Time Spent: 10m Work Description: zabetak edited a comment on pull request #1849: URL: https://github.com/apache/hive/pull/1849#issuecomment-799336894 Hey @EugeneChung did you have a chance to try out this fix? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566129) Time Spent: 1h 10m (was: 1h) > Operation Logging still leaks the log4j Appenders > - > > Key: HIVE-24590 > URL: https://issues.apache.org/jira/browse/HIVE-24590 > Project: Hive > Issue Type: Bug > Components: Logging >Reporter: Eugene Chung >Assignee: Stamatis Zampetakis >Priority: Major > Labels: pull-request-available > Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot > 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen > Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, > Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I'm using Hive 3.1.2 with options below. > * hive.server2.logging.operation.enabled=true > * hive.server2.logging.operation.level=VERBOSE > * hive.async.log.enabled=false > I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 > but HS2 still leaks log4j RandomAccessFileManager. > !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197! > I checked the operation log file which is not closed/deleted properly. > !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272! > Then there's the log, > {code:java} > client.TezClient: Shutting down Tez Session, sessionName= {code} > !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24871) Initiator / Cleaner performance metrics
[ https://issues.apache.org/jira/browse/HIVE-24871?focusedWorklogId=566057&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566057 ] ASF GitHub Bot logged work on HIVE-24871: - Author: ASF GitHub Bot Created on: 15/Mar/21 08:01 Start Date: 15/Mar/21 08:01 Worklog Time Spent: 10m Work Description: deniskuzZ merged pull request #2061: URL: https://github.com/apache/hive/pull/2061 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566057) Time Spent: 1h (was: 50m) > Initiator / Cleaner performance metrics > --- > > Key: HIVE-24871 > URL: https://issues.apache.org/jira/browse/HIVE-24871 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > The PerformanceLogger should be used in Initiator and Cleaner service. > * One cycle of Initiator should be measured, with ignoring the time spent > waiting on the lock for AUX table > * One compaction cleanup should be measured in Cleaner (using different > metric for major and minor compaction cleanup) > Important note: the PerformanceLogger implementation from metastore should be > used (not the ql one) otherwise the metric won't be published in HMS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24718) Moving to file based iteration for copying data
[ https://issues.apache.org/jira/browse/HIVE-24718?focusedWorklogId=566051&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566051 ] ASF GitHub Bot logged work on HIVE-24718: - Author: ASF GitHub Bot Created on: 15/Mar/21 07:39 Start Date: 15/Mar/21 07:39 Worklog Time Spent: 10m Work Description: ArkoSharma commented on a change in pull request #1936: URL: https://github.com/apache/hive/pull/1936#discussion_r594106033 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcrossInstances.java ## @@ -2225,17 +2224,11 @@ private void setupUDFJarOnHDFS(Path identityUdfLocalPath, Path identityUdfHdfsPa /* * Method used from TestReplicationScenariosExclusiveReplica */ - private void assertExternalFileInfo(List expected, String dumplocation, boolean isIncremental, + private void assertExternalFileList(List expected, String dumplocation, WarehouseInstance warehouseInstance) throws IOException { Path hivePath = new Path(dumplocation, ReplUtils.REPL_HIVE_BASE_DIR); -Path metadataPath = new Path(hivePath, EximUtil.METADATA_PATH_NAME); -Path externalTableInfoFile; -if (isIncremental) { - externalTableInfoFile = new Path(hivePath, FILE_NAME); -} else { - externalTableInfoFile = new Path(metadataPath, primaryDbName.toLowerCase() + File.separator + FILE_NAME); -} -ReplicationTestUtils.assertExternalFileInfo(warehouseInstance, expected, externalTableInfoFile); +Path externalTblFileList = new Path(hivePath, EximUtil.FILE_LIST_EXTERNAL); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566051) Time Spent: 6.5h (was: 6h 20m) > Moving to file based iteration for copying data > --- > > Key: HIVE-24718 > URL: https://issues.apache.org/jira/browse/HIVE-24718 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24718.01.patch, HIVE-24718.02.patch, > HIVE-24718.04.patch, HIVE-24718.05.patch, HIVE-24718.06.patch > > Time Spent: 6.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24718) Moving to file based iteration for copying data
[ https://issues.apache.org/jira/browse/HIVE-24718?focusedWorklogId=566050&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566050 ] ASF GitHub Bot logged work on HIVE-24718: - Author: ASF GitHub Bot Created on: 15/Mar/21 07:39 Start Date: 15/Mar/21 07:39 Worklog Time Spent: 10m Work Description: ArkoSharma commented on a change in pull request #1936: URL: https://github.com/apache/hive/pull/1936#discussion_r594105835 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTablesMetaDataOnly.java ## @@ -639,9 +629,11 @@ public void testIncrementalDumpEmptyDumpDirectory() throws Throwable { .verifyResult(inc2Tuple.lastReplicationId); } - private void assertFalseExternalFileList(Path externalTableFileList) - throws IOException { + private void assertFalseExternalFileList(String dumpLocation) Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566050) Time Spent: 6h 20m (was: 6h 10m) > Moving to file based iteration for copying data > --- > > Key: HIVE-24718 > URL: https://issues.apache.org/jira/browse/HIVE-24718 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24718.01.patch, HIVE-24718.02.patch, > HIVE-24718.04.patch, HIVE-24718.05.patch, HIVE-24718.06.patch > > Time Spent: 6h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24718) Moving to file based iteration for copying data
[ https://issues.apache.org/jira/browse/HIVE-24718?focusedWorklogId=566049&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566049 ] ASF GitHub Bot logged work on HIVE-24718: - Author: ASF GitHub Bot Created on: 15/Mar/21 07:38 Start Date: 15/Mar/21 07:38 Worklog Time Spent: 10m Work Description: ArkoSharma commented on a change in pull request #1936: URL: https://github.com/apache/hive/pull/1936#discussion_r594105742 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosExternalTablesMetaDataOnly.java ## @@ -639,9 +629,11 @@ public void testIncrementalDumpEmptyDumpDirectory() throws Throwable { .verifyResult(inc2Tuple.lastReplicationId); } - private void assertFalseExternalFileList(Path externalTableFileList) - throws IOException { + private void assertFalseExternalFileList(String dumpLocation) + throws IOException { Review comment: getFileSystem() call throws IOException. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 566049) Time Spent: 6h 10m (was: 6h) > Moving to file based iteration for copying data > --- > > Key: HIVE-24718 > URL: https://issues.apache.org/jira/browse/HIVE-24718 > Project: Hive > Issue Type: Bug >Reporter: Arko Sharma >Assignee: Arko Sharma >Priority: Major > Labels: pull-request-available > Attachments: HIVE-24718.01.patch, HIVE-24718.02.patch, > HIVE-24718.04.patch, HIVE-24718.05.patch, HIVE-24718.06.patch > > Time Spent: 6h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)