[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973063#comment-16973063 ] Vipin Vishvkarma commented on HIVE-22081: - [~Rajkumar Singh] Will, there be any performance improvement with this change, as I don't see changes related to point 2 from the description in the final change and we have used stream() which is sequential in nature. I may be missing something here, can you please confirm. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-21917.03.patch, HIVE-22081.04.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908606#comment-16908606 ] Hive QA commented on HIVE-22081: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12977728/HIVE-22081.04.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16740 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18353/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18353/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18353/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12977728 - PreCommit-HIVE-Build > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-21917.03.patch, HIVE-22081.04.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908588#comment-16908588 ] Hive QA commented on HIVE-22081: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 11s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} ql: The patch generated 0 new + 24 unchanged - 1 fixed = 24 total (was 25) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 17s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-18353/dev-support/hive-personality.sh | | git revision | master / 28f2340 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-18353/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-21917.03.patch, HIVE-22081.04.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908484#comment-16908484 ] Hive QA commented on HIVE-22081: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12977725/HIVE-21917.03.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16740 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18351/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18351/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18351/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12977725 - PreCommit-HIVE-Build > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-21917.03.patch, HIVE-22081.04.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908456#comment-16908456 ] Hive QA commented on HIVE-22081: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 50s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 8s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 40s{color} | {color:red} ql: The patch generated 1 new + 24 unchanged - 1 fixed = 25 total (was 25) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 20s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-18351/dev-support/hive-personality.sh | | git revision | master / 28f2340 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-18351/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-18351/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-21917.03.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908407#comment-16908407 ] Rajkumar Singh commented on HIVE-22081: --- Thanks [~pvary], I have uploaded the fresh patch with the suggested changes for a clean run. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-21917.03.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907870#comment-16907870 ] Peter Vary commented on HIVE-22081: --- [~Rajkumar Singh]: Mostly only nits, but having the same style for code is the first step to better code: * Please fix checkstyle errors * Every if should look like this (space before and after the parenthesis) {code:java} if (isCompactDisabled) {{code} * Let me backpedal on my previous ask, and set this back to INFO (as this was info before): {code:java} LOG.debug("Compaction is disabled for table " + tbl.getTableName());{code} * This should be private, since nobody uses is, and static since it does not use any member variables: {code:java} public boolean checkDynPartitioning(Table t, CompactionInfo ci){{code} * Please add spaces around + when concatenating strings: {code:java} LOG.error("Caught Exception while checking compactiton eligibility "+StringUtils.stringifyException(e));{code} Otherwise +1 LGTM > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907679#comment-16907679 ] Hive QA commented on HIVE-22081: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12977629/HIVE-21917.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16739 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18340/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18340/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18340/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12977629 - PreCommit-HIVE-Build > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907661#comment-16907661 ] Hive QA commented on HIVE-22081: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 8s{color} | {color:blue} ql in master has 2251 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 42s{color} | {color:red} ql: The patch generated 4 new + 24 unchanged - 1 fixed = 28 total (was 25) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-18340/dev-support/hive-personality.sh | | git revision | master / 71605e6 | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-18340/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-18340/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-21917.02.patch, > HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900922#comment-16900922 ] Hive QA commented on HIVE-22081: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12976762/HIVE-21917.01.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:green}SUCCESS:{color} +1 due to 16723 tests passed Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18269/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18269/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18269/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12976762 - PreCommit-HIVE-Build > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900893#comment-16900893 ] Hive QA commented on HIVE-22081: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 48s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 41s{color} | {color:blue} serde in master has 193 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 1s{color} | {color:blue} ql in master has 2250 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s{color} | {color:red} ql: The patch generated 12 new + 23 unchanged - 2 fixed = 35 total (was 25) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 29m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-18269/dev-support/hive-personality.sh | | git revision | master / 4510efd | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-18269/yetus/diff-checkstyle-ql.txt | | modules | C: serde ql U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-18269/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-21917.01.patch, HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am pl
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899688#comment-16899688 ] Rajkumar Singh commented on HIVE-22081: --- {quote}Is this for cases where the automatic compaction was turned off for a while, and then someone turns that on later?{quote} yes, that right other than this starting Hive3 by default hive tables managed tables are Acids and the user who upgraded to hive3 will see more no of managed ACID tables. currently org.apache.hadoop.hive.ql.txn.compactor.Initiator#checkForCompaction do lots of HDFS blocking operation which is time-consuming, per your suggestion I review what objects/results can be cached to make it more efficient. will upload the new patch with checkstyle warning and test failure. Thanks > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899672#comment-16899672 ] Peter Vary commented on HIVE-22081: --- [~Rajkumar Singh]: Is this for cases where the automatic compaction was turned off for a while, and then someone turns that on later? So we have big number of tables because of the accumulation of the changes before the automatic compaction was turned on. In this case splitting the jobs to multiple threads is really useful. On the other hand if we have so many changes under 5 min that it takes more than 5 min to check if compaction is needed then we might to consider some other way to calculate / cache the check results. Splitting out the tasks for multiple threads could help, but it is still a CPU hog and IO intensive. Also please consider fixing the checkstyle warnings. Thanks, Peter > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899371#comment-16899371 ] Hive QA commented on HIVE-22081: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12976591/HIVE-22081.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 16723 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.TestTxnCommands2.testInitiatorWithMultipleFailedCompactions (batchId=331) org.apache.hadoop.hive.ql.TestTxnCommands2WithSplitUpdateAndVectorization.testInitiatorWithMultipleFailedCompactions (batchId=345) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningDelete (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningInsert (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.dynamicPartitioningUpdate (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningInsert (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.schemaEvolutionAddColDynamicPartitioningUpdate (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testDisableCompactionDuringReplLoad (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestCompactor.testTableProperties (batchId=246) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.chooseMajorOverMinorWhenBothValid (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactPartitionHighDeltaPct (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactPartitionTooManyDeltas (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactTableHighDeltaPct (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.compactTableTooManyDeltas (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.enoughDeltasNoBase (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.majorCompactOnPartitionTooManyAborts (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.majorCompactOnTableTooManyAborts (batchId=318) org.apache.hadoop.hive.ql.txn.compactor.TestInitiator.twoTxnsOnSamePartitionGenerateOneCompactionRequest (batchId=318) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/18250/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/18250/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-18250/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.YetusPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 18 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12976591 - PreCommit-HIVE-Build > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (HIVE-22081) Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there are too many Table/partitions are eligible for compaction
[ https://issues.apache.org/jira/browse/HIVE-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16899363#comment-16899363 ] Hive QA commented on HIVE-22081: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 9s{color} | {color:blue} ql in master has 2250 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s{color} | {color:red} ql: The patch generated 15 new + 23 unchanged - 2 fixed = 38 total (was 25) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 25m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-18250/dev-support/hive-personality.sh | | git revision | master / d7475aa | | Default Java | 1.8.0_111 | | findbugs | v3.0.0 | | checkstyle | http://104.198.109.242/logs//PreCommit-HIVE-Build-18250/yetus/diff-checkstyle-ql.txt | | modules | C: ql U: ql | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-18250/yetus.txt | | Powered by | Apache Yetushttp://yetus.apache.org | This message was automatically generated. > Hivemetastore Performance: Compaction Initiator Thread overwhelmed if there > are too many Table/partitions are eligible for compaction > -- > > Key: HIVE-22081 > URL: https://issues.apache.org/jira/browse/HIVE-22081 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 3.1.1 >Reporter: Rajkumar Singh >Assignee: Rajkumar Singh >Priority: Major > Attachments: HIVE-22081.patch > > > if Automatic Compaction is turned on, Initiator thread check for potential > table/partitions which are eligible for compactions and run some checks in > for loop before requesting compaction for eligibles. Though initiator thread > is configured to run at interval 5 min default, in case of many objects it > keeps on running as these checks are IO intensive and hog cpu. > In the proposed changes, I am planning to do > 1. passing less object to for loop by filtering out the objects based on the > condition which we are checking within the loop. > 2. Doing Async call using future to determine compaction type(this is where > we do FileSystem calls) -- This message was sent by Atlassian JIRA (v7.6.14#76016)