[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545626#comment-16545626 ] Hudson commented on HADOOP-15598: - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14582 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14582/]) HADOOP-15598. DataChecksum calculate checksum is contented on hashtable (weichiu: rev 0c7a578927032d5d1ef3469283d7d1fb7dee2a56) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/NativeCrc32.java > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0, 3.1.1 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Fix For: 3.2.0, 3.1.1, 3.0.4 > > Attachments: HADOOP-15598.1.patch, HADOOP-15598.1.patch, Screen Shot > 2018-07-11 at 1.45.06 AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png, > hadoop-sync-contention.svg > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os.arch is > "sparc". java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540593#comment-16540593 ] Wei-Chiu Chuang commented on HADOOP-15598: -- LGTM. Linked HADOOP-12925 to this jira. From git blame this looks like what caused it. Reset priority to major. This is at least a major-level issue. I did a quick search of "System.property" in the codebase, and most usage are for static variables or in the initialization method. +1 to have a follow up Jira to cover similar usage, if Steve agrees. > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0, 3.1.1 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Major > Attachments: HADOOP-15598.1.patch, HADOOP-15598.1.patch, Screen Shot > 2018-07-11 at 1.45.06 AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540513#comment-16540513 ] Prasanth Jayachandran commented on HADOOP-15598: [~ste...@apache.org] Yeah. There will definitely be more places for this optimization. I did not look anything beyond checksum bottleneck. Is it ok to handle the wider optimization in a follow-up? > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0, 3.1.1 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Minor > Attachments: HADOOP-15598.1.patch, HADOOP-15598.1.patch, Screen Shot > 2018-07-11 at 1.45.06 AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540434#comment-16540434 ] genericqa commented on HADOOP-15598: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 21s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 11s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 51s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HADOOP-15598 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931180/HADOOP-15598.1.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6371425f3c90 3.13.0-144-generic #193-Ubuntu SMP Thu Mar 15 17:03:53 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d36ed94 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/14879/testReport/ | | Max. process+thread count | 1717 (vs. ulimit of 1) | | modules | C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common | | Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/14879/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > DataChecksum
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540359#comment-16540359 ] Steve Loughran commented on HADOOP-15598: - added. FWIW, we might want to add some class which does it once, and then is referenced: I see a number of uses of "os.type". We could also think about scanning for uses of common properties (I'm thinking user.name, classpath, ...) and again, having static fields somewhere, because they'll end up being used where its expensive. Thanks for this insight; hadn't realised it was so expensive to use > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0, 3.1.1 >Reporter: Prasanth Jayachandran >Priority: Minor > Attachments: HADOOP-15598.1.patch, HADOOP-15598.1.patch, Screen Shot > 2018-07-11 at 1.45.06 AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540270#comment-16540270 ] Prasanth Jayachandran commented on HADOOP-15598: Thanks [~ste...@apache.org] for the review! Yes. It affects branch-3 as well updated it. Also, can you plz add me as a contributor (as I cannot assign tickets)? > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0, 3.1.1 >Reporter: Prasanth Jayachandran >Priority: Minor > Attachments: HADOOP-15598.1.patch, HADOOP-15598.1.patch, Screen Shot > 2018-07-11 at 1.45.06 AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540234#comment-16540234 ] Steve Loughran commented on HADOOP-15598: - BTW, does this affect branch-3.1 too? If so, change the affects-version field and we can plan to commit it there too. thanks > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0 >Reporter: Prasanth Jayachandran >Priority: Minor > Attachments: HADOOP-15598.1.patch, HADOOP-15598.1.patch, Screen Shot > 2018-07-11 at 1.45.06 AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540233#comment-16540233 ] Steve Loughran commented on HADOOP-15598: - looks good, but we'll wait for Yetus to do a test run. Jenkins has been playing up today, so I'll reattach and resubmit. > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0 >Reporter: Prasanth Jayachandran >Priority: Minor > Attachments: HADOOP-15598.1.patch, Screen Shot 2018-07-11 at 1.45.06 > AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539756#comment-16539756 ] Prasanth Jayachandran commented on HADOOP-15598: Another profiler attached where ~94% samples are in getProperty() contention. > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.2.0 >Reporter: Prasanth Jayachandran >Priority: Minor > Attachments: HADOOP-15598.1.patch, Screen Shot 2018-07-11 at 1.45.06 > AM.png, Screen Shot 2018-07-11 at 2.01.54 AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539753#comment-16539753 ] Prasanth Jayachandran commented on HADOOP-15598: [~gopalv]/[~vinodkv] can someone please review this patch? > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.2.0 >Reporter: Prasanth Jayachandran >Priority: Major > Attachments: HADOOP-15598.1.patch, Screen Shot 2018-07-11 at 1.45.06 > AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15598) DataChecksum calculate checksum is contented on hashtable synchronization
[ https://issues.apache.org/jira/browse/HADOOP-15598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16539750#comment-16539750 ] Prasanth Jayachandran commented on HADOOP-15598: Results from JMH microbenchmark. As we can see for bytebuffers of size >128 there is no significant benefit as most of the time is spent on the chunk looping. But for smaller size, the synchronization has significant impact. {noformat} Benchmark (size) Mode Cnt Score Error Units ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch 8 avgt 10 66.970 ± 1.247 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch 16 avgt 10 91.524 ± 2.141 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch 32 avgt 10 127.155 ± 8.037 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch 64 avgt 10 233.941 ± 14.327 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch 128 avgt 10 382.898 ± 4.316 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch1024 avgt 10 2707.026 ± 53.266 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithoutPatch4096 avgt 10 11411.330 ± 181.900 ns/op Benchmark(size) Mode Cnt Score Error Units ComputeChecksumBenchmark.benchmarkChecksumWithPatch 8 avgt 10 27.948 ± 0.507 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithPatch 16 avgt 10 49.336 ± 1.212 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithPatch 32 avgt 10 96.344 ± 1.360 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithPatch 64 avgt 10 182.513 ± 2.553 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithPatch 128 avgt 10 356.205 ± 4.282 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithPatch1024 avgt 10 2825.526 ± 31.177 ns/op ComputeChecksumBenchmark.benchmarkChecksumWithPatch4096 avgt 10 12078.055 ± 262.343 ns/op {noformat} > DataChecksum calculate checksum is contented on hashtable synchronization > - > > Key: HADOOP-15598 > URL: https://issues.apache.org/jira/browse/HADOOP-15598 > Project: Hadoop Common > Issue Type: Bug > Components: common >Affects Versions: 3.2.0 >Reporter: Prasanth Jayachandran >Priority: Major > Attachments: HADOOP-15598.1.patch, Screen Shot 2018-07-11 at 1.45.06 > AM.png > > > When profiling a multi-threaded hive streaming ingest, observed lock > contention on java.util.Properties getProperty() to check if os is "sparc". > java.util.Properties internally uses HashTable. HashTable.get() is > synchronized method. In the test application, on a 30s profile with 64 > threads ~40% CPU time is spent on getProperty() contention. See attached > snapshot. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org