[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263398#comment-14263398 ] Hudson commented on HBASE-5699: --- SUCCESS: Integrated in HBase-1.0 #627 (See [https://builds.apache.org/job/HBase-1.0/627/]) HBASE-5699 Adds multiple WALs per Region Server based on groups of regions. (enis: rev 2ebeddfc4276898e42e9d7cadb8092e9f72ed421) * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/RegionGroupingProvider.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/BoundedRegionGroupingProvider.java * hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestBoundedRegionGroupingProvider.java > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 1.0.0, 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263355#comment-14263355 ] Enis Soztutar commented on HBASE-5699: -- Pushed to 1.0.0 as well. All new code, should not affect default stability. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 1.0.0, 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257413#comment-14257413 ] stack commented on HBASE-5699: -- [~enis] this for 1.0? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253936#comment-14253936 ] Sean Busbey commented on HBASE-5699: {quote} bq. Altering the WAL provider used by a particular RegionServer requires restarting that instance. It requires restarting the cluster, right? Otherwise when an RS dies there might another up still with a different wal provider. {quote} In general, yes. In the case of the providers we currently allow via configuration parameters (as opposed to user customer FQCN), they all are compatible on the recovery side. So it doesn't matter if a RS dies while there are different providers around. That's why a rolling restart can be used to change from default to multiwal. In the release note I was focused on the specific case of default vs multiwal. Should I note the general case? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253918#comment-14253918 ] Lars Hofhansl commented on HBASE-5699: -- bq. Altering the WAL provider used by a particular RegionServer requires restarting that instance. It requires restarting the cluster, right? Otherwise when an RS dies there might another up still with a different wal provider. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253791#comment-14253791 ] Hudson commented on HBASE-5699: --- SUCCESS: Integrated in HBase-TRUNK #5949 (See [https://builds.apache.org/job/HBase-TRUNK/5949/]) HBASE-5699 Adds multiple WALs per Region Server based on groups of regions. (busbey: rev f1c41e307e4e55e7849f35e09c3de2fcd4dbbd2b) * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/RegionGroupingProvider.java * hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestBoundedRegionGroupingProvider.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/BoundedRegionGroupingProvider.java > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253694#comment-14253694 ] Hudson commented on HBASE-5699: --- FAILURE: Integrated in HBase-1.1 #9 (See [https://builds.apache.org/job/HBase-1.1/9/]) HBASE-5699 Adds multiple WALs per Region Server based on groups of regions. (busbey: rev 2b94aa8c8599d66858e6a2b5198bd21ded2334ca) * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/RegionGroupingProvider.java * hbase-server/src/test/java/org/apache/hadoop/hbase/wal/TestBoundedRegionGroupingProvider.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALFactory.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/BoundedRegionGroupingProvider.java > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252858#comment-14252858 ] Sean Busbey commented on HBASE-5699: sounds great. I'll leave the default wal provider as is (a single wal to the filesystem) and make it so that if you configure multiwal you start at 2. [~enis], you fine with me pushing this to branch-1.0 (either now or after you cut RC0)? or just stick with branch-1? It changes nothing by default but allows some alternate wal configurations. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252687#comment-14252687 ] Elliott Clark commented on HBASE-5699: -- Yeah +1 on getting it into 1.0. For me I think that probably means with only one wal configured by default. For what it's worth, we've seen larger improvements from running multi wals but we also have more disks per machine. So could be cluster dependent. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252635#comment-14252635 ] stack commented on HBASE-5699: -- bq. I'm inclined to leave the default alone for 1.0 and change it for later once we have some more stats on e.g. impact on recovery. Grand. I gave this +1 over on rb if you want to commit. Fat release note I'd say (but you were probably going to do that anyways) > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252622#comment-14252622 ] Sean Busbey commented on HBASE-5699: I don't have any ideas for region grouping besides identity; I think it's a fine starting point. The current configuration makes "multiwal" the bounded version, so there's only N configurable wals (the current patch says 1, but 2 might make more sense). It doesn't given an option yet for "wal per region", so anyone who wants to use that would have to manually specify the fully qualified class name. If we wanted to give them assurances about compatibility we'd need to give them a config name. I'm split on wether it makes sense to make it the default given the modest improvement. I'm inclined to leave the default alone for 1.0 and change it for later once we have some more stats on e.g. impact on recovery. My intuition says wal recovery should be roughly the same. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252554#comment-14252554 ] stack commented on HBASE-5699: -- What you thinking here [~busbey] ? So, we'd commit this but default is 'identity' as I read it? i.e. a log per region. Is that so? Would be cool if multiwal were on by default or does it need to show better numbers to be on by default? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14248660#comment-14248660 ] stack commented on HBASE-5699: -- Thanks [~busbey] Looks like writing datanodes puts a bit of friction on our write path. I wonder how much the ringbuffer+grouping is costing us? Looking at the graph, you'd think adding extra WALs would make a bigger difference given the large gap between no-friction and one WAL. Good stuff. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_disabled_and_regular_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247315#comment-14247315 ] Elliott Clark commented on HBASE-5699: -- [~busbey] HBASE-11283 did something like that for 0.89-fb. Basically any sync that took longer than 1 second caused a roll of the HLog. The thought being that we could get a new write pipeline that might avoid a dead/dying disk or datanode. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247308#comment-14247308 ] stack commented on HBASE-5699: -- bq. I could run this as a baseline with the DisabledWALProvider. Could be informative learning upper bound on how many writes/sec we can drive. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247173#comment-14247173 ] Sean Busbey commented on HBASE-5699: bq. I was also surprised about the boost, since my initial thinking was that we'd be constrained in the single wal case by writing to one pipeline. However, once I think through things more it makes a little more sense. For one thing we only use hflush and not hsync, so even for network flushes we're still largely in memory. That also means those datanodes can keep handling the write to disk as we get the next pipeline. On the far end of this chart for 128MB blocks, that should be happening every ~1.5 seconds for the single wal. That allows us to keep more than just 3 disks busy even in the single wal case. One other note about the rate at which we are rolling pipelines already. This had me thinking about the improvements in HBASE-10278. If we're rolling that often under load, I wonder if we'd be better off just forcing a roll at whatever the "pipeline sync is slow" threshold is rather than maintain the state to do a switch. A question better exploded on that ticket, I suppose. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247143#comment-14247143 ] Sean Busbey commented on HBASE-5699: bq. What would the charts look like if no disk friction at all, i.e. no a mocked WAL? Are we using all available i/o or are we blocked internally – cpu/locks/context switching? I could run this as a baseline with the DisabledWALProvider. IIRC all it does is increment metric counts. bq. Nice graphs Sean. How'd you make them/run the tests? The tests are runs of the WALPerformanceEval tool using the command up above under "Overview". To alter the WAL count I made multiple conf dirs with that setting switched in hbase-site.xml and then exported an appropriate HBASE_CONF_DIR before each run. The tests I ran over the weekend (but haven't gotten to chart yet) are the same but with more options configured. The chart itself is just the output from the log of the test filtered for the append byte counts that happen every 30 seconds, then put into a google doc to do deltas and avg / stddev. I was considering using the same data to plot the IQR instead of average +- stddev, but I wasn't sure that would be as consumable. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247132#comment-14247132 ] Sean Busbey commented on HBASE-5699: These are nodes with 5 data drives. There are 6 physical disks in the machines and one is set aside for OS. I was also surprised about the boost, since my initial thinking was that we'd be constrained in the single wal case by writing to one pipeline. However, once I think through things more it makes a little more sense. For one thing we only use hflush and not hsync, so even for network flushes we're still largely in memory. That also means those datanodes can keep handling the write to disk as we get the next pipeline. On the far end of this chart for 128MB blocks, that should be happening every ~1.5 seconds for the single wal. That allows us to keep more than just 3 disks busy even in the single wal case. It's possible the gain shown by the perf eval will be bigger once I get HBASE-12339 in place. As is, we're paying the overhead of new block allocation instead of the overhead of new file allocation. I don't have enough info to know if that delta matters though. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247111#comment-14247111 ] stack commented on HBASE-5699: -- What would the charts look like if no disk friction at all, i.e. no a mocked WAL? Are we using all available i/o or are we blocked internally -- cpu/locks/context switching? Nice graphs Sean. How'd you make them/run the tests? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247057#comment-14247057 ] Jonathan Hsieh commented on HBASE-5699: --- I like the new graph -- summarizes a lot and is still simple to follow. I'm trying to make sense of the 20-25% boost (hoped for more!). These are 5 disk machines -- are these boxes configured so that one disk per machine set aside for the os and 4 disks are "data" drives? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_#workers_vs_MiB_per_s_1x1col_512Bval_wal_count_1,2,4.tiff, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246776#comment-14246776 ] Hadoop QA commented on HBASE-5699: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12687232/HBASE-5699.4.patch.txt against master branch at commit db873f0886ec43e2e5b3bdcb56399b3bceb4dcaa. ATTACHMENT ID: 12687232 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12084//console This message is automatically generated. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, HBASE-5699.4.patch.txt, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14243394#comment-14243394 ] Sean Busbey commented on HBASE-5699: I'm running some more tests. I redid the single wal test up to 400 concurrent sync heavy writers so I could make the comparison charts you asked for. In doing so I got more though-put in that case; it looks like we're pushing enough data then to require 1 - 1.5 new blocks each second, so we're effectively spreading across more disks. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242007#comment-14242007 ] Jonathan Hsieh commented on HBASE-5699: --- Nice results! A few questions thoughts. How do iops/s translate to mb/s? Are the reasonably close to max disk speed/3? Can you combine a few graphs so that you can see the main jump from 1 pipeline to 2 pipelines? the graphs currently don't quite line up. Instead of having time in x axis, use the # threads in x and show avg/std iop/s of each of the thread settings and # of pipeline settings? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: HBASE-5699.3.patch.txt, > HBASE-5699_write_iops_multiwal-1_1_to_200_threads.tiff, > HBASE-5699_write_iops_multiwal-2_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-4_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_multiwal-6_10,50,120,190,260,330,400_threads.tiff, > HBASE-5699_write_iops_upstream_1_to_200_threads.tiff, PerfHbase.txt, > hbase-5699_multiwal_400-threads_stats_sync_heavy.txt, > hbase-5699_total_throughput_sync_heavy.txt, > results-hbase5699-upstream.txt.bz2, results-hbase5699-wals-1.txt.bz2, > results-updated-hbase5699-wals-2.txt.bz2, > results-updated-hbase5699-wals-4.txt.bz2, > results-updated-hbase5699-wals-6.txt.bz2 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14213867#comment-14213867 ] Hadoop QA commented on HBASE-5699: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530779/PerfHbase.txt against trunk revision . ATTACHMENT ID: 12530779 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/11693//console This message is automatically generated. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Sean Busbey >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181895#comment-14181895 ] Sean Busbey commented on HBASE-5699: I have a patch implementing this on top of the refactoring in HBASE-10378. Any objections to me taking over the issue and posting it? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance, wal >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089495#comment-14089495 ] Anoop Sam John commented on HBASE-5699: --- Yes, the grouping logic should be a pluggable module. The grouping can be per table regions wise or on all regions. It should be inline with balancing strategy. (per table or not) bq.To replay locally we should avoid the RPC itself totally? Is it possible in the new distributed log replay? Have not checked deeply with code. These are like high level thoughts only. We can check more. If we can avoid RPCs in the replay that would be great IMO. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089487#comment-14089487 ] ramkrishna.s.vasudevan commented on HBASE-5699: --- Grouping based on region should in itself be a pluggable module because a simple thing could be just based on a specific factor (like group every 5 regions) or could be based on names. To start with we could do simple grouping. bq.we can try max to allocate all those regions to same RS on crash. So this RS can read this WAL and replay locally. To replay locally we should avoid the RPC itself totally? Is it possible in the new distributed log replay? It tries to create do table.batchmutate() right. Need to see the code to confirm this once. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089464#comment-14089464 ] Anoop Sam John commented on HBASE-5699: --- Ya collective ideas around MultiWAL > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089439#comment-14089439 ] Sean Busbey commented on HBASE-5699: [~anoop.hbase], that sounds like a combination of this and the ideas in HBASE-8610? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089418#comment-14089418 ] Anoop Sam John commented on HBASE-5699: --- My idea is like to make a multi WAL impl which helps write throughput as well as MTTR. The MTTR when we have the distributed log replay mode. If we can make sure to have region grouping policy in selecting the regions for a WAL in multi WAL area, we can try max to allocate all those regions to same RS on crash. So this RS can read this WAL and replay locally. The distributed log replay batch calls has not to go over RPC .. Lots of Qs and corner cases there. But we can discuss more on and try to make it better. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13912374#comment-13912374 ] stack commented on HBASE-5699: -- This issue adds switching between WALs > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821649#comment-13821649 ] Ted Yu commented on HBASE-5699: --- bq. we will have much more lease to recover At the beginning of recovery, master can send lease recovery requests for outstanding WAL files using thread pool. Each split worker would first check whether the WAL file it processes is closed. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement > Components: Performance >Reporter: binlijin >Assignee: Li Pi >Priority: Critical > Attachments: PerfHbase.txt > > -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13631631#comment-13631631 ] Anoop Sam John commented on HBASE-5699: --- May be we need to combine efforts here with HBASE-7835 [~jeffreyz] Working with HBASE-7835 where he try to do WAL replay with HTable#put() I had added below comment to HBASE-7835 {quote} I was thinking on this area. We have different JIRAs now related to HLOG and its split and replay. This one + HBASE-6772 + multi WAL... Can we think on all together. For multi WAL if we have a fixed set of regions for one WAL approach, during one RS down Master can assign those regions(Try max) to another one RS [Region groups in RS]. If the corresponding HLog file also assigned to that RS, then for the replay it can directly do puts on the region rather than IPC. {quote} If we can do all these I think MTTR also can be improved. I will start working with JIRA (along with Ram) from next week. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13631596#comment-13631596 ] Nicolas Liochon commented on HBASE-5699: If we implement this, we should test the impact on MTTR as well. My fear is that we will have much more lease to recover, and the way it's written today (one after this other), it could make failure recovery much slower on a small cluster. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13468087#comment-13468087 ] Hudson commented on HBASE-5699: --- Integrated in HBase-TRUNK #3408 (See [https://builds.apache.org/job/HBase-TRUNK/3408/]) HBASE-5699 Refactor HLog into an interface (Revision 1393126) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/backup/example/LongTermArchivingHFileCleaner.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HLogInputFormat.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/LogCleaner.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/Compressor.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogFactory.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogMetrics.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALActionsListener.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/MetaUtils.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/backup/example/TestZooKeeperTableArchiveClient.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRowProcessorEndpoint.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestWALObserver.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHLogRecordReader.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDistributedLogSplitting.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCacheOnWriteInSchema.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactSelection.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/FaultySequenceFileLogReader.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogUtilsForTests.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLo
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400559#comment-13400559 ] Lars Hofhansl commented on HBASE-5699: -- Assuming Datanodes and RegionServers are colocated no more bits will have to cross the (aggregate) "wires". Further assuming good load balancing within HBase the net bandwidth is still spread over the cluster (but with lower latency at each RegionServer). So I do not believe that HBASE-6116 will actually hurt performance. The key question is whether WAL writing is mostly bound by latency or bandwidth (And I do not know.) Do we get 35-40mb throughput from writing the WAL? If not, it is likely bound by latency. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293853#comment-13293853 ] Todd Lipcon commented on HBASE-5699: bq. I think we should wait for test result with HBASE-6116 before we invest more time in this. HBASE-6116 seems like it would improve latency but hurt throughput -- on a typical gbit link, the parallel writes would limit us to 50M/sec for 3 replicas, whereas pipelined writes could give us 100M+. The other main advantage of this JIRA is that the speed of the WAL is currently limited to the minimum speed of the 3 disks chosen in the pipeline. Given that disks can be heavily loaded, the probability of getting even a full disk's worth of throughput is low -- the likelihood is that at least one of those disks is also being written to or read from at least another client. So typically any single HDFS stream is limited to 35-40MB/sec in my experience. Given that gbit is much faster than this, we can get better throughput by adding parallel WALs, so as to stripe across disks and dynamically push writes to less-loaded disks. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293517#comment-13293517 ] Zhihong Ted Yu commented on HBASE-5699: --- As I mentioned in HBASE-6055 @ 04/Jun/12 17:47, one of the benefits of this feature is for each HLog file to receive edits for one single table. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293471#comment-13293471 ] Lars Hofhansl commented on HBASE-5699: -- I think we should wait for test result with HBASE-6116 before we invest more time in this. My gut feeling tells me, that is something that is better handled at the HDFS level. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289370#comment-13289370 ] ramkrishna.s.vasudevan commented on HBASE-5699: --- @Ted The ycsb report we will get it tomorrow. Today environemnt is busy. @Lars We will try to check HBASE-6116 also but not very sure if in the next couple of days. Anyway will try. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289161#comment-13289161 ] stack commented on HBASE-5699: -- Whats the high level on the perf numbers? Does more WALs help? How much? Thanks. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288699#comment-13288699 ] Lars Hofhansl commented on HBASE-5699: -- @Ted or @Ram: If you have any chance to test HBASE-6116 as well, that'd be really cool (although it would be more effort, as it only works against Hadoop trunk - and soon Hadoop 2.0-alpha). Andy said he might test against EC2. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288592#comment-13288592 ] Zhihong Ted Yu commented on HBASE-5699: --- Can you run ycsb with 50% insert and 50% update load ? Performance numbers in attachment match what I got based on my implementation. Thanks > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > Attachments: PerfHbase.txt > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279912#comment-13279912 ] Lars Hofhansl commented on HBASE-5699: -- Good point. Was referring to the general feature, not necessarily WAL/Region. It's a trade off: Batching vs. parallel writes (just to state the obvious) Do we batch beyond a region normally, though? Maybe during cache flush. Yeah, WAL/Region with sync is probably not a good idea, there just won't be enough spindles in the HDFS cluster to absorb that. So what's a good heuristic for the number of WALs? Maybe (assuming good block distribution and that HBase is the only user of the cluster) it should be around #spindles/#replicas...? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279667#comment-13279667 ] Todd Lipcon commented on HBASE-5699: I think with durable sync, having a WAL-per-region would be even less feasible than it is today -- we currently depend on batching in order to get good throughput. If a server has 50 regions, then you'd get 50x less batching opportunity and write throughput would grind to a halt. Imagine a fan-out write to all of the regions -- it would generate 50 disk seeks instead of just 1. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279661#comment-13279661 ] Lars Hofhansl commented on HBASE-5699: -- I suspect this will become more important when people eventually turn on HBASE-5954 (durable sync, if they don't in data centers with backup power supplies). bq. There would be many regions in a cluster. They may not receive even write load. Is that necessarily a problem? Just saying that while we are exploring this, might as well explore this option as well. I for one be happy if a region's edits are tied to that region and log splitting could just go away (well almost, would still need to split if the region is split). > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279327#comment-13279327 ] Li Pi commented on HBASE-5699: -- Btw. I have finals and other stuff coming up. So it might be a while before I finish my implementation. If anyone else wants to take a go at it. This is cool. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279289#comment-13279289 ] Li Pi commented on HBASE-5699: -- My design is a bit different. Ill upload a patch soon. I'm doing any region to any blog. Currently distributed log splitting and replication do not work yet. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279125#comment-13279125 ] Zhihong Yu commented on HBASE-5699: --- There would be many regions in a cluster. They may not receive even write load. We should set configuration parameter which governs the maximum number of concurrent WALs on each region server. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13279078#comment-13279078 ] Lars Hofhansl commented on HBASE-5699: -- Should we explore a WAL per Region? Would be a lot of open files, but if it'd work, we won't need log spitting anymore. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278927#comment-13278927 ] Zhihong Yu commented on HBASE-5699: --- @Ramkrishna: Your numbers look better than mine though the mix in my case was 50% updates and 50% puts. Can you publish latency numbers as well ? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278615#comment-13278615 ] ramkrishna.s.vasudevan commented on HBASE-5699: --- We are also interested in this. Worked on a prototype with having one HLog instance but underlying there will be multiple writer instances. The regions will be allocated with any one of the writer instance and each region will be writing to hlog using the instance associated with it. Even on logrolling the instances against each region will be updated and the region will continue to use its mapping. Without patch ~53K puts/sec. With patch ~78-80k puts/sec It is a 3 node cluster and the size of each record was 1k. No of regions : 2800 By default used 3 writer instances. I was able to pass the testcases related to TestHlog and TestDistributedLogSplitting. But Testmasterreplication was not passing. Replication needs some change based on this which i did not work on much. The pendingWrites list that we use is now converted into a map having the writer with the list of pending writes. Pls provide your suggestions on this. BTW, Li Pi, any progress on this? I would love to help you in this. May be i can prepare a more forma patch and upload over here. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268789#comment-13268789 ] Zhihong Yu commented on HBASE-5699: --- bq. If you have multiple hlogs do you use a different hlog in different regions? Correct. I have to go through legal procedure at my employer before disclosing my patch. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268786#comment-13268786 ] Jonathan Hsieh commented on HBASE-5699: --- Zhihong, I'm curious to learn about the approach you have taken in the prototype that you have. Is it on github somewhere perhaps? If you have multiple hlogs do you use a different hlog in different regions? Do you have a shim that looks like an hlog but has two hlogs inside it (as opposed to hdfs file handles)? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268697#comment-13268697 ] Zhihong Yu commented on HBASE-5699: --- bq. there is still a single "HLog" class, but underneath it would be multiple "SequenceFileLogWriters" My approach is different from the above. The new interface should be general enough that multi-HLog can be implemented without the requirement that HLog have multiple writers. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268216#comment-13268216 ] Li Pi commented on HBASE-5699: -- I thought about the compression bit already. I was going to compress each separate log individually. Yeah, I should have probably wrote up what I was going to do before hacking stuff up. Will switch gears and work on that a bit instead. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268196#comment-13268196 ] Jonathan Hsieh commented on HBASE-5699: --- Li, if you want to undertake this I'll help. Let's chat and then write a one-two page summary of what our goals are, what our assumptions are, and what our intended mechanisms are, how we are going to test this and then loopback here with a design/plan to get feedback.. Another "feature" that may come into play also is the HLog compression. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268110#comment-13268110 ] stack commented on HBASE-5699: -- @Li You should be able to work whereever you like if you make up an harness for running hlog implementations apart from hbase. This should be first order of business (unless you are a masochist). Should be easy enough, if its not possible already, especially after you make it pluggable. Regards... "I'm assuming we don't need to guarantee HLog edit sequencing. If we do, this becomes a bit harder." -- well thats the way it is currently so onus will be on you to come up a reason why it could be otherwise. In-order makes it easier to reason about whether or not all edits up to a particular sequence id have been sync'd or not. And don't forget the other side of the moon, the (distributed) log splitting story. That needs to work too after you are done. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268103#comment-13268103 ] Zhihong Yu commented on HBASE-5699: --- This feature requires validation in real cluster. @Jonathan: Are you able to help Li in this regard ? >From my experience in the past three weeks, development involves coding -> >running test suite -> discovering defect through failed unit tests -> bug >fixing -> validation through ycsb -> ... > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268101#comment-13268101 ] Li Pi commented on HBASE-5699: -- While performance numbers will change, you can simply test with MultipleHLogs on and MultipleHLogsoff. I don't think we're going to move everyone over to multiple Hlogs immediately. Will take a look at those tests. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268093#comment-13268093 ] Zhihong Yu commented on HBASE-5699: --- Here're the key unit tests that must pass: TestDistributedLogSplitting, TestReplication, TestMasterReplication, TestMultiSlaveReplication, TestHLog, TestHLogSplit, TestLogRollAbort, TestLogRolling > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268055#comment-13268055 ] Zhihong Yu commented on HBASE-5699: --- Using trunk has the drawback that performance numbers (without this feature) gathered on day N may be obsolete by day N + 5, considering the amount of changes going into trunk. I would suggest tackling replication as first priority. Dictionary WAL compression brought unexpected complexities w.r.t. replication. We shouldn't make replication any harder. w.r.t. refactoring HLog into an interface, I tend to think that the interface should make different implementations possible. If we only have one implementation, it is not easy to evaluate the effectiveness of the refactoring. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268048#comment-13268048 ] Li Pi commented on HBASE-5699: -- Replication is the failure point. I haven't really worked on that yet. Talked to Jon about the dev process. I'll create a seperate Jira for refactoring HLog into an interface. I'll probably continue to work within trunk. Separate JIRA should make things easier though. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268039#comment-13268039 ] Zhihong Yu commented on HBASE-5699: --- Are replication related unit tests passing ? Since the review process would at least take a month, I think developing against a branch would be good practice. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268010#comment-13268010 ] Li Pi commented on HBASE-5699: -- I'm assuming we don't need to guarantee HLog edit sequencing. If we do, this becomes a bit harder. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265648#comment-13265648 ] ramkrishna.s.vasudevan commented on HBASE-5699: --- Do we need to gurantee the HLog edits sequencing even with multiple WALs? Just referring to Stack's comment in https://issues.apache.org/jira/browse/HBASE-5782?focusedCommentId=13255344&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13255344 > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265620#comment-13265620 ] Zhihong Yu commented on HBASE-5699: --- Currently we maintain one sequence number per region per HLog. From append(): {code} this.lastSeqWritten.putIfAbsent(regionInfo.getEncodedNameAsBytes(), Long.valueOf(seqNum)); {code} If WALEdit's from a particular region can spread across multiple streams, accounting would be more complex. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265606#comment-13265606 ] Zhihong Yu commented on HBASE-5699: --- bq. to one "HLog" object, which might have more than one underlying stream. The above can be a (sub-)task by itself. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265596#comment-13265596 ] Todd Lipcon commented on HBASE-5699: bq. Currently each HRegion has reference to the HLog it uses. If requests can be freely redirected to the HLog instance having fewer outstanding requests, the reference would be to that of the region server. Sorry, I should be less free-wheeling with my terminology. My thought was that there is still a single "HLog" class, but underneath it would be multiple "SequenceFileLogWriters", most likely. Though maybe the correct implementation is to make HLog an interface, and then have a MultiHLog which wraps N other HLogs or something. Either way, any region would only have a reference to one "HLog" object, which might have more than one underlying stream. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265567#comment-13265567 ] Jonathan Hsieh commented on HBASE-5699: --- The argument here is mostly aimed at read latency, but a similar idea could be used for write latency as well. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265565#comment-13265565 ] Jonathan Hsieh commented on HBASE-5699: --- Part of the motivation for multiple wals can be found in this tech talk: (most relavent to HBase is backup requests, starting slide 39) http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265521#comment-13265521 ] Zhihong Yu commented on HBASE-5699: --- Trying to understand the implication of Todd's suggestion above. Currently each HRegion has reference to the HLog it uses. If requests can be freely redirected to the HLog instance having fewer outstanding requests, the reference would be to that of the region server. This means additional logic on region server for dispatching the write requests. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265426#comment-13265426 ] Li Pi commented on HBASE-5699: -- Agree with todd on the implementation details. The switching of logs should also serve to help balance our log writes. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265421#comment-13265421 ] Todd Lipcon commented on HBASE-5699: bq. I disagree, considering that most of the deployments have rep=3 you're using three spindles not one That said, most of our customers are deploying 6 disks if not 12 :) IMO the other big gain we can get from multiple WALs is to automatically switch between WALs when one gets "slow". IMO we should maintain a count of outstanding requests (probably by size) for each WAL, and submit writes to whichever has fewer outstanding requests. That way if one is faster, it will take more of the load. Then simultaneously measure trailing latency stats on each WAL, and if one is significantly slower than the other for some period of time, have it roll (to try to get a new set of disks/nodes) > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265167#comment-13265167 ] Jean-Daniel Cryans commented on HBASE-5699: --- bq. Intuitively it seems like the number of WAL's that are used should be related to the number of spindles available to hbase. I disagree, considering that most of the deployments have rep=3 you're using three spindles not one. The multiplying effect could generate a lot of disk seeks since the WALs are competing like that (plus flushing, compacting, etc). > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265128#comment-13265128 ] Zhihong Yu commented on HBASE-5699: --- Currently I use the following knob for the maximum number of WAL's on an individual region server: {code} +int totalInstances = conf.getInt("hbase.regionserver.hlog.total", DEFAULT_MAX_HLOG_INSTANCES); {code} > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265112#comment-13265112 ] Elliott Clark commented on HBASE-5699: -- Intuitively it seems like the number of WAL's that are used should be related to the number of spindles available to hbase. So maybe this should be either a configurable number or something that is derived from the number of mount points hdfs is hosted on ? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13265076#comment-13265076 ] Zhihong Yu commented on HBASE-5699: --- Playing with a prototype of this feature using ycsb (half insert, half upate) on a 5-node cluster where usertable has 13 regions on each region server. Without this feature: {code} 10 sec: 99965 operations; 9996.5 current ops/sec; [UPDATE AverageLatency(us)=258.68] [INSERT AverageLatency(us)=610.28] 20 sec: 99965 operations; 0 current ops/sec; 25 sec: 0 operations; 4.3 current ops/sec; [UPDATE AverageLatency(us)=2594303.62] [INSERT AverageLatency(us)=1240495.41] [OVERALL], RunTime(ms), 25844.0 [OVERALL], Throughput(ops/sec), 3868.9831295465096 [UPDATE], Operations, 49935 [UPDATE], AverageLatency(us), 674.2635626314209 {code} with this feature: {code} 10 sec: 99952 operations; 9994.2 current ops/sec; [UPDATE AverageLatency(us)=178.7] [INSERT AverageLatency(us)=584.76] 20 sec: 0 operations; 3.8 current ops/sec; [UPDATE AverageLatency(us)=10.88] [INSERT AverageLatency(us)=679174.27] 20 sec: 0 operations; 0 current ops/sec; [OVERALL], RunTime(ms), 20867.0 [OVERALL], Throughput(ops/sec), 4791.776489193463 [UPDATE], Operations, 49992 [UPDATE], AverageLatency(us), 178.6439030244839 {code} > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13260020#comment-13260020 ] stack commented on HBASE-5699: -- @Ted Would suggest you just leave it. When you delete, we all get a message in our mailbox about the delete transaction. Then we start to wonder... > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13260008#comment-13260008 ] Zhihong Yu commented on HBASE-5699: --- It was a duplicate message. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13260005#comment-13260005 ] stack commented on HBASE-5699: -- @Ted Why delete a comment, especially someone elses? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin >Assignee: Li Pi > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250471#comment-13250471 ] Li Pi commented on HBASE-5699: -- This seems interesting. I'll take a look at doing this. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250470#comment-13250470 ] Li Pi commented on HBASE-5699: -- This seems interesting. I'll take a look at doing this. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250374#comment-13250374 ] binlijin commented on HBASE-5699: - I just run a test and don't test the recovery and the others. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249902#comment-13249902 ] stack commented on HBASE-5699: -- @binlijin What Chunhui says. I'd think that if it were a bigger cluster you'd see a more marked improvement. What about recovery? How does log splitting work with all the extra WALs? > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249686#comment-13249686 ] chunhui shen commented on HBASE-5699: - I think the number of datanodes is a litte few in the test. Using double hlogWrites in RS, write performance should be nearly double except limit by the HDFS. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13249664#comment-13249664 ] binlijin commented on HBASE-5699: - @stack, I run a test with 0.90 version use 10 writer and 3 nodes, some times it has double write performance, may be it not very well. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5699) Run with > 1 WAL in HRegionServer
[ https://issues.apache.org/jira/browse/HBASE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246971#comment-13246971 ] Juhani Connolly commented on HBASE-5699: Since we have had some similar experience posting it here: We are finding most of our IPC threads in our region servers locked into HWal.append(42 out of 50. Of those 20 are in sync, and one is actually working... As is to be expected). We made the presumption that the problem was with the WAL synchronisation mechanisms holding things up and decided to try running multiple RS per node since we had significant amount of free CPU and memory resources as well as many barely active hard disks. By running 3 RS per node, we saw our application specific throughput go from 7k events to 18k. Each event is made up of roughly 2 writes and 2 increments, plus some reads/scans which shouldn't be touching the WAL. This situation is partially also just due to a very high spec per node. I don't think it would be necessary on more "commodity" type servers, but the option to use multiple WAL's on each region server may well give some significant throughput gains for some hardware setups. > Run with > 1 WAL in HRegionServer > - > > Key: HBASE-5699 > URL: https://issues.apache.org/jira/browse/HBASE-5699 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira