[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839024#comment-16839024 ] HBase QA commented on HBASE-12636: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 2s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/in-progress/precommit-patchnames for instructions. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HBASE-12636 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/in-progress/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-12636 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12685233/HBASE-12635-v2.diff | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/308/console | | Powered by | Apache Yetus 0.9.0 http://yetus.apache.org | This message was automatically generated. > Avoid too many write operations on zookeeper in replication > --- > > Key: HBASE-12636 > URL: https://issues.apache.org/jira/browse/HBASE-12636 > Project: HBase > Issue Type: Improvement > Components: Replication >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Major > Labels: replication > Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff > > > In our production cluster, we found there are about over 1k write operations > per second on zookeeper from hbase replication. The reason is that the > replication source will write the log position to zookeeper for every edit > shipping. If the current replicating WAL is just the WAL that regionserver is > writing to, each skipping will be very small but the frequency is very high, > which causes many write operations on zookeeper. > A simple solution is that writing log position to zookeeper when position > diff or skipped edit number is larger than a threshold, not every edit > shipping. > Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112050#comment-16112050 ] Hadoop QA commented on HBASE-12636: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 4s{color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for instructions. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 6s{color} | {color:red} HBASE-12636 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-12636 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12685233/HBASE-12635-v2.diff | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7903/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > Avoid too many write operations on zookeeper in replication > --- > > Key: HBASE-12636 > URL: https://issues.apache.org/jira/browse/HBASE-12636 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui > Labels: replication > Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff > > > In our production cluster, we found there are about over 1k write operations > per second on zookeeper from hbase replication. The reason is that the > replication source will write the log position to zookeeper for every edit > shipping. If the current replicating WAL is just the WAL that regionserver is > writing to, each skipping will be very small but the frequency is very high, > which causes many write operations on zookeeper. > A simple solution is that writing log position to zookeeper when position > diff or skipped edit number is larger than a threshold, not every edit > shipping. > Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111982#comment-16111982 ] stack commented on HBASE-12636: --- My guess is this is still a problem. Unscheduling though till someone takes it up in context of current hbase. > Avoid too many write operations on zookeeper in replication > --- > > Key: HBASE-12636 > URL: https://issues.apache.org/jira/browse/HBASE-12636 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui > Labels: replication > Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff > > > In our production cluster, we found there are about over 1k write operations > per second on zookeeper from hbase replication. The reason is that the > replication source will write the log position to zookeeper for every edit > shipping. If the current replicating WAL is just the WAL that regionserver is > writing to, each skipping will be very small but the frequency is very high, > which causes many write operations on zookeeper. > A simple solution is that writing log position to zookeeper when position > diff or skipped edit number is larger than a threshold, not every edit > shipping. > Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598568#comment-14598568 ] Lars Hofhansl commented on HBASE-12636: --- Sorry... Missed this. The patch's approach is to perform the replication and simply not log every advance. If I understand the issue correctly this is for the case we write into the cluster at a medium pace and hence each check on whether there's something to replicate would only find a few edits. Hence another approach is to simply slow down replication a bit. I.e. if we only have a few edits we wait a bit longer to edits to come in, only then we replicate the data and advance the pointer in ZK. That would also improve replication efficiency. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 2.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543131#comment-14543131 ] Liu Shaohui commented on HBASE-12636: - [~stack] [~lhofhansl] [~apurtell] Any more suggestions about this patch? Could you help to push this? Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 2.0.0, 1.2.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14366894#comment-14366894 ] Liu Shaohui commented on HBASE-12636: - [~lhofhansl] [~stack] Any suggestions about this patch? The write operations to zookeeper from replication decrease to several hundreds from 5 thousands per second in our cluster with this patch. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.1.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248100#comment-14248100 ] Liu Shaohui commented on HBASE-12636: - [~lhofhansl] [~stack] Repeated replication data will happen in many cases in current codebase. - The network problem which make master cluster did not get response of replication but the replication data has been written into in slave cluster. - A moving region in salve cluster which make the replication failed, but part of replication data has been into in slave cluster. This patch just make repeated replication data more frequently. We may open another issue to check if the replication operation is idempotent? Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246330#comment-14246330 ] stack commented on HBASE-12636: --- bq. if the frequency is too high, just check a little less often and let more data accumulate. That'd keep current 'style'. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245673#comment-14245673 ] Hadoop QA commented on HBASE-12636: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12685233/HBASE-12635-v2.diff against master branch at commit 65830b096b6f540b7e49ef590dac1ebe2491c126. ATTACHMENT ID: 12685233 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12072//console This message is automatically generated. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245817#comment-14245817 ] Lars Hofhansl commented on HBASE-12636: --- Is it really OK to double replicate data? It's not doing that now, right? It's especially bad when two clusters are setup in master-master replication and both are active. If you all feel that's OK, let's commit... -0 from me. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245821#comment-14245821 ] stack commented on HBASE-12636: --- bq. Is it really OK to double replicate data? It's not doing that now, right? The duplicate will just overwrite. Counters will be slightly off. I was thinking this only repercussion. Better than buckling your local zk ensemble with updates I'd say. Fine holding on commit till gets a bit more attention. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14245824#comment-14245824 ] Lars Hofhansl commented on HBASE-12636: --- Was thinking that if both cluster are active a write that is repeated much later will might cause issues. We can also drive down the frequency of the checks... Maybe? I.e. if the frequency is too high, just check a little less often and let more data accumulate. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12635-v2.diff, HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234418#comment-14234418 ] Lars Hofhansl commented on HBASE-12636: --- Means in the event of failure we could double replicate some entries, right? Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234440#comment-14234440 ] Andrew Purtell commented on HBASE-12636: What about recording this information into a system table instead? We can start the migration of replication state out of ZooKeeper. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234663#comment-14234663 ] stack commented on HBASE-12636: --- bq. Means in the event of failure we could double replicate some entries, right? Yes. Thats ok though, right? bq. What about recording this information into a system table instead? We can start the migration of replication state out of ZooKeeper. Yes. In a separate JIRA? Patch LGTM (The background code looks to be duplicated) Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234677#comment-14234677 ] Andrew Purtell commented on HBASE-12636: bq. Yes. In a separate JIRA? I'd say not ideally. Nobody's going to work on it. We keep putting it off. Now we're putting band aids on the problem. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234968#comment-14234968 ] Enis Soztutar commented on HBASE-12636: --- unLogPositionEntries is always 0 in this patch? Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234974#comment-14234974 ] Liu Shaohui commented on HBASE-12636: - [~apurtell] {quote} What about recording this information into a system table instead? We can start the migration of replication state out of ZooKeeper. {quote} Agree. We may start to work on it in next month. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12636) Avoid too many write operations on zookeeper in replication
[ https://issues.apache.org/jira/browse/HBASE-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234977#comment-14234977 ] Liu Shaohui commented on HBASE-12636: - [~enis] {quote} unLogPositionEntries is always 0 in this patch? {quote} Yes. I forgot to add the currentSize to unLogPositionEntries after edits are skippped. Avoid too many write operations on zookeeper in replication --- Key: HBASE-12636 URL: https://issues.apache.org/jira/browse/HBASE-12636 Project: HBase Issue Type: Improvement Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Labels: replication Fix For: 1.0.0 Attachments: HBASE-12636-v1.diff In our production cluster, we found there are about over 1k write operations per second on zookeeper from hbase replication. The reason is that the replication source will write the log position to zookeeper for every edit shipping. If the current replicating WAL is just the WAL that regionserver is writing to, each skipping will be very small but the frequency is very high, which causes many write operations on zookeeper. A simple solution is that writing log position to zookeeper when position diff or skipped edit number is larger than a threshold, not every edit shipping. Suggestions are welcomed, thx~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)