[jira] [Created] (HBASE-6379) [0.90 branch] Backport HBASE-6334 to 0.90
Gregory Chanan created HBASE-6379: - Summary: [0.90 branch] Backport HBASE-6334 to 0.90 Key: HBASE-6379 URL: https://issues.apache.org/jira/browse/HBASE-6379 Project: HBase Issue Type: Task Components: test Reporter: Gregory Chanan Assignee: Gregory Chanan Fix For: 0.90.7 See HBASE-6334 for details. The issue is that HBASE-6334 detects both HBASE-4195 (which should be backported to 0.90 -- I'll file another JIRA for that) and HBASE-2856 (which is a known issue in 0.90 that won't be fixed because it requires a change to the HFile format). So in 0.90, we need a way to only catch HBASE-4195 failures and ignore HBASE-2856 failures. Luckily, HBASE-4195 only occurs *within* a column family, while HBASE-2856 occurs *between* column families, so we just need to add a little to the backport to differentiate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row
[ https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412564#comment-13412564 ] Benoit Sigoure commented on HBASE-6239: --- This means HBase replication will still corrupt timestamps in 0.90.7, which in many cases makes replication useless. Are you sure? > [replication] ReplicationSink uses the ts of the first KV for the other KVs > in the same row > --- > > Key: HBASE-6239 > URL: https://issues.apache.org/jira/browse/HBASE-6239 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1 >Reporter: Jean-Daniel Cryans >Assignee: Jean-Daniel Cryans >Priority: Critical > Labels: corruption > Fix For: 0.92.2, 0.90.8 > > Attachments: HBASE-6239-0.92-v1.patch > > > ReplicationSink assumes that all the KVs for the same row inside a WALEdit > will have the same timestamp, which is not necessarily the case. > This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6317) Master clean start up and Partially enabled tables make region assignment inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] rajeshbabu updated HBASE-6317: -- Attachment: HBASE-6317_94_3.patch > Master clean start up and Partially enabled tables make region assignment > inconsistent. > --- > > Key: HBASE-6317 > URL: https://issues.apache.org/jira/browse/HBASE-6317 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: HBASE-6317_94.patch, HBASE-6317_94_3.patch > > > If we have a table in partially enabled state (ENABLING) then on HMaster > restart we treat it as a clean cluster start up and do a bulk assign. > Currently in 0.94 bulk assign will not handle ALREADY_OPENED scenarios and it > leads to region assignment problems. Analysing more on this we found that we > have better way to handle these scenarios. > {code} > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > synchronized (this.regions) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > {code} > We dont add to regions map so that enable table handler can handle it. But > as nothing is added to regions map we think it as a clean cluster start up. > Will come up with a patch tomorrow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412550#comment-13412550 ] Hadoop QA commented on HBASE-4050: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536174/HBASE-4050-2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 7 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 12 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2367//console This message is automatically generated. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable
[ https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ShiXing updated HBASE-6370: --- Attachment: HBASE-6370-trunk-V2.patch Yes, I think the configuration is more acceptable for heterogeneous environment between master and regionservers. I set the configuration base.master.check.compression default true. > Add compression codec test at HMaster when > createTable/modifyColumn/modifyTable > --- > > Key: HBASE-6370 > URL: https://issues.apache.org/jira/browse/HBASE-6370 > Project: HBase > Issue Type: Improvement >Reporter: ShiXing >Assignee: ShiXing >Priority: Minor > Attachments: HBASE-6370-trunk-V1.patch, HBASE-6370-trunk-V2.patch > > > We deployed a cluster that none of the regionserver supports the compression > codec such like "lzo", but the cluster user/client does not know this, and he > specifies the family's compression codec by > HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO); > Because the HBaseAdmin's createTable is async, so the client is waiting all > the regions of the table to be online forever. And client does not know why > the regions are not online until the HBase administrator find this problem. > In deed, all of the regions are assigning by master, but regionserver's > openHRegion always failed. > In my option, we can suppose all the cluster's enviroment are the same, means > if the master is deployed some lib, the regionserver should also be deployed. > Of course above is just a suppose, in real deployment, the hbase dba may just > deploy lib on regionserver or master. > So I think this failure can be found earlier before master create the > CreateTableHandler thread, and we can tell client quickly we didn't support > this compression codec type. > I will upload the patch later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5498) Secure Bulk Load
[ https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412542#comment-13412542 ] Francis Liu commented on HBASE-5498: Hi Laxman, Looks right to me apart from the delegation token. You need to pass an hdfs delegation token because we'd like to impersonate the user when changing permissions on hdfs. Also the path doesn't need to be the full URI. Getting the token should be something like this: FileSystem fs = FileSystem.get(conf); Token token = fs.getDelegationToken("renewer"); Let me know how things go. -Francis > Secure Bulk Load > > > Key: HBASE-5498 > URL: https://issues.apache.org/jira/browse/HBASE-5498 > Project: HBase > Issue Type: Improvement > Components: mapred, security >Reporter: Francis Liu >Assignee: Francis Liu > Fix For: 0.96.0 > > Attachments: HBASE-5498_draft.patch > > > Design doc: > https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load > Short summary: > Security as it stands does not cover the bulkLoadHFiles() feature. Users > calling this method will bypass ACLs. Also loading is made more cumbersome in > a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data > from user's directory to the hbase directory, which would require certain > write access privileges set. > Our solution is to create a coprocessor which makes use of AuthManager to > verify if a user has write access to the table. If so, launches a MR job as > the hbase user to do the importing (ie rewrite from text to hfiles). One > tricky part this job will have to do is impersonate the calling user when > reading the input files. We can do this by expecting the user to pass an hdfs > delegation token as part of the secureBulkLoad() coprocessor call and extend > an inputformat to make use of that token. The output is written to a > temporary directory accessible only by hbase and then bulkloadHFiles() is > called. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412538#comment-13412538 ] Lars Hofhansl commented on HBASE-6377: -- Perhaps we can have "Get" and "Update" metrics. "Updates" would include Put, Deleted, ICV, etc. But maybe that would require more discussion, so short term (0.94.1 at least), we could remove the Put/Delete/Get metrics as you suggest. > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-4050: - Attachment: HBASE-4050-2.patch After code comments. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412519#comment-13412519 ] Andrew Purtell edited comment on HBASE-6377 at 7/12/12 5:10 AM: Perhaps not a full revert, but the "put" and "delete" metrics are not useful in a basic LoadTestTool test scenario, so consider dropping those. The distinction is increasingly only valid client side. Perhaps also remove the "get" one as well so we're not in effect special casing a metric only into HRI.get(). Edit: But even after the above, the histograms remain for FS level ops, so there's benefit to a partial revert only. was (Author: apurtell): Perhaps not a full revert, but the "put" and "delete" metrics are not useful in a basic LoadTestTool test scenario, so consider dropping those. The distinction is increasingly only valid client side. Perhaps also remove the "get" one as well so we're not in effect special casing a metric only into HRI.get(). > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412519#comment-13412519 ] Andrew Purtell commented on HBASE-6377: --- Perhaps not a full revert, but the "put" and "delete" metrics are not useful in a basic LoadTestTool test scenario, so consider dropping those. The distinction is increasingly only valid client side. Perhaps also remove the "get" one as well so we're not in effect special casing a metric only into HRI.get(). > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412515#comment-13412515 ] Hadoop QA commented on HBASE-4050: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536167/HBASE-4050-1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 7 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2366//console This message is automatically generated. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412513#comment-13412513 ] Elliott Clark commented on HBASE-4050: -- bq.ResourceFinder seems to have a few nice properties but do we actually require them? Not 100%, we can get around ServiceLoader's short comings by having the ServiceLoader create factories that take in arguments and pass them to a constructor, however it would be cleaner . bq.why did you pick this particular implementation of ResourceFinder from xbeans-3.7 A friend sent me to that exact link, saying they were using it. I can pull a newer one. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412509#comment-13412509 ] Elliott Clark commented on HBASE-4050: -- bq.ReplicationMetricsSource javadoc is to be filled. bq.And some catch clauses have boilerplate code Agreed. I'll get a patch up soon. bq.I thought author name shouldn't appear in the file header: I was trying to keep the source as close to the original as possible. I'm open for whatever; I was just trying to make sure that the people who wrote it got credit. bq.Consider using uppercase M in the string below Hadoop's metrics2 uses all lowercase for context. https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L68 bq.Do we need to check that delta is non-negative ? Nope. Hadoop doesn't so I followed suit. bq.Maybe give the assembly file a more descriptive name ? Sure > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6336) Split point should not be equal with start row or end row
[ https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412504#comment-13412504 ] ramkrishna.s.vasudevan commented on HBASE-6336: --- @Stack bq.If so, why we write a flush file if no KVs? Yes we are writing an emtpy now. Incase of compaction we are creating an empty file. So once the region is split we compact, so there an empty file is created for an empty region. See HBASE-6059 - Replaying recovered edits would make deleted data exist again There i had a concern on creating an empty store file, but it was needed. So you feel any problem there Stack? > Split point should not be equal with start row or end row > - > > Key: HBASE-6336 > URL: https://issues.apache.org/jira/browse/HBASE-6336 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: chunhui shen >Assignee: chunhui shen > Fix For: 0.96.0 > > Attachments: HBASE-6336.patch > > > Should we allow split point equal with region's start row or end row? > {code} > // if the midkey is the same as the first and last keys, then we cannot > // (ever) split this region. > if (this.comparator.compareRows(mk, firstKey) == 0 && > this.comparator.compareRows(mk, lastKey) == 0) { > if (LOG.isDebugEnabled()) { > LOG.debug("cannot split because midkey is the same as first or " + > "last row"); > } > {code} > Here, I think it is a mistake. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux
[ https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John resolved HBASE-5798. --- Resolution: Duplicate The issue with NPE is fixed as part of HBASE-5928. > NPE running hbck on 0.94 out of reportTablesInFlux > -- > > Key: HBASE-5798 > URL: https://issues.apache.org/jira/browse/HBASE-5798 > Project: HBase > Issue Type: Bug > Components: hbck >Affects Versions: 0.94.0, 0.96.0 >Reporter: stack >Assignee: Anoop Sam John > Attachments: HBASE-5798_94.patch, HBASE-5798_trunk.patch > > > Got this playing w/ hbck going against the 0.94RC: > {code} > 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames => > [] > Exception in thread "main" java.lang.NullPointerException > at > org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553) > at > org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344) > at > org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380) > at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412503#comment-13412503 ] Luke Lu commented on HBASE-4050: ResourceFinder seems to have a few nice properties but do we actually *require* them?, it's > 1KLOC, 1/3 of the whole patch. According to a comment in http://goo.gl/LmsXp it doesn't support comments etc in service definitions and that it doesn't offer perceivable performance improvement over ServiceLoader. Also, why did you pick this particular implementation of ResourceFinder from xbeans-3.7 (the current release is 3.11.1)? > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.
[ https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412500#comment-13412500 ] ramkrishna.s.vasudevan commented on HBASE-5516: --- @Jon Currently am not working on 0.90. So i may not find time on that. But i would say that you can take a look at the patch? Actually in our 0.90 cluster while using GZIp compression we found memory leak frequently and that occured due to GZip Streams. Thanks Jon. > GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for > 0.90. > > > Key: HBASE-5516 > URL: https://issues.apache.org/jira/browse/HBASE-5516 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.5 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.90.7 > > Attachments: HBASE-5516_2_0.90.patch, HBASE-5516_3_0.90.patch > > > Usage of GZip is leading to resident memory leak in 0.90. > We need to have something similar to HBASE-5387 in 0.90. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately
[ https://issues.apache.org/jira/browse/HBASE-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412499#comment-13412499 ] zhou wenjian commented on HBASE-6378: - In 90 and 92 this function will delete the node in zk, But changed since 94? The javadoc puzzled me,because I found node in zk still exists when creating table down > the javadoc of setEnabledTable maybe not describe accurately > -- > > Key: HBASE-6378 > URL: https://issues.apache.org/jira/browse/HBASE-6378 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0 >Reporter: zhou wenjian > Fix For: 0.94.1 > > > /** >* Sets the ENABLED state in the cache and deletes the zookeeper node. Fails >* silently if the node is not in enabled in zookeeper >* >* @param tableName >* @throws KeeperException >*/ > public void setEnabledTable(final String tableName) throws KeeperException { > setTableState(tableName, TableState.ENABLED); > } > When setEnabledTable occours ,It will update the cache and the zookeeper > node,rather than to delete the zk node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412496#comment-13412496 ] Zhihong Ted Yu commented on HBASE-6375: --- I ran the test above and it passed: {code} Running org.apache.hadoop.hbase.master.TestHMasterRPCException Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.295 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ hbase-server --- [INFO] Tests are skipped. [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 20.332s {code} Will wait for one day for further comments. > Master may be using a stale list of region servers for creating assignment > plan during startup > -- > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 > Environment: All >Reporter: Aditya Kishore >Assignee: Aditya Kishore > Fix For: 0.96.0 > > Attachments: HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observation > where the master tries to assign all regions to a single region server even > though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it > tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468 Waiting on regionserver(s) to checkin > 56,968 Waiting on regionserver(s) to checkin > 58,468 Waiting on regionserver(s) to checkin > 59,968 Waiting on regionserver(s) to checkin > 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false > 01,469 Waiting on regionserver(s) count to settle; currently=1 > 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 > 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 > 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE > 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 > 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE > 03,336 Detected completed assignment of META, notifying catalog tracker > 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 > 03,350 Master startup proceeding: cluster startup > 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false > 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false > 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false > 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false > 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false > 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false > 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false > 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true > 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedException > { > // Get all available servers > List servers = serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegions = > MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), > true); > if (allRegions == null || allRegions.isEmpty()) return; > // Determine what type of assignment to do on startup > boolean retainAssignment = master.getConfiguration(). > getBoolean("hbase.master.startup.retainassign", true); > Map> bulkPlan = null; > if (retainAssignment) { > // Reuse existing assignment info > bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); > } else { > // assign regions in round-robin fashion > bulkPlan = LoadBalancer.roundRobinAssignment(new > ArrayList(allRegions.keySet()), servers); > } > LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + > servers.size() + " server(s), retainAssignment=" + retainAssignment); > ... > {code} > In the function assignAllUserRegions(), listed above, AM fetches the server > list from ServerManager long before
[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412492#comment-13412492 ] Lars Hofhansl commented on HBASE-6377: -- I'd be supportive of reverting HBASE-5533 until we work out a clear strategy of what we're measuring and when. > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412491#comment-13412491 ] Lars Hofhansl commented on HBASE-6377: -- I "knew" there would be issues somewhere with HBASE-6284 :( What do you say Andrew, this seems bad enough to delay 0.94.1? Does a "Put" metric still make sense? Should it be a "Mutation" metric which includes Deletes? > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5711) Tests are failing with incorrect data directory permissions.
[ https://issues.apache.org/jira/browse/HBASE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412487#comment-13412487 ] Dave Revell commented on HBASE-5711: Here's a workaround for people running into permission problems while embedding a minicluster. {noformat} hbaseTestUtil = new HBaseTestingUtility(); // Workaround for HBASE-5711, we need to set config value dfs.datanode.data.dir.perm // equal to the permissions of the temp dirs on the filesystem. These temp dirs were // probably created using this process' umask. So we guess the temp dir permissions as // 0777 & ~umask, and use that to set the config value. try { Process process = Runtime.getRuntime().exec("/bin/sh -c umask"); BufferedReader br = new BufferedReader(new InputStreamReader(process.getInputStream())); int rc = process.waitFor(); if(rc == 0) { String umask = br.readLine(); int umaskBits = Integer.parseInt(umask, 8); int permBits = 0777 & ~umaskBits; String perms = Integer.toString(permBits, 8); log.info("Setting dfs.datanode.data.dir.perm to " + perms); hbaseTestUtil.getConfiguration().set("dfs.datanode.data.dir.perm", perms); } else { log.warn("Failed running umask command in a shell, nonzero return value"); } } catch (Exception e) { // ignore errors, we might not be running on POSIX, or "sh" might not be on the path log.warn("Couldn't get umask", e); } hbaseTestUtil.startMiniCluster(); {noformat} > Tests are failing with incorrect data directory permissions. > > > Key: HBASE-5711 > URL: https://issues.apache.org/jira/browse/HBASE-5711 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G > Fix For: 0.92.3 > > Attachments: HBASE-5711.patch > > > When we run some tests in Hbase (TestAdmin), it is failing with following > error. > {quote} > Starting DataNode 0 with dfs.data.dir: > E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb\dfs\data\data1,E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb\dfs\data\data2 > 2012-04-04 18:04:51,036 WARN [main] impl.MetricsSystemImpl(137): Metrics > system not started: Cannot locate configuration: tried > hadoop-metrics2-datanode.properties, hadoop-metrics2.properties > 2012-04-04 18:04:51,255 WARN [main] datanode.DataNode(1548): Invalid > directory in dfs.data.dir: Incorrect permission for > E:/Repositories/Hbase/target/test-data/5ff23198-892e-4f1c-8022-b3d9969fcf0b/dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb/dfs/data/data1, > expected: rwxr-xr-x, while actual: rwx-- > 2012-04-04 18:04:51,411 WARN [main] datanode.DataNode(1548): Invalid > directory in dfs.data.dir: Incorrect permission for > E:/Repositories/Hbase/target/test-data/5ff23198-892e-4f1c-8022-b3d9969fcf0b/dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb/dfs/data/data2, > expected: rwxr-xr-x, while actual: rwx-- > 2012-04-04 18:04:51,411 ERROR [main] datanode.DataNode(1554): All directories > in dfs.data.dir are invalid. > 2012-04-04 18:04:51,411 INFO [main] hbase.HBaseTestingUtility(684): Shutting > down minicluster > 2012-04-04 18:04:51,646 WARN [main] hbase.HBaseTestingUtility(696): Failed > delete of > E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb > 2012-04-04 18:04:51,646 INFO [main] hbase.HBaseTestingUtility(700): > Minicluster is down > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412484#comment-13412484 ] Zhihong Ted Yu commented on HBASE-4050: --- Amazing work ! ReplicationMetricsSource javadoc is to be filled. And some catch clauses have boilerplate code: {code} + } catch (IOException e) { +e.printStackTrace(); //To change body of catch statement use File | Settings | File Templates. {code} I thought author name shouldn't appear in the file header: {code} + * author David Blevins + * version $Rev$ $Date$ {code} Consider using uppercase M in the string below: {code} + private static final String METRICS_CONTEXT = "replicationmetrics"; {code} Do we need to check that delta is non-negative ? {code} +gaugeInt.decr(delta); {code} Maybe give the assembly file a more descriptive name ? {code} +src/assembly/two.xml {code} hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceMetrics.java is removed. hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/metrics2/ReplicationSourceMetrics.java is added. What about metrics1 ? > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-4050: - Attachment: HBASE-4050-1.patch I missed one part in ReplicationSourceMetrics. This should fix the tests. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately
zhou wenjian created HBASE-6378: --- Summary: the javadoc of setEnabledTable maybe not describe accurately Key: HBASE-6378 URL: https://issues.apache.org/jira/browse/HBASE-6378 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Reporter: zhou wenjian Fix For: 0.94.1 /** * Sets the ENABLED state in the cache and deletes the zookeeper node. Fails * silently if the node is not in enabled in zookeeper * * @param tableName * @throws KeeperException */ public void setEnabledTable(final String tableName) throws KeeperException { setTableState(tableName, TableState.ENABLED); } When setEnabledTable occours ,It will update the cache and the zookeeper node,rather than to delete the zk node. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
[ https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412480#comment-13412480 ] Anoop Sam John commented on HBASE-5997: --- bq. Is it possible if the file is empty say that we'll seek on every invocation of getFirstKey? Yes as the check is against firstKey being not null. May be we can have a boolean variable based check. As this is rare chance and the existance of the half file also wont be for more time( normally) I thought may be okey. What do u say Stack? If you feel I can change this. bq.This patch does not do you your compare of row only rather than compare of full key. Is it supposed to? Yes. You can see that the comparison now is against the first key in the file rather than the split key {code} - if (getComparator().compare(key, offset, length, splitkey, 0, - splitkey.length) < 0) { + byte[] fk = getFirstKey(); + // This will be null when the file is empty in which we can not seekBefore to any key + if (fk == null) return false; + if (getComparator().compare(key, offset, length, fk, 0, + fk.length) <= 0) { {code} So it is okey to have the full key based compare.[Not rowkey alone] > Fix concerns raised in HBASE-5922 related to HalfStoreFileReader > > > Key: HBASE-5997 > URL: https://issues.apache.org/jira/browse/HBASE-5997 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 >Reporter: ramkrishna.s.vasudevan >Assignee: Anoop Sam John > Fix For: 0.94.2 > > Attachments: HBASE-5997_0.94.patch, HBASE-5997_94 V2.patch, > Testcase.patch.txt > > > Pls refer to the comment > https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346. > Raised this issue to solve that comment. Just incase we don't forget it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412479#comment-13412479 ] Hadoop QA commented on HBASE-4050: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536162/HBASE-4050-0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 7 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 11 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.replication.TestMultiSlaveReplication org.apache.hadoop.hbase.replication.TestMasterReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2365//console This message is automatically generated. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()
[ https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412476#comment-13412476 ] Anoop Sam John commented on HBASE-6284: --- I mean public methods in HRegion can be called from co processors at RS side. > Introduce HRegion#doMiniBatchMutation() > --- > > Key: HBASE-6284 > URL: https://issues.apache.org/jira/browse/HBASE-6284 > Project: HBase > Issue Type: Bug > Components: performance, regionserver >Reporter: Zhihong Ted Yu >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, > HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, > HBASE-6284_Trunk.patch > > > From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion': > The HTable#delete(List) groups the Deletes for the same RS and make > one n/w call only. But within the RS, there will be N number of delete calls > on the region one by one. This will include N number of HLog write and sync. > If this also can be grouped can we get better performance for the multi row > delete. > I have made the new miniBatchDelete () and made the > HTable#delete(List) to call this new batch delete. > Just tested initially with the one node cluster. In that itself I am getting > a performance boost which is very much promising. > Only one CF and qualifier. > 10K total rows delete with a batch of 100 deletes. Only deletes happening on > the table from one thread. > With the new way the net time taken is reduced by more than 1/10 > Will test in a 4 node cluster also. I think it will worth doing this change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412469#comment-13412469 ] Hadoop QA commented on HBASE-6375: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536140/HBASE-6375_trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The applied patch generated 5 javac compiler warnings (more than the trunk's current 4 warnings). -1 findbugs. The patch appears to introduce 8 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestHMasterRPCException Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2364//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2364//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2364//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2364//console This message is automatically generated. > Master may be using a stale list of region servers for creating assignment > plan during startup > -- > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 > Environment: All >Reporter: Aditya Kishore >Assignee: Aditya Kishore > Fix For: 0.96.0 > > Attachments: HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observation > where the master tries to assign all regions to a single region server even > though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it > tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468 Waiting on regionserver(s) to checkin > 56,968 Waiting on regionserver(s) to checkin > 58,468 Waiting on regionserver(s) to checkin > 59,968 Waiting on regionserver(s) to checkin > 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false > 01,469 Waiting on regionserver(s) count to settle; currently=1 > 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 > 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 > 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE > 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 > 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE > 03,336 Detected completed assignment of META, notifying catalog tracker > 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 > 03,350 Master startup proceeding: cluster startup > 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false > 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false > 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false > 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false > 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false > 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false > 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false > 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true > 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedException > { > // Get all available servers > List servers = serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegio
[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent
[ https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412464#comment-13412464 ] Jimmy Xiang commented on HBASE-6272: Patch version 2 was uploaded to RB: https://reviews.apache.org/r/5717/. > In-memory region state is inconsistent > -- > > Key: HBASE-6272 > URL: https://issues.apache.org/jira/browse/HBASE-6272 > Project: HBase > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > AssignmentManger stores region state related information in several places: > regionsInTransition, regions (region info to server name map), and servers > (server name to region info set map). However the access to these places is > not coordinated properly. It leads to inconsistent in-memory region state > information. Sometimes, some region could even be offline, and not in > transition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412461#comment-13412461 ] Andrew Purtell commented on HBASE-6377: --- Other means for applying puts and deletes (RowMutations etc.) are not covered either, neither are Increments, or Appends. Perhaps this is an argument for reverting HBASE-5533. > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
[ https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6377: -- Component/s: regionserver metrics Affects Version/s: 0.94.1 0.96.0 > HBASE-5533 metrics miss all operations submitted via MultiAction > > > Key: HBASE-6377 > URL: https://issues.apache.org/jira/browse/HBASE-6377 > Project: HBase > Issue Type: Bug > Components: metrics, regionserver >Affects Versions: 0.96.0, 0.94.1 >Reporter: Andrew Purtell > > A client application (LoadTestTool) calls put() on HTables. Internally to the > HBase client those puts are batched into MultiActions. The total number of > put operations shown in the RegionServer's put metrics histogram never > increases from 0 even though millions of such operations are made. Needless > to say the latency for those operations are not measured either. The value of > HBASE-5533 metrics are suspect given the client will batch put and delete ops > like this. > I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction > processing in HRegionServer would distingush between puts and deletes and > dispatch them separately. It was easy to account for the time for them. Now > both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction
Andrew Purtell created HBASE-6377: - Summary: HBASE-5533 metrics miss all operations submitted via MultiAction Key: HBASE-6377 URL: https://issues.apache.org/jira/browse/HBASE-6377 Project: HBase Issue Type: Bug Reporter: Andrew Purtell A client application (LoadTestTool) calls put() on HTables. Internally to the HBase client those puts are batched into MultiActions. The total number of put operations shown in the RegionServer's put metrics histogram never increases from 0 even though millions of such operations are made. Needless to say the latency for those operations are not measured either. The value of HBASE-5533 metrics are suspect given the client will batch put and delete ops like this. I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction processing in HRegionServer would distingush between puts and deletes and dispatch them separately. It was easy to account for the time for them. Now both puts and deletes are submitted in batch together as mutations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412457#comment-13412457 ] Alex Baranau commented on HBASE-4050: - "was about to commit" read as "was about to provide patch" ;) > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412456#comment-13412456 ] Alex Baranau commented on HBASE-4050: - Heh, was about to commit example with ServiceLoader (and same extra modules based on your schema above), but looks like it makes sense to use ResourceFinder. Will use your patch and add metrics sources for RegionServer and Master to it (almost empty, as per discussion above) tomorrow and we can think about closing the issue (after review, etc.). Thank you, Elliott! > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-4050: - Attachment: HBASE-4050-0.patch Here's a patch that add's hadoop compatibility shims. I needed something to prototype and test with so I used my implementation of HBASE-6323 as an example. hbase-hadoop-compat contains the factory and the interface. The factory uses ResourceFinder from the geronimo project. It's much more flexible than ServiceLoader (allows different locations easily and most importantly it allows constructor arguments). I didn't want to add the whole geronimo project as a dep so the code is copied in. I tried to give as much credit as I could. I can go back to using ServiceLoader if people object to having hbase-hadoop1-compat and hbase-hadoop2-compat add the actual implementation of the class who's interface is defined in hbase-hadoop-compat. I don't have a hbase-hadoop23-compat Right now depending upon which profile is building the hbase-server module gets one of the above as a dependency. In addition when building assembly files only contain the hbase-hadoop{1,2}-compat directory needed. It's possible to keep the old assembly file the way it was and change the shell scripts to only load the one. But I didn't get to that. I tested it in place and locally after building tar.gz's on both * hadoop 1.0.3 * hadoop 2.0.0-alpha In place scripts still work though I'm not really sure of why or how. I need to investigate that later. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050-0.patch, HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2315) BookKeeper for write-ahead logging
[ https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412441#comment-13412441 ] Zhihong Ted Yu commented on HBASE-2315: --- The ctor of HLog takes a FileSystem parameter. Since the FileSystem isn't important to bookkeeper, my feeling is that the approach in previous patch makes sense. You can remodel that patch by introducing a new hbase module. Thanks > BookKeeper for write-ahead logging > -- > > Key: HBASE-2315 > URL: https://issues.apache.org/jira/browse/HBASE-2315 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Flavio Junqueira > Attachments: HBASE-2315.patch, bookkeeperOverview.pdf, > zookeeper-dev-bookkeeper.jar > > > BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high > throughput write-ahead logging service. This issue provides an implementation > of write-ahead logging for hbase using BookKeeper. Apart from expected > throughput improvements, BookKeeper also has stronger durability guarantees > compared to the implementation currently used by hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6317) Master clean start up and Partially enabled tables make region assignment inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412437#comment-13412437 ] rajeshbabu commented on HBASE-6317: --- @Lars I will upload patch addressing some of his comments and writing test case for that. Upload by afternoon. > Master clean start up and Partially enabled tables make region assignment > inconsistent. > --- > > Key: HBASE-6317 > URL: https://issues.apache.org/jira/browse/HBASE-6317 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: HBASE-6317_94.patch > > > If we have a table in partially enabled state (ENABLING) then on HMaster > restart we treat it as a clean cluster start up and do a bulk assign. > Currently in 0.94 bulk assign will not handle ALREADY_OPENED scenarios and it > leads to region assignment problems. Analysing more on this we found that we > have better way to handle these scenarios. > {code} > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > synchronized (this.regions) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > {code} > We dont add to regions map so that enable table handler can handle it. But > as nothing is added to regions map we think it as a clean cluster start up. > Will come up with a patch tomorrow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6376) bin/hbase command doesn't seem to be working
Devaraj Das created HBASE-6376: -- Summary: bin/hbase command doesn't seem to be working Key: HBASE-6376 URL: https://issues.apache.org/jira/browse/HBASE-6376 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Devaraj Das Priority: Blocker Fix For: 0.96.0 I noticed that commands like "bin/hbase shell" doesn't work. The exception trace is: {noformat} bin/hbase shell Exception in thread "main" java.lang.NoClassDefFoundError: org/jruby/Main Caused by: java.lang.ClassNotFoundException: org.jruby.Main at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) {noformat} This is a trunk build (mvn package -DskipTests=true) and then I am trying to run the bin/hbase command from the root directory. (Am I missing something?) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6319) ReplicationSource can call terminate on itself and deadlock
[ https://issues.apache.org/jira/browse/HBASE-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6319: -- Fix Version/s: (was: 0.90.7) 0.90.8 Bumping to 0.90.8 -- I'm personally not concerned about replication in 0.90.7 > ReplicationSource can call terminate on itself and deadlock > --- > > Key: HBASE-6319 > URL: https://issues.apache.org/jira/browse/HBASE-6319 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans >Assignee: Jean-Daniel Cryans > Fix For: 0.92.2, 0.94.2, 0.90.8 > > Attachments: HBASE-6319-0.92.patch > > > In a few places in the ReplicationSource code calls terminate on itself which > is a problem since in terminate() we wait on that thread to die. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row
[ https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6239: -- Fix Version/s: (was: 0.90.7) 0.90.8 Bumping to 0.90.8 -- I'm personally not concerned about replication in 0.90.7 > [replication] ReplicationSink uses the ts of the first KV for the other KVs > in the same row > --- > > Key: HBASE-6239 > URL: https://issues.apache.org/jira/browse/HBASE-6239 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1 >Reporter: Jean-Daniel Cryans >Assignee: Jean-Daniel Cryans >Priority: Critical > Labels: corruption > Fix For: 0.92.2, 0.90.8 > > Attachments: HBASE-6239-0.92-v1.patch > > > ReplicationSink assumes that all the KVs for the same row inside a WALEdit > will have the same timestamp, which is not necessarily the case. > This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6325) [replication] Race in ReplicationSourceManager.init can initiate a failover even if the node is alive
[ https://issues.apache.org/jira/browse/HBASE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6325: -- Fix Version/s: (was: 0.90.7) 0.90.8 Bumping to 0.90.8 -- I'm personally not concerned about replication in 0.90.7. > [replication] Race in ReplicationSourceManager.init can initiate a failover > even if the node is alive > - > > Key: HBASE-6325 > URL: https://issues.apache.org/jira/browse/HBASE-6325 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans >Assignee: Jean-Daniel Cryans > Fix For: 0.92.2, 0.96.0, 0.94.2, 0.90.8 > > Attachments: HBASE-6325-0.92-v2.patch, HBASE-6325-0.92.patch > > > Yet another bug found during the leap second madness, it's possible to miss > the registration of new region servers so that in > ReplicationSourceManager.init we start the failover of a live and replicating > region server. I don't think there's data loss but the RS that's being failed > over will die on: > {noformat} > 2012-07-01 06:25:15,604 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server > sv4r23s48,10304,1341112194623: Writing replication status > org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = > NoNode for > /hbase/replication/rs/sv4r23s48,10304,1341112194623/4/sv4r23s48%2C10304%2C1341112194623.1341112195369 > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:372) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:655) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:697) > at > org.apache.hadoop.hbase.replication.ReplicationZookeeper.writeReplicationStatus(ReplicationZookeeper.java:470) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:154) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:607) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:368) > {noformat} > It seems to me that just refreshing {{otherRegionServers}} after getting the > list of {{currentReplicators}} would be enough to fix this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved
[ https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6347: -- Labels: noob (was: ) > -ROOT- and .META. are stale in table.jsp if they moved > -- > > Key: HBASE-6347 > URL: https://issues.apache.org/jira/browse/HBASE-6347 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans > Labels: noob > Fix For: 0.92.2, 0.94.2, 0.90.8 > > > table.jsp does not use a lookup method on {{CatalogTracker}} that does not > force a refresh of the cache, thus it can get a stale location if -ROOT- or > .META. moved and the master hasn't tried to access them yet. > Should just be a matter of using waitForRoot/Meta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved
[ https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6347: -- Fix Version/s: (was: 0.90.7) 0.90.8 > -ROOT- and .META. are stale in table.jsp if they moved > -- > > Key: HBASE-6347 > URL: https://issues.apache.org/jira/browse/HBASE-6347 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans > Labels: noob > Fix For: 0.92.2, 0.94.2, 0.90.8 > > > table.jsp does not use a lookup method on {{CatalogTracker}} that does not > force a refresh of the cache, thus it can get a stale location if -ROOT- or > .META. moved and the master hasn't tried to access them yet. > Should just be a matter of using waitForRoot/Meta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved
[ https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412396#comment-13412396 ] Jonathan Hsieh commented on HBASE-6347: --- Seems minor for 0.90.7, bumping. > -ROOT- and .META. are stale in table.jsp if they moved > -- > > Key: HBASE-6347 > URL: https://issues.apache.org/jira/browse/HBASE-6347 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans > Labels: noob > Fix For: 0.92.2, 0.94.2, 0.90.8 > > > table.jsp does not use a lookup method on {{CatalogTracker}} that does not > force a refresh of the cache, thus it can get a stale location if -ROOT- or > .META. moved and the master hasn't tried to access them yet. > Should just be a matter of using waitForRoot/Meta. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang resolved HBASE-5376. Resolution: Later Assignee: Jimmy Xiang Close it for now. We haven't seen such a problem for quite a long time. > Add more logging to triage HBASE-5312: Closed parent region present in > Hlog.lastSeqWritten > -- > > Key: HBASE-5376 > URL: https://issues.apache.org/jira/browse/HBASE-5376 > Project: HBase > Issue Type: Sub-task >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.90.7 > > Attachments: hbase-5376.txt > > > It is hard to find out what exactly caused HBASE-5312. Some logging will be > helpful to shine some lights. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.
[ https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412389#comment-13412389 ] Jonathan Hsieh commented on HBASE-5516: --- Hm.. no tests, going to bump to 0.90.8 unless action taken. > GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for > 0.90. > > > Key: HBASE-5516 > URL: https://issues.apache.org/jira/browse/HBASE-5516 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.5 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.90.7 > > Attachments: HBASE-5516_2_0.90.patch, HBASE-5516_3_0.90.patch > > > Usage of GZip is leading to resident memory leak in 0.90. > We need to have something similar to HBASE-5387 in 0.90. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6317) Master clean start up and Partially enabled tables make region assignment inconsistent.
[ https://issues.apache.org/jira/browse/HBASE-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412390#comment-13412390 ] Lars Hofhansl commented on HBASE-6317: -- @rajeshbabu: could you reply to Stacks comments? Need to push 0.94.1. > Master clean start up and Partially enabled tables make region assignment > inconsistent. > --- > > Key: HBASE-6317 > URL: https://issues.apache.org/jira/browse/HBASE-6317 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.92.2, 0.96.0, 0.94.1 > > Attachments: HBASE-6317_94.patch > > > If we have a table in partially enabled state (ENABLING) then on HMaster > restart we treat it as a clean cluster start up and do a bulk assign. > Currently in 0.94 bulk assign will not handle ALREADY_OPENED scenarios and it > leads to region assignment problems. Analysing more on this we found that we > have better way to handle these scenarios. > {code} > if (false == checkIfRegionBelongsToDisabled(regionInfo) > && false == checkIfRegionsBelongsToEnabling(regionInfo)) { > synchronized (this.regions) { > regions.put(regionInfo, regionLocation); > addToServers(regionLocation, regionInfo); > } > {code} > We dont add to regions map so that enable table handler can handle it. But > as nothing is added to regions map we think it as a clean cluster start up. > Will come up with a patch tomorrow. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6375: -- Summary: Master may be using a stale list of region servers for creating assignment plan during startup (was: Master could possibly be using a stale list of region servers for creating assignment plan during startup) > Master may be using a stale list of region servers for creating assignment > plan during startup > -- > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 > Environment: All >Reporter: Aditya Kishore >Assignee: Aditya Kishore > Fix For: 0.96.0 > > Attachments: HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observation > where the master tries to assign all regions to a single region server even > though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it > tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468 Waiting on regionserver(s) to checkin > 56,968 Waiting on regionserver(s) to checkin > 58,468 Waiting on regionserver(s) to checkin > 59,968 Waiting on regionserver(s) to checkin > 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false > 01,469 Waiting on regionserver(s) count to settle; currently=1 > 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 > 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 > 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE > 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 > 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE > 03,336 Detected completed assignment of META, notifying catalog tracker > 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 > 03,350 Master startup proceeding: cluster startup > 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false > 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false > 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false > 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false > 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false > 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false > 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false > 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true > 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedException > { > // Get all available servers > List servers = serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegions = > MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), > true); > if (allRegions == null || allRegions.isEmpty()) return; > // Determine what type of assignment to do on startup > boolean retainAssignment = master.getConfiguration(). > getBoolean("hbase.master.startup.retainassign", true); > Map> bulkPlan = null; > if (retainAssignment) { > // Reuse existing assignment info > bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); > } else { > // assign regions in round-robin fashion > bulkPlan = LoadBalancer.roundRobinAssignment(new > ArrayList(allRegions.keySet()), servers); > } > LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + > servers.size() + " server(s), retainAssignment=" + retainAssignment); > ... > {code} > In the function assignAllUserRegions(), listed above, AM fetches the server > list from ServerManager long before it actually use it to create assignment > plan. > In between these, it performs a full scan of META to create an assignment map > of regions. So even if additional RSes have registered in the meantime (as > happened in this case), AM still has the old list of just one server. > This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and > trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster > can hit this issue upon clust
[jira] [Commented] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412385#comment-13412385 ] Zhihong Ted Yu commented on HBASE-6375: --- Interesting discovery. > Master could possibly be using a stale list of region servers for creating > assignment plan during startup > - > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 > Environment: All >Reporter: Aditya Kishore >Assignee: Aditya Kishore > Fix For: 0.96.0 > > Attachments: HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observation > where the master tries to assign all regions to a single region server even > though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it > tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468 Waiting on regionserver(s) to checkin > 56,968 Waiting on regionserver(s) to checkin > 58,468 Waiting on regionserver(s) to checkin > 59,968 Waiting on regionserver(s) to checkin > 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false > 01,469 Waiting on regionserver(s) count to settle; currently=1 > 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 > 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 > 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE > 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 > 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE > 03,336 Detected completed assignment of META, notifying catalog tracker > 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 > 03,350 Master startup proceeding: cluster startup > 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false > 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false > 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false > 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false > 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false > 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false > 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false > 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true > 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedException > { > // Get all available servers > List servers = serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegions = > MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), > true); > if (allRegions == null || allRegions.isEmpty()) return; > // Determine what type of assignment to do on startup > boolean retainAssignment = master.getConfiguration(). > getBoolean("hbase.master.startup.retainassign", true); > Map> bulkPlan = null; > if (retainAssignment) { > // Reuse existing assignment info > bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); > } else { > // assign regions in round-robin fashion > bulkPlan = LoadBalancer.roundRobinAssignment(new > ArrayList(allRegions.keySet()), servers); > } > LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + > servers.size() + " server(s), retainAssignment=" + retainAssignment); > ... > {code} > In the function assignAllUserRegions(), listed above, AM fetches the server > list from ServerManager long before it actually use it to create assignment > plan. > In between these, it performs a full scan of META to create an assignment map > of regions. So even if additional RSes have registered in the meantime (as > happened in this case), AM still has the old list of just one server. > This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and > trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster > can hit this issue upon cluster start-up when the following sequence holds > true. > # Master start long before the RSes (by default this
[jira] [Updated] (HBASE-6331) Problem with HBCK mergeOverlaps
[ https://issues.apache.org/jira/browse/HBASE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6331: - Fix Version/s: (was: 0.94.1) 0.94.2 Need to release 0.94.1RC soon, pushing for now. > Problem with HBCK mergeOverlaps > --- > > Key: HBASE-6331 > URL: https://issues.apache.org/jira/browse/HBASE-6331 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.2 > > Attachments: HBASE-6331_94.patch, HBASE-6331_Trunk.patch > > > In HDFSIntegrityFixer#mergeOverlaps(), there is a logic to create the final > range of the region after the overlap. > I can see one issue with this code > {code} > if (RegionSplitCalculator.BYTES_COMPARATOR > .compare(hi.getEndKey(), range.getSecond()) > 0) { > range.setSecond(hi.getEndKey()); > } > {code} > Here suppose the regions include the end region for which the endKey will be > empty, we need to get finally the range with endkey as empty byte[] > But as per the above logic it will see that any other key greater than the > empty byte[] and will set it. > Finally the new region created will not get endkey as empty byte[] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6291) Don't retry increments on an invalid cell
[ https://issues.apache.org/jira/browse/HBASE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6291: -- Fix Version/s: (was: 0.90.7) 0.90.8 bumping from 0.90.7 > Don't retry increments on an invalid cell > - > > Key: HBASE-6291 > URL: https://issues.apache.org/jira/browse/HBASE-6291 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans > Fix For: 0.92.2, 0.94.2, 0.90.8 > > > This says it all: > {noformat} > ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after > attempts=7, exceptions: > Thu Jun 28 18:34:44 UTC 2012, > org.apache.hadoop.hbase.client.HTable$8@4eabaf8c, java.io.IOException: > java.io.IOException: Attempted to increment field that isn't 64 bits wide > {noformat} > {{HRegion}} should be modified here to send a DoNotRetryIOException: > {code} > if (wrongLength) { > throw new DoNotRetryIOException( > "Attempted to increment field that isn't 64 bits wide"); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Kishore updated HBASE-6375: -- Description: While investigating an Out of Memory issue, I had an interesting observation where the master tries to assign all regions to a single region server even though 7 other had already registered with it. As the cluster had MSLAB enabled, this resulted in OOM on the RS when it tired to open all of them. *From master's log (edited for brevity):* {quote} 55,468 Waiting on regionserver(s) to checkin 56,968 Waiting on regionserver(s) to checkin 58,468 Waiting on regionserver(s) to checkin 59,968 Waiting on regionserver(s) to checkin 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false 01,469 Waiting on regionserver(s) count to settle; currently=1 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE 03,336 Detected completed assignment of META, notifying catalog tracker 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 03,350 Master startup proceeding: cluster startup 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of {quote} *A peek at AssignmentManager code offer some explanation:* {code} public void assignAllUserRegions() throws IOException, InterruptedException { // Get all available servers List servers = serverManager.getOnlineServersList(); // Scan META for all user regions, skipping any disabled tables Map allRegions = MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), true); if (allRegions == null || allRegions.isEmpty()) return; // Determine what type of assignment to do on startup boolean retainAssignment = master.getConfiguration(). getBoolean("hbase.master.startup.retainassign", true); Map> bulkPlan = null; if (retainAssignment) { // Reuse existing assignment info bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); } else { // assign regions in round-robin fashion bulkPlan = LoadBalancer.roundRobinAssignment(new ArrayList(allRegions.keySet()), servers); } LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + servers.size() + " server(s), retainAssignment=" + retainAssignment); ... {code} In the function assignAllUserRegions(), listed above, AM fetches the server list from ServerManager long before it actually use it to create assignment plan. In between these, it performs a full scan of META to create an assignment map of regions. So even if additional RSes have registered in the meantime (as happened in this case), AM still has the old list of just one server. This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster can hit this issue upon cluster start-up when the following sequence holds true. # Master start long before the RSes (by default this long ~= 4.5 seconds) # All the RSes start togather but one wins the race of registering with Master by few seconds. I am attaching a patch for the trunk which moves the code which fetches the RS list form the beginning of the function to where it is first use. Apart from this change, one other HBase setting that now becomes important is "hbase.master.wait.on.regionservers.mintostart" due to MSLAB being enabled by default. In large clusters which keeps it enabled now must modify "hbase.master.wait.on.regionservers.mintostart" to a suitable number than the default of 1 to ensure that the master waits for a quorum of RSes which are sufficient to open all the regions among themselves. I'll create a separate JIRA for the documentation change. was: While investigating an Out of Memory issue, I had an interesting observation where the master
[jira] [Commented] (HBASE-6321) ReplicationSource dies reading the peer's id
[ https://issues.apache.org/jira/browse/HBASE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412377#comment-13412377 ] Jonathan Hsieh commented on HBASE-6321: --- Bumping from 0.90.7 to 0.90.8 > ReplicationSource dies reading the peer's id > > > Key: HBASE-6321 > URL: https://issues.apache.org/jira/browse/HBASE-6321 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans > Fix For: 0.92.2, 0.96.0, 0.94.2, 0.90.8 > > > This is what I saw: > {noformat} > 2012-07-01 05:04:01,638 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Closing > source 8 because an error occurred: Could not read peer's cluster id > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /va1-backup/hbaseid > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:259) > at > org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:253) > {noformat} > The session should just be reopened. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6321) ReplicationSource dies reading the peer's id
[ https://issues.apache.org/jira/browse/HBASE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-6321: -- Fix Version/s: (was: 0.90.7) 0.90.8 > ReplicationSource dies reading the peer's id > > > Key: HBASE-6321 > URL: https://issues.apache.org/jira/browse/HBASE-6321 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.1, 0.94.0 >Reporter: Jean-Daniel Cryans > Fix For: 0.92.2, 0.96.0, 0.94.2, 0.90.8 > > > This is what I saw: > {noformat} > 2012-07-01 05:04:01,638 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Closing > source 8 because an error occurred: Could not read peer's cluster id > org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode > = Session expired for /va1-backup/hbaseid > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:127) > at > org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:259) > at > org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:253) > {noformat} > The session should just be reopened. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5157) Backport HBASE-4880- Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5157: -- Fix Version/s: (was: 0.90.8) 0.90.7 Actually, since this is a data loss bug, considering it. > Backport HBASE-4880- Region is on service before openRegionHandler completes, > may cause data loss > - > > Key: HBASE-5157 > URL: https://issues.apache.org/jira/browse/HBASE-5157 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan > Fix For: 0.90.7 > > Attachments: HBASE-4880_branch90_1.patch > > > Backporting to 0.90.6 considering the importance of the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4083) If Enable table is not completed and is partial, then scanning of the table is not working
[ https://issues.apache.org/jira/browse/HBASE-4083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4083: -- Resolution: Fixed Fix Version/s: (was: 0.90.7) 0.94.0 Status: Resolved (was: Patch Available) Removed from 0.90, was committed to trunk/0.94.0 back in July 2011.. Please file new issue if you want to get it into 0.90. > If Enable table is not completed and is partial, then scanning of the table > is not working > --- > > Key: HBASE-4083 > URL: https://issues.apache.org/jira/browse/HBASE-4083 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.3 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.0, 0.92.0 > > Attachments: HBASE-4083-1.patch, HBASE-4083_0.90.patch, > HBASE-4083_0.90_1.patch, HBASE-4083_trunk.patch, HBASE-4083_trunk_1.patch > > > Consider the following scenario > Start the Master, Backup master and RegionServer. > Create a table which in turn creates a region. > Disable the table. > Enable the table again. > Kill the Active master exactly at the point before the actual region > assignment is started. > Restart or switch master. > Scan the table. > NotServingRegionExcepiton is thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5157) Backport HBASE-4880- Region is on service before openRegionHandler completes, may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5157: -- Fix Version/s: (was: 0.90.7) 0.90.8 No activity in 6 months. Bumping from 0.90.7 > Backport HBASE-4880- Region is on service before openRegionHandler completes, > may cause data loss > - > > Key: HBASE-5157 > URL: https://issues.apache.org/jira/browse/HBASE-5157 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan > Fix For: 0.90.8 > > Attachments: HBASE-4880_branch90_1.patch > > > Backporting to 0.90.6 considering the importance of the issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException
[ https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-4462: -- Fix Version/s: (was: 0.90.7) 0.90.8 Looks like ram bumped this from 0.90.6 and it assigned to him so I'm bumping it from 0.90.7 > Properly treating SocketTimeoutException > > > Key: HBASE-4462 > URL: https://issues.apache.org/jira/browse/HBASE-4462 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.90.4 >Reporter: Jean-Daniel Cryans >Assignee: ramkrishna.s.vasudevan > Fix For: 0.90.8 > > Attachments: HBASE-4462_0.90.x.patch > > > SocketTimeoutException is currently treated like any IOE inside of > HCM.getRegionServerWithRetries and I think this is a problem. This method > should only do retries in cases where we are pretty sure the operation will > complete, but with STE we already waited for (by default) 60 seconds and > nothing happened. > I found this while debugging Douglas Campbell's problem on the mailing list > where it seemed like he was using the same scanner from multiple threads, but > actually it was just the same client doing retries while the first run didn't > even finish yet (that's another problem). You could see the first scanner, > then up to two other handlers waiting for it to finish in order to run > (because of the synchronization on RegionScanner). > So what should we do? We could treat STE as a DoNotRetryException and let the > client deal with it, or we could retry only once. > There's also the option of having a different behavior for get/put/icv/scan, > the issue with operations that modify a cell is that you don't know if the > operation completed or not (same when a RS dies hard after completing let's > say a Put but just before returning to the client). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master
[ https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5323: -- Fix Version/s: (was: 0.90.7) 0.90.8 > Need to handle assertion error while splitting log through > ServerShutDownHandler by shutting down the master > > > Key: HBASE-5323 > URL: https://issues.apache.org/jira/browse/HBASE-5323 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.5 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.2, 0.90.8 > > Attachments: HBASE-5323.patch, HBASE-5323.patch > > > We know that while parsing the HLog we expect the proper length from HDFS. > In WALReaderFSDataInputStream > {code} > assert(realLength >= this.length); > {code} > We are trying to come out if the above condition is not satisfied. But if > SSH.splitLog() gets this problem then it lands in the run method of > EventHandler. This kills the SSH thread and so further assignment does not > happen. If ROOT and META are to be assigned they cannot be. > I think in this condition we abort the master by catching such exceptions. > Please do suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master
[ https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412366#comment-13412366 ] Jonathan Hsieh commented on HBASE-5323: --- If it isn't making 0.94.1 then I'm going to bump it from 0.90.7. > Need to handle assertion error while splitting log through > ServerShutDownHandler by shutting down the master > > > Key: HBASE-5323 > URL: https://issues.apache.org/jira/browse/HBASE-5323 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.5 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.2, 0.90.8 > > Attachments: HBASE-5323.patch, HBASE-5323.patch > > > We know that while parsing the HLog we expect the proper length from HDFS. > In WALReaderFSDataInputStream > {code} > assert(realLength >= this.length); > {code} > We are trying to come out if the above condition is not satisfied. But if > SSH.splitLog() gets this problem then it lands in the run method of > EventHandler. This kills the SSH thread and so further assignment does not > happen. If ROOT and META are to be assigned they cannot be. > I think in this condition we abort the master by catching such exceptions. > Please do suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of "Region has been PENDING_CLOSE for too long..."
[ https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412363#comment-13412363 ] Jonathan Hsieh commented on HBASE-4064: --- There was some recent activity here, anyone planning on finishing this guy? (its been bumped a few times considering bumping it for the 0.90.7 release). > Two concurrent unassigning of the same region caused the endless loop of > "Region has been PENDING_CLOSE for too long..." > > > Key: HBASE-4064 > URL: https://issues.apache.org/jira/browse/HBASE-4064 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.3 >Reporter: Jieshan Bean > Fix For: 0.90.7 > > Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, > disableflow.png > > > 1. If there is a "rubbish" RegionState object with "PENDING_CLOSE" in > regionsInTransition(The RegionState was remained by some exception which > should be removed, that's why I called it as "rubbish" object), but the > region is not currently assigned anywhere, TimeoutMonitor will fall into an > endless loop: > 2011-06-27 10:32:21,326 INFO > org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed > out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > state=PENDING_CLOSE, ts=1309141555301 > 2011-06-27 10:32:21,326 INFO > org.apache.hadoop.hbase.master.AssignmentManager: Region has been > PENDING_CLOSE for too long, running forced unassign again on > region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > 2011-06-27 10:32:21,438 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of > region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > (offlining) > 2011-06-27 10:32:21,441 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign > region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is > not currently assigned anywhere > 2011-06-27 10:32:31,207 INFO > org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed > out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > state=PENDING_CLOSE, ts=1309141555301 > 2011-06-27 10:32:31,207 INFO > org.apache.hadoop.hbase.master.AssignmentManager: Region has been > PENDING_CLOSE for too long, running forced unassign again on > region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > 2011-06-27 10:32:31,215 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of > region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > (offlining) > 2011-06-27 10:32:31,215 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign > region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is > not currently assigned anywhere > 2011-06-27 10:32:41,164 INFO > org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed > out: test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > state=PENDING_CLOSE, ts=1309141555301 > 2011-06-27 10:32:41,164 INFO > org.apache.hadoop.hbase.master.AssignmentManager: Region has been > PENDING_CLOSE for too long, running forced unassign again on > region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > 2011-06-27 10:32:41,172 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of > region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. > (offlining) > 2011-06-27 10:32:41,172 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign > region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is > not currently assigned anywhere > . > 2 In the following scenario, two concurrent unassigning call of the same > region may lead to the above problem: > the first unassign call send rpc call success, the master watched the event > of "RS_ZK_REGION_CLOSED", process this event, will create a > ClosedRegionHandler to remove the state of the region in master.eg. > while ClosedRegionHandler is running in > "hbase.master.executor.closeregion.threads" thread (A), another unassign call > of same region run in another thread(B). > while thread B run "if (!regions.containsKey(region))", this.regions have > the region info, now cpu switch to thread A. > The thread A will remove the region from the sets of "this.regions" and > "regionsInTransition", then switch to thread B. the thread B run continue, > will throw an exception with the msg of "Server null returned > java.lang.NullPointerException: Passed server is null for > 9a6e26d40293663a79523c58315b930f", but without removing
[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception
[ https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412360#comment-13412360 ] Jonathan Hsieh commented on HBASE-5883: --- @Jieshan since this was committed along time ago (5/3/12) I'd suggest creating a new issue to clean it up. I'll close this after it is done. > Backup master is going down due to connection refused exception > --- > > Key: HBASE-5883 > URL: https://issues.apache.org/jira/browse/HBASE-5883 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: Jieshan Bean > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2 > > Attachments: 90-addendum.patch, 92-addendum.patch, 94-addendum.patch, > HBASE-5883-90.patch, HBASE-5883-92.patch, HBASE-5883-94.patch, > HBASE-5883-trunk.patch, trunk-addendum.patch > > > The active master node network was down for some time (This node contains > Master,DN,ZK,RS). Here backup node got > notification, and started to became active. Immedietly backup node got > aborted with the below exception. > {noformat} > 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: > finished splitting (more than or equal to) 861248320 bytes in 4 log files in > [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting] > in 26374ms > 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master > server abort: loaded coprocessors are: [] > 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.io.IOException: java.net.ConnectException: Connection refused > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375) > at > org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) > at $Proxy13.getProtocolVersion(Unknown Source) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660) > at > org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616) > at > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362) > ... 20 more > 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting > 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: > Stopping service threads > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/
[jira] [Commented] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten
[ https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412358#comment-13412358 ] Jonathan Hsieh commented on HBASE-5376: --- Jimmy, this patch look trivial, do you want to commit this the 0.90 branch? Other branches? > Add more logging to triage HBASE-5312: Closed parent region present in > Hlog.lastSeqWritten > -- > > Key: HBASE-5376 > URL: https://issues.apache.org/jira/browse/HBASE-5376 > Project: HBase > Issue Type: Sub-task >Reporter: Jimmy Xiang >Priority: Minor > Fix For: 0.90.7 > > Attachments: hbase-5376.txt > > > It is hard to find out what exactly caused HBASE-5312. Some logging will be > helpful to shine some lights. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files
[ https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-3834: -- Fix Version/s: (was: 0.90.7) 0.90.8 No activity, moving to 0.90.8 > Store ignores checksum errors when opening files > > > Key: HBASE-3834 > URL: https://issues.apache.org/jira/browse/HBASE-3834 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.90.2 >Reporter: Todd Lipcon >Priority: Critical > Fix For: 0.90.8 > > > If you corrupt one of the storefiles in a region (eg using vim to muck up > some bytes), the region will still open, but that storefile will just be > ignored with a log message. We should probably not do this in general - > better to keep that region unassigned and force an admin to make a decision > to remove the bad storefile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files
[ https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-3834: -- No activity, moving to 0.90.8 > Store ignores checksum errors when opening files > > > Key: HBASE-3834 > URL: https://issues.apache.org/jira/browse/HBASE-3834 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 0.90.2 >Reporter: Todd Lipcon >Priority: Critical > Fix For: 0.90.8 > > > If you corrupt one of the storefiles in a region (eg using vim to muck up > some bytes), the region will still open, but that storefile will just be > ignored with a log message. We should probably not do this in general - > better to keep that region unassigned and force an admin to make a decision > to remove the bad storefile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Kishore updated HBASE-6375: -- Fix Version/s: 0.96.0 Status: Patch Available (was: Open) As described in the summary, the patch modifies the AM code to fetch the list of RSes just before it needs it. > Master could possibly be using a stale list of region servers for creating > assignment plan during startup > - > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.92.1, 0.90.6, 0.96.0 > Environment: All >Reporter: Aditya Kishore >Assignee: Aditya Kishore > Fix For: 0.96.0 > > Attachments: HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observation > where the master tries to assign all regions to a single region server even > though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it > tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468 Waiting on regionserver(s) to checkin > 56,968 Waiting on regionserver(s) to checkin > 58,468 Waiting on regionserver(s) to checkin > 59,968 Waiting on regionserver(s) to checkin > 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false > 01,469 Waiting on regionserver(s) count to settle; currently=1 > 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 > 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 > 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE > 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 > 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE > 03,336 Detected completed assignment of META, notifying catalog tracker > 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 > 03,350 Master startup proceeding: cluster startup > 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false > 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false > 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false > 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false > 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false > 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false > 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false > 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true > 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedException > { > // Get all available servers > List servers = serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegions = > MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), > true); > if (allRegions == null || allRegions.isEmpty()) return; > // Determine what type of assignment to do on startup > boolean retainAssignment = master.getConfiguration(). > getBoolean("hbase.master.startup.retainassign", true); > Map> bulkPlan = null; > if (retainAssignment) { > // Reuse existing assignment info > bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); > } else { > // assign regions in round-robin fashion > bulkPlan = LoadBalancer.roundRobinAssignment(new > ArrayList(allRegions.keySet()), servers); > } > LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + > servers.size() + " server(s), retainAssignment=" + retainAssignment); > ... > {code} > In the function assignAllUserRegions(), listed above, AM fetches the server > list from ServerManager long before it actually use it to create assignment > plan. > In between these, it performs a full scan of META to create an assignment map > of regions. So even if additional RSes have registered in the meantime (as > happened in this case), AM still has the old list of just one server. > This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and > trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster > can hit this issue upon cluster
[jira] [Updated] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup
[ https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aditya Kishore updated HBASE-6375: -- Attachment: HBASE-6375_trunk.patch Patch for trunk > Master could possibly be using a stale list of region servers for creating > assignment plan during startup > - > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 > Environment: All >Reporter: Aditya Kishore >Assignee: Aditya Kishore > Attachments: HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observation > where the master tries to assign all regions to a single region server even > though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it > tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468 Waiting on regionserver(s) to checkin > 56,968 Waiting on regionserver(s) to checkin > 58,468 Waiting on regionserver(s) to checkin > 59,968 Waiting on regionserver(s) to checkin > 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false > 01,469 Waiting on regionserver(s) count to settle; currently=1 > 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 > 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 > 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE > 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 > 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE > 03,336 Detected completed assignment of META, notifying catalog tracker > 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 > 03,350 Master startup proceeding: cluster startup > 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false > 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false > 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false > 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false > 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false > 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false > 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false > 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true > 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedException > { > // Get all available servers > List servers = serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegions = > MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), > true); > if (allRegions == null || allRegions.isEmpty()) return; > // Determine what type of assignment to do on startup > boolean retainAssignment = master.getConfiguration(). > getBoolean("hbase.master.startup.retainassign", true); > Map> bulkPlan = null; > if (retainAssignment) { > // Reuse existing assignment info > bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); > } else { > // assign regions in round-robin fashion > bulkPlan = LoadBalancer.roundRobinAssignment(new > ArrayList(allRegions.keySet()), servers); > } > LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + > servers.size() + " server(s), retainAssignment=" + retainAssignment); > ... > {code} > In the function assignAllUserRegions(), listed above, AM fetches the server > list from ServerManager long before it actually use it to create assignment > plan. > In between these, it performs a full scan of META to create an assignment map > of regions. So even if additional RSes have registered in the meantime (as > happened in this case), AM still has the old list of just one server. > This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and > trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster > can hit this issue upon cluster start-up when the following sequence holds > true. > # Master start long before the RSes (by default this long ~= 4.5 seconds) > # All the RSes start togather but
[jira] [Created] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup
Aditya Kishore created HBASE-6375: - Summary: Master could possibly be using a stale list of region servers for creating assignment plan during startup Key: HBASE-6375 URL: https://issues.apache.org/jira/browse/HBASE-6375 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0, 0.92.1, 0.90.6, 0.96.0 Environment: All Reporter: Aditya Kishore Assignee: Aditya Kishore While investigating an Out of Memory issue, I had an interesting observation where the master tries to assign all regions to a single region server even though 7 other had already registered with it. As the cluster had MSLAB enabled, this resulted in OOM on the RS when it tired to open all of them. *From master's log (edited for brevity):* {quote} 55,468 Waiting on regionserver(s) to checkin 56,968 Waiting on regionserver(s) to checkin 58,468 Waiting on regionserver(s) to checkin 59,968 Waiting on regionserver(s) to checkin 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false 01,469 Waiting on regionserver(s) count to settle; currently=1 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE 03,336 Detected completed assignment of META, notifying catalog tracker 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020 03,350 Master startup proceeding: cluster startup 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of {quote} *A peek at AssignmentManager code offer some explanation:* {code} public void assignAllUserRegions() throws IOException, InterruptedException { // Get all available servers List servers = serverManager.getOnlineServersList(); // Scan META for all user regions, skipping any disabled tables Map allRegions = MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), true); if (allRegions == null || allRegions.isEmpty()) return; // Determine what type of assignment to do on startup boolean retainAssignment = master.getConfiguration(). getBoolean("hbase.master.startup.retainassign", true); Map> bulkPlan = null; if (retainAssignment) { // Reuse existing assignment info bulkPlan = LoadBalancer.retainAssignment(allRegions, servers); } else { // assign regions in round-robin fashion bulkPlan = LoadBalancer.roundRobinAssignment(new ArrayList(allRegions.keySet()), servers); } LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " + servers.size() + " server(s), retainAssignment=" + retainAssignment); ... {code} In the function assignAllUserRegions(), listed above, AM fetches the server list from ServerManager long before it actually use it to create assignment plan. In between these, it performs a full scan of META to create an assignment map of regions. So even if additional RSes have registered in the meantime (as happened in this case), AM still has the old list of just one server. This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster can hit this issue upon cluster start-up when the following sequence holds true. # Master start long before the RSes (by default this long ~= 4.5 seconds) # All the RSes start togather but one wins the race of registering with Master by few seconds. I am attaching a patch for the trunk which moves the code which fetches the RS list form the beginning of the function to where it is first use. Apart from this change, one addition HBase setting which now become important is "hbase.master.wait.on.regionservers.mintostart" due to MSLAB being enabled by true by default. In large clusters which keeps it enabled now must modify "hbase.master.wait.on.regionser
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412279#comment-13412279 ] Andrew Purtell commented on HBASE-6368: --- This issue and patch just rolls right over a bunch of previous discussion like it never happened. > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5334) Pluggable Compaction Algorithms
[ https://issues.apache.org/jira/browse/HBASE-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicolas Spiegelberg reassigned HBASE-5334: -- Assignee: Akashnil (was: Nicolas Spiegelberg) Assigning to Akashnil, who is an intern for us that will be working on a size-based compaction algorithm (similar to BigTable strategy). > Pluggable Compaction Algorithms > --- > > Key: HBASE-5334 > URL: https://issues.apache.org/jira/browse/HBASE-5334 > Project: HBase > Issue Type: Improvement >Reporter: Nicolas Spiegelberg >Assignee: Akashnil >Priority: Minor > Labels: compaction, regionserver > > It would be good to create a set of common compaction algorithms so that we > can tune this on a per-CF basis. In order to accomplish this, we need to > refactor the current algorithm for plugability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6331) Problem with HBCK mergeOverlaps
[ https://issues.apache.org/jira/browse/HBASE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412261#comment-13412261 ] Jimmy Xiang commented on HBASE-6331: I don't think this is a bug. In hbck, the last endkey is not the normal empty byte[]. Instead, it is changed to null. So we have a special comparator. > Problem with HBCK mergeOverlaps > --- > > Key: HBASE-6331 > URL: https://issues.apache.org/jira/browse/HBASE-6331 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6331_94.patch, HBASE-6331_Trunk.patch > > > In HDFSIntegrityFixer#mergeOverlaps(), there is a logic to create the final > range of the region after the overlap. > I can see one issue with this code > {code} > if (RegionSplitCalculator.BYTES_COMPARATOR > .compare(hi.getEndKey(), range.getSecond()) > 0) { > range.setSecond(hi.getEndKey()); > } > {code} > Here suppose the regions include the end region for which the endKey will be > empty, we need to get finally the range with endkey as empty byte[] > But as per the above logic it will see that any other key greater than the > empty byte[] and will set it. > Finally the new region created will not get endkey as empty byte[] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6331) Problem with HBCK mergeOverlaps
[ https://issues.apache.org/jira/browse/HBASE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412249#comment-13412249 ] Jonathan Hsieh commented on HBASE-6331: --- Lars, let's go ahead and bump it to 0.94.2 -- I don't want to hold up a release, and though it is a bug, we've gotten by without it for a while now. > Problem with HBCK mergeOverlaps > --- > > Key: HBASE-6331 > URL: https://issues.apache.org/jira/browse/HBASE-6331 > Project: HBase > Issue Type: Bug > Components: hbck >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6331_94.patch, HBASE-6331_Trunk.patch > > > In HDFSIntegrityFixer#mergeOverlaps(), there is a logic to create the final > range of the region after the overlap. > I can see one issue with this code > {code} > if (RegionSplitCalculator.BYTES_COMPARATOR > .compare(hi.getEndKey(), range.getSecond()) > 0) { > range.setSecond(hi.getEndKey()); > } > {code} > Here suppose the regions include the end region for which the endKey will be > empty, we need to get finally the range with endkey as empty byte[] > But as per the above logic it will see that any other key greater than the > empty byte[] and will set it. > Finally the new region created will not get endkey as empty byte[] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
[ https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412230#comment-13412230 ] Paul Cavallaro commented on HBASE-6220: --- Sorry I've been away this week with intermittent internet connectivity. I'll try to reply to all of this soon. > PersistentMetricsTimeVaryingRate gets used for non-time-based metrics > - > > Key: HBASE-6220 > URL: https://issues.apache.org/jira/browse/HBASE-6220 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 0.96.0 >Reporter: David S. Wang >Assignee: Paul Cavallaro >Priority: Minor > Labels: noob > Attachments: ServerMetrics_HBASE_6220.patch > > > PersistentMetricsTimeVaryingRate gets used for metrics that are not > time-based, leading to confusing names such as "avg_time" for compaction > size, etc. You hav to read the code in order to understand that this is > actually referring to bytes, not seconds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6336) Split point should not be equal with start row or end row
[ https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412069#comment-13412069 ] stack commented on HBASE-6336: -- +1 on the patch. @Ram when you say 'But here in the first region there are no kvs at all and hence we flush an empty file.' What flush are you talking of? The close of the region on split? If so, why we write a flush file if no KVs? Do we? That don't seem right. > Split point should not be equal with start row or end row > - > > Key: HBASE-6336 > URL: https://issues.apache.org/jira/browse/HBASE-6336 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: chunhui shen >Assignee: chunhui shen > Fix For: 0.96.0 > > Attachments: HBASE-6336.patch > > > Should we allow split point equal with region's start row or end row? > {code} > // if the midkey is the same as the first and last keys, then we cannot > // (ever) split this region. > if (this.comparator.compareRows(mk, firstKey) == 0 && > this.comparator.compareRows(mk, lastKey) == 0) { > if (LOG.isDebugEnabled()) { > LOG.debug("cannot split because midkey is the same as first or " + > "last row"); > } > {code} > Here, I think it is a mistake. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2315) BookKeeper for write-ahead logging
[ https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412065#comment-13412065 ] Flavio Junqueira commented on HBASE-2315: - Hi Ted, fs is part of the issue I was discussing before. We don't have a filesystem implementation for bookkeeper, so we can't use the filesystem instance passed. About the reader and the writer, I was configuring them in the hbase-default configuration file: {noformat} hbase.regionserver.hlog.reader.impl org.apache.hadoop.hbase.regionserver.wal.BookKeeperLogReader The HLog file reader implementation. hbase.regionserver.hlog.writer.impl org.apache.hadoop.hbase.regionserver.wal.BookKeeperLogWriter The HLog file writer implementation. {noformat} I assumed previously that HLog was instantiated elsewhere. > BookKeeper for write-ahead logging > -- > > Key: HBASE-2315 > URL: https://issues.apache.org/jira/browse/HBASE-2315 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Flavio Junqueira > Attachments: HBASE-2315.patch, bookkeeperOverview.pdf, > zookeeper-dev-bookkeeper.jar > > > BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high > throughput write-ahead logging service. This issue provides an implementation > of write-ahead logging for hbase using BookKeeper. Apart from expected > throughput improvements, BookKeeper also has stronger durability guarantees > compared to the implementation currently used by hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6362) Enhance test-patch.sh script to recognize images / non-trunk patches
[ https://issues.apache.org/jira/browse/HBASE-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412063#comment-13412063 ] stack commented on HBASE-6362: -- I'm with Andrew. The barrier to contribution should be as low as possible. The argument above for requiring versioning doesn't fly given hadoopqa picks up the latest whatever the name (Regards hadoopqa not posting back if no compile, lets fix that) > Enhance test-patch.sh script to recognize images / non-trunk patches > > > Key: HBASE-6362 > URL: https://issues.apache.org/jira/browse/HBASE-6362 > Project: HBase > Issue Type: Bug >Reporter: Zhihong Ted Yu > > When user uploads logs / images / non-trunk patches, Hadoop QA would complain > that the file couldn't be applied as a patch (for trunk). > We should make this script smarter by recognizing image files and non-trunk > patches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6338) Cache Method in RPC handler
[ https://issues.apache.org/jira/browse/HBASE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412061#comment-13412061 ] stack commented on HBASE-6338: -- Why javadoc a private method (especially a method named getMethod that returns a Method)? > Cache Method in RPC handler > --- > > Key: HBASE-6338 > URL: https://issues.apache.org/jira/browse/HBASE-6338 > Project: HBase > Issue Type: Improvement >Reporter: binlijin > Attachments: HBASE-6338-90.patch, HBASE-6338-92.patch, > HBASE-6338-94.patch, HBASE-6338-trunk.patch > > > Every call in rpc handler a Method will be created, if we cache the method > will improve a little. > I test with 0.90, Average Class.getMethod(String name, Class... > parameterTypes) cost 4780 ns , if we cache it cost 2620 ns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6362) Enhance test-patch.sh script to recognize images / non-trunk patches
[ https://issues.apache.org/jira/browse/HBASE-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412057#comment-13412057 ] Zhihong Ted Yu commented on HBASE-6362: --- See https://issues.apache.org/jira/browse/HBASE-5151?focusedCommentId=13412056&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13412056 for one reason we need versioning in patch filenames. When versioning is in play, the above regex wouldn't pick up all the patches. > Enhance test-patch.sh script to recognize images / non-trunk patches > > > Key: HBASE-6362 > URL: https://issues.apache.org/jira/browse/HBASE-6362 > Project: HBase > Issue Type: Bug >Reporter: Zhihong Ted Yu > > When user uploads logs / images / non-trunk patches, Hadoop QA would complain > that the file couldn't be applied as a patch (for trunk). > We should make this script smarter by recognizing image files and non-trunk > patches. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
[ https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412056#comment-13412056 ] Zhihong Ted Yu commented on HBASE-5151: --- It turns out that the first patch was syntactically correct. Harsh added something in patch v2 which wouldn't pass compilation. Currently Hadoop QA wouldn't post back if there is compilation error. However, Stack wasn't aware of the above and integrated patch v2. This is another reason we need versioning in patch filenames so that such mistakes can be more easily avoided. > Rename "hbase.skip.errors" in HRegion as it is too general-sounding. > > > Key: HBASE-5151 > URL: https://issues.apache.org/jira/browse/HBASE-5151 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 0.94.0 >Reporter: Harsh J >Assignee: Harsh J > Fix For: 0.96.0 > > Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, > HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch > > > We should rename "hbase.skip.errors", used in HRegion.java for skipping > errors when replaying edits. It should probably be something more like > "hbase.hregion.edits.replay.skip.errors" or so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412044#comment-13412044 ] Hudson commented on HBASE-6368: --- Integrated in HBase-TRUNK #3119 (See [https://builds.apache.org/job/HBase-TRUNK/3119/]) HBASE-6368 Upgrade Guava for critical performance bug fix (Revision 1360386) Result = FAILURE tedyu : Files : * /hbase/trunk/pom.xml > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
[ https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412043#comment-13412043 ] Hudson commented on HBASE-5151: --- Integrated in HBase-TRUNK #3119 (See [https://builds.apache.org/job/HBase-TRUNK/3119/]) HBASE-5151 Rename hbase.skip.errors in HRegion as it is too general-sounding (Revision 1360384) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java > Rename "hbase.skip.errors" in HRegion as it is too general-sounding. > > > Key: HBASE-5151 > URL: https://issues.apache.org/jira/browse/HBASE-5151 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 0.94.0 >Reporter: Harsh J >Assignee: Harsh J > Fix For: 0.96.0 > > Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, > HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch > > > We should rename "hbase.skip.errors", used in HRegion.java for skipping > errors when replaying edits. It should probably be something more like > "hbase.hregion.edits.replay.skip.errors" or so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception
[ https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412041#comment-13412041 ] stack commented on HBASE-5883: -- @Jieshan So, what do we need to do to close this issue out? What do we need to apply? Thanks. > Backup master is going down due to connection refused exception > --- > > Key: HBASE-5883 > URL: https://issues.apache.org/jira/browse/HBASE-5883 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: Jieshan Bean > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2 > > Attachments: 90-addendum.patch, 92-addendum.patch, 94-addendum.patch, > HBASE-5883-90.patch, HBASE-5883-92.patch, HBASE-5883-94.patch, > HBASE-5883-trunk.patch, trunk-addendum.patch > > > The active master node network was down for some time (This node contains > Master,DN,ZK,RS). Here backup node got > notification, and started to became active. Immedietly backup node got > aborted with the below exception. > {noformat} > 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: > finished splitting (more than or equal to) 861248320 bytes in 4 log files in > [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting] > in 26374ms > 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master > server abort: loaded coprocessors are: [] > 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.io.IOException: java.net.ConnectException: Connection refused > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375) > at > org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) > at $Proxy13.getProtocolVersion(Unknown Source) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660) > at > org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616) > at > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362) > ... 20 more > 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting > 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: > Stopping service threads > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5883) Backup master is going down due to connection refused exception
[ https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gregory Chanan updated HBASE-5883: -- Fix Version/s: 0.92.2 0.90.7 Adding 0.92.2 and 0.90.7 to fix version, as this was originally checked in under those versions. I'm also unclear what needs to be done to get this to resolved, but it should be done to 0.90.7 and 0.92.2 as well. > Backup master is going down due to connection refused exception > --- > > Key: HBASE-5883 > URL: https://issues.apache.org/jira/browse/HBASE-5883 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.92.1, 0.94.0 >Reporter: Gopinathan A >Assignee: Jieshan Bean > Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2 > > Attachments: 90-addendum.patch, 92-addendum.patch, 94-addendum.patch, > HBASE-5883-90.patch, HBASE-5883-92.patch, HBASE-5883-94.patch, > HBASE-5883-trunk.patch, trunk-addendum.patch > > > The active master node network was down for some time (This node contains > Master,DN,ZK,RS). Here backup node got > notification, and started to became active. Immedietly backup node got > aborted with the below exception. > {noformat} > 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: > finished splitting (more than or equal to) 861248320 bytes in 4 log files in > [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting] > in 26374ms > 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master > server abort: loaded coprocessors are: [] > 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.io.IOException: java.net.ConnectException: Connection refused > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375) > at > org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045) > at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) > at $Proxy13.getProtocolVersion(Unknown Source) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280) > at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332) > at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353) > at > org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660) > at > org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616) > at > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.net.ConnectException: Connection refused > at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) > at > sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) > at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362) > ... 20 more > 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting > 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: > Stopping service threads > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412026#comment-13412026 ] Zhihong Ted Yu commented on HBASE-6368: --- The following step passed Hadoop QA: {code} == Checking against hadoop 2.0 build == {code} > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412025#comment-13412025 ] stack commented on HBASE-6368: -- Will this undo the work done over in HBASE-5955? Does this break our compiling against hadoop-2.0.x? > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6374) [89-fb] Unify the multi-put/get/delete path so there is only one call to each RS, instead of one call per region
[ https://issues.apache.org/jira/browse/HBASE-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Ted Yu updated HBASE-6374: -- Fix Version/s: 0.89-fb Summary: [89-fb] Unify the multi-put/get/delete path so there is only one call to each RS, instead of one call per region (was: integrate the multi-put/get/delete path so there is only one call to each RS, instead of one call per R) > [89-fb] Unify the multi-put/get/delete path so there is only one call to each > RS, instead of one call per region > > > Key: HBASE-6374 > URL: https://issues.apache.org/jira/browse/HBASE-6374 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.89-fb >Reporter: Amitanand Aiyer >Assignee: Amitanand Aiyer >Priority: Minor > Fix For: 0.89-fb > > > This is a feature similar to the batch feature in trunk. > We have optimisation for the put path where we batch puts by the > regionserver, but for gets and deletes we do batching only per hregion. So, > if there are 20 regions on a regionserver, we would be doing 20 RPC when we > can potentially batch them together in 1 call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6374) integrate the multi-put/get/delete path so there is only one call to each RS, instead of one call per R
Amitanand Aiyer created HBASE-6374: -- Summary: integrate the multi-put/get/delete path so there is only one call to each RS, instead of one call per R Key: HBASE-6374 URL: https://issues.apache.org/jira/browse/HBASE-6374 Project: HBase Issue Type: Improvement Affects Versions: 0.89-fb Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor This is a feature similar to the batch feature in trunk. We have optimisation for the put path where we batch puts by the regionserver, but for gets and deletes we do batching only per hregion. So, if there are 20 regions on a regionserver, we would be doing 20 RPC when we can potentially batch them together in 1 call. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412003#comment-13412003 ] Zhihong Ted Yu commented on HBASE-6368: --- Thanks for the reminder. I logged MAPREDUCE-4429 > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2315) BookKeeper for write-ahead logging
[ https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411998#comment-13411998 ] Zhihong Ted Yu commented on HBASE-2315: --- @Flavio: Looking at the attached patch: {code} +public void init(FileSystem fs, Path path, Configuration conf){ {code} Parameter fs isn't used. Further, you implemented HLog.Reader and HLog.Writer. I don't see where HLog is constructed. Thanks > BookKeeper for write-ahead logging > -- > > Key: HBASE-2315 > URL: https://issues.apache.org/jira/browse/HBASE-2315 > Project: HBase > Issue Type: New Feature > Components: regionserver >Reporter: Flavio Junqueira > Attachments: HBASE-2315.patch, bookkeeperOverview.pdf, > zookeeper-dev-bookkeeper.jar > > > BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high > throughput write-ahead logging service. This issue provides an implementation > of write-ahead logging for hbase using BookKeeper. Apart from expected > throughput improvements, BookKeeper also has stronger durability guarantees > compared to the implementation currently used by hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411990#comment-13411990 ] Lars Hofhansl commented on HBASE-6368: -- How does this mingle with newer versions of Hadoop (0.22+, which have Guava 11.0.2)? > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix
[ https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411978#comment-13411978 ] Zhihong Ted Yu commented on HBASE-6368: --- Integrated to trunk. > Upgrade Guava for critical performance bug fix > -- > > Key: HBASE-6368 > URL: https://issues.apache.org/jira/browse/HBASE-6368 > Project: HBase > Issue Type: Bug >Affects Versions: 0.96.0 >Reporter: Zhihong Ted Yu >Assignee: Zhihong Ted Yu >Priority: Critical > Attachments: 6368-trunk.txt > > > The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055 > See discussion under 'Upgrade to Guava 12.0.1: Performance bug in > CacheBuilder/LoadingCache fixed!' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411979#comment-13411979 ] Elliott Clark commented on HBASE-4050: -- @Luke Yes. We have per region metrics, per replication stream metrics, and per schema metrics. There might be others that I'm missing but those are the ones I have touched. In addition we will probably be implementing our own metrics classes. Right now we have MetricsHistogram which is based on metrics1. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411975#comment-13411975 ] Luke Lu commented on HBASE-4050: @Elliott: MetricsBuilder/Collector is only needed for creating metrics dynamically and for implementing new Mutable metrics. Does HBase need to define new metrics at run time? > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
[ https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411966#comment-13411966 ] stack commented on HBASE-5151: -- I applied the amendment. Thanks Harsh. > Rename "hbase.skip.errors" in HRegion as it is too general-sounding. > > > Key: HBASE-5151 > URL: https://issues.apache.org/jira/browse/HBASE-5151 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 0.94.0 >Reporter: Harsh J >Assignee: Harsh J > Fix For: 0.96.0 > > Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, > HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch > > > We should rename "hbase.skip.errors", used in HRegion.java for skipping > errors when replaying edits. It should probably be something more like > "hbase.hregion.edits.replay.skip.errors" or so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411963#comment-13411963 ] Elliott Clark commented on HBASE-4050: -- @Jonathan Thanks. @Luke Lu Unfortunately we need to shim a little bit more since the actual MetricsBuilder has also changed. I'm trying to have something ready to show in a little bit. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
[ https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411959#comment-13411959 ] stack commented on HBASE-5151: -- @Harsh No need to apologize. Thanks for fast turn around. Applying the amendment. > Rename "hbase.skip.errors" in HRegion as it is too general-sounding. > > > Key: HBASE-5151 > URL: https://issues.apache.org/jira/browse/HBASE-5151 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 0.94.0 >Reporter: Harsh J >Assignee: Harsh J > Fix For: 0.96.0 > > Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, > HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch > > > We should rename "hbase.skip.errors", used in HRegion.java for skipping > errors when replaying edits. It should probably be something more like > "hbase.hregion.edits.replay.skip.errors" or so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6373) Add more context information to audit log messages
[ https://issues.apache.org/jira/browse/HBASE-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated HBASE-6373: -- Attachment: accesscontroller.patch Updated patch (empty string instead of null if remote address not available). > Add more context information to audit log messages > -- > > Key: HBASE-6373 > URL: https://issues.apache.org/jira/browse/HBASE-6373 > Project: HBase > Issue Type: Improvement > Components: security >Reporter: Marcelo Vanzin >Priority: Minor > Attachments: accesscontroller.patch, accesscontroller.patch > > > The attached patch adds more information to the audit log messages; namely, > it includes the IP address where the request originated, if it's available. > The patch is against trunk, but I've tested it against the 0.92 branch. I > didn't find any unit test for this code, please let me know if I missed > something. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411952#comment-13411952 ] Luke Lu commented on HBASE-4050: +1 on Alex's option 2 and ServiceLoader. For HBase we only need to implement shims for metrics sources, so interface for registry and *Mutable* classes would suffice. > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework
[ https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411949#comment-13411949 ] Jonathan Hsieh commented on HBASE-4050: --- bq. bq. hbase needs to be recompiled to run against hadoop 2.0 hdfs bq. When did this happen ? In the pom all that changes for the hadoop 2.0 compile are that some dependencies are changed. I just tried on a local machine only and just changing the libs dir I was able to run either version (stand alone only so not really an exhaustive test I know) Here's one reason: https://issues.apache.org/jira/browse/HBASE-5861?focusedCommentId=13259785&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13259785 Here's another: HDFS-1620/HDFS-2412 > Update HBase metrics framework to metrics2 framework > > > Key: HBASE-4050 > URL: https://issues.apache.org/jira/browse/HBASE-4050 > Project: HBase > Issue Type: New Feature > Components: metrics >Affects Versions: 0.90.4 > Environment: Java 6 >Reporter: Eric Yang >Assignee: Alex Baranau >Priority: Critical > Fix For: 0.96.0 > > Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, > HBASE-4050.patch > > > Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, > and it might get removed in future Hadoop release. Hence, HBase needs to > revise the dependency of MetricsContext to use Metrics2 framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira