[jira] [Created] (HBASE-6379) [0.90 branch] Backport HBASE-6334 to 0.90

2012-07-11 Thread Gregory Chanan (JIRA)
Gregory Chanan created HBASE-6379:
-

 Summary: [0.90 branch] Backport HBASE-6334 to 0.90
 Key: HBASE-6379
 URL: https://issues.apache.org/jira/browse/HBASE-6379
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.90.7


See HBASE-6334 for details.

The issue is that HBASE-6334 detects both HBASE-4195 (which should be 
backported to 0.90 -- I'll file another JIRA for that) and HBASE-2856 (which is 
a known issue in 0.90 that won't be fixed because it requires a change to the 
HFile format).  So in 0.90, we need a way to only catch HBASE-4195 failures and 
ignore HBASE-2856 failures.

Luckily, HBASE-4195 only occurs *within* a column family, while HBASE-2856 
occurs *between* column families, so we just need to add a little to the 
backport to differentiate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row

2012-07-11 Thread Benoit Sigoure (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412564#comment-13412564
 ] 

Benoit Sigoure commented on HBASE-6239:
---

This means HBase replication will still corrupt timestamps in 0.90.7, which in 
many cases makes replication useless.  Are you sure?

> [replication] ReplicationSink uses the ts of the first KV for the other KVs 
> in the same row
> ---
>
> Key: HBASE-6239
> URL: https://issues.apache.org/jira/browse/HBASE-6239
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Critical
>  Labels: corruption
> Fix For: 0.92.2, 0.90.8
>
> Attachments: HBASE-6239-0.92-v1.patch
>
>
> ReplicationSink assumes that all the KVs for the same row inside a WALEdit 
> will have the same timestamp, which is not necessarily the case.
> This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6317) Master clean start up and Partially enabled tables make region assignment inconsistent.

2012-07-11 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-6317:
--

Attachment: HBASE-6317_94_3.patch

> Master clean start up and Partially enabled tables make region assignment 
> inconsistent.
> ---
>
> Key: HBASE-6317
> URL: https://issues.apache.org/jira/browse/HBASE-6317
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-6317_94.patch, HBASE-6317_94_3.patch
>
>
> If we have a  table in partially enabled state (ENABLING) then on HMaster 
> restart we treat it as a clean cluster start up and do a bulk assign.  
> Currently in 0.94 bulk assign will not handle ALREADY_OPENED scenarios and it 
> leads to region assignment problems.  Analysing more on this we found that we 
> have better way to handle these scenarios.
> {code}
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   synchronized (this.regions) {
> regions.put(regionInfo, regionLocation);
> addToServers(regionLocation, regionInfo);
>   }
> {code}
> We dont add to regions map so that enable table handler can handle it.  But 
> as nothing is added to regions map we think it as a clean cluster start up.
> Will come up with a patch tomorrow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412550#comment-13412550
 ] 

Hadoop QA commented on HBASE-4050:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536174/HBASE-4050-2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 7 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 12 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2367//console

This message is automatically generated.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable

2012-07-11 Thread ShiXing (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ShiXing updated HBASE-6370:
---

Attachment: HBASE-6370-trunk-V2.patch

Yes, I think the configuration is more acceptable for heterogeneous environment 
between master and regionservers.

I set the configuration base.master.check.compression default true.

> Add compression codec test at HMaster when 
> createTable/modifyColumn/modifyTable
> ---
>
> Key: HBASE-6370
> URL: https://issues.apache.org/jira/browse/HBASE-6370
> Project: HBase
>  Issue Type: Improvement
>Reporter: ShiXing
>Assignee: ShiXing
>Priority: Minor
> Attachments: HBASE-6370-trunk-V1.patch, HBASE-6370-trunk-V2.patch
>
>
> We deployed a cluster that none of the regionserver supports the compression 
> codec such like "lzo", but the cluster user/client does not know this, and he 
> specifies the family's compression codec by 
> HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO);
> Because the HBaseAdmin's createTable is async, so the client is waiting all 
> the regions of the table to be online forever. And client does not know why 
> the regions are not online until the HBase administrator find this problem.
> In deed, all of the regions are assigning by master, but regionserver's 
> openHRegion always failed.
> In my option, we can suppose all the cluster's enviroment are the same, means 
> if the master is deployed some lib, the regionserver should also be deployed. 
> Of course above is just a suppose, in real deployment, the hbase dba may just 
> deploy lib on regionserver or master.
> So I think this failure can be found earlier before master create the 
> CreateTableHandler thread, and we can tell client quickly we didn't support 
> this compression codec type.
> I will upload the patch later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5498) Secure Bulk Load

2012-07-11 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412542#comment-13412542
 ] 

Francis Liu commented on HBASE-5498:


Hi Laxman,

Looks right to me apart from the delegation token. You need to pass an hdfs 
delegation token because we'd like to impersonate the user when changing 
permissions on hdfs. Also the path doesn't need to be the full URI.

Getting the token should be something like this:

FileSystem fs = FileSystem.get(conf);
Token token = fs.getDelegationToken("renewer");

Let me know how things go.

-Francis




> Secure Bulk Load
> 
>
> Key: HBASE-5498
> URL: https://issues.apache.org/jira/browse/HBASE-5498
> Project: HBase
>  Issue Type: Improvement
>  Components: mapred, security
>Reporter: Francis Liu
>Assignee: Francis Liu
> Fix For: 0.96.0
>
> Attachments: HBASE-5498_draft.patch
>
>
> Design doc: 
> https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load
> Short summary:
> Security as it stands does not cover the bulkLoadHFiles() feature. Users 
> calling this method will bypass ACLs. Also loading is made more cumbersome in 
> a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data 
> from user's directory to the hbase directory, which would require certain 
> write access privileges set.
> Our solution is to create a coprocessor which makes use of AuthManager to 
> verify if a user has write access to the table. If so, launches a MR job as 
> the hbase user to do the importing (ie rewrite from text to hfiles). One 
> tricky part this job will have to do is impersonate the calling user when 
> reading the input files. We can do this by expecting the user to pass an hdfs 
> delegation token as part of the secureBulkLoad() coprocessor call and extend 
> an inputformat to make use of that token. The output is written to a 
> temporary directory accessible only by hbase and then bulkloadHFiles() is 
> called.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412538#comment-13412538
 ] 

Lars Hofhansl commented on HBASE-6377:
--

Perhaps we can have "Get" and "Update" metrics. "Updates" would include Put, 
Deleted, ICV, etc.
But maybe that would require more discussion, so short term (0.94.1 at least), 
we could remove the Put/Delete/Get metrics as you suggest.


> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-4050:
-

Attachment: HBASE-4050-2.patch

After code comments.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412519#comment-13412519
 ] 

Andrew Purtell edited comment on HBASE-6377 at 7/12/12 5:10 AM:


Perhaps not a full revert, but the "put" and "delete" metrics are not useful in 
a basic LoadTestTool test scenario, so consider dropping those. The distinction 
is increasingly only valid client side. Perhaps also remove the "get" one as 
well so we're not in effect special casing a metric only into HRI.get().

Edit: But even after the above, the histograms remain for FS level ops, so 
there's benefit to a partial revert only.

  was (Author: apurtell):
Perhaps not a full revert, but the "put" and "delete" metrics are not 
useful in a basic LoadTestTool test scenario, so consider dropping those. The 
distinction is increasingly only valid client side. Perhaps also remove the 
"get" one as well so we're not in effect special casing a metric only into 
HRI.get().
  
> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412519#comment-13412519
 ] 

Andrew Purtell commented on HBASE-6377:
---

Perhaps not a full revert, but the "put" and "delete" metrics are not useful in 
a basic LoadTestTool test scenario, so consider dropping those. The distinction 
is increasingly only valid client side. Perhaps also remove the "get" one as 
well so we're not in effect special casing a metric only into HRI.get().

> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412515#comment-13412515
 ] 

Hadoop QA commented on HBASE-4050:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536167/HBASE-4050-1.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 7 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 11 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2366//console

This message is automatically generated.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412513#comment-13412513
 ] 

Elliott Clark commented on HBASE-4050:
--

bq.ResourceFinder seems to have a few nice properties but do we actually 
require them?
Not 100%, we can get around ServiceLoader's short comings by having the 
ServiceLoader create factories that take in arguments and pass them to a 
constructor, however it would be cleaner .

bq.why did you pick this particular implementation of ResourceFinder from 
xbeans-3.7
A friend sent me to that exact link, saying they were using it.  I can pull a 
newer one. 

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412509#comment-13412509
 ] 

Elliott Clark commented on HBASE-4050:
--

bq.ReplicationMetricsSource javadoc is to be filled.
bq.And some catch clauses have boilerplate code

Agreed. I'll get a patch up soon.

bq.I thought author name shouldn't appear in the file header:
I was trying to keep the source as close to the original as possible.  I'm open 
for whatever; I was just trying to make sure that the people who wrote it got 
credit.

bq.Consider using uppercase M in the string below
Hadoop's metrics2 uses all lowercase for context. 
https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java#L68

bq.Do we need to check that delta is non-negative ?
Nope.  Hadoop doesn't so I followed suit.

bq.Maybe give the assembly file a more descriptive name ?
Sure

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6336) Split point should not be equal with start row or end row

2012-07-11 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412504#comment-13412504
 ] 

ramkrishna.s.vasudevan commented on HBASE-6336:
---

@Stack
bq.If so, why we write a flush file if no KVs?
Yes we are writing an emtpy now.  Incase of compaction we are creating an empty 
file.  So once the region is split we compact, so there an empty file is 
created for an empty region.
See HBASE-6059 - Replaying recovered edits would make deleted data exist again
There i had a concern on creating an empty store file, but it was needed.  So 
you feel any problem there Stack?

> Split point should not be equal with start row or end row
> -
>
> Key: HBASE-6336
> URL: https://issues.apache.org/jira/browse/HBASE-6336
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: HBASE-6336.patch
>
>
> Should we allow split point equal with region's start row or end row?
> {code}
> // if the midkey is the same as the first and last keys, then we cannot
> // (ever) split this region.
> if (this.comparator.compareRows(mk, firstKey) == 0 &&
> this.comparator.compareRows(mk, lastKey) == 0) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("cannot split because midkey is the same as first or " +
>   "last row");
>   }
> {code}
> Here, I think it is a mistake.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5798) NPE running hbck on 0.94 out of reportTablesInFlux

2012-07-11 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John resolved HBASE-5798.
---

Resolution: Duplicate

The issue with NPE is fixed as part of HBASE-5928.

> NPE running hbck on 0.94 out of reportTablesInFlux
> --
>
> Key: HBASE-5798
> URL: https://issues.apache.org/jira/browse/HBASE-5798
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0, 0.96.0
>Reporter: stack
>Assignee: Anoop Sam John
> Attachments: HBASE-5798_94.patch, HBASE-5798_trunk.patch
>
>
> Got this playing w/ hbck going against the 0.94RC:
> {code}
> 12/04/16 17:03:14 INFO util.HBaseFsck: getHTableDescriptors == tableNames => 
> []
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.reportTablesInFlux(HBaseFsck.java:553)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:344)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:380)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3033)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412503#comment-13412503
 ] 

Luke Lu commented on HBASE-4050:


ResourceFinder seems to have a few nice properties but do we actually *require* 
them?, it's > 1KLOC, 1/3 of the whole patch. According to a comment in 
http://goo.gl/LmsXp it doesn't support comments etc in service definitions and 
that it doesn't offer perceivable performance improvement over ServiceLoader. 
Also, why did you pick this particular implementation of ResourceFinder from 
xbeans-3.7 (the current release is 3.11.1)?

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.

2012-07-11 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412500#comment-13412500
 ] 

ramkrishna.s.vasudevan commented on HBASE-5516:
---

@Jon
Currently am not working on 0.90.  So i may not find time on that.  But i would 
say that you can take a look at the patch? Actually in our 0.90 cluster while 
using GZIp compression we found memory leak frequently and that occured due to 
GZip Streams.
Thanks Jon.

> GZip leading to memory leak in 0.90.  Fix similar to HBASE-5387 needed for 
> 0.90.
> 
>
> Key: HBASE-5516
> URL: https://issues.apache.org/jira/browse/HBASE-5516
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
> Attachments: HBASE-5516_2_0.90.patch, HBASE-5516_3_0.90.patch
>
>
> Usage of GZip is leading to resident memory leak in 0.90.
> We need to have something similar to HBASE-5387 in 0.90. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately

2012-07-11 Thread zhou wenjian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412499#comment-13412499
 ] 

zhou wenjian commented on HBASE-6378:
-

In 90 and 92 this function will delete the node in zk, But changed since 94?

The javadoc puzzled me,because I found node in zk still exists when creating 
table down

> the javadoc of  setEnabledTable maybe not describe accurately 
> --
>
> Key: HBASE-6378
> URL: https://issues.apache.org/jira/browse/HBASE-6378
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.1
>
>
>   /**
>* Sets the ENABLED state in the cache and deletes the zookeeper node. Fails
>* silently if the node is not in enabled in zookeeper
>* 
>* @param tableName
>* @throws KeeperException
>*/
>   public void setEnabledTable(final String tableName) throws KeeperException {
> setTableState(tableName, TableState.ENABLED);
>   }
> When setEnabledTable occours ,It will update the cache and the zookeeper 
> node,rather than to delete the zk node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412496#comment-13412496
 ] 

Zhihong Ted Yu commented on HBASE-6375:
---

I ran the test above and it passed:
{code}
Running org.apache.hadoop.hbase.master.TestHMasterRPCException
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.295 sec

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ 
hbase-server ---
[INFO] Tests are skipped.
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 20.332s
{code}

Will wait for one day for further comments.

> Master may be using a stale list of region servers for creating assignment 
> plan during startup
> --
>
> Key: HBASE-6375
> URL: https://issues.apache.org/jira/browse/HBASE-6375
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
> Environment: All
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.96.0
>
> Attachments: HBASE-6375_trunk.patch
>
>
> While investigating an Out of Memory issue, I had an interesting observation 
> where the master tries to assign all regions to a single region server even 
> though 7 other had already registered with it.
> As the cluster had MSLAB enabled, this resulted in OOM on the RS when it 
> tired to open all of them.
> *From master's log (edited for brevity):*
> {quote}
> 55,468 Waiting on regionserver(s) to checkin
> 56,968 Waiting on regionserver(s) to checkin
> 58,468 Waiting on regionserver(s) to checkin
> 59,968 Waiting on regionserver(s) to checkin
> 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
> 01,469 Waiting on regionserver(s) count to settle; currently=1
> 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
> 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
> 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
> 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
> 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
> 03,336 Detected completed assignment of META, notifying catalog tracker
> 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
> 03,350 Master startup proceeding: cluster startup
> 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
> 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
> 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
> 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
> 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
> 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
> 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
> 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
> 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
> {quote}
> *A peek at AssignmentManager code offer some explanation:*
> {code}
>   public void assignAllUserRegions() throws IOException, InterruptedException 
> {
> // Get all available servers
> List servers = serverManager.getOnlineServersList();
> // Scan META for all user regions, skipping any disabled tables
> Map allRegions =
>   MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
> true);
> if (allRegions == null || allRegions.isEmpty()) return;
> // Determine what type of assignment to do on startup
> boolean retainAssignment = master.getConfiguration().
>   getBoolean("hbase.master.startup.retainassign", true);
> Map> bulkPlan = null;
> if (retainAssignment) {
>   // Reuse existing assignment info
>   bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
> } else {
>   // assign regions in round-robin fashion
>   bulkPlan = LoadBalancer.roundRobinAssignment(new 
> ArrayList(allRegions.keySet()), servers);
> }
> LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
>   servers.size() + " server(s), retainAssignment=" + retainAssignment);
> ...
> {code}
> In the function assignAllUserRegions(), listed above, AM fetches the server 
> list from ServerManager long before

[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412492#comment-13412492
 ] 

Lars Hofhansl commented on HBASE-6377:
--

I'd be supportive of reverting HBASE-5533 until we work out a clear strategy of 
what we're measuring and when.


> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412491#comment-13412491
 ] 

Lars Hofhansl commented on HBASE-6377:
--

I "knew" there would be issues somewhere with HBASE-6284 :(
What do you say Andrew, this seems bad enough to delay 0.94.1?

Does a "Put" metric still make sense? Should it be a "Mutation" metric which 
includes Deletes?


> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5711) Tests are failing with incorrect data directory permissions.

2012-07-11 Thread Dave Revell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412487#comment-13412487
 ] 

Dave Revell commented on HBASE-5711:


Here's a workaround for people running into permission problems while embedding 
a minicluster.

{noformat}
hbaseTestUtil = new HBaseTestingUtility();

// Workaround for HBASE-5711, we need to set config value 
dfs.datanode.data.dir.perm
// equal to the permissions of the temp dirs on the filesystem. 
These temp dirs were
// probably created using this process' umask. So we guess the temp 
dir permissions as
// 0777 & ~umask, and use that to set the config value.
try {
Process process = Runtime.getRuntime().exec("/bin/sh -c umask");
BufferedReader br = new BufferedReader(new 
InputStreamReader(process.getInputStream()));
int rc = process.waitFor();
if(rc == 0) {
String umask = br.readLine();

int umaskBits = Integer.parseInt(umask, 8);
int permBits = 0777 & ~umaskBits;
String perms = Integer.toString(permBits, 8);

log.info("Setting dfs.datanode.data.dir.perm to " + perms);

hbaseTestUtil.getConfiguration().set("dfs.datanode.data.dir.perm", perms);
} else {
log.warn("Failed running umask command in a shell, nonzero 
return value");
}
} catch (Exception e) {
// ignore errors, we might not be running on POSIX, or "sh" 
might not be on the path
log.warn("Couldn't get umask", e);
}

hbaseTestUtil.startMiniCluster();
{noformat}

> Tests are failing with incorrect data directory permissions.
> 
>
> Key: HBASE-5711
> URL: https://issues.apache.org/jira/browse/HBASE-5711
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
> Fix For: 0.92.3
>
> Attachments: HBASE-5711.patch
>
>
> When we run some tests in Hbase (TestAdmin), it is failing with following 
> error.
> {quote}
> Starting DataNode 0 with dfs.data.dir: 
> E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb\dfs\data\data1,E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb\dfs\data\data2
> 2012-04-04 18:04:51,036 WARN  [main] impl.MetricsSystemImpl(137): Metrics 
> system not started: Cannot locate configuration: tried 
> hadoop-metrics2-datanode.properties, hadoop-metrics2.properties
> 2012-04-04 18:04:51,255 WARN  [main] datanode.DataNode(1548): Invalid 
> directory in dfs.data.dir: Incorrect permission for 
> E:/Repositories/Hbase/target/test-data/5ff23198-892e-4f1c-8022-b3d9969fcf0b/dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb/dfs/data/data1,
>  expected: rwxr-xr-x, while actual: rwx--
> 2012-04-04 18:04:51,411 WARN  [main] datanode.DataNode(1548): Invalid 
> directory in dfs.data.dir: Incorrect permission for 
> E:/Repositories/Hbase/target/test-data/5ff23198-892e-4f1c-8022-b3d9969fcf0b/dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb/dfs/data/data2,
>  expected: rwxr-xr-x, while actual: rwx--
> 2012-04-04 18:04:51,411 ERROR [main] datanode.DataNode(1554): All directories 
> in dfs.data.dir are invalid.
> 2012-04-04 18:04:51,411 INFO  [main] hbase.HBaseTestingUtility(684): Shutting 
> down minicluster
> 2012-04-04 18:04:51,646 WARN  [main] hbase.HBaseTestingUtility(696): Failed 
> delete of 
> E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb
> 2012-04-04 18:04:51,646 INFO  [main] hbase.HBaseTestingUtility(700): 
> Minicluster is down
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412484#comment-13412484
 ] 

Zhihong Ted Yu commented on HBASE-4050:
---

Amazing work !

ReplicationMetricsSource javadoc is to be filled.
And some catch clauses have boilerplate code:
{code}
+  } catch (IOException e) {
+e.printStackTrace();  //To change body of catch statement use File | 
Settings | File Templates.
{code}
I thought author name shouldn't appear in the file header:
{code}
+ * author David Blevins
+ * version $Rev$ $Date$
{code}
Consider using uppercase M in the string below:
{code}
+  private static final String METRICS_CONTEXT = "replicationmetrics";
{code}
Do we need to check that delta is non-negative ?
{code}
+gaugeInt.decr(delta);
{code}
Maybe give the assembly file a more descriptive name ?
{code}
+src/assembly/two.xml
{code}
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceMetrics.java
 is removed.
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/metrics2/ReplicationSourceMetrics.java
 is added.
What about metrics1 ?

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-4050:
-

Attachment: HBASE-4050-1.patch

I missed one part in ReplicationSourceMetrics.  This should fix the tests.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately

2012-07-11 Thread zhou wenjian (JIRA)
zhou wenjian created HBASE-6378:
---

 Summary: the javadoc of  setEnabledTable maybe not describe 
accurately 
 Key: HBASE-6378
 URL: https://issues.apache.org/jira/browse/HBASE-6378
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0
Reporter: zhou wenjian
 Fix For: 0.94.1


  /**
   * Sets the ENABLED state in the cache and deletes the zookeeper node. Fails
   * silently if the node is not in enabled in zookeeper
   * 
   * @param tableName
   * @throws KeeperException
   */
  public void setEnabledTable(final String tableName) throws KeeperException {
setTableState(tableName, TableState.ENABLED);
  }

When setEnabledTable occours ,It will update the cache and the zookeeper 
node,rather than to delete the zk node.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader

2012-07-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412480#comment-13412480
 ] 

Anoop Sam John commented on HBASE-5997:
---

bq. Is it possible if the file is empty say that we'll seek on every invocation 
of getFirstKey?
Yes as the check is against firstKey being not null. May be we can have a 
boolean variable based check.
As this is rare chance and the existance of the half file also wont be for more 
time( normally) I thought may be okey. What do u say Stack? If you feel I can 
change this.

bq.This patch does not do you your compare of row only rather than compare of 
full key. Is it supposed to?
Yes. You can see that the comparison now is against the first key in the file 
rather than the split key
{code}
-  if (getComparator().compare(key, offset, length, splitkey, 0,
-  splitkey.length) < 0) {
+  byte[] fk = getFirstKey();
+  // This will be null when the file is empty in which we can not 
seekBefore to any key
+  if (fk == null) return false;
+  if (getComparator().compare(key, offset, length, fk, 0,
+  fk.length) <= 0) {
{code}
So it is okey to have the full key based compare.[Not rowkey alone]

> Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
> 
>
> Key: HBASE-5997
> URL: https://issues.apache.org/jira/browse/HBASE-5997
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: Anoop Sam John
> Fix For: 0.94.2
>
> Attachments: HBASE-5997_0.94.patch, HBASE-5997_94 V2.patch, 
> Testcase.patch.txt
>
>
> Pls refer to the comment
> https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346.
> Raised this issue to solve that comment. Just incase we don't forget it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412479#comment-13412479
 ] 

Hadoop QA commented on HBASE-4050:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536162/HBASE-4050-0.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 1 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 7 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 11 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.replication.TestMultiSlaveReplication
  org.apache.hadoop.hbase.replication.TestMasterReplication

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2365//console

This message is automatically generated.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6284) Introduce HRegion#doMiniBatchMutation()

2012-07-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412476#comment-13412476
 ] 

Anoop Sam John commented on HBASE-6284:
---

I mean public methods in HRegion can be called from co processors at RS side.


> Introduce HRegion#doMiniBatchMutation()
> ---
>
> Key: HBASE-6284
> URL: https://issues.apache.org/jira/browse/HBASE-6284
> Project: HBase
>  Issue Type: Bug
>  Components: performance, regionserver
>Reporter: Zhihong Ted Yu
>Assignee: Anoop Sam John
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6284_Trunk-Addendum.patch, 6284_Trunk-V3.patch, 
> HBASE-6284_94.patch, HBASE-6284_Trunk-V2.patch, HBASE-6284_Trunk-V3.patch, 
> HBASE-6284_Trunk.patch
>
>
> From Anoop under thread 'Can there be a doMiniBatchDelete in HRegion':
> The HTable#delete(List) groups the Deletes for the same RS and make 
> one n/w call only. But within the RS, there will be N number of delete calls 
> on the region one by one. This will include N number of HLog write and sync. 
> If this also can be grouped can we get better performance for the multi row 
> delete.
> I have made the new miniBatchDelete () and made the 
> HTable#delete(List) to call this new batch delete.
> Just tested initially with the one node cluster.  In that itself I am getting 
> a performance boost which is very much promising.
> Only one CF and qualifier.
> 10K total rows delete with a batch of 100 deletes. Only deletes happening on 
> the table from one thread.
> With the new way the net time taken is reduced by more than 1/10
> Will test in a 4 node cluster also. I think it will worth doing this change.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412469#comment-13412469
 ] 

Hadoop QA commented on HBASE-6375:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12536140/HBASE-6375_trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestHMasterRPCException

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2364//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2364//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2364//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2364//console

This message is automatically generated.

> Master may be using a stale list of region servers for creating assignment 
> plan during startup
> --
>
> Key: HBASE-6375
> URL: https://issues.apache.org/jira/browse/HBASE-6375
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
> Environment: All
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.96.0
>
> Attachments: HBASE-6375_trunk.patch
>
>
> While investigating an Out of Memory issue, I had an interesting observation 
> where the master tries to assign all regions to a single region server even 
> though 7 other had already registered with it.
> As the cluster had MSLAB enabled, this resulted in OOM on the RS when it 
> tired to open all of them.
> *From master's log (edited for brevity):*
> {quote}
> 55,468 Waiting on regionserver(s) to checkin
> 56,968 Waiting on regionserver(s) to checkin
> 58,468 Waiting on regionserver(s) to checkin
> 59,968 Waiting on regionserver(s) to checkin
> 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
> 01,469 Waiting on regionserver(s) count to settle; currently=1
> 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
> 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
> 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
> 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
> 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
> 03,336 Detected completed assignment of META, notifying catalog tracker
> 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
> 03,350 Master startup proceeding: cluster startup
> 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
> 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
> 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
> 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
> 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
> 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
> 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
> 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
> 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
> {quote}
> *A peek at AssignmentManager code offer some explanation:*
> {code}
>   public void assignAllUserRegions() throws IOException, InterruptedException 
> {
> // Get all available servers
> List servers = serverManager.getOnlineServersList();
> // Scan META for all user regions, skipping any disabled tables
> Map allRegio

[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-11 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412464#comment-13412464
 ] 

Jimmy Xiang commented on HBASE-6272:


Patch version 2 was uploaded to RB: https://reviews.apache.org/r/5717/.


> In-memory region state is inconsistent
> --
>
> Key: HBASE-6272
> URL: https://issues.apache.org/jira/browse/HBASE-6272
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> AssignmentManger stores region state related information in several places: 
> regionsInTransition, regions (region info to server name map), and servers 
> (server name to region info set map).  However the access to these places is 
> not coordinated properly.  It leads to inconsistent in-memory region state 
> information.  Sometimes, some region could even be offline, and not in 
> transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412461#comment-13412461
 ] 

Andrew Purtell commented on HBASE-6377:
---

Other means for applying puts and deletes (RowMutations etc.) are not covered 
either, neither are Increments, or Appends. Perhaps this is an argument for 
reverting HBASE-5533. 

> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6377:
--

  Component/s: regionserver
   metrics
Affects Version/s: 0.94.1
   0.96.0

> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-11 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-6377:
-

 Summary: HBASE-5533 metrics miss all operations submitted via 
MultiAction
 Key: HBASE-6377
 URL: https://issues.apache.org/jira/browse/HBASE-6377
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell


A client application (LoadTestTool) calls put() on HTables. Internally to the 
HBase client those puts are batched into MultiActions. The total number of put 
operations shown in the RegionServer's put metrics histogram never increases 
from 0 even though millions of such operations are made. Needless to say the 
latency for those operations are not measured either. The value of HBASE-5533 
metrics are suspect given the client will batch put and delete ops like this.

I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
processing in HRegionServer would distingush between puts and deletes and 
dispatch them separately. It was easy to account for the time for them. Now 
both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Alex Baranau (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412457#comment-13412457
 ] 

Alex Baranau commented on HBASE-4050:
-

"was about to commit" read as "was about to provide patch" ;)

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Alex Baranau (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412456#comment-13412456
 ] 

Alex Baranau commented on HBASE-4050:
-

Heh, was about to commit example with ServiceLoader (and same extra modules 
based on your schema above), but looks like it makes sense to use 
ResourceFinder. Will use your patch and add metrics sources for RegionServer 
and Master to it (almost empty, as per discussion above) tomorrow and we can 
think about closing the issue (after review, etc.).

Thank you, Elliott!

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-4050:
-

Attachment: HBASE-4050-0.patch

Here's a patch that add's hadoop compatibility shims.  I needed something to 
prototype and test with so I used my implementation of HBASE-6323 as an example.

hbase-hadoop-compat contains the factory and the interface.  The factory uses 
ResourceFinder from the geronimo project.  It's much more flexible than 
ServiceLoader (allows different locations easily and most importantly it allows 
constructor arguments).  I didn't want to add the whole geronimo project as a 
dep so the code is copied in.  I tried to give as much credit as I could.  I 
can go back to using ServiceLoader if people object to having 

hbase-hadoop1-compat and hbase-hadoop2-compat add the actual implementation of 
the class who's interface is defined in hbase-hadoop-compat.  I don't have a 
hbase-hadoop23-compat

Right now depending upon which profile is building the hbase-server module gets 
one of the above as a dependency.

In addition when building assembly files only contain the 
hbase-hadoop{1,2}-compat directory needed.  It's possible to keep the old 
assembly file the way it was and change the shell scripts to only load the one. 
 But I didn't get to that.

I tested it in place and locally after building tar.gz's on both 
* hadoop 1.0.3
* hadoop 2.0.0-alpha

In place scripts still work though I'm not really sure of why or how.  I need 
to investigate that later.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2315) BookKeeper for write-ahead logging

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412441#comment-13412441
 ] 

Zhihong Ted Yu commented on HBASE-2315:
---

The ctor of HLog takes a FileSystem parameter. Since the FileSystem isn't 
important to bookkeeper, my feeling is that the approach in previous patch 
makes sense.

You can remodel that patch by introducing a new hbase module.

Thanks

> BookKeeper for write-ahead logging
> --
>
> Key: HBASE-2315
> URL: https://issues.apache.org/jira/browse/HBASE-2315
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Reporter: Flavio Junqueira
> Attachments: HBASE-2315.patch, bookkeeperOverview.pdf, 
> zookeeper-dev-bookkeeper.jar
>
>
> BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high 
> throughput write-ahead logging service. This issue provides an implementation 
> of write-ahead logging for hbase using BookKeeper. Apart from expected 
> throughput improvements, BookKeeper also has stronger durability guarantees 
> compared to the implementation currently used by hbase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6317) Master clean start up and Partially enabled tables make region assignment inconsistent.

2012-07-11 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412437#comment-13412437
 ] 

rajeshbabu commented on HBASE-6317:
---

@Lars
I will upload patch addressing some of his comments and writing test case for 
that. Upload by afternoon. 

> Master clean start up and Partially enabled tables make region assignment 
> inconsistent.
> ---
>
> Key: HBASE-6317
> URL: https://issues.apache.org/jira/browse/HBASE-6317
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-6317_94.patch
>
>
> If we have a  table in partially enabled state (ENABLING) then on HMaster 
> restart we treat it as a clean cluster start up and do a bulk assign.  
> Currently in 0.94 bulk assign will not handle ALREADY_OPENED scenarios and it 
> leads to region assignment problems.  Analysing more on this we found that we 
> have better way to handle these scenarios.
> {code}
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   synchronized (this.regions) {
> regions.put(regionInfo, regionLocation);
> addToServers(regionLocation, regionInfo);
>   }
> {code}
> We dont add to regions map so that enable table handler can handle it.  But 
> as nothing is added to regions map we think it as a clean cluster start up.
> Will come up with a patch tomorrow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6376) bin/hbase command doesn't seem to be working

2012-07-11 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-6376:
--

 Summary: bin/hbase command doesn't seem to be working
 Key: HBASE-6376
 URL: https://issues.apache.org/jira/browse/HBASE-6376
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Devaraj Das
Priority: Blocker
 Fix For: 0.96.0


I noticed that commands like "bin/hbase shell" doesn't work. The exception 
trace is:
{noformat}
bin/hbase shell
Exception in thread "main" java.lang.NoClassDefFoundError: org/jruby/Main
Caused by: java.lang.ClassNotFoundException: org.jruby.Main
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
{noformat}
This is a trunk build (mvn package -DskipTests=true) and then I am trying to 
run the bin/hbase command from the root directory. (Am I missing something?)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6319) ReplicationSource can call terminate on itself and deadlock

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6319:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

Bumping to 0.90.8 -- I'm personally not concerned about replication in 0.90.7

> ReplicationSource can call terminate on itself and deadlock
> ---
>
> Key: HBASE-6319
> URL: https://issues.apache.org/jira/browse/HBASE-6319
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.94.2, 0.90.8
>
> Attachments: HBASE-6319-0.92.patch
>
>
> In a few places in the ReplicationSource code calls terminate on itself which 
> is a problem since in terminate() we wait on that thread to die.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6239) [replication] ReplicationSink uses the ts of the first KV for the other KVs in the same row

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6239:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

Bumping to 0.90.8 -- I'm personally not concerned about replication in 0.90.7

> [replication] ReplicationSink uses the ts of the first KV for the other KVs 
> in the same row
> ---
>
> Key: HBASE-6239
> URL: https://issues.apache.org/jira/browse/HBASE-6239
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Critical
>  Labels: corruption
> Fix For: 0.92.2, 0.90.8
>
> Attachments: HBASE-6239-0.92-v1.patch
>
>
> ReplicationSink assumes that all the KVs for the same row inside a WALEdit 
> will have the same timestamp, which is not necessarily the case.
> This only affects 0.90 and 0.92 since HBASE-5203 fixes it in 0.94

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6325) [replication] Race in ReplicationSourceManager.init can initiate a failover even if the node is alive

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6325:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

Bumping to 0.90.8 -- I'm personally not concerned about replication in 0.90.7.  

> [replication] Race in ReplicationSourceManager.init can initiate a failover 
> even if the node is alive
> -
>
> Key: HBASE-6325
> URL: https://issues.apache.org/jira/browse/HBASE-6325
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.96.0, 0.94.2, 0.90.8
>
> Attachments: HBASE-6325-0.92-v2.patch, HBASE-6325-0.92.patch
>
>
> Yet another bug found during the leap second madness, it's possible to miss 
> the registration of new region servers so that in 
> ReplicationSourceManager.init we start the failover of a live and replicating 
> region server. I don't think there's data loss but the RS that's being failed 
> over will die on:
> {noformat}
> 2012-07-01 06:25:15,604 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> sv4r23s48,10304,1341112194623: Writing replication status
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for 
> /hbase/replication/rs/sv4r23s48,10304,1341112194623/4/sv4r23s48%2C10304%2C1341112194623.1341112195369
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:1246)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.setData(RecoverableZooKeeper.java:372)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:655)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.setData(ZKUtil.java:697)
> at 
> org.apache.hadoop.hbase.replication.ReplicationZookeeper.writeReplicationStatus(ReplicationZookeeper.java:470)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:154)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:607)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:368)
> {noformat}
> It seems to me that just refreshing {{otherRegionServers}} after getting the 
> list of {{currentReplicators}} would be enough to fix this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6347:
--

Labels: noob  (was: )

> -ROOT- and .META. are stale in table.jsp if they moved
> --
>
> Key: HBASE-6347
> URL: https://issues.apache.org/jira/browse/HBASE-6347
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
>  Labels: noob
> Fix For: 0.92.2, 0.94.2, 0.90.8
>
>
> table.jsp does not use a lookup method on {{CatalogTracker}} that does not 
> force a refresh of the cache, thus it can get a stale location if -ROOT- or 
> .META. moved and the master hasn't tried to access them yet.
> Should just be a matter of using waitForRoot/Meta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6347:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

> -ROOT- and .META. are stale in table.jsp if they moved
> --
>
> Key: HBASE-6347
> URL: https://issues.apache.org/jira/browse/HBASE-6347
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
>  Labels: noob
> Fix For: 0.92.2, 0.94.2, 0.90.8
>
>
> table.jsp does not use a lookup method on {{CatalogTracker}} that does not 
> force a refresh of the cache, thus it can get a stale location if -ROOT- or 
> .META. moved and the master hasn't tried to access them yet.
> Should just be a matter of using waitForRoot/Meta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6347) -ROOT- and .META. are stale in table.jsp if they moved

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412396#comment-13412396
 ] 

Jonathan Hsieh commented on HBASE-6347:
---

Seems minor for 0.90.7, bumping.

> -ROOT- and .META. are stale in table.jsp if they moved
> --
>
> Key: HBASE-6347
> URL: https://issues.apache.org/jira/browse/HBASE-6347
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
>  Labels: noob
> Fix For: 0.92.2, 0.94.2, 0.90.8
>
>
> table.jsp does not use a lookup method on {{CatalogTracker}} that does not 
> force a refresh of the cache, thus it can get a stale location if -ROOT- or 
> .META. moved and the master hasn't tried to access them yet.
> Should just be a matter of using waitForRoot/Meta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten

2012-07-11 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang resolved HBASE-5376.


Resolution: Later
  Assignee: Jimmy Xiang

Close it for now.  We haven't seen such a problem for quite a long time.

> Add more logging to triage HBASE-5312: Closed parent region present in 
> Hlog.lastSeqWritten
> --
>
> Key: HBASE-5376
> URL: https://issues.apache.org/jira/browse/HBASE-5376
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.90.7
>
> Attachments: hbase-5376.txt
>
>
> It is hard to find out what exactly caused HBASE-5312.  Some logging will be 
> helpful to shine some lights.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5516) GZip leading to memory leak in 0.90. Fix similar to HBASE-5387 needed for 0.90.

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412389#comment-13412389
 ] 

Jonathan Hsieh commented on HBASE-5516:
---

Hm.. no tests, going to bump to 0.90.8 unless action taken. 

> GZip leading to memory leak in 0.90.  Fix similar to HBASE-5387 needed for 
> 0.90.
> 
>
> Key: HBASE-5516
> URL: https://issues.apache.org/jira/browse/HBASE-5516
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
> Attachments: HBASE-5516_2_0.90.patch, HBASE-5516_3_0.90.patch
>
>
> Usage of GZip is leading to resident memory leak in 0.90.
> We need to have something similar to HBASE-5387 in 0.90. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6317) Master clean start up and Partially enabled tables make region assignment inconsistent.

2012-07-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412390#comment-13412390
 ] 

Lars Hofhansl commented on HBASE-6317:
--

@rajeshbabu: could you reply to Stacks comments?
Need to push 0.94.1.

> Master clean start up and Partially enabled tables make region assignment 
> inconsistent.
> ---
>
> Key: HBASE-6317
> URL: https://issues.apache.org/jira/browse/HBASE-6317
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: HBASE-6317_94.patch
>
>
> If we have a  table in partially enabled state (ENABLING) then on HMaster 
> restart we treat it as a clean cluster start up and do a bulk assign.  
> Currently in 0.94 bulk assign will not handle ALREADY_OPENED scenarios and it 
> leads to region assignment problems.  Analysing more on this we found that we 
> have better way to handle these scenarios.
> {code}
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   synchronized (this.regions) {
> regions.put(regionInfo, regionLocation);
> addToServers(regionLocation, regionInfo);
>   }
> {code}
> We dont add to regions map so that enable table handler can handle it.  But 
> as nothing is added to regions map we think it as a clean cluster start up.
> Will come up with a patch tomorrow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6375:
--

Summary: Master may be using a stale list of region servers for creating 
assignment plan during startup  (was: Master could possibly be using a stale 
list of region servers for creating assignment plan during startup)

> Master may be using a stale list of region servers for creating assignment 
> plan during startup
> --
>
> Key: HBASE-6375
> URL: https://issues.apache.org/jira/browse/HBASE-6375
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
> Environment: All
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.96.0
>
> Attachments: HBASE-6375_trunk.patch
>
>
> While investigating an Out of Memory issue, I had an interesting observation 
> where the master tries to assign all regions to a single region server even 
> though 7 other had already registered with it.
> As the cluster had MSLAB enabled, this resulted in OOM on the RS when it 
> tired to open all of them.
> *From master's log (edited for brevity):*
> {quote}
> 55,468 Waiting on regionserver(s) to checkin
> 56,968 Waiting on regionserver(s) to checkin
> 58,468 Waiting on regionserver(s) to checkin
> 59,968 Waiting on regionserver(s) to checkin
> 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
> 01,469 Waiting on regionserver(s) count to settle; currently=1
> 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
> 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
> 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
> 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
> 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
> 03,336 Detected completed assignment of META, notifying catalog tracker
> 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
> 03,350 Master startup proceeding: cluster startup
> 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
> 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
> 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
> 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
> 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
> 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
> 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
> 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
> 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
> {quote}
> *A peek at AssignmentManager code offer some explanation:*
> {code}
>   public void assignAllUserRegions() throws IOException, InterruptedException 
> {
> // Get all available servers
> List servers = serverManager.getOnlineServersList();
> // Scan META for all user regions, skipping any disabled tables
> Map allRegions =
>   MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
> true);
> if (allRegions == null || allRegions.isEmpty()) return;
> // Determine what type of assignment to do on startup
> boolean retainAssignment = master.getConfiguration().
>   getBoolean("hbase.master.startup.retainassign", true);
> Map> bulkPlan = null;
> if (retainAssignment) {
>   // Reuse existing assignment info
>   bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
> } else {
>   // assign regions in round-robin fashion
>   bulkPlan = LoadBalancer.roundRobinAssignment(new 
> ArrayList(allRegions.keySet()), servers);
> }
> LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
>   servers.size() + " server(s), retainAssignment=" + retainAssignment);
> ...
> {code}
> In the function assignAllUserRegions(), listed above, AM fetches the server 
> list from ServerManager long before it actually use it to create assignment 
> plan.
> In between these, it performs a full scan of META to create an assignment map 
> of regions. So even if additional RSes have registered in the meantime (as 
> happened in this case), AM still has the old list of just one server.
> This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and 
> trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster 
> can hit this issue upon clust

[jira] [Commented] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412385#comment-13412385
 ] 

Zhihong Ted Yu commented on HBASE-6375:
---

Interesting discovery.

> Master could possibly be using a stale list of region servers for creating 
> assignment plan during startup
> -
>
> Key: HBASE-6375
> URL: https://issues.apache.org/jira/browse/HBASE-6375
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
> Environment: All
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.96.0
>
> Attachments: HBASE-6375_trunk.patch
>
>
> While investigating an Out of Memory issue, I had an interesting observation 
> where the master tries to assign all regions to a single region server even 
> though 7 other had already registered with it.
> As the cluster had MSLAB enabled, this resulted in OOM on the RS when it 
> tired to open all of them.
> *From master's log (edited for brevity):*
> {quote}
> 55,468 Waiting on regionserver(s) to checkin
> 56,968 Waiting on regionserver(s) to checkin
> 58,468 Waiting on regionserver(s) to checkin
> 59,968 Waiting on regionserver(s) to checkin
> 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
> 01,469 Waiting on regionserver(s) count to settle; currently=1
> 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
> 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
> 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
> 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
> 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
> 03,336 Detected completed assignment of META, notifying catalog tracker
> 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
> 03,350 Master startup proceeding: cluster startup
> 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
> 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
> 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
> 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
> 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
> 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
> 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
> 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
> 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
> {quote}
> *A peek at AssignmentManager code offer some explanation:*
> {code}
>   public void assignAllUserRegions() throws IOException, InterruptedException 
> {
> // Get all available servers
> List servers = serverManager.getOnlineServersList();
> // Scan META for all user regions, skipping any disabled tables
> Map allRegions =
>   MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
> true);
> if (allRegions == null || allRegions.isEmpty()) return;
> // Determine what type of assignment to do on startup
> boolean retainAssignment = master.getConfiguration().
>   getBoolean("hbase.master.startup.retainassign", true);
> Map> bulkPlan = null;
> if (retainAssignment) {
>   // Reuse existing assignment info
>   bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
> } else {
>   // assign regions in round-robin fashion
>   bulkPlan = LoadBalancer.roundRobinAssignment(new 
> ArrayList(allRegions.keySet()), servers);
> }
> LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
>   servers.size() + " server(s), retainAssignment=" + retainAssignment);
> ...
> {code}
> In the function assignAllUserRegions(), listed above, AM fetches the server 
> list from ServerManager long before it actually use it to create assignment 
> plan.
> In between these, it performs a full scan of META to create an assignment map 
> of regions. So even if additional RSes have registered in the meantime (as 
> happened in this case), AM still has the old list of just one server.
> This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and 
> trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster 
> can hit this issue upon cluster start-up when the following sequence holds 
> true.
> # Master start long before the RSes (by default this 

[jira] [Updated] (HBASE-6331) Problem with HBCK mergeOverlaps

2012-07-11 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6331:
-

Fix Version/s: (was: 0.94.1)
   0.94.2

Need to release 0.94.1RC soon, pushing for now.

> Problem with HBCK mergeOverlaps
> ---
>
> Key: HBASE-6331
> URL: https://issues.apache.org/jira/browse/HBASE-6331
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 0.96.0, 0.94.2
>
> Attachments: HBASE-6331_94.patch, HBASE-6331_Trunk.patch
>
>
> In HDFSIntegrityFixer#mergeOverlaps(), there is a logic to create the final 
> range of the region after the overlap.
> I can see one issue with this code
> {code}
> if (RegionSplitCalculator.BYTES_COMPARATOR
> .compare(hi.getEndKey(), range.getSecond()) > 0) {
>   range.setSecond(hi.getEndKey());
> }
> {code}
> Here suppose the regions include the end region for which the endKey will be 
> empty, we need to get finally the range with endkey as empty byte[]
> But as per the above logic it will see that any other key greater than the 
> empty byte[] and will set it.
> Finally the new region created will not get endkey as empty byte[]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6291) Don't retry increments on an invalid cell

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6291:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

bumping from 0.90.7

> Don't retry increments on an invalid cell
> -
>
> Key: HBASE-6291
> URL: https://issues.apache.org/jira/browse/HBASE-6291
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.94.2, 0.90.8
>
>
> This says it all:
> {noformat}
> ERROR: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
> attempts=7, exceptions:
> Thu Jun 28 18:34:44 UTC 2012, 
> org.apache.hadoop.hbase.client.HTable$8@4eabaf8c, java.io.IOException: 
> java.io.IOException: Attempted to increment field that isn't 64 bits wide
> {noformat}
> {{HRegion}} should be modified here to send a DoNotRetryIOException:
> {code}
> if (wrongLength) {
>   throw new DoNotRetryIOException(
> "Attempted to increment field that isn't 64 bits wide");
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6375:
--

Description: 
While investigating an Out of Memory issue, I had an interesting observation 
where the master tries to assign all regions to a single region server even 
though 7 other had already registered with it.

As the cluster had MSLAB enabled, this resulted in OOM on the RS when it tired 
to open all of them.

*From master's log (edited for brevity):*
{quote}
55,468 Waiting on regionserver(s) to checkin
56,968 Waiting on regionserver(s) to checkin
58,468 Waiting on regionserver(s) to checkin
59,968 Waiting on regionserver(s) to checkin
01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
01,469 Waiting on regionserver(s) count to settle; currently=1
02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
03,336 Detected completed assignment of META, notifying catalog tracker
03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
03,350 Master startup proceeding: cluster startup
04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
{quote}

*A peek at AssignmentManager code offer some explanation:*
{code}
  public void assignAllUserRegions() throws IOException, InterruptedException {
// Get all available servers
List servers = serverManager.getOnlineServersList();

// Scan META for all user regions, skipping any disabled tables
Map allRegions =
  MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
true);
if (allRegions == null || allRegions.isEmpty()) return;

// Determine what type of assignment to do on startup
boolean retainAssignment = master.getConfiguration().
  getBoolean("hbase.master.startup.retainassign", true);

Map> bulkPlan = null;
if (retainAssignment) {
  // Reuse existing assignment info
  bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
} else {
  // assign regions in round-robin fashion
  bulkPlan = LoadBalancer.roundRobinAssignment(new 
ArrayList(allRegions.keySet()), servers);
}
LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
  servers.size() + " server(s), retainAssignment=" + retainAssignment);
...
{code}

In the function assignAllUserRegions(), listed above, AM fetches the server 
list from ServerManager long before it actually use it to create assignment 
plan.

In between these, it performs a full scan of META to create an assignment map 
of regions. So even if additional RSes have registered in the meantime (as 
happened in this case), AM still has the old list of just one server.

This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and 
trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster can 
hit this issue upon cluster start-up when the following sequence holds true.

# Master start long before the RSes (by default this long ~= 4.5 seconds)
# All the RSes start togather but one wins the race of registering with Master 
by few seconds.

I am attaching a patch for the trunk which moves the code which fetches the RS 
list form the beginning of the function to where it is first use.

Apart from this change, one other HBase setting that now becomes important is 
"hbase.master.wait.on.regionservers.mintostart" due to MSLAB being enabled by 
default.

In large clusters which keeps it enabled now must modify 
"hbase.master.wait.on.regionservers.mintostart" to a suitable number than the 
default of 1 to ensure that the master waits for a quorum of RSes which are 
sufficient to open all the regions among themselves. I'll create a separate 
JIRA for the documentation change.

  was:
While investigating an Out of Memory issue, I had an interesting observation 
where the master

[jira] [Commented] (HBASE-6321) ReplicationSource dies reading the peer's id

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412377#comment-13412377
 ] 

Jonathan Hsieh commented on HBASE-6321:
---

Bumping from 0.90.7 to 0.90.8

> ReplicationSource dies reading the peer's id
> 
>
> Key: HBASE-6321
> URL: https://issues.apache.org/jira/browse/HBASE-6321
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.96.0, 0.94.2, 0.90.8
>
>
> This is what I saw:
> {noformat}
> 2012-07-01 05:04:01,638 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Closing 
> source 8 because an error occurred: Could not read peer's cluster id
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /va1-backup/hbaseid
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:259)
> at 
> org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:253)
> {noformat}
> The session should just be reopened.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6321) ReplicationSource dies reading the peer's id

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-6321:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

> ReplicationSource dies reading the peer's id
> 
>
> Key: HBASE-6321
> URL: https://issues.apache.org/jira/browse/HBASE-6321
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.1, 0.94.0
>Reporter: Jean-Daniel Cryans
> Fix For: 0.92.2, 0.96.0, 0.94.2, 0.90.8
>
>
> This is what I saw:
> {noformat}
> 2012-07-01 05:04:01,638 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Closing 
> source 8 because an error occurred: Could not read peer's cluster id
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /va1-backup/hbaseid
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:154)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:259)
> at 
> org.apache.hadoop.hbase.zookeeper.ClusterId.readClusterIdZNode(ClusterId.java:61)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:253)
> {noformat}
> The session should just be reopened.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5157) Backport HBASE-4880- Region is on service before openRegionHandler completes, may cause data loss

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5157:
--

Fix Version/s: (was: 0.90.8)
   0.90.7

Actually, since this is a data loss bug, considering it.

> Backport HBASE-4880- Region is on service before openRegionHandler completes, 
> may cause data loss
> -
>
> Key: HBASE-5157
> URL: https://issues.apache.org/jira/browse/HBASE-5157
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.90.7
>
> Attachments: HBASE-4880_branch90_1.patch
>
>
> Backporting to 0.90.6 considering the importance of the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4083) If Enable table is not completed and is partial, then scanning of the table is not working

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4083:
--

   Resolution: Fixed
Fix Version/s: (was: 0.90.7)
   0.94.0
   Status: Resolved  (was: Patch Available)

Removed from 0.90, was committed to trunk/0.94.0 back in July 2011..  Please 
file new issue if you want to get it into 0.90.

> If Enable table is not completed and is partial, then scanning of the table 
> is not working 
> ---
>
> Key: HBASE-4083
> URL: https://issues.apache.org/jira/browse/HBASE-4083
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.0, 0.92.0
>
> Attachments: HBASE-4083-1.patch, HBASE-4083_0.90.patch, 
> HBASE-4083_0.90_1.patch, HBASE-4083_trunk.patch, HBASE-4083_trunk_1.patch
>
>
> Consider the following scenario
> Start the Master, Backup master and RegionServer.
> Create a table which in turn creates a region.
> Disable the table.
> Enable the table again. 
> Kill the Active master exactly at the point before the actual region 
> assignment is started.
> Restart or switch master.
> Scan the table.
> NotServingRegionExcepiton is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5157) Backport HBASE-4880- Region is on service before openRegionHandler completes, may cause data loss

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5157:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

No activity in 6 months.  Bumping from 0.90.7

> Backport HBASE-4880- Region is on service before openRegionHandler completes, 
> may cause data loss
> -
>
> Key: HBASE-5157
> URL: https://issues.apache.org/jira/browse/HBASE-5157
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.90.8
>
> Attachments: HBASE-4880_branch90_1.patch
>
>
> Backporting to 0.90.6 considering the importance of the issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4462) Properly treating SocketTimeoutException

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-4462:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

Looks like ram bumped this from 0.90.6 and it assigned to him so I'm bumping it 
from 0.90.7

> Properly treating SocketTimeoutException
> 
>
> Key: HBASE-4462
> URL: https://issues.apache.org/jira/browse/HBASE-4462
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.4
>Reporter: Jean-Daniel Cryans
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.8
>
> Attachments: HBASE-4462_0.90.x.patch
>
>
> SocketTimeoutException is currently treated like any IOE inside of 
> HCM.getRegionServerWithRetries and I think this is a problem. This method 
> should only do retries in cases where we are pretty sure the operation will 
> complete, but with STE we already waited for (by default) 60 seconds and 
> nothing happened.
> I found this while debugging Douglas Campbell's problem on the mailing list 
> where it seemed like he was using the same scanner from multiple threads, but 
> actually it was just the same client doing retries while the first run didn't 
> even finish yet (that's another problem). You could see the first scanner, 
> then up to two other handlers waiting for it to finish in order to run 
> (because of the synchronization on RegionScanner).
> So what should we do? We could treat STE as a DoNotRetryException and let the 
> client deal with it, or we could retry only once.
> There's also the option of having a different behavior for get/put/icv/scan, 
> the issue with operations that modify a cell is that you don't know if the 
> operation completed or not (same when a RS dies hard after completing let's 
> say a Put but just before returning to the client).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5323:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

> Need to handle assertion error while splitting log through 
> ServerShutDownHandler by shutting down the master
> 
>
> Key: HBASE-5323
> URL: https://issues.apache.org/jira/browse/HBASE-5323
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.2, 0.90.8
>
> Attachments: HBASE-5323.patch, HBASE-5323.patch
>
>
> We know that while parsing the HLog we expect the proper length from HDFS.
> In WALReaderFSDataInputStream
> {code}
>   assert(realLength >= this.length);
> {code}
> We are trying to come out if the above condition is not satisfied.  But if 
> SSH.splitLog() gets this problem then it lands in the run method of 
> EventHandler.  This kills the SSH thread and so further assignment does not 
> happen.  If ROOT and META are to be assigned they cannot be.
> I think in this condition we abort the master by catching such exceptions.
> Please do suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5323) Need to handle assertion error while splitting log through ServerShutDownHandler by shutting down the master

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412366#comment-13412366
 ] 

Jonathan Hsieh commented on HBASE-5323:
---

If it isn't making 0.94.1 then I'm going to bump it from 0.90.7.

> Need to handle assertion error while splitting log through 
> ServerShutDownHandler by shutting down the master
> 
>
> Key: HBASE-5323
> URL: https://issues.apache.org/jira/browse/HBASE-5323
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.2, 0.90.8
>
> Attachments: HBASE-5323.patch, HBASE-5323.patch
>
>
> We know that while parsing the HLog we expect the proper length from HDFS.
> In WALReaderFSDataInputStream
> {code}
>   assert(realLength >= this.length);
> {code}
> We are trying to come out if the above condition is not satisfied.  But if 
> SSH.splitLog() gets this problem then it lands in the run method of 
> EventHandler.  This kills the SSH thread and so further assignment does not 
> happen.  If ROOT and META are to be assigned they cannot be.
> I think in this condition we abort the master by catching such exceptions.
> Please do suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4064) Two concurrent unassigning of the same region caused the endless loop of "Region has been PENDING_CLOSE for too long..."

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412363#comment-13412363
 ] 

Jonathan Hsieh commented on HBASE-4064:
---

There was some recent activity here, anyone planning on finishing this guy? 
(its been bumped a few times considering bumping it for the 0.90.7 release).


> Two concurrent unassigning of the same region caused the endless loop of 
> "Region has been PENDING_CLOSE for too long..."
> 
>
> Key: HBASE-4064
> URL: https://issues.apache.org/jira/browse/HBASE-4064
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.3
>Reporter: Jieshan Bean
> Fix For: 0.90.7
>
> Attachments: HBASE-4064-v1.patch, HBASE-4064_branch90V2.patch, 
> disableflow.png
>
>
> 1. If there is a "rubbish" RegionState object with "PENDING_CLOSE" in 
> regionsInTransition(The RegionState was remained by some exception which 
> should be removed, that's why I called it as "rubbish" object), but the 
> region is not currently assigned anywhere, TimeoutMonitor will fall into an 
> endless loop:
> 2011-06-27 10:32:21,326 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> state=PENDING_CLOSE, ts=1309141555301
> 2011-06-27 10:32:21,326 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
> 2011-06-27 10:32:21,438 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> (offlining)
> 2011-06-27 10:32:21,441 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
> not currently assigned anywhere
> 2011-06-27 10:32:31,207 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> state=PENDING_CLOSE, ts=1309141555301
> 2011-06-27 10:32:31,207 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
> 2011-06-27 10:32:31,215 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> (offlining)
> 2011-06-27 10:32:31,215 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
> not currently assigned anywhere
> 2011-06-27 10:32:41,164 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> state=PENDING_CLOSE, ts=1309141555301
> 2011-06-27 10:32:41,164 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_CLOSE for too long, running forced unassign again on 
> region=test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f.
> 2011-06-27 10:32:41,172 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. 
> (offlining)
> 2011-06-27 10:32:41,172 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Attempted to unassign 
> region test2,070712,1308971310309.9a6e26d40293663a79523c58315b930f. but it is 
> not currently assigned anywhere
> .
> 2  In the following scenario, two concurrent unassigning call of the same 
> region may lead to the above problem:
> the first unassign call send rpc call success, the master watched the event 
> of "RS_ZK_REGION_CLOSED", process this event, will create a 
> ClosedRegionHandler to remove the state of the region in master.eg.
> while ClosedRegionHandler is running in  
> "hbase.master.executor.closeregion.threads" thread (A), another unassign call 
> of same region run in another thread(B).
> while thread B  run "if (!regions.containsKey(region))", this.regions have 
> the region info, now  cpu switch to thread A.
> The thread A will remove the region from the sets of "this.regions" and 
> "regionsInTransition", then switch to thread B. the thread B run continue, 
> will throw an exception with the msg of "Server null returned 
> java.lang.NullPointerException: Passed server is null for 
> 9a6e26d40293663a79523c58315b930f", but without removing 

[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412360#comment-13412360
 ] 

Jonathan Hsieh commented on HBASE-5883:
---

@Jieshan since this was committed along time ago (5/3/12) I'd suggest creating 
a new issue to clean it up.  I'll close this after it is done.

> Backup master is going down due to connection refused exception
> ---
>
> Key: HBASE-5883
> URL: https://issues.apache.org/jira/browse/HBASE-5883
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Gopinathan A
>Assignee: Jieshan Bean
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2
>
> Attachments: 90-addendum.patch, 92-addendum.patch, 94-addendum.patch, 
> HBASE-5883-90.patch, HBASE-5883-92.patch, HBASE-5883-94.patch, 
> HBASE-5883-trunk.patch, trunk-addendum.patch
>
>
> The active master node network was down for some time (This node contains 
> Master,DN,ZK,RS). Here backup node got 
> notification, and started to became active. Immedietly backup node got 
> aborted with the below exception.
> {noformat}
> 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
> finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
> [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
>  in 26374ms
> 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> java.io.IOException: java.net.ConnectException: Connection refused
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
>   at $Proxy13.getProtocolVersion(Unknown Source)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
>   at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
>   ... 20 more
> 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
> Stopping service threads
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/

[jira] [Commented] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412358#comment-13412358
 ] 

Jonathan Hsieh commented on HBASE-5376:
---

Jimmy, this patch look trivial, do you want to commit this the 0.90 branch?  
Other branches?

> Add more logging to triage HBASE-5312: Closed parent region present in 
> Hlog.lastSeqWritten
> --
>
> Key: HBASE-5376
> URL: https://issues.apache.org/jira/browse/HBASE-5376
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jimmy Xiang
>Priority: Minor
> Fix For: 0.90.7
>
> Attachments: hbase-5376.txt
>
>
> It is hard to find out what exactly caused HBASE-5312.  Some logging will be 
> helpful to shine some lights.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3834:
--

Fix Version/s: (was: 0.90.7)
   0.90.8

No activity, moving to 0.90.8

> Store ignores checksum errors when opening files
> 
>
> Key: HBASE-3834
> URL: https://issues.apache.org/jira/browse/HBASE-3834
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.2
>Reporter: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.8
>
>
> If you corrupt one of the storefiles in a region (eg using vim to muck up 
> some bytes), the region will still open, but that storefile will just be 
> ignored with a log message. We should probably not do this in general - 
> better to keep that region unassigned and force an admin to make a decision 
> to remove the bad storefile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3834) Store ignores checksum errors when opening files

2012-07-11 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-3834:
--


No activity, moving to 0.90.8

> Store ignores checksum errors when opening files
> 
>
> Key: HBASE-3834
> URL: https://issues.apache.org/jira/browse/HBASE-3834
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.2
>Reporter: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.8
>
>
> If you corrupt one of the storefiles in a region (eg using vim to muck up 
> some bytes), the region will still open, but that storefile will just be 
> ignored with a log message. We should probably not do this in general - 
> better to keep that region unassigned and force an admin to make a decision 
> to remove the bad storefile.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6375:
--

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

As described in the summary, the patch modifies the AM code to fetch the list 
of RSes just before it needs it.

> Master could possibly be using a stale list of region servers for creating 
> assignment plan during startup
> -
>
> Key: HBASE-6375
> URL: https://issues.apache.org/jira/browse/HBASE-6375
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.92.1, 0.90.6, 0.96.0
> Environment: All
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 0.96.0
>
> Attachments: HBASE-6375_trunk.patch
>
>
> While investigating an Out of Memory issue, I had an interesting observation 
> where the master tries to assign all regions to a single region server even 
> though 7 other had already registered with it.
> As the cluster had MSLAB enabled, this resulted in OOM on the RS when it 
> tired to open all of them.
> *From master's log (edited for brevity):*
> {quote}
> 55,468 Waiting on regionserver(s) to checkin
> 56,968 Waiting on regionserver(s) to checkin
> 58,468 Waiting on regionserver(s) to checkin
> 59,968 Waiting on regionserver(s) to checkin
> 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
> 01,469 Waiting on regionserver(s) count to settle; currently=1
> 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
> 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
> 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
> 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
> 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
> 03,336 Detected completed assignment of META, notifying catalog tracker
> 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
> 03,350 Master startup proceeding: cluster startup
> 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
> 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
> 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
> 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
> 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
> 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
> 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
> 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
> 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
> {quote}
> *A peek at AssignmentManager code offer some explanation:*
> {code}
>   public void assignAllUserRegions() throws IOException, InterruptedException 
> {
> // Get all available servers
> List servers = serverManager.getOnlineServersList();
> // Scan META for all user regions, skipping any disabled tables
> Map allRegions =
>   MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
> true);
> if (allRegions == null || allRegions.isEmpty()) return;
> // Determine what type of assignment to do on startup
> boolean retainAssignment = master.getConfiguration().
>   getBoolean("hbase.master.startup.retainassign", true);
> Map> bulkPlan = null;
> if (retainAssignment) {
>   // Reuse existing assignment info
>   bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
> } else {
>   // assign regions in round-robin fashion
>   bulkPlan = LoadBalancer.roundRobinAssignment(new 
> ArrayList(allRegions.keySet()), servers);
> }
> LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
>   servers.size() + " server(s), retainAssignment=" + retainAssignment);
> ...
> {code}
> In the function assignAllUserRegions(), listed above, AM fetches the server 
> list from ServerManager long before it actually use it to create assignment 
> plan.
> In between these, it performs a full scan of META to create an assignment map 
> of regions. So even if additional RSes have registered in the meantime (as 
> happened in this case), AM still has the old list of just one server.
> This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and 
> trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster 
> can hit this issue upon cluster 

[jira] [Updated] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6375:
--

Attachment: HBASE-6375_trunk.patch

Patch for trunk

> Master could possibly be using a stale list of region servers for creating 
> assignment plan during startup
> -
>
> Key: HBASE-6375
> URL: https://issues.apache.org/jira/browse/HBASE-6375
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
> Environment: All
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Attachments: HBASE-6375_trunk.patch
>
>
> While investigating an Out of Memory issue, I had an interesting observation 
> where the master tries to assign all regions to a single region server even 
> though 7 other had already registered with it.
> As the cluster had MSLAB enabled, this resulted in OOM on the RS when it 
> tired to open all of them.
> *From master's log (edited for brevity):*
> {quote}
> 55,468 Waiting on regionserver(s) to checkin
> 56,968 Waiting on regionserver(s) to checkin
> 58,468 Waiting on regionserver(s) to checkin
> 59,968 Waiting on regionserver(s) to checkin
> 01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
> 01,469 Waiting on regionserver(s) count to settle; currently=1
> 02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
> 02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
> 03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
> 03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
> 03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
> 03,336 Detected completed assignment of META, notifying catalog tracker
> 03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
> 03,350 Master startup proceeding: cluster startup
> 04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
> 04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
> 04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
> 04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
> 04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
> 04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
> 04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
> 05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
> 05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
> {quote}
> *A peek at AssignmentManager code offer some explanation:*
> {code}
>   public void assignAllUserRegions() throws IOException, InterruptedException 
> {
> // Get all available servers
> List servers = serverManager.getOnlineServersList();
> // Scan META for all user regions, skipping any disabled tables
> Map allRegions =
>   MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
> true);
> if (allRegions == null || allRegions.isEmpty()) return;
> // Determine what type of assignment to do on startup
> boolean retainAssignment = master.getConfiguration().
>   getBoolean("hbase.master.startup.retainassign", true);
> Map> bulkPlan = null;
> if (retainAssignment) {
>   // Reuse existing assignment info
>   bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
> } else {
>   // assign regions in round-robin fashion
>   bulkPlan = LoadBalancer.roundRobinAssignment(new 
> ArrayList(allRegions.keySet()), servers);
> }
> LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
>   servers.size() + " server(s), retainAssignment=" + retainAssignment);
> ...
> {code}
> In the function assignAllUserRegions(), listed above, AM fetches the server 
> list from ServerManager long before it actually use it to create assignment 
> plan.
> In between these, it performs a full scan of META to create an assignment map 
> of regions. So even if additional RSes have registered in the meantime (as 
> happened in this case), AM still has the old list of just one server.
> This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and 
> trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster 
> can hit this issue upon cluster start-up when the following sequence holds 
> true.
> # Master start long before the RSes (by default this long ~= 4.5 seconds)
> # All the RSes start togather but

[jira] [Created] (HBASE-6375) Master could possibly be using a stale list of region servers for creating assignment plan during startup

2012-07-11 Thread Aditya Kishore (JIRA)
Aditya Kishore created HBASE-6375:
-

 Summary: Master could possibly be using a stale list of region 
servers for creating assignment plan during startup
 Key: HBASE-6375
 URL: https://issues.apache.org/jira/browse/HBASE-6375
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.0, 0.92.1, 0.90.6, 0.96.0
 Environment: All
Reporter: Aditya Kishore
Assignee: Aditya Kishore


While investigating an Out of Memory issue, I had an interesting observation 
where the master tries to assign all regions to a single region server even 
though 7 other had already registered with it.

As the cluster had MSLAB enabled, this resulted in OOM on the RS when it tired 
to open all of them.

*From master's log (edited for brevity):*
{quote}
55,468 Waiting on regionserver(s) to checkin
56,968 Waiting on regionserver(s) to checkin
58,468 Waiting on regionserver(s) to checkin
59,968 Waiting on regionserver(s) to checkin
01,242 Registering server=srv109.datacenter,60020,1338673920529,regionCount=0,userLoad=false
01,469 Waiting on regionserver(s) count to settle; currently=1
02,969 Finished waiting for regionserver count to settle; count=1,sleptFor=46500
02,969 Exiting wait on regionserver(s) to checkin; count=1, stopped=false,count of regions out on cluster=0
03,010 Processing region \-ROOT\-,,0.70236052 in state M_ZK_REGION_OFFLINE
03,220 \-ROOT\- assigned=0, rit=true, location=srv109.datacenter:60020
03,221 Processing region .META.,,1.1028785192 in state M_ZK_REGION_OFFLINE
03,336 Detected completed assignment of META, notifying catalog tracker
03,350 .META. assigned=0, rit=true, location=srv109.datacenter:60020
03,350 Master startup proceeding: cluster startup
04,006 Registering server=srv111.datacenter,60020,1338673923399,regionCount=0,userLoad=false
04,012 Registering server=srv113.datacenter,60020,1338673923532,regionCount=0,userLoad=false
04,269 Registering server=srv115.datacenter,60020,1338673923471,regionCount=0,userLoad=false
04,363 Registering server=srv117.datacenter,60020,1338673923928,regionCount=0,userLoad=false
04,599 Registering server=srv127.datacenter,60020,1338673924067,regionCount=0,userLoad=false
04,606 Registering server=srv119.datacenter,60020,1338673923953,regionCount=0,userLoad=false
04,804 Registering server=srv129.datacenter,60020,1338673924339,regionCount=0,userLoad=false
05,126 Bulk assigning 1252 region(s) across 1 server(s), retainAssignment=true
05,546 hd109.datacenter,60020,1338673920529 unassigned znodes=207 of
{quote}

*A peek at AssignmentManager code offer some explanation:*
{code}
  public void assignAllUserRegions() throws IOException, InterruptedException {
// Get all available servers
List servers = serverManager.getOnlineServersList();

// Scan META for all user regions, skipping any disabled tables
Map allRegions =
  MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(), 
true);
if (allRegions == null || allRegions.isEmpty()) return;

// Determine what type of assignment to do on startup
boolean retainAssignment = master.getConfiguration().
  getBoolean("hbase.master.startup.retainassign", true);

Map> bulkPlan = null;
if (retainAssignment) {
  // Reuse existing assignment info
  bulkPlan = LoadBalancer.retainAssignment(allRegions, servers);
} else {
  // assign regions in round-robin fashion
  bulkPlan = LoadBalancer.roundRobinAssignment(new 
ArrayList(allRegions.keySet()), servers);
}
LOG.info("Bulk assigning " + allRegions.size() + " region(s) across " +
  servers.size() + " server(s), retainAssignment=" + retainAssignment);
...
{code}

In the function assignAllUserRegions(), listed above, AM fetches the server 
list from ServerManager long before it actually use it to create assignment 
plan.

In between these, it performs a full scan of META to create an assignment map 
of regions. So even if additional RSes have registered in the meantime (as 
happened in this case), AM still has the old list of just one server.

This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 and 
trunk. Since MSLAB is enabled by default in 0.92 onwards, any large cluster can 
hit this issue upon cluster start-up when the following sequence holds true.

# Master start long before the RSes (by default this long ~= 4.5 seconds)
# All the RSes start togather but one wins the race of registering with Master 
by few seconds.

I am attaching a patch for the trunk which moves the code which fetches the RS 
list form the beginning of the function to where it is first use.

Apart from this change, one addition HBase setting which now become important 
is "hbase.master.wait.on.regionservers.mintostart" due to MSLAB being enabled 
by true by default.

In large clusters which keeps it enabled now must modify 
"hbase.master.wait.on.regionser

[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412279#comment-13412279
 ] 

Andrew Purtell commented on HBASE-6368:
---

This issue and patch just rolls right over a bunch of previous discussion like 
it never happened.

> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5334) Pluggable Compaction Algorithms

2012-07-11 Thread Nicolas Spiegelberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg reassigned HBASE-5334:
--

Assignee: Akashnil  (was: Nicolas Spiegelberg)

Assigning to Akashnil, who is an intern for us that will be working on a 
size-based compaction algorithm (similar to BigTable strategy).

> Pluggable Compaction Algorithms
> ---
>
> Key: HBASE-5334
> URL: https://issues.apache.org/jira/browse/HBASE-5334
> Project: HBase
>  Issue Type: Improvement
>Reporter: Nicolas Spiegelberg
>Assignee: Akashnil
>Priority: Minor
>  Labels: compaction, regionserver
>
> It would be good to create a set of common compaction algorithms so that we 
> can tune this on a per-CF basis.  In order to accomplish this, we need to 
> refactor the current algorithm for plugability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6331) Problem with HBCK mergeOverlaps

2012-07-11 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412261#comment-13412261
 ] 

Jimmy Xiang commented on HBASE-6331:


I don't think this is a bug.  In hbck, the last endkey is not the normal empty 
byte[].
Instead, it is changed to null. So we have a special comparator.

> Problem with HBCK mergeOverlaps
> ---
>
> Key: HBASE-6331
> URL: https://issues.apache.org/jira/browse/HBASE-6331
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6331_94.patch, HBASE-6331_Trunk.patch
>
>
> In HDFSIntegrityFixer#mergeOverlaps(), there is a logic to create the final 
> range of the region after the overlap.
> I can see one issue with this code
> {code}
> if (RegionSplitCalculator.BYTES_COMPARATOR
> .compare(hi.getEndKey(), range.getSecond()) > 0) {
>   range.setSecond(hi.getEndKey());
> }
> {code}
> Here suppose the regions include the end region for which the endKey will be 
> empty, we need to get finally the range with endkey as empty byte[]
> But as per the above logic it will see that any other key greater than the 
> empty byte[] and will set it.
> Finally the new region created will not get endkey as empty byte[]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6331) Problem with HBCK mergeOverlaps

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412249#comment-13412249
 ] 

Jonathan Hsieh commented on HBASE-6331:
---

Lars, let's go ahead and bump it to 0.94.2 -- I don't want to hold up a 
release, and though it is a bug, we've gotten by without it for a while now. 



> Problem with HBCK mergeOverlaps
> ---
>
> Key: HBASE-6331
> URL: https://issues.apache.org/jira/browse/HBASE-6331
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6331_94.patch, HBASE-6331_Trunk.patch
>
>
> In HDFSIntegrityFixer#mergeOverlaps(), there is a logic to create the final 
> range of the region after the overlap.
> I can see one issue with this code
> {code}
> if (RegionSplitCalculator.BYTES_COMPARATOR
> .compare(hi.getEndKey(), range.getSecond()) > 0) {
>   range.setSecond(hi.getEndKey());
> }
> {code}
> Here suppose the regions include the end region for which the endKey will be 
> empty, we need to get finally the range with endkey as empty byte[]
> But as per the above logic it will see that any other key greater than the 
> empty byte[] and will set it.
> Finally the new region created will not get endkey as empty byte[]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6220) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics

2012-07-11 Thread Paul Cavallaro (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412230#comment-13412230
 ] 

Paul Cavallaro commented on HBASE-6220:
---

Sorry I've been away this week with intermittent internet connectivity. I'll 
try to reply to all of this soon.

> PersistentMetricsTimeVaryingRate gets used for non-time-based metrics
> -
>
> Key: HBASE-6220
> URL: https://issues.apache.org/jira/browse/HBASE-6220
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: David S. Wang
>Assignee: Paul Cavallaro
>Priority: Minor
>  Labels: noob
> Attachments: ServerMetrics_HBASE_6220.patch
>
>
> PersistentMetricsTimeVaryingRate gets used for metrics that are not 
> time-based, leading to confusing names such as "avg_time" for compaction 
> size, etc.  You hav to read the code in order to understand that this is 
> actually referring to bytes, not seconds.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6336) Split point should not be equal with start row or end row

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412069#comment-13412069
 ] 

stack commented on HBASE-6336:
--

+1 on the patch.

@Ram when you say 'But here in the first region there are no kvs at all and 
hence we flush an empty file.'

What flush are you talking of?  The close of the region on split?  If so, why 
we write a flush file if no KVs?  Do we?  That don't seem right.


> Split point should not be equal with start row or end row
> -
>
> Key: HBASE-6336
> URL: https://issues.apache.org/jira/browse/HBASE-6336
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.96.0
>
> Attachments: HBASE-6336.patch
>
>
> Should we allow split point equal with region's start row or end row?
> {code}
> // if the midkey is the same as the first and last keys, then we cannot
> // (ever) split this region.
> if (this.comparator.compareRows(mk, firstKey) == 0 &&
> this.comparator.compareRows(mk, lastKey) == 0) {
>   if (LOG.isDebugEnabled()) {
> LOG.debug("cannot split because midkey is the same as first or " +
>   "last row");
>   }
> {code}
> Here, I think it is a mistake.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2315) BookKeeper for write-ahead logging

2012-07-11 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412065#comment-13412065
 ] 

Flavio Junqueira commented on HBASE-2315:
-

Hi Ted, fs is part of the issue I was discussing before. We don't have a 
filesystem implementation for bookkeeper, so we can't use the filesystem 
instance passed.

About the reader and the writer, I was configuring them in the hbase-default 
configuration file:

{noformat}

hbase.regionserver.hlog.reader.impl
org.apache.hadoop.hbase.regionserver.wal.BookKeeperLogReader
The HLog file reader implementation.
  
  
hbase.regionserver.hlog.writer.impl
org.apache.hadoop.hbase.regionserver.wal.BookKeeperLogWriter
The HLog file writer implementation.
  
{noformat}

I assumed previously that HLog was instantiated elsewhere.

> BookKeeper for write-ahead logging
> --
>
> Key: HBASE-2315
> URL: https://issues.apache.org/jira/browse/HBASE-2315
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Reporter: Flavio Junqueira
> Attachments: HBASE-2315.patch, bookkeeperOverview.pdf, 
> zookeeper-dev-bookkeeper.jar
>
>
> BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high 
> throughput write-ahead logging service. This issue provides an implementation 
> of write-ahead logging for hbase using BookKeeper. Apart from expected 
> throughput improvements, BookKeeper also has stronger durability guarantees 
> compared to the implementation currently used by hbase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6362) Enhance test-patch.sh script to recognize images / non-trunk patches

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412063#comment-13412063
 ] 

stack commented on HBASE-6362:
--

I'm with Andrew.  The barrier to contribution should be as low as possible.  
The argument above for requiring versioning doesn't fly given hadoopqa picks up 
the latest whatever the name (Regards hadoopqa not posting back if no compile, 
lets fix that)

> Enhance test-patch.sh script to recognize images / non-trunk patches
> 
>
> Key: HBASE-6362
> URL: https://issues.apache.org/jira/browse/HBASE-6362
> Project: HBase
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
>
> When user uploads logs / images / non-trunk patches, Hadoop QA would complain 
> that the file couldn't be applied as a patch (for trunk).
> We should make this script smarter by recognizing image files and non-trunk 
> patches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6338) Cache Method in RPC handler

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412061#comment-13412061
 ] 

stack commented on HBASE-6338:
--

Why javadoc a private method (especially a method named getMethod that returns 
a Method)?

> Cache Method in RPC handler
> ---
>
> Key: HBASE-6338
> URL: https://issues.apache.org/jira/browse/HBASE-6338
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-6338-90.patch, HBASE-6338-92.patch, 
> HBASE-6338-94.patch, HBASE-6338-trunk.patch
>
>
> Every call in rpc handler a Method will be created, if we cache the method 
> will improve a little.
> I test with 0.90, Average Class.getMethod(String name, Class... 
> parameterTypes) cost 4780 ns , if we cache it cost 2620 ns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6362) Enhance test-patch.sh script to recognize images / non-trunk patches

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412057#comment-13412057
 ] 

Zhihong Ted Yu commented on HBASE-6362:
---

See 
https://issues.apache.org/jira/browse/HBASE-5151?focusedCommentId=13412056&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13412056
 for one reason we need versioning in patch filenames.

When versioning is in play, the above regex wouldn't pick up all the patches.

> Enhance test-patch.sh script to recognize images / non-trunk patches
> 
>
> Key: HBASE-6362
> URL: https://issues.apache.org/jira/browse/HBASE-6362
> Project: HBase
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
>
> When user uploads logs / images / non-trunk patches, Hadoop QA would complain 
> that the file couldn't be applied as a patch (for trunk).
> We should make this script smarter by recognizing image files and non-trunk 
> patches.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412056#comment-13412056
 ] 

Zhihong Ted Yu commented on HBASE-5151:
---

It turns out that the first patch was syntactically correct.
Harsh added something in patch v2 which wouldn't pass compilation.

Currently Hadoop QA wouldn't post back if there is compilation error.
However, Stack wasn't aware of the above and integrated patch v2.

This is another reason we need versioning in patch filenames so that such 
mistakes can be more easily avoided.

> Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
> 
>
> Key: HBASE-5151
> URL: https://issues.apache.org/jira/browse/HBASE-5151
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Assignee: Harsh J
> Fix For: 0.96.0
>
> Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, 
> HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch
>
>
> We should rename "hbase.skip.errors", used in HRegion.java for skipping 
> errors when replaying edits. It should probably be something more like 
> "hbase.hregion.edits.replay.skip.errors" or so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412044#comment-13412044
 ] 

Hudson commented on HBASE-6368:
---

Integrated in HBase-TRUNK #3119 (See 
[https://builds.apache.org/job/HBase-TRUNK/3119/])
HBASE-6368 Upgrade Guava for critical performance bug fix (Revision 1360386)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/trunk/pom.xml


> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.

2012-07-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412043#comment-13412043
 ] 

Hudson commented on HBASE-5151:
---

Integrated in HBase-TRUNK #3119 (See 
[https://builds.apache.org/job/HBase-TRUNK/3119/])
HBASE-5151 Rename hbase.skip.errors in HRegion as it is too 
general-sounding (Revision 1360384)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
> 
>
> Key: HBASE-5151
> URL: https://issues.apache.org/jira/browse/HBASE-5151
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Assignee: Harsh J
> Fix For: 0.96.0
>
> Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, 
> HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch
>
>
> We should rename "hbase.skip.errors", used in HRegion.java for skipping 
> errors when replaying edits. It should probably be something more like 
> "hbase.hregion.edits.replay.skip.errors" or so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412041#comment-13412041
 ] 

stack commented on HBASE-5883:
--

@Jieshan So, what do we need to do to close this issue out?  What do we need to 
apply?  Thanks.

> Backup master is going down due to connection refused exception
> ---
>
> Key: HBASE-5883
> URL: https://issues.apache.org/jira/browse/HBASE-5883
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Gopinathan A
>Assignee: Jieshan Bean
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2
>
> Attachments: 90-addendum.patch, 92-addendum.patch, 94-addendum.patch, 
> HBASE-5883-90.patch, HBASE-5883-92.patch, HBASE-5883-94.patch, 
> HBASE-5883-trunk.patch, trunk-addendum.patch
>
>
> The active master node network was down for some time (This node contains 
> Master,DN,ZK,RS). Here backup node got 
> notification, and started to became active. Immedietly backup node got 
> aborted with the below exception.
> {noformat}
> 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
> finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
> [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
>  in 26374ms
> 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> java.io.IOException: java.net.ConnectException: Connection refused
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
>   at $Proxy13.getProtocolVersion(Unknown Source)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
>   at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
>   ... 20 more
> 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
> Stopping service threads
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5883) Backup master is going down due to connection refused exception

2012-07-11 Thread Gregory Chanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-5883:
--

Fix Version/s: 0.92.2
   0.90.7

Adding 0.92.2 and 0.90.7 to fix version, as this was originally checked in 
under those versions.  I'm also unclear what needs to be done to get this to 
resolved, but it should be done to 0.90.7 and 0.92.2 as well.

> Backup master is going down due to connection refused exception
> ---
>
> Key: HBASE-5883
> URL: https://issues.apache.org/jira/browse/HBASE-5883
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.92.1, 0.94.0
>Reporter: Gopinathan A
>Assignee: Jieshan Bean
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.2
>
> Attachments: 90-addendum.patch, 92-addendum.patch, 94-addendum.patch, 
> HBASE-5883-90.patch, HBASE-5883-92.patch, HBASE-5883-94.patch, 
> HBASE-5883-trunk.patch, trunk-addendum.patch
>
>
> The active master node network was down for some time (This node contains 
> Master,DN,ZK,RS). Here backup node got 
> notification, and started to became active. Immedietly backup node got 
> aborted with the below exception.
> {noformat}
> 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
> finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
> [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
>  in 26374ms
> 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
> server abort: loaded coprocessors are: []
> 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
> Unhandled exception. Starting shutdown.
> java.io.IOException: java.net.ConnectException: Connection refused
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
>   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
>   at $Proxy13.getProtocolVersion(Unknown Source)
>   at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
>   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
>   at 
> org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
>   at 
> org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
>   at java.lang.Thread.run(Thread.java:662)
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
>   ... 20 more
> 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
> 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
> Stopping service threads
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa

[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412026#comment-13412026
 ] 

Zhihong Ted Yu commented on HBASE-6368:
---

The following step passed Hadoop QA:
{code}
==
Checking against hadoop 2.0 build
==
{code}

> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412025#comment-13412025
 ] 

stack commented on HBASE-6368:
--

Will this undo the work done over in HBASE-5955?  Does this break our compiling 
against hadoop-2.0.x?

> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6374) [89-fb] Unify the multi-put/get/delete path so there is only one call to each RS, instead of one call per region

2012-07-11 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu updated HBASE-6374:
--

Fix Version/s: 0.89-fb
  Summary: [89-fb] Unify the multi-put/get/delete path so there is only 
one call to each RS, instead of one call per region  (was: integrate the 
multi-put/get/delete path so there is only one call to each RS, instead of one 
call per R)

> [89-fb] Unify the multi-put/get/delete path so there is only one call to each 
> RS, instead of one call per region
> 
>
> Key: HBASE-6374
> URL: https://issues.apache.org/jira/browse/HBASE-6374
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.89-fb
>Reporter: Amitanand Aiyer
>Assignee: Amitanand Aiyer
>Priority: Minor
> Fix For: 0.89-fb
>
>
> This is a feature similar to the batch feature in trunk. 
> We have optimisation for the put path where we batch puts by the 
> regionserver, but for gets and deletes we do batching only per hregion. So, 
> if there are 20 regions on a regionserver, we would be doing 20 RPC when we 
> can potentially batch them together in 1 call.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6374) integrate the multi-put/get/delete path so there is only one call to each RS, instead of one call per R

2012-07-11 Thread Amitanand Aiyer (JIRA)
Amitanand Aiyer created HBASE-6374:
--

 Summary: integrate the multi-put/get/delete path so there is only 
one call to each RS, instead of one call per R
 Key: HBASE-6374
 URL: https://issues.apache.org/jira/browse/HBASE-6374
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.89-fb
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor


This is a feature similar to the batch feature in trunk. 

We have optimisation for the put path where we batch puts by the regionserver, 
but for gets and deletes we do batching only per hregion. So, if there are 20 
regions on a regionserver, we would be doing 20 RPC when we can potentially 
batch them together in 1 call.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13412003#comment-13412003
 ] 

Zhihong Ted Yu commented on HBASE-6368:
---

Thanks for the reminder.
I logged MAPREDUCE-4429

> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2315) BookKeeper for write-ahead logging

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411998#comment-13411998
 ] 

Zhihong Ted Yu commented on HBASE-2315:
---

@Flavio:
Looking at the attached patch:
{code}
+public void init(FileSystem fs, Path path, Configuration conf){
{code}
Parameter fs isn't used.

Further, you implemented HLog.Reader and HLog.Writer. I don't see where HLog is 
constructed.

Thanks

> BookKeeper for write-ahead logging
> --
>
> Key: HBASE-2315
> URL: https://issues.apache.org/jira/browse/HBASE-2315
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Reporter: Flavio Junqueira
> Attachments: HBASE-2315.patch, bookkeeperOverview.pdf, 
> zookeeper-dev-bookkeeper.jar
>
>
> BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high 
> throughput write-ahead logging service. This issue provides an implementation 
> of write-ahead logging for hbase using BookKeeper. Apart from expected 
> throughput improvements, BookKeeper also has stronger durability guarantees 
> compared to the implementation currently used by hbase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411990#comment-13411990
 ] 

Lars Hofhansl commented on HBASE-6368:
--

How does this mingle with newer versions of Hadoop (0.22+, which have Guava 
11.0.2)?

> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6368) Upgrade Guava for critical performance bug fix

2012-07-11 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411978#comment-13411978
 ] 

Zhihong Ted Yu commented on HBASE-6368:
---

Integrated to trunk.

> Upgrade Guava for critical performance bug fix
> --
>
> Key: HBASE-6368
> URL: https://issues.apache.org/jira/browse/HBASE-6368
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Zhihong Ted Yu
>Assignee: Zhihong Ted Yu
>Priority: Critical
> Attachments: 6368-trunk.txt
>
>
> The bug is http://code.google.com/p/guava-libraries/issues/detail?id=1055
> See discussion under 'Upgrade to Guava 12.0.1: Performance bug in 
> CacheBuilder/LoadingCache fixed!'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411979#comment-13411979
 ] 

Elliott Clark commented on HBASE-4050:
--

@Luke
Yes.  We have per region metrics, per replication stream metrics, and per 
schema metrics.  There might be others that I'm missing but those are the ones 
I have touched.
In addition we will probably be implementing our own metrics classes.  Right 
now we have MetricsHistogram which is based on metrics1. 

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411975#comment-13411975
 ] 

Luke Lu commented on HBASE-4050:


@Elliott: MetricsBuilder/Collector is only needed for creating metrics 
dynamically and for implementing new Mutable metrics. Does HBase need to define 
new metrics at run time?

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411966#comment-13411966
 ] 

stack commented on HBASE-5151:
--

I applied the amendment.  Thanks Harsh.

> Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
> 
>
> Key: HBASE-5151
> URL: https://issues.apache.org/jira/browse/HBASE-5151
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Assignee: Harsh J
> Fix For: 0.96.0
>
> Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, 
> HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch
>
>
> We should rename "hbase.skip.errors", used in HRegion.java for skipping 
> errors when replaying edits. It should probably be something more like 
> "hbase.hregion.edits.replay.skip.errors" or so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411963#comment-13411963
 ] 

Elliott Clark commented on HBASE-4050:
--

@Jonathan
Thanks.

@Luke Lu
Unfortunately we need to shim a little bit more since the actual MetricsBuilder 
has also changed.  I'm trying to have something ready to show in a little bit.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5151) Rename "hbase.skip.errors" in HRegion as it is too general-sounding.

2012-07-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411959#comment-13411959
 ] 

stack commented on HBASE-5151:
--

@Harsh No need to apologize.  Thanks for fast turn around.  Applying the 
amendment.

> Rename "hbase.skip.errors" in HRegion as it is too general-sounding.
> 
>
> Key: HBASE-5151
> URL: https://issues.apache.org/jira/browse/HBASE-5151
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 0.94.0
>Reporter: Harsh J
>Assignee: Harsh J
> Fix For: 0.96.0
>
> Attachments: HBASE-5151.amend.patch, HBASE-5151.amend.wrapped.patch, 
> HBASE-5151.patch, HBASE-5151.patch, HBASE-5151.patch
>
>
> We should rename "hbase.skip.errors", used in HRegion.java for skipping 
> errors when replaying edits. It should probably be something more like 
> "hbase.hregion.edits.replay.skip.errors" or so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6373) Add more context information to audit log messages

2012-07-11 Thread Marcelo Vanzin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin updated HBASE-6373:
--

Attachment: accesscontroller.patch

Updated patch (empty string instead of null if remote address not available).

> Add more context information to audit log messages
> --
>
> Key: HBASE-6373
> URL: https://issues.apache.org/jira/browse/HBASE-6373
> Project: HBase
>  Issue Type: Improvement
>  Components: security
>Reporter: Marcelo Vanzin
>Priority: Minor
> Attachments: accesscontroller.patch, accesscontroller.patch
>
>
> The attached patch adds more information to the audit log messages; namely, 
> it includes the IP address where the request originated, if it's available.
> The patch is against trunk, but I've tested it against the 0.92 branch. I 
> didn't find any unit test for this code, please let me know if I missed 
> something.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411952#comment-13411952
 ] 

Luke Lu commented on HBASE-4050:


+1 on Alex's option 2 and ServiceLoader. For HBase we only need to implement 
shims for metrics sources, so interface for registry and *Mutable* classes 
would suffice.


> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-11 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411949#comment-13411949
 ] 

Jonathan Hsieh commented on HBASE-4050:
---

bq. bq. hbase needs to be recompiled to run against hadoop 2.0 hdfs
bq. When did this happen ? In the pom all that changes for the hadoop 2.0 
compile are that some dependencies are changed. I just tried on a local machine 
only and just changing the libs dir I was able to run either version (stand 
alone only so not really an exhaustive test I know)

Here's one reason:
https://issues.apache.org/jira/browse/HBASE-5861?focusedCommentId=13259785&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13259785
Here's another: HDFS-1620/HDFS-2412


> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >