[jira] [Commented] (HBASE-9047) Tool to handle finishing replication when the cluster is offline

2013-12-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842122#comment-13842122
 ] 

Lars Hofhansl commented on HBASE-9047:
--

Looks good (this is actually very clever I think). The only concern is the 
potential cost of this:
{code}
+  // In the case of disaster/recovery, HMaster may be shutdown/crashed 
before flush data
+  // from .logs to .oldlogs. Loop into .logs folders and check whether 
a match exists
+  FileStatus[] rss = fs.listStatus(manager.getLogDir());
+  for (FileStatus rs : rss) {
+Path p = rs.getPath();
+FileStatus[] logs = fs.listStatus(p);
...
{code}

Any way we can restrict this to the case when we run this tool? Or maybe it's 
not a problem since we only get here when did not find a log to begin with...?

> Tool to handle finishing replication when the cluster is offline
> 
>
> Key: HBASE-9047
> URL: https://issues.apache.org/jira/browse/HBASE-9047
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.96.0
>Reporter: Jean-Daniel Cryans
>Assignee: Demai Ni
> Fix For: 0.98.0
>
> Attachments: HBASE-9047-0.94.9-v0.PATCH, HBASE-9047-trunk-v0.patch, 
> HBASE-9047-trunk-v1.patch, HBASE-9047-trunk-v2.patch, 
> HBASE-9047-trunk-v3.patch, HBASE-9047-trunk-v4.patch, 
> HBASE-9047-trunk-v4.patch, HBASE-9047-trunk-v5.patch
>
>
> We're having a discussion on the mailing list about replicating the data on a 
> cluster that was shut down in an offline fashion. The motivation could be 
> that you don't want to bring HBase back up but still need that data on the 
> slave.
> So I have this idea of a tool that would be running on the master cluster 
> while it is down, although it could also run at any time. Basically it would 
> be able to read the replication state of each master region server, finish 
> replicating what's missing to all the slave, and then clear that state in 
> zookeeper.
> The code that handles replication does most of that already, see 
> ReplicationSourceManager and ReplicationSource. Basically when 
> ReplicationSourceManager.init() is called, it will check all the queues in ZK 
> and try to grab those that aren't attached to a region server. If the whole 
> cluster is down, it will grab all of them.
> The beautiful thing here is that you could start that tool on all your 
> machines and the load will be spread out, but that might not be a big concern 
> if replication wasn't lagging since it would take a few seconds to finish 
> replicating the missing data for each region server.
> I'm guessing when starting ReplicationSourceManager you'd give it a fake 
> region server ID, and you'd tell it not to start its own source.
> FWIW the main difference in how replication is handled between Apache's HBase 
> and Facebook's is that the latter is always done separately of HBase itself. 
> This jira isn't about doing that.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HBASE-10093) Unregister ReplicationSource metric bean when the replication source thread is terminated

2013-12-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-10093.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.94. Thanks for the patch.

> Unregister ReplicationSource metric bean when the replication source thread 
> is terminated 
> --
>
> Key: HBASE-10093
> URL: https://issues.apache.org/jira/browse/HBASE-10093
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Affects Versions: 0.94.14
>Reporter: cuijianwei
>Assignee: cuijianwei
> Fix For: 0.94.15
>
> Attachments: HBASE-10093-0.94-v1.patch
>
>
> Each replication source thread will register a metric bean to show its 
> statistics. The source threads will be terminated when region server exit and 
> the metric beans will be removed. However, replication source thread may also 
> be terminated when user removing the peer explicitly or it just takes a 
> recover queue and finished replicating the queued HLogs. In these situations, 
> the metric bean won't be unregistered and user may be confused to always see 
> the statistics from terminated replication source threads. Maybe, it is more 
> clear to remove the metric bean after replication source thread terminated? 
> Then, the statistics will only from active replication sources.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10097) Remove a region name string creation in HRegion#nextInternal

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842115#comment-13842115
 ] 

Hudson commented on HBASE-10097:


FAILURE: Integrated in hbase-0.96 #217 (See 
[https://builds.apache.org/job/hbase-0.96/217/])
HBASE-10097 Remove a region name string creation in HRegion#nextInternal 
(nkeywal: rev 1548712)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcCallContext.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


> Remove a region name string creation in HRegion#nextInternal
> 
>
> Key: HBASE-10097
> URL: https://issues.apache.org/jira/browse/HBASE-10097
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.98.0, 0.96.1, 0.99.0
>Reporter: Nicolas Liochon
>Assignee: Nicolas Liochon
>Priority: Critical
> Fix For: 0.96.1, 0.98.1, 0.99.0
>
> Attachments: 10097.v1.patch
>
>
> We're creating a String in each "nextInternal". Before HBASE-9983 this was 
> cached, but it's not the case anymore...



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842116#comment-13842116
 ] 

Hudson commented on HBASE-10094:


FAILURE: Integrated in hbase-0.96 #217 (See 
[https://builds.apache.org/job/hbase-0.96/217/])
HBASE-10094 Add batching to HLogPerformanceEvaluation (stack: rev 1548754)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java


> Add batching to HLogPerformanceEvaluation
> -
>
> Key: HBASE-10094
> URL: https://issues.apache.org/jira/browse/HBASE-10094
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, wal
>Reporter: stack
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: 10094v2.txt
>
>
> As Himanshu points out in the the parent issue, HLogPE is using an unorthodox 
> API appending edits to the WAL; it is using an API that is meant for tests 
> only that does an append immediately followed by a sync call.
> In normal deploy, WAL appends are done as a bunch of appends followed by a 
> sync on the tail of the transaction -- not a sync per append.
> This issue is about changing HLogPE to use append and then sync.  It also 
> adds an argument so you can specifying batching of a set of appends before  
> the sync is called.  The latter lets HLogPE mimic multi puts that use the 
> minibatch... which appends, appends, appends.. and then syncs.
> Assigning to Himanshu for review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842118#comment-13842118
 ] 

Hudson commented on HBASE-10061:


FAILURE: Integrated in hbase-0.96 #217 (See 
[https://builds.apache.org/job/hbase-0.96/217/])
HBASE-10061 TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) 
resulting in thrown NPE (Amit Sela) (ndimiduk: rev 1548749)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java


> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts

2013-12-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842117#comment-13842117
 ] 

Hudson commented on HBASE-10085:


FAILURE: Integrated in hbase-0.96 #217 (See 
[https://builds.apache.org/job/hbase-0.96/217/])
HBASE-10085: Some regions aren't re-assigned after a master restarts (jeffreyz: 
rev 1548728)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
/hbase/branches/0.96/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java


> Some regions aren't re-assigned after a cluster restarts
> 
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842112#comment-13842112
 ] 

Lars Hofhansl commented on HBASE-10089:
---

I would still question how much memory we are actually saving. The number of 
life tables will relatively small (certainly not more than 1000) and the number 
of metrics per table per CF is not too large either.

> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2013-12-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842104#comment-13842104
 ] 

Ted Yu commented on HBASE-1:


bq.  the expected gain in case num_log_files > available split log workers.
Right - this is what the change targets.

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 1-recover-ts-with-pb-2.txt, 
> 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 
> 1-v4.txt, 1-v5.txt, 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842084#comment-13842084
 ] 

stack commented on HBASE-10048:
---

[~lhofhansl] I added it to trunk.  Will add to 0.94 after I shoehorn it into 
0.96 (morrow)

> Add hlog number metric in regionserver
> --
>
> Key: HBASE-10048
> URL: https://issues.apache.org/jira/browse/HBASE-10048
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, 
> HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, 
> HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff
>
>
> Add hlog number metric in regionserver. 
> We can use this metric to alert about memstore flush because of too many 
> hlogs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842081#comment-13842081
 ] 

Anoop Sam John commented on HBASE-10061:


Oh I am late. Still +1...  Thanks Amit and Nick!

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10090) Master could hang in assigning meta

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842080#comment-13842080
 ] 

Hadoop QA commented on HBASE-10090:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617518/trunk-10090_v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8084//console

This message is automatically generated.

> Master could hang in assigning meta
> ---
>
> Key: HBASE-10090
> URL: https://issues.apache.org/jira/browse/HBASE-10090
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: trunk-10090.patch, trunk-10090_v2.patch
>
>
> Under very rare scenario, master could hang waiting for meta to be assigned 
> while the meta server is dead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10101:


Attachment: test.log

Here is the right log.

> testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
> --
>
> Key: HBASE-10101
> URL: https://issues.apache.org/jira/browse/HBASE-10101
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Priority: Minor
> Attachments: test.log
>
>
> Sometimes, I got this test timed out. The log is attached. It could be 
> because the new cluster takes a while to process the dead server, or assign 
> meta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10059) TestSplitLogWorker#testMultipleTasks fails occasionally

2013-12-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842072#comment-13842072
 ] 

Jimmy Xiang commented on HBASE-10059:
-

+1

> TestSplitLogWorker#testMultipleTasks fails occasionally
> ---
>
> Key: HBASE-10059
> URL: https://issues.apache.org/jira/browse/HBASE-10059
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Jeffrey Zhong
> Attachments: hbase-10059.patch
>
>
> From 
> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/857/testReport/junit/org.apache.hadoop.hbase.regionserver/TestSplitLogWorker/testMultipleTasks/
>  :
> {code}
> 2013-11-30 01:13:23,022 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
> before: regionserver.TestSplitLogWorker#testMultipleTasks Thread=16, 
> OpenFileDescriptor=157, MaxFileDescriptor=4, SystemLoadAverage=338, 
> ProcessCount=144, AvailableMemoryMB=1474, ConnectionCount=0
> 2013-11-30 01:13:23,026 INFO  [pool-1-thread-1] 
> zookeeper.MiniZooKeeperCluster(200): Started MiniZK Cluster and connect 1 ZK 
> server on client port: 53800
> 2013-11-30 01:13:23,029 INFO  [pool-1-thread-1] 
> zookeeper.RecoverableZooKeeper(120): Process 
> identifier=split-log-worker-tests connecting to ZooKeeper 
> ensemble=localhost:53800
> 2013-11-30 01:13:23,249 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=None, 
> state=SyncConnected, path=null
> 2013-11-30 01:13:23,251 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(387): split-log-worker-tests-0x142a6913350 
> connected
> 2013-11-30 01:13:23,261 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(105): /hbase created
> 2013-11-30 01:13:23,270 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(108): /hbase/splitWAL created
> 2013-11-30 01:13:23,278 DEBUG [pool-1-thread-1] executor.ExecutorService(99): 
> Starting executor service name=RS_LOG_REPLAY_OPS-TestSplitLogWorker, 
> corePoolSize=10, maxPoolSize=10
> 2013-11-30 01:13:23,278 INFO  [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(246): testMultipleTasks
> 2013-11-30 01:13:23,280 INFO  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(175): SplitLogWorker tmt_svr,1,1 starting
> 2013-11-30 01:13:23,380 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,394 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,402 INFO  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task 
> /hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,418 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,419 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(522): task /hbase/splitWAL/tmt_task preempted 
> from tmt_svr,1,1, current task state and owner=OWNED another-worker,1,1
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
> 2013-11-30 01:13:23,420 WARN  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(374): Interrupted while yielding for other region 
> servers
> java.lang.InterruptedException: sleep interrupted
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:251)
>   at 
> org.apache.hadoop.hbase.regionserve

[jira] [Commented] (HBASE-10103) TestNodeHealthCheckChore#testRSHealthChore: Stoppable must have been stopped

2013-12-06 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842059#comment-13842059
 ] 

Andrew Purtell commented on HBASE-10103:


Test output, will come back later to look at cause, maybe RS initialization 
doesn't happen fast enough for this test now that we are using Hadoop 2 as 
default:

{noformat}
2013-12-07 02:22:11,094 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
before: TestNodeHealthCh
eckChore#testRSHealthChore Thread=114, OpenFileDescriptor=796, 
MaxFileDescriptor=65536, SystemLoadAv
erage=119, ProcessCount=73, AvailableMemoryMB=1744, ConnectionCount=2
2013-12-07 02:22:11,096 INFO  [pool-1-thread-1] 
hbase.TestNodeHealthCheckChore(129): Created /data/s
rc/hbase/hbase-server/target/test-data/bc5c27a4-678b-4600-9279-e471aebe046f/HealthScript9595afc4-7f2
a-4054-be8a-6af3a2993645.sh, executable=true
2013-12-07 02:22:11,097 INFO  [pool-1-thread-1] hbase.HealthCheckChore(42): 
Health Check Chore runs 
every 0sec
2013-12-07 02:22:11,097 INFO  [pool-1-thread-1] hbase.HealthChecker(68): 
HealthChecker initialized with script at 
/data/src/hbase/hbase-server/target/test-data/bc5c27a4-678b-4600-9279-e471aebe046f/HealthScript9595afc4-7f2a-4054-be8a-6af3a2993645.sh,
 timeout=2000
2013-12-07 02:22:11,446 INFO  [pool-1-thread-1] hbase.HealthCheckChore(65): 
Health status at 385106hrs, 22mins, 11sec : ERROR
Server not healthy
2013-12-07 02:22:11,793 INFO  [pool-1-thread-1] hbase.HealthCheckChore(65): 
Health status at 385106hrs, 22mins, 11sec : ERROR
Server not healthy
2013-12-07 02:22:12,141 INFO  [pool-1-thread-1] hbase.HealthCheckChore(65): 
Health status at 385106hrs, 22mins, 12sec : ERROR
Server not healthy
2013-12-07 02:22:12,142 DEBUG [pool-1-thread-1] fs.HFileSystem(214): The file 
system is not a DistributedFileSystem. Skipping on block location reordering
2013-12-07 02:22:12,497 INFO  [pool-1-thread-1] hbase.ResourceChecker(171): 
after: TestNodeHealthCheckChore#testRSHealthChore Thread=114 (was 114), 
OpenFileDescriptor=798 (was 796) - OpenFileDescriptor LEAK? -, 
MaxFileDescriptor=65536 (was 65536), SystemLoadAverage=119 (was 119), 
ProcessCount=73 (was 73), AvailableMemoryMB=1744 (was 1744), ConnectionCount=2 
(was 2)
{noformat}

> TestNodeHealthCheckChore#testRSHealthChore: Stoppable must have been stopped
> 
>
> Key: HBASE-10103
> URL: https://issues.apache.org/jira/browse/HBASE-10103
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
> Fix For: 0.98.0, 0.99.0
>
>
> {noformat}
> Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 623.639 sec 
> <<< FAILURE!
> testRSHealthChore(org.apache.hadoop.hbase.TestNodeHealthCheckChore)  Time 
> elapsed: 0.001 sec  <<< FAILURE!
> java.lang.AssertionError: Stoppable must have been stopped.
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.hadoop.hbase.TestNodeHealthCheckChore.testRSHealthChore(TestNodeHealthCheckChore.java:108)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HBASE-10103) TestNodeHealthCheckChore#testRSHealthChore: Stoppable must have been stopped

2013-12-06 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-10103:
--

 Summary: TestNodeHealthCheckChore#testRSHealthChore: Stoppable 
must have been stopped
 Key: HBASE-10103
 URL: https://issues.apache.org/jira/browse/HBASE-10103
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
 Fix For: 0.98.0, 0.99.0


{noformat}
Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 623.639 sec <<< 
FAILURE!
testRSHealthChore(org.apache.hadoop.hbase.TestNodeHealthCheckChore)  Time 
elapsed: 0.001 sec  <<< FAILURE!
java.lang.AssertionError: Stoppable must have been stopped.
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.hbase.TestNodeHealthCheckChore.testRSHealthChore(TestNodeHealthCheckChore.java:108)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10059) TestSplitLogWorker#testMultipleTasks fails occasionally

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842049#comment-13842049
 ] 

Hadoop QA commented on HBASE-10059:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617522/hbase-10059.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8083//console

This message is automatically generated.

> TestSplitLogWorker#testMultipleTasks fails occasionally
> ---
>
> Key: HBASE-10059
> URL: https://issues.apache.org/jira/browse/HBASE-10059
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Jeffrey Zhong
> Attachments: hbase-10059.patch
>
>
> From 
> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/857/testReport/junit/org.apache.hadoop.hbase.regionserver/TestSplitLogWorker/testMultipleTasks/
>  :
> {code}
> 2013-11-30 01:13:23,022 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
> before: regionserver.TestSplitLogWorker#testMultipleTasks Thread=16, 
> OpenFileDescriptor=157, MaxFileDescriptor=4, SystemLoadAverage=338, 
> ProcessCount=144, AvailableMemoryMB=1474, ConnectionCount=0
> 2013-11-30 01:13:23,026 INFO  [pool-1-thread-1] 
> zookeeper.MiniZooKeeperCluster(200): Started MiniZK Cluster and connect 1 ZK 
> server on client port: 53800
> 2013-11-30 01:13:23,029 INFO  [pool-1-thread-1] 
> zookeeper.RecoverableZooKeeper(120): Process 
> identifier=split-log-worker-tests connecting to ZooKeeper 
> ensemble=localhost:53800
> 2013-11-30 01:13:23,249 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=None, 
> state=SyncConnected, path=null
> 2013-11-30 01:13:23,251 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(387): split-log-worker-tests-0x142a6913350 
> connected
> 2013-11-30 01:13:23,261 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(105): /hbase created
> 2013-11-30 01:13:23,270 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(108): /hbase/splitWAL created
> 2013-11-30 01:13:23,278 DEBUG [pool-1-thread-1] executor.ExecutorService(99): 
> Starting executor se

[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2013-12-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842031#comment-13842031
 ] 

Enis Soztutar commented on HBASE-1:
---

Skimmed the patch. It looks ok to me. 
There is some expected slowdown, because of the parallelization (default 6) for 
submitting the recoverLease() RPC's to NN might be a bit slower than doing this 
from the RS's in parallel, where 6 < num_log_files < num_region_servers. I 
guess we can live with it, because of the expected gain in case num_log_files > 
available split log workers.

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 1-recover-ts-with-pb-2.txt, 
> 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 
> 1-v4.txt, 1-v5.txt, 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10102) CF.VERSIONS is not enforced with timerange scans

2013-12-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10102:
--

Attachment: 10102-0.94-POC.txt

POC patch. Just need to park it somewhere. Not tested.

> CF.VERSIONS is not enforced with timerange scans
> 
>
> Key: HBASE-10102
> URL: https://issues.apache.org/jira/browse/HBASE-10102
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
> Attachments: 10102-0.94-POC.txt
>
>
> Example brought up by Niels Basjes on the user list:
> If I do the following commands into the hbase shell
> {code}
> create 't1', {NAME => 'c1', VERSIONS => 1}
> put 't1', 'r1', 'c1', 'One', 1000
> put 't1', 'r1', 'c1', 'Two', 2000
> put 't1', 'r1', 'c1', 'Three', 3000
> get 't1', 'r1'
> get 't1', 'r1' , {TIMERANGE => [0,1500]}
> the result is this:
> get 't1', 'r1'
> COLUMN CELL
>  c1:   timestamp=3000, value=Three
> 1 row(s) in 0.0780 seconds
> get 't1', 'r1' , {TIMERANGE => [0,1500]}
> COLUMN CELL
>  c1:   timestamp=1000, value=One
> 1 row(s) in 0.1390 seconds
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10102) CF.VERSIONS is not enforced with timerange scans

2013-12-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842019#comment-13842019
 ] 

Lars Hofhansl commented on HBASE-10102:
---

Currently the workflow in ScanQueryMatcher is something like this:

#  = min(, )
# filter by timerange
# filter out columns (i.e. columns not specified in the scan)
# apply customer filters
# filter by 

Every KV is passed through this filtering process.

What we should do is this:

# filter by 
# filter by timerange
# filter out columns (i.e. columns not specified in the scan)
# apply customer filters
# filter by 

I have a POC patch that does this. It does not slow scanning in a measurable 
way.

> CF.VERSIONS is not enforced with timerange scans
> 
>
> Key: HBASE-10102
> URL: https://issues.apache.org/jira/browse/HBASE-10102
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>
> Example brought up by Niels Basjes on the user list:
> If I do the following commands into the hbase shell
> {code}
> create 't1', {NAME => 'c1', VERSIONS => 1}
> put 't1', 'r1', 'c1', 'One', 1000
> put 't1', 'r1', 'c1', 'Two', 2000
> put 't1', 'r1', 'c1', 'Three', 3000
> get 't1', 'r1'
> get 't1', 'r1' , {TIMERANGE => [0,1500]}
> the result is this:
> get 't1', 'r1'
> COLUMN CELL
>  c1:   timestamp=3000, value=Three
> 1 row(s) in 0.0780 seconds
> get 't1', 'r1' , {TIMERANGE => [0,1500]}
> COLUMN CELL
>  c1:   timestamp=1000, value=One
> 1 row(s) in 0.1390 seconds
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HBASE-10102) CF.VERSIONS is not enforced with timerange scans

2013-12-06 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-10102:
-

 Summary: CF.VERSIONS is not enforced with timerange scans
 Key: HBASE-10102
 URL: https://issues.apache.org/jira/browse/HBASE-10102
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


Example brought up by Niels Basjes on the user list:
If I do the following commands into the hbase shell
{code}
create 't1', {NAME => 'c1', VERSIONS => 1}
put 't1', 'r1', 'c1', 'One', 1000
put 't1', 'r1', 'c1', 'Two', 2000
put 't1', 'r1', 'c1', 'Three', 3000
get 't1', 'r1'
get 't1', 'r1' , {TIMERANGE => [0,1500]}

the result is this:

get 't1', 'r1'
COLUMN CELL
 c1:   timestamp=3000, value=Three
1 row(s) in 0.0780 seconds

get 't1', 'r1' , {TIMERANGE => [0,1500]}
COLUMN CELL
 c1:   timestamp=1000, value=One
1 row(s) in 0.1390 seconds
{code}




--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842008#comment-13842008
 ] 

Mikhail Bautin commented on HBASE-10089:


[~xieliang007]: yes, you are right. In that case, this problem is specific to 
JDK 6 with low permgen size.

> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-7572) move metadata settings that duplicate xml config settings to CF/table config in a backward-compatible manner

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842009#comment-13842009
 ] 

Hadoop QA commented on HBASE-7572:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617506/HBASE-7572-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 27 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestShell

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8081//console

This message is automatically generated.

> move metadata settings that duplicate xml config settings to CF/table config 
> in a backward-compatible manner
> 
>
> Key: HBASE-7572
> URL: https://issues.apache.org/jira/browse/HBASE-7572
> Project: HBase
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7572-v0.patch, HBASE-7572-v1.patch, 
> HBASE-7572-v2.patch, HBASE-7572-v3.patch, HBASE-7572-v4.patch
>
>
> 2nd part of splitting HBASE-7236



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841993#comment-13841993
 ] 

Liang Xie commented on HBASE-10089:
---

[~mikhail], IMHO, that link is not correct,  the interned strings could be 
garbage collected absolutely...

> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841971#comment-13841971
 ] 

Hadoop QA commented on HBASE-9892:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12617512/HBASE-9892-trunk-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestMetricsRegionServerSourceImpl

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8082//console

This message is automatically generated.

> Add info port to ServerName to support multi instances in a node
> 
>
> Key: HBASE-9892
> URL: https://issues.apache.org/jira/browse/HBASE-9892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, 
> HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, 
> HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, 
> HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt
>
>
> The full GC time of  regionserver with big heap(> 30G ) usually  can not be 
> controlled in 30s. At the same time, the servers with 64G memory are normal. 
> So we try to deploy multi rs instances(2-3 ) in a single node and the heap of 
> each rs is about 20G ~ 24G.
> Most of the things works fine, except the hbase web ui. The master get the RS 
> info port from conf, which is suitable for this situation of multi rs  
> instances in a node. So we add info port to ServerName.
> a. at the startup, rs report it's info port to Hmaster.
> b, For root region, rs write the servername with info port ro the zookeeper 
> root-region-server node.
> c, For meta regions, rs write the servername with info port to root region 
> d. For user regions,  rs write the servername with info port to meta regions 
> So hmaster and client can get info port from the servername.
> To test this feature, I change the rs num from 1 to

[jira] [Updated] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10101:


Attachment: (was: test.log)

> testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
> --
>
> Key: HBASE-10101
> URL: https://issues.apache.org/jira/browse/HBASE-10101
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Priority: Minor
>
> Sometimes, I got this test timed out. The log is attached. It could be 
> because the new cluster takes a while to process the dead server, or assign 
> meta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.

2013-12-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841968#comment-13841968
 ] 

Jimmy Xiang commented on HBASE-10101:
-

Attached a wrong log.  Let me get the right one.

> testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
> --
>
> Key: HBASE-10101
> URL: https://issues.apache.org/jira/browse/HBASE-10101
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Priority: Minor
>
> Sometimes, I got this test timed out. The log is attached. It could be 
> because the new cluster takes a while to process the dead server, or assign 
> meta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts

2013-12-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841969#comment-13841969
 ] 

Jeffrey Zhong commented on HBASE-10085:
---

ok. Let me check it. Thanks.

> Some regions aren't re-assigned after a cluster restarts
> 
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts

2013-12-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841967#comment-13841967
 ] 

Jimmy Xiang commented on HBASE-10085:
-

[~jeffreyz], I filed an issue on the test: HBASE-10101. Do you want to take a 
look?

> Some regions aren't re-assigned after a cluster restarts
> 
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10059) TestSplitLogWorker#testMultipleTasks fails occasionally

2013-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10059:
--

Status: Patch Available  (was: Open)

> TestSplitLogWorker#testMultipleTasks fails occasionally
> ---
>
> Key: HBASE-10059
> URL: https://issues.apache.org/jira/browse/HBASE-10059
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Jeffrey Zhong
> Attachments: hbase-10059.patch
>
>
> From 
> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/857/testReport/junit/org.apache.hadoop.hbase.regionserver/TestSplitLogWorker/testMultipleTasks/
>  :
> {code}
> 2013-11-30 01:13:23,022 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
> before: regionserver.TestSplitLogWorker#testMultipleTasks Thread=16, 
> OpenFileDescriptor=157, MaxFileDescriptor=4, SystemLoadAverage=338, 
> ProcessCount=144, AvailableMemoryMB=1474, ConnectionCount=0
> 2013-11-30 01:13:23,026 INFO  [pool-1-thread-1] 
> zookeeper.MiniZooKeeperCluster(200): Started MiniZK Cluster and connect 1 ZK 
> server on client port: 53800
> 2013-11-30 01:13:23,029 INFO  [pool-1-thread-1] 
> zookeeper.RecoverableZooKeeper(120): Process 
> identifier=split-log-worker-tests connecting to ZooKeeper 
> ensemble=localhost:53800
> 2013-11-30 01:13:23,249 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=None, 
> state=SyncConnected, path=null
> 2013-11-30 01:13:23,251 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(387): split-log-worker-tests-0x142a6913350 
> connected
> 2013-11-30 01:13:23,261 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(105): /hbase created
> 2013-11-30 01:13:23,270 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(108): /hbase/splitWAL created
> 2013-11-30 01:13:23,278 DEBUG [pool-1-thread-1] executor.ExecutorService(99): 
> Starting executor service name=RS_LOG_REPLAY_OPS-TestSplitLogWorker, 
> corePoolSize=10, maxPoolSize=10
> 2013-11-30 01:13:23,278 INFO  [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(246): testMultipleTasks
> 2013-11-30 01:13:23,280 INFO  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(175): SplitLogWorker tmt_svr,1,1 starting
> 2013-11-30 01:13:23,380 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,394 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,402 INFO  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task 
> /hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,418 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,419 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(522): task /hbase/splitWAL/tmt_task preempted 
> from tmt_svr,1,1, current task state and owner=OWNED another-worker,1,1
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
> 2013-11-30 01:13:23,420 WARN  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(374): Interrupted while yielding for other region 
> servers
> java.lang.InterruptedException: sleep interrupted
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:251)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorke

[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver

2013-12-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841963#comment-13841963
 ] 

Lars Hofhansl commented on HBASE-10048:
---

That is nice. And something we would monitor. +1 for 0.94.
Needs to be in trunk (0.99 now) as well, right?

> Add hlog number metric in regionserver
> --
>
> Key: HBASE-10048
> URL: https://issues.apache.org/jira/browse/HBASE-10048
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, 
> HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, 
> HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff
>
>
> Add hlog number metric in regionserver. 
> We can use this metric to alert about memstore flush because of too many 
> hlogs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10101:


Attachment: test.log

> testOfflineRegionReAssginedAfterMasterRestart times out sometimes.
> --
>
> Key: HBASE-10101
> URL: https://issues.apache.org/jira/browse/HBASE-10101
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Priority: Minor
> Attachments: test.log
>
>
> Sometimes, I got this test timed out. The log is attached. It could be 
> because the new cluster takes a while to process the dead server, or assign 
> meta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10059) TestSplitLogWorker#testMultipleTasks fails occasionally

2013-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10059:
--

Attachment: hbase-10059.patch

Increased the timeout from 1.5 secs to 5 secs.

> TestSplitLogWorker#testMultipleTasks fails occasionally
> ---
>
> Key: HBASE-10059
> URL: https://issues.apache.org/jira/browse/HBASE-10059
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Jeffrey Zhong
> Attachments: hbase-10059.patch
>
>
> From 
> https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/857/testReport/junit/org.apache.hadoop.hbase.regionserver/TestSplitLogWorker/testMultipleTasks/
>  :
> {code}
> 2013-11-30 01:13:23,022 INFO  [pool-1-thread-1] hbase.ResourceChecker(147): 
> before: regionserver.TestSplitLogWorker#testMultipleTasks Thread=16, 
> OpenFileDescriptor=157, MaxFileDescriptor=4, SystemLoadAverage=338, 
> ProcessCount=144, AvailableMemoryMB=1474, ConnectionCount=0
> 2013-11-30 01:13:23,026 INFO  [pool-1-thread-1] 
> zookeeper.MiniZooKeeperCluster(200): Started MiniZK Cluster and connect 1 ZK 
> server on client port: 53800
> 2013-11-30 01:13:23,029 INFO  [pool-1-thread-1] 
> zookeeper.RecoverableZooKeeper(120): Process 
> identifier=split-log-worker-tests connecting to ZooKeeper 
> ensemble=localhost:53800
> 2013-11-30 01:13:23,249 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, type=None, 
> state=SyncConnected, path=null
> 2013-11-30 01:13:23,251 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(387): split-log-worker-tests-0x142a6913350 
> connected
> 2013-11-30 01:13:23,261 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(105): /hbase created
> 2013-11-30 01:13:23,270 DEBUG [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(108): /hbase/splitWAL created
> 2013-11-30 01:13:23,278 DEBUG [pool-1-thread-1] executor.ExecutorService(99): 
> Starting executor service name=RS_LOG_REPLAY_OPS-TestSplitLogWorker, 
> corePoolSize=10, maxPoolSize=10
> 2013-11-30 01:13:23,278 INFO  [pool-1-thread-1] 
> regionserver.TestSplitLogWorker(246): testMultipleTasks
> 2013-11-30 01:13:23,280 INFO  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(175): SplitLogWorker tmt_svr,1,1 starting
> 2013-11-30 01:13:23,380 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,394 DEBUG [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,394 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,402 INFO  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(363): worker tmt_svr,1,1 acquired task 
> /hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeChildrenChanged, state=SyncConnected, path=/hbase/splitWAL
> 2013-11-30 01:13:23,410 DEBUG [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(595): tasks arrived or departed
> 2013-11-30 01:13:23,418 DEBUG [pool-1-thread-1-EventThread] 
> zookeeper.ZooKeeperWatcher(310): split-log-worker-tests-0x142a6913350, 
> quorum=localhost:53800, baseZNode=/hbase Received ZooKeeper Event, 
> type=NodeDataChanged, state=SyncConnected, path=/hbase/splitWAL/tmt_task
> 2013-11-30 01:13:23,419 INFO  [pool-1-thread-1] hbase.Waiter(174): Waiting up 
> to [1,500] milli-secs(wait.for.ratio=[1])
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(522): task /hbase/splitWAL/tmt_task preempted 
> from tmt_svr,1,1, current task state and owner=OWNED another-worker,1,1
> 2013-11-30 01:13:23,420 INFO  [pool-1-thread-1-EventThread] 
> regionserver.SplitLogWorker(608): Sending interrupt to stop the worker thread
> 2013-11-30 01:13:23,420 WARN  [SplitLogWorker-tmt_svr,1,1] 
> regionserver.SplitLogWorker(374): Interrupted while yielding for other region 
> servers
> java.lang.InterruptedException: sleep interrupted
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:372)
>   at 
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:251)
>   at 
> org.apach

[jira] [Created] (HBASE-10101) testOfflineRegionReAssginedAfterMasterRestart times out sometimes.

2013-12-06 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-10101:
---

 Summary: testOfflineRegionReAssginedAfterMasterRestart times out 
sometimes.
 Key: HBASE-10101
 URL: https://issues.apache.org/jira/browse/HBASE-10101
 Project: HBase
  Issue Type: Test
Reporter: Jimmy Xiang
Priority: Minor
 Attachments: test.log

Sometimes, I got this test timed out. The log is attached. It could be because 
the new cluster takes a while to process the dead server, or assign meta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10090) Master could hang in assigning meta

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10090:


Status: Patch Available  (was: Open)

Incorporated the latest patch for HBASE-10085.

> Master could hang in assigning meta
> ---
>
> Key: HBASE-10090
> URL: https://issues.apache.org/jira/browse/HBASE-10090
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: trunk-10090.patch, trunk-10090_v2.patch
>
>
> Under very rare scenario, master could hang waiting for meta to be assigned 
> while the meta server is dead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10090) Master could hang in assigning meta

2013-12-06 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-10090:


Attachment: trunk-10090_v2.patch

> Master could hang in assigning meta
> ---
>
> Key: HBASE-10090
> URL: https://issues.apache.org/jira/browse/HBASE-10090
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Attachments: trunk-10090.patch, trunk-10090_v2.patch
>
>
> Under very rare scenario, master could hang waiting for meta to be assigned 
> while the meta server is dead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (HBASE-10100) Hbase replication cluster can have varying peers under certain conditions

2013-12-06 Thread churro morales (JIRA)
churro morales created HBASE-10100:
--

 Summary: Hbase replication cluster can have varying peers under 
certain conditions
 Key: HBASE-10100
 URL: https://issues.apache.org/jira/browse/HBASE-10100
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0, 0.95.0, 0.94.5
Reporter: churro morales


We were trying to replicate hbase data over to a new datacenter recently.  
After we turned on replication and then did our copy tables.  We noticed that 
verify replication had discrepancies.  

We ran a list_peers and it returned back both peers, the original datacenter we 
were replicating to and the new datacenter (this was correct).  

When grepping through the logs for a few regionservers we noticed that a few 
regionservers had the following entry in their logs:

2013-09-26 10:55:46,907 ERROR 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
Error while adding a new peer java.net.UnknownHostException: xxx.xxx.flurry.com 
(this was due to a transient dns issue)

Thus a very small subet of our regionservers were not replicating to this new 
cluster while most were. 

We probably don't want to abort if this type of issue comes up, it could 
potentially be fatal if someone does an "add_peer" operation with a typo.  This 
could potentially shut down the cluster. 

One solution I can think of is keeping some flag in ReplicationSourceManager 
which is a boolean that keeps track of whether there was an errorAddingPeer.  
Then in the logPositionAndCleanOldLogs we can do something like:

{code}
if (errorAddingPeer) {
  LOG.error("There was an error adding a peer, logs will not be marked for 
deletion");
  return;
}
{code}

thus we are not deleting these logs from the queue.  You will notice your 
replicating queue rising on certain machines and you can still replay the logs, 
thus avoiding a lengthy copy table. 

I have a patch (with unit test) for the above proposal, if everyone thinks that 
is an okay solution.

An additional idea would be to add some retry logic inside the PeersWatcher 
class for the nodeChildrenChanged method.  Thus if there happens to be some 
issue we could sort it out without having to bounce that particular 
regionserver.  

Would love to hear everyones thoughts.









--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841948#comment-13841948
 ] 

stack commented on HBASE-10048:
---

Applied to trunk and to 0.98.  0.96 not going in.  Will fix it later.  
[~lhofhansl] You want this?  Its nice.

> Add hlog number metric in regionserver
> --
>
> Key: HBASE-10048
> URL: https://issues.apache.org/jira/browse/HBASE-10048
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, 
> HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, 
> HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff
>
>
> Add hlog number metric in regionserver. 
> We can use this metric to alert about memstore flush because of too many 
> hlogs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841941#comment-13841941
 ] 

Hadoop QA commented on HBASE-1:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12617497/1-recover-ts-with-pb-4.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 25 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8079//console

This message is automatically generated.

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 1-recover-ts-with-pb-2.txt, 
> 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 
> 1-v4.txt, 1-v5.txt, 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver

2013-12-06 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841937#comment-13841937
 ] 

Elliott Clark commented on HBASE-10048:
---

+1 thanks

> Add hlog number metric in regionserver
> --
>
> Key: HBASE-10048
> URL: https://issues.apache.org/jira/browse/HBASE-10048
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, 
> HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, 
> HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff
>
>
> Add hlog number metric in regionserver. 
> We can use this metric to alert about memstore flush because of too many 
> hlogs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841933#comment-13841933
 ] 

Lars Hofhansl commented on HBASE-10089:
---

Need to perf test this as well.
Things like {{cfName.equals(other.cfName)}} will be sped up if both string are 
interned, no? In that case equals can just compare the references.


> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-9892) Add info port to ServerName to support multi instances in a node

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-9892:
-

Attachment: HBASE-9892-trunk-v1.patch

> Add info port to ServerName to support multi instances in a node
> 
>
> Key: HBASE-9892
> URL: https://issues.apache.org/jira/browse/HBASE-9892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, 
> HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, 
> HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, 
> HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt
>
>
> The full GC time of  regionserver with big heap(> 30G ) usually  can not be 
> controlled in 30s. At the same time, the servers with 64G memory are normal. 
> So we try to deploy multi rs instances(2-3 ) in a single node and the heap of 
> each rs is about 20G ~ 24G.
> Most of the things works fine, except the hbase web ui. The master get the RS 
> info port from conf, which is suitable for this situation of multi rs  
> instances in a node. So we add info port to ServerName.
> a. at the startup, rs report it's info port to Hmaster.
> b, For root region, rs write the servername with info port ro the zookeeper 
> root-region-server node.
> c, For meta regions, rs write the servername with info port to root region 
> d. For user regions,  rs write the servername with info port to meta regions 
> So hmaster and client can get info port from the servername.
> To test this feature, I change the rs num from 1 to 3 in standalone mode, so 
> we can test it in standalone mode,
> I think Hoya(hbase on yarn) will encounter the same problem.  Anyone knows 
> how Hoya handle this problem?
> PS: There are  different formats for servername in zk node and meta table, i 
> think we need to unify it and refactor the code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841919#comment-13841919
 ] 

Hadoop QA commented on HBASE-9892:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12617503/HBASE-9892-trunk-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestMetricsRegionServerSourceImpl

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8080//console

This message is automatically generated.

> Add info port to ServerName to support multi instances in a node
> 
>
> Key: HBASE-9892
> URL: https://issues.apache.org/jira/browse/HBASE-9892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, 
> HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, 
> HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt
>
>
> The full GC time of  regionserver with big heap(> 30G ) usually  can not be 
> controlled in 30s. At the same time, the servers with 64G memory are normal. 
> So we try to deploy multi rs instances(2-3 ) in a single node and the heap of 
> each rs is about 20G ~ 24G.
> Most of the things works fine, except the hbase web ui. The master get the RS 
> info port from conf, which is suitable for this situation of multi rs  
> instances in a node. So we add info port to ServerName.
> a. at the startup, rs report it's info port to Hmaster.
> b, For root region, rs write the servername with info port ro the zookeeper 
> root-region-server node.
> c, For meta regions, rs write the servername with info port to root region 
> d. For user regions,  rs write the servername with info port to meta regions 
> So hmaster and client can get info port from the servername.
> To test this feature, I change the rs num from 1 to 3 in standalone mode, so 
> w

[jira] [Commented] (HBASE-10071) support data type for get/scan in hbase shell

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841914#comment-13841914
 ] 

stack commented on HBASE-10071:
---

[~cuijianwei] Nice.  What the lads said about it coming in via trunk but, one 
question: this is different from the formatting that is already present?  See 
the scan help in the shell:

{code}
Besides the default 'toStringBinary' format, 'scan' supports custom formatting
by column.  A user can define a FORMATTER by adding it to the column name in
the scan specification.  The FORMATTER can be stipulated:

 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, 
toString)
 2. or as a custom class followed by method name: e.g. 
'c(MyFormatterClass).format'.

Example formatting cf:qualifier1 and cf:qualifier2 both as Integers:
  hbase> scan 't1', {COLUMNS => ['cf:qualifier1:toInt',
'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt'] }

Note that you can specify a FORMATTER by column only (cf:qualifer).  You cannot
specify a FORMATTER for all columns of a column family.
{code}

> support data type for get/scan in hbase shell
> -
>
> Key: HBASE-10071
> URL: https://issues.apache.org/jira/browse/HBASE-10071
> Project: HBase
>  Issue Type: Improvement
>  Components: Client
>Affects Versions: 0.94.14
>Reporter: cuijianwei
> Attachments: HBASE-10071-0.94-v1.patch
>
>
> Users tend to run hbase shell to query hbase quickly. The result will be 
> shown as binary format which may not look clear enough when users write 
> columns using specified types, such as long/int/short. Therefore, it may be 
> helpful if the results could be shown as specified format. We make a patch to 
> extend get/scan in hbase shell in which user could specify the data type in 
> get/scan for each column as:
> {code}
> scan 'table', {COLUMNS=>['CF:QF:long']}
> get 'table', 'r0', {COLUMN=>'CF:QF:long'}
> {code}
> Then, the result will be shown as Long type. The result of above get will be:
> {code}
> COLUMNCELL
>   
>  
>  CF:QFtimestamp=24311261, value=24311229
> {code}
> This extended format is compatible with previous format, if users do not 
> specify the data type, the command will also work and output binary format.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HBASE-4163) Create Split Strategy for YCSB Benchmark

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4163.
--

   Resolution: Fixed
Fix Version/s: 0.99.0
 Assignee: Luke Lu  (was: Lars George)

Closing as fixed.  I also added note to our little ycsb section in the doc that 
will show the next time I push the site; it points at Luke's little script.

> Create Split Strategy for YCSB Benchmark
> 
>
> Key: HBASE-4163
> URL: https://issues.apache.org/jira/browse/HBASE-4163
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 0.90.3, 0.92.0
>Reporter: Nicolas Spiegelberg
>Assignee: Luke Lu
>Priority: Minor
>  Labels: benchmark
> Fix For: 0.99.0
>
>
> Talked with Lars about how we can make it easier for users to run the YCSB 
> benchmarks against HBase & get realistic results.  Currently, HBase is 
> optimized for the random/uniform read/write case, which is the YCSB load.  
> The initial reason why we perform bad when users test against us is because 
> they do not presplit regions & have the split ratio really low.  We need a 
> one-line way for a user to create a table that is pre-split to 200 regions 
> (or some decent number) by default & disable splitting.  Realistically, this 
> is how a uniform load cluster should scale, so it's not a hack.  This will 
> also give us a good use case to point to for how users should pre-split 
> regions.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-7572) move metadata settings that duplicate xml config settings to CF/table config in a backward-compatible manner

2013-12-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7572:


Attachment: HBASE-7572-v4.patch

I noticed this got stuck... rebased the patch.

[~enis] I noticed durability changes added another setting to HTD instead of a 
config setting... Did you add it intentionally, or didn't know about config 
overrides? Just checking. With this patch (maybe I will add comment about it to 
HTD on review iteration/commit), hopefully we can start using configuration in 
preference for HTD custom fields

> move metadata settings that duplicate xml config settings to CF/table config 
> in a backward-compatible manner
> 
>
> Key: HBASE-7572
> URL: https://issues.apache.org/jira/browse/HBASE-7572
> Project: HBase
>  Issue Type: Improvement
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HBASE-7572-v0.patch, HBASE-7572-v1.patch, 
> HBASE-7572-v2.patch, HBASE-7572-v3.patch, HBASE-7572-v4.patch
>
>
> 2nd part of splitting HBASE-7236



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10048) Add hlog number metric in regionserver

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10048:
--

  Component/s: metrics
Fix Version/s: 0.99.0
   0.96.1
   0.98.0

> Add hlog number metric in regionserver
> --
>
> Key: HBASE-10048
> URL: https://issues.apache.org/jira/browse/HBASE-10048
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, 
> HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, 
> HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff
>
>
> Add hlog number metric in regionserver. 
> We can use this metric to alert about memstore flush because of too many 
> hlogs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10048) Add hlog number metric in regionserver

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841904#comment-13841904
 ] 

stack commented on HBASE-10048:
---

+1 Patch is great.  [~eclark] Please bless and then I'll commit.

> Add hlog number metric in regionserver
> --
>
> Key: HBASE-10048
> URL: https://issues.apache.org/jira/browse/HBASE-10048
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-10048-0.94-v1.diff, HBASE-10048-0.94-v2.diff, 
> HBASE-10048-trunk-v1.diff, HBASE-10048-trunk-v2.diff, 
> HBASE-10048-trunk-v3.diff, HBASE-10048-trunk-v4.diff
>
>
> Add hlog number metric in regionserver. 
> We can use this metric to alert about memstore flush because of too many 
> hlogs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-9892) Add info port to ServerName to support multi instances in a node

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-9892:
-

Fix Version/s: 0.99.0
   0.96.1
   0.98.0

> Add info port to ServerName to support multi instances in a node
> 
>
> Key: HBASE-9892
> URL: https://issues.apache.org/jira/browse/HBASE-9892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, 
> HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, 
> HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt
>
>
> The full GC time of  regionserver with big heap(> 30G ) usually  can not be 
> controlled in 30s. At the same time, the servers with 64G memory are normal. 
> So we try to deploy multi rs instances(2-3 ) in a single node and the heap of 
> each rs is about 20G ~ 24G.
> Most of the things works fine, except the hbase web ui. The master get the RS 
> info port from conf, which is suitable for this situation of multi rs  
> instances in a node. So we add info port to ServerName.
> a. at the startup, rs report it's info port to Hmaster.
> b, For root region, rs write the servername with info port ro the zookeeper 
> root-region-server node.
> c, For meta regions, rs write the servername with info port to root region 
> d. For user regions,  rs write the servername with info port to meta regions 
> So hmaster and client can get info port from the servername.
> To test this feature, I change the rs num from 1 to 3 in standalone mode, so 
> we can test it in standalone mode,
> I think Hoya(hbase on yarn) will encounter the same problem.  Anyone knows 
> how Hoya handle this problem?
> PS: There are  different formats for servername in zk node and meta table, i 
> think we need to unify it and refactor the code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9892) Add info port to ServerName to support multi instances in a node

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841897#comment-13841897
 ] 

stack commented on HBASE-9892:
--

[~enis] You good w/ patch as is?  I am.  Wouldn't mind getting it into 0.96.1.

> Add info port to ServerName to support multi instances in a node
> 
>
> Key: HBASE-9892
> URL: https://issues.apache.org/jira/browse/HBASE-9892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, 
> HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, 
> HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt
>
>
> The full GC time of  regionserver with big heap(> 30G ) usually  can not be 
> controlled in 30s. At the same time, the servers with 64G memory are normal. 
> So we try to deploy multi rs instances(2-3 ) in a single node and the heap of 
> each rs is about 20G ~ 24G.
> Most of the things works fine, except the hbase web ui. The master get the RS 
> info port from conf, which is suitable for this situation of multi rs  
> instances in a node. So we add info port to ServerName.
> a. at the startup, rs report it's info port to Hmaster.
> b, For root region, rs write the servername with info port ro the zookeeper 
> root-region-server node.
> c, For meta regions, rs write the servername with info port to root region 
> d. For user regions,  rs write the servername with info port to meta regions 
> So hmaster and client can get info port from the servername.
> To test this feature, I change the rs num from 1 to 3 in standalone mode, so 
> we can test it in standalone mode,
> I think Hoya(hbase on yarn) will encounter the same problem.  Anyone knows 
> how Hoya handle this problem?
> PS: There are  different formats for servername in zk node and meta table, i 
> think we need to unify it and refactor the code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-9892) Add info port to ServerName to support multi instances in a node

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-9892:
-

Attachment: HBASE-9892-trunk-v1.patch

Retry

> Add info port to ServerName to support multi instances in a node
> 
>
> Key: HBASE-9892
> URL: https://issues.apache.org/jira/browse/HBASE-9892
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Attachments: HBASE-9892-0.94-v1.diff, HBASE-9892-0.94-v2.diff, 
> HBASE-9892-0.94-v3.diff, HBASE-9892-0.94-v4.diff, HBASE-9892-0.94-v5.diff, 
> HBASE-9892-trunk-v1.diff, HBASE-9892-trunk-v1.patch, HBASE-9892-v5.txt
>
>
> The full GC time of  regionserver with big heap(> 30G ) usually  can not be 
> controlled in 30s. At the same time, the servers with 64G memory are normal. 
> So we try to deploy multi rs instances(2-3 ) in a single node and the heap of 
> each rs is about 20G ~ 24G.
> Most of the things works fine, except the hbase web ui. The master get the RS 
> info port from conf, which is suitable for this situation of multi rs  
> instances in a node. So we add info port to ServerName.
> a. at the startup, rs report it's info port to Hmaster.
> b, For root region, rs write the servername with info port ro the zookeeper 
> root-region-server node.
> c, For meta regions, rs write the servername with info port to root region 
> d. For user regions,  rs write the servername with info port to meta regions 
> So hmaster and client can get info port from the servername.
> To test this feature, I change the rs num from 1 to 3 in standalone mode, so 
> we can test it in standalone mode,
> I think Hoya(hbase on yarn) will encounter the same problem.  Anyone knows 
> how Hoya handle this problem?
> PS: There are  different formats for servername in zk node and meta table, i 
> think we need to unify it and refactor the code.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10092) Move up on to log4j2

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841892#comment-13841892
 ] 

stack commented on HBASE-10092:
---

Make sure zkcli works when I am done...see if I can fix the eclipse issue over 
in HBASE-10073 while I am at it.

> Move up on to log4j2
> 
>
> Key: HBASE-10092
> URL: https://issues.apache.org/jira/browse/HBASE-10092
> Project: HBase
>  Issue Type: Task
>Reporter: stack
> Attachments: 10092.txt
>
>
> Allows logging with less friction.  See http://logging.apache.org/log4j/2.x/  
> This rather radical transition can be done w/ minor change given they have an 
> adapter for apache's logging, the one we use.  They also have and adapter for 
> slf4j so we likely can remove at least some of the 4 versions of this module 
> our dependencies make use of.
> I made a start in attached patch but am currently stuck in maven dependency 
> resolve hell courtesy of our slf4j.  Fixing will take some concentration and 
> a good net connection, an item I currently lack.  Other TODOs are that will 
> need to fix our little log level setting jsp page -- will likely have to undo 
> our use of hadoop's tool here -- and the config system changes a little.
> I will return to this project soon.  Will bring numbers.
>  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10073) Revert HBASE-9718 (Add a test scope dependency on org.slf4j:slf4j-api to hbase-client)

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841890#comment-13841890
 ] 

stack commented on HBASE-10073:
---

Ugh.  Yeah, this is awful.  I was trying to update the logging lib in hbase and 
noticed that there four versions of slf4j being pulled in by our dependencies 
and that these four versions cannot be replaced by one only as they are 
incompatible.  slf4j is also doing us the favor of spewing any console w/ a 
message if it finds more than one version of slf4j on the classpath.  Nice.  
Let me link this issue to my log4j update issue to be sure I don't break it.  
At the moment I cannot upgrade because of this issue in slf4j.

> Revert HBASE-9718 (Add a test scope dependency on org.slf4j:slf4j-api to 
> hbase-client)
> --
>
> Key: HBASE-10073
> URL: https://issues.apache.org/jira/browse/HBASE-10073
> Project: HBase
>  Issue Type: Bug
>  Components: Zookeeper
>Affects Versions: 0.96.1
> Environment: Centos6, sun-jdk-64bit-1.7.0.25
>Reporter: Aleksandr Shulman
>Assignee: Andrew Purtell
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: 10073.patch
>
>
> Observed behavior:
> In my automation, I have a call to hbase zkcli. That call recently broke with 
> this checkin: 
> https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
> The error that is reported is:
> {code}++ ./hbase zkcli
> 11:19:58  Warning: $HADOOP_HOME is deprecated.
> 11:19:58  
> 11:20:00  Exception in thread "main" java.lang.IllegalAccessError: tried to 
> access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
> org.slf4j.LoggerFactory
> 11:20:00  at org.slf4j.LoggerFactory.(LoggerFactory.java:60)
> 11:20:00  at 
> org.apache.zookeeper.ZooKeeperMain.(ZooKeeperMain.java:50)
> 11:20:00  at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
> 11:20:00  Build step 'Execute shell' marked build as failure{code}
> That said, this checkin is perfectly valid as each component should be 
> allowed to specify its own dependencies.
> The issue is a deeper one of dependency mismatches.
> Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
> where there is a similar checkin, but since trunk is not required to work 
> against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10094) Add batching to HLogPerformanceEvaluation

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-10094:
--

   Resolution: Fixed
Fix Version/s: 0.99.0
   0.96.1
   0.98.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks Himanshu.  Yeah, gives some insight on sync rates.  Let me see if can 
get better reporting, a reporting that will expose clumping of syncs or syncing 
even though only a little amount of data has been written.  This would be good 
to know too.

> Add batching to HLogPerformanceEvaluation
> -
>
> Key: HBASE-10094
> URL: https://issues.apache.org/jira/browse/HBASE-10094
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, wal
>Reporter: stack
>Assignee: Himanshu Vashishtha
> Fix For: 0.98.0, 0.96.1, 0.99.0
>
> Attachments: 10094v2.txt
>
>
> As Himanshu points out in the the parent issue, HLogPE is using an unorthodox 
> API appending edits to the WAL; it is using an API that is meant for tests 
> only that does an append immediately followed by a sync call.
> In normal deploy, WAL appends are done as a bunch of appends followed by a 
> sync on the tail of the transaction -- not a sync per append.
> This issue is about changing HLogPE to use append and then sync.  It also 
> adds an argument so you can specifying batching of a set of appends before  
> the sync is called.  The latter lets HLogPE mimic multi puts that use the 
> minibatch... which appends, appends, appends.. and then syncs.
> Assigning to Himanshu for review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-10061:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to all four branches. Thanks for the patch Amit!

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841858#comment-13841858
 ] 

Nick Dimiduk commented on HBASE-10061:
--

Patch applies cleanly on all 4 branches. Locally ran -Dtest=TestTableMapReduce 
vs default hadoop profile on each branch, all passed.

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2013-12-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841843#comment-13841843
 ] 

Ted Yu commented on HBASE-1:


Nicolas: thanks for the valuable comments.

Patch v4 should address all your comments. Let me answer the last several here 
since they're important.

bq. may be it should be outside of the loop
Done.

bq. the previous versions was doing multiple calls.
I modified the condition so that calls are made when nbAttempt > 0.

bq. but ifFileClosed is not available, we will never succeed.
The behavior is not changed: we would rely on recoverLease() to be successful.

I modified the condition for the initial sleep according to your comments above.

bq. Could we say that 'isFileClosed' will be mandatory there?
There hasn't been consensus as to which release would have such pre-requisite. 
So I want to keep the changes cover all supported hadoop releases.

bq. I looked especially at FSHDFSUtils
This is the part where more attention should be paid.

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 1-recover-ts-with-pb-2.txt, 
> 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 
> 1-v4.txt, 1-v5.txt, 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2013-12-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-1:
---

Attachment: 1-recover-ts-with-pb-4.txt

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 1-recover-ts-with-pb-2.txt, 
> 1-recover-ts-with-pb-3.txt, 1-recover-ts-with-pb-4.txt, 1-v1.txt, 
> 1-v4.txt, 1-v5.txt, 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-10061:
-

Fix Version/s: 0.99.0

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15, 0.99.0
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts

2013-12-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841827#comment-13841827
 ] 

Jimmy Xiang commented on HBASE-10085:
-

bq. The reason to restart whole cluster is that I need to trigger SSH on both 
old RSs(source RS and dst RS in a region assignment) to repro the exact issue 
to verify the fix.
That's my understanding too. But sometimes, it takes a while to restart the 
cluster. Let me think about it.

> Some regions aren't re-assigned after a cluster restarts
> 
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10098) [WINDOWS] pass in native library directory from hadoop for unit tests

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841828#comment-13841828
 ] 

Hadoop QA commented on HBASE-10098:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617488/hbase-10098_v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8078//console

This message is automatically generated.

> [WINDOWS] pass in native library directory from hadoop for unit tests
> -
>
> Key: HBASE-10098
> URL: https://issues.apache.org/jira/browse/HBASE-10098
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: hbase-10098_v1.patch, hbase-10098_v2.patch
>
>
> On windows, Hadoop depends on native libraries for doing it's job. The bin 
> scripts already handle finding hadoop's native libs and adding them to 
> java.library.path, but for running HBase's unit tests, we need to pass them 
> in. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation

2013-12-06 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841821#comment-13841821
 ] 

Himanshu Vashishtha commented on HBASE-10094:
-

Errr, just ignore my last nit... TestHLog doesn't use default iterations. It's 
good to go. Thanks.

> Add batching to HLogPerformanceEvaluation
> -
>
> Key: HBASE-10094
> URL: https://issues.apache.org/jira/browse/HBASE-10094
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, wal
>Reporter: stack
>Assignee: Himanshu Vashishtha
> Attachments: 10094v2.txt
>
>
> As Himanshu points out in the the parent issue, HLogPE is using an unorthodox 
> API appending edits to the WAL; it is using an API that is meant for tests 
> only that does an append immediately followed by a sync call.
> In normal deploy, WAL appends are done as a bunch of appends followed by a 
> sync on the tail of the transaction -- not a sync per append.
> This issue is about changing HLogPE to use append and then sync.  It also 
> adds an argument so you can specifying batching of a set of appends before  
> the sync is called.  The latter lets HLogPE mimic multi puts that use the 
> minibatch... which appends, appends, appends.. and then syncs.
> Assigning to Himanshu for review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10098) [WINDOWS] pass in native library directory from hadoop for unit tests

2013-12-06 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841820#comment-13841820
 ] 

Nick Dimiduk commented on HBASE-10098:
--

lgtm.

> [WINDOWS] pass in native library directory from hadoop for unit tests
> -
>
> Key: HBASE-10098
> URL: https://issues.apache.org/jira/browse/HBASE-10098
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: hbase-10098_v1.patch, hbase-10098_v2.patch
>
>
> On windows, Hadoop depends on native libraries for doing it's job. The bin 
> scripts already handle finding hadoop's native libs and adding them to 
> java.library.path, but for running HBase's unit tests, we need to pass them 
> in. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation

2013-12-06 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841816#comment-13841816
 ] 

Himanshu Vashishtha commented on HBASE-10094:
-

+1. This would give more insights when comparing various schemes on batching 
sync calls.

minor nit:
-long numIterations = 1;
+long numIterations = 100;

I don't think we need this. It will make TestHLog run longer by almost 90 sec.

> Add batching to HLogPerformanceEvaluation
> -
>
> Key: HBASE-10094
> URL: https://issues.apache.org/jira/browse/HBASE-10094
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, wal
>Reporter: stack
>Assignee: Himanshu Vashishtha
> Attachments: 10094v2.txt
>
>
> As Himanshu points out in the the parent issue, HLogPE is using an unorthodox 
> API appending edits to the WAL; it is using an API that is meant for tests 
> only that does an append immediately followed by a sync call.
> In normal deploy, WAL appends are done as a bunch of appends followed by a 
> sync on the tail of the transaction -- not a sync per append.
> This issue is about changing HLogPE to use append and then sync.  It also 
> adds an argument so you can specifying batching of a set of appends before  
> the sync is called.  The latter lets HLogPE mimic multi puts that use the 
> minibatch... which appends, appends, appends.. and then syncs.
> Assigning to Himanshu for review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841810#comment-13841810
 ] 

Hadoop QA commented on HBASE-10099:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12617482/HBASE-10099-trunk-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8077//console

This message is automatically generated.

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch, 
> HBASE-10099-trunk-v2.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10098) [WINDOWS] pass in native library directory from hadoop for unit tests

2013-12-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841809#comment-13841809
 ] 

Enis Soztutar commented on HBASE-10098:
---

bq. Okay. Would it be useful to propagate the parent process arguments in 
addition to what you have here?
Makes sense. v2 patch adds that. 

> [WINDOWS] pass in native library directory from hadoop for unit tests
> -
>
> Key: HBASE-10098
> URL: https://issues.apache.org/jira/browse/HBASE-10098
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: hbase-10098_v1.patch, hbase-10098_v2.patch
>
>
> On windows, Hadoop depends on native libraries for doing it's job. The bin 
> scripts already handle finding hadoop's native libs and adding them to 
> java.library.path, but for running HBase's unit tests, we need to pass them 
> in. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-5487) Generic framework for Master-coordinated tasks

2013-12-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841811#comment-13841811
 ] 

Sergey Shelukhin commented on HBASE-5487:
-

Ah well, I never got to part 2. Did you guys make progress on this? I may have 
time to resurrect this again soon.

> Generic framework for Master-coordinated tasks
> --
>
> Key: HBASE-5487
> URL: https://issues.apache.org/jira/browse/HBASE-5487
> Project: HBase
>  Issue Type: New Feature
>  Components: master, regionserver, Zookeeper
>Affects Versions: 0.94.0
>Reporter: Mubarak Seyed
>Assignee: Sergey Shelukhin
>Priority: Critical
> Attachments: Entity management in Master - part 1.pdf, Entity 
> management in Master - part 1.pdf, Is the FATE of Assignment Manager 
> FATE.pdf, Region management in Master.pdf, Region management in Master5.docx, 
> hbckMasterV2-long.pdf, hbckMasterV2b-long.pdf
>
>
> Need a framework to execute master-coordinated tasks in a fault-tolerant 
> manner. 
> Master-coordinated tasks such as online-scheme change and delete-range 
> (deleting region(s) based on start/end key) can make use of this framework.
> The advantages of framework are
> 1. Eliminate repeated code in Master, ZooKeeper tracker and Region-server for 
> master-coordinated tasks
> 2. Ability to abstract the common functions across Master -> ZK and RS -> ZK
> 3. Easy to plugin new master-coordinated tasks without adding code to core 
> components



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10098) [WINDOWS] pass in native library directory from hadoop for unit tests

2013-12-06 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-10098:
--

Attachment: hbase-10098_v2.patch

> [WINDOWS] pass in native library directory from hadoop for unit tests
> -
>
> Key: HBASE-10098
> URL: https://issues.apache.org/jira/browse/HBASE-10098
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: hbase-10098_v1.patch, hbase-10098_v2.patch
>
>
> On windows, Hadoop depends on native libraries for doing it's job. The bin 
> scripts already handle finding hadoop's native libs and adding them to 
> java.library.path, but for running HBase's unit tests, we need to pass them 
> in. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-9955) Make hadoop2 the default and deprecate hadoop1

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-9955:
-

Attachment: 9955v5.098.txt

What I applied to 0.98 (includes little addendum)

> Make hadoop2 the default and deprecate hadoop1
> --
>
> Key: HBASE-9955
> URL: https://issues.apache.org/jira/browse/HBASE-9955
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Fix For: 0.98.0
>
> Attachments: 9955.txt, 9955v2.txt, 9955v3.txt, 9955v4.txt, 
> 9955v5.098.txt, 9955v5.txt, addendum.txt, addendum.txt
>
>
> See "Hadoop version trunk dependency?" on the dev mailing ilst.  Consensus 
> seems to be forming to do the subject line (Recheck the mail thread before 
> going ahead).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-10061:
-

Assignee: Amit Sela

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Assignee: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (HBASE-7667) Support stripe compaction

2013-12-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HBASE-7667.
-

   Resolution: Fixed
Fix Version/s: 0.99.0
   0.98.0

All the pertinent patches have been committed for some time (before 98 was 
branched).

> Support stripe compaction
> -
>
> Key: HBASE-7667
> URL: https://issues.apache.org/jira/browse/HBASE-7667
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.98.0, 0.99.0
>
> Attachments: Stripe compaction perf evaluation.pdf, Stripe compaction 
> perf evaluation.pdf, Stripe compaction perf evaluation.pdf, Stripe 
> compactions.pdf, Stripe compactions.pdf, Stripe compactions.pdf, Stripe 
> compactions.pdf, Using stripe compactions.pdf, Using stripe compactions.pdf, 
> Using stripe compactions.pdf, stripe-cdf.pdf
>
>
> So I was thinking about having many regions as the way to make compactions 
> more manageable, and writing the level db doc about how level db range 
> overlap and data mixing breaks seqNum sorting, and discussing it with Jimmy, 
> Matteo and Ted, and thinking about how to avoid Level DB I/O multiplication 
> factor.
> And I suggest the following idea, let's call it stripe compactions. It's a 
> mix between level db ideas and having many small regions.
> It allows us to have a subset of benefits of many regions (wrt reads and 
> compactions) without many of the drawbacks (managing and current 
> memstore/etc. limitation).
> It also doesn't break seqNum-based file sorting for any one key.
> It works like this.
> The region key space is separated into configurable number of fixed-boundary 
> stripes (determined the first time we stripe the data, see below).
> All the data from memstores is written to normal files with all keys present 
> (not striped), similar to L0 in LevelDb, or current files.
> Compaction policy does 3 types of compactions.
> First is L0 compaction, which takes all L0 files and breaks them down by 
> stripe. It may be optimized by adding more small files from different 
> stripes, but the main logical outcome is that there are no more L0 files and 
> all data is striped.
> Second is exactly similar to current compaction, but compacting one single 
> stripe. In future, nothing prevents us from applying compaction rules and 
> compacting part of the stripe (e.g. similar to current policy with rations 
> and stuff, tiers, whatever), but for the first cut I'd argue let it "major 
> compact" the entire stripe. Or just have the ratio and no more complexity.
> Finally, the third addresses the concern of the fixed boundaries causing 
> stripes to be very unbalanced.
> It's exactly like the 2nd, except it takes 2+ adjacent stripes and writes the 
> results out with different boundaries.
> There's a tradeoff here - if we always take 2 adjacent stripes, compactions 
> will be smaller but rebalancing will take ridiculous amount of I/O.
> If we take many stripes we are essentially getting into the 
> epic-major-compaction problem again. Some heuristics will have to be in place.
> In general, if, before stripes are determined, we initially let L0 grow 
> before determining the stripes, we will get better boundaries.
> Also, unless unbalancing is really large we don't need to rebalance really.
> Obviously this scheme (as well as level) is not applicable for all scenarios, 
> e.g. if timestamp is your key it completely falls apart.
> The end result:
> - many small compactions that can be spread out in time.
> - reads still read from a small number of files (one stripe + L0).
> - region splits become marvelously simple (if we could move files between 
> regions, no references would be needed).
> Main advantage over Level (for HBase) is that default store can still open 
> the files and get correct results - there are no range overlap shenanigans.
> It also needs no metadata, although we may record some for convenience.
> It also would appear to not cause as much I/O.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a master restarts

2013-12-06 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841805#comment-13841805
 ] 

Jeffrey Zhong commented on HBASE-10085:
---

I checked in at the same time as your comments. The committed patch has updated 
format which is from our Apache template auto formatting. 

The reason to restart whole cluster is that I need to trigger SSH on both old 
RSs(source RS and dst RS in a region assignment) to repro the exact issue to 
verify the fix.

> Some regions aren't re-assigned after a master restarts
> ---
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-9955) Make hadoop2 the default and deprecate hadoop1

2013-12-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-9955:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Resolving.

> Make hadoop2 the default and deprecate hadoop1
> --
>
> Key: HBASE-9955
> URL: https://issues.apache.org/jira/browse/HBASE-9955
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Fix For: 0.98.0
>
> Attachments: 9955.txt, 9955v2.txt, 9955v3.txt, 9955v4.txt, 
> 9955v5.098.txt, 9955v5.txt, addendum.txt, addendum.txt
>
>
> See "Hadoop version trunk dependency?" on the dev mailing ilst.  Consensus 
> seems to be forming to do the subject line (Recheck the mail thread before 
> going ahead).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10085) Some regions aren't re-assigned after a cluster restarts

2013-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10085:
--

Summary: Some regions aren't re-assigned after a cluster restarts  (was: 
Some regions aren't re-assigned after a master restarts)

> Some regions aren't re-assigned after a cluster restarts
> 
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9955) Make hadoop2 the default and deprecate hadoop1

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841803#comment-13841803
 ] 

stack commented on HBASE-9955:
--

I applied the addendum to trunk (comment edit in pom.xml).  Now let me backport 
to 0.98.

> Make hadoop2 the default and deprecate hadoop1
> --
>
> Key: HBASE-9955
> URL: https://issues.apache.org/jira/browse/HBASE-9955
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Fix For: 0.98.0
>
> Attachments: 9955.txt, 9955v2.txt, 9955v3.txt, 9955v4.txt, 
> 9955v5.txt, addendum.txt, addendum.txt
>
>
> See "Hadoop version trunk dependency?" on the dev mailing ilst.  Consensus 
> seems to be forming to do the subject line (Recheck the mail thread before 
> going ahead).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841802#comment-13841802
 ] 

Nick Dimiduk commented on HBASE-10061:
--

Sounds good guys, I'll get this committed this afternoon.

This isn't the first bug I've seen related to OSGi classloaders. [~amitsela] 
any chance you could dream up a unit or integration test that will 
realistically exercise this scenario? Have a look at TestTableMapReduce and 
IntegrationTestTableMapReduceUtil for examples. I'm not very familiar with this 
environment, so I appreciate any advice you can provide.

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10094) Add batching to HLogPerformanceEvaluation

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841801#comment-13841801
 ] 

Hadoop QA commented on HBASE-10094:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617453/10094v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8075//console

This message is automatically generated.

> Add batching to HLogPerformanceEvaluation
> -
>
> Key: HBASE-10094
> URL: https://issues.apache.org/jira/browse/HBASE-10094
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance, wal
>Reporter: stack
>Assignee: Himanshu Vashishtha
> Attachments: 10094v2.txt
>
>
> As Himanshu points out in the the parent issue, HLogPE is using an unorthodox 
> API appending edits to the WAL; it is using an API that is meant for tests 
> only that does an append immediately followed by a sync call.
> In normal deploy, WAL appends are done as a bunch of appends followed by a 
> sync on the tail of the transaction -- not a sync per append.
> This issue is about changing HLogPE to use append and then sync.  It also 
> adds an argument so you can specifying batching of a set of appends before  
> the sync is called.  The latter lets HLogPE mimic multi puts that use the 
> minibatch... which appends, appends, appends.. and then syncs.
> Assigning to Himanshu for review.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-10061:
-

Fix Version/s: 0.94.15

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1, 0.94.15
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a master restarts

2013-12-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841798#comment-13841798
 ] 

Jimmy Xiang commented on HBASE-10085:
-

Never mind about my previous comment. I can address it in HBASE-10090. Thanks. 

> Some regions aren't re-assigned after a master restarts
> ---
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10098) [WINDOWS] pass in native library directory from hadoop for unit tests

2013-12-06 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841796#comment-13841796
 ] 

Nick Dimiduk commented on HBASE-10098:
--

Okay. Would it be useful to propagate the parent process arguments in addition 
to what you have here?

+1.

> [WINDOWS] pass in native library directory from hadoop for unit tests
> -
>
> Key: HBASE-10098
> URL: https://issues.apache.org/jira/browse/HBASE-10098
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: hbase-10098_v1.patch
>
>
> On windows, Hadoop depends on native libraries for doing it's job. The bin 
> scripts already handle finding hadoop's native libs and adding them to 
> java.library.path, but for running HBase's unit tests, we need to pass them 
> in. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10000) Initiate lease recovery for outstanding WAL files at the very beginning of recovery

2013-12-06 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841792#comment-13841792
 ] 

Nicolas Liochon commented on HBASE-1:
-

{code}
+  if (!exceptions.isEmpty()) {
+LOG.debug("Encountered " + exceptions.size() + " exceptions");
+throw exceptions.get(0);
+  }
{code}
=> Should be info or warning


{code}
+  public static final int LEASE_RECOVERY_UNREQUESTED = 0;
{code}
=> should be a long, no?


{code}
+if (findIsFileClosedMeth) {
+  try {
+isFileClosedMeth = dfs.getClass().getMethod("isFileClosed",
+  new Class[]{ Path.class });
+  } catch (NoSuchMethodException nsme) {
+LOG.debug("isFileClosed not available");
+  } finally {
+findIsFileClosedMeth = false;
+  }
+}
{code}

=> findIsFileClosedMeth seems to be always true at the beginning, and not read 
later (i.e. the finally clause is not needed)
The code in this method is very complex to read (that's not your fault :-) ),  
I think if you change it you need to restructure it as well. 


{code}
+  if (ts == HConstants.LEASE_RECOVERY_UNREQUESTED) {
+startWaiting = EnvironmentEdgeManager.currentTimeMillis();
+recovered = recoverLease(dfs, nbAttempt, p, startWaiting);
+  } else {
+startWaiting = ts;
+  }
{code}
=> this seems wrong (for each loop, we will reset "startWaiting = ts") or may 
be it should be outside of the loop

I'm not sure of the previous version, but I think we must be ready to do 
multiple calls to recoverLease, in case the namenode crashed at the wrong time 
or something alike. With this version, it seems it won't be the case if the 
calls was made by the master. If I read correctly, the previous versions was 
doing multiple calls.

 As well, if I'm not wrong, if the master initiated the recovey but 
ifFileClosed is not available, we will never succeed. If this case is not 
covered voluntary this should be documented.


{code}
// On the first time through wait the short 'firstPause'.
if (nbAttempt == 0) {
  Thread.sleep(firstPause);
{code}
=> Should this be changed if the master initiated the recoverLease? No need to 
wait 4s.
=> This code is not needed when isFileClosed is available (as it's cheap, we 
don't want to wait: we prefer to do the call sooner)

What's the target version? Could we say that 'isFileClosed' will be mandatory 
there? This would simplify the code.

(I haven't reviewed everything in details, I looked especially at FSHDFSUtils). 

> Initiate lease recovery for outstanding WAL files at the very beginning of 
> recovery
> ---
>
> Key: HBASE-1
> URL: https://issues.apache.org/jira/browse/HBASE-1
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.98.1
>
> Attachments: 1-recover-ts-with-pb-2.txt, 
> 1-recover-ts-with-pb-3.txt, 1-v1.txt, 1-v4.txt, 1-v5.txt, 
> 1-v6.txt
>
>
> At the beginning of recovery, master can send lease recovery requests 
> concurrently for outstanding WAL files using a thread pool.
> Each split worker would first check whether the WAL file it processes is 
> closed.
> Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this 
> idea. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10085) Some regions aren't re-assigned after a master restarts

2013-12-06 Thread Jeffrey Zhong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeffrey Zhong updated HBASE-10085:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks [~jxiang] for the reviews! I've integrated the fix into trunk, 0.98 and 
0.96 branch. The javadoc and findbug warnings are not related to this patch. 
Thanks. 

> Some regions aren't re-assigned after a master restarts
> ---
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10085) Some regions aren't re-assigned after a master restarts

2013-12-06 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841793#comment-13841793
 ] 

Jimmy Xiang commented on HBASE-10085:
-

[~jeffreyz], before you commit it, could you fix the format a little bit? For 
example:
{noformat}
+|| !(regionState.isFailedClose() || 
regionState.isPendingOpenOrOpening() || regionState
+.isOffline())) {
{noformat}
to something like
{noformat}
+|| !(regionState.isFailedClose() || 
regionState.isPendingOpenOrOpening()
+ || regionState.isOffline())) {
{noformat}

By the way, the new test is a little flaky.  Instead of restarting the whole 
mini cluster, can we just restart the master? Also increase the timeout a 
little?

Thanks.

> Some regions aren't re-assigned after a master restarts
> ---
>
> Key: HBASE-10085
> URL: https://issues.apache.org/jira/browse/HBASE-10085
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.96.1
>Reporter: Jeffrey Zhong
>Assignee: Jeffrey Zhong
> Fix For: 0.98.0, 0.96.1
>
> Attachments: hbase-10085.patch
>
>
> We see this issue happened in a cluster restart:
> 1) when shutdown a cluster, some regions are in offline state because no 
> Region servers are available(stop RS and then Master)
> 2) When the cluster restarts, the offlined regions are forced to be offline 
> again and SSH skip re-assigning them by function AM.processServerShutdown as 
> shown below.
> {code}
> 2013-12-03 10:41:56,686 INFO  
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> Processing 873dbd8c269f44d0aefb0f66c5b53537 in state: M_ZK_REGION_OFFLINE
> 2013-12-03 10:41:56,686 DEBUG 
> [master:h2-ubuntu12-sec-1386048659-hbase-8:6] master.AssignmentManager: 
> RIT 873dbd8c269f44d0aefb0f66c5b53537 in state=M_ZK_REGION_OFFLINE was on 
> deadserver; forcing offline
> ...
> 2013-12-03 10:41:56,739 DEBUG [AM.-pool1-t8] master.AssignmentManager: Force 
> region state offline {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, 
> ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
> ...
> 2013-12-03 10:41:57,223 WARN  
> [MASTER_SERVER_OPERATIONS-h2-ubuntu12-sec-1386048659-hbase-8:6-3] 
> master.RegionStates: THIS SHOULD NOT HAPPEN: unexpected 
> {873dbd8c269f44d0aefb0f66c5b53537 state=OFFLINE, ts=1386067316737, 
> server=h2-ubuntu12-sec-1386048659-hbase-6.cs1cloud.internal,60020,1386066968696}
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-3787) Increment is non-idempotent but client retries RPC

2013-12-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-3787:


   Resolution: Fixed
Fix Version/s: 0.99.0
   0.98.0
   Status: Resolved  (was: Patch Available)

This was actually committed some time ago (before branching 0.98 I think)

> Increment is non-idempotent but client retries RPC
> --
>
> Key: HBASE-3787
> URL: https://issues.apache.org/jira/browse/HBASE-3787
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.94.4, 0.95.2
>Reporter: dhruba borthakur
>Assignee: Sergey Shelukhin
>Priority: Blocker
> Fix For: 0.98.0, 0.99.0
>
> Attachments: HBASE-3787-partial.patch, HBASE-3787-v0.patch, 
> HBASE-3787-v1.patch, HBASE-3787-v10.patch, HBASE-3787-v11.patch, 
> HBASE-3787-v12.patch, HBASE-3787-v2.patch, HBASE-3787-v3.patch, 
> HBASE-3787-v4.patch, HBASE-3787-v5.patch, HBASE-3787-v5.patch, 
> HBASE-3787-v6.patch, HBASE-3787-v7.patch, HBASE-3787-v8.patch, 
> HBASE-3787-v9.patch
>
>
> The HTable.increment() operation is non-idempotent. The client retries the 
> increment RPC a few times (as specified by configuration) before throwing an 
> error to the application. This makes it possible that the same increment call 
> be applied twice at the server.
> For increment operations, is it better to use 
> HConnectionManager.getRegionServerWithoutRetries()? Another  option would be 
> to enhance the IPC module to make the RPC server correctly identify if the 
> RPC is a retry attempt and handle accordingly.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10098) [WINDOWS] pass in native library directory from hadoop for unit tests

2013-12-06 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841789#comment-13841789
 ] 

Enis Soztutar commented on HBASE-10098:
---

This is the args that we pass to the maven's child process which is forked to 
the unit test. I don't think maven passes it's own arguments to the child task. 

> [WINDOWS] pass in native library directory from hadoop for unit tests
> -
>
> Key: HBASE-10098
> URL: https://issues.apache.org/jira/browse/HBASE-10098
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 0.98.0, 0.96.2, 0.99.0
>
> Attachments: hbase-10098_v1.patch
>
>
> On windows, Hadoop depends on native libraries for doing it's job. The bin 
> scripts already handle finding hadoop's native libs and adding them to 
> java.library.path, but for running HBase's unit tests, we need to pass them 
> in. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9955) Make hadoop2 the default and deprecate hadoop1

2013-12-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841787#comment-13841787
 ] 

Hadoop QA commented on HBASE-9955:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12617460/addendum.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8076//console

This message is automatically generated.

> Make hadoop2 the default and deprecate hadoop1
> --
>
> Key: HBASE-9955
> URL: https://issues.apache.org/jira/browse/HBASE-9955
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Fix For: 0.98.0
>
> Attachments: 9955.txt, 9955v2.txt, 9955v3.txt, 9955v4.txt, 
> 9955v5.txt, addendum.txt, addendum.txt
>
>
> See "Hadoop version trunk dependency?" on the dev mailing ilst.  Consensus 
> seems to be forming to do the subject line (Recheck the mail thread before 
> going ahead).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9829) make the compaction logging less confusing

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841782#comment-13841782
 ] 

stack commented on HBASE-9829:
--

Fine by me Sergey.  You can address on commit.   Above are just nits.  Patch is 
nice.

> make the compaction logging less confusing
> --
>
> Key: HBASE-9829
> URL: https://issues.apache.org/jira/browse/HBASE-9829
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HBASE-9829.patch
>
>
> 1) One of the most popular question from HBase users has got to be "I have 
> scheduled major compactions to run once per week, why are there so many".
> We need to somehow tell the user, wherever we log that there is a "major" 
> compaction, whether it's a major compaction because that's what was in the 
> request (from regular major compaction or user request), or was it just 
> promoted because it took all files. Esp. the latter should be clear.
> 2) small vs large compaction threads and minor vs major compactions is 
> confusing. Maybe the threads can be named short and long compactions.
> We



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10010) eliminate the put latency spike on the new log file beginning

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841777#comment-13841777
 ] 

stack commented on HBASE-10010:
---

[~xieliang007] What Himanshu said otherwise looks good to me.  If the test 
changes were mistakenly included, just say, and I'll exclude them from the 
commit.

> eliminate the put latency spike on the new log file beginning
> -
>
> Key: HBASE-10010
> URL: https://issues.apache.org/jira/browse/HBASE-10010
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 0.94.13
>Reporter: Liang Xie
>Assignee: Liang Xie
> Attachments: HBase-10010-0.94-v2.txt, HBase-10010-0.94-v3.txt, 
> HBase-10010-0.94.txt, HBase-10010-trunk-v2.txt, HBase-10010-trunk.txt
>
>
> In deed, the original finding came from fb, see HBASE-6813 for detailed 
> discussion.
> Through this improvement doesn't expect obvious gain on 95th or 99th latency, 
> it still could make the response time more stable to me.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9829) make the compaction logging less confusing

2013-12-06 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841779#comment-13841779
 ] 

Sergey Shelukhin commented on HBASE-9829:
-

There are periodic questions on the thread about why major compactions run when 
they are disabled or not according to time table... this tries to clarify. In 
the similar vein, "large" and "small" compaction threads get confused with 
"major" and "minor" compactions, so people assume large thread == major 
compaction. They are chosen by size, but the reason behind the choice is to 
remove potentially long-blocking compactions from the thread where many small 
ones may run (by default, there's only one of each thread, so large compaction 
would block them), so I think the naming is allowable.

Time in the thread name is actually not very useful, other than for grepping by 
number. Maybe these threads can just be numbered when they are started? 
"longCompactions-1" is still greppable

> make the compaction logging less confusing
> --
>
> Key: HBASE-9829
> URL: https://issues.apache.org/jira/browse/HBASE-9829
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HBASE-9829.patch
>
>
> 1) One of the most popular question from HBase users has got to be "I have 
> scheduled major compactions to run once per week, why are there so many".
> We need to somehow tell the user, wherever we log that there is a "major" 
> compaction, whether it's a major compaction because that's what was in the 
> request (from regular major compaction or user request), or was it just 
> promoted because it took all files. Esp. the latter should be clear.
> 2) small vs large compaction threads and minor vs major compactions is 
> confusing. Maybe the threads can be named short and long compactions.
> We



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841774#comment-13841774
 ] 

Ted Yu commented on HBASE-10099:


Integrated to 0.98 and trunk.

Thanks for the patch, Demai.

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch, 
> HBASE-10099-trunk-v2.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-9718) Add a test scope dependency on org.slf4j:slf4j-api to hbase-client

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841773#comment-13841773
 ] 

stack commented on HBASE-9718:
--

Commit Andrew.  We may later find it a problem when some downstream context 
tries to do something we can only imagine now.  There is no 'nice' way of our 
editing the dependencies our dependencies are including.

> Add a test scope dependency on org.slf4j:slf4j-api to hbase-client
> --
>
> Key: HBASE-9718
> URL: https://issues.apache.org/jira/browse/HBASE-9718
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.98.0
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 0.98.0, 0.96.1
>
> Attachments: 9718.patch
>
>
> hbase-client needs a test scope dependency on org.slf4j:slf4j-api in its POM. 
> Without this change at least Eclipse cannot resolve org.slf4j.Logger from 
> RecoverableZooKeeper - the ZooKeeper classes use it - and so the 
> 'hbase-client' project will not build. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10061) TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in thrown NPE

2013-12-06 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841769#comment-13841769
 ] 

Lars Hofhansl commented on HBASE-10061:
---

Same here :)

> TableMapReduceUtil.findOrCreateJar calls updateMap(null, ) resulting in 
> thrown NPE
> --
>
> Key: HBASE-10061
> URL: https://issues.apache.org/jira/browse/HBASE-10061
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.94.12
>Reporter: Amit Sela
>Priority: Minor
> Fix For: 0.98.0, 0.96.1
>
> Attachments: 10061-trunk.txt, 10061-trunk.txt, HBASE-10061.patch
>
>
> TableMapReduceUtil.findOrCreateJar line 596:
> jar = getJar(my_class);
> updateMap(jar, packagedClasses);
> In case getJar returns null, updateMap will throw NPE.
> Should check null==jar before calling updateMap.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Demai Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Demai Ni updated HBASE-10099:
-

Attachment: HBASE-10099-trunk-v2.patch

change to 'from'. thanks... Demai

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch, 
> HBASE-10099-trunk-v2.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841768#comment-13841768
 ] 

stack commented on HBASE-10099:
---

[~ted_yu] Just fix it on commit rather than have [~nidmhbase] go another cycle.

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841767#comment-13841767
 ] 

Ted Yu commented on HBASE-10099:


+1
nit can be addressed on commit.

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841763#comment-13841763
 ] 

Ted Yu commented on HBASE-10099:


{code}
+   * @return KeyValue of the cell visibility expr
{code}
nit: 'of ' -> 'from'

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10099) javadoc warning introduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10099:
---

Summary: javadoc warning introduced by LabelExpander 188: warning - @return 
tag has no arguments   (was: javadoc warning instroduced by LabelExpander 188: 
warning - @return tag has no arguments )

> javadoc warning introduced by LabelExpander 188: warning - @return tag has no 
> arguments 
> 
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10099) javadoc warning instroduced by LabelExpander 188: warning - @return tag has no arguments

2013-12-06 Thread Demai Ni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Demai Ni updated HBASE-10099:
-

Attachment: HBASE-10099-trunk-v1.patch

[~yuzhih...@gmail.com], many thanks. I should have paid more attention... Demai

> javadoc warning instroduced by LabelExpander 188: warning - @return tag has 
> no arguments 
> -
>
> Key: HBASE-10099
> URL: https://issues.apache.org/jira/browse/HBASE-10099
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0
>Reporter: Demai Ni
>Assignee: Demai Ni
>Priority: Trivial
> Fix For: 0.98.0
>
> Attachments: HBASE-10099-trunk-v0.patch, HBASE-10099-trunk-v1.patch
>
>
> src/main/java/org/apache/hadoop/hbase/mapreduce/LabelExpander.java:188: 
> warning - @return tag has no arguments



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10089:
---

Status: Patch Available  (was: Open)

> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.14, 0.94.0
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841745#comment-13841745
 ] 

Ted Yu commented on HBASE-10089:


\*Schema\* tests passed based on the patch:
{code}
Running org.apache.hadoop.hbase.rest.TestSchemaResource
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.774 sec
Running org.apache.hadoop.hbase.rest.model.TestColumnSchemaModel
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.299 sec
Running org.apache.hadoop.hbase.rest.model.TestTableSchemaModel
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.427 sec
Running org.apache.hadoop.hbase.regionserver.metrics.TestSchemaConfigured
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.388 sec
Running org.apache.hadoop.hbase.regionserver.metrics.TestSchemaMetrics
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.338 sec
Running org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.668 sec
{code}

> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HBASE-10089) Metrics intern table names cause eventual permgen OOM in 0.94

2013-12-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-10089:
---

Attachment: 10089-0.94.txt

> Metrics intern table names cause eventual permgen OOM in 0.94
> -
>
> Key: HBASE-10089
> URL: https://issues.apache.org/jira/browse/HBASE-10089
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.94.14
>Reporter: Dave Latham
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 0.94.15
>
> Attachments: 10089-0.94.txt
>
>
> As part of the metrics system introduced in HBASE-4768 there are two places 
> that hbase uses String interning ( SchemaConfigured and SchemaMetrics ).  
> This includes interning table names.  We have long running environment where 
> we run regular integration tests on our application using hbase.  Those tests 
> create and drop tables with new names regularly.  These leads to filling up 
> the permgen with interned table names.  Workaround is to periodically restart 
> the region servers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


  1   2   >