[jira] [Commented] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries

2015-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263774#comment-14263774
 ] 

Hadoop QA commented on HBASE-12801:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12689962/HBASE-12801-0.94-v1.diff
  against master branch at commit ac95cc1fbb951bb9db96f2738f621d1d7cd45739.
  ATTACHMENT ID: 12689962

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/12296//console

This message is automatically generated.

> Failed to truncate a table while maintaing binary region boundaries
> ---
>
> Key: HBASE-12801
> URL: https://issues.apache.org/jira/browse/HBASE-12801
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.94.11
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 2.0.0, 0.94.28
>
> Attachments: HBASE-12801-0.94-v1.diff, HBASE-12801-trunk-v1.diff
>
>
> Binary region boundaries become wrong during 
> converting byte array to normal string, and back to byte array in 
> truncate_preserve of admin.rb, which makes the truncation of table failed.
> See: truncate_preserve method in admin.rb
> {code}
>  splits = h_table.getRegionLocations().keys().map{|i| 
> Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String
>  splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits)
> {code}
> eg:
> {code}
> \xFA\x00\x00\x00\x00\x00\x00\x00 ->  \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> {code}
> Simple patch is using binary string instead of normal string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries

2015-01-03 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-12801:

Fix Version/s: 0.94.28
   2.0.0
   Status: Patch Available  (was: Open)

> Failed to truncate a table while maintaing binary region boundaries
> ---
>
> Key: HBASE-12801
> URL: https://issues.apache.org/jira/browse/HBASE-12801
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.94.11
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 2.0.0, 0.94.28
>
> Attachments: HBASE-12801-0.94-v1.diff, HBASE-12801-trunk-v1.diff
>
>
> Binary region boundaries become wrong during 
> converting byte array to normal string, and back to byte array in 
> truncate_preserve of admin.rb, which makes the truncation of table failed.
> See: truncate_preserve method in admin.rb
> {code}
>  splits = h_table.getRegionLocations().keys().map{|i| 
> Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String
>  splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits)
> {code}
> eg:
> {code}
> \xFA\x00\x00\x00\x00\x00\x00\x00 ->  \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> {code}
> Simple patch is using binary string instead of normal string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries

2015-01-03 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-12801:

Attachment: HBASE-12801-0.94-v1.diff

Patch for 0.94. [~lhofhansl]


> Failed to truncate a table while maintaing binary region boundaries
> ---
>
> Key: HBASE-12801
> URL: https://issues.apache.org/jira/browse/HBASE-12801
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.94.11
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Attachments: HBASE-12801-0.94-v1.diff, HBASE-12801-trunk-v1.diff
>
>
> Binary region boundaries become wrong during 
> converting byte array to normal string, and back to byte array in 
> truncate_preserve of admin.rb, which makes the truncation of table failed.
> See: truncate_preserve method in admin.rb
> {code}
>  splits = h_table.getRegionLocations().keys().map{|i| 
> Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String
>  splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits)
> {code}
> eg:
> {code}
> \xFA\x00\x00\x00\x00\x00\x00\x00 ->  \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> {code}
> Simple patch is using binary string instead of normal string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries

2015-01-03 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-12801:

Attachment: HBASE-12801-trunk-v1.diff

Patch for trunk.
Test in local cluster with a master without truncateTable api.

HMaster with truncateTable api did not have this issue.


> Failed to truncate a table while maintaing binary region boundaries
> ---
>
> Key: HBASE-12801
> URL: https://issues.apache.org/jira/browse/HBASE-12801
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.94.11
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Attachments: HBASE-12801-trunk-v1.diff
>
>
> Binary region boundaries become wrong during 
> converting byte array to normal string, and back to byte array in 
> truncate_preserve of admin.rb, which makes the truncation of table failed.
> See: truncate_preserve method in admin.rb
> {code}
>  splits = h_table.getRegionLocations().keys().map{|i| 
> Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String
>  splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits)
> {code}
> eg:
> {code}
> \xFA\x00\x00\x00\x00\x00\x00\x00 ->  \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
> {code}
> Simple patch is using binary string instead of normal string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12800) HBase Regionserver very easy to die

2015-01-03 Thread gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gao updated HBASE-12800:

Description: 
Hi:
I am getting constant stability problems with the HBase Regionserver, it dies 
randomly everyday or every other day. It normally dies shortly after printing 
the following:

2014-12-30 23:06:17,091 ERROR [regionserver60020.logRoller] 
wal.ProtobufLogWriter: Got IOException while writing trailer
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/hbase/WALs/zjdx107,60020,1418269148759/zjdx107%2C60020%2C1418269148759.1419977176935
 could only be replicated to 0 nodes instead of minReplication (=1).  There are 
12 datanode(s) running and no node(s) are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2659)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:569)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)

at org.apache.hadoop.ipc.Client.call(Client.java:1409)
at org.apache.hadoop.ipc.Client.call(Client.java:1362)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy13.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy13.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266)
at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1437)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1260)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
2014-12-30 23:06:17,092 ERROR [regionserver60020.logRoller] wal.FSHLog: Failed 
close of HLog writer
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/hbase/WALs/zjdx107,60020,1418269148759/zjdx107%2C60020%2C1418269148759.1419977176935
 could only be replicated to 0 nodes instead of minReplication (=1).  There are 
12 datanode(s) running and no node(s) are excluded in this operation.
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2659)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:569)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
  

[jira] [Created] (HBASE-12800) HBase Regionserver very easy to die

2015-01-03 Thread gao (JIRA)
gao created HBASE-12800:
---

 Summary: HBase Regionserver very easy to die
 Key: HBASE-12800
 URL: https://issues.apache.org/jira/browse/HBASE-12800
 Project: HBase
  Issue Type: Bug
 Environment: hbase 0.96.1.1
Reporter: gao


Hi:
I am getting constant stability problems with the HBase Regionserver, it dies 
randomly everyday or every other day. It normally dies shortly after printing 
the following:

2014-12-30 23:06:17,091 ERROR [regionserver60020.logRoller] 
wal.ProtobufLogWriter: Got IOException while writing trailer
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File 
/hbase/WALs/zjdx107,60020,1418269148759/zjdx107%2C60020%2C1418269148759.1419977176935
 could only be replicated to 0 nodes instead of minReplication (=1).  There are 
12 datanode(s) running and no node(s) are excluded in this operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries

2015-01-03 Thread Liu Shaohui (JIRA)
Liu Shaohui created HBASE-12801:
---

 Summary: Failed to truncate a table while maintaing binary region 
boundaries
 Key: HBASE-12801
 URL: https://issues.apache.org/jira/browse/HBASE-12801
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.94.11
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor


Binary region boundaries become wrong during 
converting byte array to normal string, and back to byte array in 
truncate_preserve of admin.rb, which makes the truncation of table failed.

See: truncate_preserve method in admin.rb
{code}
 splits = h_table.getRegionLocations().keys().map{|i| 
Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String
 splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits)
{code}
eg:
{code}
\xFA\x00\x00\x00\x00\x00\x00\x00 ->  \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
\xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00
{code}

Simple patch is using binary string instead of normal string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles

2015-01-03 Thread Jiajia Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263725#comment-14263725
 ] 

Jiajia Li commented on HBASE-12332:
---

hi, [~j...@cloudera.com], now we will use the filelink when reolving mob files? 
 If so, I will try to add the UT in HBASE-12670.

> [mob] use filelink instead of retry when resolving mobfiles
> ---
>
> Key: HBASE-12332
> URL: https://issues.apache.org/jira/browse/HBASE-12332
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jonathan Hsieh
> Fix For: hbase-11339
>
> Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, 
> HBASE-12332-V3.patch, hbase-12332.link.v4.patch, hbase-12332.patch
>
>
> in the snapshot code, hmobstore was modified to traverse an hfile link to a 
> mob.   Ideally this should use the transparent filelink code to read the data.
> Also there will likely be some issues with the mob file cache with these 
> links.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12577) Disable distributed log replay by default

2015-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263631#comment-14263631
 ] 

Hudson commented on HBASE-12577:


SUCCESS: Integrated in HBase-1.0 #629 (See 
[https://builds.apache.org/job/HBase-1.0/629/])
Revert "HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was 
insufficient) (Jeffrey Zhong)" (stack: rev 
d5f63f122131a45b24b5e3839d4b59a578652b2d)
* hbase-common/src/main/resources/hbase-default.xml
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java
HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was 
insufficient) (Jeffrey Zhong) (stack: rev 
a32f485fecbcf7e1aa7159ac2fa9851c1bf1ae48)
* hbase-common/src/main/resources/hbase-default.xml
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java


> Disable distributed log replay by default
> -
>
> Key: HBASE-12577
> URL: https://issues.apache.org/jira/browse/HBASE-12577
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Jeffrey Zhong
>Priority: Critical
> Fix For: 0.99.2
>
> Attachments: HBASE-12567.patch, HBASE-12567.patch
>
>
> Distributed log replay is an awesome feature, but due of HBASE-11094, the 
> rolling upgrade story from 0.98 is hard to explain / enforce. 
> The fix for HBASE-11094 only went into 0.98.4, meaning rolling upgrades from 
> 0.98.4- might lose data during the upgrade. 
> I feel no matter how much documentation / warning we do, we cannot prevent 
> users from doing rolling upgrades from 0.98.4- to 1.0. And we do not want to 
> inconvenience the user by requiring a two step rolling upgrade.  
> Thus I think we should disable dist log replay for 1.0, and re-enable it 
> again for 1.1 (if rolling upgrade from 0.98 is not supported). 
> ie. undo: HBASE-10888



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)

2015-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263632#comment-14263632
 ] 

Hudson commented on HBASE-12746:


SUCCESS: Integrated in HBase-1.0 #629 (See 
[https://builds.apache.org/job/HBase-1.0/629/])
Revert "HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was 
insufficient) (Jeffrey Zhong)" (stack: rev 
d5f63f122131a45b24b5e3839d4b59a578652b2d)
* hbase-common/src/main/resources/hbase-default.xml
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java
HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was 
insufficient) (Jeffrey Zhong) (stack: rev 
a32f485fecbcf7e1aa7159ac2fa9851c1bf1ae48)
* hbase-common/src/main/resources/hbase-default.xml
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java


> [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
> --
>
> Key: HBASE-12746
> URL: https://issues.apache.org/jira/browse/HBASE-12746
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: 12746-v2.patch, 12746.txt, 12746.txt
>
>
> Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping 
> into HBASE-12743)  I thought it my environment but apparently not.
> If I add this to HMaster
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> index a85c2e7..d745f94 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements 
> MasterServices, Server {
>throw new IOException("Failed to start redirecting jetty server", e);
>  }
>  masterInfoPort = connector.getPort();
> + boolean dlr =
> + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY,
> +  HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG);
> +  LOG.info("Distributed log replay=" + dlr);
>}
> It says DLR is on.  HBASE-12577 was not enough it seems.  The 
> hbase-default.xml still has DLR as true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12763) Make it so there must be WALs for a server to be marked dead

2015-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263633#comment-14263633
 ] 

Hudson commented on HBASE-12763:


SUCCESS: Integrated in HBase-1.0 #629 (See 
[https://builds.apache.org/job/HBase-1.0/629/])
HBASE-12763 Make it so there must be WALs for a server to be marked dead 
(stack: rev c7b6e10ec7c154c7f22d3231027b20a9d2904025)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java


> Make it so there must be WALs for a server to be marked dead
> 
>
> Key: HBASE-12763
> URL: https://issues.apache.org/jira/browse/HBASE-12763
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Fix For: 1.0.0, 2.0.0, 1.1.0
>
> Attachments: 12746-v2-master-and-098.patch
>
>
> The patch for this issue is a subset of the patch attached to the parent.  
> The parent solves a 1.0.0-specific issue but part of the patch needs applying 
> to 0.98 and to master to fix an issue where Master on startup would think it 
> was joining a cluster rather than undergoing a fresh start just because it 
> came across a directory named for a server that was once running (the patch 
> checks if the dir has WALs and if none, does not think the server a dead 
> server).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12763) Make it so there must be WALs for a server to be marked dead

2015-01-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263595#comment-14263595
 ] 

stack commented on HBASE-12763:
---

Ok. Fixed. Applied to branch-1.0 (after removing the extra commit by revert and 
then reapply of cleaned HBASE-12746):

commit c7b6e10ec7c154c7f22d3231027b20a9d2904025
Author: stack 
Date:   Sat Jan 3 11:06:47 2015 -0800

HBASE-12763 Make it so there must be WALs for a server to be marked dead



> Make it so there must be WALs for a server to be marked dead
> 
>
> Key: HBASE-12763
> URL: https://issues.apache.org/jira/browse/HBASE-12763
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Fix For: 1.0.0, 2.0.0, 1.1.0
>
> Attachments: 12746-v2-master-and-098.patch
>
>
> The patch for this issue is a subset of the patch attached to the parent.  
> The parent solves a 1.0.0-specific issue but part of the patch needs applying 
> to 0.98 and to master to fix an issue where Master on startup would think it 
> was joining a cluster rather than undergoing a fresh start just because it 
> came across a directory named for a server that was once running (the patch 
> checks if the dir has WALs and if none, does not think the server a dead 
> server).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)

2015-01-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263594#comment-14263594
 ] 

stack commented on HBASE-12746:
---

Changed my mind. Fixing my messedup commit.  Here is what I did (revert, then 
reapply absent the extraneous HBASE-12763 stuff):

commit a32f485fecbcf7e1aa7159ac2fa9851c1bf1ae48
Author: stack 
Date:   Sat Jan 3 11:03:30 2015 -0800

HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was 
insufficient) (Jeffrey Zhong)
Reapply after fixing up an overcommit.

commit d5f63f122131a45b24b5e3839d4b59a578652b2d
Author: stack 
Date:   Sat Jan 3 10:59:58 2015 -0800

Revert "HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 
was insufficient) (Jeffrey Zhong)"
Overcommitted. Reverting so can apply intended patch

This reverts commit 2df74fbd4a85de1e5325ba6ad8595f2081238e29.

> [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
> --
>
> Key: HBASE-12746
> URL: https://issues.apache.org/jira/browse/HBASE-12746
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: 12746-v2.patch, 12746.txt, 12746.txt
>
>
> Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping 
> into HBASE-12743)  I thought it my environment but apparently not.
> If I add this to HMaster
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> index a85c2e7..d745f94 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements 
> MasterServices, Server {
>throw new IOException("Failed to start redirecting jetty server", e);
>  }
>  masterInfoPort = connector.getPort();
> + boolean dlr =
> + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY,
> +  HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG);
> +  LOG.info("Distributed log replay=" + dlr);
>}
> It says DLR is on.  HBASE-12577 was not enough it seems.  The 
> hbase-default.xml still has DLR as true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)

2015-01-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263586#comment-14263586
 ] 

stack commented on HBASE-12746:
---

I overcommitted adding the small fix for HBASE-12763 when I committed this 
patch.  Leaving as is since small potatoes.

> [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
> --
>
> Key: HBASE-12746
> URL: https://issues.apache.org/jira/browse/HBASE-12746
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.0
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 1.0.0
>
> Attachments: 12746-v2.patch, 12746.txt, 12746.txt
>
>
> Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping 
> into HBASE-12743)  I thought it my environment but apparently not.
> If I add this to HMaster
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> index a85c2e7..d745f94 100644
> --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
> @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements 
> MasterServices, Server {
>throw new IOException("Failed to start redirecting jetty server", e);
>  }
>  masterInfoPort = connector.getPort();
> + boolean dlr =
> + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY,
> +  HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG);
> +  LOG.info("Distributed log replay=" + dlr);
>}
> It says DLR is on.  HBASE-12577 was not enough it seems.  The 
> hbase-default.xml still has DLR as true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12763) Make it so there must be WALs for a server to be marked dead

2015-01-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263585#comment-14263585
 ] 

stack commented on HBASE-12763:
---

[~enis] It was committed mistakenly as part of the below.

  1 commit 2df74fbd4a85de1e5325ba6ad8595f2081238e29
  2 Author: stack 
  3 Date:   Sat Dec 27 09:41:44 2014 -0800
  4
  5 HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was 
insufficient) (Jeffrey Zhong)
  6

I'll leave it as is unless you want me do different.  Let me go make a commet 
too over on HBASE-12746.

> Make it so there must be WALs for a server to be marked dead
> 
>
> Key: HBASE-12763
> URL: https://issues.apache.org/jira/browse/HBASE-12763
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Fix For: 1.0.0, 2.0.0, 1.1.0
>
> Attachments: 12746-v2-master-and-098.patch
>
>
> The patch for this issue is a subset of the patch attached to the parent.  
> The parent solves a 1.0.0-specific issue but part of the patch needs applying 
> to 0.98 and to master to fix an issue where Master on startup would think it 
> was joining a cluster rather than undergoing a fresh start just because it 
> came across a directory named for a server that was once running (the patch 
> checks if the dir has WALs and if none, does not think the server a dead 
> server).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12763) Make it so there must be WALs for a server to be marked dead

2015-01-03 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-12763:
--
Fix Version/s: 1.0.0

> Make it so there must be WALs for a server to be marked dead
> 
>
> Key: HBASE-12763
> URL: https://issues.apache.org/jira/browse/HBASE-12763
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: stack
>Assignee: stack
> Fix For: 1.0.0, 2.0.0, 1.1.0
>
> Attachments: 12746-v2-master-and-098.patch
>
>
> The patch for this issue is a subset of the patch attached to the parent.  
> The parent solves a 1.0.0-specific issue but part of the patch needs applying 
> to 0.98 and to master to fix an issue where Master on startup would think it 
> was joining a cluster rather than undergoing a fresh start just because it 
> came across a directory named for a server that was once running (the patch 
> checks if the dir has WALs and if none, does not think the server a dead 
> server).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12270) A bug in the bucket cache, with cache blocks on write enabled

2015-01-03 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263583#comment-14263583
 ] 

stack commented on HBASE-12270:
---

[~ted_yu] The fails are obviously being tracked yet you feel the need to add a 
totally useless dup of info already freely available up on our build box? You 
are adding no value.

> A bug in the bucket cache, with cache blocks on write enabled
> -
>
> Key: HBASE-12270
> URL: https://issues.apache.org/jira/browse/HBASE-12270
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.11, 0.98.6.1
> Environment: I can reproduce it on a simple 2 node cluster, one 
> running the master and another running a RS. I was testing on ec2.
> I used the following configurations for the cluster. 
> hbase-env:HBASE_REGIONSERVER_OPTS=-Xmx2G -XX:MaxDirectMemorySize=5G 
> -XX:CMSInitiatingOccupancyFraction=88 -XX:+AggressiveOpts -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xlog 
> gc:/tmp/hbase-regionserver-gc.log
> hbase-site:
> hbase.bucketcache.ioengine=offheap
> hbase.bucketcache.size=4196
> hbase.rs.cacheblocksonwrite=true
> hfile.block.index.cacheonwrite=true
> hfile.block.bloom.cacheonwrite=true
>Reporter: Khaled Elmeleegy
>Assignee: Liu Shaohui
>Priority: Critical
> Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0
>
> Attachments: HBASE-12270-v1.diff, HBASE-12270-v2.diff, 
> HBASE-12270-v2.patch, TestHBase.java, TestKey.java
>
>
> In my experiments, I have writers streaming their output to HBase. The reader 
> powers a web page and does this scatter/gather, where it reads 1000 keys 
> written last and passes them the the front end. With this workload, I get the 
> exception below at the region server. Again, I am using HBAse (0.98.6.1). Any 
> help is appreciated.
> 2014-10-10 15:06:44,173 ERROR 
> [B.DefaultRpcServer.handler=62,queue=2,port=60020] ipc.RpcServer: Unexpected 
> throwable object 
> java.lang.IllegalArgumentException
>   at java.nio.Buffer.position(Buffer.java:236)
>  at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.skip(ByteBufferUtils.java:434)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:849)
>   at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:760)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:248)
>at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:152)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:317)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:176)
>   at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1780)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:3758)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1950)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1936)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1913)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3157)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
>at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
>  at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
>at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
>  at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12783) Create efficient RegionLocator implementation

2015-01-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263571#comment-14263571
 ] 

Hudson commented on HBASE-12783:


FAILURE: Integrated in HBase-1.1 #49 (See 
[https://builds.apache.org/job/HBase-1.1/49/])
HBASE-12783 Revert - two tests in TestAssignmentManager fail (tedyu: rev 
173eba815bd7d97d15be69893d4b0836a08cf42b)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionSizeCalculator.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaTableLocator.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTableUtil.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HRegionLocator.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
hbase-it/src/test/java/org/apache/hadoop/hbase/mapreduce/IntegrationTestBulkLoad.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionServerCallable.java


> Create efficient RegionLocator implementation
> -
>
> Key: HBASE-12783
> URL: https://issues.apache.org/jira/browse/HBASE-12783
> Project: HBase
>  Issue Type: Task
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
> Fix For: 2.0.0
>
> Attachments: 12783-10.patch, 12783-11.patch, HBASE-12783.patch, 
> HBASE-12783E.patch, HBASE-12783F.patch, HBASE-12783G.patch, 
> HBASE-12783H.patch, HBASE-12783I.patch, HBASE-12783J.patch, 
> HBASE-12783K.patch, HBASE-12873B.patch, HBASE-12873C.patch, HBASE-12873D.patch
>
>
> A new HRegionLocator that only implements RegionLocator functionality will be 
> more efficient to instantiate than a full HTable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12783) Create efficient RegionLocator implementation

2015-01-03 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-12783:
---
Fix Version/s: (was: 1.1.0)

Reverted from branch-1 due to two failing tests in TestAssignmentManager.

> Create efficient RegionLocator implementation
> -
>
> Key: HBASE-12783
> URL: https://issues.apache.org/jira/browse/HBASE-12783
> Project: HBase
>  Issue Type: Task
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Solomon Duskis
>Assignee: Solomon Duskis
> Fix For: 2.0.0
>
> Attachments: 12783-10.patch, 12783-11.patch, HBASE-12783.patch, 
> HBASE-12783E.patch, HBASE-12783F.patch, HBASE-12783G.patch, 
> HBASE-12783H.patch, HBASE-12783I.patch, HBASE-12783J.patch, 
> HBASE-12783K.patch, HBASE-12873B.patch, HBASE-12873C.patch, HBASE-12873D.patch
>
>
> A new HRegionLocator that only implements RegionLocator functionality will be 
> more efficient to instantiate than a full HTable. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12751) Allow RowLock to be reader writer

2015-01-03 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263468#comment-14263468
 ] 

Elliott Clark commented on HBASE-12751:
---

Yeah that's my thought. Puts for the same cell are already ordered by sequence 
number. So it shouldn't matter the order that they are applied to the memstore 
as long as the memstore correctly orders them there. There will be a slight 
change in behavior. Right now it's not possible for two puts to come in and 
have timestamps generated on the regionserver that don't order the same as the 
sequence id. However as long as the memstore correctly orders cells by 
timestamp then by sequence id it shouldn't be a behavioral change. We've never 
made any guarantees about the relationship of TS to seqid since TS is user 
setable.

Cells that are the same row but totally different cells can't have ordering 
issues. So they aren't an issue however right now they are paying the same cost 
as if they could. It's a reasonable trade off in order to keep the number of 
bytes that need to be compared to get a lock down.

However any check and mutate type action still needs to be able to say that 
nothing else changes the value concurrently. For that a write lock is needed.

Seems like row locks were initially put in to be exposed to the client 
(HBASE-798). That was a bad idea because of region movement. And so it was 
removed. However by that time incrementColumnValue had been committed and that 
needed row locks; so did check and put and the later check and mutate.

> Allow RowLock to be reader writer
> -
>
> Key: HBASE-12751
> URL: https://issues.apache.org/jira/browse/HBASE-12751
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Elliott Clark
>
> Right now every write operation grabs a row lock. This is to prevent values 
> from changing during a read modify write operation (increment or check and 
> put). However it limits parallelism in several different scenarios.
> If there are several puts to the same row but different columns or stores 
> then this is very limiting.
> If there are puts to the same column then mvcc number should ensure a 
> consistent ordering. So locking is not needed.
> However locking for check and put or increment is still needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)