[jira] [Commented] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries
[ https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263774#comment-14263774 ] Hadoop QA commented on HBASE-12801: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12689962/HBASE-12801-0.94-v1.diff against master branch at commit ac95cc1fbb951bb9db96f2738f621d1d7cd45739. ATTACHMENT ID: 12689962 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/12296//console This message is automatically generated. > Failed to truncate a table while maintaing binary region boundaries > --- > > Key: HBASE-12801 > URL: https://issues.apache.org/jira/browse/HBASE-12801 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0, 0.94.28 > > Attachments: HBASE-12801-0.94-v1.diff, HBASE-12801-trunk-v1.diff > > > Binary region boundaries become wrong during > converting byte array to normal string, and back to byte array in > truncate_preserve of admin.rb, which makes the truncation of table failed. > See: truncate_preserve method in admin.rb > {code} > splits = h_table.getRegionLocations().keys().map{|i| > Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String > splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits) > {code} > eg: > {code} > \xFA\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > {code} > Simple patch is using binary string instead of normal string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries
[ https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-12801: Fix Version/s: 0.94.28 2.0.0 Status: Patch Available (was: Open) > Failed to truncate a table while maintaing binary region boundaries > --- > > Key: HBASE-12801 > URL: https://issues.apache.org/jira/browse/HBASE-12801 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0, 0.94.28 > > Attachments: HBASE-12801-0.94-v1.diff, HBASE-12801-trunk-v1.diff > > > Binary region boundaries become wrong during > converting byte array to normal string, and back to byte array in > truncate_preserve of admin.rb, which makes the truncation of table failed. > See: truncate_preserve method in admin.rb > {code} > splits = h_table.getRegionLocations().keys().map{|i| > Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String > splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits) > {code} > eg: > {code} > \xFA\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > {code} > Simple patch is using binary string instead of normal string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries
[ https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-12801: Attachment: HBASE-12801-0.94-v1.diff Patch for 0.94. [~lhofhansl] > Failed to truncate a table while maintaing binary region boundaries > --- > > Key: HBASE-12801 > URL: https://issues.apache.org/jira/browse/HBASE-12801 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-12801-0.94-v1.diff, HBASE-12801-trunk-v1.diff > > > Binary region boundaries become wrong during > converting byte array to normal string, and back to byte array in > truncate_preserve of admin.rb, which makes the truncation of table failed. > See: truncate_preserve method in admin.rb > {code} > splits = h_table.getRegionLocations().keys().map{|i| > Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String > splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits) > {code} > eg: > {code} > \xFA\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > {code} > Simple patch is using binary string instead of normal string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries
[ https://issues.apache.org/jira/browse/HBASE-12801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liu Shaohui updated HBASE-12801: Attachment: HBASE-12801-trunk-v1.diff Patch for trunk. Test in local cluster with a master without truncateTable api. HMaster with truncateTable api did not have this issue. > Failed to truncate a table while maintaing binary region boundaries > --- > > Key: HBASE-12801 > URL: https://issues.apache.org/jira/browse/HBASE-12801 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 0.94.11 >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Attachments: HBASE-12801-trunk-v1.diff > > > Binary region boundaries become wrong during > converting byte array to normal string, and back to byte array in > truncate_preserve of admin.rb, which makes the truncation of table failed. > See: truncate_preserve method in admin.rb > {code} > splits = h_table.getRegionLocations().keys().map{|i| > Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String > splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits) > {code} > eg: > {code} > \xFA\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 > {code} > Simple patch is using binary string instead of normal string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12800) HBase Regionserver very easy to die
[ https://issues.apache.org/jira/browse/HBASE-12800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gao updated HBASE-12800: Description: Hi: I am getting constant stability problems with the HBase Regionserver, it dies randomly everyday or every other day. It normally dies shortly after printing the following: 2014-12-30 23:06:17,091 ERROR [regionserver60020.logRoller] wal.ProtobufLogWriter: Got IOException while writing trailer org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/WALs/zjdx107,60020,1418269148759/zjdx107%2C60020%2C1418269148759.1419977176935 could only be replicated to 0 nodes instead of minReplication (=1). There are 12 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2659) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:569) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980) at org.apache.hadoop.ipc.Client.call(Client.java:1409) at org.apache.hadoop.ipc.Client.call(Client.java:1362) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy13.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy13.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361) at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:266) at com.sun.proxy.$Proxy14.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1437) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1260) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525) 2014-12-30 23:06:17,092 ERROR [regionserver60020.logRoller] wal.FSHLog: Failed close of HLog writer org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/WALs/zjdx107,60020,1418269148759/zjdx107%2C60020%2C1418269148759.1419977176935 could only be replicated to 0 nodes instead of minReplication (=1). There are 12 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1430) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2659) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:569) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
[jira] [Created] (HBASE-12800) HBase Regionserver very easy to die
gao created HBASE-12800: --- Summary: HBase Regionserver very easy to die Key: HBASE-12800 URL: https://issues.apache.org/jira/browse/HBASE-12800 Project: HBase Issue Type: Bug Environment: hbase 0.96.1.1 Reporter: gao Hi: I am getting constant stability problems with the HBase Regionserver, it dies randomly everyday or every other day. It normally dies shortly after printing the following: 2014-12-30 23:06:17,091 ERROR [regionserver60020.logRoller] wal.ProtobufLogWriter: Got IOException while writing trailer org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hbase/WALs/zjdx107,60020,1418269148759/zjdx107%2C60020%2C1418269148759.1419977176935 could only be replicated to 0 nodes instead of minReplication (=1). There are 12 datanode(s) running and no node(s) are excluded in this operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-12801) Failed to truncate a table while maintaing binary region boundaries
Liu Shaohui created HBASE-12801: --- Summary: Failed to truncate a table while maintaing binary region boundaries Key: HBASE-12801 URL: https://issues.apache.org/jira/browse/HBASE-12801 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.94.11 Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Binary region boundaries become wrong during converting byte array to normal string, and back to byte array in truncate_preserve of admin.rb, which makes the truncation of table failed. See: truncate_preserve method in admin.rb {code} splits = h_table.getRegionLocations().keys().map{|i| Bytes.toString(i.getStartKey)}.delete_if{|k| k == ""}.to_java :String splits = org.apache.hadoop.hbase.util.Bytes.toByteArrays(splits) {code} eg: {code} \xFA\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 \xFC\x00\x00\x00\x00\x00\x00\x00 -> \xEF\xBF\xBD\x00\x00\x00\x00\x00\x00\x00 {code} Simple patch is using binary string instead of normal string. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12332) [mob] use filelink instead of retry when resolving mobfiles
[ https://issues.apache.org/jira/browse/HBASE-12332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263725#comment-14263725 ] Jiajia Li commented on HBASE-12332: --- hi, [~j...@cloudera.com], now we will use the filelink when reolving mob files? If so, I will try to add the UT in HBASE-12670. > [mob] use filelink instead of retry when resolving mobfiles > --- > > Key: HBASE-12332 > URL: https://issues.apache.org/jira/browse/HBASE-12332 > Project: HBase > Issue Type: Sub-task > Components: mob >Affects Versions: hbase-11339 >Reporter: Jonathan Hsieh > Fix For: hbase-11339 > > Attachments: HBASE-12332-V1.diff, HBASE-12332-V2.patch, > HBASE-12332-V3.patch, hbase-12332.link.v4.patch, hbase-12332.patch > > > in the snapshot code, hmobstore was modified to traverse an hfile link to a > mob. Ideally this should use the transparent filelink code to read the data. > Also there will likely be some issues with the mob file cache with these > links. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12577) Disable distributed log replay by default
[ https://issues.apache.org/jira/browse/HBASE-12577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263631#comment-14263631 ] Hudson commented on HBASE-12577: SUCCESS: Integrated in HBase-1.0 #629 (See [https://builds.apache.org/job/HBase-1.0/629/]) Revert "HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong)" (stack: rev d5f63f122131a45b24b5e3839d4b59a578652b2d) * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong) (stack: rev a32f485fecbcf7e1aa7159ac2fa9851c1bf1ae48) * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java > Disable distributed log replay by default > - > > Key: HBASE-12577 > URL: https://issues.apache.org/jira/browse/HBASE-12577 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Jeffrey Zhong >Priority: Critical > Fix For: 0.99.2 > > Attachments: HBASE-12567.patch, HBASE-12567.patch > > > Distributed log replay is an awesome feature, but due of HBASE-11094, the > rolling upgrade story from 0.98 is hard to explain / enforce. > The fix for HBASE-11094 only went into 0.98.4, meaning rolling upgrades from > 0.98.4- might lose data during the upgrade. > I feel no matter how much documentation / warning we do, we cannot prevent > users from doing rolling upgrades from 0.98.4- to 1.0. And we do not want to > inconvenience the user by requiring a two step rolling upgrade. > Thus I think we should disable dist log replay for 1.0, and re-enable it > again for 1.1 (if rolling upgrade from 0.98 is not supported). > ie. undo: HBASE-10888 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
[ https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263632#comment-14263632 ] Hudson commented on HBASE-12746: SUCCESS: Integrated in HBase-1.0 #629 (See [https://builds.apache.org/job/HBase-1.0/629/]) Revert "HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong)" (stack: rev d5f63f122131a45b24b5e3839d4b59a578652b2d) * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong) (stack: rev a32f485fecbcf7e1aa7159ac2fa9851c1bf1ae48) * hbase-common/src/main/resources/hbase-default.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java > [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) > -- > > Key: HBASE-12746 > URL: https://issues.apache.org/jira/browse/HBASE-12746 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 1.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 1.0.0 > > Attachments: 12746-v2.patch, 12746.txt, 12746.txt > > > Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping > into HBASE-12743) I thought it my environment but apparently not. > If I add this to HMaster > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > index a85c2e7..d745f94 100644 > --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements > MasterServices, Server { >throw new IOException("Failed to start redirecting jetty server", e); > } > masterInfoPort = connector.getPort(); > + boolean dlr = > + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY, > + HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG); > + LOG.info("Distributed log replay=" + dlr); >} > It says DLR is on. HBASE-12577 was not enough it seems. The > hbase-default.xml still has DLR as true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12763) Make it so there must be WALs for a server to be marked dead
[ https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263633#comment-14263633 ] Hudson commented on HBASE-12763: SUCCESS: Integrated in HBase-1.0 #629 (See [https://builds.apache.org/job/HBase-1.0/629/]) HBASE-12763 Make it so there must be WALs for a server to be marked dead (stack: rev c7b6e10ec7c154c7f22d3231027b20a9d2904025) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java > Make it so there must be WALs for a server to be marked dead > > > Key: HBASE-12763 > URL: https://issues.apache.org/jira/browse/HBASE-12763 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: stack >Assignee: stack > Fix For: 1.0.0, 2.0.0, 1.1.0 > > Attachments: 12746-v2-master-and-098.patch > > > The patch for this issue is a subset of the patch attached to the parent. > The parent solves a 1.0.0-specific issue but part of the patch needs applying > to 0.98 and to master to fix an issue where Master on startup would think it > was joining a cluster rather than undergoing a fresh start just because it > came across a directory named for a server that was once running (the patch > checks if the dir has WALs and if none, does not think the server a dead > server). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12763) Make it so there must be WALs for a server to be marked dead
[ https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263595#comment-14263595 ] stack commented on HBASE-12763: --- Ok. Fixed. Applied to branch-1.0 (after removing the extra commit by revert and then reapply of cleaned HBASE-12746): commit c7b6e10ec7c154c7f22d3231027b20a9d2904025 Author: stack Date: Sat Jan 3 11:06:47 2015 -0800 HBASE-12763 Make it so there must be WALs for a server to be marked dead > Make it so there must be WALs for a server to be marked dead > > > Key: HBASE-12763 > URL: https://issues.apache.org/jira/browse/HBASE-12763 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: stack >Assignee: stack > Fix For: 1.0.0, 2.0.0, 1.1.0 > > Attachments: 12746-v2-master-and-098.patch > > > The patch for this issue is a subset of the patch attached to the parent. > The parent solves a 1.0.0-specific issue but part of the patch needs applying > to 0.98 and to master to fix an issue where Master on startup would think it > was joining a cluster rather than undergoing a fresh start just because it > came across a directory named for a server that was once running (the patch > checks if the dir has WALs and if none, does not think the server a dead > server). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
[ https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263594#comment-14263594 ] stack commented on HBASE-12746: --- Changed my mind. Fixing my messedup commit. Here is what I did (revert, then reapply absent the extraneous HBASE-12763 stuff): commit a32f485fecbcf7e1aa7159ac2fa9851c1bf1ae48 Author: stack Date: Sat Jan 3 11:03:30 2015 -0800 HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong) Reapply after fixing up an overcommit. commit d5f63f122131a45b24b5e3839d4b59a578652b2d Author: stack Date: Sat Jan 3 10:59:58 2015 -0800 Revert "HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong)" Overcommitted. Reverting so can apply intended patch This reverts commit 2df74fbd4a85de1e5325ba6ad8595f2081238e29. > [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) > -- > > Key: HBASE-12746 > URL: https://issues.apache.org/jira/browse/HBASE-12746 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 1.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 1.0.0 > > Attachments: 12746-v2.patch, 12746.txt, 12746.txt > > > Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping > into HBASE-12743) I thought it my environment but apparently not. > If I add this to HMaster > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > index a85c2e7..d745f94 100644 > --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements > MasterServices, Server { >throw new IOException("Failed to start redirecting jetty server", e); > } > masterInfoPort = connector.getPort(); > + boolean dlr = > + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY, > + HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG); > + LOG.info("Distributed log replay=" + dlr); >} > It says DLR is on. HBASE-12577 was not enough it seems. The > hbase-default.xml still has DLR as true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12746) [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient)
[ https://issues.apache.org/jira/browse/HBASE-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263586#comment-14263586 ] stack commented on HBASE-12746: --- I overcommitted adding the small fix for HBASE-12763 when I committed this patch. Leaving as is since small potatoes. > [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) > -- > > Key: HBASE-12746 > URL: https://issues.apache.org/jira/browse/HBASE-12746 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 1.0.0 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 1.0.0 > > Attachments: 12746-v2.patch, 12746.txt, 12746.txt > > > Testing the 1.0.0RC0 candidate, I noticed DLR was on (because I was bumping > into HBASE-12743) I thought it my environment but apparently not. > If I add this to HMaster > diff --git > a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > index a85c2e7..d745f94 100644 > --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java > @@ -416,6 +416,10 @@ public class HMaster extends HRegionServer implements > MasterServices, Server { >throw new IOException("Failed to start redirecting jetty server", e); > } > masterInfoPort = connector.getPort(); > + boolean dlr = > + conf.getBoolean(HConstants.DISTRIBUTED_LOG_REPLAY_KEY, > + HConstants.DEFAULT_DISTRIBUTED_LOG_REPLAY_CONFIG); > + LOG.info("Distributed log replay=" + dlr); >} > It says DLR is on. HBASE-12577 was not enough it seems. The > hbase-default.xml still has DLR as true. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12763) Make it so there must be WALs for a server to be marked dead
[ https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263585#comment-14263585 ] stack commented on HBASE-12763: --- [~enis] It was committed mistakenly as part of the below. 1 commit 2df74fbd4a85de1e5325ba6ad8595f2081238e29 2 Author: stack 3 Date: Sat Dec 27 09:41:44 2014 -0800 4 5 HBASE-12746 [1.0.0RC0] Distributed Log Replay is on (HBASE-12577 was insufficient) (Jeffrey Zhong) 6 I'll leave it as is unless you want me do different. Let me go make a commet too over on HBASE-12746. > Make it so there must be WALs for a server to be marked dead > > > Key: HBASE-12763 > URL: https://issues.apache.org/jira/browse/HBASE-12763 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: stack >Assignee: stack > Fix For: 1.0.0, 2.0.0, 1.1.0 > > Attachments: 12746-v2-master-and-098.patch > > > The patch for this issue is a subset of the patch attached to the parent. > The parent solves a 1.0.0-specific issue but part of the patch needs applying > to 0.98 and to master to fix an issue where Master on startup would think it > was joining a cluster rather than undergoing a fresh start just because it > came across a directory named for a server that was once running (the patch > checks if the dir has WALs and if none, does not think the server a dead > server). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12763) Make it so there must be WALs for a server to be marked dead
[ https://issues.apache.org/jira/browse/HBASE-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-12763: -- Fix Version/s: 1.0.0 > Make it so there must be WALs for a server to be marked dead > > > Key: HBASE-12763 > URL: https://issues.apache.org/jira/browse/HBASE-12763 > Project: HBase > Issue Type: Sub-task > Components: wal >Reporter: stack >Assignee: stack > Fix For: 1.0.0, 2.0.0, 1.1.0 > > Attachments: 12746-v2-master-and-098.patch > > > The patch for this issue is a subset of the patch attached to the parent. > The parent solves a 1.0.0-specific issue but part of the patch needs applying > to 0.98 and to master to fix an issue where Master on startup would think it > was joining a cluster rather than undergoing a fresh start just because it > came across a directory named for a server that was once running (the patch > checks if the dir has WALs and if none, does not think the server a dead > server). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12270) A bug in the bucket cache, with cache blocks on write enabled
[ https://issues.apache.org/jira/browse/HBASE-12270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263583#comment-14263583 ] stack commented on HBASE-12270: --- [~ted_yu] The fails are obviously being tracked yet you feel the need to add a totally useless dup of info already freely available up on our build box? You are adding no value. > A bug in the bucket cache, with cache blocks on write enabled > - > > Key: HBASE-12270 > URL: https://issues.apache.org/jira/browse/HBASE-12270 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.11, 0.98.6.1 > Environment: I can reproduce it on a simple 2 node cluster, one > running the master and another running a RS. I was testing on ec2. > I used the following configurations for the cluster. > hbase-env:HBASE_REGIONSERVER_OPTS=-Xmx2G -XX:MaxDirectMemorySize=5G > -XX:CMSInitiatingOccupancyFraction=88 -XX:+AggressiveOpts -verbose:gc > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xlog > gc:/tmp/hbase-regionserver-gc.log > hbase-site: > hbase.bucketcache.ioengine=offheap > hbase.bucketcache.size=4196 > hbase.rs.cacheblocksonwrite=true > hfile.block.index.cacheonwrite=true > hfile.block.bloom.cacheonwrite=true >Reporter: Khaled Elmeleegy >Assignee: Liu Shaohui >Priority: Critical > Fix For: 1.0.0, 2.0.0, 0.98.10, 1.1.0 > > Attachments: HBASE-12270-v1.diff, HBASE-12270-v2.diff, > HBASE-12270-v2.patch, TestHBase.java, TestKey.java > > > In my experiments, I have writers streaming their output to HBase. The reader > powers a web page and does this scatter/gather, where it reads 1000 keys > written last and passes them the the front end. With this workload, I get the > exception below at the region server. Again, I am using HBAse (0.98.6.1). Any > help is appreciated. > 2014-10-10 15:06:44,173 ERROR > [B.DefaultRpcServer.handler=62,queue=2,port=60020] ipc.RpcServer: Unexpected > throwable object > java.lang.IllegalArgumentException > at java.nio.Buffer.position(Buffer.java:236) > at > org.apache.hadoop.hbase.util.ByteBufferUtils.skip(ByteBufferUtils.java:434) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.readKeyValueLen(HFileReaderV2.java:849) > at > org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:760) > at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:248) >at > org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:152) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:317) > at > org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:176) > at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:1780) > at > org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.(HRegion.java:3758) > at > org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1950) > at > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1936) > at > org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1913) > at > org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3157) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587) >at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) >at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12783) Create efficient RegionLocator implementation
[ https://issues.apache.org/jira/browse/HBASE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263571#comment-14263571 ] Hudson commented on HBASE-12783: FAILURE: Integrated in HBase-1.1 #49 (See [https://builds.apache.org/job/HBase-1.1/49/]) HBASE-12783 Revert - two tests in TestAssignmentManager fail (tedyu: rev 173eba815bd7d97d15be69893d4b0836a08cf42b) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java * hbase-server/src/main/java/org/apache/hadoop/hbase/util/RegionSizeCalculator.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java * hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/MetaTableLocator.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/MultiTableInputFormatBase.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTableUtil.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/WALPlayer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java * hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALSplitter.java * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestLoadIncrementalHFilesSplitRecovery.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionManager.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/HRegionLocator.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * hbase-it/src/test/java/org/apache/hadoop/hbase/mapreduce/IntegrationTestBulkLoad.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionServerCallable.java > Create efficient RegionLocator implementation > - > > Key: HBASE-12783 > URL: https://issues.apache.org/jira/browse/HBASE-12783 > Project: HBase > Issue Type: Task >Affects Versions: 1.0.0, 2.0.0 >Reporter: Solomon Duskis >Assignee: Solomon Duskis > Fix For: 2.0.0 > > Attachments: 12783-10.patch, 12783-11.patch, HBASE-12783.patch, > HBASE-12783E.patch, HBASE-12783F.patch, HBASE-12783G.patch, > HBASE-12783H.patch, HBASE-12783I.patch, HBASE-12783J.patch, > HBASE-12783K.patch, HBASE-12873B.patch, HBASE-12873C.patch, HBASE-12873D.patch > > > A new HRegionLocator that only implements RegionLocator functionality will be > more efficient to instantiate than a full HTable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-12783) Create efficient RegionLocator implementation
[ https://issues.apache.org/jira/browse/HBASE-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-12783: --- Fix Version/s: (was: 1.1.0) Reverted from branch-1 due to two failing tests in TestAssignmentManager. > Create efficient RegionLocator implementation > - > > Key: HBASE-12783 > URL: https://issues.apache.org/jira/browse/HBASE-12783 > Project: HBase > Issue Type: Task >Affects Versions: 1.0.0, 2.0.0 >Reporter: Solomon Duskis >Assignee: Solomon Duskis > Fix For: 2.0.0 > > Attachments: 12783-10.patch, 12783-11.patch, HBASE-12783.patch, > HBASE-12783E.patch, HBASE-12783F.patch, HBASE-12783G.patch, > HBASE-12783H.patch, HBASE-12783I.patch, HBASE-12783J.patch, > HBASE-12783K.patch, HBASE-12873B.patch, HBASE-12873C.patch, HBASE-12873D.patch > > > A new HRegionLocator that only implements RegionLocator functionality will be > more efficient to instantiate than a full HTable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12751) Allow RowLock to be reader writer
[ https://issues.apache.org/jira/browse/HBASE-12751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263468#comment-14263468 ] Elliott Clark commented on HBASE-12751: --- Yeah that's my thought. Puts for the same cell are already ordered by sequence number. So it shouldn't matter the order that they are applied to the memstore as long as the memstore correctly orders them there. There will be a slight change in behavior. Right now it's not possible for two puts to come in and have timestamps generated on the regionserver that don't order the same as the sequence id. However as long as the memstore correctly orders cells by timestamp then by sequence id it shouldn't be a behavioral change. We've never made any guarantees about the relationship of TS to seqid since TS is user setable. Cells that are the same row but totally different cells can't have ordering issues. So they aren't an issue however right now they are paying the same cost as if they could. It's a reasonable trade off in order to keep the number of bytes that need to be compared to get a lock down. However any check and mutate type action still needs to be able to say that nothing else changes the value concurrently. For that a write lock is needed. Seems like row locks were initially put in to be exposed to the client (HBASE-798). That was a bad idea because of region movement. And so it was removed. However by that time incrementColumnValue had been committed and that needed row locks; so did check and put and the later check and mutate. > Allow RowLock to be reader writer > - > > Key: HBASE-12751 > URL: https://issues.apache.org/jira/browse/HBASE-12751 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Elliott Clark > > Right now every write operation grabs a row lock. This is to prevent values > from changing during a read modify write operation (increment or check and > put). However it limits parallelism in several different scenarios. > If there are several puts to the same row but different columns or stores > then this is very limiting. > If there are puts to the same column then mvcc number should ensure a > consistent ordering. So locking is not needed. > However locking for check and put or increment is still needed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)