[jira] [Assigned] (HBASE-5777) MiniHBaseCluster cannot start multiple region servers
[ https://issues.apache.org/jira/browse/HBASE-5777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5777: -- Assignee: Jimmy Xiang > MiniHBaseCluster cannot start multiple region servers > - > > Key: HBASE-5777 > URL: https://issues.apache.org/jira/browse/HBASE-5777 > Project: HBase > Issue Type: Test >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: hbase-5777.patch > > > MiniHBaseCluster can try to start multiple region servers. But all of them > except one will die in putting up the web UI > because of BindException since HConstants.REGIONSERVER_INFO_PORT_AUTO is set > to false by default. > This issue will make many unit tests depends on multiple region servers > flaky, such as TestAdmin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4403) Adopt interface stability/audience classifications from Hadoop
[ https://issues.apache.org/jira/browse/HBASE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-4403: -- Assignee: Jimmy Xiang > Adopt interface stability/audience classifications from Hadoop > -- > > Key: HBASE-4403 > URL: https://issues.apache.org/jira/browse/HBASE-4403 > Project: HBase > Issue Type: Task >Affects Versions: 0.90.5, 0.92.0 >Reporter: Todd Lipcon >Assignee: Jimmy Xiang > Attachments: hbase-4403-nowhere-near-done.txt > > > As HBase gets more widely used, we need to be more explicit about which APIs > are stable and not expected to break between versions, which APIs are still > evolving, etc. We also have many public classes that are really internal to > the RS or Master and not meant to be used by users. Hadoop has adopted a > classification scheme for audience (public, private, or limited-private) as > well as stability (stable, evolving, unstable). I think we should copy these > annotations to HBase and start to classify our public classes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5394) Add ability to include Protobufs in HbaseObjectWritable
[ https://issues.apache.org/jira/browse/HBASE-5394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5394: -- Assignee: Jimmy Xiang > Add ability to include Protobufs in HbaseObjectWritable > --- > > Key: HBASE-5394 > URL: https://issues.apache.org/jira/browse/HBASE-5394 > Project: HBase > Issue Type: Improvement >Reporter: Zhihong Yu >Assignee: Jimmy Xiang > > This is a port of HADOOP-7379 > This is to add the cases to HbaseObjectWritable to handle subclasses of > Message, the superclass of codegenned protobufs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5327) Print a message when an invalid hbase.rootdir is passed
[ https://issues.apache.org/jira/browse/HBASE-5327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5327: -- Assignee: Jimmy Xiang > Print a message when an invalid hbase.rootdir is passed > --- > > Key: HBASE-5327 > URL: https://issues.apache.org/jira/browse/HBASE-5327 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.5 >Reporter: Jean-Daniel Cryans >Assignee: Jimmy Xiang > Fix For: 0.94.0, 0.90.7, 0.92.1 > > Attachments: hbase-5327.txt > > > As seen on the mailing list: > http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/24124 > If hbase.rootdir doesn't specify a folder on hdfs we crash while opening a > path to .oldlogs: > {noformat} > 2012-02-02 23:07:26,292 FATAL org.apache.hadoop.hbase.master.HMaster: > Unhandled exception. Starting shutdown. > java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative > path in absolute URI: hdfs://sv4r11s38:9100.oldlogs > at org.apache.hadoop.fs.Path.initialize(Path.java:148) > at org.apache.hadoop.fs.Path.(Path.java:71) > at org.apache.hadoop.fs.Path.(Path.java:50) > at > org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:112) > at > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:448) > at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:326) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.net.URISyntaxException: Relative path in absolute URI: > hdfs://sv4r11s38:9100.oldlogs > at java.net.URI.checkPath(URI.java:1787) > at java.net.URI.(URI.java:735) > at org.apache.hadoop.fs.Path.initialize(Path.java:145) > ... 6 more > {noformat} > It could also crash anywhere else, this just happens to be the first place we > use hbase.rootdir. We need to verify that it's an actual folder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5221) bin/hbase script doesn't look for Hadoop jars in the right place in trunk layout
[ https://issues.apache.org/jira/browse/HBASE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5221: -- Assignee: Jimmy Xiang > bin/hbase script doesn't look for Hadoop jars in the right place in trunk > layout > > > Key: HBASE-5221 > URL: https://issues.apache.org/jira/browse/HBASE-5221 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.0 >Reporter: Todd Lipcon >Assignee: Jimmy Xiang > > Running against an 0.24.0-SNAPSHOT hadoop: > ls: cannot access > /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-common*.jar: No such file or > directory > ls: cannot access /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-hdfs*.jar: > No such file or directory > ls: cannot access > /home/todd/ha-demo/hadoop-0.24.0-SNAPSHOT/hadoop-mapred*.jar: No such file or > directory > The jars are rooted deeper in the heirarchy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root r
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5099: -- Assignee: Jimmy Xiang > ZK event thread waiting for root region while server shutdown handler waiting > for event thread to finish distributed log splitting to recover the region > sever the root region is on > > > Key: HBASE-5099 > URL: https://issues.apache.org/jira/browse/HBASE-5099 > Project: HBase > Issue Type: Bug >Affects Versions: 0.92.0, 0.94.0 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > Attachments: ZK-event-thread-waiting-for-root.png, > distributed-log-splitting-hangs.png, hbase-5099.patch > > > A RS died. The ServerShutdownHandler kicked in and started the logspliting. > SpliLogManager > installed the tasks asynchronously, then started to wait for them to complete. > The task znodes were not created actually. The requests were just queued. > At this time, the zookeeper connection expired. HMaster tried to recover the > expired ZK session. > During the recovery, a new zookeeper connection was created. However, this > master became the > new master again. It tried to assign root and meta. > Because the dead RS got the old root region, the master needs to wait for the > log splitting to complete. > This waiting holds the zookeeper event thread. So the async create split > task is never retried since > there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-5081) Distributed log splitting deleteNode races againsth splitLog retry
[ https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-5081: -- Assignee: Prakash Khemani (was: Jimmy Xiang) Prakash is going to put up a patch to fix createIfAbsent, which is great alternative. > Distributed log splitting deleteNode races againsth splitLog retry > --- > > Key: HBASE-5081 > URL: https://issues.apache.org/jira/browse/HBASE-5081 > Project: HBase > Issue Type: Bug > Components: wal >Affects Versions: 0.92.0, 0.94.0 >Reporter: Jimmy Xiang >Assignee: Prakash Khemani > Fix For: 0.92.0 > > Attachments: distributed-log-splitting-screenshot.png, > hbase-5081-patch-v6.txt, hbase-5081-patch-v7.txt, > hbase-5081_patch_for_92_v4.txt, hbase-5081_patch_v5.txt, patch_for_92.txt, > patch_for_92_v2.txt, patch_for_92_v3.txt > > > Recently, during 0.92 rc testing, we found distributed log splitting hangs > there forever. Please see attached screen shot. > I looked into it and here is what happened I think: > 1. One rs died, the servershutdownhandler found it out and started the > distributed log splitting; > 2. All three tasks failed, so the three tasks were deleted, asynchronously; > 3. Servershutdownhandler retried the log splitting; > 4. During the retrial, it created these three tasks again, and put them in a > hashmap (tasks); > 5. The asynchronously deletion in step 2 finally happened for one task, in > the callback, it removed one > task in the hashmap; > 6. One of the newly submitted tasks' zookeeper watcher found out that task is > unassigned, and it is not > in the hashmap, so it created a new orphan task. > 7. All three tasks failed, but that task created in step 6 is an orphan so > the batch.err counter was one short, > so the log splitting hangs there and keeps waiting for the last task to > finish which is never going to happen. > So I think the problem is step 2. The fix is to make deletion sync, instead > of async, so that the retry will have > a clean start. > Async deleteNode will mess up with split log retrial. In extreme situation, > if async deleteNode doesn't happen > soon enough, some node created during the retrial could be deleted. > deleteNode should be sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4797) [availability] Give recovered.edits files better names, ones that include first and last sequence id so we can skip files with edits we know older than current region ha
[ https://issues.apache.org/jira/browse/HBASE-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-4797: -- Assignee: Jimmy Xiang > [availability] Give recovered.edits files better names, ones that include > first and last sequence id so we can skip files with edits we know older than > current region has > -- > > Key: HBASE-4797 > URL: https://issues.apache.org/jira/browse/HBASE-4797 > Project: HBase > Issue Type: Bug > Components: performance >Reporter: stack >Assignee: Jimmy Xiang >Priority: Critical > Labels: noob > > Testing 0.92, I crashed all servers out. Another bug makes it so WALs are > not getting cleaned so I had 7000 regions to replay. The distributed split > code did a nice job and cluster came back but interesting is that some hot > regions ended up having loads of recovered.edits files -- tens if not > hundreds -- to replay against the region (can we bulk load recovered.edits > instead of replaying them?). Each recovered.edits file is taking about a > second to process (though only about 30 odd edits per file it seems). The > region is unavailable during this time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-4820) Distributed log splitting coding enhancement to make it easier to understand, no semantics change
[ https://issues.apache.org/jira/browse/HBASE-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang reassigned HBASE-4820: -- Assignee: Jimmy Xiang > Distributed log splitting coding enhancement to make it easier to understand, > no semantics change > - > > Key: HBASE-4820 > URL: https://issues.apache.org/jira/browse/HBASE-4820 > Project: HBase > Issue Type: Improvement > Components: wal >Affects Versions: 0.94.0 >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Labels: newbie > Fix For: 0.94.0 > > > In reviewing distributed log splitting feature, we found some cosmetic > issues. They make the code hard to understand. > It will be great to fix them. For this issue, there should be no semantic > change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira