[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489278#comment-13489278 ] Hadoop QA commented on HBASE-6410: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551821/HBASE-6410-16.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 164 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 87 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3213//console This message is automatically generated. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, > HBASE-6410-16.patch, HBASE-6410-1.patch, HBASE-6410-2.patch, > HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, > HBASE-6410-6.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6410: - Attachment: HBASE-6410-16.patch Fix for the test failure. I changed one of the hadoop shims but forgot to change the other. That will be something to watch out for on metrics changes from now on. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, > HBASE-6410-16.patch, HBASE-6410-1.patch, HBASE-6410-2.patch, > HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, > HBASE-6410-6.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.
[ https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489256#comment-13489256 ] Ted Yu commented on HBASE-2645: --- Thanks for the verification, Stack. Appreciate it. > HLog writer can do 1-2 sync operations after lease has been recovered for > split process. > > > Key: HBASE-2645 > URL: https://issues.apache.org/jira/browse/HBASE-2645 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 0.90.4 >Reporter: Cosmin Lehene >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.96.0 > > Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, > org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt > > > TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. > This test starts a thread that writes one edit to the log, syncs and counts. > During this, a HLog.splitLog operation is started. splitLog recovers the log > lease before reading the log, so that the original regionserver could not > wake up and write after the split process started. > The test compares the number of edits reported by the split process and by > the writer thread. Writer thread (called zombie in the test) should report <= > than the splitLog (sync() might raise after the last edit gets written and > the edit won't get counted by zombie thread). However it appears that the > zombie counts 1-2 more edits. So it looks like it can sync without a lease. > This might be a hdfs-0.20 related issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.
[ https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489252#comment-13489252 ] stack commented on HBASE-2645: -- It fails for me too locally if I keep running it. If I do the below, it usually passes but it can also fail: {code} $ for i in `seq 0 9`; do echo "here=$i"; mvn test -Dtest=TestHLogSplit#testLogCannotBeWrittenOnceParsed -PlocalTests; mv hbase-server/target/surefire-reports hbase-server/target/$i; done &> /tmp/test.txt & {code} I tried w/ 1.0.4 and it also fails. Will revert for now. If a bug in HDFS, will file an issue. > HLog writer can do 1-2 sync operations after lease has been recovered for > split process. > > > Key: HBASE-2645 > URL: https://issues.apache.org/jira/browse/HBASE-2645 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 0.90.4 >Reporter: Cosmin Lehene >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.96.0 > > Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, > org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt > > > TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. > This test starts a thread that writes one edit to the log, syncs and counts. > During this, a HLog.splitLog operation is started. splitLog recovers the log > lease before reading the log, so that the original regionserver could not > wake up and write after the split process started. > The test compares the number of edits reported by the split process and by > the writer thread. Writer thread (called zombie in the test) should report <= > than the splitLog (sync() might raise after the last edit gets written and > the edit won't get counted by zombie thread). However it appears that the > zombie counts 1-2 more edits. So it looks like it can sync without a lease. > This might be a hdfs-0.20 related issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489247#comment-13489247 ] stack commented on HBASE-7083: -- bq. As to focing reassign, it is needed since the region state in memory may show the missing daughter is open. [~jxiang] When would the above case be true? It was not in .META. so its edit failed? How then could it be open in the master given its a new region? > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489248#comment-13489248 ] stack commented on HBASE-7083: -- Also, if open already in master, could it be open on a regionserver? So this assign will mean double assignment? > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489245#comment-13489245 ] stack commented on HBASE-6410: -- Above failure looks like you Elliott. If you fix it and check that my comments on your last post to RB are just RB messing up your patch, I'd be +1 committing and then addressing teething issues in new JIRAs. Any one else want to have a look at this? I think it should go in sooner rather than later since its covers alot of ground so will rot quickly. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, > HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, > HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489244#comment-13489244 ] Hudson commented on HBASE-6925: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/]) HBASE-6925 Change socket write size from 8K to 64K for HBaseServer (Karthik) (Revision 1404777) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489243#comment-13489243 ] Hudson commented on HBASE-6060: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/]) HBASE-6060 Regions's in OPENING state from failed regionservers takes a long time to recover (Revision 1404759) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionState.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489242#comment-13489242 ] Hudson commented on HBASE-7077: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/]) HBASE-7077 ADDENDUM, add TestCategory (Revision 1404640) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHBase7051.java > Test for: CheckAndPut should properly read MVCC > --- > > Key: HBASE-7077 > URL: https://issues.apache.org/jira/browse/HBASE-7077 > Project: HBase > Issue Type: Sub-task >Reporter: Gregory Chanan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch > > > checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed > appends and increments. > Also need a test, here's one we could use (originally proposed in HBASE-7051): > The current value of some cell is 10. > I issue two concurrent requests: > A) a check and put where check value = 10, put value = 11 > B) a put where put value = 50 > The only result at the end of these operations that seems reasonable to me is > the value of the cell being 50. If A occurred first (ACID wise), then our > values go 10->11->50. If B occurred first, then our values go 10->50 (and the > checkAndPut fails) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
[ https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489241#comment-13489241 ] Hudson commented on HBASE-7082: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/]) HBASE-7082 TestHFileCleaner#testHFileCleaning fails due to cleaner is reset (Revision 1404722) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java > TestHFileCleaner#testHFileCleaning fails due to cleaner is reset > > > Key: HBASE-7082 > URL: https://issues.apache.org/jira/browse/HBASE-7082 > Project: HBase > Issue Type: Test >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Trivial > Fix For: 0.96.0 > > Attachments: trunk-7082.patch > > > TestHFileCleaner#testHFileCleaning fails if it runs after > testRemovesEmptyDirectories which resets the cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7083: --- Resolution: Fixed Status: Resolved (was: Patch Available) Integrated into trunk. Thanks Stack and Ted for the review. The failed unit tests are flaky, seem to be fine locally. > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489238#comment-13489238 ] stack commented on HBASE-7055: -- The last posted patch is what was put on RB Sergey? It does not include the rename of CompactionManager? You want to add that? I'd be +1 on committing after the rename. You want to attach the patch to a different issue since this issue is about forward porting the 89fb tier compaction? (Liyin and a few of the lads were by today and said they have a doc. that talks about how this new compaction works -- its benefits, etc. -- which they will post... IIRC, I think he said it is not yet in production). > port HBASE-6371 tier-based compaction from 0.89-fb to trunk > --- > > Key: HBASE-7055 > URL: https://issues.apache.org/jira/browse/HBASE-7055 > Project: HBase > Issue Type: Task > Components: Compaction >Affects Versions: 0.96.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 0.96.0 > > Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, > HBASE-6371-v3-refactor-only-squashed.patch, > HBASE-6371-v4-refactor-only-squashed.patch > > > There's divergence in the code :( > See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489221#comment-13489221 ] Ted Yu edited comment on HBASE-6796 at 11/2/12 3:37 AM: I tried to reproduce the test failure in TestHFileCleaner#testHFileCleaning but was not successful. I looked at the test output: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/566/testReport/org.apache.hadoop.hbase.master.cleaner/TestHFileCleaner/testHFileCleaning/ {code} 2012-11-02 01:29:55,534 INFO [pool-1-thread-1] hbase.ResourceChecker(145): before master.cleaner.TestHFileCleaner#testTTLCleaner: 43 threads, 127 file descriptors 0 connections, 2012-11-02 01:29:55,898 DEBUG [pool-1-thread-1] cleaner.TimeToLiveHFileCleaner(68): Life:108, ttl:100, current:1351819795897, from: 1351819795789 2012-11-02 01:29:55,900 INFO [pool-1-thread-1] hbase.ResourceChecker(145): after master.cleaner.TestHFileCleaner#testTTLCleaner: 44 threads (was 43), 127 file descriptors (was 127). 0 connections, -thread leak?- {code} We can see that one thread didn't exit. Still, it is not very obvious which thread hung :-( was (Author: yuzhih...@gmail.com): I tried to reproduce the test failure in TestHFileCleaner#testHFileCleaning but was not successful. I looked at the test output: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/566/testReport/org.apache.hadoop.hbase.master.cleaner/TestHFileCleaner/testHFileCleaning/ {code} 2012-11-02 01:29:57,917 INFO [pool-1-thread-1] hbase.ResourceChecker(145): before master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 45 threads, 127 file descriptors 0 connections, ... 2012-11-02 01:29:58,730 INFO [pool-1-thread-1] hbase.ResourceChecker(145): after master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 44 threads (was 45), 127 file descriptors (was 127). 0 connections, {code} We can see that one thread didn't exit. I checked test output on MacBook. The thread count in both occasions was the same. Still, it is not very obvious which thread hung :-( > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489221#comment-13489221 ] Ted Yu commented on HBASE-6796: --- I tried to reproduce the test failure in TestHFileCleaner#testHFileCleaning but was not successful. I looked at the test output: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/566/testReport/org.apache.hadoop.hbase.master.cleaner/TestHFileCleaner/testHFileCleaning/ {code} 2012-11-02 01:29:57,917 INFO [pool-1-thread-1] hbase.ResourceChecker(145): before master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 45 threads, 127 file descriptors 0 connections, ... 2012-11-02 01:29:58,730 INFO [pool-1-thread-1] hbase.ResourceChecker(145): after master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 44 threads (was 45), 127 file descriptors (was 127). 0 connections, {code} We can see that one thread didn't exit. I checked test output on MacBook. The thread count in both occasions was the same. Still, it is not very obvious which thread hung :-( > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489208#comment-13489208 ] Hadoop QA commented on HBASE-6410: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551809/HBASE-6410-15.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 164 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 87 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestMetricsRegionServer Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3212//console This message is automatically generated. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, > HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, > HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489200#comment-13489200 ] Hudson commented on HBASE-6925: --- Integrated in HBase-0.94 #566 (See [https://builds.apache.org/job/HBase-0.94/566/]) HBASE-6925 Change socket write size from 8K to 64K for HBaseServer (Karthik) (Revision 1404776) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6410: - Attachment: HBASE-6410-15.patch Added the comments that Stack asked for. Added more tests that hit the region server. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, > HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, > HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489176#comment-13489176 ] Cheng Hao commented on HBASE-6852: -- Thanks Lars and Ted, I will try to reproduce the failure locally first, and then to see if any logical bug of the schema metrics flushing. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.4 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk
[ https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HBASE-7055: Attachment: HBASE-6371-v4-refactor-only-squashed.patch > port HBASE-6371 tier-based compaction from 0.89-fb to trunk > --- > > Key: HBASE-7055 > URL: https://issues.apache.org/jira/browse/HBASE-7055 > Project: HBase > Issue Type: Task > Components: Compaction >Affects Versions: 0.96.0 >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 0.96.0 > > Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, > HBASE-6371-v3-refactor-only-squashed.patch, > HBASE-6371-v4-refactor-only-squashed.patch > > > There's divergence in the code :( > See HBASE-6371 for details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6305) TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds.
[ https://issues.apache.org/jira/browse/HBASE-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489152#comment-13489152 ] Jimmy Xiang commented on HBASE-6305: How about where this conf is used? > TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds. > > > Key: HBASE-6305 > URL: https://issues.apache.org/jira/browse/HBASE-6305 > Project: HBase > Issue Type: Sub-task > Components: test >Affects Versions: 0.92.2, 0.94.1 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 0.94.3 > > Attachments: hbase-6305-94.patch, HBASE-6305-94-v2.patch, > HBASE-6305-94-v2.patch, HBASE-6305-v1.patch > > > trunk: mvn clean test -Dhadoop.profile=2.0 -Dtest=TestLocalHBaseCluster > 0.94: mvn clean test -Dhadoop.profile=23 -Dtest=TestLocalHBaseCluster > {code} > testLocalHBaseCluster(org.apache.hadoop.hbase.TestLocalHBaseCluster) Time > elapsed: 0.022 sec <<< ERROR! > java.lang.RuntimeException: Master not initialized after 200 seconds > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:208) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:424) > at > org.apache.hadoop.hbase.TestLocalHBaseCluster.testLocalHBaseCluster(TestLocalHBaseCluster.java:66) > ... > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489148#comment-13489148 ] Hudson commented on HBASE-6925: --- Integrated in HBase-TRUNK #3507 (See [https://builds.apache.org/job/HBase-TRUNK/3507/]) HBASE-6925 Change socket write size from 8K to 64K for HBaseServer (Karthik) (Revision 1404777) Result = SUCCESS larsh : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations
[ https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489136#comment-13489136 ] Jean-Daniel Cryans commented on HBASE-4583: --- bq. You guys used this more extensively at SU than anybody else, I think. Would you stick to your statement that the upsert complexity and breaking of the general HBase VERSIONS contract is warranted for the performance gain? We're kind of dependent of the current performance now, unless we start double counting or missing increments then statu quo I'd say is preferred. It's still a bit unclear to me what the implications of not having this jira mean. And also what strick and non-strict means. Let me phrase it like this then: I think everyone can agree that we don't want data loss (eg no missed increments), so taking a performance hit is likely necessary. If it's only a case where a client reading a counter would miss the most updated values, then that's something at least I can live with. > Integrate RWCC with Append and Increment operations > --- > > Key: HBASE-4583 > URL: https://issues.apache.org/jira/browse/HBASE-4583 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.96.0 > > Attachments: 4583-trunk-less-radical.txt, > 4583-trunk-less-radical-v2.txt, 4583-trunk-less-radical-v3.txt, > 4583-trunk-less-radical-v4.txt, 4583-trunk-less-radical-v5.txt, > 4583-trunk-less-radical-v6.txt, 4583-trunk-radical.txt, > 4583-trunk-radical_v2.txt, 4583-trunk-v3.txt, 4583.txt, 4583-v2.txt, > 4583-v3.txt, 4583-v4.txt > > > Currently Increment and Append operations do not work with RWCC and hence a > client could see the results of multiple such operation mixed in the same > Get/Scan. > The semantics might be a bit more interesting here as upsert adds and removes > to and from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations
[ https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489119#comment-13489119 ] Lars Hofhansl commented on HBASE-4583: -- You guys used this more extensively at SU than anybody else, I think. Would you stick to your statement that the upsert complexity and breaking of the general HBase VERSIONS contract is warranted for the performance gain? It would probably be am even bigger problem if there's other load (that fills the memstore) that now unnecessarily needs to be flushed because of many increments. > Integrate RWCC with Append and Increment operations > --- > > Key: HBASE-4583 > URL: https://issues.apache.org/jira/browse/HBASE-4583 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.96.0 > > Attachments: 4583-trunk-less-radical.txt, > 4583-trunk-less-radical-v2.txt, 4583-trunk-less-radical-v3.txt, > 4583-trunk-less-radical-v4.txt, 4583-trunk-less-radical-v5.txt, > 4583-trunk-less-radical-v6.txt, 4583-trunk-radical.txt, > 4583-trunk-radical_v2.txt, 4583-trunk-v3.txt, 4583.txt, 4583-v2.txt, > 4583-v3.txt, 4583-v4.txt > > > Currently Increment and Append operations do not work with RWCC and hence a > client could see the results of multiple such operation mixed in the same > Get/Scan. > The semantics might be a bit more interesting here as upsert adds and removes > to and from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations
[ https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489112#comment-13489112 ] Jean-Daniel Cryans commented on HBASE-4583: --- bq. Jean-Daniel Cryans I was wondering whether you could do your same performance test with the "radical" version of this patch (4583-trunk-v3.txt)? What I saw is it's 10% slower, but also that it was flushing tiny HFiles. > Integrate RWCC with Append and Increment operations > --- > > Key: HBASE-4583 > URL: https://issues.apache.org/jira/browse/HBASE-4583 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Lars Hofhansl > Fix For: 0.96.0 > > Attachments: 4583-trunk-less-radical.txt, > 4583-trunk-less-radical-v2.txt, 4583-trunk-less-radical-v3.txt, > 4583-trunk-less-radical-v4.txt, 4583-trunk-less-radical-v5.txt, > 4583-trunk-less-radical-v6.txt, 4583-trunk-radical.txt, > 4583-trunk-radical_v2.txt, 4583-trunk-v3.txt, 4583.txt, 4583-v2.txt, > 4583-v3.txt, 4583-v4.txt > > > Currently Increment and Append operations do not work with RWCC and hence a > client could see the results of multiple such operation mixed in the same > Get/Scan. > The semantics might be a bit more interesting here as upsert adds and removes > to and from the memstore. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6305) TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds.
[ https://issues.apache.org/jira/browse/HBASE-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489109#comment-13489109 ] Himanshu Vashishtha commented on HBASE-6305: [~lhofhansl] The reason I didn't touch the HBaseTestingUtility class is the method is used by number of other methods (36 references to getTestDataDir()). I think it is beyond the scope of this jira. Thoughts? > TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds. > > > Key: HBASE-6305 > URL: https://issues.apache.org/jira/browse/HBASE-6305 > Project: HBase > Issue Type: Sub-task > Components: test >Affects Versions: 0.92.2, 0.94.1 >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 0.94.3 > > Attachments: hbase-6305-94.patch, HBASE-6305-94-v2.patch, > HBASE-6305-94-v2.patch, HBASE-6305-v1.patch > > > trunk: mvn clean test -Dhadoop.profile=2.0 -Dtest=TestLocalHBaseCluster > 0.94: mvn clean test -Dhadoop.profile=23 -Dtest=TestLocalHBaseCluster > {code} > testLocalHBaseCluster(org.apache.hadoop.hbase.TestLocalHBaseCluster) Time > elapsed: 0.022 sec <<< ERROR! > java.lang.RuntimeException: Master not initialized after 200 seconds > at > org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:208) > at > org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:424) > at > org.apache.hadoop.hbase.TestLocalHBaseCluster.testLocalHBaseCluster(TestLocalHBaseCluster.java:66) > ... > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489105#comment-13489105 ] Lars Hofhansl commented on HBASE-6796: -- That test run didn't look so good. [~jesse_yates] Mind having a look? > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489094#comment-13489094 ] Hudson commented on HBASE-6796: --- Integrated in HBase-0.94 #565 (See [https://builds.apache.org/job/HBase-0.94/565/]) HBASE-6796 Backport HBASE-5547, Don't delete HFiles in backup mode. (Jesse Yates) (Revision 1404762) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/BaseConfigurable.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/Chore.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/TimeToLiveLogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseHFileCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseLogCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/FileCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/LogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveHFileCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveLogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java * /hbase/branches/0.94/src/main/resources/hbase-default.xml * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup/TestHFileArchiving.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestCleanerChore.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/CheckedArchivingHFileCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/HFileArchiveTestingUtil.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestHFileArchiveUtil.java > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"
[ https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489093#comment-13489093 ] Hudson commented on HBASE-5547: --- Integrated in HBase-0.94 #565 (See [https://builds.apache.org/job/HBase-0.94/565/]) HBASE-6796 Backport HBASE-5547, Don't delete HFiles in backup mode. (Jesse Yates) (Revision 1404762) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/BaseConfigurable.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/Chore.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/TimeToLiveLogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseHFileCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseLogCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/FileCleanerDelegate.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/LogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveHFileCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveLogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java * /hbase/branches/0.94/src/main/resources/hbase-default.xml * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup/TestHFileArchiving.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestCleanerChore.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/CheckedArchivingHFileCleaner.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/HFileArchiveTestingUtil.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestHFileArchiveUtil.java > Don't delete HFiles when in "backup mode" > - > > Key: HBASE-5547 > URL: https://issues.apache.org/jira/browse/HBASE-5547 > Project: HBase > Issue Type: New Feature >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.96.0 > > Attachments: 5547.addendum-v3, 5547-addendum-v4.txt, 5547-v12.txt, > 5547-v16.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, > hbase-5547-0.94-backport-v0.patch, hbase-5547-v9.patch, > java_HBASE-5547.addendum, java_HBASE-5547.addendum-v1, > java_HBASE-5547.addendum-v2, java_HBASE-5547_v13.patch, > java_HBASE-5547_v14.patch, java_HBASE-5547_v15.patch, > java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, > jav
[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489092#comment-13489092 ] Hudson commented on HBASE-7017: --- Integrated in HBase-0.94 #565 (See [https://builds.apache.org/job/HBase-0.94/565/]) HBASE-7017 Backport '[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file' to 0.94 (Devaraj Das) (Revision 1404764) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java > Backport "[replication] The replication-executor should make sure the file > that it is replicating is closed before declaring success on that file" to > 0.94 > -- > > Key: HBASE-7017 > URL: https://issues.apache.org/jira/browse/HBASE-7017 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.3 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489080#comment-13489080 ] Hudson commented on HBASE-6060: --- Integrated in HBase-TRUNK #3506 (See [https://builds.apache.org/job/HBase-TRUNK/3506/]) HBASE-6060 Regions's in OPENING state from failed regionservers takes a long time to recover (Revision 1404759) Result = FAILURE jxiang : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionState.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489070#comment-13489070 ] Hadoop QA commented on HBASE-7083: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551760/trunk-7083.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 85 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3211//console This message is automatically generated. > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs
[ https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-3996: - Fix Version/s: (was: 0.94.3) 0.94.4 Oh well... Probably not getting to it. > Support multiple tables and scanners as input to the mapper in map/reduce jobs > -- > > Key: HBASE-3996 > URL: https://issues.apache.org/jira/browse/HBASE-3996 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Eran Kutner >Assignee: Lars Hofhansl > Fix For: 0.96.0, 0.94.4 > > Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, 3996-v5.txt, > 3996-v6.txt, 3996-v7.txt, HBase-3996.patch > > > It seems that in many cases feeding data from multiple tables or multiple > scanners on a single table can save a lot of time when running map/reduce > jobs. > I propose a new MultiTableInputFormat class that would allow doing this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-6925. -- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.94 and 0.96. Thanks Karthik. > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489051#comment-13489051 ] Jesse Yates commented on HBASE-6796: Thanks Lars! > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489042#comment-13489042 ] Karthik Ranganathan commented on HBASE-6925: Go for the commit Lars! > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.
[ https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489041#comment-13489041 ] Ted Yu commented on HBASE-2645: --- Here is information about my environment: {code} java version "1.6.0_35" Java(TM) SE Runtime Environment (build 1.6.0_35-b10-428-11M3811) Java HotSpot(TM) 64-Bit Server VM (build 20.10-b01-428, mixed mode) Darwin T32 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64 {code} Initially I wasn't sure whether the test failure only happened with JDK 1.6 Later I saw the test failure on Jenkins where JDK 1.7 was used. > HLog writer can do 1-2 sync operations after lease has been recovered for > split process. > > > Key: HBASE-2645 > URL: https://issues.apache.org/jira/browse/HBASE-2645 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 0.90.4 >Reporter: Cosmin Lehene >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.96.0 > > Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, > org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt > > > TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. > This test starts a thread that writes one edit to the log, syncs and counts. > During this, a HLog.splitLog operation is started. splitLog recovers the log > lease before reading the log, so that the original regionserver could not > wake up and write after the split process started. > The test compares the number of edits reported by the split process and by > the writer thread. Writer thread (called zombie in the test) should report <= > than the splitLog (sync() might raise after the last edit gets written and > the edit won't get counted by zombie thread). However it appears that the > zombie counts 1-2 more edits. So it looks like it can sync without a lease. > This might be a hdfs-0.20 related issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489033#comment-13489033 ] Hudson commented on HBASE-6852: --- Integrated in HBase-0.94 #564 (See [https://builds.apache.org/job/HBase-0.94/564/]) HBASE-6852, REVERT again, due to unexplained test failures that only occur on the jenkins machines (Revision 1404691) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.4 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.
[ https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489030#comment-13489030 ] stack commented on HBASE-2645: -- [~ted_yu], your repeat of info in here, info that is public and available to all is of no help; its just spam in our mailboxes. Neither is your cut-off paste of a failure on your local machine w/ no context such as what your local machine is, if it fails always, etc. I committed the patch because it passed locally on repeated runs and passed hadoopqa. > HLog writer can do 1-2 sync operations after lease has been recovered for > split process. > > > Key: HBASE-2645 > URL: https://issues.apache.org/jira/browse/HBASE-2645 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 0.90.4 >Reporter: Cosmin Lehene >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.96.0 > > Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, > org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt > > > TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. > This test starts a thread that writes one edit to the log, syncs and counts. > During this, a HLog.splitLog operation is started. splitLog recovers the log > lease before reading the log, so that the original regionserver could not > wake up and write after the split process started. > The test compares the number of edits reported by the split process and by > the writer thread. Writer thread (called zombie in the test) should report <= > than the splitLog (sync() might raise after the last edit gets written and > the edit won't get counted by zombie thread). However it appears that the > zombie counts 1-2 more edits. So it looks like it can sync without a lease. > This might be a hdfs-0.20 related issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-2611) Handle RS that fails while processing the failure of another one
[ https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-2611: - Fix Version/s: (was: 0.94.3) 0.94.4 Alas, looks like we won't get to this... again. > Handle RS that fails while processing the failure of another one > > > Key: HBASE-2611 > URL: https://issues.apache.org/jira/browse/HBASE-2611 > Project: HBase > Issue Type: Sub-task > Components: Replication >Reporter: Jean-Daniel Cryans >Assignee: Chris Trezzo > Fix For: 0.94.4 > > Attachments: HBase-2611-upstream-v1.patch > > > HBASE-2223 doesn't manage region servers that fail while doing the transfer > of HLogs queues from other region servers that failed. Devise a reliable way > to do it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-7017. -- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.94. Thanks for patch (and backport) > Backport "[replication] The replication-executor should make sure the file > that it is replicating is closed before declaring success on that file" to > 0.94 > -- > > Key: HBASE-7017 > URL: https://issues.apache.org/jira/browse/HBASE-7017 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.3 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7083: --- Status: Patch Available (was: Open) Try hadoop-qa again. > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7083: --- Attachment: trunk-7083.patch > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl resolved HBASE-6796. -- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.94... Pfeeww. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7083: --- Status: Open (was: Patch Available) > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch, trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6060: --- Resolution: Fixed Status: Resolved (was: Patch Available) Integrated into trunk. Thanks all. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489019#comment-13489019 ] Lars Hofhansl commented on HBASE-6796: -- According to HBASE-4737 Medium should be < 45s. So I'll make it a LargeTest and then commit... Finally. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489017#comment-13489017 ] Lars Hofhansl commented on HBASE-6796: -- One last question: Should TestHFileArchiving be a LargeTest? It runs for 122s on my machine. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489012#comment-13489012 ] Lars Hofhansl commented on HBASE-6796: -- Or better even, use a better testdir for that test. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489011#comment-13489011 ] Lars Hofhansl commented on HBASE-6796: -- Test works if the datadir is cleaned first (either in an @After method or inline in this one test.) I'll just paste that fix on this patch and commit. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489010#comment-13489010 ] Jesse Yates commented on HBASE-6796: Hmm, I added "testReadingArchiveDirectoryFromFS()". And that method literally has nothing to do with the added code. Looks like a flapper already in the codebase. On a related note, it looks like Jenkins isn't running all the tests for 0.94. See https://builds.apache.org/job/HBase-0.94/563/testReport/org.apache.hadoop.hbase.util/ which should have run TestFSTableDescriptors...but shows neither hide nor hair of it (or many other expected tests). > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
[ https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489005#comment-13489005 ] Hudson commented on HBASE-7082: --- Integrated in HBase-TRUNK #3505 (See [https://builds.apache.org/job/HBase-TRUNK/3505/]) HBASE-7082 TestHFileCleaner#testHFileCleaning fails due to cleaner is reset (Revision 1404722) Result = SUCCESS jxiang : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java > TestHFileCleaner#testHFileCleaning fails due to cleaner is reset > > > Key: HBASE-7082 > URL: https://issues.apache.org/jira/browse/HBASE-7082 > Project: HBase > Issue Type: Test >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Trivial > Fix For: 0.96.0 > > Attachments: trunk-7082.patch > > > TestHFileCleaner#testHFileCleaning fails if it runs after > testRemovesEmptyDirectories which resets the cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489003#comment-13489003 ] Lars Hofhansl commented on HBASE-6796: -- And of course it only happens sometimes and never in the debugger. Sigh. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489000#comment-13489000 ] Devaraj Das commented on HBASE-7017: [~lhofhansl] - Yes.. > Backport "[replication] The replication-executor should make sure the file > that it is replicating is closed before declaring success on that file" to > 0.94 > -- > > Key: HBASE-7017 > URL: https://issues.apache.org/jira/browse/HBASE-7017 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.3 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488997#comment-13488997 ] stack commented on HBASE-6925: -- Looks like HBaseClient in 89fb and in trunk or 0.94 are very different. That might explain it Lars (Looking at the 89fb client, I can see how brother Karthik's change could make such a big difference). > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488995#comment-13488995 ] Lars Hofhansl commented on HBASE-6925: -- Thanks Karthik, I'm going to commit, unless you want to :) > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488993#comment-13488993 ] Lars Hofhansl commented on HBASE-6796: -- I'm actually getting a failure with TestFSTableDescriptors :( {code} testCreateAndUpdate(org.apache.hadoop.hbase.util.TestFSTableDescriptors) Time elapsed: 0.025 sec <<< FAILURE! java.lang.AssertionError: statuses.length=5 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.hbase.util.TestFSTableDescriptors.testCreateAndUpdate(TestFSTableDescriptors.java:63) {code} > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488991#comment-13488991 ] Karthik Ranganathan commented on HBASE-6925: No, I dont think that would matter, this is more about the socket transfer size into an underlying buffer. > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488982#comment-13488982 ] Lars Hofhansl commented on HBASE-7017: -- I assume this was with 6758-0.94.txt from HBASE-6758? > Backport "[replication] The replication-executor should make sure the file > that it is replicating is closed before declaring success on that file" to > 0.94 > -- > > Key: HBASE-7017 > URL: https://issues.apache.org/jira/browse/HBASE-7017 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.3 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488973#comment-13488973 ] Lars Hofhansl commented on HBASE-6925: -- I tried a local test (distributed HBase, but only a single RegionServer and Datanode all located on the same machine)... I could not discern any change in performance (as maybe expected from local-only test). I also changed the buffer size in Result.java, but again no difference. > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-6796: --- Attachment: hbase-6796-v2.patch Attaching patch on top of current 0.94 branch. Ran "mvn clean test -Dtest=TestLogsCleaner,TestHFileArchiving,TesstCatalogJanitor,TestCleanerChore,TestHFileCleaner,TestFSTableDescriptors,TestHFileArchiveUtil -P localTests" with no test failures. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch, hbase-6796-v2.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488939#comment-13488939 ] Jesse Yates commented on HBASE-6796: hmmm, I'll try a rebase and see. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488936#comment-13488936 ] Lars Hofhansl commented on HBASE-6796: -- Hmm... Patch does not apply cleanly to 0.94 > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488933#comment-13488933 ] Jesse Yates commented on HBASE-6796: [~lhofhansl] seemed like it would be less work for you this way. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488930#comment-13488930 ] Hadoop QA commented on HBASE-6060: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12551735/trunk-6060_v3.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 85 warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/3210//console This message is automatically generated. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488931#comment-13488931 ] Lars Hofhansl commented on HBASE-6796: -- Oh I see, you did that with all the related changes. Fine then. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer
[ https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488928#comment-13488928 ] Lars Hofhansl commented on HBASE-6925: -- I'd definitely be interested in the various changes that went into this. HBASE-4956 just fixes Result.readFields to read the resultbuffer in 8k chunks? Would that defeat the purpose of this change? I can do some quick test on a single host, but that won't be very representative. > Change socket write size from 8K to 64K for HBaseServer > --- > > Key: HBASE-6925 > URL: https://issues.apache.org/jira/browse/HBASE-6925 > Project: HBase > Issue Type: Sub-task > Components: Performance >Reporter: Karthik Ranganathan >Assignee: Karthik Ranganathan >Priority: Critical > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-6925.patch > > > Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT > from 8K to 64K in HBaseServer. This seems to increase scan throughput. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488929#comment-13488929 ] Lars Hofhansl commented on HBASE-6796: -- Wanna do it this way, or should I rather commit HBASE-7082 to 0.94? I'm fine either way. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
[ https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7082: --- Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Integrated into trunk. Thanks Himanshu and Jesse for reviewing it. > TestHFileCleaner#testHFileCleaning fails due to cleaner is reset > > > Key: HBASE-7082 > URL: https://issues.apache.org/jira/browse/HBASE-7082 > Project: HBase > Issue Type: Test >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Trivial > Fix For: 0.96.0 > > Attachments: trunk-7082.patch > > > TestHFileCleaner#testHFileCleaning fails if it runs after > testRemovesEmptyDirectories which resets the cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.
[ https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesse Yates updated HBASE-6796: --- Attachment: hbase-6796-v1.patch Updating patch to include fix from HBASE-7082. > Backport HBASE-5547, Don't delete HFiles in backup mode. > > > Key: HBASE-6796 > URL: https://issues.apache.org/jira/browse/HBASE-6796 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: Jesse Yates > Fix For: 0.94.3 > > Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, > hbase-6796-v1.patch > > > See HBASE-5547 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
[ https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488912#comment-13488912 ] Jesse Yates commented on HBASE-7082: +1 yup, that looks good. > TestHFileCleaner#testHFileCleaning fails due to cleaner is reset > > > Key: HBASE-7082 > URL: https://issues.apache.org/jira/browse/HBASE-7082 > Project: HBase > Issue Type: Test >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Trivial > Attachments: trunk-7082.patch > > > TestHFileCleaner#testHFileCleaning fails if it runs after > testRemovesEmptyDirectories which resets the cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6060: --- Status: Patch Available (was: Open) > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
[ https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1348#comment-1348 ] Himanshu Vashishtha commented on HBASE-7082: +1. lgtm. > TestHFileCleaner#testHFileCleaning fails due to cleaner is reset > > > Key: HBASE-7082 > URL: https://issues.apache.org/jira/browse/HBASE-7082 > Project: HBase > Issue Type: Test >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Trivial > Attachments: trunk-7082.patch > > > TestHFileCleaner#testHFileCleaning fails if it runs after > testRemovesEmptyDirectories which resets the cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6060: --- Assignee: Jimmy Xiang (was: rajeshbabu) Status: Open (was: Patch Available) > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-6060: --- Attachment: trunk-6060_v3.3.patch > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: rajeshbabu > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6852: - Fix Version/s: (was: 0.94.3) 0.94.4 Moving to 0.94.4. Cheng, looking at the patch, do you think it's possible due to the way we flush the metrics for the all schema metrics? > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.4 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl reopened HBASE-6852: -- Reverted a 2nd time. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.4 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488858#comment-13488858 ] rajeshbabu commented on HBASE-6060: --- @Jimmy Its good. +1 > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: rajeshbabu > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488849#comment-13488849 ] Francis Liu commented on HBASE-7042: {quote} It is possible to reuse Exec and ExecResult. Just ignore / don't set the row fields etc. Having separate classes for master exec only is not a clean API in my opinion. {quote} Apart from being a bit clunky I made them separate so that each class can evolve without master/region usages stepping/confusing on each other. If you think this is ok I'm fine with reusing the Exec and ExecResult. > Master Coprocessor Endpoint > --- > > Key: HBASE-7042 > URL: https://issues.apache.org/jira/browse/HBASE-7042 > Project: HBase > Issue Type: Sub-task >Reporter: Francis Liu >Assignee: Francis Liu > Fix For: 0.96.0 > > Attachments: HBASE-7042_94.patch > > > Having support for a master coprocessor endpoint would enable developers to > easily extended HMaster functionality/features. As is the case for region > server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488846#comment-13488846 ] Lars Hofhansl commented on HBASE-7017: -- Thanks DD. I will commit this today, unless I hear any objections. > Backport "[replication] The replication-executor should make sure the file > that it is replicating is closed before declaring success on that file" to > 0.94 > -- > > Key: HBASE-7017 > URL: https://issues.apache.org/jira/browse/HBASE-7017 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.3 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488845#comment-13488845 ] Francis Liu commented on HBASE-7042: Though I believe your concern can be addressed by making system coprocessors reloadable. Which I think we should do for both master and region coprocessors anyway. This we can address in a separate jira? > Master Coprocessor Endpoint > --- > > Key: HBASE-7042 > URL: https://issues.apache.org/jira/browse/HBASE-7042 > Project: HBase > Issue Type: Sub-task >Reporter: Francis Liu >Assignee: Francis Liu > Fix For: 0.96.0 > > Attachments: HBASE-7042_94.patch > > > Having support for a master coprocessor endpoint would enable developers to > easily extended HMaster functionality/features. As is the case for region > server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94
[ https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HBASE-7017: --- Assignee: Devaraj Das I was able to run a decently large test.. and the patch showed no problems. In particular, I ran the LoadTestTool with 100 keys, and set the config for hbase.regionserver.hlog.blocksize to be a low number (1 MB). The test led to ~650 log rolls on one of the region servers I was tracking. I verified that the data got successfully replicated to the remote cluster. > Backport "[replication] The replication-executor should make sure the file > that it is replicating is closed before declaring success on that file" to > 0.94 > -- > > Key: HBASE-7017 > URL: https://issues.apache.org/jira/browse/HBASE-7017 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Devaraj Das >Assignee: Devaraj Das > Fix For: 0.94.3 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint
[ https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488838#comment-13488838 ] Francis Liu commented on HBASE-7042: {quote} The issue I see with supporting only system coprocessors is it requires deployment by way of XML config update as the sole option, which requires a process restart. Table CPs can be deployed to HDFS and then loaded dynamically with a schema update without needing to stop any process. Not supporting this latter option on the master seems a shortcut and limiting. {quote} Don't we already have a precedent for this with MasterObserver? From a use case perspective it would seem to me that Master Coprocessors are cluster owner decisions and are prolly fewer and slower moving than region coprocessors which are owned by table owners. In the former case updating and restarting the master should be acceptable? > Master Coprocessor Endpoint > --- > > Key: HBASE-7042 > URL: https://issues.apache.org/jira/browse/HBASE-7042 > Project: HBase > Issue Type: Sub-task >Reporter: Francis Liu >Assignee: Francis Liu > Fix For: 0.96.0 > > Attachments: HBASE-7042_94.patch > > > Having support for a master coprocessor endpoint would enable developers to > easily extended HMaster functionality/features. As is the case for region > server grouping. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6060: - Fix Version/s: (was: 0.94.3) (was: 0.92.3) Removed from 0.92 and 0.94. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: rajeshbabu > Fix For: 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488824#comment-13488824 ] Lars Hofhansl commented on HBASE-6852: -- Sorry, Cheng, I am probably going to have to roll this back again. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.3 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7066) Some HMaster coprocessor exceptions are being swallowed in try catch blocks
[ https://issues.apache.org/jira/browse/HBASE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488822#comment-13488822 ] Francis Liu commented on HBASE-7066: Lars, if you agree that this fix is the right way to go. Can we get this into 0.94.3? > Some HMaster coprocessor exceptions are being swallowed in try catch blocks > --- > > Key: HBASE-7066 > URL: https://issues.apache.org/jira/browse/HBASE-7066 > Project: HBase > Issue Type: Bug > Components: Coprocessors, security >Affects Versions: 0.94.2, 0.96.0 >Reporter: Francis Liu >Assignee: Francis Liu >Priority: Critical > Attachments: HBASE-7066_94.patch, HBASE-7066_trunk.patch, > HBASE-7066_trunk.patch > > > This is causing HMaster.shutdown() and HMaster.stopMaster() to succeed even > when an AccessDeniedException is thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.
[ https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reopened HBASE-2645: --- The test failed in HBase-TRUNK build #3504 > HLog writer can do 1-2 sync operations after lease has been recovered for > split process. > > > Key: HBASE-2645 > URL: https://issues.apache.org/jira/browse/HBASE-2645 > Project: HBase > Issue Type: Bug > Components: Filters >Affects Versions: 0.90.4 >Reporter: Cosmin Lehene >Assignee: Todd Lipcon >Priority: Blocker > Fix For: 0.96.0 > > Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, > org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt > > > TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. > This test starts a thread that writes one edit to the log, syncs and counts. > During this, a HLog.splitLog operation is started. splitLog recovers the log > lease before reading the log, so that the original regionserver could not > wake up and write after the split process started. > The test compares the number of edits reported by the split process and by > the writer thread. Writer thread (called zombie in the test) should report <= > than the splitLog (sync() might raise after the last edit gets written and > the edit won't get counted by zombie thread). However it appears that the > zombie counts 1-2 more edits. So it looks like it can sync without a lease. > This might be a hdfs-0.20 related issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7066) Some HMaster coprocessor exceptions are being swallowed in try catch blocks
[ https://issues.apache.org/jira/browse/HBASE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488818#comment-13488818 ] Francis Liu commented on HBASE-7066: More specifically if AccessController.preShutdown() throws an IOException, shutdown should fail since we cannot truly determine whether the shutdown request is authorized and thus should err on the side of caution and fail the request. There may be instances that preShutdown() hook exceptions can be ignored but that is not the case for AccessController which comes packaged with HBase thus it would make sense to guarantee that it works as advertised. We can address the previous scenario by fixing the coprocessor framework. > Some HMaster coprocessor exceptions are being swallowed in try catch blocks > --- > > Key: HBASE-7066 > URL: https://issues.apache.org/jira/browse/HBASE-7066 > Project: HBase > Issue Type: Bug > Components: Coprocessors, security >Affects Versions: 0.94.2, 0.96.0 >Reporter: Francis Liu >Assignee: Francis Liu >Priority: Critical > Attachments: HBASE-7066_94.patch, HBASE-7066_trunk.patch, > HBASE-7066_trunk.patch > > > This is causing HMaster.shutdown() and HMaster.stopMaster() to succeed even > when an AccessDeniedException is thrown. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover
[ https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488817#comment-13488817 ] Jimmy Xiang commented on HBASE-6060: @Rajesh, are you ok with the latest patch on RB for trunk? @Lars, I am fine with letting it go with 0.92 and 0.94. My patch won't work well in 0.92/0.94 since it depends on other patches which are too big to backport. > Regions's in OPENING state from failed regionservers takes a long time to > recover > - > > Key: HBASE-6060 > URL: https://issues.apache.org/jira/browse/HBASE-6060 > Project: HBase > Issue Type: Bug > Components: master, regionserver >Reporter: Enis Soztutar >Assignee: rajeshbabu > Fix For: 0.92.3, 0.94.3, 0.96.0 > > Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, > 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, > 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, > 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, > 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, > HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, > HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, > trunk-6060.patch, trunk-6060_v2.patch > > > we have seen a pattern in tests, that the regions are stuck in OPENING state > for a very long time when the region server who is opening the region fails. > My understanding of the process: > > - master calls rs to open the region. If rs is offline, a new plan is > generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in > master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), > HMaster.assign() > - RegionServer, starts opening a region, changes the state in znode. But > that znode is not ephemeral. (see ZkAssign) > - Rs transitions zk node from OFFLINE to OPENING. See > OpenRegionHandler.process() > - rs then opens the region, and changes znode from OPENING to OPENED > - when rs is killed between OPENING and OPENED states, then zk shows OPENING > state, and the master just waits for rs to change the region state, but since > rs is down, that wont happen. > - There is a AssignmentManager.TimeoutMonitor, which does exactly guard > against these kind of conditions. It periodically checks (every 10 sec by > default) the regions in transition to see whether they timedout > (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, > which explains what you and I are seeing. > - ServerShutdownHandler in Master does not reassign regions in OPENING > state, although it handles other states. > Lowering that threshold from the configuration is one option, but still I > think we can do better. > Will investigate more. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488816#comment-13488816 ] Jimmy Xiang commented on HBASE-7083: Tried to submit to hadoop qa again. As to focing reassign, it is needed since the region state in memory may show the missing daughter is open. Since it is not in meta, SSH will assign it only in fixupDaughters. Without forcing it, assign won't do anything since it is already open. > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488814#comment-13488814 ] Hudson commented on HBASE-7077: --- Integrated in HBase-TRUNK #3504 (See [https://builds.apache.org/job/HBase-TRUNK/3504/]) HBASE-7077 ADDENDUM, add TestCategory (Revision 1404640) Result = FAILURE larsh : Files : * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHBase7051.java > Test for: CheckAndPut should properly read MVCC > --- > > Key: HBASE-7077 > URL: https://issues.apache.org/jira/browse/HBASE-7077 > Project: HBase > Issue Type: Sub-task >Reporter: Gregory Chanan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch > > > checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed > appends and increments. > Also need a test, here's one we could use (originally proposed in HBASE-7051): > The current value of some cell is 10. > I issue two concurrent requests: > A) a check and put where check value = 10, put value = 11 > B) a put where put value = 50 > The only result at the end of these operations that seems reasonable to me is > the value of the cell being 50. If A occurred first (ACID wise), then our > values go 10->11->50. If B occurred first, then our values go 10->50 (and the > checkAndPut fails) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7083: --- Hadoop Flags: Reviewed Status: Patch Available (was: Open) > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488808#comment-13488808 ] Lars Hofhansl commented on HBASE-6852: -- Also tried on some other machines (JDK7 and JDK6) it passes all the time. This is extremely disconcerting. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.3 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter
[ https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-7083: --- Status: Open (was: Patch Available) > SSH#fixupDaughter should force re-assign missing daughter > - > > Key: HBASE-7083 > URL: https://issues.apache.org/jira/browse/HBASE-7083 > Project: HBase > Issue Type: Bug > Components: Region Assignment >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang >Priority: Minor > Fix For: 0.96.0 > > Attachments: trunk-7083.patch > > > In looking into flaky test > TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a > missing daughter is not assigned by SSH properly. It could be open on the > dead server. We need to force re-assign it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488795#comment-13488795 ] Ted Yu commented on HBASE-6852: --- I wasn't able to produce the test failure on MacBook. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.3 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488794#comment-13488794 ] Lars Hofhansl commented on HBASE-6852: -- Does it fail locally on your machine Ted? I'm going to run the test on a different machine so that I can debug. If that is not fruitful, I'll revert the change again... Sigh :( > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.3 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488791#comment-13488791 ] Ted Yu commented on HBASE-6852: --- The new build failed again. After reverting patch v2, TestScannerSelectionUsingTTL passed on the above mentioned platform. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.3 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488784#comment-13488784 ] Hudson commented on HBASE-7077: --- Integrated in HBase-0.94 #563 (See [https://builds.apache.org/job/HBase-0.94/563/]) HBASE-7077 ADDENDUM, add TestCategory (Revision 1404641) Result = FAILURE larsh : Files : * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHBase7051.java > Test for: CheckAndPut should properly read MVCC > --- > > Key: HBASE-7077 > URL: https://issues.apache.org/jira/browse/HBASE-7077 > Project: HBase > Issue Type: Sub-task >Reporter: Gregory Chanan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch > > > checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed > appends and increments. > Also need a test, here's one we could use (originally proposed in HBASE-7051): > The current value of some cell is 10. > I issue two concurrent requests: > A) a check and put where check value = 10, put value = 11 > B) a put where put value = 50 > The only result at the end of these operations that seems reasonable to me is > the value of the cell being 50. If A occurred first (ACID wise), then our > values go 10->11->50. If B occurred first, then our values go 10->50 (and the > checkAndPut fails) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488763#comment-13488763 ] Lars Hofhansl commented on HBASE-6852: -- I triggered a new build. If that fails again, I am not sure what to do. I ran the test a lot of times locally and it always passes. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Assignee: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.3 > > Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, > onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC
[ https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488757#comment-13488757 ] Lars Hofhansl commented on HBASE-7077: -- fixed > Test for: CheckAndPut should properly read MVCC > --- > > Key: HBASE-7077 > URL: https://issues.apache.org/jira/browse/HBASE-7077 > Project: HBase > Issue Type: Sub-task >Reporter: Gregory Chanan >Assignee: ramkrishna.s.vasudevan > Fix For: 0.94.3, 0.96.0 > > Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch > > > checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed > appends and increments. > Also need a test, here's one we could use (originally proposed in HBASE-7051): > The current value of some cell is 10. > I issue two concurrent requests: > A) a check and put where check value = 10, put value = 11 > B) a put where put value = 50 > The only result at the end of these operations that seems reasonable to me is > the value of the cell being 50. If A occurred first (ACID wise), then our > values go 10->11->50. If B occurred first, then our values go 10->50 (and the > checkAndPut fails) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira