[jira] [Commented] (HBASE-5096) Replication does not handle deletes correctly.
[ https://issues.apache.org/jira/browse/HBASE-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176147#comment-13176147 ] Hudson commented on HBASE-5096: --- Integrated in HBase-0.92-security #49 (See [https://builds.apache.org/job/HBase-0.92-security/49/]) HBASE-5096 Replication does not handle deletes correctly. (Lars H) larsh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java Replication does not handle deletes correctly. -- Key: HBASE-5096 URL: https://issues.apache.org/jira/browse/HBASE-5096 Project: HBase Issue Type: Sub-task Components: replication Affects Versions: 0.94.0, 0.92.1 Reporter: Lars Hofhansl Assignee: Lars Hofhansl Fix For: 0.94.0, 0.92.1 Attachments: 5096.txt Teruyoshi Zenmyo discovered this problem. The problem turns out to be this code in ReplicationSink.java: {code} if (kvs.get(0).isDelete()) { ... if (kv.isDeleteFamily()) { delete.deleteFamily(kv.getFamily()); } else if (!kv.isEmptyColumn()) { delete.deleteColumn(kv.getFamily(), kv.getQualifier()); } } ... {code} So the code deal with families delete markers and then assumes that if it's not a family delete marker it must have been a version delete marker. (deleteColumn sets a version delete marker, deleteColumns sets a column delete marker). I.e. column delete markers are not replicated correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HBASE-3924: --- Attachment: HBASE-3924.patch Patch that addresses Lars' comments. {code} HBase Shell command-line options: script-file [script-options] Script to run, along with its arguments. --format=OPTIONFormatter for outputting results. Valid options are: console, html. (Default: console) -d | --debug Set DEBUG log levels. -h | --helpThis help. {code} Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-3274) Replace all config properties references in code with string constants
[ https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reassigned HBASE-3274: -- Assignee: Harsh J Replace all config properties references in code with string constants -- Key: HBASE-3274 URL: https://issues.apache.org/jira/browse/HBASE-3274 Project: HBase Issue Type: Improvement Reporter: Lars George Assignee: Harsh J Priority: Trivial See HBASE-2721 for details. We have fixed the default values in HBASE-3272 but we should also follow Hadoop to remove all hardcoded strings that refer to configuration properties and move them to HConstants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176117#comment-13176117 ] Andrew Purtell commented on HBASE-5097: --- Are you inheriting from {{BaseRegionObserver}}? I'd guess not? Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4009) Script to patch holes in .META. table
[ https://issues.apache.org/jira/browse/HBASE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars George updated HBASE-4009: --- Attachment: add_empty_region.rb Adds an empty region at a given start and end key. Script to patch holes in .META. table - Key: HBASE-4009 URL: https://issues.apache.org/jira/browse/HBASE-4009 Project: HBase Issue Type: New Feature Components: shell Environment: CDH3U0 Reporter: Lars George Priority: Trivial Attachments: add_empty_region.rb, patch_meta.rb I need a script to patch holes in the .META. table, which was corrupted by earlier issue on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HBASE-3924: --- Fix Version/s: 0.94.0 0.92.0 Status: Patch Available (was: Open) Patch was against trunk, but should apply to both 0.92 and trunk. Would be good to have in a further 0.92 rev. Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176182#comment-13176182 ] Hadoop QA commented on HBASE-3924: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508673/HBASE-3924.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -151 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 77 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestLogRolling org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/604//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/604//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/604//console This message is automatically generated. Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3274) Replace all config properties references in code with string constants
[ https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176204#comment-13176204 ] Harsh J commented on HBASE-3274: I'm recording all the strange param names I encounter as I do this at https://gist.github.com/76416a2211ece8edb95a Meanwhile, am hoping no more patches get in with config names as strings… :) Replace all config properties references in code with string constants -- Key: HBASE-3274 URL: https://issues.apache.org/jira/browse/HBASE-3274 Project: HBase Issue Type: Improvement Reporter: Lars George Assignee: Harsh J Priority: Trivial Original Estimate: 168h Remaining Estimate: 168h See HBASE-2721 for details. We have fixed the default values in HBASE-3272 but we should also follow Hadoop to remove all hardcoded strings that refer to configuration properties and move them to HConstants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reassigned HBASE-3924: -- Assignee: Harsh J Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176140#comment-13176140 ] ramkrishna.s.vasudevan commented on HBASE-5097: --- Ya.. My bad..:(.. Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.
[ https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176207#comment-13176207 ] Akash Ashok commented on HBASE-4565: Really helpful. Thanks a lot. Also patch seems to be failing on trunk. I am not sure it its already out of sync with trunk. $ patch -p0 pom.patch (Stripping trailing CRs from patch.) patching file pom.xml Hunk #1 FAILED at 696. patch unexpectedly ends in middle of line Hunk #2 succeeded at 1328 with fuzz 1 (offset 90 lines). 1 out of 2 hunks FAILED -- saving rejects to file pom.xml.rej I manually patched it and works beautifully. Maven HBase build broken on cygwin with copynativelib.sh call. -- Key: HBASE-4565 URL: https://issues.apache.org/jira/browse/HBASE-4565 Project: HBase Issue Type: Bug Components: build Affects Versions: 0.92.0 Environment: cygwin (on xp and win7) Reporter: Suraj Varma Assignee: Suraj Varma Labels: build, maven Fix For: 0.94.0 Attachments: HBASE-4565-0.92.patch, HBASE-4565-v2.patch, HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch, HBASE-4565.patch This is broken in both 0.92 as well as trunk pom.xml Here's a sample maven log snippet from trunk (from Mayuresh on user mailing list) [INFO] [antrun:run {execution: package}] [INFO] Executing tasks main: [mkdir] Created dir: D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform} [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: No such file or directory [exec] tar (child): Cannot connect to D: resolve failed [INFO] [ERROR] BUILD ERROR [INFO] [INFO] An Ant BuildException has occured: exec returned: 3328 There are two issues: 1) The ant run task below doesn't resolve the windows file separator returned by the project.build.directory - this causes the above resolve failed. !-- Using Unix cp to preserve symlinks, using script to handle wildcards -- echo file=${project.build.directory}/copynativelibs.sh if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then 2) The tar argument value below also has a similar issue in that the path arg doesn't resolve right. !-- Using Unix tar to preserve symlinks -- exec executable=tar failonerror=yes dir=${project.build.directory}/${project.artifactId}-${project.version} arg value=czf/ arg value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/ arg value=./ /exec In both cases, the fix would probably be to use a cross-platform way to handle the directory locations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-3274) Replace all config properties references in code with string constants
[ https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-3274 started by Harsh J. Replace all config properties references in code with string constants -- Key: HBASE-3274 URL: https://issues.apache.org/jira/browse/HBASE-3274 Project: HBase Issue Type: Improvement Reporter: Lars George Assignee: Harsh J Priority: Trivial Original Estimate: 168h Remaining Estimate: 168h See HBASE-2721 for details. We have fixed the default values in HBASE-3272 but we should also follow Hadoop to remove all hardcoded strings that refer to configuration properties and move them to HConstants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3274) Replace all config properties references in code with string constants
[ https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HBASE-3274: --- Remaining Estimate: 168h Original Estimate: 168h Replace all config properties references in code with string constants -- Key: HBASE-3274 URL: https://issues.apache.org/jira/browse/HBASE-3274 Project: HBase Issue Type: Improvement Reporter: Lars George Assignee: Harsh J Priority: Trivial Original Estimate: 168h Remaining Estimate: 168h See HBASE-2721 for details. We have fixed the default values in HBASE-3272 but we should also follow Hadoop to remove all hardcoded strings that refer to configuration properties and move them to HConstants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin
[ https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5073: -- Fix Version/s: (was: 0.90.5) 0.90.6 Hadoop Flags: Reviewed Registered listeners not getting removed leading to memory leak in HBaseAdmin - Key: HBASE-5073 URL: https://issues.apache.org/jira/browse/HBASE-5073 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.6 Attachments: HBASE-5073.patch HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog tracker. Every time Root node tracker and meta node tracker are started and a listener is registered. But after the operations are performed the listeners are not getting removed. Hence if the admin apis are consistently used then it may lead to memory leak. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin
[ https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176231#comment-13176231 ] Zhihong Yu commented on HBASE-5073: --- Integrated to 0.90 branch Thanks for the patch, Ramkrishna. Thanks for the review, Stack and Lars Registered listeners not getting removed leading to memory leak in HBaseAdmin - Key: HBASE-5073 URL: https://issues.apache.org/jira/browse/HBASE-5073 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.6 Attachments: HBASE-5073.patch HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog tracker. Every time Root node tracker and meta node tracker are started and a listener is registered. But after the operations are performed the listeners are not getting removed. Hence if the admin apis are consistently used then it may lead to memory leak. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server
[ https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-4720: -- Attachment: (was: HBASE-4720.trunk.v2.patch) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server Key: HBASE-4720 URL: https://issues.apache.org/jira/browse/HBASE-4720 Project: HBase Issue Type: Improvement Reporter: Daniel Lord Assignee: Mubarak Seyed Fix For: 0.94.0 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, HBASE-4720.v1.patch, HBASE-4720.v3.patch I have several large application/HBase clusters where an application node will occasionally need to talk to HBase from a different cluster. In order to help ensure some of my consistency guarantees I have a sentinel table that is updated atomically as users interact with the system. This works quite well for the regular hbase client but the REST client does not implement the checkAndPut and checkAndDelete operations. This exposes the application to some race conditions that have to be worked around. It would be ideal if the same checkAndPut/checkAndDelete operations could be supported by the REST client. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5094: -- Priority: Critical (was: Major) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible. Key: HBASE-5094 URL: https://issues.apache.org/jira/browse/HBASE-5094 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: ramkrishna.s.vasudevan Priority: Critical {code} RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey()); ServerName addressFromAM = this.services.getAssignmentManager() .getRegionServerOfRegion(e.getKey()); if (rit != null !rit.isClosing() !rit.isPendingClose()) { // Skip regions that were in transition unless CLOSING or // PENDING_CLOSE LOG.info(Skip assigning region + rit.toString()); } else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } {code} In ServerShutDownHandler we try to get the address in the AM. This address is initially null because it is not yet updated after the region was opened .i.e. the CAll back after node deletion is not yet done in the master side. But removal from RIT is completed on the master side. So this will trigger a new assignment. So there is a small window between the online region is actually added in to the online list and the ServerShutdownHandler where we check the existing address in AM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4009) Script to patch holes in .META. table
[ https://issues.apache.org/jira/browse/HBASE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176243#comment-13176243 ] Todd Lipcon commented on HBASE-4009: I don't htink we should be committing any more rb scripts like this to the source code - they just get people into trouble, fall out of date, etc. Instead this should be an hbck feature that automatically patches holes in a table if requested Script to patch holes in .META. table - Key: HBASE-4009 URL: https://issues.apache.org/jira/browse/HBASE-4009 Project: HBase Issue Type: New Feature Components: shell Environment: CDH3U0 Reporter: Lars George Priority: Trivial Attachments: add_empty_region.rb, patch_meta.rb I need a script to patch holes in the .META. table, which was corrupted by earlier issue on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3274) Replace all config properties references in code with string constants
[ https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176276#comment-13176276 ] Lars Hofhansl commented on HBASE-3274: -- Big +1. I have actually reviewed some recent patches that add config names as string. Sorry I did not point that out as defect. Also many people turn to hbase-default.xml, but many (new and old) configs are not in there. Replace all config properties references in code with string constants -- Key: HBASE-3274 URL: https://issues.apache.org/jira/browse/HBASE-3274 Project: HBase Issue Type: Improvement Reporter: Lars George Assignee: Harsh J Priority: Trivial Original Estimate: 168h Remaining Estimate: 168h See HBASE-2721 for details. We have fixed the default values in HBASE-3272 but we should also follow Hadoop to remove all hardcoded strings that refer to configuration properties and move them to HConstants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176281#comment-13176281 ] Lars Hofhansl commented on HBASE-3924: -- It's still a bit confusing I think. How about something like? {code} Usage: HBase shell [OPTIONS] [SCRIPT [ARGUMENTS]] --format=OPTIONFormatter for outputting results. Valid options are: console, html. (Default: console) -d | --debug Set DEBUG log levels. -h | --helpThis help. {code} Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176287#comment-13176287 ] Andrew Purtell commented on HBASE-5097: --- Wouldn't hurt. Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176296#comment-13176296 ] Lars Hofhansl commented on HBASE-5097: -- Not a big fan of catching NPEs. We can add a null check somewhere in the code, but catching NPEs is bad design (IMHO). @Ram: How did you actually do this? You need to implement RegionObserver, so you cannot compile unless you provide an implementation of postScannerOpen. Or did you class not even implement RegionObserver? Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5098) Update hbase-default.xml to reflect the state of HConstants once its the center.
Update hbase-default.xml to reflect the state of HConstants once its the center. Key: HBASE-5098 URL: https://issues.apache.org/jira/browse/HBASE-5098 Project: HBase Issue Type: Sub-task Reporter: Harsh J Priority: Trivial Once the parent task is done, we should be easily able to add to and update hbase-default.xml. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176299#comment-13176299 ] Harsh J commented on HBASE-3924: +1 on that. It does look much more readable. Nit: I'd ditch the 'HBase' though, seems unnecessary. Or maybe use 'hbase' to avoid confusion about what it means there. Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176311#comment-13176311 ] Lars Hofhansl commented on HBASE-5061: -- From looking at the code, this would produce data locality with respect to certain host (if I understand this correctly). Wouldn't we want to report locality from the viewpoint the respective regionserver(s)? I.e. regions for a table might be on many regionservers, but for each the data might be local to the regionserver. StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: StoreFileLocalityChecker.java org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176322#comment-13176322 ] Andrew Purtell commented on HBASE-5061: --- bq. From looking at the code, this would produce data locality with respect to certain host (if I understand this correctly). Yes this is the intent, to report locality to a given regionserver host, by default the local host as determined by a HBase style reverse lookup, assuming ops would run it either on an ops box (or the master) and supply the desired local host name via the '-h' option, or local to the RS. I was thinking one use case could be to iterate over each cluster node and trigger major compactions for region(s) depending on the output of this tool. Of you are looking for a whole cluster report? StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: StoreFileLocalityChecker.java org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176326#comment-13176326 ] Andrew Purtell commented on HBASE-5061: --- So, currently this tool is meant to check the locality of files or regions to a given single host, however it could be modified such that if the '-h' option is supplied then the report looks only at the given host, otherwise it will enumerate the live hosts in ClusterStatus and return results for all. While at it, a new option '-j' for reporting in JSON. Is this more what you have in mind Lars? StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: StoreFileLocalityChecker.java org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker
[ https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176332#comment-13176332 ] Lars Hofhansl commented on HBASE-5061: -- That's what I had envisioned for the -t option at least. Would be nice to have a report on all regions of a table. But maybe that would not be useful as the number of region servers grows...? +1 on JSON output. StoreFileLocalityChecker Key: HBASE-5061 URL: https://issues.apache.org/jira/browse/HBASE-5061 Project: HBase Issue Type: New Feature Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Attachments: StoreFileLocalityChecker.java org.apache.hadoop.hbase.HFileLocalityChecker [options] A tool to report the number of local and nonlocal HFile blocks, and the ratio of as a percentage. Where options are: |-f file|Analyze a store file| |-r region|Analyze all store files for the region| |-t table|Analyze all store files for regions of the table served by the local regionserver| |-h host|Consider host local, defaults to the local host| |-v|Verbose operation| -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176334#comment-13176334 ] Lars Hofhansl commented on HBASE-3924: -- +1 on ditching HBase and just say shell. And should it say SCRIPT or SCRIPTFILE? Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3924) Improve Shell's CLI help
[ https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176353#comment-13176353 ] stack commented on HBASE-3924: -- +1 on patch. SCRIPTFILE sounds better, yeah, and *HBase* shell is superfluous. I don't think formatter other than console works but thats for another issue. Improve Shell's CLI help Key: HBASE-3924 URL: https://issues.apache.org/jira/browse/HBASE-3924 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.90.3 Reporter: Lars George Assignee: Harsh J Priority: Trivial Fix For: 0.92.0, 0.94.0 Attachments: HBASE-3924.patch In the hirb.rb source we have {noformat} # so they don't go through to irb. Output shell 'usage' if user types '--help' cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: formatFormatter for outputting results: console | html. Default: console -d | --debug Set DEBUG log levels. HERE found = [] format = 'console' script2run = nil log_level = org.apache.log4j.Level::ERROR for arg in ARGV if arg =~ /^--format=(.+)/i format = $1 if format =~ /^html$/i raise NoMethodError.new(Not yet implemented) elsif format =~ /^console$/i # This is default else raise ArgumentError.new(Unsupported format + arg) end found.push(arg) elsif arg == '-h' || arg == '--help' puts cmdline_help exit elsif arg == '-d' || arg == '--debug' log_level = org.apache.log4j.Level::DEBUG $fullBackTrace = true puts Setting DEBUG log level... else # Presume it a script. Save it off for running later below # after we've set up some environment. script2run = arg found.push(arg) # Presume that any other args are meant for the script. break end end {noformat} We should enhance the help printed when using -h/--help to look like this? {noformat} cmdline_help = HERE # HERE document output as shell usage HBase Shell command-line options: --format={console|html}Formatter for outputting results. Default: console -d | --debug Set DEBUG log levels. -h | --help This help. script-filename [script-options] HERE {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5093) wiki update for HBase/Scala
[ https://issues.apache.org/jira/browse/HBASE-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176355#comment-13176355 ] stack commented on HBASE-5093: -- Do you have a 'wiki' name Joe? I'll add you to the list of contribs (Ours and the hadoop wikis have been spammed bad for years and a recent change has it that folks need to ask for perms to edit). Thanks for doing this ska la doc. wiki update for HBase/Scala --- Key: HBASE-5093 URL: https://issues.apache.org/jira/browse/HBASE-5093 Project: HBase Issue Type: Improvement Reporter: Joe Stein I tried to edit the wiki but it says immutable page would be helpful/nice for folks to know how to get sbt working with Scala the following is what I did to get it working, not sure why could not edit the wiki figure i open a JIRA so someone with access could update this {code} resolvers += Apache HBase at https://repository.apache.org/content/repositories/releases; resolvers += Thrift at http://people.apache.org/~rawson/repo/; libraryDependencies ++= Seq( org.apache.hadoop % hadoop-core % 0.20.2, org.apache.hbase % hbase % 0.90.4 ) {code} or let me access it and I can do it, np -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re
ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root region is on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HBASE-5099: --- Attachment: ZK-event-thread-waiting-for-root.png distributed-log-splitting-hangs.png ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root region is on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Attachments: ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5094: -- Attachment: 5094.patch Patch for trunk The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible. Key: HBASE-5094 URL: https://issues.apache.org/jira/browse/HBASE-5094 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: ramkrishna.s.vasudevan Priority: Critical Attachments: 5094.patch {code} RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey()); ServerName addressFromAM = this.services.getAssignmentManager() .getRegionServerOfRegion(e.getKey()); if (rit != null !rit.isClosing() !rit.isPendingClose()) { // Skip regions that were in transition unless CLOSING or // PENDING_CLOSE LOG.info(Skip assigning region + rit.toString()); } else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } {code} In ServerShutDownHandler we try to get the address in the AM. This address is initially null because it is not yet updated after the region was opened .i.e. the CAll back after node deletion is not yet done in the master side. But removal from RIT is completed on the master side. So this will trigger a new assignment. So there is a small window between the online region is actually added in to the online list and the ServerShutdownHandler where we check the existing address in AM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176402#comment-13176402 ] ramkrishna.s.vasudevan commented on HBASE-5097: --- @Lars I implemented RegionObserver and not BaseRegionObserver. In BaseRegionObserver we return a regionscanner so it will not cause any problem. But if we implement RegionObserver then default will be null value. That is the problem. Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5094: -- Status: Patch Available (was: Open) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible. Key: HBASE-5094 URL: https://issues.apache.org/jira/browse/HBASE-5094 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: ramkrishna.s.vasudevan Priority: Critical Attachments: 5094.patch {code} RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey()); ServerName addressFromAM = this.services.getAssignmentManager() .getRegionServerOfRegion(e.getKey()); if (rit != null !rit.isClosing() !rit.isPendingClose()) { // Skip regions that were in transition unless CLOSING or // PENDING_CLOSE LOG.info(Skip assigning region + rit.toString()); } else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } {code} In ServerShutDownHandler we try to get the address in the AM. This address is initially null because it is not yet updated after the region was opened .i.e. the CAll back after node deletion is not yet done in the master side. But removal from RIT is completed on the master side. So this will trigger a new assignment. So there is a small window between the online region is actually added in to the online list and the ServerShutdownHandler where we check the existing address in AM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176406#comment-13176406 ] Zhihong Yu commented on HBASE-5099: --- bq. Another one is to abort the master The above sounds better. How about introducing a timeout for: {code} this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO); {code} after which master aborts ? ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root region is on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Attachments: ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin
[ https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan resolved HBASE-5073. --- Resolution: Fixed Committed to branch hence resolving. Registered listeners not getting removed leading to memory leak in HBaseAdmin - Key: HBASE-5073 URL: https://issues.apache.org/jira/browse/HBASE-5073 Project: HBase Issue Type: Bug Affects Versions: 0.90.4 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.90.6 Attachments: HBASE-5073.patch HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog tracker. Every time Root node tracker and meta node tracker are started and a listener is registered. But after the operations are performed the listeners are not getting removed. Hence if the admin apis are consistently used then it may lead to memory leak. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5009: -- Status: Open (was: Patch Available) Cancelling and resubmitting as the tests hanged Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Attachments: HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-5009: -- Fix Version/s: 0.90.6 0.92.1 Affects Version/s: (was: 0.90.6) Status: Patch Available (was: Open) Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176430#comment-13176430 ] Jimmy Xiang commented on HBASE-5099: This is good. If introducing a timeout, I prefer to do it for tryRecoveringExpiredZKSession(). The reason for that is, other than waitForAssignment, there are several other places which have the waiting logic as well, such as bulkAssign(), waitForRoot(), this.activeMasterManager.blockUntilBecomingActiveMaster(startupStatus), etc. ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root region is on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Attachments: ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176433#comment-13176433 ] Zhihong Yu commented on HBASE-5009: --- Pressing 'Submit' again wouldn't run the tests. This is because Hadoop QA remembers the attachment Id of the patch for which test suite was executed. Please attach the same copy of patch for TRUNK again. Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5009: -- Attachment: 5009.txt Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5009: -- Status: Open (was: Patch Available) Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5009: -- Status: Patch Available (was: Open) Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176440#comment-13176440 ] Lars Hofhansl commented on HBASE-5097: -- Hmmm. There is no default implementation for an interface. Are you referring to the default implementation generated for you by eclipse? I just don't think this is a bug. Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176447#comment-13176447 ] Jimmy Xiang commented on HBASE-5099: tryRecoveringExpiredZKSession() is only called by abortNow(), which is called by abort(), which is called by the eventThread. I was thinking to put this whole method in another thread with executor service and time it out after a certain time, for example, 5 minutes, then fails the recovery and let it abort. This way, we don't have to adding timeout for all the methods. The regular master startup is not impacted which calls assignRootAndMeta() too. However, if we know most likely just waitForAssignment() takes a long time, we can add timeout to this method only. But I am not so sure. ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root region is on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Attachments: ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176454#comment-13176454 ] Zhihong Yu commented on HBASE-5099: --- Timing out tryRecoveringExpiredZKSession() is good. We can introduce a separate thread to carry the current logic of tryRecoveringExpiredZKSession() and monitor this thread in eventThread. ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root region is on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Attachments: ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.
[ https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176456#comment-13176456 ] ramkrishna.s.vasudevan commented on HBASE-5097: --- @Lars.. Ok Lars..I too accept it.. Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization. --- Key: HBASE-5097 URL: https://issues.apache.org/jira/browse/HBASE-5097 Project: HBase Issue Type: Bug Components: coprocessors Reporter: ramkrishna.s.vasudevan In HRegionServer.java openScanner() {code} r.prepareScanner(scan); RegionScanner s = null; if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().preScannerOpen(scan); } if (s == null) { s = r.getScanner(scan); } if (r.getCoprocessorHost() != null) { s = r.getCoprocessorHost().postScannerOpen(scan, s); } {code} If we dont have implemention for postScannerOpen the RegionScanner is null and so throwing nullpointer {code} java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881) at org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282) at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {code} Making this defect as blocker.. Pls feel free to change the priority if am wrong. Also correct me if my way of trying out coprocessors without implementing postScannerOpen is wrong. Am just a learner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176458#comment-13176458 ] Zhihong Yu commented on HBASE-5094: --- +1 on patch. The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible. Key: HBASE-5094 URL: https://issues.apache.org/jira/browse/HBASE-5094 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: ramkrishna.s.vasudevan Priority: Critical Attachments: 5094.patch {code} RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey()); ServerName addressFromAM = this.services.getAssignmentManager() .getRegionServerOfRegion(e.getKey()); if (rit != null !rit.isClosing() !rit.isPendingClose()) { // Skip regions that were in transition unless CLOSING or // PENDING_CLOSE LOG.info(Skip assigning region + rit.toString()); } else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } {code} In ServerShutDownHandler we try to get the address in the AM. This address is initially null because it is not yet updated after the region was opened .i.e. the CAll back after node deletion is not yet done in the master side. But removal from RIT is completed on the master side. So this will trigger a new assignment. So there is a small window between the online region is actually added in to the online list and the ServerShutdownHandler where we check the existing address in AM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4529) Make WAL Pluggable
[ https://issues.apache.org/jira/browse/HBASE-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176459#comment-13176459 ] dhruba borthakur commented on HBASE-4529: - This will be a great feature! HDFS provides io-fencing for HBase log. When a regionserver dies, the master renames the log directory so that the oldregionserver cannot continue to write any new transactions to the transaction log. (HDFS provides an api that fails a file-creation-request if an intermediate path component is non-existant). It will be nice to get the pluggable-wal API support this type of operation. Make WAL Pluggable --- Key: HBASE-4529 URL: https://issues.apache.org/jira/browse/HBASE-4529 Project: HBase Issue Type: Improvement Components: regionserver, wal Reporter: Akash Ashok Assignee: Akash Ashok Labels: regionserver, wal Make WAL a pluggable, configurable component, thus making it easier to write to different filesystems (including multiple filesystems). From Stack: Pluggable WAL component would need to check that the split can deal w/ multiple logs written by the one server concurrently (sort by sequence edit id after sorting on all the rest that makes up a wal log key). From Jesse Yates: It would be nice to be able to tie pluggable WAL component into a service that logs directly to disk, rather than go through HDFS giving some potentially awesome speedup at the cost of having to write a logging service that handles replication, etc. From Karthik Tunga: Along with the log replaying part, logic is also needed for log roll. This, I think, is easier compared to the merging of the logs. Any edits less than the last sequence number on the file system can be removed from all the WALs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5100) Rollback of split would cause closed region to opened
Rollback of split would cause closed region to opened -- Key: HBASE-5100 URL: https://issues.apache.org/jira/browse/HBASE-5100 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen If master sending close region to rs and region's split transaction concurrently happen, it may cause closed region to opened. See the detailed code in SplitTransaction#createDaughters {code} ListStoreFile hstoreFilesToSplit = null; try{ hstoreFilesToSplit = this.parent.close(false); if (hstoreFilesToSplit == null) { // The region was closed by a concurrent thread. We can't continue // with the split, instead we must just abandon the split. If we // reopen or split this could cause problems because the region has // probably already been moved to a different server, or is in the // process of moving to a different server. throw new IOException(Failed to close region: already closed by + another thread); } } finally { this.journal.add(JournalEntry.CLOSED_PARENT_REGION); } {code} when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes this.parent.initialize(); Although this region is not onlined in the regionserver, it may bring some potential problem. For example, in our environment, the closed parent region is rolled back sucessfully , and then starting compaction and split again. The parent region is f892dd6107b6b4130199582abc78e9c1 master log {code} 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., src=dw87.kgb.sqa.cm4,60020,1324827866085, dest=dw80.kgb.sqa.cm4,60020,1324827865780 2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. (offlining) 2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING) 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. state=CLOSED, ts=1324830285347 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x13447f283f40e73 Creating (or updating) unassigned node for f892dd6107b6b4130199582abc78e9c1 with OFFLINE state 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. destination server is + serverName=dw80.kgb.sqa.cm4,60020,1324827865780, load=(requests=0, regions=1, usedHeap=0, maxHeap=0) 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.;
[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened
[ https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176465#comment-13176465 ] chunhui shen commented on HBASE-5100: - When this.parent.close(false) returns null(It means region has already been closed), we needn't add the JournalEntry.CLOSED_PARENT_REGION Rollback of split would cause closed region to opened -- Key: HBASE-5100 URL: https://issues.apache.org/jira/browse/HBASE-5100 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen If master sending close region to rs and region's split transaction concurrently happen, it may cause closed region to opened. See the detailed code in SplitTransaction#createDaughters {code} ListStoreFile hstoreFilesToSplit = null; try{ hstoreFilesToSplit = this.parent.close(false); if (hstoreFilesToSplit == null) { // The region was closed by a concurrent thread. We can't continue // with the split, instead we must just abandon the split. If we // reopen or split this could cause problems because the region has // probably already been moved to a different server, or is in the // process of moving to a different server. throw new IOException(Failed to close region: already closed by + another thread); } } finally { this.journal.add(JournalEntry.CLOSED_PARENT_REGION); } {code} when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes this.parent.initialize(); Although this region is not onlined in the regionserver, it may bring some potential problem. For example, in our environment, the closed parent region is rolled back sucessfully , and then starting compaction and split again. The parent region is f892dd6107b6b4130199582abc78e9c1 master log {code} 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., src=dw87.kgb.sqa.cm4,60020,1324827866085, dest=dw80.kgb.sqa.cm4,60020,1324827865780 2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. (offlining) 2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING) 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. state=CLOSED, ts=1324830285347 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x13447f283f40e73 Creating (or updating) unassigned node for f892dd6107b6b4130199582abc78e9c1 with OFFLINE state 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for
[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176467#comment-13176467 ] Zhihong Yu commented on HBASE-5009: --- Patch for 0.90 passes tests: {code} Tests run: 702, Failures: 0, Errors: 0, Skipped: 9 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 1:18:08.131s {code} Integrated to 0.90 branch. Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5100) Rollback of split would cause closed region to opened
[ https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176465#comment-13176465 ] Zhihong Yu edited comment on HBASE-5100 at 12/28/11 6:01 AM: - When this.parent.close(false) returns null(It means region has already been closed), we needn't add the JournalEntry.CLOSED_PARENT_REGION was (Author: zjushch): When this.parent.close(false) returns null(It means region has already been closed), we needn't add the JournalEntry.CLOSED_PARENT_REGION Rollback of split would cause closed region to opened -- Key: HBASE-5100 URL: https://issues.apache.org/jira/browse/HBASE-5100 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen If master sending close region to rs and region's split transaction concurrently happen, it may cause closed region to opened. See the detailed code in SplitTransaction#createDaughters {code} ListStoreFile hstoreFilesToSplit = null; try{ hstoreFilesToSplit = this.parent.close(false); if (hstoreFilesToSplit == null) { // The region was closed by a concurrent thread. We can't continue // with the split, instead we must just abandon the split. If we // reopen or split this could cause problems because the region has // probably already been moved to a different server, or is in the // process of moving to a different server. throw new IOException(Failed to close region: already closed by + another thread); } } finally { this.journal.add(JournalEntry.CLOSED_PARENT_REGION); } {code} when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes this.parent.initialize(); Although this region is not onlined in the regionserver, it may bring some potential problem. For example, in our environment, the closed parent region is rolled back sucessfully , and then starting compaction and split again. The parent region is f892dd6107b6b4130199582abc78e9c1 master log {code} 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., src=dw87.kgb.sqa.cm4,60020,1324827866085, dest=dw80.kgb.sqa.cm4,60020,1324827865780 2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. (offlining) 2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING) 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. state=CLOSED, ts=1324830285347 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x13447f283f40e73 Creating (or updating) unassigned node for f892dd6107b6b4130199582abc78e9c1 with OFFLINE state 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, region=f892dd6107b6b4130199582abc78e9c1
[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened
[ https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176473#comment-13176473 ] Zhihong Yu commented on HBASE-5100: --- @Chunhui: Do you have a patch ? Rollback of split would cause closed region to opened -- Key: HBASE-5100 URL: https://issues.apache.org/jira/browse/HBASE-5100 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen If master sending close region to rs and region's split transaction concurrently happen, it may cause closed region to opened. See the detailed code in SplitTransaction#createDaughters {code} ListStoreFile hstoreFilesToSplit = null; try{ hstoreFilesToSplit = this.parent.close(false); if (hstoreFilesToSplit == null) { // The region was closed by a concurrent thread. We can't continue // with the split, instead we must just abandon the split. If we // reopen or split this could cause problems because the region has // probably already been moved to a different server, or is in the // process of moving to a different server. throw new IOException(Failed to close region: already closed by + another thread); } } finally { this.journal.add(JournalEntry.CLOSED_PARENT_REGION); } {code} when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes this.parent.initialize(); Although this region is not onlined in the regionserver, it may bring some potential problem. For example, in our environment, the closed parent region is rolled back sucessfully , and then starting compaction and split again. The parent region is f892dd6107b6b4130199582abc78e9c1 master log {code} 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., src=dw87.kgb.sqa.cm4,60020,1324827866085, dest=dw80.kgb.sqa.cm4,60020,1324827865780 2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. (offlining) 2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING) 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. state=CLOSED, ts=1324830285347 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x13447f283f40e73 Creating (or updating) unassigned node for f892dd6107b6b4130199582abc78e9c1 with OFFLINE state 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. destination server is + serverName=dw80.kgb.sqa.cm4,60020,1324827865780,
[jira] [Updated] (HBASE-5100) Rollback of split would cause closed region to opened
[ https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chunhui shen updated HBASE-5100: Attachment: hbase-5100.patch Rollback of split would cause closed region to opened -- Key: HBASE-5100 URL: https://issues.apache.org/jira/browse/HBASE-5100 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-5100.patch If master sending close region to rs and region's split transaction concurrently happen, it may cause closed region to opened. See the detailed code in SplitTransaction#createDaughters {code} ListStoreFile hstoreFilesToSplit = null; try{ hstoreFilesToSplit = this.parent.close(false); if (hstoreFilesToSplit == null) { // The region was closed by a concurrent thread. We can't continue // with the split, instead we must just abandon the split. If we // reopen or split this could cause problems because the region has // probably already been moved to a different server, or is in the // process of moving to a different server. throw new IOException(Failed to close region: already closed by + another thread); } } finally { this.journal.add(JournalEntry.CLOSED_PARENT_REGION); } {code} when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes this.parent.initialize(); Although this region is not onlined in the regionserver, it may bring some potential problem. For example, in our environment, the closed parent region is rolled back sucessfully , and then starting compaction and split again. The parent region is f892dd6107b6b4130199582abc78e9c1 master log {code} 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., src=dw87.kgb.sqa.cm4,60020,1324827866085, dest=dw80.kgb.sqa.cm4,60020,1324827865780 2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. (offlining) 2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING) 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. state=CLOSED, ts=1324830285347 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x13447f283f40e73 Creating (or updating) unassigned node for f892dd6107b6b4130199582abc78e9c1 with OFFLINE state 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. destination server is + serverName=dw80.kgb.sqa.cm4,60020,1324827865780, load=(requests=0, regions=1,
[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.
[ https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu updated HBASE-5094: -- Comment: was deleted (was: +1 on patch.) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible. Key: HBASE-5094 URL: https://issues.apache.org/jira/browse/HBASE-5094 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: ramkrishna.s.vasudevan Priority: Critical Attachments: 5094.patch {code} RegionState rit = this.services.getAssignmentManager().isRegionInTransition(e.getKey()); ServerName addressFromAM = this.services.getAssignmentManager() .getRegionServerOfRegion(e.getKey()); if (rit != null !rit.isClosing() !rit.isPendingClose()) { // Skip regions that were in transition unless CLOSING or // PENDING_CLOSE LOG.info(Skip assigning region + rit.toString()); } else if (addressFromAM != null !addressFromAM.equals(this.serverName)) { LOG.debug(Skip assigning region + e.getKey().getRegionNameAsString() + because it has been opened in + addressFromAM.getServerName()); } {code} In ServerShutDownHandler we try to get the address in the AM. This address is initially null because it is not yet updated after the region was opened .i.e. the CAll back after node deletion is not yet done in the master side. But removal from RIT is completed on the master side. So this will trigger a new assignment. So there is a small window between the online region is actually added in to the online list and the ServerShutdownHandler where we check the existing address in AM. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further
[ https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176478#comment-13176478 ] Hadoop QA commented on HBASE-5009: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12508723/5009.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -151 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 76 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/606//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/606//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/606//console This message is automatically generated. Failure of creating split dir if it already exists prevents splits from happening further - Key: HBASE-5009 URL: https://issues.apache.org/jira/browse/HBASE-5009 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch The scenario is - The split of a region takes a long time - The deletion of the splitDir fails due to HDFS problems. - Subsequent splits also fail after that. {code} private static void createSplitDir(final FileSystem fs, final Path splitdir) throws IOException { if (fs.exists(splitdir)) throw new IOException(Splitdir already exits? + splitdir); if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of + splitdir); } {code} Correct me if am wrong? If it is an issue can we change the behaviour of throwing exception? Pls suggest. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened
[ https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176480#comment-13176480 ] chunhui shen commented on HBASE-5100: - If returns not null, exceptionEncountered is also false. what about naming the boolean alreadyClosed, or closedBefore,or other similars... Rollback of split would cause closed region to opened -- Key: HBASE-5100 URL: https://issues.apache.org/jira/browse/HBASE-5100 Project: HBase Issue Type: Bug Reporter: chunhui shen Assignee: chunhui shen Attachments: hbase-5100.patch If master sending close region to rs and region's split transaction concurrently happen, it may cause closed region to opened. See the detailed code in SplitTransaction#createDaughters {code} ListStoreFile hstoreFilesToSplit = null; try{ hstoreFilesToSplit = this.parent.close(false); if (hstoreFilesToSplit == null) { // The region was closed by a concurrent thread. We can't continue // with the split, instead we must just abandon the split. If we // reopen or split this could cause problems because the region has // probably already been moved to a different server, or is in the // process of moving to a different server. throw new IOException(Failed to close region: already closed by + another thread); } } finally { this.journal.add(JournalEntry.CLOSED_PARENT_REGION); } {code} when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes this.parent.initialize(); Although this region is not onlined in the regionserver, it may bring some potential problem. For example, in our environment, the closed parent region is rolled back sucessfully , and then starting compaction and split again. The parent region is f892dd6107b6b4130199582abc78e9c1 master log {code} 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., src=dw87.kgb.sqa.cm4,60020,1324827866085, dest=dw80.kgb.sqa.cm4,60020,1324827865780 2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. (offlining) 2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1., server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING) 2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1. state=CLOSED, ts=1324830285347 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x13447f283f40e73 Creating (or updating) unassigned node for f892dd6107b6b4130199582abc78e9c1 with OFFLINE state 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, region=f892dd6107b6b4130199582abc78e9c1 2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for
[jira] [Commented] (HBASE-3741) Make HRegionServer aware of the regions it's opening/closing
[ https://issues.apache.org/jira/browse/HBASE-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176498#comment-13176498 ] johnyang commented on HBASE-3741: - 升级到0.90.4 能修复这个问题 Make HRegionServer aware of the regions it's opening/closing Key: HBASE-3741 URL: https://issues.apache.org/jira/browse/HBASE-3741 Project: HBase Issue Type: Bug Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.90.3 Attachments: HBASE-3741-rsfix-v2.patch, HBASE-3741-rsfix-v3.patch, HBASE-3741-rsfix.patch, HBASE-3741-trunk.patch This is a serious issue about a race between regions being opened and closed in region servers. We had this situation where the master tried to unassign a region for balancing, failed, force unassigned it, force assigned it somewhere else, failed to open it on another region server (took too long), and then reassigned it back to the original region server. A few seconds later, the region server processed the first closed and the region was left unassigned. This is from the master log: {quote} 11-04-05 15:11:17,758 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to serverName=sv4borg42,60020,1300920459477, load=(requests=187, regions=574, usedHeap=3918, maxHeap=6973) for region stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 2011-04-05 15:12:10,021 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 state=PENDING_CLOSE, ts=1302041477758 2011-04-05 15:12:10,021 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 ... 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 state=CLOSED, ts=1302041685733 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x42ec2cece810b68 Creating (or updating) unassigned node for 1470298961 with OFFLINE state ... 2011-04-05 15:14:45,885 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for region stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961; plan=hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961, src=sv4borg42,60020,1300920459477, dest=sv4borg40,60020,1302041218196 2011-04-05 15:14:45,885 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 to sv4borg40,60020,1302041218196 2011-04-05 15:15:39,410 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 state=PENDING_OPEN, ts=1302041700944 2011-04-05 15:15:39,410 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_OPEN for too long, reassigning region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 2011-04-05 15:15:39,410 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 state=PENDING_OPEN, ts=1302041700944 ... 2011-04-05 15:15:39,410 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 so generated a random one; hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961, src=, dest=sv4borg42,60020,1300920459477; 19 (online=19, exclude=null) available servers 2011-04-05 15:15:39,410 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961 to sv4borg42,60020,1300920459477 2011-04-05 15:15:40,951 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: master:6-0x42ec2cece810b68 Received ZooKeeper Event,