[jira] [Commented] (HBASE-5096) Replication does not handle deletes correctly.

2011-12-27 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176147#comment-13176147
 ] 

Hudson commented on HBASE-5096:
---

Integrated in HBase-0.92-security #49 (See 
[https://builds.apache.org/job/HBase-0.92-security/49/])
HBASE-5096  Replication does not handle deletes correctly. (Lars H)

larsh : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/replication/TestReplication.java


 Replication does not handle deletes correctly.
 --

 Key: HBASE-5096
 URL: https://issues.apache.org/jira/browse/HBASE-5096
 Project: HBase
  Issue Type: Sub-task
  Components: replication
Affects Versions: 0.94.0, 0.92.1
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.92.1

 Attachments: 5096.txt


 Teruyoshi Zenmyo discovered this problem.
 The problem turns out to be this code in ReplicationSink.java:
 {code}
 if (kvs.get(0).isDelete()) {
 ...
   if (kv.isDeleteFamily()) {
 delete.deleteFamily(kv.getFamily());
   } else if (!kv.isEmptyColumn()) {
 delete.deleteColumn(kv.getFamily(), kv.getQualifier());
   }
 }
 ...
 {code}
 So the code deal with families delete markers and then assumes that if it's 
 not a family delete marker it must have been a version delete marker.
 (deleteColumn sets a version delete marker, deleteColumns sets a column 
 delete marker).
 I.e. column delete markers are not replicated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-3924:
---

Attachment: HBASE-3924.patch

Patch that addresses Lars' comments.

{code}
HBase Shell command-line options:
 script-file [script-options] Script to run, along with its arguments.

 --format=OPTIONFormatter for outputting results.
Valid options are: console, html.
(Default: console)

 -d | --debug   Set DEBUG log levels.
 -h | --helpThis help.
{code}

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-3274) Replace all config properties references in code with string constants

2011-12-27 Thread Harsh J (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HBASE-3274:
--

Assignee: Harsh J

 Replace all config properties references in code with string constants
 --

 Key: HBASE-3274
 URL: https://issues.apache.org/jira/browse/HBASE-3274
 Project: HBase
  Issue Type: Improvement
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial

 See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
 but we should also follow Hadoop to remove all hardcoded strings that refer 
 to configuration properties and move them to HConstants. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176117#comment-13176117
 ] 

Andrew Purtell commented on HBASE-5097:
---

Are you inheriting from {{BaseRegionObserver}}? I'd guess not?

 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4009) Script to patch holes in .META. table

2011-12-27 Thread Lars George (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars George updated HBASE-4009:
---

Attachment: add_empty_region.rb

Adds an empty region at a given start and end key. 

 Script to patch holes in .META. table
 -

 Key: HBASE-4009
 URL: https://issues.apache.org/jira/browse/HBASE-4009
 Project: HBase
  Issue Type: New Feature
  Components: shell
 Environment: CDH3U0
Reporter: Lars George
Priority: Trivial
 Attachments: add_empty_region.rb, patch_meta.rb


 I need a script to patch holes in the .META. table, which was corrupted by 
 earlier issue on the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-3924:
---

Fix Version/s: 0.94.0
   0.92.0
   Status: Patch Available  (was: Open)

Patch was against trunk, but should apply to both 0.92 and trunk. Would be good 
to have in a further 0.92 rev.

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176182#comment-13176182
 ] 

Hadoop QA commented on HBASE-3924:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508673/HBASE-3924.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 77 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/604//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/604//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/604//console

This message is automatically generated.

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3274) Replace all config properties references in code with string constants

2011-12-27 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176204#comment-13176204
 ] 

Harsh J commented on HBASE-3274:


I'm recording all the strange param names I encounter as I do this at 
https://gist.github.com/76416a2211ece8edb95a

Meanwhile, am hoping no more patches get in with config names as strings… :)

 Replace all config properties references in code with string constants
 --

 Key: HBASE-3274
 URL: https://issues.apache.org/jira/browse/HBASE-3274
 Project: HBase
  Issue Type: Improvement
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
   Original Estimate: 168h
  Remaining Estimate: 168h

 See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
 but we should also follow Hadoop to remove all hardcoded strings that refer 
 to configuration properties and move them to HConstants. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Harsh J (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J reassigned HBASE-3924:
--

Assignee: Harsh J

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial

 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176140#comment-13176140
 ] 

ramkrishna.s.vasudevan commented on HBASE-5097:
---

Ya.. My bad..:(..

 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4565) Maven HBase build broken on cygwin with copynativelib.sh call.

2011-12-27 Thread Akash Ashok (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176207#comment-13176207
 ] 

Akash Ashok commented on HBASE-4565:


Really helpful. Thanks a lot. 

Also patch seems to be failing on trunk. I am not sure it its already out of 
sync with trunk.

$ patch -p0  pom.patch
(Stripping trailing CRs from patch.)
patching file pom.xml
Hunk #1 FAILED at 696.
patch unexpectedly ends in middle of line
Hunk #2 succeeded at 1328 with fuzz 1 (offset 90 lines).
1 out of 2 hunks FAILED -- saving rejects to file pom.xml.rej

I manually patched it and works beautifully. 

 Maven HBase build broken on cygwin with copynativelib.sh call.
 --

 Key: HBASE-4565
 URL: https://issues.apache.org/jira/browse/HBASE-4565
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.92.0
 Environment: cygwin (on xp and win7)
Reporter: Suraj Varma
Assignee: Suraj Varma
  Labels: build, maven
 Fix For: 0.94.0

 Attachments: HBASE-4565-0.92.patch, HBASE-4565-v2.patch, 
 HBASE-4565-v3-0.92.patch, HBASE-4565-v3.patch, HBASE-4565.patch


 This is broken in both 0.92 as well as trunk pom.xml
 Here's a sample maven log snippet from trunk (from Mayuresh on user mailing 
 list)
 [INFO] [antrun:run {execution: package}]
 [INFO] Executing tasks
 main:
[mkdir] Created dir: 
 D:\workspace\mkshirsa\hbase-trunk\target\hbase-0.93-SNAPSHOT\hbase-0.93-SNAPSHOT\lib\native\${build.platform}
 [exec] ls: cannot access D:workspacemkshirsahbase-trunktarget/nativelib: 
 No such file or directory
 [exec] tar (child): Cannot connect to D: resolve failed
 [INFO] 
 
 [ERROR] BUILD ERROR
 [INFO] 
 
 [INFO] An Ant BuildException has occured: exec returned: 3328
 There are two issues: 
 1) The ant run task below doesn't resolve the windows file separator returned 
 by the project.build.directory - this causes the above resolve failed.
 !-- Using Unix cp to preserve symlinks, using script to handle wildcards --
 echo file=${project.build.directory}/copynativelibs.sh
 if [ `ls ${project.build.directory}/nativelib | wc -l` -ne 0]; then
 2) The tar argument value below also has a similar issue in that the path arg 
 doesn't resolve right.
 !-- Using Unix tar to preserve symlinks --
 exec executable=tar failonerror=yes 
 dir=${project.build.directory}/${project.artifactId}-${project.version}
 arg value=czf/
 arg 
 value=/cygdrive/c/workspaces/hbase-0.92-svn/target/${project.artifactId}-${project.version}.tar.gz/
 arg value=./
 /exec
 In both cases, the fix would probably be to use a cross-platform way to 
 handle the directory locations. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work started] (HBASE-3274) Replace all config properties references in code with string constants

2011-12-27 Thread Harsh J (Work started) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-3274 started by Harsh J.

 Replace all config properties references in code with string constants
 --

 Key: HBASE-3274
 URL: https://issues.apache.org/jira/browse/HBASE-3274
 Project: HBase
  Issue Type: Improvement
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
   Original Estimate: 168h
  Remaining Estimate: 168h

 See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
 but we should also follow Hadoop to remove all hardcoded strings that refer 
 to configuration properties and move them to HConstants. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3274) Replace all config properties references in code with string constants

2011-12-27 Thread Harsh J (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-3274:
---

Remaining Estimate: 168h
 Original Estimate: 168h

 Replace all config properties references in code with string constants
 --

 Key: HBASE-3274
 URL: https://issues.apache.org/jira/browse/HBASE-3274
 Project: HBase
  Issue Type: Improvement
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
   Original Estimate: 168h
  Remaining Estimate: 168h

 See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
 but we should also follow Hadoop to remove all hardcoded strings that refer 
 to configuration properties and move them to HConstants. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5073:
--

Fix Version/s: (was: 0.90.5)
   0.90.6
 Hadoop Flags: Reviewed

 Registered listeners not getting removed leading to memory leak in HBaseAdmin
 -

 Key: HBASE-5073
 URL: https://issues.apache.org/jira/browse/HBASE-5073
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5073.patch


 HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog 
 tracker.  Every time Root node tracker and meta node tracker are started and 
 a listener is registered.  But after the operations are performed the 
 listeners are not getting removed. Hence if the admin apis are consistently 
 used then it may lead to memory leak.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176231#comment-13176231
 ] 

Zhihong Yu commented on HBASE-5073:
---

Integrated to 0.90 branch

Thanks for the patch, Ramkrishna.

Thanks for the review, Stack and Lars

 Registered listeners not getting removed leading to memory leak in HBaseAdmin
 -

 Key: HBASE-5073
 URL: https://issues.apache.org/jira/browse/HBASE-5073
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5073.patch


 HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog 
 tracker.  Every time Root node tracker and meta node tracker are started and 
 a listener is registered.  But after the operations are performed the 
 listeners are not getting removed. Hence if the admin apis are consistently 
 used then it may lead to memory leak.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4720:
--

Attachment: (was: HBASE-4720.trunk.v2.patch)

 Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
 client/server 
 

 Key: HBASE-4720
 URL: https://issues.apache.org/jira/browse/HBASE-4720
 Project: HBase
  Issue Type: Improvement
Reporter: Daniel Lord
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
 HBASE-4720.v1.patch, HBASE-4720.v3.patch


 I have several large application/HBase clusters where an application node 
 will occasionally need to talk to HBase from a different cluster.  In order 
 to help ensure some of my consistency guarantees I have a sentinel table that 
 is updated atomically as users interact with the system.  This works quite 
 well for the regular hbase client but the REST client does not implement 
 the checkAndPut and checkAndDelete operations.  This exposes the application 
 to some race conditions that have to be worked around.  It would be ideal if 
 the same checkAndPut/checkAndDelete operations could be supported by the REST 
 client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5094:
--

Priority: Critical  (was: Major)

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical

 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4009) Script to patch holes in .META. table

2011-12-27 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176243#comment-13176243
 ] 

Todd Lipcon commented on HBASE-4009:


I don't htink we should be committing any more rb scripts like this to the 
source code - they just get people into trouble, fall out of date, etc. Instead 
this should be an hbck feature that automatically patches holes in a table if 
requested

 Script to patch holes in .META. table
 -

 Key: HBASE-4009
 URL: https://issues.apache.org/jira/browse/HBASE-4009
 Project: HBase
  Issue Type: New Feature
  Components: shell
 Environment: CDH3U0
Reporter: Lars George
Priority: Trivial
 Attachments: add_empty_region.rb, patch_meta.rb


 I need a script to patch holes in the .META. table, which was corrupted by 
 earlier issue on the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3274) Replace all config properties references in code with string constants

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176276#comment-13176276
 ] 

Lars Hofhansl commented on HBASE-3274:
--

Big +1.

I have actually reviewed some recent patches that add config names as string. 
Sorry I did not point that out as defect.
Also many people turn to hbase-default.xml, but many (new and old) configs are 
not in there.

 Replace all config properties references in code with string constants
 --

 Key: HBASE-3274
 URL: https://issues.apache.org/jira/browse/HBASE-3274
 Project: HBase
  Issue Type: Improvement
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
   Original Estimate: 168h
  Remaining Estimate: 168h

 See HBASE-2721 for details. We have fixed the default values in HBASE-3272 
 but we should also follow Hadoop to remove all hardcoded strings that refer 
 to configuration properties and move them to HConstants. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176281#comment-13176281
 ] 

Lars Hofhansl commented on HBASE-3924:
--

It's still a bit confusing I think.

How about something like?

{code}
Usage: HBase shell [OPTIONS] [SCRIPT [ARGUMENTS]]

 --format=OPTIONFormatter for outputting results.
Valid options are: console, html.
(Default: console)

 -d | --debug   Set DEBUG log levels.
 -h | --helpThis help.
{code}


 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176287#comment-13176287
 ] 

Andrew Purtell commented on HBASE-5097:
---

Wouldn't hurt. 

 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176296#comment-13176296
 ] 

Lars Hofhansl commented on HBASE-5097:
--

Not a big fan of catching NPEs. We can add a null check somewhere in the code, 
but catching NPEs is bad design (IMHO).

@Ram: How did you actually do this? You need to implement RegionObserver, so 
you cannot compile unless you provide an implementation of postScannerOpen. Or 
did you class not even implement RegionObserver?


 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5098) Update hbase-default.xml to reflect the state of HConstants once its the center.

2011-12-27 Thread Harsh J (Created) (JIRA)
Update hbase-default.xml to reflect the state of HConstants once its the center.


 Key: HBASE-5098
 URL: https://issues.apache.org/jira/browse/HBASE-5098
 Project: HBase
  Issue Type: Sub-task
Reporter: Harsh J
Priority: Trivial


Once the parent task is done, we should be easily able to add to and update 
hbase-default.xml.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Harsh J (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176299#comment-13176299
 ] 

Harsh J commented on HBASE-3924:


+1 on that. It does look much more readable.

Nit: I'd ditch the 'HBase' though, seems unnecessary. Or maybe use 'hbase' to 
avoid confusion about what it means there.

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176311#comment-13176311
 ] 

Lars Hofhansl commented on HBASE-5061:
--

From looking at the code, this would produce data locality with respect to 
certain host (if I understand this correctly).
Wouldn't we want to report locality from the viewpoint the respective 
regionserver(s)?

I.e. regions for a table might be on many regionservers, but for each the data 
might be local to the regionserver.


 StoreFileLocalityChecker
 

 Key: HBASE-5061
 URL: https://issues.apache.org/jira/browse/HBASE-5061
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: StoreFileLocalityChecker.java


 org.apache.hadoop.hbase.HFileLocalityChecker [options]
 A tool to report the number of local and nonlocal HFile blocks, and the ratio 
 of as a percentage.
 Where options are:
 |-f file|Analyze a store file|
 |-r region|Analyze all store files for the region|
 |-t table|Analyze all store files for regions of the table served by the 
 local regionserver|
 |-h host|Consider host local, defaults to the local host|
 |-v|Verbose operation|

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker

2011-12-27 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176322#comment-13176322
 ] 

Andrew Purtell commented on HBASE-5061:
---

bq. From looking at the code, this would produce data locality with respect to 
certain host (if I understand this correctly).

Yes this is the intent, to report locality to a given regionserver host, by 
default the local host as determined by a HBase style reverse lookup, assuming 
ops would run it either on an ops box (or the master) and supply the desired 
local host name via the '-h' option, or local to the RS.

I was thinking one use case could be to iterate over each cluster node and 
trigger major compactions for region(s) depending on the output of this tool. 

Of you are looking for a whole cluster report?


 StoreFileLocalityChecker
 

 Key: HBASE-5061
 URL: https://issues.apache.org/jira/browse/HBASE-5061
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: StoreFileLocalityChecker.java


 org.apache.hadoop.hbase.HFileLocalityChecker [options]
 A tool to report the number of local and nonlocal HFile blocks, and the ratio 
 of as a percentage.
 Where options are:
 |-f file|Analyze a store file|
 |-r region|Analyze all store files for the region|
 |-t table|Analyze all store files for regions of the table served by the 
 local regionserver|
 |-h host|Consider host local, defaults to the local host|
 |-v|Verbose operation|

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker

2011-12-27 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176326#comment-13176326
 ] 

Andrew Purtell commented on HBASE-5061:
---

So, currently this tool is meant to check the locality of files or regions to a 
given single host, however it could be modified such that if the '-h' option is 
supplied then the report looks only at the given host, otherwise it will 
enumerate the live hosts in ClusterStatus and return results for all. While at 
it, a new option '-j' for reporting in JSON.

Is this more what you have in mind Lars?

 StoreFileLocalityChecker
 

 Key: HBASE-5061
 URL: https://issues.apache.org/jira/browse/HBASE-5061
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: StoreFileLocalityChecker.java


 org.apache.hadoop.hbase.HFileLocalityChecker [options]
 A tool to report the number of local and nonlocal HFile blocks, and the ratio 
 of as a percentage.
 Where options are:
 |-f file|Analyze a store file|
 |-r region|Analyze all store files for the region|
 |-t table|Analyze all store files for regions of the table served by the 
 local regionserver|
 |-h host|Consider host local, defaults to the local host|
 |-v|Verbose operation|

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5061) StoreFileLocalityChecker

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176332#comment-13176332
 ] 

Lars Hofhansl commented on HBASE-5061:
--

That's what I had envisioned for the -t option at least.
Would be nice to have a report on all regions of a table.
But maybe that would not be useful as the number of region servers grows...?

+1 on JSON output.


 StoreFileLocalityChecker
 

 Key: HBASE-5061
 URL: https://issues.apache.org/jira/browse/HBASE-5061
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Minor
 Attachments: StoreFileLocalityChecker.java


 org.apache.hadoop.hbase.HFileLocalityChecker [options]
 A tool to report the number of local and nonlocal HFile blocks, and the ratio 
 of as a percentage.
 Where options are:
 |-f file|Analyze a store file|
 |-r region|Analyze all store files for the region|
 |-t table|Analyze all store files for regions of the table served by the 
 local regionserver|
 |-h host|Consider host local, defaults to the local host|
 |-v|Verbose operation|

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176334#comment-13176334
 ] 

Lars Hofhansl commented on HBASE-3924:
--

+1 on ditching HBase and just say  shell.
And should it say SCRIPT or SCRIPTFILE?

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3924) Improve Shell's CLI help

2011-12-27 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176353#comment-13176353
 ] 

stack commented on HBASE-3924:
--

+1 on patch.  SCRIPTFILE sounds better, yeah, and *HBase* shell is superfluous. 
 I don't think formatter other than console works but thats for another issue.

 Improve Shell's CLI help
 

 Key: HBASE-3924
 URL: https://issues.apache.org/jira/browse/HBASE-3924
 Project: HBase
  Issue Type: Improvement
  Components: shell
Affects Versions: 0.90.3
Reporter: Lars George
Assignee: Harsh J
Priority: Trivial
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-3924.patch


 In the hirb.rb source we have
 {noformat}
 # so they don't go through to irb.  Output shell 'usage' if user types 
 '--help'
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  formatFormatter for outputting results: console | html.
 Default: console
  -d | --debug  Set DEBUG log levels.
 HERE
 found = []
 format = 'console'
 script2run = nil
 log_level = org.apache.log4j.Level::ERROR
 for arg in ARGV
  if arg =~ /^--format=(.+)/i
format = $1
if format =~ /^html$/i
  raise NoMethodError.new(Not yet implemented)
elsif format =~ /^console$/i
  # This is default
else
  raise ArgumentError.new(Unsupported format  + arg)
end
found.push(arg)
  elsif arg == '-h' || arg == '--help'
puts cmdline_help
exit
  elsif arg == '-d' || arg == '--debug'
log_level = org.apache.log4j.Level::DEBUG
$fullBackTrace = true
puts Setting DEBUG log level...
  else
# Presume it a script. Save it off for running later below
# after we've set up some environment.
script2run = arg
found.push(arg)
# Presume that any other args are meant for the script.
break
  end
 end
 {noformat}
 We should enhance the help printed when using -h/--help to look like this?
 {noformat}
 cmdline_help = HERE # HERE document output as shell usage
 HBase Shell command-line options:
  --format={console|html}Formatter for outputting results.
 Default: console
  -d | --debug  Set DEBUG log levels.
  -h | --help   This help.
  script-filename [script-options]
 HERE
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5093) wiki update for HBase/Scala

2011-12-27 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176355#comment-13176355
 ] 

stack commented on HBASE-5093:
--

Do you have a 'wiki' name Joe?  I'll add you to the list of contribs (Ours and 
the hadoop wikis have been spammed bad for years and a recent change has it 
that folks need to ask for perms to edit).  Thanks for doing this ska la doc.

 wiki update for HBase/Scala
 ---

 Key: HBASE-5093
 URL: https://issues.apache.org/jira/browse/HBASE-5093
 Project: HBase
  Issue Type: Improvement
Reporter: Joe Stein

 I tried to edit the wiki but it says immutable page
 would be helpful/nice for folks to know how to get sbt working with Scala
 the following is what I did to get it working, not sure why could not edit 
 the wiki figure i open a JIRA so someone with access could update this
 {code}
 resolvers += Apache HBase at 
 https://repository.apache.org/content/repositories/releases;
 resolvers += Thrift at http://people.apache.org/~rawson/repo/;
 libraryDependencies ++= Seq(
 org.apache.hadoop % hadoop-core % 0.20.2,
 org.apache.hbase % hbase % 0.90.4
 )
 {code}
 or let me access it and I can do it, np

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re

2011-12-27 Thread Jimmy Xiang (Created) (JIRA)
ZK event thread waiting for root region while server shutdown handler waiting 
for event thread to finish distributed log splitting to recover the region 
sever the root region is on


 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang


A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
SpliLogManager
installed the tasks asynchronously, then started to wait for them to complete.

The task znodes were not created actually.  The requests were just queued.
At this time, the zookeeper connection expired.  HMaster tried to recover the 
expired ZK session.
During the recovery, a new zookeeper connection was created.  However, this 
master became the
new master again.  It tried to assign root and meta.

Because the dead RS got the old root region, the master needs to wait for the 
log splitting to complete.
This waiting holds the zookeeper event thread.  So the async create split task 
is never retried since
there is only one event thread, which is waiting for the root region assigned.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root re

2011-12-27 Thread Jimmy Xiang (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5099:
---

Attachment: ZK-event-thread-waiting-for-root.png
distributed-log-splitting-hangs.png

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-27 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5094:
--

Attachment: 5094.patch

Patch for trunk

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: 5094.patch


 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176402#comment-13176402
 ] 

ramkrishna.s.vasudevan commented on HBASE-5097:
---

@Lars

I implemented RegionObserver and not BaseRegionObserver.  In BaseRegionObserver 
we return a regionscanner so it will not cause any problem.  But if we 
implement RegionObserver then default will be null value.  That is the problem. 
 

 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5094:
--

Status: Patch Available  (was: Open)

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: 5094.patch


 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176406#comment-13176406
 ] 

Zhihong Yu commented on HBASE-5099:
---

bq. Another one is to abort the master
The above sounds better. How about introducing a timeout for:
{code}
  this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
{code}
after which master aborts ?

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-5073) Registered listeners not getting removed leading to memory leak in HBaseAdmin

2011-12-27 Thread ramkrishna.s.vasudevan (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-5073.
---

Resolution: Fixed

Committed to branch hence resolving.

 Registered listeners not getting removed leading to memory leak in HBaseAdmin
 -

 Key: HBASE-5073
 URL: https://issues.apache.org/jira/browse/HBASE-5073
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6

 Attachments: HBASE-5073.patch


 HBaseAdmin apis like tableExists(), flush, split, closeRegion uses catalog 
 tracker.  Every time Root node tracker and meta node tracker are started and 
 a listener is registered.  But after the operations are performed the 
 listeners are not getting removed. Hence if the admin apis are consistently 
 used then it may lead to memory leak.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5009:
--

Status: Open  (was: Patch Available)

Cancelling and resubmitting as the tests hanged

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5009:
--

Fix Version/s: 0.90.6
   0.92.1
Affects Version/s: (was: 0.90.6)
   Status: Patch Available  (was: Open)

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-27 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176430#comment-13176430
 ] 

Jimmy Xiang commented on HBASE-5099:


This is good.  If introducing a timeout, I prefer to do it for 
tryRecoveringExpiredZKSession().  
The reason for that is, other than waitForAssignment, there are several other 
places which have the waiting logic as well,
such as bulkAssign(), waitForRoot(), 
this.activeMasterManager.blockUntilBecomingActiveMaster(startupStatus), etc.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176433#comment-13176433
 ] 

Zhihong Yu commented on HBASE-5009:
---

Pressing 'Submit' again wouldn't run the tests.
This is because Hadoop QA remembers the attachment Id of the patch for which 
test suite was executed.

Please attach the same copy of patch for TRUNK again.

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5009:
--

Attachment: 5009.txt

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5009:
--

Status: Open  (was: Patch Available)

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5009:
--

Status: Patch Available  (was: Open)

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176440#comment-13176440
 ] 

Lars Hofhansl commented on HBASE-5097:
--

Hmmm. There is no default implementation for an interface. Are you referring to 
the default implementation generated for you by eclipse?
I just don't think this is a bug.




 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-27 Thread Jimmy Xiang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176447#comment-13176447
 ] 

Jimmy Xiang commented on HBASE-5099:


tryRecoveringExpiredZKSession() is only called by abortNow(), which is called 
by abort(), which is called by the
eventThread.  I was thinking to put this whole method in another thread with 
executor service and time it out after a certain time, for example, 5 minutes, 
then fails the recovery and let it abort.

This way, we don't have to adding timeout for all the methods.  The regular 
master startup is not impacted which calls assignRootAndMeta() too.

However, if we know most likely just waitForAssignment() takes a long time, we 
can add timeout to this method only. But I am not so sure.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5099) ZK event thread waiting for root region while server shutdown handler waiting for event thread to finish distributed log splitting to recover the region sever the root

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176454#comment-13176454
 ] 

Zhihong Yu commented on HBASE-5099:
---

Timing out tryRecoveringExpiredZKSession() is good.
We can introduce a separate thread to carry the current logic of 
tryRecoveringExpiredZKSession() and monitor this thread in eventThread.

 ZK event thread waiting for root region while server shutdown handler waiting 
 for event thread to finish distributed log splitting to recover the region 
 sever the root region is on
 

 Key: HBASE-5099
 URL: https://issues.apache.org/jira/browse/HBASE-5099
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
 Attachments: ZK-event-thread-waiting-for-root.png, 
 distributed-log-splitting-hangs.png


 A RS died.  The ServerShutdownHandler kicked in and started the logspliting.  
 SpliLogManager
 installed the tasks asynchronously, then started to wait for them to complete.
 The task znodes were not created actually.  The requests were just queued.
 At this time, the zookeeper connection expired.  HMaster tried to recover the 
 expired ZK session.
 During the recovery, a new zookeeper connection was created.  However, this 
 master became the
 new master again.  It tried to assign root and meta.
 Because the dead RS got the old root region, the master needs to wait for the 
 log splitting to complete.
 This waiting holds the zookeeper event thread.  So the async create split 
 task is never retried since
 there is only one event thread, which is waiting for the root region assigned.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5097) Coprocessor RegionObserver implementation without preScannerOpen and postScannerOpen Impl is throwing NPE and so failing the system initialization.

2011-12-27 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176456#comment-13176456
 ] 

ramkrishna.s.vasudevan commented on HBASE-5097:
---

@Lars..

Ok Lars..I too accept it..

 Coprocessor RegionObserver implementation without preScannerOpen and 
 postScannerOpen Impl is throwing NPE and so failing the system initialization.
 ---

 Key: HBASE-5097
 URL: https://issues.apache.org/jira/browse/HBASE-5097
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
Reporter: ramkrishna.s.vasudevan

 In HRegionServer.java openScanner()
 {code}
   r.prepareScanner(scan);
   RegionScanner s = null;
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().preScannerOpen(scan);
   }
   if (s == null) {
 s = r.getScanner(scan);
   }
   if (r.getCoprocessorHost() != null) {
 s = r.getCoprocessorHost().postScannerOpen(scan, s);
   }
 {code}
 If we dont have implemention for postScannerOpen the RegionScanner is null 
 and so throwing nullpointer 
 {code}
 java.lang.NullPointerException
   at 
 java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.addScanner(HRegionServer.java:2282)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2272)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326)
 {code}
 Making this defect as blocker.. Pls feel free to change the priority if am 
 wrong.  Also correct me if my way of trying out coprocessors without 
 implementing postScannerOpen is wrong.  Am just a learner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176458#comment-13176458
 ] 

Zhihong Yu commented on HBASE-5094:
---

+1 on patch.

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: 5094.patch


 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4529) Make WAL Pluggable

2011-12-27 Thread dhruba borthakur (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176459#comment-13176459
 ] 

dhruba borthakur commented on HBASE-4529:
-

This will be a great feature!

HDFS provides io-fencing for HBase log. When a regionserver dies, the master 
renames the log directory so that the oldregionserver cannot continue to write 
any new transactions to the transaction log. (HDFS provides an api that fails a 
file-creation-request if an intermediate path component is non-existant). It 
will be nice to get the pluggable-wal API support this type of operation.

 Make WAL Pluggable 
 ---

 Key: HBASE-4529
 URL: https://issues.apache.org/jira/browse/HBASE-4529
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, wal
Reporter: Akash Ashok
Assignee: Akash Ashok
  Labels: regionserver, wal

 Make WAL a pluggable, configurable component, thus making it easier to write 
 to different filesystems (including multiple filesystems).
 From Stack:
 Pluggable WAL component would need to check that the split can deal w/
 multiple logs written by the one server concurrently (sort by sequence
 edit id after sorting on all the rest that makes up a wal log key).
 From Jesse Yates:
 It would be nice to be able to tie pluggable WAL component into a service 
 that logs directly to
 disk, rather than go through HDFS giving some potentially awesome speedup at
 the cost of having to write a logging service that handles replication, etc.
 From Karthik Tunga:
 Along with the log replaying part, logic is also needed for log roll.
 This, I think, is easier compared to the merging of the logs. Any edits less
 than the last sequence number on the file system can be removed from all
 the WALs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-27 Thread chunhui shen (Created) (JIRA)
Rollback of split would cause closed region to opened 
--

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen


If master sending close region to rs and region's split transaction 
concurrently happen,
it may cause closed region to opened. 

See the detailed code in SplitTransaction#createDaughters
{code}
ListStoreFile hstoreFilesToSplit = null;
try{
  hstoreFilesToSplit = this.parent.close(false);
  if (hstoreFilesToSplit == null) {
// The region was closed by a concurrent thread.  We can't continue
// with the split, instead we must just abandon the split.  If we
// reopen or split this could cause problems because the region has
// probably already been moved to a different server, or is in the
// process of moving to a different server.
throw new IOException(Failed to close region: already closed by  +
  another thread);
  }
} finally {
  this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
}
{code}

when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
this.parent.initialize();

Although this region is not onlined in the regionserver, it may bring some 
potential problem.

For example, in our environment, the closed parent region is rolled back 
sucessfully , and then starting compaction and split again.

The parent region is f892dd6107b6b4130199582abc78e9c1

master log
{code}
2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
 src=dw87.kgb.sqa.cm4,60020,1324827866085, 
dest=dw80.kgb.sqa.cm4,60020,1324827865780
2011-12-26 00:24:42,693 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Starting unassignment of region 
writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 (offlining)
2011-12-26 00:24:42,694 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Sent CLOSE to serverName=dw87.kgb.sqa.cm4,60020,1324827866085, 
load=(requests=0, regions=0, usedHeap=0, maxHeap=0) for region 
writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling new unassigned node: 
/hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
(region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
 server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
2011-12-26 00:24:42,699 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_CLOSING, 
server=dw87.kgb.sqa.cm4,60020,1324827866085, 
region=f892dd6107b6b4130199582abc78e9c1
2011-12-26 00:24:45,348 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=RS_ZK_REGION_CLOSED, 
server=dw87.kgb.sqa.cm4,60020,1324827866085, 
region=f892dd6107b6b4130199582abc78e9c1
2011-12-26 00:24:45,349 DEBUG 
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
event for f892dd6107b6b4130199582abc78e9c1
2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Forcing OFFLINE; 
was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 state=CLOSED, ts=1324830285347
2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Handling transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
region=f892dd6107b6b4130199582abc78e9c1
2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Found an existing plan for 
writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 destination server is + serverName=dw80.kgb.sqa.cm4,60020,1324827865780, 
load=(requests=0, regions=1, usedHeap=0, maxHeap=0)
2011-12-26 00:24:45,354 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Using pre-existing plan for region 
writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.;
 

[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-27 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176465#comment-13176465
 ] 

chunhui shen commented on HBASE-5100:
-

When this.parent.close(false) returns null(It means region has already been 
closed), we needn't add the JournalEntry.CLOSED_PARENT_REGION

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen

 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176467#comment-13176467
 ] 

Zhihong Yu commented on HBASE-5009:
---

Patch for 0.90 passes tests:
{code}
Tests run: 702, Failures: 0, Errors: 0, Skipped: 9

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 1:18:08.131s
{code}
Integrated to 0.90 branch.

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-27 Thread Zhihong Yu (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176465#comment-13176465
 ] 

Zhihong Yu edited comment on HBASE-5100 at 12/28/11 6:01 AM:
-

When this.parent.close(false) returns null(It means region has already been 
closed), we needn't add the JournalEntry.CLOSED_PARENT_REGION

  was (Author: zjushch):
When this.parent.close(false) returns null(It means region has already been 
closed), we needn't add the JournalEntry.CLOSED_PARENT_REGION
  
 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen

 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 

[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-27 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176473#comment-13176473
 ] 

Zhihong Yu commented on HBASE-5100:
---

@Chunhui:
Do you have a patch ?

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen

 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  destination server is + serverName=dw80.kgb.sqa.cm4,60020,1324827865780, 
 

[jira] [Updated] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-27 Thread chunhui shen (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-5100:


Attachment: hbase-5100.patch

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  destination server is + serverName=dw80.kgb.sqa.cm4,60020,1324827865780, 
 load=(requests=0, regions=1, 

[jira] [Updated] (HBASE-5094) The META can hold an entry for a region with a different server name from the one actually in the AssignmentManager thus making the region inaccessible.

2011-12-27 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5094:
--

Comment: was deleted

(was: +1 on patch.)

 The META can hold an entry for a region with a different server name from the 
 one actually in the AssignmentManager thus making the region inaccessible.
 

 Key: HBASE-5094
 URL: https://issues.apache.org/jira/browse/HBASE-5094
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: 5094.patch


 {code}
 RegionState rit = 
 this.services.getAssignmentManager().isRegionInTransition(e.getKey());
 ServerName addressFromAM = this.services.getAssignmentManager()
 .getRegionServerOfRegion(e.getKey());
 if (rit != null  !rit.isClosing()  !rit.isPendingClose()) {
   // Skip regions that were in transition unless CLOSING or
   // PENDING_CLOSE
   LOG.info(Skip assigning region  + rit.toString());
 } else if (addressFromAM != null
  !addressFromAM.equals(this.serverName)) {
   LOG.debug(Skip assigning region 
 + e.getKey().getRegionNameAsString()
 +  because it has been opened in 
 + addressFromAM.getServerName());
   }
 {code}
 In ServerShutDownHandler we try to get the address in the AM.  This address 
 is initially null because it is not yet updated after the region was opened 
 .i.e. the CAll back after node deletion is not yet done in the master side.
 But removal from RIT is completed on the master side.  So this will trigger a 
 new assignment.
 So there is a small window between the online region is actually added in to 
 the online list and the ServerShutdownHandler where we check the existing 
 address in AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5009) Failure of creating split dir if it already exists prevents splits from happening further

2011-12-27 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176478#comment-13176478
 ] 

Hadoop QA commented on HBASE-5009:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12508723/5009.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -151 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 76 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/606//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/606//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/606//console

This message is automatically generated.

 Failure of creating split dir if it already exists prevents splits from 
 happening further
 -

 Key: HBASE-5009
 URL: https://issues.apache.org/jira/browse/HBASE-5009
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.92.1, 0.90.6

 Attachments: 5009.txt, HBASE-5009.patch, HBASE-5009_Branch90.patch


 The scenario is
 - The split of a region takes a long time
 - The deletion of the splitDir fails due to HDFS problems.
 - Subsequent splits also fail after that.
 {code}
 private static void createSplitDir(final FileSystem fs, final Path splitdir)
   throws IOException {
 if (fs.exists(splitdir)) throw new IOException(Splitdir already exits?  
 + splitdir);
 if (!fs.mkdirs(splitdir)) throw new IOException(Failed create of  + 
 splitdir);
   }
 {code}
 Correct me if am wrong? If it is an issue can we change the behaviour of 
 throwing exception?
 Pls suggest.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5100) Rollback of split would cause closed region to opened

2011-12-27 Thread chunhui shen (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176480#comment-13176480
 ] 

chunhui shen commented on HBASE-5100:
-

If returns not null, exceptionEncountered  is also false.
what about naming the boolean alreadyClosed, or closedBefore,or other 
similars...

 Rollback of split would cause closed region to opened 
 --

 Key: HBASE-5100
 URL: https://issues.apache.org/jira/browse/HBASE-5100
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5100.patch


 If master sending close region to rs and region's split transaction 
 concurrently happen,
 it may cause closed region to opened. 
 See the detailed code in SplitTransaction#createDaughters
 {code}
 ListStoreFile hstoreFilesToSplit = null;
 try{
   hstoreFilesToSplit = this.parent.close(false);
   if (hstoreFilesToSplit == null) {
 // The region was closed by a concurrent thread.  We can't continue
 // with the split, instead we must just abandon the split.  If we
 // reopen or split this could cause problems because the region has
 // probably already been moved to a different server, or is in the
 // process of moving to a different server.
 throw new IOException(Failed to close region: already closed by  +
   another thread);
   }
 } finally {
   this.journal.add(JournalEntry.CLOSED_PARENT_REGION);
 }
 {code}
 when rolling back, the JournalEntry.CLOSED_PARENT_REGION causes 
 this.parent.initialize();
 Although this region is not onlined in the regionserver, it may bring some 
 potential problem.
 For example, in our environment, the closed parent region is rolled back 
 sucessfully , and then starting compaction and split again.
 The parent region is f892dd6107b6b4130199582abc78e9c1
 master log
 {code}
 2011-12-26 00:24:42,693 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  src=dw87.kgb.sqa.cm4,60020,1324827866085, 
 dest=dw80.kgb.sqa.cm4,60020,1324827865780
 2011-12-26 00:24:42,693 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  (offlining)
 2011-12-26 00:24:42,694 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=dw87.kgb.sqa.cm4,60020,1324827866085, load=(requests=0, regions=0, 
 usedHeap=0, maxHeap=0) for region 
 writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase-tbfs/unassigned/f892dd6107b6b4130199582abc78e9c1 
 (region=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.,
  server=dw87.kgb.sqa.cm4,60020,1324827866085, state=RS_ZK_REGION_CLOSING)
 2011-12-26 00:24:42,699 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSING, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,348 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=RS_ZK_REGION_CLOSED, server=dw87.kgb.sqa.cm4,60020,1324827866085, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED 
 event for f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,349 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=writetest,8ZW417DZP93OU6SZ0QQMKTALTDP4883KW5AXSAFMQ952Y6J6VPPXEXRRPCWBR2PK7DQV3RKK28222JMOJSW3JJ8AB05MIREM1CL6,1324829936318.f892dd6107b6b4130199582abc78e9c1.
  state=CLOSED, ts=1324830285347
 2011-12-26 00:24:45,349 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x13447f283f40e73 Creating (or updating) unassigned node for 
 f892dd6107b6b4130199582abc78e9c1 with OFFLINE state
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling 
 transition=M_ZK_REGION_OFFLINE, server=dw75.kgb.sqa.cm4:6, 
 region=f892dd6107b6b4130199582abc78e9c1
 2011-12-26 00:24:45,354 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Found an existing plan for 
 

[jira] [Commented] (HBASE-3741) Make HRegionServer aware of the regions it's opening/closing

2011-12-27 Thread johnyang (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176498#comment-13176498
 ] 

johnyang commented on HBASE-3741:
-

升级到0.90.4 能修复这个问题


 Make HRegionServer aware of the regions it's opening/closing
 

 Key: HBASE-3741
 URL: https://issues.apache.org/jira/browse/HBASE-3741
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.90.3

 Attachments: HBASE-3741-rsfix-v2.patch, HBASE-3741-rsfix-v3.patch, 
 HBASE-3741-rsfix.patch, HBASE-3741-trunk.patch


 This is a serious issue about a race between regions being opened and closed 
 in region servers. We had this situation where the master tried to unassign a 
 region for balancing, failed, force unassigned it, force assigned it 
 somewhere else, failed to open it on another region server (took too long), 
 and then reassigned it back to the original region server. A few seconds 
 later, the region server processed the first closed and the region was left 
 unassigned.
 This is from the master log:
 {quote}
 11-04-05 15:11:17,758 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
 Sent CLOSE to serverName=sv4borg42,60020,1300920459477, load=(requests=187, 
 regions=574, usedHeap=3918, maxHeap=6973) for region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 2011-04-05 15:12:10,021 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_CLOSE, ts=1302041477758
 2011-04-05 15:12:10,021 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_CLOSE for too long, running forced unassign again on 
 region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 ...
 2011-04-05 15:14:45,783 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=CLOSED, ts=1302041685733
 2011-04-05 15:14:45,783 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x42ec2cece810b68 Creating (or updating) unassigned node for 
 1470298961 with OFFLINE state
 ...
 2011-04-05 15:14:45,885 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961;
  
 plan=hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
  src=sv4borg42,60020,1300920459477, dest=sv4borg40,60020,1302041218196
 2011-04-05 15:14:45,885 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  to sv4borg40,60020,1302041218196
 2011-04-05 15:15:39,410 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
 out:  
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_OPEN, ts=1302041700944
 2011-04-05 15:15:39,410 INFO 
 org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
 PENDING_OPEN for too long, reassigning 
 region=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  state=PENDING_OPEN, ts=1302041700944
 ...
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
 was found (or we are ignoring an existing plan) for 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  so generated a random one; 
 hri=stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961,
  src=, dest=sv4borg42,60020,1300920459477; 19 (online=19, exclude=null) 
 available servers
 2011-04-05 15:15:39,410 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 stumbles_by_userid2,\x00'\x8E\xE8\x7F\xFF\xFE\xE7\xA9\x97\xFC\xDF\x01\x10\xCC6,1266566087256.1470298961
  to sv4borg42,60020,1300920459477
 2011-04-05 15:15:40,951 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: 
 master:6-0x42ec2cece810b68 Received ZooKeeper Event,