[jira] [Commented] (HBASE-5843) Improve HBase MTTR - Mean Time To Recover

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560472#comment-13560472
 ] 

nkeywal commented on HBASE-5843:


bq. What is the application bug(AB) mentioned in your design doc? Do you mean 
hbase bug? or hbase client application code bug? 
Mainly HBase, but it could be as well a coprocessor issue. HBase can be 
configured to stop the regionserver if a coprocessor sends unexpected 
exceptions, but it's quite easy to write buggy stuff, like a coprocessor that 
takes resources without freeing them. Here you may need to stop the region 
server.


bq. If it is hbase client application code bug, does that need stop/start 
region server to fix the issue? 
For a pure client (i.e. a user of the hbase.client package), it would be an 
HBase bug imho: HBase/a regionserver should be resistant to any client behavior.
For a coprocessor, it's client code executed within the regionserver process. 
Thanks to Java, many coprocessors bugs will have a limited effect, but as said 
above there are some cases that cannot be handled simply.

bq. If it is hbase code bug, do you refer to hbase bug that cause region server 
einter some bad state like deadlock, and so on? I think that could benefit from 
restarting region server to fix the problem. 
Yes.

 Improve HBase MTTR - Mean Time To Recover
 -

 Key: HBASE-5843
 URL: https://issues.apache.org/jira/browse/HBASE-5843
 Project: HBase
  Issue Type: Umbrella
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal

 A part of the approach is described here: 
 https://docs.google.com/document/d/1z03xRoZrIJmg7jsWuyKYl6zNournF_7ZHzdi0qz_B4c/edit
 The ideal target is:
 - failure impact client applications only by an added delay to execute a 
 query, whatever the failure.
 - this delay is always inferior to 1 second.
 We're not going to achieve that immediately...
 Priority will be given to the most frequent issues.
 Short term:
 - software crash
 - standard administrative tasks as stop/start of a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6816) [WINDOWS] line endings on checkout for .sh files

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560507#comment-13560507
 ] 

nkeywal commented on HBASE-6816:


Installed the patch on unix  windows, seems ok.

I was surprised because the example in the git documentation explicitly states 
binary for png  jpg. So does 
https://github.com/Countly/countly-sdk-android/blob/master/.gitattributes for 
example. So I changed architecture.gif on windows, committed, then read it from 
Linux. I found my changes. So I'm +1 :-).

 [WINDOWS] line endings on checkout for .sh files
 

 Key: HBASE-6816
 URL: https://issues.apache.org/jira/browse/HBASE-6816
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-16_v1.patch, hbase-6816_v1.patch


 On code checkout from svn or git, we need to ensure that the line endings for 
 .sh files are LF, so that they work with cygwin. This is important for 
 getting src/saveVersion.sh to work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6829) [WINDOWS] Tests should ensure that HLog is closed

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560519#comment-13560519
 ] 

nkeywal commented on HBASE-6829:


fwiw, TestDefaultCompactSelection works for me on Windows before and after the 
patch. While the patch fixes TestCacheOnWriteInSchema. But what's in 
TestDefaultCompactSelection  makes sense.

There are some unused variable in TestDefaultCompactSelection, but they were 
there before the patch:
long tooBig = maxSize + 1;
Path oldLogDir = new Path(basedir, HConstants.HREGION_OLDLOGDIR_NAME);

So +1 from me as well.


 [WINDOWS] Tests should ensure that HLog is closed
 -

 Key: HBASE-6829
 URL: https://issues.apache.org/jira/browse/HBASE-6829
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-6829_v1-0.94.patch, hbase-6829_v1-trunk.patch, 
 hbase-6829_v2-0.94.patch, hbase-6829_v2-trunk.patch, 
 hbase-6829_v3-0.94.patch, hbase-6829_v3-trunk.patch, 
 hbase-6829_v4-trunk.patch, hbase-6829_v4-trunk.patch


 TestCacheOnWriteInSchema and TestCompactSelection fails with 
 {code}
 java.io.IOException: Target HLog directory already exists: 
 ./target/test-data/2d814e66-75d3-4c1b-92c7-a49d9972e8fd/TestCacheOnWriteInSchema/logs
   at org.apache.hadoop.hbase.regionserver.wal.HLog.init(HLog.java:385)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.init(HLog.java:316)
   at 
 org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema.setUp(TestCacheOnWriteInSchema.java:162)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560529#comment-13560529
 ] 

nkeywal commented on HBASE-6832:


for EnvironmentEdgeManager, should we not always initialize the 'time' at 
10 or something like this? I can imagine other piece of code doing minus 
something. Initializing it to something high enough could save us from some 
burden later.

For the fix, except this non critical comment above, I'm ok, but I wonder if 
the root issue (strange time counter on windows) won't shows up in production. 
That's another subject, though.

 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, hbase-6832_v5-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6825) [WINDOWS] Java NIO socket channels does not work with Windows ipv6

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560536#comment-13560536
 ] 

nkeywal commented on HBASE-6825:


If I'm not mistaken, the test uses a fixed port 8502, this should be changed, 
if not we can have random failures when running the test suites.

For the issue itself, why should it not make it to the main code? I mean, we 
test that, on windows, a critical feature works. If not we will have issues. 
This should be in main, not in test, no?

I like the way the test is done, btw: we don't explicitly test the jdk1.7, so 
it means that if it's fixed in a later jdk 1.6 patch the code will still be 
right.

And actually, this seems to be fixed in 1.6 u34 says 
http://www.oracle.com/technetwork/java/javase/documentation/overview-156328.html.

If it's the case, we could put it as a requirement and we're done (that's 
acceptable for 0.96 imho. May be not for 0.94).

 [WINDOWS] Java NIO socket channels does not work with Windows ipv6
 --

 Key: HBASE-6825
 URL: https://issues.apache.org/jira/browse/HBASE-6825
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.94.3, 0.96.0
 Environment: JDK6 on windows for ipv6. 
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-6825_v3-0.94.patch, hbase-6825_v3-trunk.patch


 While running the test TestAdmin.testCheckHBaseAvailableClosesConnection(), I 
 noticed that it takes very long, since it sleeps for 2sec * 500, because of 
 zookeeper retries. 
 The root cause of the problem is that ZK uses Java NIO to create 
 ServerSorcket's from ServerSocketChannels. Under windows, the ipv4 and ipv6 
 is implemented independently, and Java seems that it cannot reuse the same 
 socket channel for both ipv4 and ipv6 sockets. We are getting 
 java.net.SocketException: Address family not supported by protocol
 family exceptions. When, ZK client resolves localhost, it gets both v4 
 127.0.0.1 and v6 ::1 address, but the socket channel cannot bind to both v4 
 and v6. 
 The problem is reported as:
 http://bugs.sun.com/view_bug.do?bug_id=6230761
 http://stackoverflow.com/questions/1357091/binding-an-ipv6-server-socket-on-windows
 Although the JDK bug is reported as resolved, I have tested with jdk1.6.0_33 
 without any success. Although JDK7 seems to have fixed this problem. In ZK, 
 we can replace the ClientCnxnSocket implementation from ClientCnxnSocketNIO 
 to a non-NIO one, but I am not sure that would be the way to go.
 Disabling ipv6 resolution of localhost is one other approach. I'll test it 
 to see whether it will be any good. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6821) [WINDOWS] .META. table name causes file system problems in windows

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560545#comment-13560545
 ] 

nkeywal commented on HBASE-6821:


Hum, can't test on Windows, I can't delete the bad directory of the initial 
test now. It works on linux.

It's nothing else than a hack, but it's simple and we're in the tests; so to me 
it's ok.

The root issue it much more of a problem :-(, and can't be fixed without 
discussions (and thinking :-)). In the meantime, we can commit this patch imho. 
+1 so.

 [WINDOWS] .META. table name causes file system problems in windows
 --

 Key: HBASE-6821
 URL: https://issues.apache.org/jira/browse/HBASE-6821
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-4388-root.dir.tgz, hbase-6821_v2_0.94.patch, 
 hbase-6821_v2-trunk.patch, TestMetaMigrationConvertToPB.tgz


 TestMetaMigrationRemovingHTD untars a cluster dir having a .META. 
 subdirectory. This causes mvn clean to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5664) CP hooks in Scan flow for fast forward when filter filters out a row

2013-01-23 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560560#comment-13560560
 ] 

Anoop Sam John commented on HBASE-5664:
---

[~apurtell] Comments from your side?

 CP hooks in Scan flow for fast forward when filter filters out a row
 

 Key: HBASE-5664
 URL: https://issues.apache.org/jira/browse/HBASE-5664
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Filters
Affects Versions: 0.92.1
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-5664_94.patch, HBASE-5664_94_V2.patch, 
 HBASE-5664_Trunk.patch, HBASE-5664_Trunk_V2.patch


 In HRegion.nextInternal(int limit, String metric)
   We have while(true) loop so as to fetch a next result which satisfies 
 filter condition. When Filter filters out the current fetched row we call 
 nextRow(byte [] currentRow) before going with the next row.
 {code}
 if (results.isEmpty() || filterRow()) {
 // this seems like a redundant step - we already consumed the row
 // there're no left overs.
 // the reasons for calling this method are:
 // 1. reset the filters.
 // 2. provide a hook to fast forward the row (used by subclasses)
 nextRow(currentRow);
 {code}
 // 2. provide a hook to fast forward the row (used by subclasses)
 We can provide same feature of fast forward support for the CP also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7637) hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560565#comment-13560565
 ] 

nkeywal commented on HBASE-7637:


You're right, it's not much of an issue.

The patch works with 2.0 and 1.0. Don't forget the clean between the two as I 
initially did :-)

+1 so.

 hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0
 

 Key: HBASE-7637
 URL: https://issues.apache.org/jira/browse/HBASE-7637
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7637-0.patch


 I'm unclear on the root cause / fix. Here is the scenario:
 {noformat}
 mvn clean package install -Dhadoop.profile=2.0 -DskipTests
 bin/start-hbase.sh
 {noformat}
 fails with
 {noformat}
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.metrics2.lib.MetricMutable
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 {noformat}
 doing 
 {noformat}
 rm -rf hbase-hadoop1-compat/target/
 {noformat}
 makes it work. 
 In the pom.xml, we never reference hadoop2-compat. But doing so does not 
 help: hadoop1-compat is compiled and takes precedence over hadoop2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition and snapshot data loss

2013-01-23 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560637#comment-13560637
 ] 

Jonathan Hsieh commented on HBASE-7643:
---

Looks good.  Just have a few comment suggestions.

Mention cleaner dir deletion race?

{code}
+  if (i  0) {
+// Ensure that the archive directory exists
+// (we're in a retry loop, so don't worry too much about the exception)
+try {
{code}

Is there corresponding java doc that needs to be changed?

{code}
  public boolean moveAndClose(Path dest) throws IOException {
   this.close();
   Path p = this.getPath();
-  return !fs.rename(p, dest);
+  return fs.rename(p, dest);
 }
{code}

nit: comment about 1 ms sleep between cleaner runs..
{code}
+Stoppable stoppable = new StoppableImplementation();
+HFileCleaner cleaner = new HFileCleaner(1, stoppable, conf, fs, 
archiveDir);
+
{code}

I buy this but needed to think a bit to figure out why this is correct.  Add a 
comment? (the invariant is that the file is in on or the other place, and if it 
failes in one we check the other).
{code}
+  try {
+HFileArchiver.archiveRegion(conf, fs, rootDir, 
sourceRegionDir.getParent(), sourceRegionDir);
+assertTrue(fs.exists(archiveFile));
+assertFalse(fs.exists(sourceFile));
+  } catch (IOException e) {
+assertFalse(fs.exists(archiveFile));
+assertTrue(fs.exists(sourceFile));
+  }
{code}

 HFileArchiver.resolveAndArchive() race condition and snapshot data loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of situations
  * Avoid to remove empty directories from the archive if the table exists or 
 is not disabled
  * Add a try catch around the fs.rename
 The last one, the easiest one, looks like:
 {code}
 for (int i = 0; i  retries; ++i) {
   // ensure archive directory to be present
   fs.mkdir(archiveDir);
   //  possible race -
   // try to archive file
   success = fs.rename(originalPath/fileName, archiveDir/fileName);
   if (success) break;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7495) parallel scanner seek in StoreScanner's constructor

2013-01-23 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-7495:
-

Attachment: HBASE-7495.txt

 parallel scanner seek in StoreScanner's constructor
 ---

 Key: HBASE-7495
 URL: https://issues.apache.org/jira/browse/HBASE-7495
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.94.3, 0.96.0
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-7495.txt, HBASE-7495.txt


 seems there's a potential improvable space before doing scanner.next:
 {code:title=StoreScanner.java|borderStyle=solid}
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   for (KeyValueScanner scanner : scanners) {
 scanner.requestSeek(matcher.getStartKey(), false, true);
   }
 } else {
   for (KeyValueScanner scanner : scanners) {
 scanner.seek(matcher.getStartKey());
   }
 }
 {code} 
 we can do scanner.requestSeek or scanner.seek in parallel, instead of current 
 serialization, to reduce latency for special case.
 Any ideas on it ?  I'll have a try if the comments/suggestions are positive:)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7495) parallel scanner seek in StoreScanner's constructor

2013-01-23 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-7495:
-

Status: Patch Available  (was: Open)

 parallel scanner seek in StoreScanner's constructor
 ---

 Key: HBASE-7495
 URL: https://issues.apache.org/jira/browse/HBASE-7495
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.94.3, 0.96.0
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-7495.txt, HBASE-7495.txt


 seems there's a potential improvable space before doing scanner.next:
 {code:title=StoreScanner.java|borderStyle=solid}
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   for (KeyValueScanner scanner : scanners) {
 scanner.requestSeek(matcher.getStartKey(), false, true);
   }
 } else {
   for (KeyValueScanner scanner : scanners) {
 scanner.seek(matcher.getStartKey());
   }
 }
 {code} 
 we can do scanner.requestSeek or scanner.seek in parallel, instead of current 
 serialization, to reduce latency for special case.
 Any ideas on it ?  I'll have a try if the comments/suggestions are positive:)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7495) parallel seek in StoreScanner

2013-01-23 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-7495:
-

Summary: parallel seek in StoreScanner  (was: parallel scanner seek in 
StoreScanner's constructor)

 parallel seek in StoreScanner
 -

 Key: HBASE-7495
 URL: https://issues.apache.org/jira/browse/HBASE-7495
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.94.3, 0.96.0
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-7495.txt, HBASE-7495.txt


 seems there's a potential improvable space before doing scanner.next:
 {code:title=StoreScanner.java|borderStyle=solid}
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   for (KeyValueScanner scanner : scanners) {
 scanner.requestSeek(matcher.getStartKey(), false, true);
   }
 } else {
   for (KeyValueScanner scanner : scanners) {
 scanner.seek(matcher.getStartKey());
   }
 }
 {code} 
 we can do scanner.requestSeek or scanner.seek in parallel, instead of current 
 serialization, to reduce latency for special case.
 Any ideas on it ?  I'll have a try if the comments/suggestions are positive:)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7293) [replication] Remove dead sinks from ReplicationSource.currentPeers and pick new ones

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560638#comment-13560638
 ] 

Hudson commented on HBASE-7293:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #368 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/368/])
HBASE-7293 [replication] Remove dead sinks from 
ReplicationSource.currentPeers and pick new ones (Revision 1437240)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


 [replication] Remove dead sinks from ReplicationSource.currentPeers and pick 
 new ones
 -

 Key: HBASE-7293
 URL: https://issues.apache.org/jira/browse/HBASE-7293
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Jean-Daniel Cryans
Assignee: Lars Hofhansl
 Fix For: 0.96.0, 0.94.5

 Attachments: 7293-0.94.txt, 7293-0.94-v2.txt, 7293-0.96.txt


 I happened to look at a log today where I saw a lot lines like this:
 {noformat}
 2012-12-06 23:29:08,318 INFO 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave 
 cluster looks down: This server is in the failed servers list: 
 sv4r20s49/10.4.20.49:10304
 2012-12-06 23:29:15,987 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of a local or network error: 
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:416)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:462)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1150)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1000)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy14.replicateLogEntries(Unknown Source)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:627)
   at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 2012-12-06 23:29:15,988 INFO 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave 
 cluster looks down: Connection refused
 {noformat}
 What struck me as weird is this had been going on for some days, I would 
 expect the RS to find new servers if it wasn't able to replicate. But the 
 reality is that only a few of the chosen sink RS were down so eventually the 
 source hits one that's good and is never able to refresh its list of servers.
 We should remove the dead servers, it's spammy and probably adds some slave 
 lag.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560639#comment-13560639
 ] 

Hudson commented on HBASE-6466:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #368 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/368/])
HBASE-6466 Revert, TestLogRolling failed twice on trunk build (Revision 
1437274)
HBASE-6466 Enable multi-thread for memstore flush (Chunhui) (Revision 1437252)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java

tedyu : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java


 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6466-v6.patch, HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch, 
 HBASE-6466-v4.patch, HBASE-6466-v5.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7646) Make forkedProcessTimeoutInSeconds configurable

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560640#comment-13560640
 ] 

Hudson commented on HBASE-7646:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #368 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/368/])
HBASE-7646 Make forkedProcessTimeoutInSeconds configurable (Revision 
1437130)

 Result = FAILURE
jxiang : 
Files : 
* /hbase/trunk/pom.xml


 Make forkedProcessTimeoutInSeconds configurable
 ---

 Key: HBASE-7646
 URL: https://issues.apache.org/jira/browse/HBASE-7646
 Project: HBase
  Issue Type: Bug
  Components: build
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Trivial
 Fix For: 0.96.0, 0.94.5

 Attachments: 0.94-7646.patch, trunk-7646.patch


 Command line property surefire.timeout somehow doesn't work. It may be 
 because forkedProcessTimeoutInSeconds is hard-coded to 900.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7588) Fix two findbugs warning in MemStoreFlusher

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560641#comment-13560641
 ] 

Hudson commented on HBASE-7588:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #368 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/368/])
HBASE-7588 Fix two findbugs warning in MemStoreFlusher; REAPPLIED (Revision 
1437154)
HBASE-7588 Fix two findbugs warning in MemStoreFlusher; REVERTED (Revision 
1437121)
HBASE-7588 Fix two findbugs warning in MemStoreFlusher (Revision 1437119)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java

stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java

stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java


 Fix two findbugs warning in MemStoreFlusher
 ---

 Key: HBASE-7588
 URL: https://issues.apache.org/jira/browse/HBASE-7588
 Project: HBase
  Issue Type: Bug
Reporter: Jean-Marc Spaggiari
Assignee: Jean-Marc Spaggiari
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7588-v0-trunk.patch, HBASE-7588-v1-trunk.patch, 
 HBASE-7588-v2-trunk.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7637) hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0

2013-01-23 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-7637:
---

Attachment: nomodules.patch

 hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0
 

 Key: HBASE-7637
 URL: https://issues.apache.org/jira/browse/HBASE-7637
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7637-0.patch, nomodules.patch


 I'm unclear on the root cause / fix. Here is the scenario:
 {noformat}
 mvn clean package install -Dhadoop.profile=2.0 -DskipTests
 bin/start-hbase.sh
 {noformat}
 fails with
 {noformat}
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.metrics2.lib.MetricMutable
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 {noformat}
 doing 
 {noformat}
 rm -rf hbase-hadoop1-compat/target/
 {noformat}
 makes it work. 
 In the pom.xml, we never reference hadoop2-compat. But doing so does not 
 help: hadoop1-compat is compiled and takes precedence over hadoop2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7637) hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0

2013-01-23 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560669#comment-13560669
 ] 

nkeywal commented on HBASE-7637:


It's possible to do this (cf. nomodules.patch) in the main pom.xml (on top of 
what you did). It has one advantage: you don't download (or need) the hadoop 
version you don'tuse. Without it, even if you build hbase for hadoop 1, you 
take hadoop 2 as well.

 hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0
 

 Key: HBASE-7637
 URL: https://issues.apache.org/jira/browse/HBASE-7637
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7637-0.patch, nomodules.patch


 I'm unclear on the root cause / fix. Here is the scenario:
 {noformat}
 mvn clean package install -Dhadoop.profile=2.0 -DskipTests
 bin/start-hbase.sh
 {noformat}
 fails with
 {noformat}
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.metrics2.lib.MetricMutable
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 {noformat}
 doing 
 {noformat}
 rm -rf hbase-hadoop1-compat/target/
 {noformat}
 makes it work. 
 In the pom.xml, we never reference hadoop2-compat. But doing so does not 
 help: hadoop1-compat is compiled and takes precedence over hadoop2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7495) parallel seek in StoreScanner

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560676#comment-13560676
 ] 

Hadoop QA commented on HBASE-7495:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566118/HBASE-7495.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait
  org.apache.hadoop.hbase.regionserver.TestAtomicOperation
  org.apache.hadoop.hbase.regionserver.TestHRegion
  org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
  
org.apache.hadoop.hbase.regionserver.TestEndToEndSplitTransaction
  org.apache.hadoop.hbase.TestAcidGuarantees
  org.apache.hadoop.hbase.TestLocalHBaseCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4145//console

This message is automatically generated.

 parallel seek in StoreScanner
 -

 Key: HBASE-7495
 URL: https://issues.apache.org/jira/browse/HBASE-7495
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.94.3, 0.96.0
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-7495.txt, HBASE-7495.txt


 seems there's a potential improvable space before doing scanner.next:
 {code:title=StoreScanner.java|borderStyle=solid}
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   for (KeyValueScanner scanner : scanners) {
 scanner.requestSeek(matcher.getStartKey(), false, true);
   }
 } else {
   for (KeyValueScanner scanner : scanners) {
 scanner.seek(matcher.getStartKey());
   }
 }
 {code} 
 we can do scanner.requestSeek or scanner.seek in parallel, instead of current 
 serialization, to reduce latency for special case.
 Any ideas on it ?  I'll have a try if the comments/suggestions are positive:)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-23 Thread chunhui shen (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560710#comment-13560710
 ] 

chunhui shen commented on HBASE-6466:
-

TestLogRolling#testLogRollOnDatanodeDeath() is failed in trunk build 3779 and 
3780 by
{code}assertTrue(LowReplication Roller should've been 
disabled,!log.isLowReplicationRollEnabled());
{code}
lowReplicationRollEnabled will only be set false in 
FSHlog#checkLowReplication();
FSHlog#checkLowReplication() will only called by FSHlog#syncer, however it is 
skipped when rolling log
{code}
if (!this.logRollRunning) {
checkLowReplication();
...
}
{code}

Therefore, I could only think one reason for this failed test. Log is rolling 
when calling syncer().

From the logs, I could only find HDFS pipeline error detected. Found 1 
replicas but expecting no less than 2 replicas(logged by the 
FSHlog#checkLowReplication()) 3 times,  but need at least 4 times to pass the 
test.

It's easy to reproduce the failed test with the following change in FSHlog
{code}
--- 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java 
(revision 1437274)
+++ 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java 
(working copy)
@@ -501,6 +501,10 @@
   byte [][] regionsToFlush = null;
   try {
 this.logRollRunning = true;
+try {
+  Thread.sleep(1500);
+} catch (InterruptedException e) {
+}
 boolean isClosed = closed;
 if (isClosed || !closeBarrier.beginOp()) {
   LOG.debug(HLog  + (isClosed ? closed : closing) + . Skipping 
rolling of writer);
{code}

In addition, with patch v6, pass the test TestLogRolling 50 times on my local 
PC.

Attaching patchV7, change a little in the TestLogRolling 



 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6466-v6.patch, HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch, 
 HBASE-6466-v4.patch, HBASE-6466-v5.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7221) RowKey utility class for rowkey construction

2013-01-23 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560714#comment-13560714
 ] 

Doug Meil commented on HBASE-7221:
--


This parsing approach...
{code}
 byte[] key = RowKey.format(%16x%4d%8d, hashVal, intVal, longVal);
{code}
... seems a lot less understandable to me than the proposal.  It also doesn't 
address reading components back, which is why the RowKey (aka 
FixedLengthKey/ComponentKey) needs to have state.  I don't think it's enough 
just to have a builder pattern, people need some way of reading and processing 
the key.  It's not just about the writes.



 RowKey utility class for rowkey construction
 

 Key: HBASE-7221
 URL: https://issues.apache.org/jira/browse/HBASE-7221
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: HBASE_7221.patch, hbase-common_hbase_7221_2.patch, 
 hbase-common_hbase_7221_v3.patch


 A common question in the dist-lists is how to construct rowkeys, particularly 
 composite keys.  Put/Get/Scan specifies byte[] as the rowkey, but it's up to 
 you to sensibly populate that byte-array, and that's where things tend to go 
 off the rails.
 The intent of this RowKey utility class isn't meant to add functionality into 
 Put/Get/Scan, but rather make it simpler for folks to construct said arrays.  
 Example:
 {code}
RowKey key = RowKey.create(RowKey.SIZEOF_MD5_HASH + RowKey.SIZEOF_LONG);
key.addHash(a);
key.add(b);
byte bytes[] = key.getBytes();
 {code} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-23 Thread chunhui shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunhui shen updated HBASE-6466:


Attachment: 6466-v7.patch

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6466-v6.patch, 6466-v7.patch, HBASE-6466.patch, 
 HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, 
 HBASE-6466-v4.patch, HBASE-6466-v4.patch, HBASE-6466-v5.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7637) hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560715#comment-13560715
 ] 

Hadoop QA commented on HBASE-7637:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566119/nomodules.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestZKBasedOpenCloseRegion

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4146//console

This message is automatically generated.

 hbase-hadoop1-compat conflicts with -Dhadoop.profile=2.0
 

 Key: HBASE-7637
 URL: https://issues.apache.org/jira/browse/HBASE-7637
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7637-0.patch, nomodules.patch


 I'm unclear on the root cause / fix. Here is the scenario:
 {noformat}
 mvn clean package install -Dhadoop.profile=2.0 -DskipTests
 bin/start-hbase.sh
 {noformat}
 fails with
 {noformat}
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hadoop.metrics2.lib.MetricMutable
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 {noformat}
 doing 
 {noformat}
 rm -rf hbase-hadoop1-compat/target/
 {noformat}
 makes it work. 
 In the pom.xml, we never reference hadoop2-compat. But doing so does not 
 help: hadoop1-compat is compiled and takes precedence over hadoop2...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7650) bin/hbase-config.sh is not executable

2013-01-23 Thread nkeywal (JIRA)
nkeywal created HBASE-7650:
--

 Summary: bin/hbase-config.sh is not executable
 Key: HBASE-7650
 URL: https://issues.apache.org/jira/browse/HBASE-7650
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.94.4, 0.96.0
Reporter: nkeywal


And it strange that everything seems to work despite this...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7650) bin/hbase-config.sh is not executable

2013-01-23 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-7650.


Resolution: Invalid

must be sourced only...

 bin/hbase-config.sh is not executable
 -

 Key: HBASE-7650
 URL: https://issues.apache.org/jira/browse/HBASE-7650
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.96.0, 0.94.4
Reporter: nkeywal

 And it strange that everything seems to work despite this...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560745#comment-13560745
 ] 

Hadoop QA commented on HBASE-6466:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566127/6466-v7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4147//console

This message is automatically generated.

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6466-v6.patch, 6466-v7.patch, HBASE-6466.patch, 
 HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, 
 HBASE-6466-v4.patch, HBASE-6466-v4.patch, HBASE-6466-v5.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7495) parallel seek in StoreScanner

2013-01-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560791#comment-13560791
 ] 

ramkrishna.s.vasudevan commented on HBASE-7495:
---

[~xieliang007]
How will the ordering be maintained.  Do we need to ensure the ordering of the 
kvs?  Just asking?

 parallel seek in StoreScanner
 -

 Key: HBASE-7495
 URL: https://issues.apache.org/jira/browse/HBASE-7495
 Project: HBase
  Issue Type: Bug
  Components: Scanners
Affects Versions: 0.94.3, 0.96.0
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-7495.txt, HBASE-7495.txt


 seems there's a potential improvable space before doing scanner.next:
 {code:title=StoreScanner.java|borderStyle=solid}
 if (explicitColumnQuery  lazySeekEnabledGlobally) {
   for (KeyValueScanner scanner : scanners) {
 scanner.requestSeek(matcher.getStartKey(), false, true);
   }
 } else {
   for (KeyValueScanner scanner : scanners) {
 scanner.seek(matcher.getStartKey());
   }
 }
 {code} 
 we can do scanner.requestSeek or scanner.seek in parallel, instead of current 
 serialization, to reduce latency for special case.
 Any ideas on it ?  I'll have a try if the comments/suggestions are positive:)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7403) Online Merge

2013-01-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560793#comment-13560793
 ] 

ramkrishna.s.vasudevan commented on HBASE-7403:
---

I dont have much comments here.  clarified my doubts with Chunhui.
Overall the functionality seems fine and the scenarios have been taken care.
So +1 from me.  Thanks Chunhui.

 Online Merge
 

 Key: HBASE-7403
 URL: https://issues.apache.org/jira/browse/HBASE-7403
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.3
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.5

 Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv10.patch, 
 hbase-7403-trunkv11.patch, hbase-7403-trunkv1.patch, 
 hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, 
 hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, merge region.pdf


 The feature of this online merge:
 1.Online,no necessary to disable table
 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
 3.Easy to call merege request, no need to input a long region name, only 
 encoded name enough
 4.No limit when operation, you don't need to tabke care the events like 
 Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
 whether you send a wrong merge request, it has alread done for you
 5.Only little offline time for two merging regions
 Usage:
 1.Tool:  
 bin/hbase org.apache.hadoop.hbase.util.OnlineMerge [-force] [-async] [-show] 
 table-name region-encodedname-1 region-encodedname-2
 2.API: static void MergeManager#createMergeRequest
 We need merge in the following cases:
 1.Region hole or region overlap, can’t be fix by hbck
 2.Region become empty because of TTL and not reasonable Rowkey design
 3.Region is always empty or very small because of presplit when create table
 4.Too many empty or small regions would reduce the system performance(e.g. 
 mslab)
 Current merge tools only support offline and are not able to redo if 
 exception is thrown in the process of merging, causing a dirty data
 For online system, we need a online merge.
 This implement logic of this patch for  Online Merge is :
 For example, merge regionA and regionB into regionC
 1.Offline the two regions A and B
 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
 regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
 regionB’s directory)
 3.Add the merged regionC to .META.
 4.Assign the merged regionC
 As design of this patch , once we do the merge work in the HDFS,we could redo 
 it until successful if it throws exception or abort or server restart, but 
 couldn’t be rolled back. 
 It depends on
 Use zookeeper to record the transaction journal state, make redo easier
 Use zookeeper to send/receive merge request
 Merge transaction is executed on the master
 Support calling merge request through API or shell tool
 About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7648) TestAcidGuarantees.testMixedAtomicity hangs sometimes

2013-01-23 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7648:
---

   Resolution: Fixed
Fix Version/s: 0.94.5
   0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 TestAcidGuarantees.testMixedAtomicity hangs sometimes
 -

 Key: HBASE-7648
 URL: https://issues.apache.org/jira/browse/HBASE-7648
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.5

 Attachments: 0.94-7648.patch, trunk-7648.patch


 java.lang.RuntimeException: Deferred
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:76)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.waitFor(MultithreadedTestUtil.java:69)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:301)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:244)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.testMixedAtomicity(TestAcidGuarantees.java:343)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47)
 at org.junit.rules.RunRules.evaluate(RunRules.java:18)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
 at org.junit.runners.Suite.runChild(Suite.java:128)
 at org.junit.runners.Suite.runChild(Suite.java:24)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.NotServingRegionException):
  org.apache.hadoop.hbase.NotServingRegionException: Region is not online: 
 TestAcidGuarantees,,135776964.317288e8ca738963ca5e273fc56750fd.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3211)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.flushRegion(HRegionServer.java:2963)
 at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1021)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
 at $Proxy23.flushRegion(Unknown Source)
 at org.apache.hadoop.hbase.client.HBaseAdmin.flush(HBaseAdmin.java:1248)
 at org.apache.hadoop.hbase.client.HBaseAdmin.flush(HBaseAdmin.java:1230)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees$1.doAnAction(TestAcidGuarantees.java:272)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:145)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestThread.run(MultithreadedTestUtil.java:121)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For 

[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560848#comment-13560848
 ] 

Ted Yu commented on HBASE-6466:
---

I ran TestLogRolling using patch v7 locally and it passed:

Running org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
2013-01-23 09:22:46.693 java[8875:1703] Unable to load realm info from 
SCDynamicStore
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 141.023 sec

Integrated to trunk again.

Let's see what Jenkins tells us.

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6466-v6.patch, 6466-v7.patch, HBASE-6466.patch, 
 HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, 
 HBASE-6466-v4.patch, HBASE-6466-v4.patch, HBASE-6466-v5.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition and snapshot data loss

2013-01-23 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7643:
---

Attachment: HBASE-7653-p4-v4.patch

added more comments, as Jon suggested.

{quote}
Is there corresponding java doc that needs to be changed?
{quote}
No, the javadoc was wrong before

 HFileArchiver.resolveAndArchive() race condition and snapshot data loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of situations
  * Avoid to remove empty directories from the archive if the table exists or 
 is not disabled
  * Add a try catch around the fs.rename
 The last one, the easiest one, looks like:
 {code}
 for (int i = 0; i  retries; ++i) {
   // ensure archive directory to be present
   fs.mkdir(archiveDir);
   //  possible race -
   // try to archive file
   success = fs.rename(originalPath/fileName, archiveDir/fileName);
   if (success) break;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition and snapshot data loss

2013-01-23 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560861#comment-13560861
 ] 

Jonathan Hsieh commented on HBASE-7643:
---

v4 lgtm.  please fix line length complaints before commit.

 HFileArchiver.resolveAndArchive() race condition and snapshot data loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of situations
  * Avoid to remove empty directories from the archive if the table exists or 
 is not disabled
  * Add a try catch around the fs.rename
 The last one, the easiest one, looks like:
 {code}
 for (int i = 0; i  retries; ++i) {
   // ensure archive directory to be present
   fs.mkdir(archiveDir);
   //  possible race -
   // try to archive file
   success = fs.rename(originalPath/fileName, archiveDir/fileName);
   if (success) break;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition and snapshot data loss

2013-01-23 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7643:
---

Attachment: HBASE-7653-p4-v5.patch

 HFileArchiver.resolveAndArchive() race condition and snapshot data loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch, 
 HBASE-7653-p4-v5.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of situations
  * Avoid to remove empty directories from the archive if the table exists or 
 is not disabled
  * Add a try catch around the fs.rename
 The last one, the easiest one, looks like:
 {code}
 for (int i = 0; i  retries; ++i) {
   // ensure archive directory to be present
   fs.mkdir(archiveDir);
   //  possible race -
   // try to archive file
   success = fs.rename(originalPath/fileName, archiveDir/fileName);
   if (success) break;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7622) Add table descriptor verification after snapshot restore

2013-01-23 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560867#comment-13560867
 ] 

Matteo Bertozzi commented on HBASE-7622:


I'm going to commit this to the snapshots branch, if there're no objections

 Add table descriptor verification after snapshot restore
 

 Key: HBASE-7622
 URL: https://issues.apache.org/jira/browse/HBASE-7622
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: hbase-6055

 Attachments: HBASE-7622-v0.patch, HBASE-7622-v1.patch, 
 HBASE-7622-v2.patch


 Add the schema verification not only based on disk data, but also on the 
 HTableDescriptor

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560868#comment-13560868
 ] 

Ted Yu commented on HBASE-6832:
---

Patch looks good to me.

 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, hbase-6832_v5-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6821) [WINDOWS] .META. table name causes file system problems in windows

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560872#comment-13560872
 ] 

Ted Yu commented on HBASE-6821:
---

For TestMetaMigrationConvertToPB.README, year in license header is not needed.

Other than the above, +1.

 [WINDOWS] .META. table name causes file system problems in windows
 --

 Key: HBASE-6821
 URL: https://issues.apache.org/jira/browse/HBASE-6821
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-4388-root.dir.tgz, hbase-6821_v2_0.94.patch, 
 hbase-6821_v2-trunk.patch, TestMetaMigrationConvertToPB.tgz


 TestMetaMigrationRemovingHTD untars a cluster dir having a .META. 
 subdirectory. This causes mvn clean to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6731) Port HBASE-6537 'Race between balancer and disable table can lead to inconsistent cluster' to 0.92

2013-01-23 Thread rajeshbabu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rajeshbabu updated HBASE-6731:
--

Attachment: HBASE-6731.patch

Same as HBASE-6537 patch.

 Port HBASE-6537 'Race between balancer and disable table can lead to 
 inconsistent cluster' to 0.92
 --

 Key: HBASE-6731
 URL: https://issues.apache.org/jira/browse/HBASE-6731
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Fix For: 0.92.3

 Attachments: HBASE-6731.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7329) remove flush-related records from WAL and make locking more granular

2013-01-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560889#comment-13560889
 ] 

ramkrishna.s.vasudevan commented on HBASE-7329:
---

Had time check this patch today.  Nice one.  Covering all cases.  One question,
What was the motivation in introducing a barrier?  What was the major problem 
wrt to close operation prior to this patch.
Thanks Sergey.

 remove flush-related records from WAL and make locking more granular
 

 Key: HBASE-7329
 URL: https://issues.apache.org/jira/browse/HBASE-7329
 Project: HBase
  Issue Type: Improvement
  Components: wal
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: 7329-findbugs.diff, 7329-v7.txt, HBASE-7329-v0.patch, 
 HBASE-7329-v0.patch, HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch, 
 HBASE-7329-v1.patch, HBASE-7329-v2.patch, HBASE-7329-v3.patch, 
 HBASE-7329-v4.patch, HBASE-7329-v5.patch, HBASE-7329-v6.patch, 
 HBASE-7329-v6.patch


 Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
 records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-6731) Port HBASE-6537 'Race between balancer and disable table can lead to inconsistent cluster' to 0.92

2013-01-23 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reassigned HBASE-6731:
-

Assignee: rajeshbabu

 Port HBASE-6537 'Race between balancer and disable table can lead to 
 inconsistent cluster' to 0.92
 --

 Key: HBASE-6731
 URL: https://issues.apache.org/jira/browse/HBASE-6731
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: rajeshbabu
 Fix For: 0.92.3

 Attachments: HBASE-6731.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6731) Port HBASE-6537 'Race between balancer and disable table can lead to inconsistent cluster' to 0.92

2013-01-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560893#comment-13560893
 ] 

ramkrishna.s.vasudevan commented on HBASE-6731:
---

+1 on patch.

 Port HBASE-6537 'Race between balancer and disable table can lead to 
 inconsistent cluster' to 0.92
 --

 Key: HBASE-6731
 URL: https://issues.apache.org/jira/browse/HBASE-6731
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: rajeshbabu
 Fix For: 0.92.3

 Attachments: HBASE-6731.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7622) Add table descriptor verification after snapshot restore

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560897#comment-13560897
 ] 

Ted Yu commented on HBASE-7622:
---

@Matteo:
I ran the tests again and they passed:

Running org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient
2013-01-23 10:06:13.436 java[9234:1203] Unable to load realm info from 
SCDynamicStore
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 229.729 sec
Running org.apache.hadoop.hbase.snapshot.TestRestoreFlushSnapshotFromClient
2013-01-23 10:10:03.758 java[9267:1203] Unable to load realm info from 
SCDynamicStore
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 93.49 sec

Go for it.

 Add table descriptor verification after snapshot restore
 

 Key: HBASE-7622
 URL: https://issues.apache.org/jira/browse/HBASE-7622
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: hbase-6055

 Attachments: HBASE-7622-v0.patch, HBASE-7622-v1.patch, 
 HBASE-7622-v2.patch


 Add the schema verification not only based on disk data, but also on the 
 HTableDescriptor

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7521) fix HBASE-6060 (regions stuck in opening state) in 0.94

2013-01-23 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560898#comment-13560898
 ] 

ramkrishna.s.vasudevan commented on HBASE-7521:
---

[~rajesh23]
Could you take a look at this patch with the code base that you have.  It will 
be easy to see if anything is missed out because Sergey has rebased the earlier 
patches that we had posted that time.
I wil also have a look at this tomorrow during day time.

 fix HBASE-6060 (regions stuck in opening state) in 0.94
 ---

 Key: HBASE-7521
 URL: https://issues.apache.org/jira/browse/HBASE-7521
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7521-original-patch-ported-v0.patch, 
 HBASE-7521-v0.patch, HBASE-7521-v1.patch


 Discussion in HBASE-6060 implies that the fix there does not work on 0.94. 
 Still, we may want to fix the issue in 0.94 (via some different fix) because 
 the regions stuck in opening for ridiculous amounts of time is not a good 
 thing to have.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7329) remove flush-related records from WAL and make locking more granular

2013-01-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560906#comment-13560906
 ] 

Sergey Shelukhin commented on HBASE-7329:
-

Hmm... as I mentioned above actually I am not sure whether safety on close is 
necessary. This was just to preserve the logic of existing behavior.
cacheFlushLock taken on close interacted with things as such - close would wait 
for flush; close would wait for log rolling; log rolling would wait for close 
and then exit because .closed is set; cache flush will wait for close and then 
proceed(?). Judging by lack of bugs from the later case, I am assuming it is 
deliberately, or by coincidence, ensured externally that it doesn't happen. 
With barrier, close would still wait for both operations. Both flush and log 
roll will not start if close has started.

 remove flush-related records from WAL and make locking more granular
 

 Key: HBASE-7329
 URL: https://issues.apache.org/jira/browse/HBASE-7329
 Project: HBase
  Issue Type: Improvement
  Components: wal
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: 7329-findbugs.diff, 7329-v7.txt, HBASE-7329-v0.patch, 
 HBASE-7329-v0.patch, HBASE-7329-v0-tmp.patch, HBASE-7329-v1.patch, 
 HBASE-7329-v1.patch, HBASE-7329-v2.patch, HBASE-7329-v3.patch, 
 HBASE-7329-v4.patch, HBASE-7329-v5.patch, HBASE-7329-v6.patch, 
 HBASE-7329-v6.patch


 Comments from many people in HBASE-6466 and HBASE-6980 indicate that flush 
 records in WAL are not useful. If so, they should be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7648) TestAcidGuarantees.testMixedAtomicity hangs sometimes

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560907#comment-13560907
 ] 

Hudson commented on HBASE-7648:
---

Integrated in HBase-TRUNK #3783 (See 
[https://builds.apache.org/job/HBase-TRUNK/3783/])
HBASE-7648 TestAcidGuarantees.testMixedAtomicity hangs sometimes (Revision 
1437538)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java


 TestAcidGuarantees.testMixedAtomicity hangs sometimes
 -

 Key: HBASE-7648
 URL: https://issues.apache.org/jira/browse/HBASE-7648
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.5

 Attachments: 0.94-7648.patch, trunk-7648.patch


 java.lang.RuntimeException: Deferred
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:76)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.waitFor(MultithreadedTestUtil.java:69)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:301)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:244)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.testMixedAtomicity(TestAcidGuarantees.java:343)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47)
 at org.junit.rules.RunRules.evaluate(RunRules.java:18)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
 at org.junit.runners.Suite.runChild(Suite.java:128)
 at org.junit.runners.Suite.runChild(Suite.java:24)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.NotServingRegionException):
  org.apache.hadoop.hbase.NotServingRegionException: Region is not online: 
 TestAcidGuarantees,,135776964.317288e8ca738963ca5e273fc56750fd.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3211)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.flushRegion(HRegionServer.java:2963)
 at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1021)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
 at $Proxy23.flushRegion(Unknown Source)
 at org.apache.hadoop.hbase.client.HBaseAdmin.flush(HBaseAdmin.java:1248)
 at org.apache.hadoop.hbase.client.HBaseAdmin.flush(HBaseAdmin.java:1230)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees$1.doAnAction(TestAcidGuarantees.java:272)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:145)
 at 
 

[jira] [Reopened] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2013-01-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reopened HBASE-7268:
-


Doesn't cover all cases, saw that it still removes from cache on error from 
different server... 
Original patch is valid, I'll make addendum patch

 correct local region location cache information can be overwritten w/stale 
 information from an old server
 -

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v0.patch, HBASE-7268-v1.patch, HBASE-7268-v2.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v3.patch, HBASE-7268-v4.patch, HBASE-7268-v5.patch, 
 HBASE-7268-v6.patch, HBASE-7268-v7.patch, HBASE-7268-v8.patch, 
 HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7268:


Summary: correct local region location cache information can be overwritten 
(or deleted) w/stale information from an old server  (was: correct local region 
location cache information can be overwritten w/stale information from an old 
server)

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v0.patch, HBASE-7268-v1.patch, HBASE-7268-v2.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v3.patch, HBASE-7268-v4.patch, HBASE-7268-v5.patch, 
 HBASE-7268-v6.patch, HBASE-7268-v7.patch, HBASE-7268-v8.patch, 
 HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7114) Increment does not extend Mutation but probably should

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560914#comment-13560914
 ] 

Ted Yu commented on HBASE-7114:
---

Stable annotation is only in 0.96 code base.
I agree with Devaraj that we should tackle this in 0.96

 Increment does not extend Mutation but probably should
 --

 Key: HBASE-7114
 URL: https://issues.apache.org/jira/browse/HBASE-7114
 Project: HBase
  Issue Type: Bug
  Components: Client
Reporter: Andrew Purtell
Priority: Minor

 Increment is the only operation in the class of mutators that does not extend 
 Mutation. It mostly duplicates what Mutation provides, but not quite. The 
 signatures for setWriteToWAL and getFamilyMap are slightly different. This 
 can be inconvenient because it requires special case code and therefore could 
 be considered an API design nit. Unfortunately it is not a simple change: The 
 interface is marked stable and the internals of the family map are different 
 from other mutation types. The latter is why I suspect this was not addressed 
 when Mutation was introduced.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition and snapshot data loss

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560920#comment-13560920
 ] 

Hadoop QA commented on HBASE-7643:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12566150/HBASE-7653-p4-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces lines longer than 
100

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4148//console

This message is automatically generated.

 HFileArchiver.resolveAndArchive() race condition and snapshot data loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch, 
 HBASE-7653-p4-v5.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of 

[jira] [Updated] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition may lead to snapshot data loss

2013-01-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7643:
--

 Summary: HFileArchiver.resolveAndArchive() race condition may lead to 
snapshot data loss  (was: HFileArchiver.resolveAndArchive() race condition and 
snapshot data loss)
Hadoop Flags: Reviewed

 HFileArchiver.resolveAndArchive() race condition may lead to snapshot data 
 loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch, 
 HBASE-7653-p4-v5.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of situations
  * Avoid to remove empty directories from the archive if the table exists or 
 is not disabled
  * Add a try catch around the fs.rename
 The last one, the easiest one, looks like:
 {code}
 for (int i = 0; i  retries; ++i) {
   // ensure archive directory to be present
   fs.mkdir(archiveDir);
   //  possible race -
   // try to archive file
   success = fs.rename(originalPath/fileName, archiveDir/fileName);
   if (success) break;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7648) TestAcidGuarantees.testMixedAtomicity hangs sometimes

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560932#comment-13560932
 ] 

Hudson commented on HBASE-7648:
---

Integrated in HBase-0.94 #754 (See 
[https://builds.apache.org/job/HBase-0.94/754/])
HBASE-7648 TestAcidGuarantees.testMixedAtomicity hangs sometimes (Revision 
1437539)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java


 TestAcidGuarantees.testMixedAtomicity hangs sometimes
 -

 Key: HBASE-7648
 URL: https://issues.apache.org/jira/browse/HBASE-7648
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Fix For: 0.96.0, 0.94.5

 Attachments: 0.94-7648.patch, trunk-7648.patch


 java.lang.RuntimeException: Deferred
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.checkException(MultithreadedTestUtil.java:76)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$TestContext.waitFor(MultithreadedTestUtil.java:69)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:301)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.runTestAtomicity(TestAcidGuarantees.java:244)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees.testMixedAtomicity(TestAcidGuarantees.java:343)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47)
 at org.junit.rules.RunRules.evaluate(RunRules.java:18)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
 at 
 org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
 at org.junit.runners.Suite.runChild(Suite.java:128)
 at org.junit.runners.Suite.runChild(Suite.java:24)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hbase.NotServingRegionException):
  org.apache.hadoop.hbase.NotServingRegionException: Region is not online: 
 TestAcidGuarantees,,135776964.317288e8ca738963ca5e273fc56750fd.
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3211)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.flushRegion(HRegionServer.java:2963)
 at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1021)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
 at $Proxy23.flushRegion(Unknown Source)
 at org.apache.hadoop.hbase.client.HBaseAdmin.flush(HBaseAdmin.java:1248)
 at org.apache.hadoop.hbase.client.HBaseAdmin.flush(HBaseAdmin.java:1230)
 at 
 org.apache.hadoop.hbase.TestAcidGuarantees$1.doAnAction(TestAcidGuarantees.java:272)
 at 
 org.apache.hadoop.hbase.MultithreadedTestUtil$RepeatingTestThread.doWork(MultithreadedTestUtil.java:145)
 at 
 

[jira] [Updated] (HBASE-7622) Add table descriptor verification after snapshot restore

2013-01-23 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7622:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

committed to the snapshots branch

 Add table descriptor verification after snapshot restore
 

 Key: HBASE-7622
 URL: https://issues.apache.org/jira/browse/HBASE-7622
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: hbase-6055

 Attachments: HBASE-7622-v0.patch, HBASE-7622-v1.patch, 
 HBASE-7622-v2.patch


 Add the schema verification not only based on disk data, but also on the 
 HTableDescriptor

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7622) Add table descriptor verification after snapshot restore

2013-01-23 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7622:
---

Status: Patch Available  (was: Open)

 Add table descriptor verification after snapshot restore
 

 Key: HBASE-7622
 URL: https://issues.apache.org/jira/browse/HBASE-7622
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-6055
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: hbase-6055

 Attachments: HBASE-7622-v0.patch, HBASE-7622-v1.patch, 
 HBASE-7622-v2.patch


 Add the schema verification not only based on disk data, but also on the 
 HTableDescriptor

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition may lead to snapshot data loss

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560939#comment-13560939
 ] 

Hadoop QA commented on HBASE-7643:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12566153/HBASE-7653-p4-v5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149//console

This message is automatically generated.

 HFileArchiver.resolveAndArchive() race condition may lead to snapshot data 
 loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch, 
 HBASE-7653-p4-v5.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, 

[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560950#comment-13560950
 ] 

Ted Yu commented on HBASE-7268:
---

From 
https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/366/testReport/junit/org.apache.hadoop.hbase.util/TestMiniClusterLoadSequential/loadTest_0_/,
 I found:
{code}
2013-01-22 03:16:55,763 ERROR [HBaseWriterThread_6] 
server.NIOServerCnxnFactory$1(44): Thread Thread[HBaseWriterThread_6,5,main] 
died
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.deleteCachedLocation(HConnectionManager.java:1783)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.updateCachedLocations(HConnectionManager.java:1825)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.access$1300(HConnectionManager.java:515)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2035)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1874)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1863)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1842)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:882)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:692)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
at 
org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:175)
at 
org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:145)
{code}
Looks like oldLocation was null in the following check:
{code}
isStaleDelete = (source != null)  !oldLocation.equals(source);
{code}
Can you include the fix in the addendum ?

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v0.patch, HBASE-7268-v1.patch, HBASE-7268-v2.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v3.patch, HBASE-7268-v4.patch, HBASE-7268-v5.patch, 
 HBASE-7268-v6.patch, HBASE-7268-v7.patch, HBASE-7268-v8.patch, 
 HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7382:
---

Attachment: HBASE-7382-trunk.patch

Patch which forward port the multi functionality. It doesn't include the 
compatibility issues which was there in 6775.

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560966#comment-13560966
 ] 

Enis Soztutar commented on HBASE-6832:
--

bq. for EnvironmentEdgeManager, should we not always initialize the 'time' at 
10 or something like this? I can imagine other piece of code doing minus 
something. Initializing it to something high enough could save us from some 
burden later.
Makes sense. I changed it so that it starts with currentTimeMillis by default. 
bq. For the fix, except this non critical comment above, I'm ok, but I wonder 
if the root issue (strange time counter on windows) won't shows up in 
production. That's another subject, though.
Opened HBASE-6833 for that, although the fix is not that clear at this point. 

 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, hbase-6832_v5-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6832:
-

Attachment: hbase-6832_v6-trunk.patch

Updated patch with N's suggestions. 

 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, 
 hbase-6832_v5-trunk.patch, hbase-6832_v6-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560971#comment-13560971
 ] 

Himanshu Vashishtha commented on HBASE-7382:


ran jenkins with this job; TestHbck failed which looks unrelated. Ran it 
locally and it passed.

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7605) TestMiniClusterLoadSequential fails in trunk build on hadoop 2

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560974#comment-13560974
 ] 

Ted Yu commented on HBASE-7605:
---

{code}
2013-01-22 03:16:55,763 ERROR [HBaseWriterThread_6] 
server.NIOServerCnxnFactory$1(44): Thread Thread[HBaseWriterThread_6,5,main] 
died
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.deleteCachedLocation(HConnectionManager.java:1783)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.updateCachedLocations(HConnectionManager.java:1825)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.access$1300(HConnectionManager.java:515)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.processBatchCallback(HConnectionManager.java:2035)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$Process.access$900(HConnectionManager.java:1874)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1863)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1842)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:882)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:692)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:667)
at 
org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:175)
at 
org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:145)
{code}
Fixing the NullPointer exception, I was able to see the test pass against 
hadoop 2.0:

Running org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential
2013-01-23 11:09:20.078 java[10447:1203] Unable to load realm info from 
SCDynamicStore
2013-01-23 11:09:20.155 java[10447:1203] Unable to load realm info from 
SCDynamicStore
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 57.738 sec

 TestMiniClusterLoadSequential fails in trunk build on hadoop 2
 --

 Key: HBASE-7605
 URL: https://issues.apache.org/jira/browse/HBASE-7605
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Priority: Critical
 Fix For: 0.96.0


 From HBase-TRUNK-on-Hadoop-2.0.0 #354:
   loadTest[0](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[1](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[2](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[3](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7382:
--

Status: Patch Available  (was: Open)

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560981#comment-13560981
 ] 

Sergey Shelukhin commented on HBASE-7268:
-

Sure. I actually misread the logs for what I thought was missing, it does 
remove based on incorrect location but it's a forced remove. I'll add null 
check and make the logs/docs more clear.

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v0.patch, HBASE-7268-v1.patch, HBASE-7268-v2.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v3.patch, HBASE-7268-v4.patch, HBASE-7268-v5.patch, 
 HBASE-7268-v6.patch, HBASE-7268-v7.patch, HBASE-7268-v8.patch, 
 HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7605) TestMiniClusterLoadSequential fails in trunk build on hadoop 2

2013-01-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560985#comment-13560985
 ] 

stack commented on HBASE-7605:
--

[~ted_yu] is that NPE because of another issue?  Should we close out this one 
then?

 TestMiniClusterLoadSequential fails in trunk build on hadoop 2
 --

 Key: HBASE-7605
 URL: https://issues.apache.org/jira/browse/HBASE-7605
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Priority: Critical
 Fix For: 0.96.0


 From HBase-TRUNK-on-Hadoop-2.0.0 #354:
   loadTest[0](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[1](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[2](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[3](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6816) [WINDOWS] line endings on checkout for .sh files

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar resolved HBASE-6816.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

Committed this. Thanks for the review Nicolas. 

 [WINDOWS] line endings on checkout for .sh files
 

 Key: HBASE-6816
 URL: https://issues.apache.org/jira/browse/HBASE-6816
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-16_v1.patch, hbase-6816_v1.patch


 On code checkout from svn or git, we need to ensure that the line endings for 
 .sh files are LF, so that they work with cygwin. This is important for 
 getting src/saveVersion.sh to work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6832:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed v6, which has trivial changes to v5. Thanks for the reviews. 

 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Fix For: 0.96.0

 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, 
 hbase-6832_v5-trunk.patch, hbase-6832_v6-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7605) TestMiniClusterLoadSequential fails in trunk build on hadoop 2

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560995#comment-13560995
 ] 

Ted Yu commented on HBASE-7605:
---

Once addendum for HBASE-7268 goes in and this test passes on hadoop 2.0, I will 
resolve this issue.

 TestMiniClusterLoadSequential fails in trunk build on hadoop 2
 --

 Key: HBASE-7605
 URL: https://issues.apache.org/jira/browse/HBASE-7605
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu
Priority: Critical
 Fix For: 0.96.0


 From HBase-TRUNK-on-Hadoop-2.0.0 #354:
   loadTest[0](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[1](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[2](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds
   loadTest[3](org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential): 
 test timed out after 12 milliseconds

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6829) [WINDOWS] Tests should ensure that HLog is closed

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6829:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed v4. Thanks for reviews guys. 

 [WINDOWS] Tests should ensure that HLog is closed
 -

 Key: HBASE-6829
 URL: https://issues.apache.org/jira/browse/HBASE-6829
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Fix For: 0.96.0

 Attachments: hbase-6829_v1-0.94.patch, hbase-6829_v1-trunk.patch, 
 hbase-6829_v2-0.94.patch, hbase-6829_v2-trunk.patch, 
 hbase-6829_v3-0.94.patch, hbase-6829_v3-trunk.patch, 
 hbase-6829_v4-trunk.patch, hbase-6829_v4-trunk.patch


 TestCacheOnWriteInSchema and TestCompactSelection fails with 
 {code}
 java.io.IOException: Target HLog directory already exists: 
 ./target/test-data/2d814e66-75d3-4c1b-92c7-a49d9972e8fd/TestCacheOnWriteInSchema/logs
   at org.apache.hadoop.hbase.regionserver.wal.HLog.init(HLog.java:385)
   at org.apache.hadoop.hbase.regionserver.wal.HLog.init(HLog.java:316)
   at 
 org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema.setUp(TestCacheOnWriteInSchema.java:162)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6825) [WINDOWS] Java NIO socket channels does not work with Windows ipv6

2013-01-23 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561015#comment-13561015
 ] 

Enis Soztutar commented on HBASE-6825:
--

bq. If I'm not mistaken, the test uses a fixed port 8502, this should be 
changed, if not we can have random failures when running the test suites.
That part of the test mimics the test in the java bug report 
(http://bugs.sun.com/view_bug.do?bug_id=6230761). I think it checks for 
BindException's and just passes, assuming an optimistic test. I agree that I 
should change the fixed port. 
bq. For the issue itself, why should it not make it to the main code? I mean, 
we test that, on windows, a critical feature works. If not we will have issues. 
This should be in main, not in test, no?
Sorry, I failed to give enough context here. In the original patches, we did 
have a change for passing -Djava.net.preferIPv4Stack=true in the bin/hbase.cmd 
script. But I removed that one so that this patch would not depend on 
HBASE-6815. In the actual patch for HBASE-6815, we are passing preferipv4 to 
the hbase daemons through hbase.cmd script. In hbase-env.cmd:
{code}
+@rem Extra Java runtime options.
+@rem Below are what we set by default.  May only work with SUN JVM.
+@rem For more on why as well as other possible settings,
+@rem see http://wiki.apache.org/hadoop/PerformanceTuning
+@rem JDK6 on Windows has a known bug for IPv6, use preferIPv4Stack unless JDK7.
+@rem @rem See TestIPv6NIOServerSocketChannel.
+set HBASE_OPTS=-XX:+UseConcMarkSweepGC -Djava.net.preferIPv4Stack=true
{code}

bq. And actually, this seems to be fixed in 1.6 u34 says 
http://www.oracle.com/technetwork/java/javase/documentation/overview-156328.html.
Let me test this one. After HBASE-7301, we are running the tests with ipv4 on 
linux anyway, so we might as well commit this one regardless. 


 [WINDOWS] Java NIO socket channels does not work with Windows ipv6
 --

 Key: HBASE-6825
 URL: https://issues.apache.org/jira/browse/HBASE-6825
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.94.3, 0.96.0
 Environment: JDK6 on windows for ipv6. 
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: hbase-6825_v3-0.94.patch, hbase-6825_v3-trunk.patch


 While running the test TestAdmin.testCheckHBaseAvailableClosesConnection(), I 
 noticed that it takes very long, since it sleeps for 2sec * 500, because of 
 zookeeper retries. 
 The root cause of the problem is that ZK uses Java NIO to create 
 ServerSorcket's from ServerSocketChannels. Under windows, the ipv4 and ipv6 
 is implemented independently, and Java seems that it cannot reuse the same 
 socket channel for both ipv4 and ipv6 sockets. We are getting 
 java.net.SocketException: Address family not supported by protocol
 family exceptions. When, ZK client resolves localhost, it gets both v4 
 127.0.0.1 and v6 ::1 address, but the socket channel cannot bind to both v4 
 and v6. 
 The problem is reported as:
 http://bugs.sun.com/view_bug.do?bug_id=6230761
 http://stackoverflow.com/questions/1357091/binding-an-ipv6-server-socket-on-windows
 Although the JDK bug is reported as resolved, I have tested with jdk1.6.0_33 
 without any success. Although JDK7 seems to have fixed this problem. In ZK, 
 we can replace the ClientCnxnSocket implementation from ClientCnxnSocketNIO 
 to a non-NIO one, but I am not sure that would be the way to go.
 Disabling ipv6 resolution of localhost is one other approach. I'll test it 
 to see whether it will be any good. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7594) TestLocalHBaseCluster failing on ubuntu2

2013-01-23 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7594:
--

   Resolution: Fixed
Fix Version/s: 0.96.0
   Status: Resolved  (was: Patch Available)

Committed v5 patch with a tiny change to update Javadoc in HBaseTestingUtility 
for the new method.

 TestLocalHBaseCluster failing on ubuntu2
 

 Key: HBASE-7594
 URL: https://issues.apache.org/jira/browse/HBASE-7594
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.96.0

 Attachments: 7594-1.patch, 7594-2.patch, 7594-3.patch, 7594-4.patch, 
 7594-5.patch


 {noformat}
 java.io.IOException: java.io.IOException: java.io.IOException: 
 java.lang.InstantiationException: org.apache.hadoop.io.RawComparator
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:612)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:533)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4092)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4042)
   at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:427)
   at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:130)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: java.io.IOException: 
 java.lang.InstantiationException: org.apache.hadoop.io.RawComparator
   at 
 org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:450)
   at org.apache.hadoop.hbase.regionserver.HStore.init(HStore.java:215)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3060)
   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:585)
   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:583)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   ... 3 more
 Caused by: java.io.IOException: java.lang.InstantiationException: 
 org.apache.hadoop.io.RawComparator
   at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.createComparator(FixedFileTrailer.java:607)
   at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.createComparator(FixedFileTrailer.java:615)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.init(HFileReaderV2.java:115)
   at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:564)
   at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:599)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1294)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:525)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:628)
   at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:426)
   at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422)
   ... 8 more
 Caused by: java.lang.InstantiationException: 
 org.apache.hadoop.io.RawComparator
   at java.lang.Class.newInstance0(Class.java:340)
   at java.lang.Class.newInstance(Class.java:308)
   at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.createComparator(FixedFileTrailer.java:605)
   ... 17 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7522) Tests should not be writing under /tmp/

2013-01-23 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561018#comment-13561018
 ] 

Andrew Purtell commented on HBASE-7522:
---

TestLocalHBaseCluster fixed in HBASE-7594

 Tests should not be writing under /tmp/
 ---

 Key: HBASE-7522
 URL: https://issues.apache.org/jira/browse/HBASE-7522
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0, 0.94.5
Reporter: Enis Soztutar

 As per the discussion 
 http://mail-archives.apache.org/mod_mbox/hbase-dev/201301.mbox/%3CCA%2BRK%3D_BmV%3Dvwws4VeDJVPt6hY7NKCDEafex3XTNam630pQRBbA%40mail.gmail.com%3E,
  tests should not be writing under /tmp/ directory. 
 TestStoreFile is one of the offending ones. Some of them will be fixed at 
 HBASE-6824. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561029#comment-13561029
 ] 

Hudson commented on HBASE-6466:
---

Integrated in HBase-TRUNK #3784 (See 
[https://builds.apache.org/job/HBase-TRUNK/3784/])
HBASE-6466 Enable multi-thread for memstore flush (Chunhui) (Revision 
1437591)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/Threads.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestLogRolling.java


 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: 6466-v6.patch, 6466-v7.patch, HBASE-6466.patch, 
 HBASE-6466v2.patch, HBASE-6466v3.1.patch, HBASE-6466v3.patch, 
 HBASE-6466-v4.patch, HBASE-6466-v4.patch, HBASE-6466-v5.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561051#comment-13561051
 ] 

Hadoop QA commented on HBASE-6832:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12566167/hbase-6832_v6-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 21 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4151//console

This message is automatically generated.

 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Fix For: 0.96.0

 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, 
 hbase-6832_v5-trunk.patch, hbase-6832_v6-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561053#comment-13561053
 ] 

Hadoop QA commented on HBASE-7382:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12566165/HBASE-7382-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 6 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.TestLocalHBaseCluster

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150//console

This message is automatically generated.

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6821) [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file system problems in windows

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6821:
-

Summary: [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name 
causes file system problems in windows  (was: [WINDOWS] .META. table name 
causes file system problems in windows)

 [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file 
 system problems in windows
 -

 Key: HBASE-6821
 URL: https://issues.apache.org/jira/browse/HBASE-6821
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-4388-root.dir.tgz, hbase-6821_v2_0.94.patch, 
 hbase-6821_v2-trunk.patch, TestMetaMigrationConvertToPB.tgz


 TestMetaMigrationRemovingHTD untars a cluster dir having a .META. 
 subdirectory. This causes mvn clean to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6821) [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file system problems on windows

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6821:
-

Summary: [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name 
causes file system problems on windows  (was: [WINDOWS] In 
TestMetaMigrationConvertingToPB .META. table name causes file system problems 
in windows)

 [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file 
 system problems on windows
 -

 Key: HBASE-6821
 URL: https://issues.apache.org/jira/browse/HBASE-6821
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Attachments: hbase-4388-root.dir.tgz, hbase-6821_v2_0.94.patch, 
 hbase-6821_v2-trunk.patch, TestMetaMigrationConvertToPB.tgz


 TestMetaMigrationRemovingHTD untars a cluster dir having a .META. 
 subdirectory. This causes mvn clean to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6821) [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file system problems on windows

2013-01-23 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar resolved HBASE-6821.
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed

Committed this together with TestMetaMigrationConvertToPB.tgz. Removed the year 
in copyright notice per Ted's suggestion. 

Thanks for the reviews.

 [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file 
 system problems on windows
 -

 Key: HBASE-6821
 URL: https://issues.apache.org/jira/browse/HBASE-6821
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Fix For: 0.96.0

 Attachments: hbase-4388-root.dir.tgz, hbase-6821_v2_0.94.patch, 
 hbase-6821_v2-trunk.patch, TestMetaMigrationConvertToPB.tgz


 TestMetaMigrationRemovingHTD untars a cluster dir having a .META. 
 subdirectory. This causes mvn clean to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561065#comment-13561065
 ] 

Ted Yu commented on HBASE-7382:
---

@Himanshu:
Can you take a look at the javadoc and fidnbugs warnings ?

Thanks

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7268:


Attachment: HBASE-7268-addendum-v0.patch

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, 
 HBASE-7268-addendum-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch, HBASE-7268-v4.patch, 
 HBASE-7268-v5.patch, HBASE-7268-v6.patch, HBASE-7268-v7.patch, 
 HBASE-7268-v8.patch, HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7268:


Status: Patch Available  (was: Reopened)

Added patch.

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, 
 HBASE-7268-addendum-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch, HBASE-7268-v4.patch, 
 HBASE-7268-v5.patch, HBASE-7268-v6.patch, HBASE-7268-v7.patch, 
 HBASE-7268-v8.patch, HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6821) [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file system problems on windows

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561085#comment-13561085
 ] 

Hudson commented on HBASE-6821:
---

Integrated in HBase-TRUNK #3785 (See 
[https://builds.apache.org/job/HBase-TRUNK/3785/])
HBASE-6821. [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name 
causes file system problems on windows (Revision 1437718)

 Result = FAILURE
enis : 
Files : 
* /hbase/trunk/hbase-server/src/test/data/TestMetaMigrationConvertToPB.README
* /hbase/trunk/hbase-server/src/test/data/TestMetaMigrationConvertToPB.tgz
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaMigrationConvertingToPB.java


 [WINDOWS] In TestMetaMigrationConvertingToPB .META. table name causes file 
 system problems on windows
 -

 Key: HBASE-6821
 URL: https://issues.apache.org/jira/browse/HBASE-6821
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Fix For: 0.96.0

 Attachments: hbase-4388-root.dir.tgz, hbase-6821_v2_0.94.patch, 
 hbase-6821_v2-trunk.patch, TestMetaMigrationConvertToPB.tgz


 TestMetaMigrationRemovingHTD untars a cluster dir having a .META. 
 subdirectory. This causes mvn clean to fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7594) TestLocalHBaseCluster failing on ubuntu2

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561086#comment-13561086
 ] 

Hudson commented on HBASE-7594:
---

Integrated in HBase-TRUNK #3785 (See 
[https://builds.apache.org/job/HBase-TRUNK/3785/])
HBASE-7594. TestLocalHBaseCluster failing on ubuntu2 (Revision 1437658)

 Result = FAILURE
apurtell : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestLocalHBaseCluster.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFile.java


 TestLocalHBaseCluster failing on ubuntu2
 

 Key: HBASE-7594
 URL: https://issues.apache.org/jira/browse/HBASE-7594
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.96.0

 Attachments: 7594-1.patch, 7594-2.patch, 7594-3.patch, 7594-4.patch, 
 7594-5.patch


 {noformat}
 java.io.IOException: java.io.IOException: java.io.IOException: 
 java.lang.InstantiationException: org.apache.hadoop.io.RawComparator
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:612)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:533)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4092)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4042)
   at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:427)
   at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:130)
   at 
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:202)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: java.io.IOException: 
 java.lang.InstantiationException: org.apache.hadoop.io.RawComparator
   at 
 org.apache.hadoop.hbase.regionserver.HStore.loadStoreFiles(HStore.java:450)
   at org.apache.hadoop.hbase.regionserver.HStore.init(HStore.java:215)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:3060)
   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:585)
   at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:583)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   ... 3 more
 Caused by: java.io.IOException: java.lang.InstantiationException: 
 org.apache.hadoop.io.RawComparator
   at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.createComparator(FixedFileTrailer.java:607)
   at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.createComparator(FixedFileTrailer.java:615)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.init(HFileReaderV2.java:115)
   at 
 org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:564)
   at 
 org.apache.hadoop.hbase.io.hfile.HFile.createReaderWithEncoding(HFile.java:599)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1294)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:525)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:628)
   at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:426)
   at org.apache.hadoop.hbase.regionserver.HStore$1.call(HStore.java:422)
   ... 8 more
 Caused by: java.lang.InstantiationException: 
 org.apache.hadoop.io.RawComparator
   at java.lang.Class.newInstance0(Class.java:340)
   at java.lang.Class.newInstance(Class.java:308)
   at 
 org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.createComparator(FixedFileTrailer.java:605)
   ... 17 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6832) [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on implicit RS timing

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561087#comment-13561087
 ] 

Hudson commented on HBASE-6832:
---

Integrated in HBase-TRUNK #3785 (See 
[https://builds.apache.org/job/HBase-TRUNK/3785/])
HBASE-6832. [WINDOWS] Tests should use explicit timestamp for Puts, and not 
rely on implicit RS timing (Revision 1437643)

 Result = FAILURE
enis : 
Files : 
* 
/hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/IncrementingEnvironmentEdge.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverBypass.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestScannerSelectionUsingTTL.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/thrift/TestThriftServerCmdLine.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestIncrementingEnvironmentEdge.java


 [WINDOWS] Tests should use explicit timestamp for Puts, and not rely on 
 implicit RS timing  
 

 Key: HBASE-6832
 URL: https://issues.apache.org/jira/browse/HBASE-6832
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Enis Soztutar
  Labels: windows
 Fix For: 0.96.0

 Attachments: hbase-6832_v1-0.94.patch, hbase-6832_v1-trunk.patch, 
 hbase-6832_v4-0.94.patch, hbase-6832_v4-trunk.patch, 
 hbase-6832_v5-trunk.patch, hbase-6832_v6-trunk.patch


 TestRegionObserverBypass.testMulti() fails with 
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.checkRowAndDelete(TestRegionObserverBypass.java:173)
   at 
 org.apache.hadoop.hbase.coprocessor.TestRegionObserverBypass.testMulti(TestRegionObserverBypass.java:166)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6816) [WINDOWS] line endings on checkout for .sh files

2013-01-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561089#comment-13561089
 ] 

Hudson commented on HBASE-6816:
---

Integrated in HBase-TRUNK #3785 (See 
[https://builds.apache.org/job/HBase-TRUNK/3785/])
HBASE-6816. [WINDOWS] line endings on checkout for .sh files (Revision 
1437642)

 Result = FAILURE
enis : 
Files : 
* /hbase/trunk/.gitattributes
* /hbase/trunk/src/site/resources/images/hbase_logo.svg


 [WINDOWS] line endings on checkout for .sh files
 

 Key: HBASE-6816
 URL: https://issues.apache.org/jira/browse/HBASE-6816
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 0.96.0

 Attachments: hbase-16_v1.patch, hbase-6816_v1.patch


 On code checkout from svn or git, we need to ensure that the line endings for 
 .sh files are LF, so that they work with cygwin. This is important for 
 getting src/saveVersion.sh to work. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561099#comment-13561099
 ] 

Ted Yu commented on HBASE-7382:
---

I looked at the javadoc warnings and only saw warnings about Bytes.java

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1356#comment-1356
 ] 

Ted Yu commented on HBASE-7268:
---

Looks good to me.

I ran TestMiniClusterLoadSequential and TestMiniClusterLoadParallel against 
hadoop 2.0 - they passed.

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, 
 HBASE-7268-addendum-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch, HBASE-7268-v4.patch, 
 HBASE-7268-v5.patch, HBASE-7268-v6.patch, HBASE-7268-v7.patch, 
 HBASE-7268-v8.patch, HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561131#comment-13561131
 ] 

Himanshu Vashishtha commented on HBASE-7382:


How do you dig/fix findbugs warnings Ted?

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7651) RegionServerSnapshotManager does not accept subsquent snapshots if previous fails with NotServingRegionException.

2013-01-23 Thread Jonathan Hsieh (JIRA)
Jonathan Hsieh created HBASE-7651:
-

 Summary: RegionServerSnapshotManager does not accept subsquent 
snapshots if previous fails with NotServingRegionException.
 Key: HBASE-7651
 URL: https://issues.apache.org/jira/browse/HBASE-7651
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-7290
Reporter: Jonathan Hsieh
Priority: Blocker


I've reproduced this problem consistently on a 20 node cluster.

The first run fails on a node (jon-snaphots-2 in this case) to take snapshot 
due to a NotServingRegionException (this is acceptable)

{code}
2013-01-23 13:32:48,631 DEBUG 
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  accepting 
received exception
org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
 org.apache.hadoop.hbase.NotServingRegionException: 
TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is closing
at 
org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:184)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs.abort(ZKProcedureCoordinatorRpcs.java:240)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs$1.nodeCreated(ZKProcedureCoordinatorRpcs.java:182)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:294)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
Caused by: 
org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: 
org.apache.hadoop.hbase.NotServingRegionException: 
TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is closing
at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:343)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:107)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:123)
at 
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
at 
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
2013-01-23 13:32:48,631 DEBUG 
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  Recieved 
error, notifying listeners...
2013-01-23 13:32:48,730 ERROR org.apache.hadoop.hbase.procedure.Procedure: 
Procedure 'pe-6' execution failed!
org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
 org.apache.hadoop.hbase.NotServingRegionException: 
TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is closing
at 
org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:84)
at 
org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:357)
at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:203)
at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: 
org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: 
org.apache.hadoop.hbase.NotServingRegionException: 
TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is closing
at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:343)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:107)
at 

[jira] [Assigned] (HBASE-7651) RegionServerSnapshotManager does not accept subsquent snapshots if previous fails with NotServingRegionException.

2013-01-23 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh reassigned HBASE-7651:
-

Assignee: Jonathan Hsieh

 RegionServerSnapshotManager does not accept subsquent snapshots if previous 
 fails with NotServingRegionException.
 -

 Key: HBASE-7651
 URL: https://issues.apache.org/jira/browse/HBASE-7651
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-7290
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker

 I've reproduced this problem consistently on a 20 node cluster.
 The first run fails on a node (jon-snaphots-2 in this case) to take snapshot 
 due to a NotServingRegionException (this is acceptable)
 {code}
 2013-01-23 13:32:48,631 DEBUG 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  accepting 
 received exception
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
 jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
  org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:184)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs.abort(ZKProcedureCoordinatorRpcs.java:240)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs$1.nodeCreated(ZKProcedureCoordinatorRpcs.java:182)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:294)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: 
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:343)
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:107)
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:123)
 at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
 at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2013-01-23 13:32:48,631 DEBUG 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  Recieved 
 error, notifying listeners...
 2013-01-23 13:32:48,730 ERROR org.apache.hadoop.hbase.procedure.Procedure: 
 Procedure 'pe-6' execution failed!
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
 jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
  org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:84)
 at 
 org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:357)
 at 
 org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:203)
 at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: 
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 

[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561137#comment-13561137
 ] 

Ted Yu commented on HBASE-7382:
---

One way is to go through hbase-server findbugs xml, looking for the files (and 
lines) touched by your patch.

The other way is to diff 
https://builds.apache.org/job/PreCommit-HBASE-Build/4149/artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.xml
 with 
https://builds.apache.org/job/PreCommit-HBASE-Build/4150/artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.xml
This should narrow your search.

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7643) HFileArchiver.resolveAndArchive() race condition may lead to snapshot data loss

2013-01-23 Thread Matteo Bertozzi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561143#comment-13561143
 ] 

Matteo Bertozzi commented on HBASE-7643:


committed p4-v5 to the snapshot branch, to have more coverage (jenkins, test 
rig, ...)
I'll commit it to trunk in a couple of days if everything is fine and there're 
no objections.

 HFileArchiver.resolveAndArchive() race condition may lead to snapshot data 
 loss
 ---

 Key: HBASE-7643
 URL: https://issues.apache.org/jira/browse/HBASE-7643
 Project: HBase
  Issue Type: Bug
Affects Versions: hbase-6055, 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-7653-p4-v0.patch, HBASE-7653-p4-v1.patch, 
 HBASE-7653-p4-v2.patch, HBASE-7653-p4-v3.patch, HBASE-7653-p4-v4.patch, 
 HBASE-7653-p4-v5.patch


  * The master have an hfile cleaner thread (that is responsible for cleaning 
 the /hbase/.archive dir)
  ** /hbase/.archive/table/region/family/hfile
  ** if the family/region/family directory is empty the cleaner removes it
  * The master can archive files (from another thread, e.g. DeleteTableHandler)
  * The region can archive files (from another server/process, e.g. compaction)
 The simplified file archiving code looks like this:
 {code}
 HFileArchiver.resolveAndArchive(...) {
   // ensure that the archive dir exists
   fs.mkdir(archiveDir);
   // move the file to the archiver
   success = fs.rename(originalPath/fileName, archiveDir/fileName)
   // if the rename is failed, delete the file without archiving
   if (!success) fs.delete(originalPath/fileName);
 }
 {code}
 Since there's no synchronization between HFileArchiver.resolveAndArchive() 
 and the cleaner run (different process, thread, ...) you can end up in the 
 situation where you are moving something in a directory that doesn't exists.
 {code}
 fs.mkdir(archiveDir);
 // HFileCleaner chore starts at this point
 // and the archiveDirectory that we just ensured to be present gets removed.
 // The rename at this point will fail since the parent directory is missing.
 success = fs.rename(originalPath/fileName, archiveDir/fileName)
 {code}
 The bad thing of deleting the file without archiving is that if you've a 
 snapshot that relies on the file to be present, or you've a clone table that 
 relies on that file is that you're losing data.
 Possible solutions
  * Create a ZooKeeper lock, to notify the master (Hey I'm archiving 
 something, wait a bit)
  * Add a RS - Master call to let the master removes files and avoid this 
 kind of situations
  * Avoid to remove empty directories from the archive if the table exists or 
 is not disabled
  * Add a try catch around the fs.rename
 The last one, the easiest one, looks like:
 {code}
 for (int i = 0; i  retries; ++i) {
   // ensure archive directory to be present
   fs.mkdir(archiveDir);
   //  possible race -
   // try to archive file
   success = fs.rename(originalPath/fileName, archiveDir/fileName);
   if (success) break;
 }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561144#comment-13561144
 ] 

Hadoop QA commented on HBASE-7268:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12566190/HBASE-7268-addendum-v0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4152//console

This message is automatically generated.

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, 
 HBASE-7268-addendum-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch, HBASE-7268-v4.patch, 
 HBASE-7268-v5.patch, HBASE-7268-v6.patch, HBASE-7268-v7.patch, 
 HBASE-7268-v8.patch, HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7651) RegionServerSnapshotManager does not accept subsquent snapshots if previous fails with NotServingRegionException.

2013-01-23 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561156#comment-13561156
 ] 

Jonathan Hsieh commented on HBASE-7651:
---

NSRE's are possible with this snapshotting implementation (master gets a list 
of regions/regionservers to care about, regions move, and then the snapshotting 
request is sent to the rs's.)

Restarting the particular node (jon-snapshot-2 from the example) fixes the 
problem but when the next NSRE pops up elsewhere we get stuck again.

 RegionServerSnapshotManager does not accept subsquent snapshots if previous 
 fails with NotServingRegionException.
 -

 Key: HBASE-7651
 URL: https://issues.apache.org/jira/browse/HBASE-7651
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-7290
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker

 I've reproduced this problem consistently on a 20 node cluster.
 The first run fails on a node (jon-snaphots-2 in this case) to take snapshot 
 due to a NotServingRegionException (this is acceptable)
 {code}
 2013-01-23 13:32:48,631 DEBUG 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  accepting 
 received exception
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
 jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
  org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:184)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs.abort(ZKProcedureCoordinatorRpcs.java:240)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs$1.nodeCreated(ZKProcedureCoordinatorRpcs.java:182)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:294)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: 
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:343)
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:107)
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:123)
 at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
 at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2013-01-23 13:32:48,631 DEBUG 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  Recieved 
 error, notifying listeners...
 2013-01-23 13:32:48,730 ERROR org.apache.hadoop.hbase.procedure.Procedure: 
 Procedure 'pe-6' execution failed!
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
 jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
  org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:84)
 at 
 org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:357)
 at 
 org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:203)
 at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 

[jira] [Updated] (HBASE-7114) Increment does not extend Mutation but probably should

2013-01-23 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7114:
--

Affects Version/s: 0.96.0

 Increment does not extend Mutation but probably should
 --

 Key: HBASE-7114
 URL: https://issues.apache.org/jira/browse/HBASE-7114
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.96.0
Reporter: Andrew Purtell
Priority: Minor

 Increment is the only operation in the class of mutators that does not extend 
 Mutation. It mostly duplicates what Mutation provides, but not quite. The 
 signatures for setWriteToWAL and getFamilyMap are slightly different. This 
 can be inconvenient because it requires special case code and therefore could 
 be considered an API design nit. Unfortunately it is not a simple change: The 
 interface is marked stable and the internals of the family map are different 
 from other mutation types. The latter is why I suspect this was not addressed 
 when Mutation was introduced.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5664) CP hooks in Scan flow for fast forward when filter filters out a row

2013-01-23 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561170#comment-13561170
 ] 

Andrew Purtell commented on HBASE-5664:
---

+1 if this is what you need Anoop.

 CP hooks in Scan flow for fast forward when filter filters out a row
 

 Key: HBASE-5664
 URL: https://issues.apache.org/jira/browse/HBASE-5664
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Filters
Affects Versions: 0.92.1
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-5664_94.patch, HBASE-5664_94_V2.patch, 
 HBASE-5664_Trunk.patch, HBASE-5664_Trunk_V2.patch


 In HRegion.nextInternal(int limit, String metric)
   We have while(true) loop so as to fetch a next result which satisfies 
 filter condition. When Filter filters out the current fetched row we call 
 nextRow(byte [] currentRow) before going with the next row.
 {code}
 if (results.isEmpty() || filterRow()) {
 // this seems like a redundant step - we already consumed the row
 // there're no left overs.
 // the reasons for calling this method are:
 // 1. reset the filters.
 // 2. provide a hook to fast forward the row (used by subclasses)
 nextRow(currentRow);
 {code}
 // 2. provide a hook to fast forward the row (used by subclasses)
 We can provide same feature of fast forward support for the CP also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7403) Online Merge

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561190#comment-13561190
 ] 

Ted Yu commented on HBASE-7403:
---

@Chunhui:
Looks like you have 2 +1's already.

 Online Merge
 

 Key: HBASE-7403
 URL: https://issues.apache.org/jira/browse/HBASE-7403
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.94.3
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.5

 Attachments: 7403-trunkv5.patch, 7403-trunkv6.patch, 7403v5.diff, 
 7403-v5.txt, 7403v5.txt, hbase-7403-94v1.patch, hbase-7403-trunkv10.patch, 
 hbase-7403-trunkv11.patch, hbase-7403-trunkv1.patch, 
 hbase-7403-trunkv5.patch, hbase-7403-trunkv6.patch, hbase-7403-trunkv7.patch, 
 hbase-7403-trunkv8.patch, hbase-7403-trunkv9.patch, merge region.pdf


 The feature of this online merge:
 1.Online,no necessary to disable table
 2.Less change for current code, could applied in trunk,0.94 or 0.92,0.90
 3.Easy to call merege request, no need to input a long region name, only 
 encoded name enough
 4.No limit when operation, you don't need to tabke care the events like 
 Server Dead, Balance, Split, Disabing/Enabing table, no need to take care 
 whether you send a wrong merge request, it has alread done for you
 5.Only little offline time for two merging regions
 Usage:
 1.Tool:  
 bin/hbase org.apache.hadoop.hbase.util.OnlineMerge [-force] [-async] [-show] 
 table-name region-encodedname-1 region-encodedname-2
 2.API: static void MergeManager#createMergeRequest
 We need merge in the following cases:
 1.Region hole or region overlap, can’t be fix by hbck
 2.Region become empty because of TTL and not reasonable Rowkey design
 3.Region is always empty or very small because of presplit when create table
 4.Too many empty or small regions would reduce the system performance(e.g. 
 mslab)
 Current merge tools only support offline and are not able to redo if 
 exception is thrown in the process of merging, causing a dirty data
 For online system, we need a online merge.
 This implement logic of this patch for  Online Merge is :
 For example, merge regionA and regionB into regionC
 1.Offline the two regions A and B
 2.Merge the two regions in the HDFS(Create regionC’s directory, move 
 regionA’s and regionB’s file to regionC’s directory, delete regionA’s and 
 regionB’s directory)
 3.Add the merged regionC to .META.
 4.Assign the merged regionC
 As design of this patch , once we do the merge work in the HDFS,we could redo 
 it until successful if it throws exception or abort or server restart, but 
 couldn’t be rolled back. 
 It depends on
 Use zookeeper to record the transaction journal state, make redo easier
 Use zookeeper to send/receive merge request
 Merge transaction is executed on the master
 Support calling merge request through API or shell tool
 About the merge process, please see the attachment and patch

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7651) RegionServerSnapshotManager does not accept subsquent snapshots if previous fails with NotServingRegionException.

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561198#comment-13561198
 ] 

Ted Yu commented on HBASE-7651:
---

Line 343 in RegionServerSnapshotManager#waitForOutstandingTasks():
{code}
LOG.warn(cancelling region task);
f.cancel(true);
{code}
Shall we pass false to cancel() ?

 RegionServerSnapshotManager does not accept subsquent snapshots if previous 
 fails with NotServingRegionException.
 -

 Key: HBASE-7651
 URL: https://issues.apache.org/jira/browse/HBASE-7651
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-7290
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker

 I've reproduced this problem consistently on a 20 node cluster.
 The first run fails on a node (jon-snaphots-2 in this case) to take snapshot 
 due to a NotServingRegionException (this is acceptable)
 {code}
 2013-01-23 13:32:48,631 DEBUG 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  accepting 
 received exception
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
 jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
  org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.errorhandling.ForeignException.deserialize(ForeignException.java:184)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs.abort(ZKProcedureCoordinatorRpcs.java:240)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureCoordinatorRpcs$1.nodeCreated(ZKProcedureCoordinatorRpcs.java:182)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:294)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: 
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:343)
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:107)
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:123)
 at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
 at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:52)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 2013-01-23 13:32:48,631 DEBUG 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher:  Recieved 
 error, notifying listeners...
 2013-01-23 13:32:48,730 ERROR org.apache.hadoop.hbase.procedure.Procedure: 
 Procedure 'pe-6' execution failed!
 org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable via 
 jon-snapshots-2.ent.cloudera.com,22101,1358976524369:org.apache.hadoop.hbase.errorhandling.ForeignException$ProxyThrowable:
  org.apache.hadoop.hbase.NotServingRegionException: 
 TestTable,0002493652,1358976652443.b858147ad87a7812ac9a73dd8fef36ad. is 
 closing
 at 
 org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:84)
 at 
 org.apache.hadoop.hbase.procedure.Procedure.waitForLatch(Procedure.java:357)
 at 
 org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:203)
 at org.apache.hadoop.hbase.procedure.Procedure.call(Procedure.java:68)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: 
 

[jira] [Commented] (HBASE-5664) CP hooks in Scan flow for fast forward when filter filters out a row

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561203#comment-13561203
 ] 

Ted Yu commented on HBASE-5664:
---

Integrated to trunk.

Thanks for the patch, Anoop.

Thanks for the review, Andy.

 CP hooks in Scan flow for fast forward when filter filters out a row
 

 Key: HBASE-5664
 URL: https://issues.apache.org/jira/browse/HBASE-5664
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Filters
Affects Versions: 0.92.1
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.96.0, 0.94.5

 Attachments: HBASE-5664_94.patch, HBASE-5664_94_V2.patch, 
 HBASE-5664_Trunk.patch, HBASE-5664_Trunk_V2.patch


 In HRegion.nextInternal(int limit, String metric)
   We have while(true) loop so as to fetch a next result which satisfies 
 filter condition. When Filter filters out the current fetched row we call 
 nextRow(byte [] currentRow) before going with the next row.
 {code}
 if (results.isEmpty() || filterRow()) {
 // this seems like a redundant step - we already consumed the row
 // there're no left overs.
 // the reasons for calling this method are:
 // 1. reset the filters.
 // 2. provide a hook to fast forward the row (used by subclasses)
 nextRow(currentRow);
 {code}
 // 2. provide a hook to fast forward the row (used by subclasses)
 We can provide same feature of fast forward support for the CP also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7649) client retry timeout doesn't need to do x2 fallback when going to different server

2013-01-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7649:


Attachment: HBASE-7649-v0.patch

Attaching patch that tracks retries by server.

 client retry timeout doesn't need to do x2 fallback when going to different 
 server
 --

 Key: HBASE-7649
 URL: https://issues.apache.org/jira/browse/HBASE-7649
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7649-v0.patch


 See HBASE-7520. When we go to server A, get a bunch of failures, then finally 
 learn the region is on B it doesn't make sense to wait for 30 seconds before 
 going to B.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7649) client retry timeout doesn't need to do x2 fallback when going to different server

2013-01-23 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7649:


Status: Patch Available  (was: Open)

 client retry timeout doesn't need to do x2 fallback when going to different 
 server
 --

 Key: HBASE-7649
 URL: https://issues.apache.org/jira/browse/HBASE-7649
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7649-v0.patch


 See HBASE-7520. When we go to server A, get a bunch of failures, then finally 
 learn the region is on B it doesn't make sense to wait for 30 seconds before 
 going to B.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten (or deleted) w/stale information from an old server

2013-01-23 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561224#comment-13561224
 ] 

Sergey Shelukhin commented on HBASE-7268:
-

should this be ok to commit?

 correct local region location cache information can be overwritten (or 
 deleted) w/stale information from an old server
 --

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: 7268-v6.patch, 7268-v8.patch, 
 HBASE-7268-addendum-v0.patch, HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch, HBASE-7268-v2-plus-masterTs.patch, 
 HBASE-7268-v2-plus-masterTs.patch, HBASE-7268-v3.patch, HBASE-7268-v4.patch, 
 HBASE-7268-v5.patch, HBASE-7268-v6.patch, HBASE-7268-v7.patch, 
 HBASE-7268-v8.patch, HBASE-7268-v9.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7382) Port ZK.multi support from HBASE-6775 to 0.96

2013-01-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561236#comment-13561236
 ] 

Ted Yu commented on HBASE-7382:
---

The 3 additional findbugs warnings are the following:
{code}
   BugInstance type=HE_EQUALS_USE_HASHCODE priority=1 abbrev=HE 
 category=BAD_PRACTICE
 Class 
 classname=org.apache.hadoop.hbase.zookeeper.ZKUtil$ZKUtilOp$CreateAndFailSilent
...
   BugInstance type=HE_EQUALS_USE_HASHCODE priority=1 abbrev=HE 
 category=BAD_PRACTICE
 Class 
 classname=org.apache.hadoop.hbase.zookeeper.ZKUtil$ZKUtilOp$DeleteNodeFailSilent
...
   BugInstance type=HE_EQUALS_USE_HASHCODE priority=1 abbrev=HE 
 category=BAD_PRACTICE
 Class 
 classname=org.apache.hadoop.hbase.zookeeper.ZKUtil$ZKUtilOp$SetData
...
{code}
Please refer to newPatchFindbugsWarningshbase-server.xml from PreCommit build 
4150 for details.

 Port ZK.multi support from HBASE-6775 to 0.96
 -

 Key: HBASE-7382
 URL: https://issues.apache.org/jira/browse/HBASE-7382
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Gregory Chanan
Assignee: Himanshu Vashishtha
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-7382-trunk.patch


 HBASE-6775 adds support for ZK.multi ZKUtil and uses it for the 0.92/0.94 
 compatibility fix implemented in HBASE-6710.
 ZK.multi support is most likely useful in 0.96, but since HBASE-6710 is not 
 relevant for 0.96, perhaps we should find another use case first before we 
 port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >