Hi St.Ack, many thanks for your detailed reply. This clears up a lot of things!
On Wed, Apr 13, 2011 at 01:48, Stack <[email protected]> wrote: > So, HBase 0.90.2 and the tip of branch-0.20-append is recommended. > To summarize the RPC version differences: - Hadoop 0.20.2 release uses version 41. - The Hadoop 0.20-append version shipped with HBase 0.90.{1,2} uses 42. - The trunk version of Hadoop 0.20-append uses 43. So I understand that in order to actually install Hadoop 0.20-append for use with HBase 0.90.2, we can simply use an existing 0.20.2 release installation and replace its JAR files with the append JAR files created from the tip of branch-0.20-append. My own answer to this question would be "Yes" but since it is a critical question I still want to ask it. Regarding the various build test failures: I read through the links you posted (e.g. your comments in HBASE-3285), and it seems that the build failures for org.apache.hadoop.hdfs.TestFileAppend4 do not indicate at the moment whether the build is really erroneous or not. In other words, the (or rather, some) unit tests are currently broken for the tip of branch-0.20-append so we may (have to) ignore this build error because it doesn't really tell us anything at the moment. Right? > 2) When I run "ant test" for the latest version of the append branch, I > > get the same error as before. However, I sometimes -- not always -- get > > additional failures/errors for > > * TEST-org.apache.hadoop.hdfs.server.namenode.TestEditLogRace.txt [11] > > * TEST-org.apache.hadoop.hdfs.TestMultiThreadedSync.txt [12] > > both of which look like "general" errors to me. Maybe a problem of > > the machine I'm running the build and the tests on? > > > > This I have not noticed. > In my latest build tests, I have seen errors reported by org.apache.hadoop.hdfs.server.namenode.TestHeartbeatHandling.txt. If you want to, I can perform some more build tests on branch-0.20-append and report back if that helps. > 2. Is there a way to test whether my custom build is "correct"? In other > > words, how can I find out whether the append/syncing works properly > > so that it does not come to a data loss in HBase at some point. > > Unfortunately, I haven't found any instructions to intentionally > > create such a data-loss scenario for verifying whether Hadoop/HBase > > handles it properly. St.Ack, for instance, only talks about some > > basic tests he did himself [13]. > > > Yes. > > There are hbase unit tests that will check for lost data. These > passed before we cut the release. > Ok. Would I have to run these unit tests as part of an HBase build process, or is there a way to run them separately? My understanding is that I can use HBase 0.90.2 release as is as soon as I have a Hadoop 0.20-append build ready. In other words, when I replace the JAR files of Hadoop 0.20.2 release with the JAR files built from branch-0.20-append (for all machines in a cluster), then I can use the tarball of HBase 0.90.2 and do not need to build HBase myself. But I guess what you imply is that I would have to re-run HBase unit tests myself if I want to test them with the "trunk" branch-0.20-append JAR files (because though your tests passed before release, they were against the HEAD^1 version of branch-0.20-append). > Its probably little consolation to you but we've been running 0.90.1, > a 0.90.1 that had HBASE-3285 applied (and a CDH3b2 with 1554 et al. > applied) with a good while in production here where I work on multiple > clusters. > At least some consolation. :-) But: ...on HEAD or HEAD^1 of branch-0.20-append for Hadoop? FYI, I am currently preparing a step-by-step summary of how you go about building Hadoop 0.20-append for HBase 0.90.2 based on the feedback in this thread. I can also post it back to the mailing list, and I'm also more than happy to help extending the current HBase docs in one way or the other if you are interested in that. Many thanks again for your help, it's really appreciated! Best, Michael
