On Thu, Feb 9, 2012 at 12:54 PM, Konstantin Shvachko <shv.had...@gmail.com> wrote: > Ok I misunderstood the current direction of the merge. > > On the review request: > >> we don't deal with the case where OP_ADD >> contains blocks on a new file -- this is a case that doesn't happen on >> real clusters, but currently happens with synthetic logs generated >> from the CreateEditLogs tool. > > I intentionally did not remove handling of new files with blocks (non > empty). The reason is potential issues with backward compatibility. If > any HDFS version in the past produced such transactions the new > version must be able to read them. I thought it is easier to retain > the support for this type of transactions rather than checking all > past versions. > I would not recommend restricting OP_ADD in the way that it requires > new files to be empty.
Good point. I checked back in old versions and it appears that we had this behavior in 0.16 and below (removed in HADOOP-2345) in 0.17. I'll update the merge to continue to support this old behavior, and leave a note that it could be simplified by bumping our minimum upgrade version to 0.17 or 0.18 (which seems entirely reasonable given they're ~4 years old). Will report back when a new patch is up for review. -Todd > > Thanks, > --Konstantin > > On Thu, Feb 9, 2012 at 10:57 AM, Todd Lipcon <t...@cloudera.com> wrote: >> Hi Konstantin, >> >> To be clear, this review request is a merge from the trunk branch into >> the HA branch (NOT a merge INTO trunk) - we've been doing these pretty >> much daily since the project began, so that we track trunk closely. >> The idea is so that we don't have unexpected integration issues when >> we do the merge the _other_ way when it's ready. >> >> When we propose the merge *into* trunk we'll definitely address your >> questions below. We are indeed already running cluster tests at >> 100-node scale with failovers and both MR and HBase workloads, though >> have not done a formal performance comparison at this point. >> >> -Todd >> >> On Thu, Feb 9, 2012 at 10:54 AM, Konstantin Shvachko >> <shv.had...@gmail.com> wrote: >>> I was wondering >>> 1. What was the test plan that has been executed for testing this >>> implementation of HA? Besides unit tests. >>> 2. Have you done any benchmarks, comparing current cluster performance >>> against the branch. Would be good to have numbers for both cases with >>> HA off and HA on. >>> >>> I'll post these questions to the jira as well. >>> >>> Thanks, >>> --Konstantin >>> >>> On Wed, Feb 8, 2012 at 9:08 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>> The branch developed some new conflicts due to recent changes in trunk >>>> affecting the RPC between the DN and the NN (the "StorageReport" >>>> stuff). I've done a new merge to address these conflicts here: >>>> >>>> https://github.com/toddlipcon/hadoop-common/tree/ha-merge-20120208 >>>> >>>> I've also addressed Aaron's comments in the thread above. >>>> I ran the unit tests on the branch and they passed. >>>> >>>> Thanks >>>> -Todd >>>> >>>> On Fri, Feb 3, 2012 at 4:44 PM, Aaron T. Myers <a...@cloudera.com> wrote: >>>>> Hey Todd, >>>>> >>>>> The merge largely looks good. I agree with the general approach you took. >>>>> A >>>>> few small comments: >>>>> >>>>> 1. There's a comment in the OP_ADD case blockabout handling OP_CLOSE. This >>>>> makes sense in 0.22/0.23/0.24, but in the HA branch the OP_ADD and >>>>> OP_CLOSE >>>>> cases are completely separate case blocks. I actually find this whole >>>>> comment a little confusing, since it numbers the cases we have to handle, >>>>> but those numbers aren't referenced anywhere else. >>>>> >>>>> 2. You mentioned in your message that you don't handle the (invalid) case >>>>> of OP_ADD on a new file containing updated blocks, but it looks like the >>>>> code actually does, though the code also mentions that we should add a >>>>> sanity check that this is actually can't occur. Seems like we should clean >>>>> up this inconsistency. I agree that adding asserting this case doesn't >>>>> occur is the right way to go. >>>>> >>>>> 3. If we go with my suggestion in (2), we can also move the call to >>>>> FSEditLogLoader#updateBlocks to only the case of OP_ADD for an existing >>>>> file, and then get rid of the "INodeFile newFile = oldFile" assignment, >>>>> which I found kind of confusing at first. (Though I do see why it's >>>>> correct >>>>> as-implemented.) If you don't go with my suggestion in (2), please add a >>>>> comment explaining the assignment. >>>>> >>>>> Otherwise looks good. Merge away. >>>>> >>>>> -- >>>>> Aaron T. Myers >>>>> Software Engineer, Cloudera >>>>> >>>>> >>>>> >>>>> On Fri, Feb 3, 2012 at 2:10 PM, Todd Lipcon <t...@cloudera.com> wrote: >>>>> >>>>>> I've got a merge pending of trunk into HDFS-1623 -- it was a bit >>>>>> complicated so wanted to ask for another set of eyes: >>>>>> https://github.com/toddlipcon/hadoop-common/tree/ha-merge-20120203 >>>>>> (using github since it's hard to review a merge patch via JIRA) >>>>>> >>>>>> The interesting bit of the merge was to deal with conflicts with >>>>>> HDFS-2718. To summarize the changes I had to make: >>>>>> - in the HDFS-1623 branch, we don't deal with the case where OP_ADD >>>>>> contains blocks on a new file -- this is a case that doesn't happen on >>>>>> real clusters, but currently happens with synthetic logs generated >>>>>> from the CreateEditLogs tool. I added a TODO to add a sanity check >>>>>> here and will address as a follow-up. Given the difference between >>>>>> trunk and branch, there were a couple of small changes that propagated >>>>>> into unprotectedAddFile >>>>>> - In the HDFS-1623 branch we had already implemented the >>>>>> "updateBlocks" call inside FSEditLogLoader. I used that existing >>>>>> implementation rather than adding the new one in FSDirectory, since >>>>>> this function had some other changes related to HA in the branch >>>>>> version. >>>>>> >>>>>> I'll wait for a +1 before committing. I ran all of the unit tests and >>>>>> they passed. >>>>>> >>>>>> -Todd >>>>>> -- >>>>>> Todd Lipcon >>>>>> Software Engineer, Cloudera >>>>>> >>>> >>>> >>>> >>>> -- >>>> Todd Lipcon >>>> Software Engineer, Cloudera >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera -- Todd Lipcon Software Engineer, Cloudera