AFAICT, Owen was the one to -1 removal of HDFS Proxy. Owen, are you guys maintaining this?
Cheers, Nige On Apr 4, 2011, at 12:19 PM, Todd Lipcon wrote: > Could those of you who -1ed the removal of HDFS Proxy please look into the > test that has been failing our Hudson build for the last several months: > https://issues.apache.org/jira/browse/HDFS-1666 > > <https://issues.apache.org/jira/browse/HDFS-1666>It is one thing to say that > we "should" maintain a piece of code, but it's another to actually maintain > it. In my mind, part of maintaining a project involves addressing consistent > test failures as high priority items. > > -Todd > > On Tue, Feb 22, 2011 at 9:27 PM, Nigel Daley <nda...@mac.com> wrote: > >> For closure, this vote fails due to a couple binding -1 votes. >> >> Nige >> >> On Feb 18, 2011, at 4:46 AM, Eric Baldeschwieler wrote: >> >>> Hi Bernd, >>> >>> Apache Hadoop is about scale. Most clusters will always be small, but >> Hadoop is going mainstream precisely because it scales to huge data and >> cluster sizes. >>> >>> There are lots of systems that work well on 10 node clusters. People >> select Hadoop because they are confident that as their business / problem >> grows, Hadoop can grow with it. >>> >>> --- >>> E14 - via iPhone >>> >>> On Feb 17, 2011, at 7:25 AM, "Bernd Fondermann" < >> bernd.fonderm...@googlemail.com> wrote: >>> >>>> On Thu, Feb 17, 2011 at 14:58, Ian Holsman <had...@holsman.net> wrote: >>>>> Hi Bernd. >>>>> >>>>> On Feb 17, 2011, at 7:43 AM, Bernd Fondermann wrote: >>>>>> >>>>>> We have the very unfortunate situation here at Hadoop where Apache >>>>>> Hadoop is not the primary and foremost place of Hadoop development. >>>>>> Instead, code is developed internally at Yahoo and then contributed in >>>>>> (smaller or larger) chunks to Hadoop. >>>>> >>>>> This has been the situation in the past, >>>>> but as you can see in the last month, this has changed. >>>>> >>>>> Yahoo! has publicly committed to move their development into the main >> code base, and you can see they have started doing this with the 20.100 >> branch, >>>>> and their recent commits to trunk. >>>>> Combine this with Nige taking on the 0.22 release branch, (and >> sheperding it into a stable release) and I think we have are addressing your >> concerns. >>>>> >>>>> They have also started bringing the discussions back on the list, see >> the recent discussion about Jobtracker-nextgen Arun has re-started in >> MAPREDUCE-279. >>>>> >>>>> I'm not saying it's perfect, but I think the major players understand >> there is an issue, and they are *ALL* moving in the right direction. >>>> >>>> I enthusiastically would like to see your optimism be verified. >>>> Maybe I'm misreading the statements issued publicly, but I don't think >>>> that this is fully understood. I agree though that it's a move into >>>> the right direction. >>>> >>>>>> This is open source development upside down. >>>>>> It is not ok for people to diff ASF svn against their internal code >>>>>> and provide the diff as a patch without reviewing IP first for every >>>>>> line of code changed. >>>>>> For larger chunks I'd suggest to even go via the Incubator IP >> clearance process. >>>>>> Only then will we force committers to primarily work here in the open >>>>>> and return to what I'd consider a healthy project. >>>>>> >>>>>> To be honest: Hadoop is in the process of falling apart. >>>>>> Contrib Code gets moved out of Apache instead of being maintained >> here. >>>>>> Discussions are seldom consense-driven. >>>>>> Release branches stagnate. >>>>> >>>>> True. releases do take a long time. This is mainly due to it being >> extremely hard to test and verify that a release is stable. >>>>> It's not enough to just run the thing on 4 machines, you need at least >> 50 to test some of the major problems. This requires some serious $ for >> someone to verify. >>>> >>>> It has been proposed on the list before, IIRC. Don't know how to get >>>> there, but the project seriously needs access to a cluster of this >>>> size. >>>> >>>>>> Downstream projects like HBase don't get proper support. >>>>>> Production setups are made from 3rd party distributions. >>>>>> Development is not happening here, but elsewhere behind corporate >> doors. >>>>>> Discussion about future developments are started on corporate blogs ( >>>>>> >> http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/ >>>>>> ) instead of on the proper mailing list. >>>>>> Hurdles for committing are way too high. >>>>>> On the bright side, new committers and PMC members are added, this is >>>>>> an improvement. >>>>>> >>>>>> I'd suggest to move away from relying on large code dumps from >>>>>> corporations, and move back to the ASF-proven "individual committer >>>>>> commits on trunk"-model where more committers can get involved. >>>>>> If that means not to support high end cluster sizes for some months, >>>>>> well, so be it. >>>>> >>>>>> Average committers cannot run - e.g. test - on high >>>>>> end cluster sizes. If that would mean they cannot participate, then >>>>>> the open source project better concentrate on small and medium sized >>>>>> cluster instead. >>>>> >>>>> >>>>> Well.. that's one approach.. but there are several companies out there >> who rely on apache's hadoop to power their large clusters, so I'd hate to >> see hadoop become something that only runs well on >>>>> 10-nodes.. as I don't think that will help anyone either. >>>> >>>> But only looking at high-end scale doesn't help either. >>>> >>>> Lets face the fact that Hadoop is now moving from early adaptors phase >>>> into a much broader market. I predict that small to medium sized >>>> clusters will be the majority of Hadoop deployments in a few month >>>> time. 4000, or even 500 machines is the high-end range. If the open >>>> source project Hadoop cannot support those users adequately (without >>>> becoming defunct), the committership might be better off to focus on >>>> the low-end and medium sized users. >>>> >>>> I'm not suggesting to turn away from the handfull (?) of high-end >>>> users. They certainly have most valuable input. But also, *they* >>>> obviously have the resources in terms of larger clusters and >>>> developers to deal with their specific setups. Obviously, they don't >>>> need to rely on the open source project to make releases. In fact, >>>> they *do* work on their own Hadoop derivatives. >>>> All the other users, the hundreds of boring small cluster users, don't >>>> have that choice. They *depend* on the open source releases. >>>> >>>> Hadoop is an Apache project, to provide HDFS and MR free of charge to >>>> the general public. Not only to me - nor to only one or two big >>>> companies either. >>>> Focus on all the users. >>>> >>>> Bernd >> >> > > > -- > Todd Lipcon > Software Engineer, Cloudera