My comment about the build being broken was my fault (as expected). Turns out that during a pull, git decided that a file removed upstream should be left as an untracked file in my working directory.
On Fri, Mar 2, 2012 at 10:42 PM, Arvind Prabhakar <[email protected]> wrote: > Hi Eric, > > > I just checked out Flume and the build is broken. > > Are you sure the build is broken? It seems to work fine for me at > revision 1296578 - which is the last revision based on my commit of > FLUME-978. See the build log below: > > http://pastebin.com/RNq4gFGX > > > ... The hudson job seems to > > be largely ignored. > > I disagree. It is on our list of things to do and there is a jira to > track it. See: > https://issues.apache.org/jira/browse/FLUME-974 I get your point, but I feel like filing a JIRA that just bakes in the backlog is the same as ignoring it. It went broken for almost a month straight. Anyone who commits code to the repo should be at least checking it every once in a while. > > ... I want to stress that code reviews aren't just for > > show; we should actually review the code. > > +1 on the idea that code reviews aren't just for show. -1 for the > implied accusation that we are not doing it. I don't think any one of > us in this community takes it lightly. As with anything, we can > overlook something at times - and that is why the reviews are public > so that others can jump in and help out. If you find something being > overlooked, please point it out. > Accuse is a strong word but I suppose that's what I was doing. In looking at some recent changes, I see style deviations, weird (inconsistent) defensive coding of internal objects, unnecessary synchronization[1]... Maybe I'm being overly cautious but I worry about getting back to a place where the code base is unnecessarily complex and loaded with features without paying attention to stabilization, performance, resource control, and correctness. I know features are exciting, but we're back on to worry about tail and a Hive sink and that worries me. I'd like to see us collectively prioritize a stable, fast, correct core first. That's my opinion. > > > Some classes were removed (I > > don't necessarily think it was a great idea, but that's another story) > and > > tests weren't updated and now they fail to compile. > > I am not sure why you are seeing this. Look at the build log above - I > did a fresh checkout and everything compiles and tests correctly. Are > you trying to compile a previous revision? If so - note that when I > was committing FLUME-865, I accidentally did not checkin the newly > added files (revision 1291612); realized the mistake in a few minutes > and added the missing files (revision 1291613). I guess the only point > where the build can break is if you checkout the earlier of these two > revisions. If there are more places at various commits, please let me > know. > No, it was my mistake. I think short term errors are fine; we'd be silly not to expect that to happen. To be clear, I don't think anyone is being malicious and I realize, in retrospect, it sounds like I meant that code reviews were for show. I really meant I don't want them to degrade to the point of being just for show. So called silly things like style, layering of APIs, and performance matter a lot more in infrastructure systems. [1] As a concrete example, take the changes to Context. It is now half thread safe because it might be used across threads when, in fact, it wasn't really designed to be used concurrently at all. In fact, I'm not sure its ever modified post-creation. Also, I realize that books say to make defensive copies of objects but what they don't say is why. When you're building something that is consumed by an untrusted party (a loose, very public API with low performance requirements, frequent extension by a wide, untrained audience) this makes a ton of sense. When you're working on internal objects that are only used by a small set of people in a system that requires a relatively high level of knowledge about its environment, maybe it doesn't make sense. Worse, when it's only done in a small space in the API, the expectation that nothing is safe is now gone and users get confused. I'd argue it's better to tell people to assume all classes that do not document themselves as such are not thread safe or protected with defensive copies. In the layer of the system that it's used, I don't think Context needs a synchronized map and, if it does, it's still not thread safe (see the race in the get methods that take a default parameter[2]). [2] This form of default parameter is evil and is the source of a great many bugs in Hadoop. Are we really, really, really sure we want to follow this pattern? > Thanks, > Arvind > > On Fri, Mar 2, 2012 at 7:42 PM, Eric Sammer <[email protected]> wrote: > > All: > > > > I just checked out Flume and the build is broken. The hudson job seems to > > be largely ignored. I want to stress that code reviews aren't just for > > show; we should actually review the code. Some classes were removed (I > > don't necessarily think it was a great idea, but that's another story) > and > > tests weren't updated and now they fail to compile. What bugs me more is > > that this happend about 5 commits ago which means people aren't running > the > > tests at all nor are they validating the patches they commit. > > > > If you make a change to the code base and you only run the tests you care > > about, you're defeating one of the main purposes of unit testing; to > ensure > > there are no unintended consequences of the changes. > > > > Please be vigilant. People trust their data to Flume and we shouldn't > take > > that lightly. It's critical that with all the new committers joining the > > project and the pace of development that we not go feature crazy and get > > right back to where the 0.9.x branch was. I will continue to be a pain > the > > ass about this. :) > > > > Thanks. > > -- > > Eric Sammer > > twitter: esammer > > data: www.cloudera.com > -- Eric Sammer twitter: esammer data: www.cloudera.com
