On Fri, Oct 28, 2011 at 2:32 PM, Matthew Brett <matthew.br...@gmail.com> wrote: > Hi, > > On Fri, Oct 28, 2011 at 2:16 PM, Nathaniel Smith <n...@pobox.com> wrote: >> On Tue, Oct 25, 2011 at 2:56 PM, Travis Oliphant <oliph...@enthought.com> >> wrote: >>> I think Nathaniel and Matthew provided very >>> specific feedback that was helpful in understanding other perspectives of a >>> difficult problem. In particular, I really wanted bit-patterns >>> implemented. However, I also understand that Mark did quite a bit of work >>> and altered his original designs quite a bit in response to community >>> feedback. I wasn't a major part of the pull request discussion, nor did I >>> merge the changes, but I support Charles if he reviewed the code and felt >>> like it was the right thing to do. I likely would have done the same thing >>> rather than let Mark Wiebe's work languish. >> >> My connectivity is spotty this week, so I'll stay out of the technical >> discussion for now, but I want to share a story. >> >> Maybe a year ago now, Jonathan Taylor and I were debating what the >> best API for describing statistical models would be -- whether we >> wanted something like R's "formulas" (which I supported), or another >> approach based on sympy (his idea). To summarize, I thought his API >> was confusing, pointlessly complicated, and didn't actually solve the >> problem; he thought R-style formulas were superficially simpler but >> hopelessly confused and inconsistent underneath. Now, obviously, I was >> right and he was wrong. Well, obvious to me, anyway... ;-) But it >> wasn't like I could just wave a wand and make his arguments go away, >> no matter how annoying and wrong-headed I thought they were... I could >> write all the code I wanted but no-one would use it unless I could >> convince them it's actually the right solution, so I had to engage >> with him, and dig deep into his arguments. >> >> What I discovered was that (as I thought) R-style formulas *do* have a >> solid theoretical basis -- but (as he thought) all the existing >> implementations *are* broken and inconsistent! I'm still not sure I >> can actually convince Jonathan to go my way, but, because of his >> stubbornness, I had to invent a better way of handling these formulas, >> and so my library[1] is actually the first implementation of these >> things that has a rigorous theory behind it, and in the process it >> avoids two fundamental, decades-old bugs in R. (And I'm not sure the R >> folks can fix either of them at this point without breaking a ton of >> code, since they both have API consequences.) >> >> -- >> >> It's extremely common for healthy FOSS projects to insist on consensus >> for almost all decisions, where consensus means something like "every >> interested party has a veto"[2]. This seems counterintuitive, because >> if everyone's vetoing all the time, how does anything get done? The >> trick is that if anyone *can* veto, then vetoes turn out to actually >> be very rare. Everyone knows that they can't just ignore alternative >> points of view -- they have to engage with them if they want to get >> anything done. So you get buy-in on features early, and no vetoes are >> necessary. And by forcing people to engage with each other, like me >> with Jonathan, you get better designs. >> >> But what about the cost of all that code that doesn't get merged, or >> written, because everyone's spending all this time debating instead? >> Better designs are nice and all, but how does that justify letting >> working code languish? >> >> The greatest risk for a FOSS project is that people will ignore you. >> Projects and features live and die by community buy-in. Consider the >> "NA mask" feature right now. It works (at least the parts of it that >> are implemented). It's in mainline. But IIRC, Pierre said last time >> that he doesn't think the current design will help him improve or >> replace numpy.ma. Up-thread, Wes McKinney is leaning towards ignoring >> this feature in favor of his library pandas' current hacky NA support. >> Members of the neuroimaging crowd are saying that the memory overhead >> is too high and the benefits too marginal, so they'll stick with NaNs. >> Together these folk a huge proportion of the this feature's target >> audience. So what have we actually accomplished by merging this to >> mainline? Are we going to be stuck supporting a feature that only a >> fraction of the target audience actually uses? (Maybe they're being >> dumb, but if people are ignoring your code for dumb reasons... they're >> still ignoring your code.) >> >> The consensus rule forces everyone to do the hardest and riskiest part >> -- building buy-in -- up front. Because you *have* to do it sooner or >> later, and doing it sooner doesn't just generate better designs. It >> drastically reduces the risk of ending up in a huge trainwreck. >> >> -- >> >> In my story at the beginning, I wished I had a magic wand to skip this >> annoying debate and political stuff. But giving it to me would have >> been a bad idea. I think that's went wrong with the NA discussion in >> the first place. Mark's an excellent programmer, and he tried his best >> to act in the good of everyone in the project -- but in the end, he >> did have a wand like that. He didn't have that sense that he *had* to >> get everyone on board (even the people who were saying dumb things), >> or he'd just be wasting his time. He didn't ask Pierre if the NA >> design would actually work for numpy.ma's purposes -- I did. >> >> You may have noticed that I do have some ideas for about how NA >> support should work. But my ideas aren't really the important thing. >> The alter-NEP was my attempt to find common ground between the >> different needs people were bringing up, so we could discuss whether >> it would work for people or not. I'm not wedded to anything in it. But >> this is a complicated issue with a lot of conflicting interests, and >> we need to find something that actually does work for everyone (or as >> large a subset as is practical). >> >> So here's what I think we should do: >> 1) I will submit a pull request backing Mark's NA work out of >> mainline, for now. (This is more or less done, I just need to get it >> onto github, see above re: connectivity) >> 2) I will also put together a new branch containing that work, >> rebased against current mainline, so it doesn't get lost. (Ditto.) >> 3) And we'll decide what to do with it *after* we hammer out a >> design that the various NA-supporting groups all find convincing. Or >> at least a design for some of the less controversial pieces (like the >> 'where=' ufunc argument?), get those merged, and then iterate >> incrementally. >> >> What do you all think? > > Nice post - thank you. > > I agree that we may have a problem with - process. I mean, maybe > there is not much agreement on what the process for these kinds of > discussions should be - and therefore - we can't point to some > constitution or similar to say - hey - wait - we're not doing it > right.
Your post reminded me of this: http://en.wikipedia.org/wiki/Rough_consensus It does depend on having something like a committee and a chairperson though. See you, Matthew _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion