Hi Paul, As you know, I try my hardest to post well-researched comments to this mailing list, and this time I fell short of that. Please accept my apologies. Let me try and re-frame the problem, and respond to your criticisms.
My point is: we need more public design discussions and review, and we need those discussions to have a logical conclusion. I think the RFC, coupled with more traffic on dev@, is the answer to that. That said, I counted the number of comments on those 4 PRs from the general public: * Clustered purge PR #1370 has 0 non-Cloudant comments on it. * PSE PR #496 has one comment from me asking you to write documentation (that I don't think landed). That's the only non-Cloudant post. * Replicator scheduler PR #470 has a number of community comments on it that resulted in a higher quality PR. ...and I'm not going to even attempt to recap the BigCouch mess, but a lot of non-Cloudant people were involved. So 50% of the PRs were developed in the open, but they might as well have happened on an IBM private repo. That's unfortunate. There are a number of possible, valid explanations for why these PRs were so unengaging, in my view. It may be a natural reflection of the fact that those are the only people who are paid to take an interest in the code. Or it may be that the PRs themselves are not as discoverable as posts to the mailing list. Perhaps it's because big PRs are intimidating and difficult to interpret to those who don't live and breathe the CouchDB code base daily. I was wrong to say that there was just one reason why this is the case. But I don't think I am wrong to point out that something smells wrong when features land without community comment on either the design or the code itself. I do think it's fair to say that the mailing list discussions for these features were minimal as compared to the discussions that happened in the PRs, regardless of participant. (Your PSE dev@ post got no responses, for instance. Maybe it's a bad example, being a somewhat esoteric feature.) Recent traffic on FDB and resharding proves to me that the ML is still a valid venue to discuss proposals, and that these proposals are getting better as a result of those things. The RFC is intended to be a cap to those discussions, just a slightly more ritualised way of voting on the discussion and writing up the result. As to the PR side of things, because PRs go to notifications@, they are largely ignored by the dev@ community. Subscribing to all of the GitHub emails from all of the CouchDB repos is overwhelming. Even if you were to filter that only to new PRs and forward them to dev@ somehow, it's still a lot of emails to wade through, so I'm not sure that's a solution to the problem. PRs that reference an RFC, though, could be the "happy medium" that we need, and again a simple bot could help here. As a PMC member, I feel it is my responsibility to try and steer more of our community into these discussions, so that the best possible solution can be reached. It's less about "Cloudant vs. non-Cloudant" and more about serving the needs of our developer and user base. (In fact, none of the feature proposals in this thread said anything to the user@ mailing list - where we might have reached even more people who could have informed the design phase of the work. Something to consider.) > Yes these were big PRs, and yes they took a long time to review. But > there was plenty of time for anyone to do that review (and there were > a number of non Cloudant people involved in these listed). Being open for a long time, and helping people through reading the PR are very different things. Again, not until recently did these PRs start including top-level READMEs that helped people understand the code involved. Nick's README on the replicator scheduler is a great example of something very positive: https://github.com/apache/couchdb/pull/470/files#diff-a3be920760d32aca56cc1d2b838d07ef I feel the RFC could be the initial README.md, which would then be supplemented by a short intro to how the code is written and actually works. But one thing at a time ;) > While I'm not sure about prototyping, I do think RFCs would help solve > this problem. It definitely helps to know what the reason a PR even > exists and maybe why various other approaches were discarded before > starting to review it. I don't personally know of much prototyping > related to these sorts of features. There's definitely evolution to > them based on various restrictions and that is captured on our > commits@ lists (obviously in a difficult to consume format post facto, > but useful for anyone following along at least). My comment on the prototyping was specifically with FDB in mind, where I expect we will have multiple throwaway bits of code written to try and determine how exactly we'll make it work. Those don't necessarily need to be shared, but if they helped someone reach a decision, it could be useful. > Adding RFCs won't solve the issue that large features almost by > definition have correspondingly large PRs that can be daunting to > review. I do think having an RFC may make it easier, but I don't think > its solving the problem as posed. It's a two-pronged approach. The RFC is intended to solve the design end of things, so that even if we don't have more community members involved in the Pull Request review process, they can at least rest assured that there was agreement on what *should* be implemented. Those people *should* be able to ignore the PR and not be surprised by what it contains when it lands, since we've got a nice summary that was agreed to of what it actually will contain. And, should they want to engage more fully with our development process, they *should* be able to read documentation aimed at CouchDB developers in the PR itself that explains how the feature was implemented. The RFC can be a start on writing this README.md. (We do a very poor job on this, and I will continue to harp about how hard it is to onboard new CouchDB developers until it gets easier.) Does this help? -Joan