I'm not sure I want to wade all the way into this, but a few notes from
the climate science side of things :)

First, I agree with Chris's sentiment.  There should be as few barriers as
possible towards any development of the code.  Everyone working on this is
capable of writing software and if he or she isn't comfortable with
submitting their code then we have the community to help.

There is another issue at play here that I think is causing some of the
problems (long discussions on github, Kyo getting frustrated, etc.).  That
is that much of this project has always been based on very simple software
that is commonly used by the climate community to do routine climate
analysis.  Let's be honest with ourselves, we are not building something
that is a commercially competitive product, we are building a science tool
to help scientists do what they already do more easily and to allow access
to things scientists don't always have.  When there are long discussion on
really any of the actual nitty gritty parts of the code functionality at
this point, something is wrong.  This many years into the project we
should not have week to month to longer issues ongoing about simple
analytical tools that climate scientists do routinely (cite many examples
in the past several weeks).

This does NOT mean there should be shame in asking for help, for admitting
you don't understand, etc.  But not doing that, and arguing over whether
something is correct when it's at the simplest basic tools for analysis
wastes time and hampers any progress on this project.  This, I believe is
the root of Kyo's expressed frustrationŠnot to speak for him.  If anyone
on this project has a question, I'm sure someone else on this project has
the answer.  So ask the question, accept the help and answer, fix the
issue, and move on to the next thing.  Chances are very high that Kyo,
Jinwon, or myself are correct when it comes to the climate analysis.
Chances are less that Jinwon or I would be able to fix issues ourselves
without guidance (kyo's pretty good :) ), but that's where the team can
support each other.  When we don't do that, people get frustrated and
understandably so, or shy away from participating.  We are
interdisciplinary by nature, that should be a plus.

OK, I think that's it for my part.

Paul 


On 5/12/15 5:38 PM, "Mattmann, Chris A (3980)"
<chris.a.mattm...@jpl.nasa.gov> wrote:

>Mike,
>
>A few top level thoughts without replying directly inline below I
>get pretty tired of reading in-between threads on that, so thought
>I would summarize here:
>
>1. I¹ve been on tons of projects that had commit, review later,
>commit. In fact, pretty much every project I¹m on works that way,
>mostly. Nutch, we commit things directly sometimes - sometimes we
>review. Tika we commit things directly sometimes - sometimes things
>break. No one yells at me if I broke something. We just fix it, and
>make the tests pass again. I¹ve worked on a ton of projects over
>the years at Apache. See the thing is too, I¹m not proposing one
>model or the other. I think there are situations to support both
>on projects.
>
>Like I said, in my mind: RTC is great if it¹s controversial; or if
>you just desire a review. There is nothing stopping you from *always*
>RTC¹ing what you do. That¹s your prerogative. In the hypothetical
>situation I committed anything to OCW at some point in the future
>(thought having done a ton of work upstream of the project at JPL
>before coming to Apache) I just don¹t want that imposed on *me*.
>And moreover, I think some of this stuff in terms of conversations
>I¹m seeing on the Githubs could be obviated by Kyo not having to
>post examples; or slides; or 10 comments before he can just push
>the dang thing, see if the test broke, if it did, fix it (or you
>fix it; or Kim fixes or someone fixes it out of the kindness of
>their hearts). If no one fixes it, and no one reverts it, we¹ve got
>a dead community. We¹re using version control. Having things perfect
>before doing things is unnecessary and I honestly don¹t think people
>should have to make even 1 Github comment before making a JIRA
>issue, or just pushing code. We want minimum barriers for some;
>bigger barriers for others; in-betweens, we want the whole lot.  We
>have to be flexible.
>
>2. We have to make the ability to capture *everyone¹s* contributions,
>whatever pace they go at. Realize that the pace may change, depending
>on someone¹s job; life, funding, etc. Right now, my #1 goal considering
>the state of this project right now is to reduce barriers for getting
>the project going in terms of pushing things. That means when I see
>lots of conversation on Github, and talk about code being pushed;
>or things in presentations rather than code simply being pushed, I
>think there are more barriers than we need. It may be that you
>always want PRs and Kim and you love conversing on Github about the
>PRs, and then when you finally have things the way you agreed upon
>it you push the code. I am just saying be open to when that¹s not
>the case, and accept it - it will earn more people around this
>project in 10 years including yourself. Should someone every come
>in and be funded full time to work on this project or 10 people
>come to be funded full time to work on this project, if we followed
>only a strictly RTC approach, we would end up with what Hadoop ended
>up with, and the potential for something like Spark to end up with.
>I¹m glad you brought Spark up. I convinced them to come to Apache.
>I know plenty about their community that¹s great - but also some
>things that aren¹t. Dirty laundry that¹s emerged in public about
>barriers to contributions and higher bars for committers and PMC
>which are now split in that project.  It¹s certainly a model for
>doing great things, but realize too - there are 124+ contributors;
>large VC-backed companies whose only goal is to work on software,
>not necessarily do science with little funding and contributions,
>etc., so it¹s a largely different ballgame.  When you and I and
>Zimmy made the Bot for OCW Climate, it was with little pieces of
>our time - imagine if I had you spend your time writing up the
>ENTIRE plan (which I never actually saw a document describing mind
>you) for the OCW bot before I asked you to look at the Spark one
>and do it? Would you have had an easy time writing up the plan FIRST
>before doing the work?
>
>3. To the point about JIRA and Github notifications being as easy
>to spot as plain old email that simply is not backed up by fact.
>In fact look at Spark - they had to create a completely separate
>mailing list to handle their conversations because honestly dev@s.a.o
>was impossible to be subscribed on due to all the automated nonsense.
>I am still unsubscribed to dev@s.a.o to this day b/c of that. In
>addition, look up emails over the 15+ year history of this foundation
>(you have access now as a member) - JIRA and SVN auto commits and
>Git auto notifications and conversations have proven NOT to be very
>easy to follow as something with a simple subject line, written by
>a human being and not a bot.
>
>4. RE: Git - committing locally is fine, but the ASF only cares
>about code on the ASF¹s hardware. Github (in the guide you referenced)
>is NOT the canonical source for Apache OCW. The canonical source
>is the writeable Git repos at the ASF. This was one of the major
>concerns about Git that the ASF had in initially implementing it.
>Storing stuff in branches and committing all you want creates
>nightmares when the PMC who is responsible for managing the code
>tries to collectively work on a codebase *together* and to have
>shared stewardship of the code base.  I don¹t see that right now.
>I see a very small amount of people doing things every now and then,
>with most of the commentary on whether it¹s right or wrong coming
>from you. And people like Kyo not even knowing if they have write
>access to the repository at Apache or not. That¹s not an Apache
>project¹s model and it needs to be fixed here in OCW. There are no
>BDFL¹s in this project. We are all members of the PMC and have
>shared rights and stewardship to the code.
>
>
>Those are my thoughts on this.
>
>Cheers,
>Chris
>
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398)
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattm...@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Michael Joyce <jo...@apache.org>
>Reply-To: "dev@climate.apache.org" <dev@climate.apache.org>
>Date: Tuesday, May 12, 2015 at 11:49 AM
>To: "dev@climate.apache.org" <dev@climate.apache.org>
>Subject: Re: Project Workflow
>
>>Pre-email note - 'you' is used here to collectively refer to a
>>nonexistent
>>person, not a specific person in this chain of emails.
>>
>>---
>>
>>I would certainly agree that RTC COULD cause problems if it wasn't
>>applied
>>fairly. But it's applied generally across all commits from all
>>contributors
>>here. If no one gives any feedback on a patch after a while (again, I
>>usually stick with the fairly standard 72 hours idea), merge the dang
>>thing. That's always been my stance on it. I think applying this to all
>>contributions regardless of status is actually more fair than most
>>project's approaches where contributors have to submit a patch and wait
>>for
>>someone to hopefully come along and give a crap.
>>
>>Also, I think it's a huge misnomer that CTR turns into anything other
>>than
>>Commit then Commit some more. I have never seen someone do a code review
>>after pushing commits on any project I've worked on. People don't want to
>>do reviews, they're certainly not going to do it after the fact. Do early
>>reviews always fix the problem of bugs being introduced? Of course not.
>>Do
>>they make it worse? Absolutely not. We've seen that early reviews prevent
>>broken tests and bugs from being committed to our code base multiple
>>times
>>(and that has been on many people' contributions, including my own many
>>times). Even if this project decided to switch to a CTR approach every
>>one
>>of my contributions would go through a PR. It keeps you honest with
>>regards
>>to running the tests and it gives people opportunities to help you make
>>your code better. That's always a plus to me. I wish more people would
>>actually give feed back on my contributions/PRs. That would make me
>>really
>>happy.
>>
>>Also, no committer/PMC member is excluded from merging pull requests.
>>Everyone who is a committer/PMC can merge pull requests. So if you want
>>to
>>be responsible for validating that the pull request doesn't break stuff
>>and
>>getting it merged, please do! The jenkins build goes a long way towards
>>helping with that and does most of the heavy lifting anyway. If anyone is
>>worried that only Kim and I have been merging PRs then step up and help
>>get
>>PRs merged.
>>
>>Regarding conversations being buried on Github: All conversations are
>>mapped back to the mailing list under the appropriate PR title and to the
>>appropriate JIRA ticket (assuming the workflow laid out on the wiki is
>>followed). I don't see the conversation being on a PR being much
>>different
>>than the conversation being on JIRA when someone submits a patch (a
>>rather
>>common workflow on other projects in my experience). I'm open to hearing
>>how that is problematic though. My thought, if people are going to ignore
>>github chatter, then they're probably going to ignore JIRA chatter, and
>>they're probably not going to notice emails either.
>>
>>Also, we're using git. If you want to scratch your own itch, make a
>>branch
>>and do work. Commit locally all you want. You have a full version of the
>>repository. It's exactly the same thing on github/asf servers. Keeping
>>your
>>branch up to date with master is trivially easy. When you want to push
>>that
>>contribution out make a pull request. It's extremely easy to branch a
>>billion times and merge and commit locally and do your own thing. I'm not
>>terribly certain how having a pull request/code review centric workflow
>>hinders this??
>>
>>One more final note. Apache Spark, one of the most active ASF projects
>>uses
>>a more complicated Github based PR-centric workflow that uses RTC. It
>>certainly hasn't prevented them from getting hundreds of
>>committers/contributors.
>>
>>
>>-- Jimmy
>>
>>On Tue, May 12, 2015 at 10:02 AM, Ramirez, Paul M (398M) <
>>paul.m.rami...@jpl.nasa.gov> wrote:
>>
>>> I believe the current workflow has inhibited progress and caused
>>>tension.
>>> Although I've been more of an observer than contributor it seems to me
>>>that
>>> when your community is small that the emphasis should be on allowing as
>>> Chris states for people to "scratch their own itch." Additionally, I've
>>> seen at times that reviews have focused on minor items and that
>>>discussions
>>> often took longer than either side reaching out to the other to help
>>>get
>>> the patch in and build community and camaraderie amongst all those that
>>>are
>>> already committers. This has gone on to the extent that appears
>>>detrimental
>>> to the project.
>>>
>>> I would actively support CTR at this point so that energy and progress
>>>can
>>> be infused. Sure this could cause some technical debt to build up but
>>> community over code would seemingly once again come to the forefront
>>>which
>>> appears to be lacking at the moment.
>>>
>>>
>>> --Paul
>>>
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> Paul Ramirez, M.S.
>>> Technical Group Supervisor
>>> Computer Science for Data Intensive Applications (398M)
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 158-264, Mailstop: 158-242
>>> Email: paul.m.rami...@jpl.nasa.gov<mailto:paul.m.rami...@jpl.nasa.gov>
>>> Office: 818-354-1015
>>> Cell: 818-395-8194
>>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>
>>> On May 12, 2015, at 9:33 AM, "Mattmann, Chris A (3980)" <
>>> chris.a.mattm...@jpl.nasa.gov<mailto:chris.a.mattm...@jpl.nasa.gov>>
>>>  wrote:
>>>
>>> I don¹t think we should dictate everything be code reviewed. I¹ve
>>> seen this directly lead to conversations that are relevant to
>>> development being buried in Github. Take for example your and
>>> Whitehall¹s conversation(s) with Kyo that I doubt anyone here has
>>> ever seen since they aren¹t even commenting on the Github. Yes,
>>> Github emails are sent to the dev list, my guess is that people
>>> ignore them.
>>>
>>> On the code review issue - Kyo (or others) shouldn¹t have to endlessly
>>> debate or discuss the advantages or disadvantages of this or that
>>> before simply pushing code, and pushing tests. My general rule of
>>> thumb is that there are CTR and RTC workflows and use cases for
>>> both. RTC works great when it¹s possibly controversial or when you
>>> really want someone¹s eyes on your code for review. However it¹s
>>> also overhead that I do not believe is needed if a developer wants
>>> to push forward and scratch his or her itch, in an area of the
>>> codebase that they are working on. The codebase is factored out
>>> enough reasonably well so that people can work on things in parallel
>>> and independently. When in doubt, ask.
>>>
>>>
>>> I¹m also pretty worried since anyone that looks at the Git and
>>> project history over the last year can easily see that Mike has
>>> pretty much been doing the bulk load of the pushing and code
>>> committing here. Kim¹s stepped up recently as has Kyo, which is
>>> 3 people, which is great, but I¹m worried about a project with
>>> a small number of active developers (really 1 major) imposing
>>> RTC - I don¹t have time to look up citations but you are free
>>> to scope these out over the ASF archives. RTC on smaller projects
>>> just leads to barriers. We need to be flexible and make it inviting
>>> for at the very least, our own developers to contribute to the project,
>>> let along attracting new people. Ross was elected in December 2014,
>>> which is great, but we need to do better.
>>>
>>> Cheers,
>>> Chris
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Chief Architect
>>> Instrument Software and Science Data Systems Section (398)
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 168-519, Mailstop: 168-527
>>> Email: chris.a.mattm...@nasa.gov<mailto:chris.a.mattm...@nasa.gov>
>>> WWW:  http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Associate Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>>>
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Michael Joyce <jo...@apache.org<mailto:jo...@apache.org>>
>>> Reply-To: "dev@climate.apache.org<mailto:dev@climate.apache.org>" <
>>> dev@climate.apache.org<mailto:dev@climate.apache.org>>
>>> Date: Tuesday, May 12, 2015 at 8:55 AM
>>> To: "dev@climate.apache.org<mailto:dev@climate.apache.org>" <
>>> dev@climate.apache.org<mailto:dev@climate.apache.org>>
>>> Subject: Project Workflow
>>>
>>> Hi folks,
>>>
>>> Since this has been brought up a few times on various tickets I thought
>>> now
>>> would be a good time to go over our project workflow and make sure it's
>>> working for us.
>>>
>>> A general overview of the workflow that we use is available at [1]. A
>>> brief
>>> overview is that:
>>> - ALL changes are pushed up to Github for code review before being
>>>merged.
>>> - If no issues are raised within a reasonable amount of time (usually
>>>72
>>> hours is what I stick with) those changes can be merged.
>>>
>>> In general, I've been quite happy with this workflow. We have a low
>>>enough
>>> throughput that this isn't overwhelming and I think it's great that we
>>>can
>>> get together and review each other's code. I know I appreciate the
>>> opportunity for people to find issues with my code before we merge it.
>>>I
>>> think it would be beneficial to flesh out the docs a bit more on the
>>> workflow (for instance, how to run tests should be included in there,
>>>how
>>> long to wait for a merge, etc.). So community, what do we think of our
>>> workflow? Do we like it so far? Is it working for us? Are there pain
>>> points? What don't we like? Etc.
>>>
>>> [1]
>>> 
>>>https://cwiki.apache.org/confluence/display/CLIMATE/Developer+Getting+St
>>>a
>>>r
>>> ted+Guide
>>>
>>>
>>> -- Jimmy
>>>
>>>
>>>
>

Reply via email to