To be clear, I have no objection to JIRA and I like the point that a little
friction on contribution may encourage more structured contribution and
planning.

As Neil mentioned, my original goal with this thread was to understand any
non-technical reason that prevented automating the creation of JIRA tickets
from github PRs before I went and coded such a thing (plug: proposed change
to merge_pr.py is here[1] if anyone wants to review it)

I did not intend to open a philosophical conversation on changing
communication patterns and behavior in the Arrow community at large, though
reading the viewpoints on this thread has been most enlightening. Thank you
all for the time you have taken to share your views.

Andrew

[1] https://github.com/apache/arrow/pull/9598


On Tue, Mar 2, 2021 at 3:01 PM Wes McKinney <wesmck...@gmail.com> wrote:

> With regards to the actual work of merging patches. I have merged
> 3,234 patches in this project [1], so I think this qualifies me to
> have an opinion about this. I don't think that the merge tool is
> problematic -- I use a little bash helper function [2] which cuts down
> my work merging a patch down to just a few keystrokes to the point
> where I hardly think about it anymore. I store my Jira username and
> password in environment variables — they aren't Bitcoin private keys
> so I think if they get compromised it's not a big deal. I believe it's
> worth the effort for every committer to streamline their process so
> they aren't having to jump through a bunch of hoops every time they
> merge a patch.
>
> As far as creating Jira issues for PR's that don't have them:
>
> * It seems like it should be straightforward to create a GitHub
> actions bot to create a Jira issue and paste the link into the GitHub
> PR so you don't have to use the web interface at all.
> * The one-time cost of asking a contributor to create an account and
> assign themselves to the issue is still going to be present, but the
> cost of this seems low to me. The project has had around 600 unique
> contributors in 5 years. Some of the administrative labor of adding
> users to roles could in theory be automated, but since it takes 30-60
> seconds to do this, is it worth the automation effort to reduce this
> down to ~10 seconds or so?
> * We should really be encouraging contributors to be good citizens
> with respect to issue hygiene. If they care enough to write a PR but
> not enough to create an issue about it, is that good?
> * Spark uses the "[MINOR]" tag for PRs that don't merit an entry in
> the changelog, I'd be fine with adopting that formally and
> implementing it in the merge tool
>
> I could just be numb to the drudgery at this point but looking back on
> 5 years of work and thousands of issues I don't feel it's negatively
> impacted me, but I'm probably not a typical case. (Note: I
> automatically filter all Jira mail that doesn't have "mentioned you"
> in it into a separate folder and find it helpful to be able to search
> through it in gmail)
>
> That said, I'm open to migrating away from Jira ONLY if everyone else
> is committed to holding contributors to some reasonable standards of
> professionalism when it comes to communication. The worry I have is
> that we end up with a long queue of PRs having little context and not
> much planning / organization. This project has been successful not
> haphazardly/anarchically but precisely because of deliberate planning
> and organization over a long period of time, so I think we should try
> to respect the process that has gotten us to where we are now. What
> may seem like bureaucracy serves a valuable function to keep people
> organized and informed.
>
> - Wes
>
> [1]: https://gist.github.com/wesm/7058fd833861afc0d3306cdabe5b0a90
> [2]: https://gist.github.com/wesm/17bc4cb8e7a6e5a715cb6de46d2e01e9
>
> On Tue, Mar 2, 2021 at 1:44 PM Neal Richardson
> <neal.p.richard...@gmail.com> wrote:
> >
> > A few thoughts:
> >
> > * Given the cost of switching issue trackers (even if we were to identify
> > one we thought was better), I think it's extremely unlikely that we would
> > abandon JIRA. So rather than dumping on JIRA (an easy target, of course),
> > we should focus on how we can make the workflows we have smoother.
> >
> > * Our workflow assumes that a JIRA issue exists before a pull request
> does,
> > so it's awkward when you do it the other way, but we can add automation
> to
> > make it better. Perhaps this is exactly what Andrew Lamb is working on
> (the
> > original impetus for this discussion thread), but suppose for example we
> > had a PR comment bot action that would create a JIRA issue from a pull
> > request and ideally then rename the PR to match. Our current bot that
> > checks that every PR has a JIRA issue would just suggest that you make
> the
> > magic comment on the PR to create an issue.
> >
> > * FWIW for merging PRs, I just keep a terminal window open with the
> python
> > virtualenv active, and merging a PR just means hitting the up arrow and
> > changing the PR number from the previous command. You can set your
> > credentials in env vars if you find it burdensome to retype them. As
> > someone who merges lots of patches, I find this easy. It's less easy than
> > clicking a merge button in the web browser, but the extra checks and
> > confirmations it does have prevented me from merging bad code before, so
> I
> > think the extra friction actually serves a purpose here.
> >
> > Neal
> >
> > On Tue, Mar 2, 2021 at 11:21 AM Weston Pace <weston.p...@gmail.com>
> wrote:
> >
> > > It also seems like we're describing two different issues.  The first,
> > > a barrier to entry for new development.  The second, overhead imposed
> > > on an active developer.  I'm personally not so worried about the
> > > overhead imposed, perhaps because I can't write code that fast
> > > anyways, so I'll stay out of that discussion.
> > >
> > > I think the barrier to entry is not so much "I don't know which issue
> > > tracker to use" or "I have to follow a bunch of steps" as it is "I'm
> > > pretty sure I can improve this but not 100% sure and I don't want to
> > > look like a fool and this is a huge code base and I'd need a lot of
> > > help getting started and I don't want to burden people."
> > >
> > > Also, I would challenge the fact that people born after the year 2000
> > > are cognitively identical :)  Someone born after 2000 treats email the
> > > same way people born after 1970 treat phone calls.  Most gen Z I work
> > > with see gmail as an authentication tool and not a communication tool.
> > >
> > > I think Fernando's point about informal discussion is a good one.  I
> > > don't think Github is the tool you'd want for this anyways.  We have
> > > Zulip but it is not advertised (e.g.
> > > https://arrow.apache.org/community/).  It's also heavily
> > > developer-centric and not user-centric at the moment.  If we want
> > > something like that I'd be willing to help with the management /
> > > answering questions as I'm able.
> > >
> > > On Tue, Mar 2, 2021 at 9:11 AM Jorge Cardoso Leitão
> > > <jorgecarlei...@gmail.com> wrote:
> > > >
> > > > Hi Antoine,
> > > >
> > > > Can you expand a bit on this?  In particular, which aspects of using
> > > > > JIRA feel bureaucratic?  Is it the requirement to create a new
> issue
> > > > > for each PR?  Or is it other concerns (such as the UI for entering
> or
> > > > > searching issues)?
> > > > >
> > > >
> > > > First of all, thank you for taking my concerns and actively trying to
> > > > understand them.
> > > >
> > > > It is advantageous for everyone to have small, focused PRs, as they
> are
> > > > easy to review and can narrow the discussion to a single problem.
> This
> > > also
> > > > makes it easy for new contributors to start. For this to work, we
> need a
> > > > system on which the work needed to create and merge PRs must be
> small in
> > > > absolute terms, as the "meat" of the PR may be small. *This* IMO is
> not
> > > > working. As a flavour, below is the usual process for a situation on
> > > which
> > > > while working on an issue, I found a side issue and need for PR it:
> > > >
> > > > 1. create the PR on github with the fix
> > > > 2. got an email from github that a bot commented that I must put the
> JIRA
> > > > issue. Got it, I forgot about that...
> > > > 3. go to JIRA, if not logged in, log in (3 clicks + some password
> manager
> > > > stuff, and be redirected to a random page)
> > > > 4. press "create issue"
> > > > 5. Fill content:
> > > >    * type
> > > >    * Summary (do not forget to add the component to the title)
> > > >    * component
> > > >    * assign myself
> > > >    * description(*)
> > > > 6. press create. A small popup on JIRA will show that it was
> created. It
> > > is
> > > > really difficult to copy-paste the issue number from the pop up:
> > > > 7. Press on it before it disappears so that I can easily copy-paste
> its
> > > > number. I need to be fast, though: if it disappears before I press on
> > > it, I
> > > > will need to find it, which is a story on itself.
> > > > 8. go back to github and modify the title
> > > >
> > > > (*) I already wrote a description on the PR using markdown, which is
> > > when I
> > > > was first thinking about the PR itself. JIRA does not support
> markdown,
> > > so
> > > > I can't copy-paste. I now need to fight with the "visual" editor, or
> > > > remember what the notation is for the text. I also need to remember
> that
> > > > {{{ }}}, not backticks, is for code. I will likely leave that one
> empty.
> > > >
> > > > I am adding hiccups above because they do happen due to the mental
> > > workload
> > > > involved, even for someone quite proficient at this.
> > > >
> > > > Let's now assume that all is done and we can merge it. Let's assume
> for
> > > > simplicity that the PR was done by a contributor that is already a
> member
> > > > of JIRA and already has a contributor role on JIRA (the easy case).
> > > >
> > > > 1. Open a terminal and navigate to a clone of the arrow project that
> > > > already has a Python venv on it
> > > > 2. run `source venv/bin/activate` (our script has some external
> > > > requirements, thus we need this)
> > > > 3. run `dev/merge_pr.py PR number`. What was the PR number again?
> > > > 4. Go to github and copy the PR number
> > > > 5. paste on the terminal and press enter
> > > > 6. Now type my username from JIRA
> > > > 7. Now my password. I store all my passwords on a password manager
> with a
> > > > browser extension and often work from different VMs via ssh. Thus, I
> go
> > > to
> > > > JIRA on the browser, click on the extension and copy password
> > > > 8. paste on my terminal and press enter
> > > > 9. Assuming that no conflicts arise, press enter/yes 2 or 3 times
> and it
> > > is
> > > > merged and pushed. Great!
> > > > 10. When updating the JIRA issue, I noticed on the terminal that the
> > > > component is missing. Dam...
> > > > 11. press on the JIRA link on the terminal (and a possible new
> login) and
> > > > add the necessary components
> > > > 12. press enter on the terminal. Now we are done.
> > > > 13. Go to github and thank the contributor for the great work.
> > > >
> > > > IMO, these flows are large. They represent about the same time I
> would
> > > need
> > > > to create the small fix, a test, and PR it, including the PR
> description.
> > > >
> > > > Maybe some people have better flows than I do, but my understanding
> from
> > > > other committers is that these steps are more or less representative
> of a
> > > > good day (i.e. no merge conflicts, master passing, etc).
> > > >
> > > > A corollary problem is that this is not something only on committers
> /
> > > > PMCs' plate. Let's now go through the other side: I am a brand new
> > > > contributor.
> > > >
> > > > 1. create the PR on github because I found something to fix and
> fixed it
> > > > 2. got an email by a bot that I must follow some convention.
> > > > 3. okk, let me try to fix the title. Oh, I need an issue in JIRA...
> > > > 3. go to JIRA and there is no "create issue" option...
> > > > 4. somehow I figured that I need to create an account.
> > > > 5. There is no option to create a new account...
> > > > 6. Try pressing log in and see if it sends me somewhere. Now there
> is a
> > > > small button "Sign up". Finally, let's go.
> > > > 6. create an account: no SSO: I need write my name, create a new
> password
> > > > on my password manager and wait for an email.
> > > > 6. wait for verification email and validate
> > > > 7. press create issue
> > > >    * type
> > > >    * Summary (no idea I need to put the component on the title)
> > > >    * Priority (no idea)
> > > >    * Due date (no idea)
> > > >    * component (no idea, let me search some keywords and see if
> something
> > > > matches: perfect, Rust)
> > > >    * affected version (put latest)
> > > >    * Add description. Can't use markdown, so some struggle here as
> > > > copy-pasting code usually does not work. Need to learn how to use
> this.
> > > >    * More 5 fields or so, that I have no idea if it is expected of
> me to
> > > > fill or not.
> > > > 9. press create
> > > > 10. Pop up shows up: great, now I have the issue... wait, I forgot,
> why I
> > > > was creating the issue again? Pop up disappears.
> > > > 11. Ah, right, the issue number. Dam, how do I find it now? The pop
> up is
> > > > gone.
> > > > 12. Spend an arbitrary amount of time finding the issue.... finally,
> I
> > > > found it.
> > > > 13. Copy it over to the github, edit the title on github.
> > > > 14. Oh, I just received two emails from JIRA: someone already
> commented
> > > on
> > > > it?!?!
> > > > 15. Open email, and after some digging about it I conclude that:
> > > >     * the first email is telling me that the title of the JIRA has
> been
> > > > updated
> > > >     * the second email is telling me that there is now a PR
> associated
> > > with
> > > > the issue. Well, that is a bummer...
> > > > 16. That bot comment has not disappeared from github. Am I done?
> > > > 17. (some time later): oh, another email from JIRA: after 1m reading
> the
> > > > diff: ah, someone added some [DataFusion] to the title...
> > > >
> > > > Again, I am adding some clues of what I perceive to be a state of
> mind of
> > > > the person going through this flow, just to point out that IMO we are
> > > > talking about no small barrier here. I am also assuming an infinite
> > > > willpower.
> > > >
> > > > I do not find JIRA appealing, but I do not find it bad either. I do
> think
> > > > that the setup we have puts too much load on everyone and in my
> previous
> > > > email I tried to express that, for what is worth, that has caused me
> to
> > > > significantly reduce my contributions. I also tried formulating what
> I
> > > see
> > > > as the root cause for the current status quo (availability of
> information
> > > > from the project); I hope this one helps to clarify what I meant with
> > > "too
> > > > much".
> > > >
> > > > In my experience, discussion on JIRA is about the issue itself (for
> > > > > example diagnosing a bug or discussing a feature), then discussion
> on
> > > > > the PR is about the implementation.  JIRA discussions are generally
> > > > > readable by users (and indeed, users often participate) while PR
> > > > > discussions are really for developers of the project.
> > > > >
> > > >
> > > > In my experience (from Rust alone), little discussion happens on
> JIRA.
> > > > Either on the PRs, google docs, or mailing list. Only one of them
> > > supports
> > > > markdown, though, which I consider a basic requirement for any
> product
> > > for
> > > > developers.
> > > >
> > > > FWIW, I've set up a mail filter that sends all "work logged"
> automated
> > > > > mail to the trashbin.  I agree it's unfortunate that developers
> have to
> > > > > do that.  I also have other qualms with the Apache JIRA
> configuration,
> > > > > such as the fact that "labels" (keywords) are shared between all
> > > > > projects, so there is essentially a million of them with no effort
> at
> > > > > taxonomy.
> > > > >
> > > >
> > > > I also struggle with both. I tried at some point adding "beginner
> > > friendly"
> > > > labels to issues that were so, but there are 5 variants of that
> label;
> > > if I
> > > > do not know which one to pick, how can I expect a *beginner* to know
> > > which
> > > > one to pick?
> > > >
> > > > Best,
> > > > Jorge
> > > >
> > > > On Tue, Mar 2, 2021 at 10:26 AM Antoine Pitrou <anto...@python.org>
> > > wrote:
> > > >
> > > > >
> > > > > Hi Jorge,
> > > > >
> > > > > On Tue, 2 Mar 2021 08:55:03 +0100
> > > > > Jorge Cardoso Leitão <jorgecarlei...@gmail.com> wrote:
> > > > > > Hi,
> > > > > >
> > > > > > FWIW, the amount of bureaucracy that goes into JIRA is a major
> > > > > contributing
> > > > > > factor for the reduction of my time commitment to this project by
> > > 80%+.
> > > > >
> > > > > Can you expand a bit on this?  In particular, which aspects of
> using
> > > > > JIRA feel bureaucratic?  Is it the requirement to create a new
> issue
> > > > > for each PR?  Or is it other concerns (such as the UI for entering
> or
> > > > > searching issues)?
> > > > >
> > > > > I can't say I like JIRA myself, but at least it provides the
> > > > > classification and navigation features that I would expect from an
> > > > > issue tracker.  The Github issue tracker AFAIK is rudimentary and
> not
> > > > > really practical when a project has accumulated many issues (but
> they
> > > > > may have changed this recently).
> > > > >
> > > > > > The major challenge is that most discussions happen where PRs are
> > > created
> > > > > > and seen, which is on github, but JIRA and mailing list is used
> for
> > > other
> > > > > > types of decisions. In this model, how do we preserve curated
> > > information
> > > > > > about the decision process while at the same time leverage both
> JIRA
> > > and
> > > > > > github's capabilities?
> > > > >
> > > > > In my experience, discussion on JIRA is about the issue itself (for
> > > > > example diagnosing a bug or discussing a feature), then discussion
> on
> > > > > the PR is about the implementation.  JIRA discussions are generally
> > > > > readable by users (and indeed, users often participate) while PR
> > > > > discussions are really for developers of the project.
> > > > >
> > > > > > OTOH, asking contributors to create a jira account
> > > > > > and committers to add the person as contributor, as well as the
> email
> > > > > spam
> > > > > > and the merge process is a large barrier.
> > > > >
> > > > > FWIW, I've set up a mail filter that sends all "work logged"
> automated
> > > > > mail to the trashbin.  I agree it's unfortunate that developers
> have to
> > > > > do that.  I also have other qualms with the Apache JIRA
> configuration,
> > > > > such as the fact that "labels" (keywords) are shared between all
> > > > > projects, so there is essentially a million of them with no effort
> at
> > > > > taxonomy.
> > > > >
> > > > > > IMO the foundation could be clearer wrt to what does it mean with
> > > > > > information being preserved and available (e.g. on apache
> servers?)
> > > and
> > > > > if
> > > > > > yes, follow it through by hosting all their projects on their own
> > > github
> > > > > /
> > > > > > gitlab / whatever, where issues and PRs are on the same
> platform, and
> > > > > offer
> > > > > > SSO for contributors as a way to prove identity across the
> system.
> > > But
> > > > > that
> > > > > > is also a complex operation with a lot of unknowns...
> > > > >
> > > > > From what I see of the ASF's velocity, I wouldn't expect such a
> large
> > > > > breakthrough in the short future.
> > > > >
> > > > > (this is not trying to badmouth the ASF, just a pragmatic
> evaluation)
> > > > >
> > > > > Regards
> > > > >
> > > > > Antoine.
> > > > >
> > > > >
> > > > >
> > >
>

Reply via email to