With regards to the actual work of merging patches. I have merged
3,234 patches in this project [1], so I think this qualifies me to
have an opinion about this. I don't think that the merge tool is
problematic -- I use a little bash helper function [2] which cuts down
my work merging a patch down to just a few keystrokes to the point
where I hardly think about it anymore. I store my Jira username and
password in environment variables — they aren't Bitcoin private keys
so I think if they get compromised it's not a big deal. I believe it's
worth the effort for every committer to streamline their process so
they aren't having to jump through a bunch of hoops every time they
merge a patch.

As far as creating Jira issues for PR's that don't have them:

* It seems like it should be straightforward to create a GitHub
actions bot to create a Jira issue and paste the link into the GitHub
PR so you don't have to use the web interface at all.
* The one-time cost of asking a contributor to create an account and
assign themselves to the issue is still going to be present, but the
cost of this seems low to me. The project has had around 600 unique
contributors in 5 years. Some of the administrative labor of adding
users to roles could in theory be automated, but since it takes 30-60
seconds to do this, is it worth the automation effort to reduce this
down to ~10 seconds or so?
* We should really be encouraging contributors to be good citizens
with respect to issue hygiene. If they care enough to write a PR but
not enough to create an issue about it, is that good?
* Spark uses the "[MINOR]" tag for PRs that don't merit an entry in
the changelog, I'd be fine with adopting that formally and
implementing it in the merge tool

I could just be numb to the drudgery at this point but looking back on
5 years of work and thousands of issues I don't feel it's negatively
impacted me, but I'm probably not a typical case. (Note: I
automatically filter all Jira mail that doesn't have "mentioned you"
in it into a separate folder and find it helpful to be able to search
through it in gmail)

That said, I'm open to migrating away from Jira ONLY if everyone else
is committed to holding contributors to some reasonable standards of
professionalism when it comes to communication. The worry I have is
that we end up with a long queue of PRs having little context and not
much planning / organization. This project has been successful not
haphazardly/anarchically but precisely because of deliberate planning
and organization over a long period of time, so I think we should try
to respect the process that has gotten us to where we are now. What
may seem like bureaucracy serves a valuable function to keep people
organized and informed.

- Wes

[1]: https://gist.github.com/wesm/7058fd833861afc0d3306cdabe5b0a90
[2]: https://gist.github.com/wesm/17bc4cb8e7a6e5a715cb6de46d2e01e9

On Tue, Mar 2, 2021 at 1:44 PM Neal Richardson
<neal.p.richard...@gmail.com> wrote:
>
> A few thoughts:
>
> * Given the cost of switching issue trackers (even if we were to identify
> one we thought was better), I think it's extremely unlikely that we would
> abandon JIRA. So rather than dumping on JIRA (an easy target, of course),
> we should focus on how we can make the workflows we have smoother.
>
> * Our workflow assumes that a JIRA issue exists before a pull request does,
> so it's awkward when you do it the other way, but we can add automation to
> make it better. Perhaps this is exactly what Andrew Lamb is working on (the
> original impetus for this discussion thread), but suppose for example we
> had a PR comment bot action that would create a JIRA issue from a pull
> request and ideally then rename the PR to match. Our current bot that
> checks that every PR has a JIRA issue would just suggest that you make the
> magic comment on the PR to create an issue.
>
> * FWIW for merging PRs, I just keep a terminal window open with the python
> virtualenv active, and merging a PR just means hitting the up arrow and
> changing the PR number from the previous command. You can set your
> credentials in env vars if you find it burdensome to retype them. As
> someone who merges lots of patches, I find this easy. It's less easy than
> clicking a merge button in the web browser, but the extra checks and
> confirmations it does have prevented me from merging bad code before, so I
> think the extra friction actually serves a purpose here.
>
> Neal
>
> On Tue, Mar 2, 2021 at 11:21 AM Weston Pace <weston.p...@gmail.com> wrote:
>
> > It also seems like we're describing two different issues.  The first,
> > a barrier to entry for new development.  The second, overhead imposed
> > on an active developer.  I'm personally not so worried about the
> > overhead imposed, perhaps because I can't write code that fast
> > anyways, so I'll stay out of that discussion.
> >
> > I think the barrier to entry is not so much "I don't know which issue
> > tracker to use" or "I have to follow a bunch of steps" as it is "I'm
> > pretty sure I can improve this but not 100% sure and I don't want to
> > look like a fool and this is a huge code base and I'd need a lot of
> > help getting started and I don't want to burden people."
> >
> > Also, I would challenge the fact that people born after the year 2000
> > are cognitively identical :)  Someone born after 2000 treats email the
> > same way people born after 1970 treat phone calls.  Most gen Z I work
> > with see gmail as an authentication tool and not a communication tool.
> >
> > I think Fernando's point about informal discussion is a good one.  I
> > don't think Github is the tool you'd want for this anyways.  We have
> > Zulip but it is not advertised (e.g.
> > https://arrow.apache.org/community/).  It's also heavily
> > developer-centric and not user-centric at the moment.  If we want
> > something like that I'd be willing to help with the management /
> > answering questions as I'm able.
> >
> > On Tue, Mar 2, 2021 at 9:11 AM Jorge Cardoso Leitão
> > <jorgecarlei...@gmail.com> wrote:
> > >
> > > Hi Antoine,
> > >
> > > Can you expand a bit on this?  In particular, which aspects of using
> > > > JIRA feel bureaucratic?  Is it the requirement to create a new issue
> > > > for each PR?  Or is it other concerns (such as the UI for entering or
> > > > searching issues)?
> > > >
> > >
> > > First of all, thank you for taking my concerns and actively trying to
> > > understand them.
> > >
> > > It is advantageous for everyone to have small, focused PRs, as they are
> > > easy to review and can narrow the discussion to a single problem. This
> > also
> > > makes it easy for new contributors to start. For this to work, we need a
> > > system on which the work needed to create and merge PRs must be small in
> > > absolute terms, as the "meat" of the PR may be small. *This* IMO is not
> > > working. As a flavour, below is the usual process for a situation on
> > which
> > > while working on an issue, I found a side issue and need for PR it:
> > >
> > > 1. create the PR on github with the fix
> > > 2. got an email from github that a bot commented that I must put the JIRA
> > > issue. Got it, I forgot about that...
> > > 3. go to JIRA, if not logged in, log in (3 clicks + some password manager
> > > stuff, and be redirected to a random page)
> > > 4. press "create issue"
> > > 5. Fill content:
> > >    * type
> > >    * Summary (do not forget to add the component to the title)
> > >    * component
> > >    * assign myself
> > >    * description(*)
> > > 6. press create. A small popup on JIRA will show that it was created. It
> > is
> > > really difficult to copy-paste the issue number from the pop up:
> > > 7. Press on it before it disappears so that I can easily copy-paste its
> > > number. I need to be fast, though: if it disappears before I press on
> > it, I
> > > will need to find it, which is a story on itself.
> > > 8. go back to github and modify the title
> > >
> > > (*) I already wrote a description on the PR using markdown, which is
> > when I
> > > was first thinking about the PR itself. JIRA does not support markdown,
> > so
> > > I can't copy-paste. I now need to fight with the "visual" editor, or
> > > remember what the notation is for the text. I also need to remember that
> > > {{{ }}}, not backticks, is for code. I will likely leave that one empty.
> > >
> > > I am adding hiccups above because they do happen due to the mental
> > workload
> > > involved, even for someone quite proficient at this.
> > >
> > > Let's now assume that all is done and we can merge it. Let's assume for
> > > simplicity that the PR was done by a contributor that is already a member
> > > of JIRA and already has a contributor role on JIRA (the easy case).
> > >
> > > 1. Open a terminal and navigate to a clone of the arrow project that
> > > already has a Python venv on it
> > > 2. run `source venv/bin/activate` (our script has some external
> > > requirements, thus we need this)
> > > 3. run `dev/merge_pr.py PR number`. What was the PR number again?
> > > 4. Go to github and copy the PR number
> > > 5. paste on the terminal and press enter
> > > 6. Now type my username from JIRA
> > > 7. Now my password. I store all my passwords on a password manager with a
> > > browser extension and often work from different VMs via ssh. Thus, I go
> > to
> > > JIRA on the browser, click on the extension and copy password
> > > 8. paste on my terminal and press enter
> > > 9. Assuming that no conflicts arise, press enter/yes 2 or 3 times and it
> > is
> > > merged and pushed. Great!
> > > 10. When updating the JIRA issue, I noticed on the terminal that the
> > > component is missing. Dam...
> > > 11. press on the JIRA link on the terminal (and a possible new login) and
> > > add the necessary components
> > > 12. press enter on the terminal. Now we are done.
> > > 13. Go to github and thank the contributor for the great work.
> > >
> > > IMO, these flows are large. They represent about the same time I would
> > need
> > > to create the small fix, a test, and PR it, including the PR description.
> > >
> > > Maybe some people have better flows than I do, but my understanding from
> > > other committers is that these steps are more or less representative of a
> > > good day (i.e. no merge conflicts, master passing, etc).
> > >
> > > A corollary problem is that this is not something only on committers /
> > > PMCs' plate. Let's now go through the other side: I am a brand new
> > > contributor.
> > >
> > > 1. create the PR on github because I found something to fix and fixed it
> > > 2. got an email by a bot that I must follow some convention.
> > > 3. okk, let me try to fix the title. Oh, I need an issue in JIRA...
> > > 3. go to JIRA and there is no "create issue" option...
> > > 4. somehow I figured that I need to create an account.
> > > 5. There is no option to create a new account...
> > > 6. Try pressing log in and see if it sends me somewhere. Now there is a
> > > small button "Sign up". Finally, let's go.
> > > 6. create an account: no SSO: I need write my name, create a new password
> > > on my password manager and wait for an email.
> > > 6. wait for verification email and validate
> > > 7. press create issue
> > >    * type
> > >    * Summary (no idea I need to put the component on the title)
> > >    * Priority (no idea)
> > >    * Due date (no idea)
> > >    * component (no idea, let me search some keywords and see if something
> > > matches: perfect, Rust)
> > >    * affected version (put latest)
> > >    * Add description. Can't use markdown, so some struggle here as
> > > copy-pasting code usually does not work. Need to learn how to use this.
> > >    * More 5 fields or so, that I have no idea if it is expected of me to
> > > fill or not.
> > > 9. press create
> > > 10. Pop up shows up: great, now I have the issue... wait, I forgot, why I
> > > was creating the issue again? Pop up disappears.
> > > 11. Ah, right, the issue number. Dam, how do I find it now? The pop up is
> > > gone.
> > > 12. Spend an arbitrary amount of time finding the issue.... finally, I
> > > found it.
> > > 13. Copy it over to the github, edit the title on github.
> > > 14. Oh, I just received two emails from JIRA: someone already commented
> > on
> > > it?!?!
> > > 15. Open email, and after some digging about it I conclude that:
> > >     * the first email is telling me that the title of the JIRA has been
> > > updated
> > >     * the second email is telling me that there is now a PR associated
> > with
> > > the issue. Well, that is a bummer...
> > > 16. That bot comment has not disappeared from github. Am I done?
> > > 17. (some time later): oh, another email from JIRA: after 1m reading the
> > > diff: ah, someone added some [DataFusion] to the title...
> > >
> > > Again, I am adding some clues of what I perceive to be a state of mind of
> > > the person going through this flow, just to point out that IMO we are
> > > talking about no small barrier here. I am also assuming an infinite
> > > willpower.
> > >
> > > I do not find JIRA appealing, but I do not find it bad either. I do think
> > > that the setup we have puts too much load on everyone and in my previous
> > > email I tried to express that, for what is worth, that has caused me to
> > > significantly reduce my contributions. I also tried formulating what I
> > see
> > > as the root cause for the current status quo (availability of information
> > > from the project); I hope this one helps to clarify what I meant with
> > "too
> > > much".
> > >
> > > In my experience, discussion on JIRA is about the issue itself (for
> > > > example diagnosing a bug or discussing a feature), then discussion on
> > > > the PR is about the implementation.  JIRA discussions are generally
> > > > readable by users (and indeed, users often participate) while PR
> > > > discussions are really for developers of the project.
> > > >
> > >
> > > In my experience (from Rust alone), little discussion happens on JIRA.
> > > Either on the PRs, google docs, or mailing list. Only one of them
> > supports
> > > markdown, though, which I consider a basic requirement for any product
> > for
> > > developers.
> > >
> > > FWIW, I've set up a mail filter that sends all "work logged" automated
> > > > mail to the trashbin.  I agree it's unfortunate that developers have to
> > > > do that.  I also have other qualms with the Apache JIRA configuration,
> > > > such as the fact that "labels" (keywords) are shared between all
> > > > projects, so there is essentially a million of them with no effort at
> > > > taxonomy.
> > > >
> > >
> > > I also struggle with both. I tried at some point adding "beginner
> > friendly"
> > > labels to issues that were so, but there are 5 variants of that label;
> > if I
> > > do not know which one to pick, how can I expect a *beginner* to know
> > which
> > > one to pick?
> > >
> > > Best,
> > > Jorge
> > >
> > > On Tue, Mar 2, 2021 at 10:26 AM Antoine Pitrou <anto...@python.org>
> > wrote:
> > >
> > > >
> > > > Hi Jorge,
> > > >
> > > > On Tue, 2 Mar 2021 08:55:03 +0100
> > > > Jorge Cardoso Leitão <jorgecarlei...@gmail.com> wrote:
> > > > > Hi,
> > > > >
> > > > > FWIW, the amount of bureaucracy that goes into JIRA is a major
> > > > contributing
> > > > > factor for the reduction of my time commitment to this project by
> > 80%+.
> > > >
> > > > Can you expand a bit on this?  In particular, which aspects of using
> > > > JIRA feel bureaucratic?  Is it the requirement to create a new issue
> > > > for each PR?  Or is it other concerns (such as the UI for entering or
> > > > searching issues)?
> > > >
> > > > I can't say I like JIRA myself, but at least it provides the
> > > > classification and navigation features that I would expect from an
> > > > issue tracker.  The Github issue tracker AFAIK is rudimentary and not
> > > > really practical when a project has accumulated many issues (but they
> > > > may have changed this recently).
> > > >
> > > > > The major challenge is that most discussions happen where PRs are
> > created
> > > > > and seen, which is on github, but JIRA and mailing list is used for
> > other
> > > > > types of decisions. In this model, how do we preserve curated
> > information
> > > > > about the decision process while at the same time leverage both JIRA
> > and
> > > > > github's capabilities?
> > > >
> > > > In my experience, discussion on JIRA is about the issue itself (for
> > > > example diagnosing a bug or discussing a feature), then discussion on
> > > > the PR is about the implementation.  JIRA discussions are generally
> > > > readable by users (and indeed, users often participate) while PR
> > > > discussions are really for developers of the project.
> > > >
> > > > > OTOH, asking contributors to create a jira account
> > > > > and committers to add the person as contributor, as well as the email
> > > > spam
> > > > > and the merge process is a large barrier.
> > > >
> > > > FWIW, I've set up a mail filter that sends all "work logged" automated
> > > > mail to the trashbin.  I agree it's unfortunate that developers have to
> > > > do that.  I also have other qualms with the Apache JIRA configuration,
> > > > such as the fact that "labels" (keywords) are shared between all
> > > > projects, so there is essentially a million of them with no effort at
> > > > taxonomy.
> > > >
> > > > > IMO the foundation could be clearer wrt to what does it mean with
> > > > > information being preserved and available (e.g. on apache servers?)
> > and
> > > > if
> > > > > yes, follow it through by hosting all their projects on their own
> > github
> > > > /
> > > > > gitlab / whatever, where issues and PRs are on the same platform, and
> > > > offer
> > > > > SSO for contributors as a way to prove identity across the system.
> > But
> > > > that
> > > > > is also a complex operation with a lot of unknowns...
> > > >
> > > > From what I see of the ASF's velocity, I wouldn't expect such a large
> > > > breakthrough in the short future.
> > > >
> > > > (this is not trying to badmouth the ASF, just a pragmatic evaluation)
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > > >
> > > >
> > > >
> >

Reply via email to