Hi Andrew,

Sorry for increasing scope of this topic.

As a reviewer of your PR
https://github.com/apache/arrow/pull/9598, I was not sure we
can really keep following "The Apache Way" with the PR's
approach. The workflow proposed by the PR creates an
associated JIRA issue for a PR when the PR is merged. I was
not sure we can ensure "make plans in the open" on the
single point of truth platform (JIRA) with the workflow. I
have a concern about it just satisfies "we have an
associated JIRA issue for each non-trivial PR".

(I think that https://github.com/apache/arrow/pull/9576 that
is referred from the PR is a trivial PR. I don't think that
we need an associated JIRA issue for it.)


I haven't read all discussions yet but I hope that I can get
an insight from them. Thanks all.


Thanks,
--
kou

In <CAFhtnRw=thytfqekkxz_o95z8wvbyxec_ncwkgpyotc9znv...@mail.gmail.com>
  "Re: Requirements on JIRA usage in Apache Arrow" on Tue, 2 Mar 2021 16:57:37 
-0500,
  Andrew Lamb <al...@influxdata.com> wrote:

> To be clear, I have no objection to JIRA and I like the point that a little
> friction on contribution may encourage more structured contribution and
> planning.
> 
> As Neil mentioned, my original goal with this thread was to understand any
> non-technical reason that prevented automating the creation of JIRA tickets
> from github PRs before I went and coded such a thing (plug: proposed change
> to merge_pr.py is here[1] if anyone wants to review it)
> 
> I did not intend to open a philosophical conversation on changing
> communication patterns and behavior in the Arrow community at large, though
> reading the viewpoints on this thread has been most enlightening. Thank you
> all for the time you have taken to share your views.
> 
> Andrew
> 
> [1] https://github.com/apache/arrow/pull/9598
> 
> 
> On Tue, Mar 2, 2021 at 3:01 PM Wes McKinney <wesmck...@gmail.com> wrote:
> 
>> With regards to the actual work of merging patches. I have merged
>> 3,234 patches in this project [1], so I think this qualifies me to
>> have an opinion about this. I don't think that the merge tool is
>> problematic -- I use a little bash helper function [2] which cuts down
>> my work merging a patch down to just a few keystrokes to the point
>> where I hardly think about it anymore. I store my Jira username and
>> password in environment variables — they aren't Bitcoin private keys
>> so I think if they get compromised it's not a big deal. I believe it's
>> worth the effort for every committer to streamline their process so
>> they aren't having to jump through a bunch of hoops every time they
>> merge a patch.
>>
>> As far as creating Jira issues for PR's that don't have them:
>>
>> * It seems like it should be straightforward to create a GitHub
>> actions bot to create a Jira issue and paste the link into the GitHub
>> PR so you don't have to use the web interface at all.
>> * The one-time cost of asking a contributor to create an account and
>> assign themselves to the issue is still going to be present, but the
>> cost of this seems low to me. The project has had around 600 unique
>> contributors in 5 years. Some of the administrative labor of adding
>> users to roles could in theory be automated, but since it takes 30-60
>> seconds to do this, is it worth the automation effort to reduce this
>> down to ~10 seconds or so?
>> * We should really be encouraging contributors to be good citizens
>> with respect to issue hygiene. If they care enough to write a PR but
>> not enough to create an issue about it, is that good?
>> * Spark uses the "[MINOR]" tag for PRs that don't merit an entry in
>> the changelog, I'd be fine with adopting that formally and
>> implementing it in the merge tool
>>
>> I could just be numb to the drudgery at this point but looking back on
>> 5 years of work and thousands of issues I don't feel it's negatively
>> impacted me, but I'm probably not a typical case. (Note: I
>> automatically filter all Jira mail that doesn't have "mentioned you"
>> in it into a separate folder and find it helpful to be able to search
>> through it in gmail)
>>
>> That said, I'm open to migrating away from Jira ONLY if everyone else
>> is committed to holding contributors to some reasonable standards of
>> professionalism when it comes to communication. The worry I have is
>> that we end up with a long queue of PRs having little context and not
>> much planning / organization. This project has been successful not
>> haphazardly/anarchically but precisely because of deliberate planning
>> and organization over a long period of time, so I think we should try
>> to respect the process that has gotten us to where we are now. What
>> may seem like bureaucracy serves a valuable function to keep people
>> organized and informed.
>>
>> - Wes
>>
>> [1]: https://gist.github.com/wesm/7058fd833861afc0d3306cdabe5b0a90
>> [2]: https://gist.github.com/wesm/17bc4cb8e7a6e5a715cb6de46d2e01e9
>>
>> On Tue, Mar 2, 2021 at 1:44 PM Neal Richardson
>> <neal.p.richard...@gmail.com> wrote:
>> >
>> > A few thoughts:
>> >
>> > * Given the cost of switching issue trackers (even if we were to identify
>> > one we thought was better), I think it's extremely unlikely that we would
>> > abandon JIRA. So rather than dumping on JIRA (an easy target, of course),
>> > we should focus on how we can make the workflows we have smoother.
>> >
>> > * Our workflow assumes that a JIRA issue exists before a pull request
>> does,
>> > so it's awkward when you do it the other way, but we can add automation
>> to
>> > make it better. Perhaps this is exactly what Andrew Lamb is working on
>> (the
>> > original impetus for this discussion thread), but suppose for example we
>> > had a PR comment bot action that would create a JIRA issue from a pull
>> > request and ideally then rename the PR to match. Our current bot that
>> > checks that every PR has a JIRA issue would just suggest that you make
>> the
>> > magic comment on the PR to create an issue.
>> >
>> > * FWIW for merging PRs, I just keep a terminal window open with the
>> python
>> > virtualenv active, and merging a PR just means hitting the up arrow and
>> > changing the PR number from the previous command. You can set your
>> > credentials in env vars if you find it burdensome to retype them. As
>> > someone who merges lots of patches, I find this easy. It's less easy than
>> > clicking a merge button in the web browser, but the extra checks and
>> > confirmations it does have prevented me from merging bad code before, so
>> I
>> > think the extra friction actually serves a purpose here.
>> >
>> > Neal
>> >
>> > On Tue, Mar 2, 2021 at 11:21 AM Weston Pace <weston.p...@gmail.com>
>> wrote:
>> >
>> > > It also seems like we're describing two different issues.  The first,
>> > > a barrier to entry for new development.  The second, overhead imposed
>> > > on an active developer.  I'm personally not so worried about the
>> > > overhead imposed, perhaps because I can't write code that fast
>> > > anyways, so I'll stay out of that discussion.
>> > >
>> > > I think the barrier to entry is not so much "I don't know which issue
>> > > tracker to use" or "I have to follow a bunch of steps" as it is "I'm
>> > > pretty sure I can improve this but not 100% sure and I don't want to
>> > > look like a fool and this is a huge code base and I'd need a lot of
>> > > help getting started and I don't want to burden people."
>> > >
>> > > Also, I would challenge the fact that people born after the year 2000
>> > > are cognitively identical :)  Someone born after 2000 treats email the
>> > > same way people born after 1970 treat phone calls.  Most gen Z I work
>> > > with see gmail as an authentication tool and not a communication tool.
>> > >
>> > > I think Fernando's point about informal discussion is a good one.  I
>> > > don't think Github is the tool you'd want for this anyways.  We have
>> > > Zulip but it is not advertised (e.g.
>> > > https://arrow.apache.org/community/).  It's also heavily
>> > > developer-centric and not user-centric at the moment.  If we want
>> > > something like that I'd be willing to help with the management /
>> > > answering questions as I'm able.
>> > >
>> > > On Tue, Mar 2, 2021 at 9:11 AM Jorge Cardoso Leitão
>> > > <jorgecarlei...@gmail.com> wrote:
>> > > >
>> > > > Hi Antoine,
>> > > >
>> > > > Can you expand a bit on this?  In particular, which aspects of using
>> > > > > JIRA feel bureaucratic?  Is it the requirement to create a new
>> issue
>> > > > > for each PR?  Or is it other concerns (such as the UI for entering
>> or
>> > > > > searching issues)?
>> > > > >
>> > > >
>> > > > First of all, thank you for taking my concerns and actively trying to
>> > > > understand them.
>> > > >
>> > > > It is advantageous for everyone to have small, focused PRs, as they
>> are
>> > > > easy to review and can narrow the discussion to a single problem.
>> This
>> > > also
>> > > > makes it easy for new contributors to start. For this to work, we
>> need a
>> > > > system on which the work needed to create and merge PRs must be
>> small in
>> > > > absolute terms, as the "meat" of the PR may be small. *This* IMO is
>> not
>> > > > working. As a flavour, below is the usual process for a situation on
>> > > which
>> > > > while working on an issue, I found a side issue and need for PR it:
>> > > >
>> > > > 1. create the PR on github with the fix
>> > > > 2. got an email from github that a bot commented that I must put the
>> JIRA
>> > > > issue. Got it, I forgot about that...
>> > > > 3. go to JIRA, if not logged in, log in (3 clicks + some password
>> manager
>> > > > stuff, and be redirected to a random page)
>> > > > 4. press "create issue"
>> > > > 5. Fill content:
>> > > >    * type
>> > > >    * Summary (do not forget to add the component to the title)
>> > > >    * component
>> > > >    * assign myself
>> > > >    * description(*)
>> > > > 6. press create. A small popup on JIRA will show that it was
>> created. It
>> > > is
>> > > > really difficult to copy-paste the issue number from the pop up:
>> > > > 7. Press on it before it disappears so that I can easily copy-paste
>> its
>> > > > number. I need to be fast, though: if it disappears before I press on
>> > > it, I
>> > > > will need to find it, which is a story on itself.
>> > > > 8. go back to github and modify the title
>> > > >
>> > > > (*) I already wrote a description on the PR using markdown, which is
>> > > when I
>> > > > was first thinking about the PR itself. JIRA does not support
>> markdown,
>> > > so
>> > > > I can't copy-paste. I now need to fight with the "visual" editor, or
>> > > > remember what the notation is for the text. I also need to remember
>> that
>> > > > {{{ }}}, not backticks, is for code. I will likely leave that one
>> empty.
>> > > >
>> > > > I am adding hiccups above because they do happen due to the mental
>> > > workload
>> > > > involved, even for someone quite proficient at this.
>> > > >
>> > > > Let's now assume that all is done and we can merge it. Let's assume
>> for
>> > > > simplicity that the PR was done by a contributor that is already a
>> member
>> > > > of JIRA and already has a contributor role on JIRA (the easy case).
>> > > >
>> > > > 1. Open a terminal and navigate to a clone of the arrow project that
>> > > > already has a Python venv on it
>> > > > 2. run `source venv/bin/activate` (our script has some external
>> > > > requirements, thus we need this)
>> > > > 3. run `dev/merge_pr.py PR number`. What was the PR number again?
>> > > > 4. Go to github and copy the PR number
>> > > > 5. paste on the terminal and press enter
>> > > > 6. Now type my username from JIRA
>> > > > 7. Now my password. I store all my passwords on a password manager
>> with a
>> > > > browser extension and often work from different VMs via ssh. Thus, I
>> go
>> > > to
>> > > > JIRA on the browser, click on the extension and copy password
>> > > > 8. paste on my terminal and press enter
>> > > > 9. Assuming that no conflicts arise, press enter/yes 2 or 3 times
>> and it
>> > > is
>> > > > merged and pushed. Great!
>> > > > 10. When updating the JIRA issue, I noticed on the terminal that the
>> > > > component is missing. Dam...
>> > > > 11. press on the JIRA link on the terminal (and a possible new
>> login) and
>> > > > add the necessary components
>> > > > 12. press enter on the terminal. Now we are done.
>> > > > 13. Go to github and thank the contributor for the great work.
>> > > >
>> > > > IMO, these flows are large. They represent about the same time I
>> would
>> > > need
>> > > > to create the small fix, a test, and PR it, including the PR
>> description.
>> > > >
>> > > > Maybe some people have better flows than I do, but my understanding
>> from
>> > > > other committers is that these steps are more or less representative
>> of a
>> > > > good day (i.e. no merge conflicts, master passing, etc).
>> > > >
>> > > > A corollary problem is that this is not something only on committers
>> /
>> > > > PMCs' plate. Let's now go through the other side: I am a brand new
>> > > > contributor.
>> > > >
>> > > > 1. create the PR on github because I found something to fix and
>> fixed it
>> > > > 2. got an email by a bot that I must follow some convention.
>> > > > 3. okk, let me try to fix the title. Oh, I need an issue in JIRA...
>> > > > 3. go to JIRA and there is no "create issue" option...
>> > > > 4. somehow I figured that I need to create an account.
>> > > > 5. There is no option to create a new account...
>> > > > 6. Try pressing log in and see if it sends me somewhere. Now there
>> is a
>> > > > small button "Sign up". Finally, let's go.
>> > > > 6. create an account: no SSO: I need write my name, create a new
>> password
>> > > > on my password manager and wait for an email.
>> > > > 6. wait for verification email and validate
>> > > > 7. press create issue
>> > > >    * type
>> > > >    * Summary (no idea I need to put the component on the title)
>> > > >    * Priority (no idea)
>> > > >    * Due date (no idea)
>> > > >    * component (no idea, let me search some keywords and see if
>> something
>> > > > matches: perfect, Rust)
>> > > >    * affected version (put latest)
>> > > >    * Add description. Can't use markdown, so some struggle here as
>> > > > copy-pasting code usually does not work. Need to learn how to use
>> this.
>> > > >    * More 5 fields or so, that I have no idea if it is expected of
>> me to
>> > > > fill or not.
>> > > > 9. press create
>> > > > 10. Pop up shows up: great, now I have the issue... wait, I forgot,
>> why I
>> > > > was creating the issue again? Pop up disappears.
>> > > > 11. Ah, right, the issue number. Dam, how do I find it now? The pop
>> up is
>> > > > gone.
>> > > > 12. Spend an arbitrary amount of time finding the issue.... finally,
>> I
>> > > > found it.
>> > > > 13. Copy it over to the github, edit the title on github.
>> > > > 14. Oh, I just received two emails from JIRA: someone already
>> commented
>> > > on
>> > > > it?!?!
>> > > > 15. Open email, and after some digging about it I conclude that:
>> > > >     * the first email is telling me that the title of the JIRA has
>> been
>> > > > updated
>> > > >     * the second email is telling me that there is now a PR
>> associated
>> > > with
>> > > > the issue. Well, that is a bummer...
>> > > > 16. That bot comment has not disappeared from github. Am I done?
>> > > > 17. (some time later): oh, another email from JIRA: after 1m reading
>> the
>> > > > diff: ah, someone added some [DataFusion] to the title...
>> > > >
>> > > > Again, I am adding some clues of what I perceive to be a state of
>> mind of
>> > > > the person going through this flow, just to point out that IMO we are
>> > > > talking about no small barrier here. I am also assuming an infinite
>> > > > willpower.
>> > > >
>> > > > I do not find JIRA appealing, but I do not find it bad either. I do
>> think
>> > > > that the setup we have puts too much load on everyone and in my
>> previous
>> > > > email I tried to express that, for what is worth, that has caused me
>> to
>> > > > significantly reduce my contributions. I also tried formulating what
>> I
>> > > see
>> > > > as the root cause for the current status quo (availability of
>> information
>> > > > from the project); I hope this one helps to clarify what I meant with
>> > > "too
>> > > > much".
>> > > >
>> > > > In my experience, discussion on JIRA is about the issue itself (for
>> > > > > example diagnosing a bug or discussing a feature), then discussion
>> on
>> > > > > the PR is about the implementation.  JIRA discussions are generally
>> > > > > readable by users (and indeed, users often participate) while PR
>> > > > > discussions are really for developers of the project.
>> > > > >
>> > > >
>> > > > In my experience (from Rust alone), little discussion happens on
>> JIRA.
>> > > > Either on the PRs, google docs, or mailing list. Only one of them
>> > > supports
>> > > > markdown, though, which I consider a basic requirement for any
>> product
>> > > for
>> > > > developers.
>> > > >
>> > > > FWIW, I've set up a mail filter that sends all "work logged"
>> automated
>> > > > > mail to the trashbin.  I agree it's unfortunate that developers
>> have to
>> > > > > do that.  I also have other qualms with the Apache JIRA
>> configuration,
>> > > > > such as the fact that "labels" (keywords) are shared between all
>> > > > > projects, so there is essentially a million of them with no effort
>> at
>> > > > > taxonomy.
>> > > > >
>> > > >
>> > > > I also struggle with both. I tried at some point adding "beginner
>> > > friendly"
>> > > > labels to issues that were so, but there are 5 variants of that
>> label;
>> > > if I
>> > > > do not know which one to pick, how can I expect a *beginner* to know
>> > > which
>> > > > one to pick?
>> > > >
>> > > > Best,
>> > > > Jorge
>> > > >
>> > > > On Tue, Mar 2, 2021 at 10:26 AM Antoine Pitrou <anto...@python.org>
>> > > wrote:
>> > > >
>> > > > >
>> > > > > Hi Jorge,
>> > > > >
>> > > > > On Tue, 2 Mar 2021 08:55:03 +0100
>> > > > > Jorge Cardoso Leitão <jorgecarlei...@gmail.com> wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > FWIW, the amount of bureaucracy that goes into JIRA is a major
>> > > > > contributing
>> > > > > > factor for the reduction of my time commitment to this project by
>> > > 80%+.
>> > > > >
>> > > > > Can you expand a bit on this?  In particular, which aspects of
>> using
>> > > > > JIRA feel bureaucratic?  Is it the requirement to create a new
>> issue
>> > > > > for each PR?  Or is it other concerns (such as the UI for entering
>> or
>> > > > > searching issues)?
>> > > > >
>> > > > > I can't say I like JIRA myself, but at least it provides the
>> > > > > classification and navigation features that I would expect from an
>> > > > > issue tracker.  The Github issue tracker AFAIK is rudimentary and
>> not
>> > > > > really practical when a project has accumulated many issues (but
>> they
>> > > > > may have changed this recently).
>> > > > >
>> > > > > > The major challenge is that most discussions happen where PRs are
>> > > created
>> > > > > > and seen, which is on github, but JIRA and mailing list is used
>> for
>> > > other
>> > > > > > types of decisions. In this model, how do we preserve curated
>> > > information
>> > > > > > about the decision process while at the same time leverage both
>> JIRA
>> > > and
>> > > > > > github's capabilities?
>> > > > >
>> > > > > In my experience, discussion on JIRA is about the issue itself (for
>> > > > > example diagnosing a bug or discussing a feature), then discussion
>> on
>> > > > > the PR is about the implementation.  JIRA discussions are generally
>> > > > > readable by users (and indeed, users often participate) while PR
>> > > > > discussions are really for developers of the project.
>> > > > >
>> > > > > > OTOH, asking contributors to create a jira account
>> > > > > > and committers to add the person as contributor, as well as the
>> email
>> > > > > spam
>> > > > > > and the merge process is a large barrier.
>> > > > >
>> > > > > FWIW, I've set up a mail filter that sends all "work logged"
>> automated
>> > > > > mail to the trashbin.  I agree it's unfortunate that developers
>> have to
>> > > > > do that.  I also have other qualms with the Apache JIRA
>> configuration,
>> > > > > such as the fact that "labels" (keywords) are shared between all
>> > > > > projects, so there is essentially a million of them with no effort
>> at
>> > > > > taxonomy.
>> > > > >
>> > > > > > IMO the foundation could be clearer wrt to what does it mean with
>> > > > > > information being preserved and available (e.g. on apache
>> servers?)
>> > > and
>> > > > > if
>> > > > > > yes, follow it through by hosting all their projects on their own
>> > > github
>> > > > > /
>> > > > > > gitlab / whatever, where issues and PRs are on the same
>> platform, and
>> > > > > offer
>> > > > > > SSO for contributors as a way to prove identity across the
>> system.
>> > > But
>> > > > > that
>> > > > > > is also a complex operation with a lot of unknowns...
>> > > > >
>> > > > > From what I see of the ASF's velocity, I wouldn't expect such a
>> large
>> > > > > breakthrough in the short future.
>> > > > >
>> > > > > (this is not trying to badmouth the ASF, just a pragmatic
>> evaluation)
>> > > > >
>> > > > > Regards
>> > > > >
>> > > > > Antoine.
>> > > > >
>> > > > >
>> > > > >
>> > >
>>

Reply via email to