Hi all,

Is there a consensus to migrate to GitHub?

On Wed, Dec 15, 2021 at 9:17 AM Brian Hulette <bhule...@google.com> wrote:

>
>
> On Tue, Dec 14, 2021 at 1:14 PM Kenneth Knowles <k...@google.com> wrote:
>
>>
>>
>> On Thu, Dec 9, 2021 at 11:50 PM Jean-Baptiste Onofre <j...@nanthrax.net>
>> wrote:
>>
>>> Hi,
>>>
>>> No problem for me. The only thing I don’t like with GitHub issues is
>>> that fact that it’s not possible to “assign” several milestones to an issue.
>>> When we maintain several active branch/version, it sucks (one issue ==
>>> one milestone), as we have to create several issue.
>>>
>>
>> This is a good point to consider. In Beam we often create multiple issues
>> anyhow when we intend to backport/cherrypick a fix. One issue for the
>> original fix and one each targeted cherrypick. This way their resolution
>> status can be tracked separately. But it is nice for users to be able to go
>> back and edit the original bug report to say which versions are affected
>> and which are not.
>>
>
> I looked into this a little bit. It looks like milestones don't have to
> represent a release (e.g. they could represent some abstract goal), but
> they are often associated with releases. This seems like a reasonable field
> to map to "Fix Version/s" in jira, but jira does support specifying
> multiple releases. So one issue == one milestone would be a regression.
> As Kenn pointed out though we often create a separate jira to track
> backports anyway (even though we could just specify multiple fix versions),
> so I'm not sure this is a significant blocker.
>
> If we want to use milestones to track abstract goals, I think we'd be out
> of luck. We could just use labels, but the GitHub UI doesn't present a nice
> burndown chart for those. See
> https://github.com/pandas-dev/pandas/milestones vs.
> https://github.com/pandas-dev/pandas/labels. FWIW jira doesn't have great
> functionality here either.
>
>
>>
>> Kenn
>>
>>
>>>
>>> Regards
>>> JB
>>>
>>> > Le 10 déc. 2021 à 01:28, Kyle Weaver <kcwea...@google.com> a écrit :
>>> >
>>> > I’m in favor of switching to Github issues. I can’t think of a single
>>> thing jira does better.
>>> >
>>> > Thanks Jarek, this is a really great resource [1]. For another
>>> reference, the Calcite project is engaged in the same discussion right now
>>> [2]. I came up with many of the same points independently before I saw
>>> their thread.
>>> >
>>> > When evaluating feature parity, we should make a distinction between
>>> non-structured (text) and structured data. And we don’t need a strict
>>> mechanical mapping for everything unless we’re planning on automatically
>>> migrating all existing issues. I don’t see the point in automatic
>>> migration, though; as Jarek pointed out, we’d end up perpetuating a ton of
>>> obsolete issues.
>>> >
>>> >       • We use nested issues and issue relations in jira, but as far
>>> as I know robots don’t use them and we don’t query them much, so we’re not
>>> losing anything by moving from an API to plain English descriptions: “This
>>> issue is blocked by issue #n.” Mentions show up automatically on other
>>> issues.
>>> >       • For component, type, priority, etc., we can use Github labels.
>>> >       • Version(s) affected is used inconsistently, and as far as I
>>> know only by humans, so a simple English description is fine. We can follow
>>> the example of other projects and make the version affected a part of the
>>> issue template.
>>> >       • For fix version, which we use to track which issues we want to
>>> fix in upcoming releases, as well as automatically generate release notes:
>>> Github has “milestones,” which can be marked on PRs or issues, or both.
>>> >               • IMO the automatically generated JIRA release notes are
>>> not especially useful anyway. They are too detailed for a quick summary,
>>> and not precise enough to show everything. For a readable summary, we use
>>> CHANGES.md to highlight changes we especially want users to know about. For
>>> a complete list of changes, there’s the git commit log, which is the
>>> ultimate source of truth.
>>> >       • We’d only want to preserve reporter and assignee if we’re
>>> planning on migrating everything automatically, and even then I think it’d
>>> be fine to compile a map of active contributors and drop the rest.
>>> >
>>> > As for the advantages of switching (just the ones off the top of my
>>> head):
>>> >       • As others have mentioned, it’s less burden for new
>>> contributors to create new issues and comment on existing ones.
>>> >       • Effortless linking between issues and PRs.
>>> >               • Github -> jira links were working for a short while,
>>> but they seem to be broken at the moment.
>>> >               • Jira -> github links only show: “links to GitHub Pull
>>> Request #xxxxx”. They don’t say the status of the PR, so you have to follow
>>> the link to find out. Especially inconvenient when one jira maps to several
>>> PRs, and you have to open all the links to get a summary of what work was
>>> done.
>>> >               • When you mention a GH issue in a pull request, a link
>>> to the PR will automatically appear on the issue, including not just the ID
>>> but also the PR’s description and status (open/closed/draft/merged/etc.),
>>> and if you hover it will show a preview as well.
>>> >               • We frequently merge a PR and then forget to mark the
>>> jira as closed. Whereas if a PR is linked to a GH issue using the “closes”
>>> keyword, the GH issue will automatically be closed [3].
>>> >       • I don’t have to look up or guess whether a github account and
>>> jira account belong to the same person.
>>> >       • There’s a single unified search bar to find issues, PRs, and
>>> code.
>>> >       • Github enables markdown formatting everywhere, which is more
>>> or less the industry standard, whereas Jira has its own bespoke system [4].
>>> >       • In GH issues, links to Github code snippets will automatically
>>> display the code snippet inline.
>>> >       • GH labels are scoped to each project, whereas ASF Jira labels
>>> are an unmanageable, infinitely growing namespace (see “flake,” “flaky,”
>>> “flakey,” “Flaky,” “flaky-test”...).
>>> >
>>> > [1]
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632
>>> > [2]
>>> https://mail-archives.apache.org/mod_mbox/calcite-dev/202112.mbox/%3CCAB%3DJe-EuaijDjwb6umU_N2TaqFZawE%2BUbgZAgZYvrgPFypfAYQ%40mail.gmail.com%3E
>>> > [3]
>>> https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
>>> > [4]
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632
>>> >
>>> https://jira.atlassian.com/secure/WikiRendererHelpAction.jspa?section=all
>>> >
>>> >
>>> > On Wed, Dec 8, 2021 at 9:13 AM Alexey Romanenko <
>>> aromanenko....@gmail.com> wrote:
>>> > Many thanks for details, Jarek!
>>> >
>>> > Actually, your experience proves that the full data transfer is very
>>> expensive (if even possible) and not necessary, especially taking the fact
>>> that the number of Beam Jira issues is a couple of orders more than Airflow
>>> one.  So, very likely that we will end up by living with two issue
>>> trackers, at least for some time, to avoid issue duplications and have an
>>> access to old ones. This can be very confusing.
>>> >
>>> > In the same time, except the argument of “one tool for everything”,
>>> which is quite strong for sure, I don’t see any other advantages of GH
>>> issues over Jira issues. Also, the more important is not to lose what we
>>> have for now, as Jan mentioned below.
>>> >
>>> > So, my vote for now is -0 since it has significant pros and cons and
>>> the final impact is not evident.
>>> >
>>> > —
>>> > Alexey
>>> >
>>> > > On 8 Dec 2021, at 01:38, Jarek Potiuk <ja...@potiuk.com> wrote:
>>> > >
>>> > >> Do I understand correctly that this transition (if it will happen)
>>> includes the transfer of all Beam Jira archive to GitHub issues with a
>>> proper statuses/comments/refs/etc? If not, what are the options?
>>> > >
>>> > > Suggestion from the experience of Airflow again - you can look it up
>>> > > in our notes.
>>> > >
>>> > > We've tried it initially to copy the issues manually or in bulk, but
>>> > > eventually we decided to tap into the wisdom and cooperation of our
>>> > > community.
>>> > >
>>> > > We migrated some (not many) important things only and asked our users
>>> > > to move the important issues if they think they are still
>>> > > relevant/important to them. We closed the JIRA for entry and left the
>>> > > issues in JIRA in read-only state so that we could always refer to
>>> > > them if needed.
>>> > >
>>> > > So rather than proactively copy the issues, we asked the users to
>>> make
>>> > > the decision which issues are important to them and proactively move
>>> > > it and we left an option of reactive moving if someone came back to
>>> > > the issue later.
>>> > >
>>> > > That turned out to be a smart decision considering the effort it
>>> would
>>> > > require to smartly move the issues vs. the results achieved. And
>>> > > helped us to clean some "stale/useless/not important" issues.
>>> > >
>>> > > We've had 1719 open JIRA issues when we migrated. Over the course of
>>> > > ~1.5 years (since about April 2020) we've had ~140 issues that refer
>>> > > to any of the JIRA issues
>>> > >
>>> https://github.com/apache/airflow/issues?q=is%3Aissue+is%3Aclosed+%22https%3A%2F%2Fissues.apache.org%2Fjira%22+
>>> .
>>> > > Currently we have > 4500 GH issues (3700 closed, 800 opened).
>>> > >
>>> > > This means that roughly speaking only < 10% of original open JIRA
>>> > > issues were actually somewhat valuable (roughly speaking of course)
>>> > > and they were < 5% of today's numbers. Of course some of the new GH
>>> > > issues duplicated those JIRA ones. But not many I think, especially
>>> > > that those issues in JIRA referred mostly to older Airflow versions.
>>> > >
>>> > > One more comment for the migration - I STRONGLY recommend using well
>>> > > designed templates for GH issues from day one. That significantly
>>> > > improves the quality of issues - and using Discussions as the place
>>> > > where you move unclear/not reproducible issues (and for example
>>> > > guiding users to use discussions if they have no clearly reproducible
>>> > > case). This significantly reduces the "bad issue overload" (see also
>>> > > more detailed comments in
>>> > >
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632
>>> ).
>>> > >
>>> > > I personally think a well designed issue entry process for new issues
>>> > > is more important than migrating old issues in bulk. Especially if
>>> you
>>> > > will ask users to help - as they will have to make a structured entry
>>> > > with potentially more detailed information/reproducibility) or they
>>> > > will decide themselves that opening a github discussion is better
>>> than
>>> > > opening an issue if they do not have a reproducible case. Or they
>>> will
>>> > > give up if too much information is needed (but this means that their
>>> > > issue is essentially not that important IMHO).
>>> > >
>>> > > But this is just friendly advice from the experience of those who did
>>> > > it quite some time ago :)
>>> > >
>>> > > J.
>>> > >
>>> > > On Wed, Dec 8, 2021 at 1:08 AM Brian Hulette <bhule...@google.com>
>>> wrote:
>>> > >>
>>> > >> At this point I just wanted to see if the community is interested
>>> in such a change or if there are any hard blockers. If we do go down this
>>> path I think we should port jiras over to GH Issues. You're right this
>>> isn't trivial, there's no ready-made solution we can use, we'd need to
>>> decide on a mapping for everything and write a tool to do the migration. It
>>> sounds like there may be other work in this area we can build on (e.g.
>>> Airflow may have made a tool we can work from?).
>>> > >>
>>> > >> I honestly don't have much experience with GH Issues so I can't
>>> provide concrete examples of better usability (maybe Jarek can?). From my
>>> perspective:
>>> > >> - I hear a lot of grumbling about jira, and a lot of praise for
>>> GitHub Issues.
>>> > >> - Most new users/contributors already have a GitHub account, and
>>> very few already have an ASF account. It sounds silly, but I'm sure this is
>>> a barrier for engaging with the community. Filing an issue, or commenting
>>> on one to provide additional context, or asking a clarifying question about
>>> a starter task should be very quick and easy - I bet a lot of these
>>> interactions are blocked at the jira registration page.
>>> > >>
>>> > >> Brian
>>> > >>
>>> > >> On Tue, Dec 7, 2021 at 9:04 AM Alexey Romanenko <
>>> aromanenko....@gmail.com> wrote:
>>> > >>>
>>> > >>> Do I understand correctly that this transition (if it will happen)
>>> includes the transfer of all Beam Jira archive to GitHub issues with a
>>> proper statuses/comments/refs/etc? If not, what are the options?
>>> > >>>
>>> > >>> Since this transfer looks quite complicated at the first glance,
>>> what are the real key advantages (some concrete examples are very
>>> appreciated) to initiate this process and what are the show-stoppers for us
>>> with a current Jira workflow?
>>> > >>>
>>> > >>> —
>>> > >>> Alexey
>>> > >>>
>>> > >>> On 6 Dec 2021, at 19:48, Udi Meiri <eh...@google.com> wrote:
>>> > >>>
>>> > >>> +1 on migrating to GH issues.
>>> > >>> We will need to update our release process. Hopefully it'll make
>>> it simpler.
>>> > >>>
>>> > >>>
>>> > >>> On Sat, Dec 4, 2021 at 2:35 AM Jarek Potiuk <ja...@potiuk.com>
>>> wrote:
>>> > >>>>
>>> > >>>> Just to add a comment on those requirements Kenneth, looking into
>>> the
>>> > >>>> near future.
>>> > >>>>
>>> > >>>> Soon GitHub issues will open for GA a whole new way of interacting
>>> > >>>> with the issues (without removing the current way) which will
>>> greatly
>>> > >>>> improve iI think all aspects of what You mentioned). The issues
>>> (and
>>> > >>>> associated projects) will gain new capabilities:
>>> > >>>>
>>> > >>>> * structured metadata that you will be able to define (much better
>>> > >>>> than unstructured labels)
>>> > >>>> * table-like visualisations which will allow for fast, bulk,
>>> > >>>> keyboard-driven management
>>> > >>>> * better automation of workflows
>>> > >>>> * complete APIs to manage the issues (good for GitHub Actions
>>> > >>>> integration for example)
>>> > >>>>
>>> > >>>> Re: assigning by non-committers is one of the things that won't
>>> work
>>> > >>>> currently. Only comitters can assign the issues, and only if a
>>> user
>>> > >>>> commented on the issue. But it nicely works - when a user
>>> comments "I
>>> > >>>> want to work on that issue", a committer assigns the user. And It
>>> > >>>> could be easily automated as well.
>>> > >>>>
>>> > >>>> You can see what it will is about here:
>>> https://github.com/features/issues
>>> > >>>>
>>> > >>>> They are currently at the "Public Beta" and heading towards
>>> General
>>> > >>>> Availability, but it is not available to "open" projects yet.
>>> However
>>> > >>>> I have a promise from the GitHub Product manager (my friend heads
>>> the
>>> > >>>> team implementing it) that ASF will be the first on the list when
>>> the
>>> > >>>> public projects will be enabled, because it looks like it will
>>> make
>>> > >>>> our triaging and organisation much better.
>>> > >>>>
>>> > >>>> J.
>>> > >>>>
>>> > >>>> On Sat, Dec 4, 2021 at 1:46 AM Kenneth Knowles <k...@apache.org>
>>> wrote:
>>> > >>>>>
>>> > >>>>> This sounds really good to me. Much more familiar to newcomers.
>>> I think we end up doing a lot more ad hoc stuff with labels, yes? Probably
>>> worth having a specific plan. Things I care about:
>>> > >>>>>
>>> > >>>>> - priorities with documented meaning
>>> > >>>>> - targeting issues to future releases
>>> > >>>>> - basic visualizations (mainly total vs open issues over time)
>>> > >>>>> - tags / components
>>> > >>>>> - editing/assigning by non-committers
>>> > >>>>> - workflow supporting "needs triage" (default) -> open ->
>>> resolved
>>> > >>>>>
>>> > >>>>> I think a lot of the above is done via ad hoc labels but I'm not
>>> sure if there are other fancy ways to do it.
>>> > >>>>>
>>> > >>>>> Anyhow we should switch even if there is a feature gap for the
>>> sake of community.
>>> > >>>>>
>>> > >>>>> Kenn
>>> > >>>>>
>>> > >>>>> On Fri, Dec 3, 2021 at 3:06 PM David Huntsperger <
>>> dhuntsper...@google.com> wrote:
>>> > >>>>>>
>>> > >>>>>> Yes, please. I can help clean up the website issues as part of
>>> a migration.
>>> > >>>>>>
>>> > >>>>>> On Fri, Dec 3, 2021 at 1:46 PM Robert Burke <rob...@frantil.com>
>>> wrote:
>>> > >>>>>>>
>>> > >>>>>>> Similar thing happened for Go migrating to use GH issues for
>>> everything from Language Feature proposals to bugs. Much easier than the
>>> very gerrit driven process it was before, and User Discussions are far more
>>> discoverable by users: they usually already have a GH account, and don't
>>> need to create a new separate one.
>>> > >>>>>>>
>>> > >>>>>>> GitHub does seem to permit user directed templates for issues
>>> so we can simplify issue triage by users: Eg for Go there are a number of
>>> requests one can make: https://github.com/golang/go/issues/new/choose
>>> > >>>>>>>
>>> > >>>>>>> On Fri, Dec 3, 2021, 12:17 PM Andy Ye <yea...@google.com>
>>> wrote:
>>> > >>>>>>>>
>>> > >>>>>>>> Chiming in from the perspective of a new Beam contributor. +1
>>> on Github issues. I feel like it would be easier to learn about and
>>> contribute to existing issues/bugs if it were tracked in the same place as
>>> that of the source code, rather than bouncing back and forth between the
>>> two different sites.
>>> > >>>>>>>>
>>> > >>>>>>>> On Fri, Dec 3, 2021 at 1:18 PM Jarek Potiuk <ja...@potiuk.com>
>>> wrote:
>>> > >>>>>>>>>
>>> > >>>>>>>>> Comment from a friendly outsider.
>>> > >>>>>>>>>
>>> > >>>>>>>>> TL; DR; Yes. Do migrate. Highly recommended.
>>> > >>>>>>>>>
>>> > >>>>>>>>> There were already similar discussions happening recently
>>> (community
>>> > >>>>>>>>> and infra mailing lists) and as a result I captured Airflow's
>>> > >>>>>>>>> experiences and recommendations in the BUILD wiki. You might
>>> find some
>>> > >>>>>>>>> hints and suggestions to follow as well as our experiences
>>> at Airflow:
>>> > >>>>>>>>>
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632
>>> > >>>>>>>>>
>>> > >>>>>>>>> J,
>>> > >>>>>>>>>
>>> > >>>>>>>>>
>>> > >>>>>>>>> On Fri, Dec 3, 2021 at 7:46 PM Brian Hulette <
>>> bhule...@google.com> wrote:
>>> > >>>>>>>>>>
>>> > >>>>>>>>>> Hi all,
>>> > >>>>>>>>>> I wanted to start a discussion to gauge interest on moving
>>> our issue tracking from the ASF Jira to GitHub Issues.
>>> > >>>>>>>>>>
>>> > >>>>>>>>>> Pros:
>>> > >>>>>>>>>> + GH Issues is more discoverable and approachable for new
>>> users and contributors.
>>> > >>>>>>>>>> + For contributors at Google: we have tooling to integrate
>>> GH Issues with internal issue tracking, which would help us be more
>>> accountable (Full disclosure: this is the reason I started thinking about
>>> this).
>>> > >>>>>>>>>>
>>> > >>>>>>>>>> Cons:
>>> > >>>>>>>>>> - GH Issues can't be linked to jiras for other ASF projects
>>> (I don't think we do this often in jira anyway).
>>> > >>>>>>>>>> - We would likely need to do a one-time migration of jiras
>>> to GH Issues, and update any processes or automation built on jira (e.g.
>>> release notes).
>>> > >>>>>>>>>> - Anything else?
>>> > >>>>>>>>>>
>>> > >>>>>>>>>> I've always thought that using ASF Jira was a hard
>>> requirement for Apache projects, but that is not the case. Other Apache
>>> projects are using GitHub Issues today, for example the Arrow DataFusion
>>> sub-project uses GitHub issues now [1,2] and Airflow migrated from jira [3]
>>> to GitHub issues [4].
>>> > >>>>>>>>>>
>>> > >>>>>>>>>> [1]
>>> https://lists.apache.org/thread/w3dr1vlt9115r3x9m7bprmo4zpnog483
>>> > >>>>>>>>>> [2] https://github.com/apache/arrow-datafusion/issues
>>> > >>>>>>>>>> [3] https://issues.apache.org/jira/projects/AIRFLOW/issues
>>> > >>>>>>>>>> [4] https://github.com/apache/airflow/issues
>>> > >>>
>>> > >>>
>>> >
>>>
>>>

Reply via email to