Hi all, Is there a consensus to migrate to GitHub?
On Wed, Dec 15, 2021 at 9:17 AM Brian Hulette <bhule...@google.com> wrote: > > > On Tue, Dec 14, 2021 at 1:14 PM Kenneth Knowles <k...@google.com> wrote: > >> >> >> On Thu, Dec 9, 2021 at 11:50 PM Jean-Baptiste Onofre <j...@nanthrax.net> >> wrote: >> >>> Hi, >>> >>> No problem for me. The only thing I don’t like with GitHub issues is >>> that fact that it’s not possible to “assign” several milestones to an issue. >>> When we maintain several active branch/version, it sucks (one issue == >>> one milestone), as we have to create several issue. >>> >> >> This is a good point to consider. In Beam we often create multiple issues >> anyhow when we intend to backport/cherrypick a fix. One issue for the >> original fix and one each targeted cherrypick. This way their resolution >> status can be tracked separately. But it is nice for users to be able to go >> back and edit the original bug report to say which versions are affected >> and which are not. >> > > I looked into this a little bit. It looks like milestones don't have to > represent a release (e.g. they could represent some abstract goal), but > they are often associated with releases. This seems like a reasonable field > to map to "Fix Version/s" in jira, but jira does support specifying > multiple releases. So one issue == one milestone would be a regression. > As Kenn pointed out though we often create a separate jira to track > backports anyway (even though we could just specify multiple fix versions), > so I'm not sure this is a significant blocker. > > If we want to use milestones to track abstract goals, I think we'd be out > of luck. We could just use labels, but the GitHub UI doesn't present a nice > burndown chart for those. See > https://github.com/pandas-dev/pandas/milestones vs. > https://github.com/pandas-dev/pandas/labels. FWIW jira doesn't have great > functionality here either. > > >> >> Kenn >> >> >>> >>> Regards >>> JB >>> >>> > Le 10 déc. 2021 à 01:28, Kyle Weaver <kcwea...@google.com> a écrit : >>> > >>> > I’m in favor of switching to Github issues. I can’t think of a single >>> thing jira does better. >>> > >>> > Thanks Jarek, this is a really great resource [1]. For another >>> reference, the Calcite project is engaged in the same discussion right now >>> [2]. I came up with many of the same points independently before I saw >>> their thread. >>> > >>> > When evaluating feature parity, we should make a distinction between >>> non-structured (text) and structured data. And we don’t need a strict >>> mechanical mapping for everything unless we’re planning on automatically >>> migrating all existing issues. I don’t see the point in automatic >>> migration, though; as Jarek pointed out, we’d end up perpetuating a ton of >>> obsolete issues. >>> > >>> > • We use nested issues and issue relations in jira, but as far >>> as I know robots don’t use them and we don’t query them much, so we’re not >>> losing anything by moving from an API to plain English descriptions: “This >>> issue is blocked by issue #n.” Mentions show up automatically on other >>> issues. >>> > • For component, type, priority, etc., we can use Github labels. >>> > • Version(s) affected is used inconsistently, and as far as I >>> know only by humans, so a simple English description is fine. We can follow >>> the example of other projects and make the version affected a part of the >>> issue template. >>> > • For fix version, which we use to track which issues we want to >>> fix in upcoming releases, as well as automatically generate release notes: >>> Github has “milestones,” which can be marked on PRs or issues, or both. >>> > • IMO the automatically generated JIRA release notes are >>> not especially useful anyway. They are too detailed for a quick summary, >>> and not precise enough to show everything. For a readable summary, we use >>> CHANGES.md to highlight changes we especially want users to know about. For >>> a complete list of changes, there’s the git commit log, which is the >>> ultimate source of truth. >>> > • We’d only want to preserve reporter and assignee if we’re >>> planning on migrating everything automatically, and even then I think it’d >>> be fine to compile a map of active contributors and drop the rest. >>> > >>> > As for the advantages of switching (just the ones off the top of my >>> head): >>> > • As others have mentioned, it’s less burden for new >>> contributors to create new issues and comment on existing ones. >>> > • Effortless linking between issues and PRs. >>> > • Github -> jira links were working for a short while, >>> but they seem to be broken at the moment. >>> > • Jira -> github links only show: “links to GitHub Pull >>> Request #xxxxx”. They don’t say the status of the PR, so you have to follow >>> the link to find out. Especially inconvenient when one jira maps to several >>> PRs, and you have to open all the links to get a summary of what work was >>> done. >>> > • When you mention a GH issue in a pull request, a link >>> to the PR will automatically appear on the issue, including not just the ID >>> but also the PR’s description and status (open/closed/draft/merged/etc.), >>> and if you hover it will show a preview as well. >>> > • We frequently merge a PR and then forget to mark the >>> jira as closed. Whereas if a PR is linked to a GH issue using the “closes” >>> keyword, the GH issue will automatically be closed [3]. >>> > • I don’t have to look up or guess whether a github account and >>> jira account belong to the same person. >>> > • There’s a single unified search bar to find issues, PRs, and >>> code. >>> > • Github enables markdown formatting everywhere, which is more >>> or less the industry standard, whereas Jira has its own bespoke system [4]. >>> > • In GH issues, links to Github code snippets will automatically >>> display the code snippet inline. >>> > • GH labels are scoped to each project, whereas ASF Jira labels >>> are an unmanageable, infinitely growing namespace (see “flake,” “flaky,” >>> “flakey,” “Flaky,” “flaky-test”...). >>> > >>> > [1] >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632 >>> > [2] >>> https://mail-archives.apache.org/mod_mbox/calcite-dev/202112.mbox/%3CCAB%3DJe-EuaijDjwb6umU_N2TaqFZawE%2BUbgZAgZYvrgPFypfAYQ%40mail.gmail.com%3E >>> > [3] >>> https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword >>> > [4] >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632 >>> > >>> https://jira.atlassian.com/secure/WikiRendererHelpAction.jspa?section=all >>> > >>> > >>> > On Wed, Dec 8, 2021 at 9:13 AM Alexey Romanenko < >>> aromanenko....@gmail.com> wrote: >>> > Many thanks for details, Jarek! >>> > >>> > Actually, your experience proves that the full data transfer is very >>> expensive (if even possible) and not necessary, especially taking the fact >>> that the number of Beam Jira issues is a couple of orders more than Airflow >>> one. So, very likely that we will end up by living with two issue >>> trackers, at least for some time, to avoid issue duplications and have an >>> access to old ones. This can be very confusing. >>> > >>> > In the same time, except the argument of “one tool for everything”, >>> which is quite strong for sure, I don’t see any other advantages of GH >>> issues over Jira issues. Also, the more important is not to lose what we >>> have for now, as Jan mentioned below. >>> > >>> > So, my vote for now is -0 since it has significant pros and cons and >>> the final impact is not evident. >>> > >>> > — >>> > Alexey >>> > >>> > > On 8 Dec 2021, at 01:38, Jarek Potiuk <ja...@potiuk.com> wrote: >>> > > >>> > >> Do I understand correctly that this transition (if it will happen) >>> includes the transfer of all Beam Jira archive to GitHub issues with a >>> proper statuses/comments/refs/etc? If not, what are the options? >>> > > >>> > > Suggestion from the experience of Airflow again - you can look it up >>> > > in our notes. >>> > > >>> > > We've tried it initially to copy the issues manually or in bulk, but >>> > > eventually we decided to tap into the wisdom and cooperation of our >>> > > community. >>> > > >>> > > We migrated some (not many) important things only and asked our users >>> > > to move the important issues if they think they are still >>> > > relevant/important to them. We closed the JIRA for entry and left the >>> > > issues in JIRA in read-only state so that we could always refer to >>> > > them if needed. >>> > > >>> > > So rather than proactively copy the issues, we asked the users to >>> make >>> > > the decision which issues are important to them and proactively move >>> > > it and we left an option of reactive moving if someone came back to >>> > > the issue later. >>> > > >>> > > That turned out to be a smart decision considering the effort it >>> would >>> > > require to smartly move the issues vs. the results achieved. And >>> > > helped us to clean some "stale/useless/not important" issues. >>> > > >>> > > We've had 1719 open JIRA issues when we migrated. Over the course of >>> > > ~1.5 years (since about April 2020) we've had ~140 issues that refer >>> > > to any of the JIRA issues >>> > > >>> https://github.com/apache/airflow/issues?q=is%3Aissue+is%3Aclosed+%22https%3A%2F%2Fissues.apache.org%2Fjira%22+ >>> . >>> > > Currently we have > 4500 GH issues (3700 closed, 800 opened). >>> > > >>> > > This means that roughly speaking only < 10% of original open JIRA >>> > > issues were actually somewhat valuable (roughly speaking of course) >>> > > and they were < 5% of today's numbers. Of course some of the new GH >>> > > issues duplicated those JIRA ones. But not many I think, especially >>> > > that those issues in JIRA referred mostly to older Airflow versions. >>> > > >>> > > One more comment for the migration - I STRONGLY recommend using well >>> > > designed templates for GH issues from day one. That significantly >>> > > improves the quality of issues - and using Discussions as the place >>> > > where you move unclear/not reproducible issues (and for example >>> > > guiding users to use discussions if they have no clearly reproducible >>> > > case). This significantly reduces the "bad issue overload" (see also >>> > > more detailed comments in >>> > > >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632 >>> ). >>> > > >>> > > I personally think a well designed issue entry process for new issues >>> > > is more important than migrating old issues in bulk. Especially if >>> you >>> > > will ask users to help - as they will have to make a structured entry >>> > > with potentially more detailed information/reproducibility) or they >>> > > will decide themselves that opening a github discussion is better >>> than >>> > > opening an issue if they do not have a reproducible case. Or they >>> will >>> > > give up if too much information is needed (but this means that their >>> > > issue is essentially not that important IMHO). >>> > > >>> > > But this is just friendly advice from the experience of those who did >>> > > it quite some time ago :) >>> > > >>> > > J. >>> > > >>> > > On Wed, Dec 8, 2021 at 1:08 AM Brian Hulette <bhule...@google.com> >>> wrote: >>> > >> >>> > >> At this point I just wanted to see if the community is interested >>> in such a change or if there are any hard blockers. If we do go down this >>> path I think we should port jiras over to GH Issues. You're right this >>> isn't trivial, there's no ready-made solution we can use, we'd need to >>> decide on a mapping for everything and write a tool to do the migration. It >>> sounds like there may be other work in this area we can build on (e.g. >>> Airflow may have made a tool we can work from?). >>> > >> >>> > >> I honestly don't have much experience with GH Issues so I can't >>> provide concrete examples of better usability (maybe Jarek can?). From my >>> perspective: >>> > >> - I hear a lot of grumbling about jira, and a lot of praise for >>> GitHub Issues. >>> > >> - Most new users/contributors already have a GitHub account, and >>> very few already have an ASF account. It sounds silly, but I'm sure this is >>> a barrier for engaging with the community. Filing an issue, or commenting >>> on one to provide additional context, or asking a clarifying question about >>> a starter task should be very quick and easy - I bet a lot of these >>> interactions are blocked at the jira registration page. >>> > >> >>> > >> Brian >>> > >> >>> > >> On Tue, Dec 7, 2021 at 9:04 AM Alexey Romanenko < >>> aromanenko....@gmail.com> wrote: >>> > >>> >>> > >>> Do I understand correctly that this transition (if it will happen) >>> includes the transfer of all Beam Jira archive to GitHub issues with a >>> proper statuses/comments/refs/etc? If not, what are the options? >>> > >>> >>> > >>> Since this transfer looks quite complicated at the first glance, >>> what are the real key advantages (some concrete examples are very >>> appreciated) to initiate this process and what are the show-stoppers for us >>> with a current Jira workflow? >>> > >>> >>> > >>> — >>> > >>> Alexey >>> > >>> >>> > >>> On 6 Dec 2021, at 19:48, Udi Meiri <eh...@google.com> wrote: >>> > >>> >>> > >>> +1 on migrating to GH issues. >>> > >>> We will need to update our release process. Hopefully it'll make >>> it simpler. >>> > >>> >>> > >>> >>> > >>> On Sat, Dec 4, 2021 at 2:35 AM Jarek Potiuk <ja...@potiuk.com> >>> wrote: >>> > >>>> >>> > >>>> Just to add a comment on those requirements Kenneth, looking into >>> the >>> > >>>> near future. >>> > >>>> >>> > >>>> Soon GitHub issues will open for GA a whole new way of interacting >>> > >>>> with the issues (without removing the current way) which will >>> greatly >>> > >>>> improve iI think all aspects of what You mentioned). The issues >>> (and >>> > >>>> associated projects) will gain new capabilities: >>> > >>>> >>> > >>>> * structured metadata that you will be able to define (much better >>> > >>>> than unstructured labels) >>> > >>>> * table-like visualisations which will allow for fast, bulk, >>> > >>>> keyboard-driven management >>> > >>>> * better automation of workflows >>> > >>>> * complete APIs to manage the issues (good for GitHub Actions >>> > >>>> integration for example) >>> > >>>> >>> > >>>> Re: assigning by non-committers is one of the things that won't >>> work >>> > >>>> currently. Only comitters can assign the issues, and only if a >>> user >>> > >>>> commented on the issue. But it nicely works - when a user >>> comments "I >>> > >>>> want to work on that issue", a committer assigns the user. And It >>> > >>>> could be easily automated as well. >>> > >>>> >>> > >>>> You can see what it will is about here: >>> https://github.com/features/issues >>> > >>>> >>> > >>>> They are currently at the "Public Beta" and heading towards >>> General >>> > >>>> Availability, but it is not available to "open" projects yet. >>> However >>> > >>>> I have a promise from the GitHub Product manager (my friend heads >>> the >>> > >>>> team implementing it) that ASF will be the first on the list when >>> the >>> > >>>> public projects will be enabled, because it looks like it will >>> make >>> > >>>> our triaging and organisation much better. >>> > >>>> >>> > >>>> J. >>> > >>>> >>> > >>>> On Sat, Dec 4, 2021 at 1:46 AM Kenneth Knowles <k...@apache.org> >>> wrote: >>> > >>>>> >>> > >>>>> This sounds really good to me. Much more familiar to newcomers. >>> I think we end up doing a lot more ad hoc stuff with labels, yes? Probably >>> worth having a specific plan. Things I care about: >>> > >>>>> >>> > >>>>> - priorities with documented meaning >>> > >>>>> - targeting issues to future releases >>> > >>>>> - basic visualizations (mainly total vs open issues over time) >>> > >>>>> - tags / components >>> > >>>>> - editing/assigning by non-committers >>> > >>>>> - workflow supporting "needs triage" (default) -> open -> >>> resolved >>> > >>>>> >>> > >>>>> I think a lot of the above is done via ad hoc labels but I'm not >>> sure if there are other fancy ways to do it. >>> > >>>>> >>> > >>>>> Anyhow we should switch even if there is a feature gap for the >>> sake of community. >>> > >>>>> >>> > >>>>> Kenn >>> > >>>>> >>> > >>>>> On Fri, Dec 3, 2021 at 3:06 PM David Huntsperger < >>> dhuntsper...@google.com> wrote: >>> > >>>>>> >>> > >>>>>> Yes, please. I can help clean up the website issues as part of >>> a migration. >>> > >>>>>> >>> > >>>>>> On Fri, Dec 3, 2021 at 1:46 PM Robert Burke <rob...@frantil.com> >>> wrote: >>> > >>>>>>> >>> > >>>>>>> Similar thing happened for Go migrating to use GH issues for >>> everything from Language Feature proposals to bugs. Much easier than the >>> very gerrit driven process it was before, and User Discussions are far more >>> discoverable by users: they usually already have a GH account, and don't >>> need to create a new separate one. >>> > >>>>>>> >>> > >>>>>>> GitHub does seem to permit user directed templates for issues >>> so we can simplify issue triage by users: Eg for Go there are a number of >>> requests one can make: https://github.com/golang/go/issues/new/choose >>> > >>>>>>> >>> > >>>>>>> On Fri, Dec 3, 2021, 12:17 PM Andy Ye <yea...@google.com> >>> wrote: >>> > >>>>>>>> >>> > >>>>>>>> Chiming in from the perspective of a new Beam contributor. +1 >>> on Github issues. I feel like it would be easier to learn about and >>> contribute to existing issues/bugs if it were tracked in the same place as >>> that of the source code, rather than bouncing back and forth between the >>> two different sites. >>> > >>>>>>>> >>> > >>>>>>>> On Fri, Dec 3, 2021 at 1:18 PM Jarek Potiuk <ja...@potiuk.com> >>> wrote: >>> > >>>>>>>>> >>> > >>>>>>>>> Comment from a friendly outsider. >>> > >>>>>>>>> >>> > >>>>>>>>> TL; DR; Yes. Do migrate. Highly recommended. >>> > >>>>>>>>> >>> > >>>>>>>>> There were already similar discussions happening recently >>> (community >>> > >>>>>>>>> and infra mailing lists) and as a result I captured Airflow's >>> > >>>>>>>>> experiences and recommendations in the BUILD wiki. You might >>> find some >>> > >>>>>>>>> hints and suggestions to follow as well as our experiences >>> at Airflow: >>> > >>>>>>>>> >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191332632 >>> > >>>>>>>>> >>> > >>>>>>>>> J, >>> > >>>>>>>>> >>> > >>>>>>>>> >>> > >>>>>>>>> On Fri, Dec 3, 2021 at 7:46 PM Brian Hulette < >>> bhule...@google.com> wrote: >>> > >>>>>>>>>> >>> > >>>>>>>>>> Hi all, >>> > >>>>>>>>>> I wanted to start a discussion to gauge interest on moving >>> our issue tracking from the ASF Jira to GitHub Issues. >>> > >>>>>>>>>> >>> > >>>>>>>>>> Pros: >>> > >>>>>>>>>> + GH Issues is more discoverable and approachable for new >>> users and contributors. >>> > >>>>>>>>>> + For contributors at Google: we have tooling to integrate >>> GH Issues with internal issue tracking, which would help us be more >>> accountable (Full disclosure: this is the reason I started thinking about >>> this). >>> > >>>>>>>>>> >>> > >>>>>>>>>> Cons: >>> > >>>>>>>>>> - GH Issues can't be linked to jiras for other ASF projects >>> (I don't think we do this often in jira anyway). >>> > >>>>>>>>>> - We would likely need to do a one-time migration of jiras >>> to GH Issues, and update any processes or automation built on jira (e.g. >>> release notes). >>> > >>>>>>>>>> - Anything else? >>> > >>>>>>>>>> >>> > >>>>>>>>>> I've always thought that using ASF Jira was a hard >>> requirement for Apache projects, but that is not the case. Other Apache >>> projects are using GitHub Issues today, for example the Arrow DataFusion >>> sub-project uses GitHub issues now [1,2] and Airflow migrated from jira [3] >>> to GitHub issues [4]. >>> > >>>>>>>>>> >>> > >>>>>>>>>> [1] >>> https://lists.apache.org/thread/w3dr1vlt9115r3x9m7bprmo4zpnog483 >>> > >>>>>>>>>> [2] https://github.com/apache/arrow-datafusion/issues >>> > >>>>>>>>>> [3] https://issues.apache.org/jira/projects/AIRFLOW/issues >>> > >>>>>>>>>> [4] https://github.com/apache/airflow/issues >>> > >>> >>> > >>> >>> > >>> >>>