I looked at some related projects on github: https://github.com/Skraeda/jira-2-github Does the barebones basics but helps you think of the inputs: "username mapping", "release -> milestone mapping", etc. Of course for a username mapping, maybe its best to just handle the top 99% or so and let the long-tail just come across as "full name". I also find plenty of projects that convert "special jira language" to markdown, e.g. https://github.com/catcombo/jira2markdown I'm not convinced conversion would be degraded, with a little bit of thought into the conversion, I think it could actually be *better*. github issues can do everything jira can, just without the fussy UI. e.g. issues can have attachments (for all the patch files), and attachment names can have duplicates. Issues can link to other issues, commits, or PRs easily.
It just depends on how much we want to invest into it. If we want to really go whole-hog, then when we do the initial JIRA->issue conversion, we should *save that mapping* as a .CSV file or similar. Because later we could then use it to find/replace URLs in Changes.txt, source code, benchmark annotations, etc etc. Let's at least leave the possibility open to do that work as followup. I find the idea that we're stuck looking at JIRA forever ridiculous. On Sat, Jun 18, 2022 at 3:19 AM Dawid Weiss <dawid.we...@gmail.com> wrote: > > > I honestly don't know what can be done and what has to be sacrificed. I'm > pretty sure it'll be more difficult than svn->git conversion because more > factors are involved. One tough thing to somehow preserve may be user names > (reporters, etc.). I'm not sure how other projects dealt with that. > > Perhaps a way to do it incrementally would be to create a json/xml > (structured) dump of jira content and then write a converter into a similar > json/xml dump for importing into github. I remember it took many iterations > and trial and error for svn->git conversion to eventually reach the final > shape and it was simpler and faster to do it locally. > > Dawid > > On Sat, Jun 18, 2022 at 8:59 AM Tomoko Uchida <tomoko.uchida.1...@gmail.com> > wrote: >> >> I'll give it a try though, I'm really skeptical that it can be done >> with a satisfactory level of quality (we want to "preserve" issue >> history, not just to have shallow/degraded copies, right?), and the >> migration will be significantly delayed to figure out the way to >> properly moving all issues to GitHub. >> if there is another way to bypass this challenge - please let me know. >> >> Tomoko >> >> 2022年6月18日(土) 15:44 Dawid Weiss <dawid.we...@gmail.com>: >> >> > >> > >> > Hi Tomoko, >> > >> > I've added a few bullet points that script could/should handle under >> > LUCENE-10557, hope you don't mind. If you place these script(s) in the >> > open then perhaps indeed we could try to collaborate and see what can be >> > done. >> > >> > Dawid >> > >> > On Sat, Jun 18, 2022 at 5:33 AM Tomoko Uchida >> > <tomoko.uchida.1...@gmail.com> wrote: >> >> >> >> Replying to myself - Jira issues can be read via REST API without any >> >> access token and we can iterate all issues by issue number. >> >> curl -s https://issues.apache.org/jira/rest/api/latest/issue/LUCENE-10557 >> >> >> >> Would you please hold the discussion for a while - it's a waste of our >> >> time without a working prototype to me. I will be back here with a >> >> sandbox github repo where part of existing jira issues are migrated >> >> (with the best effort). >> >> In the process, we could simultaneously figure out the way to operate >> >> GitHub metadata (milestones/labels). >> >> >> >> Tomoko >> >> >> >> 2022年6月18日(土) 10:41 Tomoko Uchida <tomoko.uchida.1...@gmail.com>: >> >> >> >> > >> >> > Does anyone have information on API access keys to Jira (preferably, >> >> > read-only and limited to Lucene project)? >> >> > https://issues.apache.org/jira/browse/LUCENE-10622 >> >> > >> >> > 2022年6月18日(土) 10:11 Tomoko Uchida <tomoko.uchida.1...@gmail.com>: >> >> > > >> >> > > I feel like we should delay the decision on the mingration of existing >> >> > > issues until we have a clear image of what can be done and what cannot >> >> > > be done. >> >> > > >> >> > > I'll write some migration script that preserves the issue history as >> >> > > far as possible - then come back here with some experience. >> >> > > Let's make a decision upon the concrete knowledge and information. >> >> > > >> >> > > Tomoko >> >> > > >> >> > > 2022年6月18日(土) 9:26 Tomoko Uchida <tomoko.uchida.1...@gmail.com>: >> >> > > > >> >> > > > I don't intend to neglect histories in Jira... it's an important, >> >> > > > valuable asset for all of us and possible contributors in the >> >> > > > future. >> >> > > > >> >> > > > It's important, *therefore*, I don't want to have the degraded >> >> > > > copies >> >> > > > of them on GitHub. >> >> > > > We cannot preserve all of history - again, there should be tons of >> >> > > > unignorable information losses (timestamp, reporter, assignee, >> >> > > > markdown, metadata that cannot be ported to GitHub) if we attempt to >> >> > > > migrate the whole Jira history into Github. Rather than trying to >> >> > > > have >> >> > > > such incomplete copies, I would preserve Jira issues in the >> >> > > > perfectly >> >> > > > archived status, then simply refer to them. >> >> > > > >> >> > > > Tomoko >> >> > > > >> >> > > > 2022年6月18日(土) 7:47 Gus Heck <gus.h...@gmail.com>: >> >> > > > > >> >> > > > > I hope you count me as someone who sees history as important. >> >> > > > > It's important in more ways than one however. You gave the >> >> > > > > example of trying to understand something, and looking at the >> >> > > > > issue history directly. I also give weight to the scenario where >> >> > > > > someone has written a blog post about the topic and linked the >> >> > > > > issue "For the latest see LUCENE-XXXX" for example... Or someone >> >> > > > > planning upgrades has a spreadsheet of things to track down... >> >> > > > > The existing links should point to a *complete* history of the >> >> > > > > issue. >> >> > > > > >> >> > > > > I don't see the migration of everything to github as being as >> >> > > > > critical as you do but I'm not at all against migrating things >> >> > > > > that are closed if someone wants to do that work, and perhaps >> >> > > > > even copying over existing open issues periodically as they >> >> > > > > become closed (and accelerating the close rate by aggressive >> >> > > > > closing of silent issues). No new issues in Jira sounds fine, >> >> > > > > even better if enforced by Jira. Proceed from here in Github >> >> > > > > since that's where the community wants to go. Links to the >> >> > > > > migrated version automatically added to Jira and/or backlinks to >> >> > > > > Jira would be just fine too since readers might (hopefully >> >> > > > > needlessly) worry that something didn't get migrated, we should >> >> > > > > make it easy to check. >> >> > > > > >> >> > > > > What I don't want is for someone to land on an issue via link or >> >> > > > > via google search (or via search in jira because they are using >> >> > > > > Jira already for some other apache project), read through it and >> >> > > > > think A) it never got resolved when it did or B) miss the fact >> >> > > > > that it got reopened and further changes were made and only have >> >> > > > > half the story... or any other scenario where they are looking at >> >> > > > > an incomplete record of the issue. (thus obfuscating/splitting >> >> > > > > the very important rich history across systems). >> >> > > > > >> >> > > > > So that's why I feel issues should be completely tracked in the >> >> > > > > system where they were created. Syncing old closed stuff into a >> >> > > > > new system probably is fine so long as there are periodic sweeps >> >> > > > > to pull in reopens or newly completed issues. We could even sync >> >> > > > > open things so long as they are clearly marked in the title as >> >> > > > > having their primary record in Jira and "last synced from JIRA on >> >> > > > > YYYY-MM-DD" or something in a final comment each time new content >> >> > > > > is brought over. >> >> > > > > >> >> > > > > For simplicity and workload however maybe just sync things when >> >> > > > > they close. Depends on how much effort the person writing code >> >> > > > > for syncing things wants to put into it I guess. >> >> > > > > >> >> > > > > Although I agree with Dawid on the "What if Elon buys it?" issue, >> >> > > > > that ship has sailed, the community accepts that risk and we >> >> > > > > probably should not rehash it. >> >> > > > > >> >> > > > > WRT Robert's comments on PRs being issues... this has already >> >> > > > > worried me because I've already seen a lot of discussion on PR's >> >> > > > > and I've worried that this stuff has the potential to get lost or >> >> > > > > be hard to find. If there is one key positive of this move is >> >> > > > > that they will become easier to find since the search in github >> >> > > > > can find it. I would say that a PR is not a substitute for a well >> >> > > > > described issue report but that's probably a separate discussion >> >> > > > > (which I would hope mirrors the policy on small edits like typos >> >> > > > > or adding comments/javadoc not needing an issue). I've also seen >> >> > > > > folks who like to clean up and remove old branches and PR's, >> >> > > > > which is problematic if that's where the important discussion is >> >> > > > > (possibly a 3rd can of worms there). >> >> > > > > >> >> > > > > -Gus >> >> > > > > >> >> > > > > On Fri, Jun 17, 2022 at 4:34 PM Robert Muir <rcm...@gmail.com> >> >> > > > > wrote: >> >> > > > >> >> >> > > > >> On Fri, Jun 17, 2022 at 3:27 PM Dawid Weiss >> >> > > > >> <dawid.we...@gmail.com> wrote: >> >> > > > >> > >> >> > > > >> > I'd be more afraid of what happens to github issues in two >> >> > > > >> > years (or longer). Will it look the same? Will it be >> >> > > > >> > different? Will it be gone (and how do we get a backup of the >> >> > > > >> > isse history then)? Contrary to the apache-hosted Jira, github >> >> > > > >> > is very much an independent entity. If Elon Musk decides to >> >> > > > >> > buy and close it tomorrow... then what? :) >> >> > > > >> > >> >> > > > >> >> >> > > > >> We already have a ton of github "issues" (pull requests, since >> >> > > > >> PRs are issues). >> >> > > > >> If you want to "back them up", its easy, you can paginate thru >> >> > > > >> them >> >> > > > >> 100 at a time, e.g. run this command, incrementing 'page' until >> >> > > > >> it >> >> > > > >> returns empty list: >> >> > > > >> >> >> > > > >> curl -H "Accept: application/vnd.github.v3+json" >> >> > > > >> "https://api.github.com/repos/apache/lucene/issues?per_page=100&page=1&direction=asc&state=all" >> >> > > > >> > file1.json >> >> > > > >> >> >> > > > >> Yeah of course if you want to backup the comments and stuff, >> >> > > > >> you'll >> >> > > > >> need to do more. >> >> > > > >> But it is already the case today, that a ton of this "history" is >> >> > > > >> already in github issues, as PRs. Most recent JIRAs are just >> >> > > > >> useless >> >> > > > >> placeholders. >> >> > > > >> Also the same risks apply to JIRA, except are not theoretical >> >> > > > >> and real >> >> > > > >> concerns, no? I thought Atlassian had deprecated "onsite" JIRA >> >> > > > >> to try >> >> > > > >> to sucker you into their "Atlassian Cloud": >> >> > > > >> https://www.theregister.com/2020/10/19/atlassian_server_licenses/ >> >> > > > >> >> >> > > > >> --------------------------------------------------------------------- >> >> > > > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> > > > >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > > >> >> >> > > > > >> >> > > > > >> >> > > > > -- >> >> > > > > http://www.needhamsoftware.com (work) >> >> > > > > http://www.the111shift.com (play) >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org