Yes thank you! You say this is not difficult, but it looks like a big job to me! Here are a bunch of things I noticed that we would ideally address (from looking at one long and complex issue, LUCENE-9004). I wouldn't be so bold as to say these should block us from proceeding if they're not addressed, just want to point out there is potentially a lot to do:
Will it be possible to preserve links from issues -> pull requests? That seems like one of the most important pieces of "metadata". As far as attached files go, I see you seem to have made an attempt? There is a link in https://issues.apache.org/jira/browse/LUCENE-9004 where you had posted a picture of a graph, for example; in https://github.com/mocobeta/sandbox-lucene-10557/issues/171 it is represented as a link. When you click on the link you get an error though. I wonder if it would be possible to link back to the images hosted in JIRA? (Ideally as an <IMG> tag; otherwise a link would be good). I agree with Ryan - I'd be willing to bulk-delete 1000 notifications if it means preserving hyperlinks to people Numbered list formatting became giant bold text (see the comment containing "Here's a strategy") Many comments were lost in the transfer. The last one in the copy is only about 1/4 of the way through this gigantic issue. This really is a blocker I think. I wonder what happened? Maybe some API calls failed and we need to retry??? I wanted to check other fancy formatting (tables, block comments, code blocks, etc) but haven't looked at these yet... On Wed, Jun 22, 2022 at 8:34 AM Ryan Ernst <r...@iernst.net> wrote: > > This is great work Tomoko! A couple minor thoughts: > > * I don’t think a flood of notifications from the import is a problem. It’s a > one time hassle, and having the actual user links is nice for GitHub’s cross > linking system. > > * Do you have an estimate for how many api calls are needed? How many total > issues+comments exist in jira? I assume the limits you dealt with were the > default 5k requests per hour. If that will take too long, we could consider > using a user from an enterprise account which has 3x the limit. > > On Tue, Jun 21, 2022 at 15:56 Tomoko Uchida <tomoko.uchida.1...@gmail.com> > wrote: >> >> Hi all, >> again - this is about GitHub migration. >> >> We have a large disagreement on whether we should migrate existing Jira >> issues (including all closed issues) to GitHub or not. >> >> I drafted a tiny migration tool [1] to see how it looks if we move Jira >> issues to GitHub, and tried to migrate a small portion of Jira >> issues/comments to a test repo. You can see it here: >> - Closed issues list >> https://github.com/mocobeta/sandbox-lucene-10557/issues?q=is%3Aissue+is%3Aclosed >> - Unresolved issues list: >> https://github.com/mocobeta/sandbox-lucene-10557/issues >> >> I don't deserve to have a strong opinion on how we should treat 20+ years of >> history so I would reserve my opinion - would the prototype be of some help >> to have a conversation? >> I have to leave for a while, I'd be glad if you have a talk on it while I'm >> away and hopefully reach an agreement. >> >> Here's a summary of what can be done. >> >> You can: >> * migrate all texts in issue descriptions and comments to GitHub; >> browsing/searching old issues should work fine. >> * extract every issue metadata from Jira and port it to labels or issue >> descriptions (as plain text). >> * map Jira cross-issue link "LUCENE-xxx" to GitHub issue mention "#yyy". >> * see this example: >> https://github.com/mocobeta/sandbox-lucene-10557/issues/218 >> * map Jira user ids to GitHub accounts if the mapping is given. >> * convert Jira markups to Markdown with parser library. >> * not perfect - there can be many conversion errors >> >> And here are the limitations. (Please correct me if I'm missing something.) >> >> You cannot: >> * simulate original authors and timestamps; they have to be preserved in >> free-text forms. >> * migrate attached files (patches, images, etc.) to GitHub; these have to >> remain in Jira. >> * it's not allowed to programmatically upload files and attach them to >> issues. >> * create hyperlinks from issues to GitHub accounts (reporters, comment >> authors, etc.) by mentions; otherwise everyone will receive a huge volume of >> notifications. >> * still accounts can be noted with a markup `@xxxx` (without mentioning) >> in their right place >> * "bulk" import issues/comments. Each resource has to be posted one by one. >> Migration would take many hours (perhaps days?) due to the severe API call >> rate limit. >> >> It's not a particularly difficult task, however, there will be other >> uncontrollable things I haven't noticed yet. >> >> [1] https://github.com/mocobeta/sandbox-lucene-10557/tree/main/migration >> >> Tomoko --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org