Yes thank you! You say this is not difficult, but it looks like a big
job to me! Here are a bunch of things I noticed that we would ideally
address (from looking at one long and complex issue, LUCENE-9004). I
wouldn't be so bold as to say these should block us from proceeding if
they're not addressed, just want to point out there is potentially a
lot to do:

Will it be possible to preserve links from issues -> pull requests?
That seems like one of the most important pieces of "metadata".

As far as attached files go, I see you seem to have made an attempt?
There is a link in https://issues.apache.org/jira/browse/LUCENE-9004
where you had posted a picture of a graph, for example; in
https://github.com/mocobeta/sandbox-lucene-10557/issues/171 it is
represented as a link. When you click on the link you get an error
though. I wonder if it would be possible to link back to the images
hosted in JIRA? (Ideally as an <IMG> tag; otherwise a link would be
good).

I agree with Ryan - I'd be willing to bulk-delete 1000 notifications
if it means preserving hyperlinks to people

Numbered list formatting became giant bold text (see the comment
containing "Here's a strategy")

Many comments were lost in the transfer. The last one in the copy is
only about 1/4 of the way through this gigantic issue. This really is
a blocker I think. I wonder what happened? Maybe some API calls failed
and we need to retry???

I wanted to check other fancy formatting (tables, block comments, code
blocks, etc) but haven't looked at these yet...

On Wed, Jun 22, 2022 at 8:34 AM Ryan Ernst <r...@iernst.net> wrote:
>
> This is great work Tomoko! A couple minor thoughts:
>
> * I don’t think a flood of notifications from the import is a problem. It’s a 
> one time hassle, and having the actual user links is nice for GitHub’s cross 
> linking system.
>
> * Do you have an estimate for how many api calls are needed? How many total 
> issues+comments exist in jira? I assume the limits you dealt with were the 
> default 5k requests per hour. If that will take too long, we could consider 
> using a user from an enterprise account which has 3x the limit.
>
> On Tue, Jun 21, 2022 at 15:56 Tomoko Uchida <tomoko.uchida.1...@gmail.com> 
> wrote:
>>
>> Hi all,
>> again - this is about GitHub migration.
>>
>> We have a large disagreement on whether we should migrate existing Jira 
>> issues (including all closed issues) to GitHub or not.
>>
>> I drafted a tiny migration tool [1] to see how it looks if we move Jira 
>> issues to GitHub, and tried to migrate a small portion of Jira 
>> issues/comments to a test repo. You can see it here:
>> - Closed issues list 
>> https://github.com/mocobeta/sandbox-lucene-10557/issues?q=is%3Aissue+is%3Aclosed
>> - Unresolved issues list: 
>> https://github.com/mocobeta/sandbox-lucene-10557/issues
>>
>> I don't deserve to have a strong opinion on how we should treat 20+ years of 
>> history so I would reserve my opinion - would the prototype be of some help 
>> to have a conversation?
>> I have to leave for a while, I'd be glad if you have a talk on it while I'm 
>> away and hopefully reach an agreement.
>>
>> Here's a summary of what can be done.
>>
>> You can:
>> * migrate all texts in issue descriptions and comments to GitHub; 
>> browsing/searching old issues should work fine.
>> * extract every issue metadata from Jira and port it to labels or issue 
>> descriptions (as plain text).
>> * map Jira cross-issue link "LUCENE-xxx" to GitHub issue mention "#yyy".
>>    * see this example: 
>> https://github.com/mocobeta/sandbox-lucene-10557/issues/218
>> * map Jira user ids to GitHub accounts if the mapping is given.
>> * convert Jira markups to Markdown with parser library.
>>    * not perfect - there can be many conversion errors
>>
>> And here are the limitations. (Please correct me if I'm missing something.)
>>
>> You cannot:
>> * simulate original authors and timestamps; they have to be preserved in 
>> free-text forms.
>> * migrate attached files (patches, images, etc.) to GitHub; these have to 
>> remain in Jira.
>>    * it's not allowed to programmatically upload files and attach them to 
>> issues.
>> * create hyperlinks from issues to GitHub accounts (reporters, comment 
>> authors, etc.) by mentions; otherwise everyone will receive a huge volume of 
>> notifications.
>>    * still accounts can be noted with a markup `@xxxx` (without mentioning) 
>> in their right place
>> * "bulk" import issues/comments. Each resource has to be posted one by one. 
>> Migration would take many hours (perhaps days?) due to the severe API call 
>> rate limit.
>>
>> It's not a particularly difficult task, however, there will be other 
>> uncontrollable things I haven't noticed yet.
>>
>> [1] https://github.com/mocobeta/sandbox-lucene-10557/tree/main/migration
>>
>> Tomoko

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to