Haven't fully read all the comments but I think this is the duty for the release manager? This is what I use to get the difference between jira and git
git log --oneline 0167558eb31ff48308d592ef70b6d005ba6d21fb...3ee8b0c75b | grep -o -E "^[0-9a-z]*\s*HBASE-[0-9]*"|awk '{print $2}' | sort -u > in_git_2.0.0.txt cat CHANGELOG.md | grep -o -E "\[HBASE-[0-9]*\]" | grep -o -E "HBASE-[0-9]*" | sort -u > CHANGELOG_jiras.txt And then you could use comm to get the difference. We all make mistakes so this process can not be automatic, trust me. Sometimes a commit is reverted and never applied again, sometimes we include the wrong jira id in commit message. This is why I always use a jira issue to land the CHANGES and RELEASENOTES, tag it, and then use the release script to generate the RC. For me, it will cost me a lot of time when creating a new minor release because there will be a lot of commits, but for a patch release usually it is fine. Maybe things will be easier if we move all the development process to github? Thanks. Sean Busbey <bus...@apache.org> 于2021年1月16日周六 上午8:05写道: > What would the logistics look like for git as the source of truth? > > Every release candidate I've ever done I've had to do the jira and git > reconciliation. There are always errors in both. Folks commit stuff with > the wrong JIRA ID or wrong message. > > Would reconciliation for a RM then require revert and reapply of any such > changes? > > On Fri, Jan 15, 2021, 17:54 Stack <st...@duboce.net> wrote: > > > On Fri, Jan 15, 2021 at 10:06 AM Andrew Purtell <apurt...@apache.org> > > wrote: > > > > > I have now had to sink two 2.4.1 RCs because of errors in the change > log. > > > > > > I made a pass over git history and ensured every commit was included. I > > had > > > also made a pass over JIRA to move out any unresolved issues or > complete > > > the resolution of same. What I did not do is check that every resolved > > JIRA > > > corresponded to an actual commit. This is not something RMs have had to > > do > > > in the past and it asks a lot of them. > > > > > > > > Just to say, that making RCs, I've the done the JIRA<->GIT reconciliation > > described (ugly hack scripting, sorting, and comparing). Others have too > > I've noticed. It is awful, yes, especially when issues like those found > by > > Viraj in yours and Huaxiangs RCs this week. The refguide is vague on what > > is involved; it needs more detail. The issue HBASE-22853 description on > the > > need for a tool to do the reconcile is a bit better. > > > > > > > > > I know NOW that as RM I cannot currently trust committers to get fix > > > versions right or care about this. > > > > > > That's right... Commtters cannot be trusted to correctly maintain issue > > > metadata in JIRA. > > > > > > > > Yes (says a key offender). > > > > > > > > > That is not a good situation for the project to be in. Up until now it > > has > > > not been the responsibility of the RM to check each and every JIRA > > status. > > > It has been the collective responsibility of committers to care about > the > > > project's release tracking insofar as to correctly update fix versions > in > > > JIRA. For releases containing relatively few changes, like 2.4.1, with > > ~50 > > > changes, I suppose it is possible for the RM to remove all 2.4.1 fix > > > versions, walk the commit history, and set back fix versions on JIRA to > > > actually correspond with what was truly committed. However, for minor > > > releases, with hundreds of commits, this will not be possible. > > > > > > I think the root cause is GitHub and JIRA are two separate change > > tracking > > > systems with only a minimal amount of integration. It requires manual > > > effort. More and more, new committers are familiar with GitHub and PRs > > and > > > are not familiar with JIRA and the Apache way of using JIRA to build > > change > > > logs. We need to better educate new and existing committers on their > > > responsibilities with regards to maintaining JIRA metadata correctly. > > > > > > > > Agree. > > > > Other issues are that the release process has been evolving. In the old > > days, JIRA was the source of truth; changelog was a JIRA report. The > > CHANGES.md/RELEASENOTES.md have come to the fore perhaps making > difference > > between JIRA and GIT more pronounced. > > > > I like the suggestion made by you above (and Nick in an internal > > discussion) that git be the source of truth. That the CHANGES.md NOT be > > checked-in is also a good idea so it can be regenerated w/o need of a > > change to the RC if JIRA is changed (I can work on this part if wanted). > > > > S > > > > > > > -- > > > Best regards, > > > Andrew > > > > > > Words like orphans lost among the crosstalk, meaning torn from truth's > > > decrepit hands > > > - A23, Crosstalk > > > > > >