Was the duplication due to a script that generates the release note or is this something that happened manually? If the former, it would be best for that script to itself check for duplicates, I reckon. I’m not trying to detail the original discussion but understand how we can prevent mistakes without giving the RM more to do. :)
On Tue, Jun 4, 2019 at 7:46 PM Guanghao Zhang <zghao...@gmail.com> wrote: > > > > So what was in the 85MB file? I’m confused about the nature of the > mistake > > that caused it. > > > There are very duplicate content. Every issue's release note repeated > hundreds of times. So it may be heplful to check this in precommit job. > Grep the issue number from release note and check whether there are > duplicate. > > Sean Busbey <bus...@apache.org> 于2019年6月5日周三 上午4:15写道: > > > Busbey-MBA:hbase busbey$ git show cb0e9bb95599 | grep "Release Notes" | > > sort -u > > # HBASE 2.2.0 Release Notes > > Busbey-MBA:hbase busbey$ git show 61e9de9b82a9 | grep "Release Notes" | > > sort -u > > # HBASE 2.2.0 Release Notes > > Busbey-MBA:hbase busbey$ git checkout branch-2.2 > > Switched to branch 'branch-2.2' > > Your branch is up to date with 'origin/branch-2.2'. > > Busbey-MBA:hbase busbey$ cat RELEASENOTES.md | grep "Release Notes" | > sort > > -u > > # HBASE 2.2.0 Release Notes > > Busbey-MBA:hbase busbey$ cat RELEASENOTES.md | grep "Release Notes" | wc > -l > > 1 > > Busbey-MBA:hbase busbey$ git show cb0e9bb95599 | grep "Release Notes" | > wc > > -l > > 1296 > > Busbey-MBA:hbase busbey$ git show 61e9de9b82a9 | grep "Release Notes" | > wc > > -l > > 216 > > > > Seems like a simple mistake. No big deal; maybe some instructions are > > unclear. I haven't tried to drive a 2.y.z release yet. > > > > On Tue, Jun 4, 2019 at 3:12 PM Misty Linville <mi...@apache.org> wrote: > > > > > > So what was in the 85MB file? I’m confused about the nature of the > > mistake > > > that caused it. > > > > > > On Tue, Jun 4, 2019 at 1:10 PM Sean Busbey <bus...@apache.org> wrote: > > > > > > > changelog is a list of all the fixes. release notes is supposed to be > > > > the things we as a community think downstream users need to pay > > > > attention to. > > > > > > > > The release notes in an archive should be related to the release in > > > > question. Historically that has meant "all the maintenance releases > in > > > > this minor release". > > > > > > > > It's still not a problematic amount. We are prematurely optimizing > > IMHO. > > > > > > > > HBase 1.2 was around for 3+ years. its change documentation predates > > > > our use of Yetus Release Doc Maker so it's basically the report JIRA > > > > produces for each maintenance release. the final 1.2.12 release had a > > > > changelog of 154KiB. > > > > > > > > https://github.com/apache/hbase/blob/rel/1.2.12/CHANGES.txt > > > > > > > > HBase 2.0.0 had like 3 years of backlog fixes and the 2.0.5 summary > of > > > > to date has 827KiB for CHANGES and 460KiB for RELEASENOTES. > > > > > > > > https://github.com/apache/hbase/blob/rel/2.0.5/CHANGES.md > > > > https://github.com/apache/hbase/blob/rel/2.0.5/RELEASENOTES.md > > > > > > > > > > > > On Tue, Jun 4, 2019 at 2:55 PM Misty Linville <mi...@apache.org> > > wrote: > > > > > > > > > > I’m confused how an automatically generated file can vary so much > in > > > > size. > > > > > Can the release notes file with an archive just target the release > in > > > > > question and leave off the older stuff? What’s the difference in > > practice > > > > > between the release noted and changelog? > > > > > > > > > > A pre-commit and possibly a presubmit would help. > > > > > > > > > > On Tue, Jun 4, 2019 at 10:55 AM Andrew Purtell < > apurt...@apache.org> > > > > wrote: > > > > > > > > > > > The various release notes and changes.txt come up frequently in a > > > > listing > > > > > > of large-ish objects committed in the repo, along with > > autogenerated > > > > > > protobuf and thrift files. It's fine if we tolerate them all in > > return > > > > for > > > > > > something. Not requiring local IDL compilers to build is a > > reasonable > > > > > > tradeoff for checking in what can be (and is) generated. I'm less > > > > convinced > > > > > > about release notes, given they can be made readily available > > online, > > > > and > > > > > > already come in an online form on JIRA with no work required from > > us > > > > beyond > > > > > > proper attention to fix versions, but I don't have a strong > opinion > > > > about > > > > > > it. > > > > > > > > > > > > On Tue, Jun 4, 2019 at 10:35 AM Sean Busbey <bus...@apache.org> > > wrote: > > > > > > > > > > > > > presuming you mean the two files from your original email: > > > > > > > > > > > > > > cb0e9bb95599 86MiB RELEASENOTES.md > > > > > > > 61e9de9b82a9 14MiB RELEASENOTES.md > > > > > > > > > > > > > > these are both for HBase 2.2.0. Since branch-2.2 currently > shows: > > > > > > > > > > > > > > > > > > > > > Busbey-MBA:hbase busbey$ git checkout branch-2.2 > > > > > > > Already on 'branch-2.2' > > > > > > > Your branch is up to date with 'origin/branch-2.2'. > > > > > > > Busbey-MBA:hbase busbey$ ls -lah *.md > > > > > > > -rw-r--r-- 1 busbey staff 100K Jun 4 12:24 CHANGES.md > > > > > > > -rw-r--r-- 1 busbey staff 69K Jun 4 12:24 RELEASENOTES.md > > > > > > > > > > > > > > Yes, I'd wager it was also a mistake. > > > > > > > > > > > > > > If we don't want files over some threshold size, we should do > > what we > > > > > > > normally do when we want to change committer behavior: offer > > people > > > > > > > tools that help them do the right thing without thinking about > > it too > > > > > > > much. In this case I would guess a client side git pre-commit > > hook > > > > > > > that stopped things when files are too big, e.g. 10MiB. When a > > case > > > > > > > comes up that there's something bigger we need to commit then > > folks > > > > > > > can use their judgement and discuss it on dev@ if they think > > it's > > > > > > > contentious. > > > > > > > > > > > > > > On Tue, Jun 4, 2019 at 12:18 PM Andrew Purtell < > > apurt...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > > There's an 80 MB release notes file, possibly a mistake; the > > next > > > > > > largest > > > > > > > > object is an 11 MB release notes file. Also a mistake? > > > > > > > > > > > > > > > > On Tue, Jun 4, 2019 at 10:11 AM Sean Busbey < > bus...@apache.org > > > > > > > wrote: > > > > > > > > > > > > > > > > > Currently putting them in the repo is how we get release > > notes > > > > into > > > > > > the > > > > > > > > > source and binary artifacts we vote on. It's really > > convenient > > > > for > > > > > > > making > > > > > > > > > sure folks who download things have some version of the > > notes. > > > > > > > > > > > > > > > > > > We've been including the bare file in the dist area as > well, > > so > > > > we'd > > > > > > > face > > > > > > > > > the same issue around distributing a large file (probably > > more > > > > likely > > > > > > > to > > > > > > > > > face it since it's compressed in the tarballs). > > > > > > > > > > > > > > > > > > I agree we should have an up to date rendered version on > the > > > > website > > > > > > > (it's > > > > > > > > > been an outstanding doc jira). But the files usually aren't > > very > > > > > > large > > > > > > > and > > > > > > > > > having them shipped in the release is nice. > > > > > > > > > > > > > > > > > > On Tue, Jun 4, 2019, 12:04 Andrew Purtell < > > apurt...@apache.org> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > What do you think about linking to a remote site-based > > release > > > > > > notes > > > > > > > file > > > > > > > > > > instead of checking them into the main repo? > > > > > > > > > > > > > > > > > > > > On Mon, Jun 3, 2019 at 10:12 PM Guanghao Zhang < > > > > zghao...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Sorry. This large RELEASENOTES.md may be introduced by > > me. I > > > > > > > committed > > > > > > > > > a > > > > > > > > > > > big RELEASENOTES.md to branch-2.2 yesterday. I fixed it > > > > today. > > > > > > And > > > > > > > > > roll a > > > > > > > > > > > new 2.2.0RC5. Thanks. > > > > > > > > > > > > > > > > > > > > > > Andrew Purtell <apurt...@apache.org> 于2019年6月4日周二 > > 上午3:24写道: > > > > > > > > > > > > > > > > > > > > > > > remote: warning: GH001: Large files detected. You may > > want > > > > to > > > > > > > try Git > > > > > > > > > > > Large > > > > > > > > > > > > File Storage - https://git-lfs.github.com. > > > > > > > > > > > > remote: warning: See http://git.io/iEPt8g for more > > > > > > information. > > > > > > > > > > > > remote: warning: File RELEASENOTES.md is 85.80 MB; > > this is > > > > > > larger > > > > > > > > > than > > > > > > > > > > > > GitHub's recommended maximum file size of 50.00 MB > > > > > > > > > > > > > > > > > > > > > > > > The object is cb0e9bb95599 86MiB RELEASENOTES.md > > > > > > > > > > > > > > > > > > > > > > > > Incidentially, the next largest file is also a markup > > > > release > > > > > > > notes > > > > > > > > > > file. > > > > > > > > > > > > > > > > > > > > > > > > 61e9de9b82a9 14MiB RELEASENOTES.md > > > > > > > > > > > > > > > > > > > > > > > > As far as I can tell cb0e9bb95599 is not referenced > > from > > > > > > > anywhere. So > > > > > > > > > > > > eventually garbage collection both on the github side > > and > > > > in > > > > > > > local > > > > > > > > > > > > repositories will clear it out? > > > > > > > > > > > > > > > > > > > > > > > > This leads to the natural question of whether we > > should be > > > > > > > checking > > > > > > > > > in > > > > > > > > > > > such > > > > > > > > > > > > really large autogenerated files. There are release > > policy > > > > > > > > > > implications. > > > > > > > > > > > > Perhaps we can check in very small release notes > files > > > > > > containing > > > > > > > > > only > > > > > > > > > > a > > > > > > > > > > > > URL to an online resource? Put the release notes > > objects in > > > > > > > > > hbase-site > > > > > > > > > > ? > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Andrew > > > > > > > > > > > > > > > > > > > > > > > > Words like orphans lost among the crosstalk, meaning > > torn > > > > from > > > > > > > > > truth's > > > > > > > > > > > > decrepit hands > > > > > > > > > > > > - A23, Crosstalk > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best regards, > > > > > > > > > > Andrew > > > > > > > > > > > > > > > > > > > > Words like orphans lost among the crosstalk, meaning torn > > from > > > > > > > truth's > > > > > > > > > > decrepit hands > > > > > > > > > > - A23, Crosstalk > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best regards, > > > > > > > > Andrew > > > > > > > > > > > > > > > > Words like orphans lost among the crosstalk, meaning torn > from > > > > truth's > > > > > > > > decrepit hands > > > > > > > > - A23, Crosstalk > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Best regards, > > > > > > Andrew > > > > > > > > > > > > Words like orphans lost among the crosstalk, meaning torn from > > truth's > > > > > > decrepit hands > > > > > > - A23, Crosstalk > > > > > > > > > > > > >