Was the duplication due to a script that generates the release note or is
this something that happened manually? If the former, it would be best for
that script to itself check for duplicates, I reckon. I’m not trying to
detail the original discussion but understand how we can prevent mistakes
without giving the RM more to do. :)

On Tue, Jun 4, 2019 at 7:46 PM Guanghao Zhang <zghao...@gmail.com> wrote:

> >
> > So what was in the 85MB file? I’m confused about the nature of the
> mistake
> > that caused it.
> >
> There are very duplicate content. Every issue's release note repeated
> hundreds of times. So it may be heplful to check this in precommit job.
> Grep the issue number from release note and check whether there are
> duplicate.
>
> Sean Busbey <bus...@apache.org> 于2019年6月5日周三 上午4:15写道:
>
> > Busbey-MBA:hbase busbey$ git show cb0e9bb95599 | grep "Release Notes" |
> > sort -u
> > # HBASE  2.2.0 Release Notes
> > Busbey-MBA:hbase busbey$ git show 61e9de9b82a9 | grep "Release Notes" |
> > sort -u
> > # HBASE  2.2.0 Release Notes
> > Busbey-MBA:hbase busbey$ git checkout branch-2.2
> > Switched to branch 'branch-2.2'
> > Your branch is up to date with 'origin/branch-2.2'.
> > Busbey-MBA:hbase busbey$ cat RELEASENOTES.md | grep "Release Notes" |
> sort
> > -u
> > # HBASE  2.2.0 Release Notes
> > Busbey-MBA:hbase busbey$ cat RELEASENOTES.md | grep "Release Notes" | wc
> -l
> >        1
> > Busbey-MBA:hbase busbey$ git show cb0e9bb95599 | grep "Release Notes" |
> wc
> > -l
> >     1296
> > Busbey-MBA:hbase busbey$ git show 61e9de9b82a9 | grep "Release Notes" |
> wc
> > -l
> >      216
> >
> > Seems like a simple mistake. No big deal; maybe some instructions are
> > unclear. I haven't tried to drive a 2.y.z release yet.
> >
> > On Tue, Jun 4, 2019 at 3:12 PM Misty Linville <mi...@apache.org> wrote:
> > >
> > > So what was in the 85MB file? I’m confused about the nature of the
> > mistake
> > > that caused it.
> > >
> > > On Tue, Jun 4, 2019 at 1:10 PM Sean Busbey <bus...@apache.org> wrote:
> > >
> > > > changelog is a list of all the fixes. release notes is supposed to be
> > > > the things we as a community think downstream users need to pay
> > > > attention to.
> > > >
> > > > The release notes in an archive should be related to the release in
> > > > question. Historically that has meant "all the maintenance releases
> in
> > > > this minor release".
> > > >
> > > > It's still not a problematic amount. We are prematurely optimizing
> > IMHO.
> > > >
> > > > HBase 1.2 was around for 3+ years. its change documentation predates
> > > > our use of Yetus Release Doc Maker so it's basically the report JIRA
> > > > produces for each maintenance release. the final 1.2.12 release had a
> > > > changelog of 154KiB.
> > > >
> > > > https://github.com/apache/hbase/blob/rel/1.2.12/CHANGES.txt
> > > >
> > > > HBase 2.0.0 had like 3 years of backlog fixes and the 2.0.5 summary
> of
> > > > to date has 827KiB for CHANGES and 460KiB for RELEASENOTES.
> > > >
> > > > https://github.com/apache/hbase/blob/rel/2.0.5/CHANGES.md
> > > > https://github.com/apache/hbase/blob/rel/2.0.5/RELEASENOTES.md
> > > >
> > > >
> > > > On Tue, Jun 4, 2019 at 2:55 PM Misty Linville <mi...@apache.org>
> > wrote:
> > > > >
> > > > > I’m confused how an automatically generated file can vary so much
> in
> > > > size.
> > > > > Can the release notes file with an archive just target the release
> in
> > > > > question and leave off the older stuff? What’s the difference in
> > practice
> > > > > between the release noted and changelog?
> > > > >
> > > > > A pre-commit and possibly a presubmit would help.
> > > > >
> > > > > On Tue, Jun 4, 2019 at 10:55 AM Andrew Purtell <
> apurt...@apache.org>
> > > > wrote:
> > > > >
> > > > > > The various release notes and changes.txt come up frequently in a
> > > > listing
> > > > > > of large-ish objects committed in the repo, along with
> > autogenerated
> > > > > > protobuf and thrift files. It's fine if we tolerate them all in
> > return
> > > > for
> > > > > > something. Not requiring local IDL compilers to build is a
> > reasonable
> > > > > > tradeoff for checking in what can be (and is) generated. I'm less
> > > > convinced
> > > > > > about release notes, given they can be made readily available
> > online,
> > > > and
> > > > > > already come in an online form on JIRA with no work required from
> > us
> > > > beyond
> > > > > > proper attention to fix versions, but I don't have a strong
> opinion
> > > > about
> > > > > > it.
> > > > > >
> > > > > > On Tue, Jun 4, 2019 at 10:35 AM Sean Busbey <bus...@apache.org>
> > wrote:
> > > > > >
> > > > > > > presuming you mean the two files from your original email:
> > > > > > >
> > > > > > > cb0e9bb95599 86MiB RELEASENOTES.md
> > > > > > > 61e9de9b82a9   14MiB RELEASENOTES.md
> > > > > > >
> > > > > > > these are both for HBase 2.2.0. Since branch-2.2 currently
> shows:
> > > > > > >
> > > > > > >
> > > > > > > Busbey-MBA:hbase busbey$ git checkout branch-2.2
> > > > > > > Already on 'branch-2.2'
> > > > > > > Your branch is up to date with 'origin/branch-2.2'.
> > > > > > > Busbey-MBA:hbase busbey$ ls -lah *.md
> > > > > > > -rw-r--r--  1 busbey  staff   100K Jun  4 12:24 CHANGES.md
> > > > > > > -rw-r--r--  1 busbey  staff    69K Jun  4 12:24 RELEASENOTES.md
> > > > > > >
> > > > > > > Yes, I'd wager it was also a mistake.
> > > > > > >
> > > > > > > If we don't want files over some threshold size, we should do
> > what we
> > > > > > > normally do when we want to change committer behavior: offer
> > people
> > > > > > > tools that help them do the right thing without thinking about
> > it too
> > > > > > > much. In this case I would guess a client side git pre-commit
> > hook
> > > > > > > that stopped things when files are too big, e.g. 10MiB. When a
> > case
> > > > > > > comes up that there's something bigger we need to commit then
> > folks
> > > > > > > can use their judgement and discuss it on dev@ if they think
> > it's
> > > > > > > contentious.
> > > > > > >
> > > > > > > On Tue, Jun 4, 2019 at 12:18 PM Andrew Purtell <
> > apurt...@apache.org>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > There's an 80 MB release notes file, possibly a mistake; the
> > next
> > > > > > largest
> > > > > > > > object is an 11 MB release notes file. Also a mistake?
> > > > > > > >
> > > > > > > > On Tue, Jun 4, 2019 at 10:11 AM Sean Busbey <
> bus...@apache.org
> > >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Currently putting them in the repo is how we get release
> > notes
> > > > into
> > > > > > the
> > > > > > > > > source and binary artifacts we vote on. It's really
> > convenient
> > > > for
> > > > > > > making
> > > > > > > > > sure folks who download things have some version of the
> > notes.
> > > > > > > > >
> > > > > > > > > We've been including the bare file in the dist area as
> well,
> > so
> > > > we'd
> > > > > > > face
> > > > > > > > > the same issue around distributing a large file (probably
> > more
> > > > likely
> > > > > > > to
> > > > > > > > > face it since it's compressed in the tarballs).
> > > > > > > > >
> > > > > > > > > I agree we should have an up to date rendered version on
> the
> > > > website
> > > > > > > (it's
> > > > > > > > > been an outstanding doc jira). But the files usually aren't
> > very
> > > > > > large
> > > > > > > and
> > > > > > > > > having them shipped in the release is nice.
> > > > > > > > >
> > > > > > > > > On Tue, Jun 4, 2019, 12:04 Andrew Purtell <
> > apurt...@apache.org>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > What do you think about linking to a remote site-based
> > release
> > > > > > notes
> > > > > > > file
> > > > > > > > > > instead of checking them into the main repo?
> > > > > > > > > >
> > > > > > > > > > On Mon, Jun 3, 2019 at 10:12 PM Guanghao Zhang <
> > > > zghao...@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Sorry. This large RELEASENOTES.md may be introduced by
> > me. I
> > > > > > > committed
> > > > > > > > > a
> > > > > > > > > > > big RELEASENOTES.md to branch-2.2 yesterday. I fixed it
> > > > today.
> > > > > > And
> > > > > > > > > roll a
> > > > > > > > > > > new 2.2.0RC5. Thanks.
> > > > > > > > > > >
> > > > > > > > > > > Andrew Purtell <apurt...@apache.org> 于2019年6月4日周二
> > 上午3:24写道:
> > > > > > > > > > >
> > > > > > > > > > > > remote: warning: GH001: Large files detected. You may
> > want
> > > > to
> > > > > > > try Git
> > > > > > > > > > > Large
> > > > > > > > > > > > File Storage - https://git-lfs.github.com.
> > > > > > > > > > > > remote: warning: See http://git.io/iEPt8g for more
> > > > > > information.
> > > > > > > > > > > > remote: warning: File RELEASENOTES.md is 85.80 MB;
> > this is
> > > > > > larger
> > > > > > > > > than
> > > > > > > > > > > > GitHub's recommended maximum file size of 50.00 MB
> > > > > > > > > > > >
> > > > > > > > > > > > The object is cb0e9bb95599 86MiB RELEASENOTES.md
> > > > > > > > > > > >
> > > > > > > > > > > > Incidentially, the next largest file is also a markup
> > > > release
> > > > > > > notes
> > > > > > > > > > file.
> > > > > > > > > > > >
> > > > > > > > > > > > 61e9de9b82a9   14MiB RELEASENOTES.md
> > > > > > > > > > > >
> > > > > > > > > > > > As far as I can tell cb0e9bb95599 is not referenced
> > from
> > > > > > > anywhere. So
> > > > > > > > > > > > eventually garbage collection both on the github side
> > and
> > > > in
> > > > > > > local
> > > > > > > > > > > > repositories will clear it out?
> > > > > > > > > > > >
> > > > > > > > > > > > This leads to the natural question of whether we
> > should be
> > > > > > > checking
> > > > > > > > > in
> > > > > > > > > > > such
> > > > > > > > > > > > really large autogenerated files. There are release
> > policy
> > > > > > > > > > implications.
> > > > > > > > > > > > Perhaps we can check in very small release notes
> files
> > > > > > containing
> > > > > > > > > only
> > > > > > > > > > a
> > > > > > > > > > > > URL to an online resource? Put the release notes
> > objects in
> > > > > > > > > hbase-site
> > > > > > > > > > ?
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Best regards,
> > > > > > > > > > > > Andrew
> > > > > > > > > > > >
> > > > > > > > > > > > Words like orphans lost among the crosstalk, meaning
> > torn
> > > > from
> > > > > > > > > truth's
> > > > > > > > > > > > decrepit hands
> > > > > > > > > > > >    - A23, Crosstalk
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best regards,
> > > > > > > > > > Andrew
> > > > > > > > > >
> > > > > > > > > > Words like orphans lost among the crosstalk, meaning torn
> > from
> > > > > > > truth's
> > > > > > > > > > decrepit hands
> > > > > > > > > >    - A23, Crosstalk
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best regards,
> > > > > > > > Andrew
> > > > > > > >
> > > > > > > > Words like orphans lost among the crosstalk, meaning torn
> from
> > > > truth's
> > > > > > > > decrepit hands
> > > > > > > >    - A23, Crosstalk
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > > Andrew
> > > > > >
> > > > > > Words like orphans lost among the crosstalk, meaning torn from
> > truth's
> > > > > > decrepit hands
> > > > > >    - A23, Crosstalk
> > > > > >
> > > >
> >
>

Reply via email to