Hi all,

re: *Changelog with GH PR/issues*:

Until 1.10.9 the script in dev folder
<https://github.com/apache/airflow/blob/d70303c8ba06ef0551e3d8408fd2c7588ab5b135/dev/airflow-jira#L273>
worked
fine for generating Changelog and like Ash mentioned they were categorized
based on the Jira issue (which had to be fixed manually most of the times
as they were most of the time defaulted to Bug).

While working on 1.10.10 I modified the script a bit to fetch the PRs
from Github
milestone <https://github.com/apache/airflow/milestone/2> and find the
issue and its labels (currently we had bug and feature). For others, I
manually categorized them. Before, when this was automated I would still
scroll through the entire list to correct if any record is not categorized
wrongly.

While doing this I tried to come up with some labels that we can add to the
PRs (in-line with what we had on JIRA and our Changelog sections):


   - type: bug-fix
   - type: new-feature
   - type: doc-change
   - type: improvement
   - type: internal   [For all CI & tests related changes]


After merging a PR if we (committers) can add one of this label and a
relevant *Milestone *(1.10.11
<https://github.com/apache/airflow/milestone/4> for now - Don't need to add
for 2.0 as it would be branched off the Master and generating Changelog for
2.0 is a separate issue but let's deal with it at a later time). We should
only add a PR to this Milestone if we want a PR/commit to be cherry-picked
in Airflow 1.10.11

In the coming days: I will update our current release script to support
this and generate a Changelog from it we all agree to the labels I
suggested above.


re: *Commit Messages (and PR descriptions)*

This is a tough one - I had changed the words when adding few commits to a
Changelog for 1.10.10 so that they are *more readable*.

This would be tough to enforce and I feel changing the name before merging
would be our best bet. Most of the time the PR authors don't know (or
wouldn't know) what is the best way to describe the change.
I have re-written PR title on couple of occasions and I know many of other
committers have the done the same which I think is better than enforcing a
Language Parsing Bot"

A related issue is currently a large number of PRs don't have a
description. We had automation to check and fail for it but we removed it
later as the error message showed by *Mergeable *(bot we use) was a bit
hidden and unclear.
Unless you check the code itself, it is hard to know "What is the PR
solving / Why is the PR needed". There is an exception to this rule when
the PR title itself can be sufficient like "Fixing a typo" etc.

re: *Conventional Commits*

I read https://www.conventionalcommits.org/en/v1.0.0/ some time back and
had liked the general approach but I think if we were to use that it has to
be after Airflow 2.0. Currently, we already have a fix of commit messages
containing JIRA issues and others without it. The semantic syntax would be
useful when all the commits use it and it is standardized, currently while
cherry-picking we have to backport some of the commits from last year too
as a recent fix/feature is dependent on old commit.

Since we have "extra" space due to the removal of JIRA issues in the commit
message I feel this can be used to add "feat:" / "bugfix:" prefixes.

And like it has been already pointed that this can help in the automatic
generation of Changelog (also removing the need of Github labels I talked
about at the top of this email).

*tl;dr*: I am in favor of Conventional commits & Semantic commit messages
once we branch of the Master (for Airflow 2.0).

If we don't want to wait until Airflow 2.0 to enforce Conventional Commits
/ Semantic Versioning - we can enforce it now but we will have to remove
them later in the Changelog so that it looks consistent.

Regards,
Kaxil





On Sun, Apr 26, 2020 at 6:49 PM Tomasz Urbaszek <[email protected]>
wrote:

> We probably can try to add proper labels using a semantic prefix. As per
> irrelevant (from changelog point of view) we can use ci or dev prefixes.
>
> T.
>
> On Sun, Apr 26, 2020 at 6:48 PM Jarek Potiuk <[email protected]>
> wrote:
>
> > Reno sounds like it's there to handle the corporate environment with
> > multiple teams working on different features. A bit too much overhead if
> > you ask me :). There is a reason why I am not working for a big
> corporation
> > :D
> >
> > I agree that Updating.md is fine as it is now. Also, I think often
> > Updating.md description is longer than one that will acceptably fit into
> > commit message (and it might not have the same limitations for line
> > length).
> >
> > I think we should not necessarily publish what has been generated
> > automatically without modifications. I thought that we should generate it
> > and review/manually update it afterwards. We even might want to remove
> some
> > of the stuff like dev/ci/chores etc. automatically when generating it. I
> > think that's what your current script did so far? I believe (remember
> some
> > discussions with you or Kaxil)  the cleanup/removal of CI/Chore stuff was
> > quite some manual work.
> >
> > If we go semantic and make sure we review also the type, we can
> distribute
> > the "review" part among the committers and do it at the time of review. I
> > believe it's much easier to verify type while you are reviewing the
> change
> > rather than when you prepare the release sometime few weeks or months
> later
> > (and you might have no idea what the change was about initially as it
> could
> > be approved by another committer).
> >
> > We could rely on labels of course, but those are a bit volatile and they
> > leave no trace in git, Labels are github stuff and you need to query for
> it
> > additionally with github API to get the label. Having a type of issue in
> > the commit message I think simplifies the initial generation of such
> > changelog - you only rely on the .git content.
> >
> > J.
> >
> >
> > On Sun, Apr 26, 2020 at 6:04 PM Ash Berlin-Taylor <[email protected]>
> wrote:
> >
> > > Random other thoughts about changelog:
> > >
> > > Sometimes we get the type wrong (say it's a new feature when it's a
> fix)
> > >
> > > Some commits just don't need to be included in the changelog as they
> > > aren't relevant to users of Airflow.
> > >
> > > - Either it's a bug fix to a new feature that hasn't yet been released
> > > - or its just internal and doesn't affect users when they install
> > Airflow.
> > >
> > > (To me the Changelog is for users seeing what has changed before
> > > upgrading. So things like CI improvements don't belong in there as it's
> > not
> > > for them.)
> > >
> > > Not sure my point here, just thoughts of what I consider when building
> a
> > > changelog for a release.
> > >
> > > -a
> > >
> > > On 26 April 2020 16:56:42 BST, Ash Berlin-Taylor <[email protected]>
> wrote:
> > > >Yes thank you, that's the one!
> > > >
> > > >On 26 April 2020 16:44:50 BST, "Kamil Breguła"
> > > ><[email protected]> wrote:
> > > >>Reno was developed by OpenStack:
> > > >>https://docs.openstack.org/reno/latest/
> > > >>
> > > >>On Sun, Apr 26, 2020 at 5:39 PM Ash Berlin-Taylor <[email protected]>
> > > >>wrote:
> > > >>>
> > > >>> Yeah, probably is overly complex for general changelog.
> > > >>>
> > > >>> The main thing updating.md needs is some concept of knowing which
> > > >>release it is in when the commit appears in multiple branches (but
> > > >with
> > > >>different commit IDs because of cherry picking.) Sure we could write
> > > >>the tooling for that, but someone already has.
> > > >>>
> > > >>> (I think it's for Openstack? I'll find the link on Monday)
> > > >>>
> > > >>> I'd say let's keep updating.md out of the discussion for now
> > > >>>
> > > >>> On 26 April 2020 16:15:18 BST, Jarek Potiuk
> > > >><[email protected]> wrote:
> > > >>> >>
> > > >>> >>
> > > >>> >> For changelog there is another option, which is a more
> > > >>formal/complex
> > > >>> >> system which works better with our backport/release branch
> > > >>process.
> > > >>> >(Can't
> > > >>> >> find the specific tool I'm thinking of from my phone. It
> involves
> > > >>> >each
> > > >>> >> change note in a separate file, and then a script to compile it.
> > > >>> >Roughly)
> > > >>> >> That may be better suited to UPDATING.md though
> > > >>> >>
> > > >>> >
> > > >>> >I think overhead for that would be very, very painful for regular
> > > >>> >changes.
> > > >>> >We already have the commit message - so why don't we simply use it
> > > >>> >better.
> > > >>> >
> > > >>> >BTW. The commitizen + semantic commits already have a way to
> handle
> > > >>> >BREAKING CHANGES- they are put in the footer with "breaking
> > > >changes"
> > > >>> >prefix
> > > >>> >and you can put the right description there (and it can be used to
> > > >>add
> > > >>> >them
> > > >>> >to changelog/updating.md updates). Here I am on the fence if we
> > > >>should
> > > >>> >use
> > > >>> >it or not - because `git annotate` on UPDATING.md has already all
> > > >>info
> > > >>> >that
> > > >>> >is needed, but maybe we can discuss the approach here as well.
> > > >>> >
> > > >>> >J.
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> >>
> > > >>> >> -a
> > > >>> >>
> > > >>> >> On 26 April 2020 14:38:17 BST, Tomasz Urbaszek
> > > >><[email protected]>
> > > >>> >> wrote:
> > > >>> >> >I agree with Ash that it's a committer's task to check the
> > > >commit
> > > >>> >name.
> > > >>> >> >But
> > > >>> >> >I personally was deceived by a discrepancy between PR name and
> > > >>> >commit
> > > >>> >> >title
> > > >>> >> >and merged the PR with "wrong" commit message.
> > > >>> >> >
> > > >>> >> >That's where I agree with Jarek: we should help ourselves. If
> > > >>"red
> > > >>> >> >check"
> > > >>> >> >will make people correct the commit / PR title then there will
> > > >be
> > > >>> >less
> > > >>> >> >work
> > > >>> >> >for us and fewer mistakes made by us. Also, semantic commits
> > > >have
> > > >>a
> > > >>> >> >nice
> > > >>> >> >side effect: they teach about commit messages. I think having a
> > > >>tool
> > > >>> >to
> > > >>> >> >check and teach is better having a "you should follow this"
> > > >link,
> > > >>> >which
> > > >>> >> >in
> > > >>> >> >most cases is ticked without clicking the link (btw. should we
> > > >>> >measure
> > > >>> >> >it?).
> > > >>> >> >
> > > >>> >> >One of my reason to suggest it was a conventional changelog
> that
> > > >>> >could
> > > >>> >> >be
> > > >>> >> >auto-generated. As Jarek mentioned currently it's mostly done
> by
> > > >>> >@Kaxil
> > > >>> >> >and
> > > >>> >> >it would be interesting to hear what he thinks about it.
> > > >>> >> >
> > > >>> >> >T.
> > > >>> >> >
> > > >>> >> >On Sun, Apr 26, 2020 at 2:44 PM Jarek Potiuk
> > > >>> ><[email protected]>
> > > >>> >> >wrote:
> > > >>> >> >
> > > >>> >> >> I think you pointed out the exact things I thought are
> > > >>important
> > > >>> >and
> > > >>> >> >could
> > > >>> >> >> be automated. I think those are the very things committing
> > > >>checks
> > > >>> >> >for.
> > > >>> >> >>
> > > >>> >> >> I think we could benefit from 1, 2, 3, 4, 6 (6. with
> > > >exceptions
> > > >>> >> >indeed but
> > > >>> >> >> a warning or a way to mark an exception would be nice -
> > > >>similarly
> > > >>> >as
> > > >>> >> >we do
> > > >>> >> >> with pylint).
> > > >>> >> >> I certainly did not want to improve automatically on the 5
> > > >>(yet)
> > > >>> >and
> > > >>> >> >7
> > > >>> >> >> (here it's much more of a convention we agree between the
> > > >>> >committers
> > > >>> >> >-
> > > >>> >> >> whether the body should be optional - I think it should and
> > > >>> >whether
> > > >>> >> >it
> > > >>> >> >> should be opt-out rather than opt-in - I thin it should be
> > > >>> >opt-out).
> > > >>> >> >>
> > > >>> >> >> There are quite a few commits currently with ... when you
> look
> > > >>at
> > > >>> >> >> the commit log in Github for one (because they do not obey
> the
> > > >>> >> >subject
> > > >>> >> >> length) - I picked the ones without JIRAs - still even
> without
> > > >>> >JIRAs
> > > >>> >> >> sometimes the subject is too long:
> > > >>> >> >>
> > > >>> >> >>    -
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >
> > > >>> >>
> > > >>>
> > > >>>
> > >
> >
> https://github.com/apache/airflow/commit/d883ff49ca2841f91ab7e0ab98204d5ad271473b
> > > >>> >> >>    -
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >
> > > >>> >>
> > > >>>
> > > >>>
> > >
> >
> https://github.com/apache/airflow/commit/bc230a9711fec2004e20f46aee22fb44c7461b6c
> > > >>> >> >>    -
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >
> > > >>> >>
> > > >>>
> > > >>>
> > >
> >
> https://github.com/apache/airflow/commit/fa262c12f87102a7ae1abb11ea7f0d5e8be0de47
> > > >>> >> >>
> > > >>> >> >> However - this is secondary. It was merely a comment on the
> > > >>> >possible
> > > >>> >> >> completion of the "semantic convention" approach. This is the
> > > >>main
> > > >>> >> >subject.
> > > >>> >> >>
> > > >>> >> >> I think the main idea behind the semantic commit/PR is the
> > > >>prefix
> > > >>> >is
> > > >>> >> >that
> > > >>> >> >> it allows for much easier and consistent ChangeLog
> generation.
> > > >>For
> > > >>> >> >example
> > > >>> >> >> in Angular you have
> > > >>> >> >> https://github.com/angular/angular/blob/master/CHANGELOG.md
> > > >>which
> > > >>> >is
> > > >>> >> >> generated automatically including breaking changes etc.  I
> > > >>think
> > > >>> >it's
> > > >>> >> >> mainly Kaxil's work now to prepare the changelog and group
> the
> > > >>> >> >changes into
> > > >>> >> >> separate buckets, so Kaxil - maybe your opinion is important
> > > >>here.
> > > >>> >If
> > > >>> >> >there
> > > >>> >> >> is a way everyone as committers and contributors we can do to
> > > >>make
> > > >>> >> >release
> > > >>> >> >> manager's job easier - I think we should do it.
> > > >>> >> >>
> > > >>> >> >> BTW. The convention is easy to follow without any tools.
> > > >>However
> > > >>> >> >commitizen
> > > >>> >> >> has the nice feature of also guiding new users - it provides
> a
> > > >>> >nice
> > > >>> >> >> explanation of the types you have defined in the project and
> > > >>guide
> > > >>> >> >the new
> > > >>> >> >> users how to write a good commit. I think it might be really
> > > >>nice
> > > >>> >> >touch for
> > > >>> >> >> our "welcoming community" approach. See the 5 minutes video
> > > >>about
> > > >>> >it:
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >
> > > >>> >>
> > > >>>
> > > >>>
> > >
> >
> https://egghead.io/lessons/javascript-writing-conventional-commits-with-commitizen
> > > >>> >> >>
> > > >>> >> >> J.
> > > >>> >> >>
> > > >>> >> >> J.
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >> On Sun, Apr 26, 2020 at 2:13 PM Ash Berlin-Taylor
> > > >><[email protected]>
> > > >>> >> >wrote:
> > > >>> >> >>
> > > >>> >> >> > My main objection is this is trying to apply a technical
> > > >>> >solution
> > > >>> >> >to a
> > > >>> >> >> > people+English problem. This feels like just one extra step
> > > >>to
> > > >>> >have
> > > >>> >> >> > commiters to do, when we as committers can very easily
> > > >>correct
> > > >>> >this
> > > >>> >> >in
> > > >>> >> >> > Github whilst reviewing/before merging.
> > > >>> >> >> >
> > > >>> >> >> > That said, can you point at any examples of recent commits
> > > >>that
> > > >>> >you
> > > >>> >> >> > think would have been clearer as a result of using?
> > > >>> >> >> >
> > > >>> >> >> > (Also a significant proportion of commits from form a git
> > > >gui
> > > >>or
> > > >>> >an
> > > >>> >> >ide,
> > > >>> >> >> > so cz-cli won't help those users.)
> > > >>> >> >> >
> > > >>> >> >> > The "good commit messages" we already link to
> > > >>> >> >> > https://chris.beams.io/posts/git-commit/ has these points
> > > >>> >> >> >
> > > >>> >> >> >     1. Separate subject from body with a blank line
> > > >>> >> >> >     2. Limit the subject line to 50 characters
> > > >>> >> >> >     3. Capitalize the subject line
> > > >>> >> >> >     4. Do not end the subject line with a period
> > > >>> >> >> >     5. Use the imperative mood in the subject line
> > > >>> >> >> >     6. Wrap the body at 72 characters
> > > >>> >> >> >     7. Use the body to explain what and why vs. how
> > > >>> >> >> >
> > > >>> >> >> > 2 we _could_ enforce, but it is not a hard-and-fast rule. 5
> > > >>and
> > > >>> >7
> > > >>> >> >is
> > > >>> >> >> > almost impossible for a computer to enforce. 6 always has
> > > >>> >> >exceptions.
> > > >>> >> >> > The most important ones is 7, and that is the hardest to
> > > >>> >> >programitcally
> > > >>> >> >> > enforce.
> > > >>> >> >> >
> > > >>> >> >> > -a
> > > >>> >> >> >
> > > >>> >> >> > On Apr 26 2020, at 11:30 am, Jarek Potiuk
> > > >>> >> ><[email protected]>
> > > >>> >> >> > wrote:
> > > >>> >> >> >
> > > >>> >> >> > > I think it's a very good idea to use it. We already
> > > >>discussed
> > > >>> >> >that we
> > > >>> >> >> > > should have some improvements in the way we write commits
> > > >-
> > > >>> >and
> > > >>> >> >why to
> > > >>> >> >> > > come up with our own conventions if we can adopt one that
> > > >>> >already
> > > >>> >> >> > > exists and has set of nice tools available.
> > > >>> >> >> > >
> > > >>> >> >> > > As usual, I think automation is a key - as it might make
> > > >>lives
> > > >>> >of
> > > >>> >> >> > > committers a bit easier. There are already a number of
> > > >>tools
> > > >>> >that
> > > >>> >> >we
> > > >>> >> >> > > could use together with such convention, as both
> > > >>pre-commits
> > > >>> >and
> > > >>> >> >a bot
> > > >>> >> >> > > in Github.
> > > >>> >> >> > > There are quite a few tools that embraced the concept of
> > > >>> >semantic
> > > >>> >> >> > > pr/semantic commits and I have heard good words about
> them
> > > >>> >from
> > > >>> >> >other
> > > >>> >> >> > > open-source projects. I've heard especially good words
> > > >>about
> > > >>> >> >> > > commitizen CLI, that could work hand-in-hand with
> semantic
> > > >>> >> >> > > commits/PRs:
> > > >>> >> >> > >
> > > >>> >> >> > > https://github.com/commitizen/cz-cli
> > > >>> >> >> > >
> > > >>> >> >> > > One of the things it has it also integrates with commit
> > > >>lint
> > > >>> >> >where we
> > > >>> >> >> > > could write our own rules and make them more meaningful
> > > >>> >> >> > > https://commitlint.js.org/#/
> > > >>> >> >> > >
> > > >>> >> >> > > Also, there are ready-to-use changelog generators that we
> > > >>can
> > > >>> >use
> > > >>> >> >(for
> > > >>> >> >> > > example
> > > >>> >https://github.com/commitizen/cz-conventional-changelog )
> > > >>> >> >> > >
> > > >>> >> >> > > Those are tools coming from the nodejs world, but I do
> not
> > > >>see
> > > >>> >a
> > > >>> >> >big
> > > >>> >> >> > > problem with using them (of course trying them out first)
> > > >-
> > > >>> >since
> > > >>> >> >we
> > > >>> >> >> > > can now connect it via pre-commit, it should be easy to
> > > >add
> > > >>> >all
> > > >>> >> >that
> > > >>> >> >> > > to our toolbox.
> > > >>> >> >> > >
> > > >>> >> >> > > J.
> > > >>> >> >> > >
> > > >>> >> >> > >
> > > >>> >> >> > > On Sun, Apr 26, 2020 at 10:44 AM Ash Berlin-Taylor
> > > >>> >> ><[email protected]>
> > > >>> >> >> > wrote:
> > > >>> >> >> > >>
> > > >>> >> >> > >> I agree that many commit messages are often lacking but
> > > >>I'm
> > > >>> >not
> > > >>> >> >a fan
> > > >>> >> >> > >> of that the prefix style that app requires, - plus I
> > > >think
> > > >>it
> > > >>> >> >would
> > > >>> >> >> > >> still be possible to have unhelpful PR titles just with
> > > >>> >'fix:'
> > > >>> >> >> prefixed.
> > > >>> >> >> > >>
> > > >>> >> >> > >> Is rather we as commiters updated the pr subjects when
> > > >>> >> >reviewing. The
> > > >>> >> >> > >> rule I try to follow is to (mentally) prefix the message
> > > >>with
> > > >>> >> >"When
> > > >>> >> >> > >> this commit is applied it will ..."
> > > >>> >> >> > >>
> > > >>> >> >> > >> -a
> > > >>> >> >> > >>
> > > >>> >> >> > >> On 26 April 2020 09:34:56 BST, Tomasz Urbaszek
> > > >>> >> ><[email protected]>
> > > >>> >> >> > wrote:
> > > >>> >> >> > >> >Hi all!
> > > >>> >> >> > >> >
> > > >>> >> >> > >> >Sometimes it happens that pull requests or commits have
> > > >>not
> > > >>> >so
> > > >>> >> >> > >> >meaningful messages and it's hard to say what's exactly
> > > >>> >going
> > > >>> >> >on.
> > > >>> >> >> > >> >So I am wondering if we would like to consider using
> > > >>> >semantic
> > > >>> >> >pull
> > > >>> >> >> > >> >request:
> https://github.com/zeke/semantic-pull-requests
> > > >>> >> >> > >> >
> > > >>> >> >> > >> >Since we are using Github it should be pretty easy to
> > > >>add:
> > > >>> >> >> > >> >https://github.com/apps/semantic-pull-requests
> > > >>> >> >> > >> >
> > > >>> >> >> > >> >Of course, it does not solve the problem of "pr
> message"
> > > >>but
> > > >>> >> >> > >> >definitely it raises attention about it. On the other
> > > >>hand,
> > > >>> >it
> > > >>> >> >should
> > > >>> >> >> > >> >also help with publishing changelogs. Personally I like
> > > >>this
> > > >>> >> >approach
> > > >>> >> >> > >> >and I used to use it before joining Airflow.
> > > >>> >> >> > >> >
> > > >>> >> >> > >> >Happy to see what you think about it. And sorry if it
> > > >was
> > > >>> >> >decided
> > > >>> >> >> some
> > > >>> >> >> > >> >ago that Airflow won't follow it.
> > > >>> >> >> > >> >
> > > >>> >> >> > >> >Cheers,
> > > >>> >> >> > >> >Tomek
> > > >>> >> >> > >
> > > >>> >> >> > >
> > > >>> >> >> > >
> > > >>> >> >> > > --
> > > >>> >> >> > >
> > > >>> >> >> > > Jarek Potiuk
> > > >>> >> >> > > Polidea | Principal Software Engineer
> > > >>> >> >> > >
> > > >>> >> >> > > M: +48 660 796 129
> > > >>> >> >> > >
> > > >>> >> >> >
> > > >>> >> >>
> > > >>> >> >>
> > > >>> >> >> --
> > > >>> >> >>
> > > >>> >> >> Jarek Potiuk
> > > >>> >> >> Polidea <https://www.polidea.com/> | Principal Software
> > > >>Engineer
> > > >>> >> >>
> > > >>> >> >> M: +48 660 796 129 <+48660796129>
> > > >>> >> >> [image: Polidea] <https://www.polidea.com/>
> > > >>> >> >>
> > > >>> >>
> > > >>> >
> > > >>> >
> > > >>> >--
> > > >>> >
> > > >>> >Jarek Potiuk
> > > >>> >Polidea <https://www.polidea.com/> | Principal Software Engineer
> > > >>> >
> > > >>> >M: +48 660 796 129 <+48660796129>
> > > >>> >[image: Polidea] <https://www.polidea.com/>
> > >
> >
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >
>

Reply via email to