Re: A little report on informal commit tag usage

2019-07-17 Thread Alvaro Herrera
On 2019-Jul-16, Daniel Gustafsson wrote:

> The green gamification dot on people’s Github profiles might light up if the
> machine readable format with email address was used (and the user has that
> specific email connected to their Github account unless it’s a primary email).
> Looking at commit 1c9bb02d8ec1d5b1b319e4fed70439a403c245b1 I can see that for
> August 2018 Amit’s Github profile lists “Created 1 commit in 1 repository
> postgres/postgres 1 commit”, which is likely from this commit message being
> parsed in the mirror.

I specifically use "co-authored-by" (and scanning the grep results, I'm
the only person doing it) because github recognizes it in this way.
However I only feel entitled to use it when the patch has been developed
by me plus some other person(s), which has a bit of a contradictory result:
when I don't touch some submitted patch, I use "Author" since I (the
committer) am not a co-author.  That means github attributes such
patches solely to me :-(

I realize now, however, that in order for this to work I have to include
the email address, not just the name.  I failed to do that at least
once.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: A little report on informal commit tag usage

2019-07-16 Thread Michael Paquier
On Tue, Jul 16, 2019 at 04:33:07PM -0700, Andres Freund wrote:
> On 2019-07-16 19:26:59 -0400, Tom Lane wrote:
>> I've wondered for some time what you think the "-" means in this.
> 
> Up to master. Occasionally there's bugs that only need to be fixed in
> some back branches etc.

Is "-" most common to define a range of branches?  I would have
imagined that "~" makes more sense here.
--
Michael


signature.asc
Description: PGP signature


Re: A little report on informal commit tag usage

2019-07-16 Thread Andres Freund
Hi,

On 2019-07-16 19:26:59 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > They don't preclude each other though. E.g. it'd be sensible to have both
> 
> >> Per gripe from Ken Tanzer.  Back-patch to 9.6.  The issue exists
> >> further back, but before 9.6 the code looks very different and it
> >> doesn't actually know whether the "var" name matches anything,
> >> so I desisted from trying to fix it.
> 
> > and "Backpatch: 9.6-" or such.
> 
> I've wondered for some time what you think the "-" means in this.

Up to master. Occasionally there's bugs that only need to be fixed in
some back branches etc.

Greetings,

Andres Freund




Re: A little report on informal commit tag usage

2019-07-16 Thread Tom Lane
Andres Freund  writes:
> They don't preclude each other though. E.g. it'd be sensible to have both

>> Per gripe from Ken Tanzer.  Back-patch to 9.6.  The issue exists
>> further back, but before 9.6 the code looks very different and it
>> doesn't actually know whether the "var" name matches anything,
>> so I desisted from trying to fix it.

> and "Backpatch: 9.6-" or such.

I've wondered for some time what you think the "-" means in this.

regards, tom lane




Re: A little report on informal commit tag usage

2019-07-16 Thread Andres Freund
Hi,

On 2019-07-16 10:33:06 -0400, Tom Lane wrote:
> Michael Paquier  writes:
> > As mentioned on different threads, "Discussion" is the only one we had
> > a strong agreement with.  Could it be possible to consider things like
> > Author, Reported-by, Reviewed-by or Backpatch-through for example and
> > extend to that?  The first three ones are useful for parsing the
> > commit logs.  The fourth one is handy so as there is no need to look
> > at a full log tree with git log --graph or such, which is something I
> > do from time to time to guess down to where a fix has been applied (I
> > tend to avoid git_changelog).
> 
> FWIW, I'm one of the people who prefer prose for this.  The backpatching
> bit is a good example of why, because my log messages typically don't
> just say "backpatch to 9.6" but something about why that was the cutoff.

They don't preclude each other though. E.g. it'd be sensible to have both

> Per gripe from Ken Tanzer.  Back-patch to 9.6.  The issue exists
> further back, but before 9.6 the code looks very different and it
> doesn't actually know whether the "var" name matches anything,
> so I desisted from trying to fix it.

and "Backpatch: 9.6-" or such.


> I am in favor of trying to consistently mention that a patch is being
> back-patched, rather than expecting people to rely on git metadata
> to find that out.  But I don't see that a rigid "Backpatch" tag format
> makes anything easier there.  If you need to know that mechanically,
> git_changelog is way more reliable.

I find it useful to have a quick place to scan in a commit message. It's
a lot quicker to focus on the last few lines with tags, and see a
'Backpatch: 9.6-' than to parse a potentially long commit message. If
I'm then still interested in the commit, I'll then read the commit.

Greetings,

Andres Freund




Re: A little report on informal commit tag usage

2019-07-16 Thread Daniel Gustafsson
> On 16 Jul 2019, at 16:33, Tom Lane  wrote:
> 
> Michael Paquier  writes:
>> As mentioned on different threads, "Discussion" is the only one we had
>> a strong agreement with.  Could it be possible to consider things like
>> Author, Reported-by, Reviewed-by or Backpatch-through for example and
>> extend to that?  The first three ones are useful for parsing the
>> commit logs.  The fourth one is handy so as there is no need to look
>> at a full log tree with git log --graph or such, which is something I
>> do from time to time to guess down to where a fix has been applied (I
>> tend to avoid git_changelog).
> 
> FWIW, I'm one of the people who prefer prose for this.  The backpatching
> bit is a good example of why, because my log messages typically don't
> just say "backpatch to 9.6" but something about why that was the cutoff.

Wearing my $work-hat where I regularly perform interesting merges of postgres
releases as an upstream, these detailed commit messages are very valuable and
much appreciated.  The wealth of (human readable) information stored in the
commit logs makes tracking postgres as an upstream quite a lot easier.

> I'm also skeptical of the argument that machine-parseable Reported-by
> and so forth are useful to anybody.  Who'd use them, and for what?

The green gamification dot on people’s Github profiles might light up if the
machine readable format with email address was used (and the user has that
specific email connected to their Github account unless it’s a primary email).
Looking at commit 1c9bb02d8ec1d5b1b319e4fed70439a403c245b1 I can see that for
August 2018 Amit’s Github profile lists “Created 1 commit in 1 repository
postgres/postgres 1 commit”, which is likely from this commit message being
parsed in the mirror.

cheers ./daniel



Re: A little report on informal commit tag usage

2019-07-16 Thread Tom Lane
Michael Paquier  writes:
> As mentioned on different threads, "Discussion" is the only one we had
> a strong agreement with.  Could it be possible to consider things like
> Author, Reported-by, Reviewed-by or Backpatch-through for example and
> extend to that?  The first three ones are useful for parsing the
> commit logs.  The fourth one is handy so as there is no need to look
> at a full log tree with git log --graph or such, which is something I
> do from time to time to guess down to where a fix has been applied (I
> tend to avoid git_changelog).

FWIW, I'm one of the people who prefer prose for this.  The backpatching
bit is a good example of why, because my log messages typically don't
just say "backpatch to 9.6" but something about why that was the cutoff.
For instance in 0ec3e13c6,

Per gripe from Ken Tanzer.  Back-patch to 9.6.  The issue exists
further back, but before 9.6 the code looks very different and it
doesn't actually know whether the "var" name matches anything,
so I desisted from trying to fix it.

I am in favor of trying to consistently mention that a patch is being
back-patched, rather than expecting people to rely on git metadata
to find that out.  But I don't see that a rigid "Backpatch" tag format
makes anything easier there.  If you need to know that mechanically,
git_changelog is way more reliable.

I'm also skeptical of the argument that machine-parseable Reported-by
and so forth are useful to anybody.  Who'd use them, and for what?
Also, it's not always clear how to apply such a format to a real
situation --- eg, what do you do if the reporter is also the patch
author, or a co-author?  I'm not excited about redundantly entering
somebody's name several times.

regards, tom lane




Re: A little report on informal commit tag usage

2019-07-16 Thread Michael Paquier
On Mon, Jul 15, 2019 at 05:49:26PM +1200, Thomas Munro wrote:
> I would have tried to exclude the first line messages if I'd thought
> of that.  But anyway, the reason for the low Doc number is case
> sensitivity. I ran that on a Mac and its lame collation support failed
> me in the "sort" step (also -i didn't do what I wanted, but that
> wasn't the issue).  Trying again on FreeBSD box and explicitly setting
> LANG for the benefit of anyone else wanting to run this (see end), and
> then removing a few obvious false matches, I now get similar numbers
> in most fields but a higher "doc" number:
> 
>  767 Author
>9 Authors
>  144 Backpatch-through
>   55 Backpatch
>   14 Bug
>   14 Co-authored-by
>   27 Diagnosed-by
> 1599 Discussion
>  119 doc
>   36 docs
>  284 Reported-by
>5 Review
>8 Reviewed by
>  460 Reviewed-by
>7 Security
>9 Tested-by

Thanks for those numbers.  I am wondering if we could do a bit of
consolidation here and write a page about this stuff on the wiki.
Getting the "Discussion" field most of the time is really cool.

I think that we could get some improvements on the following things.
Here is a set of ideas:
- Avoid "Authors" and replace it with "Author" even if there are
multiple authors.
- Avoid having multiple entries for each one of them?  For example we
have a couple of commits listed listing one "Reviewed-by" field with
one single name.
- Most commit entries to not use the email address with the name of
the author, reviewer, tester or reporter.  Perhaps we should give up
on that?
- Keep "Backpatch-through", not "Backpatch".
- Keep "Reviewed-by", not "Reviewed by" nor "Review".

"Security" is a special case, we append it to all the CVE-related
commits.

That is mainly a matter of taste, but I tend to prefer the following
format, protecting usually the order:
- Diagnosed-by
- Author
- Reviewed-by
- Discussion
- Backpatch-through
- I tend to have only one "Reviewed-by" entry with a list of names,
same for "Author" and "Reported-by".
- Only names, no emails.

As mentioned on different threads, "Discussion" is the only one we had
a strong agreement with.  Could it be possible to consider things like
Author, Reported-by, Reviewed-by or Backpatch-through for example and
extend to that?  The first three ones are useful for parsing the
commit logs.  The fourth one is handy so as there is no need to look
at a full log tree with git log --graph or such, which is something I
do from time to time to guess down to where a fix has been applied (I
tend to avoid git_changelog).
--
Michael


signature.asc
Description: PGP signature


Re: A little report on informal commit tag usage

2019-07-14 Thread Thomas Munro
On Mon, Jul 15, 2019 at 5:12 PM Tom Lane  wrote:
> Thomas Munro  writes:
> >   42 Doc

> [...]  I see a lot more than 42 such commit messages in the past
> year, so not sure what you were counting?

I would have tried to exclude the first line messages if I'd thought
of that.  But anyway, the reason for the low Doc number is case
sensitivity. I ran that on a Mac and its lame collation support failed
me in the "sort" step (also -i didn't do what I wanted, but that
wasn't the issue).  Trying again on FreeBSD box and explicitly setting
LANG for the benefit of anyone else wanting to run this (see end), and
then removing a few obvious false matches, I now get similar numbers
in most fields but a higher "doc" number:

 767 Author
   9 Authors
 144 Backpatch-through
  55 Backpatch
  14 Bug
  14 Co-authored-by
  27 Diagnosed-by
1599 Discussion
 119 doc
  36 docs
 284 Reported-by
   5 Review
   8 Reviewed by
 460 Reviewed-by
   7 Security
   9 Tested-by

git log --since 2018-07-14 | \
  grep -E '^ +[a-zA-Z].*: ' | \
  LANG=en_US.UTF-8 sort | \
  sed 's/:.*//' | \
  LANG=en_US.UTF-8 uniq -ic | \
  grep -v -E '^ *[12] '

-- 
Thomas Munro
https://enterprisedb.com




Re: A little report on informal commit tag usage

2019-07-14 Thread Tom Lane
Thomas Munro  writes:
> Here are the tags that people have used in the past year, in commit messages:

>  763 Author
>9 Authors
>  144 Backpatch-through
>   55 Backpatch
>   14 Bug
>   14 Co-authored-by
>   27 Diagnosed-By
> 1593 Discussion
>   42 Doc
>  284 Reported-By
>5 Review
>8 Reviewed by
>  456 Reviewed-By
>7 Security
>9 Tested-By

One small comment on that --- I'm not sure what you meant to count
in respect to the "Doc" item, but I believe there's a fairly widespread
convention to write "doc:" or some variant in the initial summary line
of commits that touch only documentation.  The point here is to let
release-note writers quickly ignore such commits, since we never list
them as release note items.  Bruce and I, being the usual suspects for
release-note writing, are pretty religious about this but other people
do it too.  I see a lot more than 42 such commit messages in the past
year, so not sure what you were counting?

Anyway, that's not a "tag" in the sense I understand you to be using
(otherwise the entries would look something like "Doc: yes" and be at
the end, which is unhelpful for the purpose).  But it's a related sort
of commit-message convention.

regards, tom lane