Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Noah Misch
On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:
> Jesse Zhang  writes:
> > On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
> >> Is there any reason we don't just automatically run pgindent regularly?
> >> Like once a week? And also update typedefs.list automatically, while
> >> we're at it?
> 
> > You know what's better than weekly? Every check-in.
> 
> I'm not in favor of unsupervised pgindent runs, really.  It can do a lot
> of damage to code that was written without thinking about it --- in
> particular, it'll make a hash of comment blocks that were manually
> formatted and not protected with dashes.
> 
> If the workflow is commit first and re-indent later, then we can always
> revert the pgindent commit and clean things up manually; but an auto
> re-indent during commit wouldn't provide that history.

There are competing implementations of assuring pgindent-cleanliness at every
check-in:

1. After each push, an automated followup commit appears, restoring
   pgindent-cleanliness.
2. "git push" results in a commit that melds pgindent changes into what the
   committer tried to push.
3. "git push" fails, for the master branch, if the pushed commit disrupts
   pgindent-cleanliness.

Were you thinking of (2)?  (1) doesn't have the lack-of-history problem, but
it does have the unexpected-damage problem, and it makes gitweb noisier.  (3)
has neither problem, and I'd prefer it over (1), (2), or $SUBJECT.

Regarding typedefs.list, I would use the in-tree one, like you discussed here:

On Wed, Aug 12, 2020 at 07:57:29PM -0400, Tom Lane wrote:
> Maybe the secret is to not allow automated adoption of new typedefs.list
> entries, but to require somebody to add entries to that file by hand,
> even if they're basing it on the buildfarm results.  (This would
> encourage the habit some people have adopted of updating typedefs.list
> along with commits that add typedefs.  I've never done that, but would
> be willing to change if there's good motivation to.)




Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Michael Paquier
On Wed, Aug 12, 2020 at 06:53:25PM -0400, Alvaro Herrera wrote:
> On 2020-Aug-12, Andres Freund wrote:
>> Is there any reason we don't just automatically run pgindent regularly?
>> Like once a week? And also update typedefs.list automatically, while
>> we're at it?
> 
> Seconded.

Thirded.
--
Michael


signature.asc
Description: PGP signature


Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Tom Lane
Andres Freund  writes:
> Unfortunately that is, with the current tooling, not entirely trivial to
> do so completely. The way we generate the list of known typedefs
> unfortunately depends on the platform a build is run on. Therefore the
> buildfarm collects a number of the generated list of typedefs from
> different platforms, and then we use that combined list to run pgindent.

Yeah, it's hard to see how to avoid that given that the set of typedefs
provided by system headers is not fixed.  (Windows vs not Windows is the
worst case of course, but Unixen aren't all alike either.)

Another gotcha that we have to keep our eyes on is that sometimes the
process finds spurious names that we don't want to treat as typedefs
because it results in messing up too much code.  There's a reject list
in pgindent that takes care of those, but it has to be maintained
manually.  So I'm not sure how that could work in conjunction with
unsupervised reindents --- by the time you notice a problem, the git
history will already be cluttered with bogus reindentations.

Maybe the secret is to not allow automated adoption of new typedefs.list
entries, but to require somebody to add entries to that file by hand,
even if they're basing it on the buildfarm results.  (This would
encourage the habit some people have adopted of updating typedefs.list
along with commits that add typedefs.  I've never done that, but would
be willing to change if there's good motivation to.)

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Tom Lane
Jesse Zhang  writes:
> On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
>> Is there any reason we don't just automatically run pgindent regularly?
>> Like once a week? And also update typedefs.list automatically, while
>> we're at it?

> You know what's better than weekly? Every check-in.

I'm not in favor of unsupervised pgindent runs, really.  It can do a lot
of damage to code that was written without thinking about it --- in
particular, it'll make a hash of comment blocks that were manually
formatted and not protected with dashes.

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

I do like the idea of more frequent, smaller pgindent runs instead of
doing just one giant run per year.  With the giant runs it's necessary
to invest a fair amount of time eyeballing all the changes; if we did it
every couple weeks then the pain would be a lot less.

Another idea would be to have a bot that runs pgindent *without*
committing the results, and emails the people who have made commits
into files that changed, saying "if you don't like these diffs then
you'd better clean up your commit before it happens for real".  With
some warning like that, it might be okay to do automatic reindenting
a little bit later.  Plus, nagging committers who habitually commit
improperly-indented code might persuade them to clean up their acts ;-)

regards, tom lane




Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Andres Freund
Hi,

On 2020-08-12 16:08:50 -0700, Jesse Zhang wrote:
> On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
> >
> > Hi,
> >
> > When developing patches I find it fairly painful that I cannot re-indent
> > patches with pgindent without also seeing a lot of indentation changes
> > in unmodified parts of files.  It is easy enough ([1]) to only re-indent
> > files that I have modified, but there's often a lot of independent
> > indentation changes in the files that I did modified.
> >
> > I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
> > most of the hunks were entirely unrelated. Despite the development
> > window for 14 having only relatively recently opened. Based on my
> > experience it tends to get worse over time.
> 
> How bad was it right after branching 13? I wonder if we have any
> empirical measure of badness over time -- assuming there was a point in
> the recent past where everything was good, and the bad just crept in.

Well, just after branching it was perfect, because pgindent was
customarily is run just before branching. After that it incrementally
gets worse.


> > Is there any reason we don't just automatically run pgindent regularly?
> > Like once a week? And also update typedefs.list automatically, while
> > we're at it?
> 
> You know what's better than weekly? Every check-in. I for one would love
> it if we can just format the entire codebase, and ensure that new
> check-ins are also formatted. We _do_ need some form of continuous
> integration to catch us when we have fallen short (again, once HEAD
> reaches a "known good" state, it's conceivably cheap to keep it in the
> good state.

Unfortunately that is, with the current tooling, not entirely trivial to
do so completely. The way we generate the list of known typedefs
unfortunately depends on the platform a build is run on. Therefore the
buildfarm collects a number of the generated list of typedefs from
different platforms, and then we use that combined list to run pgindent.

We surely can improve further, but I think having any automation around
this already would be a huge step.

Greetings,

Andres Freund




Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Jesse Zhang
Hi Andres,

On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
>
> Hi,
>
> When developing patches I find it fairly painful that I cannot re-indent
> patches with pgindent without also seeing a lot of indentation changes
> in unmodified parts of files.  It is easy enough ([1]) to only re-indent
> files that I have modified, but there's often a lot of independent
> indentation changes in the files that I did modified.
>
> I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
> most of the hunks were entirely unrelated. Despite the development
> window for 14 having only relatively recently opened. Based on my
> experience it tends to get worse over time.

How bad was it right after branching 13? I wonder if we have any
empirical measure of badness over time -- assuming there was a point in
the recent past where everything was good, and the bad just crept in.

>
>
> Is there any reason we don't just automatically run pgindent regularly?
> Like once a week? And also update typedefs.list automatically, while
> we're at it?

You know what's better than weekly? Every check-in. I for one would love
it if we can just format the entire codebase, and ensure that new
check-ins are also formatted. We _do_ need some form of continuous
integration to catch us when we have fallen short (again, once HEAD
reaches a "known good" state, it's conceivably cheap to keep it in the
good state.

Cheers,
Jesse




Re: run pgindent on a regular basis / scripted manner

2020-08-12 Thread Alvaro Herrera
On 2020-Aug-12, Andres Freund wrote:

> Is there any reason we don't just automatically run pgindent regularly?
> Like once a week? And also update typedefs.list automatically, while
> we're at it?

Seconded.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




run pgindent on a regular basis / scripted manner

2020-08-12 Thread Andres Freund
Hi,

When developing patches I find it fairly painful that I cannot re-indent
patches with pgindent without also seeing a lot of indentation changes
in unmodified parts of files.  It is easy enough ([1]) to only re-indent
files that I have modified, but there's often a lot of independent
indentation changes in the files that I did modified.

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.


Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

Currently the yearly pgindent runs are somewhat painful for patches that
didn't make it into the release, but if we were to reindent on a more
regular basis, that should be much less the case.  It'd also help newer
developers who we sometimes tell to use pgindent - which doesn't really
work.

Greetings,

Andres Freund

[1] ./src/tools/pgindent/pgindent $(git diff-tree --no-commit-id --name-only -r 
upstream/master..HEAD|grep -v src/test|grep -v READ ME|grep -v typedefs.list)




<    1   2   3   4