Re: buildfarm warnings

2022-02-18 Thread Robert Haas
On Thu, Feb 17, 2022 at 3:57 PM Robert Haas  wrote:
> True. That would be easy enough.

I played around with this a bit, and of course it is easy enough to
add --progress with or without --verbose to a few tests, but when I
reverted 62cb7427d1e491faf8612a82c2e3711a8cd65422 it didn't make any
tests fail. So then I tried sticking --progress --verbose into
@pg_basebackup_defs so all the tests would run with it, and that did
make some tests fail, but the same ones fail with and without the
patch. So it doesn't seem like we would have caught this particular
problem via this testing method no matter what we did in detail.

If we just want to splatter a few --progress switches around for the
heck of it, we could do something like the attached. But I don't know
if this is best: it seems highly arguable what to do in detail (and
also not worth arguing about, so if someone else feels motivated to do
something different than this, or the same as this, fine by me).

-- 
Robert Haas
EDB: http://www.enterprisedb.com


splatter-some-progress.patch
Description: Binary data


Re: buildfarm warnings

2022-02-17 Thread Robert Haas
On Thu, Feb 17, 2022 at 3:51 PM Andres Freund  wrote:
> On 2022-02-17 15:22:08 -0500, Robert Haas wrote:
> > OK, sounds good, thanks. I couldn't (and still can't) think of a good
> > way of testing the progress-reporting code either. I mean I guess if
> > you could convince pg_basebackup not to truncate the filenames, maybe
> > by convincing it that your terminal is as wide as your garage door,
> > then you could capture the output and do some tests against it. But I
> > feel like the test code would be two orders of magnitude larger than
> > the code it intends to exercise, and I'm not sure it would be entirely
> > robust, either.
>
> How about just running pg_basebackup with --progress in one or two of the
> tests? Of course that's not testing very much, but at least it verifies not
> crashing...

True. That would be easy enough.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: buildfarm warnings

2022-02-17 Thread Andres Freund
On 2022-02-17 15:22:08 -0500, Robert Haas wrote:
> OK, sounds good, thanks. I couldn't (and still can't) think of a good
> way of testing the progress-reporting code either. I mean I guess if
> you could convince pg_basebackup not to truncate the filenames, maybe
> by convincing it that your terminal is as wide as your garage door,
> then you could capture the output and do some tests against it. But I
> feel like the test code would be two orders of magnitude larger than
> the code it intends to exercise, and I'm not sure it would be entirely
> robust, either.

How about just running pg_basebackup with --progress in one or two of the
tests? Of course that's not testing very much, but at least it verifies not
crashing...




Re: buildfarm warnings

2022-02-17 Thread Robert Haas
On Thu, Feb 17, 2022 at 2:36 PM Tom Lane  wrote:
> Yeah, I came to the same conclusion while out doing some errands.
> There's no very convincing reason to believe that what's passed to
> progress_update_filename has got adequate lifespan either, or that
> that would remain true even if it's true today, or that testing
> would detect such a problem (even if we had any, ahem).  The file
> names I was seeing in testing just now tended to contain fragments
> of temporary directory names, which'd be mighty hard to tell from
> random garbage in any automated way:
>
> 3/22774 kB (0%), 0/2 tablespaces (...pc1/PG_15_202202141/5/16392_init)
> 3/22774 kB (0%), 1/2 tablespaces (...pc1/PG_15_202202141/5/16392_init)
> 22785/22785 kB (100%), 1/2 tablespaces (...t_T8u0/backup1/global/pg_control)
>
> So I think we should make progress_update_filename strdup each
> input while freeing the last value, and insist that every update
> go through that logic.  (This is kind of a lot of cycles to spend
> in support of an optional mode, but maybe we could do it only if
> showprogress && verbose.)  I'll go make it so.

OK, sounds good, thanks. I couldn't (and still can't) think of a good
way of testing the progress-reporting code either. I mean I guess if
you could convince pg_basebackup not to truncate the filenames, maybe
by convincing it that your terminal is as wide as your garage door,
then you could capture the output and do some tests against it. But I
feel like the test code would be two orders of magnitude larger than
the code it intends to exercise, and I'm not sure it would be entirely
robust, either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: buildfarm warnings

2022-02-17 Thread Tom Lane
Robert Haas  writes:
> If not, I think that your quick-and-dirty fix is about right, except
> that we probably need to do it every place where we set
> progress_filename, and we should arrange to free the memory later.
> With -Ft, you won't have enough archives to matter, but with -Fp, you
> might have enough archive members to matter.

Yeah, I came to the same conclusion while out doing some errands.
There's no very convincing reason to believe that what's passed to
progress_update_filename has got adequate lifespan either, or that
that would remain true even if it's true today, or that testing
would detect such a problem (even if we had any, ahem).  The file
names I was seeing in testing just now tended to contain fragments
of temporary directory names, which'd be mighty hard to tell from
random garbage in any automated way:

3/22774 kB (0%), 0/2 tablespaces (...pc1/PG_15_202202141/5/16392_init)
3/22774 kB (0%), 1/2 tablespaces (...pc1/PG_15_202202141/5/16392_init)
22785/22785 kB (100%), 1/2 tablespaces (...t_T8u0/backup1/global/pg_control)

So I think we should make progress_update_filename strdup each
input while freeing the last value, and insist that every update
go through that logic.  (This is kind of a lot of cycles to spend
in support of an optional mode, but maybe we could do it only if
showprogress && verbose.)  I'll go make it so.

regards, tom lane




Re: buildfarm warnings

2022-02-17 Thread Robert Haas
On Thu, Feb 17, 2022 at 12:08 PM Tom Lane  wrote:
> I wrote:
> > Justin Pryzby  writes:
> >> pg_basebackup.c:1261:35: warning: storing the address of local variable 
> >> archive_filename in progress_filename [-Wdangling-pointer=]
> >> => new in 23a1c6578 - looks like a real error
>
> > I saw that one a few days ago but didn't get around to looking
> > more closely yet.  It does look fishy, but it might be okay
> > depending on when the global variable can be accessed.
>
> I got around to looking at this, and it absolutely is a bug.
> The test scripts don't reach it, because they don't exercise -P
> mode, let alone -P -v which is what you need to get progress_report()
> to try to print the filename.  However, if you modify the test
> to do so, then you find that the output differs depending on
> whether or not you add "progress_filename = NULL;" at the bottom
> of CreateBackupStreamer().  So the variable is continuing to be
> referenced, not only after it goes out of scope within that
> function, but even after the function returns entirely.
> (Interestingly, the output isn't obvious garbage, at least not
> on my machine; so somehow the array's part of the stack is not
> getting overwritten very soon.)
>
> A quick-and-dirty fix is
>
> -   progress_filename = archive_filename;
> +   progress_filename = pg_strdup(archive_filename);
>
> but perhaps this needs more thought.  How long is that filename
> actually reasonable to show in the progress reports?

Man, you couldn't have asked a question that made me any happier.
Preserving that behavior was one of the most annoying parts of this
whole refactoring. The server always sends a series of tar files, but
the pre-existing behavior is that progress reports show the archive
name with -Ft and the name of the archive member with -Fp. I think
that's sort of useful, but it's certainly annoying from an
implementation perspective because we only know the name of the
archive member if we're extracting the tarfile, possibly after
decompressing it, whereas the name of the archive itself is easily
known. This results in annoying gymnastics to try to update the global
variable that we use to store this information from very different
places depending on what the user requested, and evidently I messed
that up, and also, why in the world does this code think that "more
global variables" is the right answer to every problem?

I'm not totally convinced that it's OK to just hit this progress
reporting stuff with a hammer and remove some functionality in the
interest of simplifying the code. But I will shed no tears if that's
the direction other people would like to go.

If not, I think that your quick-and-dirty fix is about right, except
that we probably need to do it every place where we set
progress_filename, and we should arrange to free the memory later.
With -Ft, you won't have enough archives to matter, but with -Fp, you
might have enough archive members to matter.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: buildfarm warnings

2022-02-17 Thread Tom Lane
I wrote:
> Justin Pryzby  writes:
>> pg_basebackup.c:1261:35: warning: storing the address of local variable 
>> archive_filename in progress_filename [-Wdangling-pointer=]
>> => new in 23a1c6578 - looks like a real error

> I saw that one a few days ago but didn't get around to looking
> more closely yet.  It does look fishy, but it might be okay
> depending on when the global variable can be accessed.

I got around to looking at this, and it absolutely is a bug.
The test scripts don't reach it, because they don't exercise -P
mode, let alone -P -v which is what you need to get progress_report()
to try to print the filename.  However, if you modify the test
to do so, then you find that the output differs depending on
whether or not you add "progress_filename = NULL;" at the bottom
of CreateBackupStreamer().  So the variable is continuing to be
referenced, not only after it goes out of scope within that
function, but even after the function returns entirely.
(Interestingly, the output isn't obvious garbage, at least not
on my machine; so somehow the array's part of the stack is not
getting overwritten very soon.)

A quick-and-dirty fix is

-   progress_filename = archive_filename;
+   progress_filename = pg_strdup(archive_filename);

but perhaps this needs more thought.  How long is that filename
actually reasonable to show in the progress reports?

regards, tom lane




Re: buildfarm warnings

2022-02-14 Thread Andrew Dunstan


On 2/12/22 16:13, Justin Pryzby wrote:
> Is there any check for warnings from new code, other than those buildfarm
> members with -Werror ?


I had forgotten about this :-) but a few years ago I provided a
check_warnings setting (and a --check-warnings command line parameter).
It's processed if set in the main build steps ("make", "make contrib"
and "make_test_modules") as well as some modules that check external
software. But it hasn't been widely advertised so it's probably in
hardly any use, if at all.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: buildfarm warnings

2022-02-13 Thread Thomas Munro
On Mon, Feb 14, 2022 at 6:10 AM Tom Lane  wrote:
> Another thing I've been ignoring on the grounds that I-don't-feel-
> like-fixing-this is that the Solaris and OpenIndiana machines whine
> about every single reference from loadable modules to the core code,
> eg this from haddock while building contrib/fuzzystrmatch:
>
> Undefined   first referenced
>  symbol in file
> cstring_to_text dmetaphone.o
> varstr_levenshtein  fuzzystrmatch.o
> text_to_cstring dmetaphone.o
> errstart_cold   fuzzystrmatch.o
> errmsg  fuzzystrmatch.o
> palloc  dmetaphone.o
> ExceptionalConditionfuzzystrmatch.o
> errfinish   fuzzystrmatch.o
> varstr_levenshtein_less_equal   fuzzystrmatch.o
> repallocdmetaphone.o
> errcode fuzzystrmatch.o
> errmsg_internal fuzzystrmatch.o
> pg_detoast_datum_packed dmetaphone.o
> ld: warning: symbol referencing errors
>
> Presumably, this means that the Solaris linker would really like
> to see the executable the module will load against, analogously to
> the BE_DLLLIBS settings we use on some other platforms such as macOS.
> It evidently still works without that, but we might be losing
> performance from an extra indirection or the like.  And in any case,
> this amount of noise would be very distracting if you wanted to do
> actual development on that platform.  I'm not especially interested
> in tracking down the correct incantation, but I'd gladly push a patch
> if someone submits one.

I took a quick look at this on an OpenIndiana vagrant image, using the
GNU compiler and [Open]Solaris's linker.  It seems you can't use -l
for this (it only likes files called libX.Y) but the manual[1] says
that "shared objects that reference symbols from an application can
use the -z defs option, together with defining the symbols by using an
extern mapfile directive", so someone might need to figure out how to
build and feed in such a file...

[1] https://docs.oracle.com/cd/E26502_01/html/E26507/chapter2-90421.html




Re: xml_is_well_formed (was Re: buildfarm warnings)

2022-02-13 Thread Andrew Dunstan


On 2/13/22 13:59, Tom Lane wrote:
> So we're faced with a dilemma: we can't rename either instance without
> breaking something.  The only way to get rid of the warnings seems to be
> to decide that the copy in xml2 has outlived its usefulness and remove
> it.  I don't think that's a hard argument to make: that version has been
> obsolete since 9.1 (a0b7b717a), meaning that only pre-extensions versions
> of xml2 could need it.  In fact, we pulled the trigger on it once before,
> in 2016 (20540710e), and then changed our minds not because anyone
> lobbied to put it back but just because we gave up on the PGDLLEXPORT
> rearrangement that prompted the change then.
>
> I think that getting rid of these build warnings is sufficient reason
> to drop this ancient compatibility function, and now propose that
> we do that.
>
>   


+1


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: buildfarm warnings

2022-02-13 Thread Andrew Dunstan


On 2/12/22 16:42, Tom Lane wrote:
>> I'm of the impression that some people have sql access to BF logs.
> Yeah.  I'm not sure exactly what the access policy is for that machine;
> I know we'll give out logins to committers, but not whether there's
> any precedent for non-committers.


Yeah, I don't think it's wider than that. The sysadmins own these
instances, so the policy is really up to them.


cheers


andrew



--
Andrew Dunstan
EDB: https://www.enterprisedb.com





xml_is_well_formed (was Re: buildfarm warnings)

2022-02-13 Thread Tom Lane
I wrote:
> This conversation motivated me to take a fresh pass over said filtering
> script, and one thing I notice that it's been ignoring is these complaints
> from all the AIX animals:

> ld: 0711-224 WARNING: Duplicate symbol: .xml_is_well_formed
> ld: 0711-224 WARNING: Duplicate symbol: xml_is_well_formed

> while building contrib/xml2.  Evidently this is because xml2 defines
> a symbol also defined in the core.  We cannot rename xml2's function
> without a user-visible ABI break, but it would be pretty painless
> to rename the core function (at the C level, not SQL level).
> So I propose we do that.

I tried to do that, but it blew up in my face, because it turns out that
actually contrib/xml2's extension script has a by-C-name reference to
the *core* function:

-- deprecated old name for xml_is_well_formed
CREATE FUNCTION xml_valid(text) RETURNS bool
AS 'xml_is_well_formed'
LANGUAGE INTERNAL STRICT STABLE PARALLEL SAFE;

The situation with the instance in xml2 is explained by its comment:

 * Note: this has been superseded by a core function.  We still have to
 * have it in the contrib module so that existing SQL-level references
 * to the function won't fail; but in normal usage with up-to-date SQL
 * definitions for the contrib module, this won't be called.

So we're faced with a dilemma: we can't rename either instance without
breaking something.  The only way to get rid of the warnings seems to be
to decide that the copy in xml2 has outlived its usefulness and remove
it.  I don't think that's a hard argument to make: that version has been
obsolete since 9.1 (a0b7b717a), meaning that only pre-extensions versions
of xml2 could need it.  In fact, we pulled the trigger on it once before,
in 2016 (20540710e), and then changed our minds not because anyone
lobbied to put it back but just because we gave up on the PGDLLEXPORT
rearrangement that prompted the change then.

I think that getting rid of these build warnings is sufficient reason
to drop this ancient compatibility function, and now propose that
we do that.

regards, tom lane




Re: buildfarm warnings

2022-02-13 Thread Tom Lane
I wrote:
> Justin Pryzby  writes:
>> Is there any check for warnings from new code, other than those buildfarm
>> members with -Werror ?

> I periodically grep the buildfarm logs for interesting warnings.
> There are a lot of uninteresting ones :-(, so I'm not sure how
> automatable that'd be.  I do use a prefiltering script, which
> right now says there are
> 13917 total warnings from 41 buildfarm members
> of which it thinks all but 350 are of no interest.

This conversation motivated me to take a fresh pass over said filtering
script, and one thing I notice that it's been ignoring is these complaints
from all the AIX animals:

ld: 0711-224 WARNING: Duplicate symbol: .xml_is_well_formed
ld: 0711-224 WARNING: Duplicate symbol: xml_is_well_formed

while building contrib/xml2.  Evidently this is because xml2 defines
a symbol also defined in the core.  We cannot rename xml2's function
without a user-visible ABI break, but it would be pretty painless
to rename the core function (at the C level, not SQL level).
So I propose we do that.  I think that when I put in that pattern,
there were more problems, but this seems to be all that's left.

Another thing I've been ignoring on the grounds that I-don't-feel-
like-fixing-this is that the Solaris and OpenIndiana machines whine
about every single reference from loadable modules to the core code,
eg this from haddock while building contrib/fuzzystrmatch:

Undefined   first referenced
 symbol in file
cstring_to_text dmetaphone.o
varstr_levenshtein  fuzzystrmatch.o
text_to_cstring dmetaphone.o
errstart_cold   fuzzystrmatch.o
errmsg  fuzzystrmatch.o
palloc  dmetaphone.o
ExceptionalConditionfuzzystrmatch.o
errfinish   fuzzystrmatch.o
varstr_levenshtein_less_equal   fuzzystrmatch.o
repallocdmetaphone.o
errcode fuzzystrmatch.o
errmsg_internal fuzzystrmatch.o
pg_detoast_datum_packed dmetaphone.o
ld: warning: symbol referencing errors

Presumably, this means that the Solaris linker would really like
to see the executable the module will load against, analogously to
the BE_DLLLIBS settings we use on some other platforms such as macOS.
It evidently still works without that, but we might be losing
performance from an extra indirection or the like.  And in any case,
this amount of noise would be very distracting if you wanted to do
actual development on that platform.  I'm not especially interested
in tracking down the correct incantation, but I'd gladly push a patch
if someone submits one.

regards, tom lane




Re: buildfarm warnings

2022-02-12 Thread Andres Freund
Hi,

On 2022-02-12 16:42:03 -0500, Tom Lane wrote:
> Another new one that maybe should be silenced is
>
> /mnt/resource/bf/build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c:1129:41:
>  warning: 'freespace' may be used uninitialized in this function 
> [-Wmaybe-uninitialized]
>
> Only skink and frogfish are showing that, though.

skink uses -O1 to reduce the overhead of valgrind a bit (iirc that shaved off
a couple of hours) without making backtraces completely unreadable. Which
explains why it shows different warnings than most other animals...

Greetings,

Andres Freund




Re: buildfarm warnings

2022-02-12 Thread Tom Lane
Justin Pryzby  writes:
> Is there any check for warnings from new code, other than those buildfarm
> members with -Werror ?

I periodically grep the buildfarm logs for interesting warnings.
There are a lot of uninteresting ones :-(, so I'm not sure how
automatable that'd be.  I do use a prefiltering script, which
right now says there are
13917 total warnings from 41 buildfarm members
of which it thinks all but 350 are of no interest.  Most of those
350 are of no interest either, but I didn't bother to make
filter rules for them yet ...

> It'd be better to avoid warnings, allowing members to use -Werror, rather than
> to allow/ignore warnings, which preclude that possibility.

The ones I classify as "uninteresting" are mostly from old/broken
compilers.  I don't think it'd be profitable to try to convince
prairiedog's 17-year-old compiler that we don't have uninitialized
variables, for instance.

> I'm of the impression that some people have sql access to BF logs.

Yeah.  I'm not sure exactly what the access policy is for that machine;
I know we'll give out logins to committers, but not whether there's
any precedent for non-committers.

> pg_basebackup.c:1261:35: warning: storing the address of local variable 
> archive_filename in progress_filename [-Wdangling-pointer=]
>   => new in 23a1c6578 - looks like a real error

I saw that one a few days ago but didn't get around to looking
more closely yet.  It does look fishy, but it might be okay
depending on when the global variable can be accessed.

Another new one that maybe should be silenced is

/mnt/resource/bf/build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c:1129:41:
 warning: 'freespace' may be used uninitialized in this function 
[-Wmaybe-uninitialized]

Only skink and frogfish are showing that, though.

regards, tom lane