Re: buildfarm warnings
On Thu, Feb 17, 2022 at 3:57 PM Robert Haas wrote: > True. That would be easy enough. I played around with this a bit, and of course it is easy enough to add --progress with or without --verbose to a few tests, but when I reverted 62cb7427d1e491faf8612a82c2e3711a8cd65422 it didn't make any tests fail. So then I tried sticking --progress --verbose into @pg_basebackup_defs so all the tests would run with it, and that did make some tests fail, but the same ones fail with and without the patch. So it doesn't seem like we would have caught this particular problem via this testing method no matter what we did in detail. If we just want to splatter a few --progress switches around for the heck of it, we could do something like the attached. But I don't know if this is best: it seems highly arguable what to do in detail (and also not worth arguing about, so if someone else feels motivated to do something different than this, or the same as this, fine by me). -- Robert Haas EDB: http://www.enterprisedb.com splatter-some-progress.patch Description: Binary data
Re: buildfarm warnings
On Thu, Feb 17, 2022 at 3:51 PM Andres Freund wrote: > On 2022-02-17 15:22:08 -0500, Robert Haas wrote: > > OK, sounds good, thanks. I couldn't (and still can't) think of a good > > way of testing the progress-reporting code either. I mean I guess if > > you could convince pg_basebackup not to truncate the filenames, maybe > > by convincing it that your terminal is as wide as your garage door, > > then you could capture the output and do some tests against it. But I > > feel like the test code would be two orders of magnitude larger than > > the code it intends to exercise, and I'm not sure it would be entirely > > robust, either. > > How about just running pg_basebackup with --progress in one or two of the > tests? Of course that's not testing very much, but at least it verifies not > crashing... True. That would be easy enough. -- Robert Haas EDB: http://www.enterprisedb.com
Re: buildfarm warnings
On 2022-02-17 15:22:08 -0500, Robert Haas wrote: > OK, sounds good, thanks. I couldn't (and still can't) think of a good > way of testing the progress-reporting code either. I mean I guess if > you could convince pg_basebackup not to truncate the filenames, maybe > by convincing it that your terminal is as wide as your garage door, > then you could capture the output and do some tests against it. But I > feel like the test code would be two orders of magnitude larger than > the code it intends to exercise, and I'm not sure it would be entirely > robust, either. How about just running pg_basebackup with --progress in one or two of the tests? Of course that's not testing very much, but at least it verifies not crashing...
Re: buildfarm warnings
On Thu, Feb 17, 2022 at 2:36 PM Tom Lane wrote: > Yeah, I came to the same conclusion while out doing some errands. > There's no very convincing reason to believe that what's passed to > progress_update_filename has got adequate lifespan either, or that > that would remain true even if it's true today, or that testing > would detect such a problem (even if we had any, ahem). The file > names I was seeing in testing just now tended to contain fragments > of temporary directory names, which'd be mighty hard to tell from > random garbage in any automated way: > > 3/22774 kB (0%), 0/2 tablespaces (...pc1/PG_15_202202141/5/16392_init) > 3/22774 kB (0%), 1/2 tablespaces (...pc1/PG_15_202202141/5/16392_init) > 22785/22785 kB (100%), 1/2 tablespaces (...t_T8u0/backup1/global/pg_control) > > So I think we should make progress_update_filename strdup each > input while freeing the last value, and insist that every update > go through that logic. (This is kind of a lot of cycles to spend > in support of an optional mode, but maybe we could do it only if > showprogress && verbose.) I'll go make it so. OK, sounds good, thanks. I couldn't (and still can't) think of a good way of testing the progress-reporting code either. I mean I guess if you could convince pg_basebackup not to truncate the filenames, maybe by convincing it that your terminal is as wide as your garage door, then you could capture the output and do some tests against it. But I feel like the test code would be two orders of magnitude larger than the code it intends to exercise, and I'm not sure it would be entirely robust, either. -- Robert Haas EDB: http://www.enterprisedb.com
Re: buildfarm warnings
Robert Haas writes: > If not, I think that your quick-and-dirty fix is about right, except > that we probably need to do it every place where we set > progress_filename, and we should arrange to free the memory later. > With -Ft, you won't have enough archives to matter, but with -Fp, you > might have enough archive members to matter. Yeah, I came to the same conclusion while out doing some errands. There's no very convincing reason to believe that what's passed to progress_update_filename has got adequate lifespan either, or that that would remain true even if it's true today, or that testing would detect such a problem (even if we had any, ahem). The file names I was seeing in testing just now tended to contain fragments of temporary directory names, which'd be mighty hard to tell from random garbage in any automated way: 3/22774 kB (0%), 0/2 tablespaces (...pc1/PG_15_202202141/5/16392_init) 3/22774 kB (0%), 1/2 tablespaces (...pc1/PG_15_202202141/5/16392_init) 22785/22785 kB (100%), 1/2 tablespaces (...t_T8u0/backup1/global/pg_control) So I think we should make progress_update_filename strdup each input while freeing the last value, and insist that every update go through that logic. (This is kind of a lot of cycles to spend in support of an optional mode, but maybe we could do it only if showprogress && verbose.) I'll go make it so. regards, tom lane
Re: buildfarm warnings
On Thu, Feb 17, 2022 at 12:08 PM Tom Lane wrote: > I wrote: > > Justin Pryzby writes: > >> pg_basebackup.c:1261:35: warning: storing the address of local variable > >> archive_filename in progress_filename [-Wdangling-pointer=] > >> => new in 23a1c6578 - looks like a real error > > > I saw that one a few days ago but didn't get around to looking > > more closely yet. It does look fishy, but it might be okay > > depending on when the global variable can be accessed. > > I got around to looking at this, and it absolutely is a bug. > The test scripts don't reach it, because they don't exercise -P > mode, let alone -P -v which is what you need to get progress_report() > to try to print the filename. However, if you modify the test > to do so, then you find that the output differs depending on > whether or not you add "progress_filename = NULL;" at the bottom > of CreateBackupStreamer(). So the variable is continuing to be > referenced, not only after it goes out of scope within that > function, but even after the function returns entirely. > (Interestingly, the output isn't obvious garbage, at least not > on my machine; so somehow the array's part of the stack is not > getting overwritten very soon.) > > A quick-and-dirty fix is > > - progress_filename = archive_filename; > + progress_filename = pg_strdup(archive_filename); > > but perhaps this needs more thought. How long is that filename > actually reasonable to show in the progress reports? Man, you couldn't have asked a question that made me any happier. Preserving that behavior was one of the most annoying parts of this whole refactoring. The server always sends a series of tar files, but the pre-existing behavior is that progress reports show the archive name with -Ft and the name of the archive member with -Fp. I think that's sort of useful, but it's certainly annoying from an implementation perspective because we only know the name of the archive member if we're extracting the tarfile, possibly after decompressing it, whereas the name of the archive itself is easily known. This results in annoying gymnastics to try to update the global variable that we use to store this information from very different places depending on what the user requested, and evidently I messed that up, and also, why in the world does this code think that "more global variables" is the right answer to every problem? I'm not totally convinced that it's OK to just hit this progress reporting stuff with a hammer and remove some functionality in the interest of simplifying the code. But I will shed no tears if that's the direction other people would like to go. If not, I think that your quick-and-dirty fix is about right, except that we probably need to do it every place where we set progress_filename, and we should arrange to free the memory later. With -Ft, you won't have enough archives to matter, but with -Fp, you might have enough archive members to matter. -- Robert Haas EDB: http://www.enterprisedb.com
Re: buildfarm warnings
I wrote: > Justin Pryzby writes: >> pg_basebackup.c:1261:35: warning: storing the address of local variable >> archive_filename in progress_filename [-Wdangling-pointer=] >> => new in 23a1c6578 - looks like a real error > I saw that one a few days ago but didn't get around to looking > more closely yet. It does look fishy, but it might be okay > depending on when the global variable can be accessed. I got around to looking at this, and it absolutely is a bug. The test scripts don't reach it, because they don't exercise -P mode, let alone -P -v which is what you need to get progress_report() to try to print the filename. However, if you modify the test to do so, then you find that the output differs depending on whether or not you add "progress_filename = NULL;" at the bottom of CreateBackupStreamer(). So the variable is continuing to be referenced, not only after it goes out of scope within that function, but even after the function returns entirely. (Interestingly, the output isn't obvious garbage, at least not on my machine; so somehow the array's part of the stack is not getting overwritten very soon.) A quick-and-dirty fix is - progress_filename = archive_filename; + progress_filename = pg_strdup(archive_filename); but perhaps this needs more thought. How long is that filename actually reasonable to show in the progress reports? regards, tom lane
Re: buildfarm warnings
On 2/12/22 16:13, Justin Pryzby wrote: > Is there any check for warnings from new code, other than those buildfarm > members with -Werror ? I had forgotten about this :-) but a few years ago I provided a check_warnings setting (and a --check-warnings command line parameter). It's processed if set in the main build steps ("make", "make contrib" and "make_test_modules") as well as some modules that check external software. But it hasn't been widely advertised so it's probably in hardly any use, if at all. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: buildfarm warnings
On Mon, Feb 14, 2022 at 6:10 AM Tom Lane wrote: > Another thing I've been ignoring on the grounds that I-don't-feel- > like-fixing-this is that the Solaris and OpenIndiana machines whine > about every single reference from loadable modules to the core code, > eg this from haddock while building contrib/fuzzystrmatch: > > Undefined first referenced > symbol in file > cstring_to_text dmetaphone.o > varstr_levenshtein fuzzystrmatch.o > text_to_cstring dmetaphone.o > errstart_cold fuzzystrmatch.o > errmsg fuzzystrmatch.o > palloc dmetaphone.o > ExceptionalConditionfuzzystrmatch.o > errfinish fuzzystrmatch.o > varstr_levenshtein_less_equal fuzzystrmatch.o > repallocdmetaphone.o > errcode fuzzystrmatch.o > errmsg_internal fuzzystrmatch.o > pg_detoast_datum_packed dmetaphone.o > ld: warning: symbol referencing errors > > Presumably, this means that the Solaris linker would really like > to see the executable the module will load against, analogously to > the BE_DLLLIBS settings we use on some other platforms such as macOS. > It evidently still works without that, but we might be losing > performance from an extra indirection or the like. And in any case, > this amount of noise would be very distracting if you wanted to do > actual development on that platform. I'm not especially interested > in tracking down the correct incantation, but I'd gladly push a patch > if someone submits one. I took a quick look at this on an OpenIndiana vagrant image, using the GNU compiler and [Open]Solaris's linker. It seems you can't use -l for this (it only likes files called libX.Y) but the manual[1] says that "shared objects that reference symbols from an application can use the -z defs option, together with defining the symbols by using an extern mapfile directive", so someone might need to figure out how to build and feed in such a file... [1] https://docs.oracle.com/cd/E26502_01/html/E26507/chapter2-90421.html
Re: xml_is_well_formed (was Re: buildfarm warnings)
On 2/13/22 13:59, Tom Lane wrote: > So we're faced with a dilemma: we can't rename either instance without > breaking something. The only way to get rid of the warnings seems to be > to decide that the copy in xml2 has outlived its usefulness and remove > it. I don't think that's a hard argument to make: that version has been > obsolete since 9.1 (a0b7b717a), meaning that only pre-extensions versions > of xml2 could need it. In fact, we pulled the trigger on it once before, > in 2016 (20540710e), and then changed our minds not because anyone > lobbied to put it back but just because we gave up on the PGDLLEXPORT > rearrangement that prompted the change then. > > I think that getting rid of these build warnings is sufficient reason > to drop this ancient compatibility function, and now propose that > we do that. > > +1 cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: buildfarm warnings
On 2/12/22 16:42, Tom Lane wrote: >> I'm of the impression that some people have sql access to BF logs. > Yeah. I'm not sure exactly what the access policy is for that machine; > I know we'll give out logins to committers, but not whether there's > any precedent for non-committers. Yeah, I don't think it's wider than that. The sysadmins own these instances, so the policy is really up to them. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
xml_is_well_formed (was Re: buildfarm warnings)
I wrote: > This conversation motivated me to take a fresh pass over said filtering > script, and one thing I notice that it's been ignoring is these complaints > from all the AIX animals: > ld: 0711-224 WARNING: Duplicate symbol: .xml_is_well_formed > ld: 0711-224 WARNING: Duplicate symbol: xml_is_well_formed > while building contrib/xml2. Evidently this is because xml2 defines > a symbol also defined in the core. We cannot rename xml2's function > without a user-visible ABI break, but it would be pretty painless > to rename the core function (at the C level, not SQL level). > So I propose we do that. I tried to do that, but it blew up in my face, because it turns out that actually contrib/xml2's extension script has a by-C-name reference to the *core* function: -- deprecated old name for xml_is_well_formed CREATE FUNCTION xml_valid(text) RETURNS bool AS 'xml_is_well_formed' LANGUAGE INTERNAL STRICT STABLE PARALLEL SAFE; The situation with the instance in xml2 is explained by its comment: * Note: this has been superseded by a core function. We still have to * have it in the contrib module so that existing SQL-level references * to the function won't fail; but in normal usage with up-to-date SQL * definitions for the contrib module, this won't be called. So we're faced with a dilemma: we can't rename either instance without breaking something. The only way to get rid of the warnings seems to be to decide that the copy in xml2 has outlived its usefulness and remove it. I don't think that's a hard argument to make: that version has been obsolete since 9.1 (a0b7b717a), meaning that only pre-extensions versions of xml2 could need it. In fact, we pulled the trigger on it once before, in 2016 (20540710e), and then changed our minds not because anyone lobbied to put it back but just because we gave up on the PGDLLEXPORT rearrangement that prompted the change then. I think that getting rid of these build warnings is sufficient reason to drop this ancient compatibility function, and now propose that we do that. regards, tom lane
Re: buildfarm warnings
I wrote: > Justin Pryzby writes: >> Is there any check for warnings from new code, other than those buildfarm >> members with -Werror ? > I periodically grep the buildfarm logs for interesting warnings. > There are a lot of uninteresting ones :-(, so I'm not sure how > automatable that'd be. I do use a prefiltering script, which > right now says there are > 13917 total warnings from 41 buildfarm members > of which it thinks all but 350 are of no interest. This conversation motivated me to take a fresh pass over said filtering script, and one thing I notice that it's been ignoring is these complaints from all the AIX animals: ld: 0711-224 WARNING: Duplicate symbol: .xml_is_well_formed ld: 0711-224 WARNING: Duplicate symbol: xml_is_well_formed while building contrib/xml2. Evidently this is because xml2 defines a symbol also defined in the core. We cannot rename xml2's function without a user-visible ABI break, but it would be pretty painless to rename the core function (at the C level, not SQL level). So I propose we do that. I think that when I put in that pattern, there were more problems, but this seems to be all that's left. Another thing I've been ignoring on the grounds that I-don't-feel- like-fixing-this is that the Solaris and OpenIndiana machines whine about every single reference from loadable modules to the core code, eg this from haddock while building contrib/fuzzystrmatch: Undefined first referenced symbol in file cstring_to_text dmetaphone.o varstr_levenshtein fuzzystrmatch.o text_to_cstring dmetaphone.o errstart_cold fuzzystrmatch.o errmsg fuzzystrmatch.o palloc dmetaphone.o ExceptionalConditionfuzzystrmatch.o errfinish fuzzystrmatch.o varstr_levenshtein_less_equal fuzzystrmatch.o repallocdmetaphone.o errcode fuzzystrmatch.o errmsg_internal fuzzystrmatch.o pg_detoast_datum_packed dmetaphone.o ld: warning: symbol referencing errors Presumably, this means that the Solaris linker would really like to see the executable the module will load against, analogously to the BE_DLLLIBS settings we use on some other platforms such as macOS. It evidently still works without that, but we might be losing performance from an extra indirection or the like. And in any case, this amount of noise would be very distracting if you wanted to do actual development on that platform. I'm not especially interested in tracking down the correct incantation, but I'd gladly push a patch if someone submits one. regards, tom lane
Re: buildfarm warnings
Hi, On 2022-02-12 16:42:03 -0500, Tom Lane wrote: > Another new one that maybe should be silenced is > > /mnt/resource/bf/build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c:1129:41: > warning: 'freespace' may be used uninitialized in this function > [-Wmaybe-uninitialized] > > Only skink and frogfish are showing that, though. skink uses -O1 to reduce the overhead of valgrind a bit (iirc that shaved off a couple of hours) without making backtraces completely unreadable. Which explains why it shows different warnings than most other animals... Greetings, Andres Freund
Re: buildfarm warnings
Justin Pryzby writes: > Is there any check for warnings from new code, other than those buildfarm > members with -Werror ? I periodically grep the buildfarm logs for interesting warnings. There are a lot of uninteresting ones :-(, so I'm not sure how automatable that'd be. I do use a prefiltering script, which right now says there are 13917 total warnings from 41 buildfarm members of which it thinks all but 350 are of no interest. Most of those 350 are of no interest either, but I didn't bother to make filter rules for them yet ... > It'd be better to avoid warnings, allowing members to use -Werror, rather than > to allow/ignore warnings, which preclude that possibility. The ones I classify as "uninteresting" are mostly from old/broken compilers. I don't think it'd be profitable to try to convince prairiedog's 17-year-old compiler that we don't have uninitialized variables, for instance. > I'm of the impression that some people have sql access to BF logs. Yeah. I'm not sure exactly what the access policy is for that machine; I know we'll give out logins to committers, but not whether there's any precedent for non-committers. > pg_basebackup.c:1261:35: warning: storing the address of local variable > archive_filename in progress_filename [-Wdangling-pointer=] > => new in 23a1c6578 - looks like a real error I saw that one a few days ago but didn't get around to looking more closely yet. It does look fishy, but it might be okay depending on when the global variable can be accessed. Another new one that maybe should be silenced is /mnt/resource/bf/build/skink-master/HEAD/pgsql.build/../pgsql/src/backend/access/heap/vacuumlazy.c:1129:41: warning: 'freespace' may be used uninitialized in this function [-Wmaybe-uninitialized] Only skink and frogfish are showing that, though. regards, tom lane