Strange coding in _mdfd_openseg()
Hello, I think the following conditional code is misleading, and I wonder if it would be better like so: --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -1787,8 +1787,13 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber forknum, BlockNumber segno, if (fd < 0) return NULL; - if (segno <= reln->md_num_open_segs[forknum]) - _fdvec_resize(reln, forknum, segno + 1); + /* +* Segments are always opened in order from lowest to highest, so we must +* be adding a new one at the end. +*/ + Assert(segno == reln->md_num_open_segs[forknum]); + + _fdvec_resize(reln, forknum, segno + 1); /* fill the entry */ v = &reln->md_seg_fds[forknum][segno]; I think the condition is always true, and with == it would also always be true. If that weren't the case, the call to _fdvec_resize() code would effectively leak vfds. -- Thomas Munro https://enterprisedb.com
Strange coding in _mdfd_openseg()
Hello, I think the following conditional code is misleading, and I wonder if it would be better like so: --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -1787,8 +1787,13 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber forknum, BlockNumber segno, if (fd < 0) return NULL; - if (segno <= reln->md_num_open_segs[forknum]) - _fdvec_resize(reln, forknum, segno + 1); + /* +* Segments are always opened in order from lowest to highest, so we must +* be adding a new one at the end. +*/ + Assert(segno == reln->md_num_open_segs[forknum]); + + _fdvec_resize(reln, forknum, segno + 1); /* fill the entry */ v = &reln->md_seg_fds[forknum][segno]; I think the condition is always true, and with == it would also always be true. If that weren't the case, the call to _fdvec_resize() code would effectively leak vfds. -- Thomas Munro https://enterprisedb.com
Re: Strange coding in _mdfd_openseg()
At Wed, 3 Apr 2019 17:14:36 +1300, Thomas Munro wrote in > Hello, > > I think the following conditional code is misleading, and I wonder if > it would be better like so: > > --- a/src/backend/storage/smgr/md.c > +++ b/src/backend/storage/smgr/md.c > @@ -1787,8 +1787,13 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber > forknum, BlockNumber segno, > if (fd < 0) > return NULL; > > - if (segno <= reln->md_num_open_segs[forknum]) > - _fdvec_resize(reln, forknum, segno + 1); > + /* > +* Segments are always opened in order from lowest to highest, > so we must > +* be adding a new one at the end. > +*/ > + Assert(segno == reln->md_num_open_segs[forknum]); > + > + _fdvec_resize(reln, forknum, segno + 1); > > /* fill the entry */ > v = &reln->md_seg_fds[forknum][segno]; > > I think the condition is always true, and with == it would also always > be true. If that weren't the case, the call to _fdvec_resize() code > would effectively leak vfds. I may be missing something, but it seems possible that _mdfd_getseg calls it with segno > opensegs. | for (nextsegno = reln->md_num_open_segs[forknum]; | nextsegno <= targetseg; nextsegno++) | ... | v = _mdfd_openseg(reln, forknum, nextsegno, flags); regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Strange coding in _mdfd_openseg()
On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI wrote: > I may be missing something, but it seems possible that > _mdfd_getseg calls it with segno > opensegs. > > | for (nextsegno = reln->md_num_open_segs[forknum]; Here nextsegno starts out equal to opensegs. > | nextsegno <= targetseg; nextsegno++) Here we add one to nextsegno... > | ... > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); ... after adding one to opensegs. So they're always equal. This fits a general programming pattern when appending to an array, the next element's index is the same as the number of elements. But I claim the coding is weird, because _mdfd_openseg's *looks* like it can handle opening segments in any order, except that the author accidentally wrote "<=" instead of ">=". In fact it can't open them in any order, because we don't support "holes" in the array. So I think it should really be "==", and it should be an assertion, not a condition. -- Thomas Munro https://enterprisedb.com
Re: Strange coding in _mdfd_openseg()
Hi, On 2019-04-04 09:24:49 +1300, Thomas Munro wrote: > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI > wrote: > > I may be missing something, but it seems possible that > > _mdfd_getseg calls it with segno > opensegs. > > > > | for (nextsegno = reln->md_num_open_segs[forknum]; > > Here nextsegno starts out equal to opensegs. > > > | nextsegno <= targetseg; nextsegno++) > > Here we add one to nextsegno... > > > | ... > > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); > > ... after adding one to opensegs. So they're always equal. This fits > a general programming pattern when appending to an array, the next > element's index is the same as the number of elements. But I claim > the coding is weird, because _mdfd_openseg's *looks* like it can > handle opening segments in any order, except that the author > accidentally wrote "<=" instead of ">=". In fact it can't open them > in any order, because we don't support "holes" in the array. So I > think it should really be "==", and it should be an assertion, not a > condition. Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion of equality, or just invert the >= (which I agree I probably just screwed up and didn't notice when reviewing the patch because it looked close enough to correct and it didn't have a measurable effect). Greetings, Andres Freund
Re: Strange coding in _mdfd_openseg()
Hello. At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > Hi, > > On 2019-04-04 09:24:49 +1300, Thomas Munro wrote: > > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI > > wrote: > > > I may be missing something, but it seems possible that > > > _mdfd_getseg calls it with segno > opensegs. > > > > > > | for (nextsegno = reln->md_num_open_segs[forknum]; > > > > Here nextsegno starts out equal to opensegs. > > > > > | nextsegno <= targetseg; nextsegno++) > > > > Here we add one to nextsegno... > > > > > | ... > > > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); > > > > ... after adding one to opensegs. So they're always equal. This fits > > a general programming pattern when appending to an array, the next > > element's index is the same as the number of elements. But I claim > > the coding is weird, because _mdfd_openseg's *looks* like it can > > handle opening segments in any order, except that the author > > accidentally wrote "<=" instead of ">=". In fact it can't open them > > in any order, because we don't support "holes" in the array. So I > > think it should really be "==", and it should be an assertion, not a > > condition. > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > of equality, or just invert the >= (which I agree I probably just > screwed up and didn't notice when reviewing the patch because it looked > close enough to correct and it didn't have a measurable effect). I looked there and agreed. _mdfd_openseg is always called just to add one new segment after the last opened segment by the two callers. So I think == is better. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Strange coding in _mdfd_openseg()
On Thu, Apr 4, 2019 at 4:16 PM Kyotaro HORIGUCHI wrote: > At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote > in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > > of equality, or just invert the >= (which I agree I probably just > > screwed up and didn't notice when reviewing the patch because it looked > > close enough to correct and it didn't have a measurable effect). > > I looked there and agreed. _mdfd_openseg is always called just to > add one new segment after the last opened segment by the two > callers. So I think == is better. Thanks. Some other little things I noticed while poking around in this area: Callers of _mdgetseg(EXTENSION_RETURN_NULL) expect errno to be set if it returns NULL, and it expects the same of mdopen(EXTERNSION_RETURN_NULL), and yet the latter does: fd = PathNameOpenFile(path, O_RDWR | PG_BINARY); if (fd < 0) { if ((behavior & EXTENSION_RETURN_NULL) && FILE_POSSIBLY_DELETED(errno)) { pfree(path); return NULL; } 1. I guess that needs save_errno treatment to protect it from being clobbered by pfree()? 2. It'd be nice if function documentation explicitly said which functions set errno on error (and perhaps which syscalls). 3. Why does some code use "file < 0" and other code "file <= 0" to detect errors from fd.c functions that return File? -- Thomas Munro https://enterprisedb.com
Re: Strange coding in _mdfd_openseg()
At Fri, 5 Apr 2019 18:44:15 +1300, Thomas Munro wrote in > On Thu, Apr 4, 2019 at 4:16 PM Kyotaro HORIGUCHI > wrote: > > At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote > > in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > > > of equality, or just invert the >= (which I agree I probably just > > > screwed up and didn't notice when reviewing the patch because it looked > > > close enough to correct and it didn't have a measurable effect). > > > > I looked there and agreed. _mdfd_openseg is always called just to > > add one new segment after the last opened segment by the two > > callers. So I think == is better. > > Thanks. Some other little things I noticed while poking around in this area: > > Callers of _mdgetseg(EXTENSION_RETURN_NULL) expect errno to be set if > it returns NULL, and it expects the same of Only mdsyncfiletag seems expecting that and it is documented. But _mdfd_getseg is not documented as the same. mdopen also is not. > mdopen(EXTERNSION_RETURN_NULL), and yet the latter does: > > fd = PathNameOpenFile(path, O_RDWR | PG_BINARY); > > if (fd < 0) > { > if ((behavior & EXTENSION_RETURN_NULL) && > FILE_POSSIBLY_DELETED(errno)) > { > pfree(path); > return NULL; > } > 1. I guess that needs save_errno treatment to protect it from being > clobbered by pfree()? If both elog() and free() don't change errno, we don't need to do that at least for AllocSetFree, and is seems to be the same for other allocators. I think it is better to guarantee (and document) that errno does not change by pfree(), rather than to protect in the caller side.a > 2. It'd be nice if function documentation explicitly said which > functions set errno on error (and perhaps which syscalls). I agree about errno. I'm not sure about syscall (names?). > 3. Why does some code use "file < 0" and other code "file <= 0" to > detect errors from fd.c functions that return File? That seems just a thinko, or difference of how to think about invalid (or impossible) values. Vfd == 0 is invalid and impossible, so file <=0 and < 0 are effectively the equivalents. I think we should treat 0 as error rather than sucess. I don't think it worth to do Assert(file != 0). regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Strange coding in _mdfd_openseg()
At Wed, 3 Apr 2019 17:14:36 +1300, Thomas Munro wrote in > Hello, > > I think the following conditional code is misleading, and I wonder if > it would be better like so: > > --- a/src/backend/storage/smgr/md.c > +++ b/src/backend/storage/smgr/md.c > @@ -1787,8 +1787,13 @@ _mdfd_openseg(SMgrRelation reln, ForkNumber > forknum, BlockNumber segno, > if (fd < 0) > return NULL; > > - if (segno <= reln->md_num_open_segs[forknum]) > - _fdvec_resize(reln, forknum, segno + 1); > + /* > +* Segments are always opened in order from lowest to highest, > so we must > +* be adding a new one at the end. > +*/ > + Assert(segno == reln->md_num_open_segs[forknum]); > + > + _fdvec_resize(reln, forknum, segno + 1); > > /* fill the entry */ > v = &reln->md_seg_fds[forknum][segno]; > > I think the condition is always true, and with == it would also always > be true. If that weren't the case, the call to _fdvec_resize() code > would effectively leak vfds. I may be missing something, but it seems possible that _mdfd_getseg calls it with segno > opensegs. | for (nextsegno = reln->md_num_open_segs[forknum]; | nextsegno <= targetseg; nextsegno++) | ... | v = _mdfd_openseg(reln, forknum, nextsegno, flags); regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Strange coding in _mdfd_openseg()
On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI wrote: > I may be missing something, but it seems possible that > _mdfd_getseg calls it with segno > opensegs. > > | for (nextsegno = reln->md_num_open_segs[forknum]; Here nextsegno starts out equal to opensegs. > | nextsegno <= targetseg; nextsegno++) Here we add one to nextsegno... > | ... > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); ... after adding one to opensegs. So they're always equal. This fits a general programming pattern when appending to an array, the next element's index is the same as the number of elements. But I claim the coding is weird, because _mdfd_openseg's *looks* like it can handle opening segments in any order, except that the author accidentally wrote "<=" instead of ">=". In fact it can't open them in any order, because we don't support "holes" in the array. So I think it should really be "==", and it should be an assertion, not a condition. -- Thomas Munro https://enterprisedb.com
Re: Strange coding in _mdfd_openseg()
Hi, On 2019-04-04 09:24:49 +1300, Thomas Munro wrote: > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI > wrote: > > I may be missing something, but it seems possible that > > _mdfd_getseg calls it with segno > opensegs. > > > > | for (nextsegno = reln->md_num_open_segs[forknum]; > > Here nextsegno starts out equal to opensegs. > > > | nextsegno <= targetseg; nextsegno++) > > Here we add one to nextsegno... > > > | ... > > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); > > ... after adding one to opensegs. So they're always equal. This fits > a general programming pattern when appending to an array, the next > element's index is the same as the number of elements. But I claim > the coding is weird, because _mdfd_openseg's *looks* like it can > handle opening segments in any order, except that the author > accidentally wrote "<=" instead of ">=". In fact it can't open them > in any order, because we don't support "holes" in the array. So I > think it should really be "==", and it should be an assertion, not a > condition. Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion of equality, or just invert the >= (which I agree I probably just screwed up and didn't notice when reviewing the patch because it looked close enough to correct and it didn't have a measurable effect). Greetings, Andres Freund
Re: Strange coding in _mdfd_openseg()
Hello. At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > Hi, > > On 2019-04-04 09:24:49 +1300, Thomas Munro wrote: > > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI > > wrote: > > > I may be missing something, but it seems possible that > > > _mdfd_getseg calls it with segno > opensegs. > > > > > > | for (nextsegno = reln->md_num_open_segs[forknum]; > > > > Here nextsegno starts out equal to opensegs. > > > > > | nextsegno <= targetseg; nextsegno++) > > > > Here we add one to nextsegno... > > > > > | ... > > > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); > > > > ... after adding one to opensegs. So they're always equal. This fits > > a general programming pattern when appending to an array, the next > > element's index is the same as the number of elements. But I claim > > the coding is weird, because _mdfd_openseg's *looks* like it can > > handle opening segments in any order, except that the author > > accidentally wrote "<=" instead of ">=". In fact it can't open them > > in any order, because we don't support "holes" in the array. So I > > think it should really be "==", and it should be an assertion, not a > > condition. > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > of equality, or just invert the >= (which I agree I probably just > screwed up and didn't notice when reviewing the patch because it looked > close enough to correct and it didn't have a measurable effect). I looked there and agreed. _mdfd_openseg is always called just to add one new segment after the last opened segment by the two callers. So I think == is better. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Strange coding in _mdfd_openseg()
On Thu, Apr 4, 2019 at 4:16 PM Kyotaro HORIGUCHI wrote: > At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote > in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > > of equality, or just invert the >= (which I agree I probably just > > screwed up and didn't notice when reviewing the patch because it looked > > close enough to correct and it didn't have a measurable effect). > > I looked there and agreed. _mdfd_openseg is always called just to > add one new segment after the last opened segment by the two > callers. So I think == is better. Thanks. Some other little things I noticed while poking around in this area: Callers of _mdgetseg(EXTENSION_RETURN_NULL) expect errno to be set if it returns NULL, and it expects the same of mdopen(EXTERNSION_RETURN_NULL), and yet the latter does: fd = PathNameOpenFile(path, O_RDWR | PG_BINARY); if (fd < 0) { if ((behavior & EXTENSION_RETURN_NULL) && FILE_POSSIBLY_DELETED(errno)) { pfree(path); return NULL; } 1. I guess that needs save_errno treatment to protect it from being clobbered by pfree()? 2. It'd be nice if function documentation explicitly said which functions set errno on error (and perhaps which syscalls). 3. Why does some code use "file < 0" and other code "file <= 0" to detect errors from fd.c functions that return File? -- Thomas Munro https://enterprisedb.com
Re: Strange coding in _mdfd_openseg()
At Fri, 5 Apr 2019 18:44:15 +1300, Thomas Munro wrote in > On Thu, Apr 4, 2019 at 4:16 PM Kyotaro HORIGUCHI > wrote: > > At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote > > in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > > > of equality, or just invert the >= (which I agree I probably just > > > screwed up and didn't notice when reviewing the patch because it looked > > > close enough to correct and it didn't have a measurable effect). > > > > I looked there and agreed. _mdfd_openseg is always called just to > > add one new segment after the last opened segment by the two > > callers. So I think == is better. > > Thanks. Some other little things I noticed while poking around in this area: > > Callers of _mdgetseg(EXTENSION_RETURN_NULL) expect errno to be set if > it returns NULL, and it expects the same of Only mdsyncfiletag seems expecting that and it is documented. But _mdfd_getseg is not documented as the same. mdopen also is not. > mdopen(EXTERNSION_RETURN_NULL), and yet the latter does: > > fd = PathNameOpenFile(path, O_RDWR | PG_BINARY); > > if (fd < 0) > { > if ((behavior & EXTENSION_RETURN_NULL) && > FILE_POSSIBLY_DELETED(errno)) > { > pfree(path); > return NULL; > } > 1. I guess that needs save_errno treatment to protect it from being > clobbered by pfree()? If both elog() and free() don't change errno, we don't need to do that at least for AllocSetFree, and is seems to be the same for other allocators. I think it is better to guarantee (and document) that errno does not change by pfree(), rather than to protect in the caller side.a > 2. It'd be nice if function documentation explicitly said which > functions set errno on error (and perhaps which syscalls). I agree about errno. I'm not sure about syscall (names?). > 3. Why does some code use "file < 0" and other code "file <= 0" to > detect errors from fd.c functions that return File? That seems just a thinko, or difference of how to think about invalid (or impossible) values. Vfd == 0 is invalid and impossible, so file <=0 and < 0 are effectively the equivalents. I think we should treat 0 as error rather than sucess. I don't think it worth to do Assert(file != 0). regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Strange coding in _mdfd_openseg()
On Thu, Apr 04, 2019 at 12:15:52PM +0900, Kyotaro HORIGUCHI wrote: > At Wed, 3 Apr 2019 13:47:46 -0700, Andres Freund wrote > in <20190403204746.2yumq7c2mirmo...@alap3.anarazel.de> > > On 2019-04-04 09:24:49 +1300, Thomas Munro wrote: > > > On Wed, Apr 3, 2019 at 5:34 PM Kyotaro HORIGUCHI > > > wrote: > > > > I may be missing something, but it seems possible that > > > > _mdfd_getseg calls it with segno > opensegs. > > > > > > > > | for (nextsegno = reln->md_num_open_segs[forknum]; > > > > > > Here nextsegno starts out equal to opensegs. > > > > > > > | nextsegno <= targetseg; nextsegno++) > > > > > > Here we add one to nextsegno... > > > > > > > | ... > > > > | v = _mdfd_openseg(reln, forknum, nextsegno, flags); > > > > > > ... after adding one to opensegs. So they're always equal. This fits > > > a general programming pattern when appending to an array, the next > > > element's index is the same as the number of elements. But I claim > > > the coding is weird, because _mdfd_openseg's *looks* like it can > > > handle opening segments in any order, except that the author > > > accidentally wrote "<=" instead of ">=". In fact it can't open them > > > in any order, because we don't support "holes" in the array. So I > > > think it should really be "==", and it should be an assertion, not a > > > condition. > > > > Yea, I totally agree it's weird. I'm not sure if I'd go for an assertion > > of equality, or just invert the >= (which I agree I probably just > > screwed up and didn't notice when reviewing the patch because it looked > > close enough to correct and it didn't have a measurable effect). > > I looked there and agreed. _mdfd_openseg is always called just to > add one new segment after the last opened segment by the two > callers. So I think == is better. Agreed. The rest of md.c won't cope with a hole in this array, so allowing less-than-or-equal here is futile. The patch in the original post looks fine.
Re: Strange coding in _mdfd_openseg()
On Sun, Jan 26, 2020 at 8:23 AM Noah Misch wrote: > Agreed. The rest of md.c won't cope with a hole in this array, so allowing > less-than-or-equal here is futile. The patch in the original post looks fine. Thanks. Pushed.