On Tue, Aug 27, 2019 at 11:59 PM Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Aug 16, 2019 at 6:23 AM Jeevan Chalke > <jeevan.cha...@enterprisedb.com> wrote: > > [ patches ] > > Reviewing 0002 and 0003: > > - Commit message for 0003 claims magic number and checksum are 0, but > that (fortunately) doesn't seem to be the case. > Oops, updated commit message. > > - looks_like_rel_name actually checks whether it looks like a > *non-temporary* relation name; suggest adjusting the function name. > > - The names do_full_backup and do_incremental_backup are quite > confusing because you're really talking about what to do with one > file. I suggest sendCompleteFile() and sendPartialFile(). > Changed function names. > > - Is there any good reason to have 'refptr' as a global variable, or > could we just pass the LSN around via function arguments? I know it's > just mimicking startptr, but storing startptr in a global variable > doesn't seem like a great idea either, so if it's not too annoying, > let's pass it down via function arguments instead. Also, refptr is a > crappy name (even worse than startptr); whether we end up with a > global variable or a bunch of local variables, let's make the name(s) > clear and unambiguous, like incremental_reference_lsn. Yeah, I know > that's long, but I still think it's better than being unclear. > Renamed variable. However, I have kept that as global only as it needs many functions to change their signature, like, sendFile(), sendDir(), sendTablspeace() etc. > - do_incremental_backup looks like it can never report an error from > fread(), which is bad. But I see that this is just copied from the > existing code which has the same problem, so I started a separate > thread about that. > > - I think that passing cnt and blkindex to verify_page_checksum() > doesn't look very good from an abstraction point of view. Granted, > the existing code isn't great either, but I think this makes the > problem worse. I suggest passing "int backup_distance" to this > function, computed as cnt - BLCKSZ * blkindex. Then, you can > fseek(-backup_distance), fread(BLCKSZ), and then fseek(backup_distance > - BLCKSZ). > Yep. Done these changes in the refactoring patch. > > - While I generally support the use of while and for loops rather than > goto for flow control, a while (1) loop that ends with a break is > functionally a goto anyway. I think there are several ways this could > be revised. The most obvious one is probably to use goto, but I vote > for inverting the sense of the test: if (PageIsNew(page) || > PageGetLSN(page) >= startptr) break; This approach also saves a level > of indentation for more than half of the function. > I have used this new inverted condition, but we still need a while(1) loop. > - I am not sure that it's a good idea for sendwholefile = true to > result in dumping the entire file onto the wire in a single CopyData > message. I don't know of a concrete problem in typical > configurations, but someone who increases RELSEG_SIZE might be able to > overflow CopyData's length word. At 2GB the length word would be > negative, which might break, and at 4GB it would wrap around, which > would certainly break. See CopyData in > https://www.postgresql.org/docs/12/protocol-message-formats.html To > avoid this issue, and maybe some others, I suggest defining a > reasonably large chunk size, say 1MB as a constant in this file > someplace, and sending the data as a series of chunks of that size. > OK. Done as per the suggestions. > > - I don't think that the way concurrent truncation is handled is > correct for partial files. Right now it just falls through to code > which appends blocks of zeroes in either the complete-file or > partial-file case. I think that logic should be moved into the > function that handles the complete-file case. In the partial-file > case, the blocks that we actually send need to match the list of block > numbers we promised to send. We can't just send the promised blocks > and then tack a bunch of zero-filled blocks onto the end that the file > header doesn't know about. > Well, in partial file case we won't end up inside that block. So we are never sending zeroes at the end in case of partial file. > - For reviewer convenience, please use the -v option to git > format-patch when posting and reposting a patch series. Using -v2, > -v3, etc. on successive versions really helps. > Sure. Thanks for letting me know about this option. > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > Thanks -- Jeevan Chalke Technical Architect, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company