Thanks, Robert for the review. On Wed, Dec 11, 2019 at 1:10 AM Robert Haas <robertmh...@gmail.com> wrote:
> On Tue, Dec 10, 2019 at 6:40 AM Suraj Kharage > <suraj.khar...@enterprisedb.com> wrote: > > Please find attached patch for backup validator implementation (0004 > patch). This patch is based > > on Rushabh's latest patch for backup manifest. > > > > There are some functions required at client side as well, so I have > moved those functions > > and some data structure at common place so that they can be accessible > for both. (0003 patch). > > > > My colleague Rajkumar Raghuwanshi has prepared the WIP patch (0005) for > tap test cases which > > is also attached. As of now, test cases related to the tablespace and > tar backup format are missing, > > will continue work on same and submit the complete patch. > > > > With this mail, I have attached the complete patch stack for backup > manifest and backup > > validate implementation. > > > > Please let me know your thoughts on the same. > > Well, for the second time on this thread, please don't take a bunch of > somebody else's code and post it in a patch that doesn't attribute > that person as one of the authors. For the second time on this thread, > the person is me, but don't borrow *anyone's* code without proper > attribution. It's really important! > > On a related note, it's a very good idea to use git format-patch and > git rebase -i to maintain patch stacks like this. Rushabh seems to > have done that, but the files you're posting look like raw 'git diff' > output. Notice that this gives him a way to include authorship > information and a tentative commit message in each patch, but you > don't have any of that. > Sorry, I have corrected this in the attached v2 patch set. > Also on a related note, part of the process of adapting existing code > to a new purpose is adapting the comments. You haven't done that: > > + * Search a result-set hash table for a row matching a given filename. > ... > + * Insert a row into a result-set hash table, provided no such row is > already > ... > + * Most of the values > + * that we're hashing are short integers formatted as text, so there > + * shouldn't be much room for pathological input. > Corrected in v2 patch. > I think that what we should actually do here is try to use simplehash. > Right now, it won't work for frontend code, but I posted some patches > to try to address that issue: > > > https://www.postgresql.org/message-id/CA%2BTgmob8oyh02NrZW%3DxCScB%2B5GyJ-jVowE3%2BTWTUmPF%3DFsGWTA%40mail.gmail.com > > That would have a few advantages. One, we wouldn't need to know the > number of elements in advance, because simplehash can grow > dynamically. Two, we could use the iteration interfaces to walk the > hash table. Your solution to that is pgrhash_seq_search, but that's > actually not well-designed, because it's not a generic iterator > function but something that knows specifically about the 'touch' flag. > I incidentally suggest renaming 'touch' to 'matched;' 'touch' is not > bad, but I think 'matched' will be a little more recognizable. > Thanks for the suggestion. Will try to implement the same and update accordingly. I am assuming that I need to build the patch based on the changes that you proposed on the mentioned thread. > Please run pgindent. If needed, first add locally defined types to > typedefs.list, so that things indent properly. > > It's not a crazy idea to try to share some data structures and code > between the frontend and the backend here, but I think > src/common/backup.c and src/include/common/backup.h is a far too > generic name given what the code is actually doing. It's mostly about > checksums, not backup, and I think it should be named accordingly. I > suggest removing "manifestinfo" and renaming the rest to just talk > about checksums rather than manifests. That would make it logical to > reuse this for any other future code that needs a configurable > checksum type. Also, how about adding a function like: > > extern bool parse_checksum_algorithm(char *name, ChecksumAlgorithm *algo); > > ...which would return true and set *algo if name is recognized, and > return false otherwise. That code could be used on both the client and > server sides of this patch, and by any future patches that want to > return this scaffolding. > Corrected the filename and implemented the function as suggested. > The file header for backup.h has the wrong filename (string.h). The > header format looks somewhat atypical compared to what we normally do, > too. My bad, corrected the header format as well. > > It's arguable, but I tend to think that it would be better to > hex-encode the CRC rather than printing it as an integer. Maybe > hex_encode() is another thing that could be moved into the new > src/common file. We are already encoding the CRC checksum as well. Please let me know if I misunderstood anything. Moved hex_encode into src/common. > As I said before about Rushabh's patch set, it's very confusing that > we have so many patches here stacked up. Like, you have 0002 moving > stuff, and then 0003 moving it again. That's super-confusing. Please > try to structure the patch set so as to make it as easy to review as > possible. > Sorry for the confusion. I have squashed 0001 to 0003 patches in one patch. > Regarding the test case patch, error checks are important! Don't do > things like this: > > +open my $modify_file_sha256, '>>', > "$tempdir/backup_verify/postgresql.conf"; > +print $modify_file_sha256 "port = 5555\n"; > +close $modify_file_sha256; > > If the open fails, then it and the print and the close are going to > silently do nothing. That's bad. I don't know exactly what the > customary error-checking is for things like this in TAP tests, but I > hope it's not like this, because this has a pretty fair chance of > looking like it's testing something that it isn't. Let's figure out > what the best practice in this area is and adhere to it. > Rajkumar has fixed this, please find attached 0003 patch for same. Please find attached v2 set patches. TODO: will implement the simplehash as suggested. > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company > -- -- Thanks & Regards, Suraj kharage, EnterpriseDB Corporation, The Postgres Database Company.
v2-0001-Backup-manifest-with-file-names-sizes-timestamps-.patch
Description: Binary data
v2-0002-Implemenation-of-backup-validator.patch
Description: Binary data
v2-0003-Tap-test-case-patch-to-verify-the-backup-using-ve.patch
Description: Binary data