Hi, Alexander! Thank you for working on these patches. On Thu, 28 Mar 2024 at 02:14, Alexander Korotkov <aekorot...@gmail.com> wrote:
> Hi, Pavel! > > Thank you for your feedback. The revised patch set is attached. > > I found that vacuum.c has a lot of heap-specific code. Thus, it > should be OK for analyze.c to keep heap-specific code. Therefore, I > rolled back the movement of functions between files. That leads to a > smaller patch, easier to review. > I agree with you. And with the changes remaining in heapam_handler.c I suppose we can also remove the includes introduced: #include <math.h> #include "utils/sampling.h" #include "utils/spccache.h" On Wed, Mar 27, 2024 at 2:52 PM Pavel Borisov <pashkin.e...@gmail.com> > wrote: > >> The revised rest of the patchset is attached. > >> 0001 (was 0006) – I prefer the definition of AcquireSampleRowsFunc to > >> stay in vacuum.h. If we move it to sampling.h then we would have to > >> add there includes to define Relation, HeapTuple etc. I'd like to > >> avoid this kind of change. Also, I've deleted > >> table_beginscan_analyze(), because it's only called from > >> tableam-specific AcquireSampleRowsFunc. Also I put some comments to > >> heapam_scan_analyze_next_block() and heapam_scan_analyze_next_tuple() > >> given that there are now no relevant comments for them in tableam.h. > >> I've removed some redundancies from acquire_sample_rows(). And added > >> comments to AcquireSampleRowsFunc based on what we have in FDW docs > >> for this function. Did some small edits as well. As you suggested, > >> turned back declarations for acquire_sample_rows() and compare_rows(). > > > > > > In my comment in the thread I was not thinking about returning the old > name acquire_sample_rows(), it was only about the declarations and the > order of functions to be one code block. To me heapam_acquire_sample_rows() > looks better for a name of heap implementation of *AcquireSampleRowsFunc(). > I suggest returning the name heapam_acquire_sample_rows() from v4. Sorry > for the confusion in this. > > I found that the function name acquire_sample_rows is referenced in > quite several places in the source code. So, I would prefer to save > the old name to keep the changes minimal. > The full list shows only a couple of changes in analyze.c and several comments elsewhere. contrib/postgres_fdw/postgres_fdw.c: * of the relation. Same algorithm as in acquire_sample_rows in src/backend/access/heap/vacuumlazy.c: * match what analyze.c's acquire_sample_rows() does, otherwise VACUUM src/backend/access/heap/vacuumlazy.c: * The logic here is a bit simpler than acquire_sample_rows(), as src/backend/access/heap/vacuumlazy.c: * what acquire_sample_rows() does. src/backend/access/heap/vacuumlazy.c: * acquire_sample_rows() does, so be consistent. src/backend/access/heap/vacuumlazy.c: * acquire_sample_rows() will recognize the same LP_DEAD items as dead src/backend/commands/analyze.c:static int acquire_sample_rows(Relation onerel, int elevel, src/backend/commands/analyze.c: acquirefunc = acquire_sample_rows; src/backend/commands/analyze.c: * acquire_sample_rows -- acquire a random sample of rows from the table src/backend/commands/analyze.c:acquire_sample_rows(Relation onerel, int elevel, src/backend/commands/analyze.c: * This has the same API as acquire_sample_rows, except that rows are src/backend/commands/analyze.c: acquirefunc = acquire_sample_rows; My point for renaming is to make clear that it's a heap implementation of acquire_sample_rows which could be useful for possible reworking heap implementations of table am methods into a separate place later. But I'm also ok with the existing naming. > > The changed type of static function that always returned true for heap > looks good to me: > > static void heapam_scan_analyze_next_block > > > > The same is for removing the comparison of always true "block_accepted" > in (heapam_)acquire_sample_rows() > > Ok! > > > Removing table_beginscan_analyze and call scan_begin() is not in the > same style as other table_beginscan_* functions. Though this is not a > change in functionality, I'd leave this part as it was in v4. > > With the patch, this method doesn't have usages outside of table am. > I don't think we should keep a method, which doesn't have clear > external usage patterns. But I agree that starting a scan with > heap_beginscan() and ending with table_endscan() is not correct. Now > ending this scan with heap_endscan(). > Good! > > Also, a comment about it was introduced in v5: > > > > src/backend/access/heap/heapam_handler.c: * with > table_beginscan_analyze() > > Corrected. > For comments I'd propose: > > %s/In addition, store estimates/In addition, a function should store > estimates/g > > %s/zerp/zero/g > > Fixed. > > >> 0002 (was 0007) – I've turned the redundant "if", which you've pointed > >> out, into an assert. Also, added some comments, most notably comment > >> for TableAmRoutine.reloptions based on the indexam docs. > > > > %s/validate sthe/validates the/g > > Fixed. > > > This seems not needed, it's already inited to InvalidOid before. > > +else > > +accessMethod = default_table_access_method; > > + accessMethodId = InvalidOid; > > > > This code came from 374c7a22904. I don't insist on this simplification > in a patch 0002. > > This is minor redundancy. I prefer to keep it. This makes it obvious > that patch just moved this code. > I agree with the remaining. Regards, Pavel Borisov