gfphoenix78 commented on code in PR #1397:
URL: https://github.com/apache/cloudberry/pull/1397#discussion_r2471498480
##########
contrib/pax_storage/src/cpp/access/pax_access_handle.cc:
##########
@@ -411,6 +416,107 @@ void CCPaxAccessMethod::FinishBulkInsert(Relation
relation, int options) {
CBDB_END_TRY();
}
+void CCPaxAccessMethod::RelationVacuum(Relation rel, VacuumParams *params,
+ BufferAccessStrategy /*bstrategy*/) {
+ CBDB_TRY();
+ {
+ std::vector<int> minmax_cols = cbdb::GetMinMaxColumnIndexes(rel);
+ std::vector<int> bf_cols = cbdb::GetBloomFilterColumnIndexes(rel);
+
+ if (minmax_cols.empty() && bf_cols.empty()) {
+ return;
+ }
+
+ Oid aux_oid = cbdb::GetPaxAuxRelid(RelationGetRelid(rel));
+ Relation aux_rel = cbdb::TableOpen(aux_oid, RowExclusiveLock);
Review Comment:
Does it work with manifest meta? The aux heap table is not exist when meta
data file is used.
##########
contrib/pax_storage/src/cpp/access/pax_inserter.cc:
##########
@@ -45,6 +45,7 @@ CPaxInserter::CPaxInserter(Relation rel)
writer_->SetWriteSummaryCallback(&cbdb::InsertOrUpdateMicroPartitionEntry)
->SetFileSplitStrategy(std::make_unique<PaxDefaultSplitStrategy>())
+ ->SetEnableStats(pax::pax_enable_sync_collect_stats)
Review Comment:
Will this behavior be controlled in cloud? How to set this value in
functions used by cloud?
##########
contrib/pax_storage/src/cpp/storage/micro_partition_stats.cc:
##########
@@ -656,10 +648,17 @@ void MicroPartitionStats::AddRow(TupleTableSlot *slot) {
fmt("Current stats initialized [N=%lu], in tuple desc [natts=%d]
",
column_mem_stats_.size(), n));
for (auto i = 0; i < n; i++) {
- if (slot->tts_isnull[i])
- AddNullColumn(i);
- else
+ if (slot->tts_isnull[i]) {
+ if (update_null_stats) {
+ AddNullColumn(i);
+ }
+ } else {
+ if (update_null_stats) {
+ column_mem_stats_[i].null_count_stats_.all_null = false;
+ ++column_mem_stats_[i].null_count_stats_.not_null_count;
+ }
AddNonNullColumn(i, slot->tts_values[i]);
Review Comment:
How is the combination of update-stats and update-null valid?
##########
contrib/pax_storage/src/cpp/catalog/pax_aux_table.cc:
##########
@@ -278,6 +287,11 @@ void UpdateVisimap(Oid aux_relid, int block_id, const char
*visimap_filename) {
NameGetDatum(&pt_visimap_name);
}
+ // visimap only update in delete operation, and stats is always valid
+ repls[ANUM_PG_PAX_BLOCK_TABLES_PTISSTATSVALID - 1] = true;
+ nulls[ANUM_PG_PAX_BLOCK_TABLES_PTISSTATSVALID - 1] = false;
+ values[ANUM_PG_PAX_BLOCK_TABLES_PTISSTATSVALID - 1] = BoolGetDatum(true);
Review Comment:
Why is `isstatsvalid` changed if the visimap file is updated? If it's not
changed, repls[..] should be false and leave nulls[..] and values[..] unset.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]