robertmu opened a new issue, #1254:
URL: https://github.com/apache/cloudberry/issues/1254

   ### Issue Description
   
   There appears to be an inconsistency in how Cloudberry handles the default 
`checksum` storage option for append-optimized (AO) tables.
   
   The system configuration `gp_default_storage_options` correctly shows that 
`checksum=true` is part of the default settings. However, when an AO table is 
created without an explicit `checksum` clause, this default value is not 
persisted to the table's metadata in the `pg_class.reloptions` column.
   
   This behavior differs from Greenplum, which correctly persists the 
`checksum=true` default to `pg_class.reloptions`. Since 
`gp_default_storage_options` is a GUC that can be changed at any time, it is 
critical that the effective storage options at creation time are explicitly 
recorded in the metadata. The current behavior can lead to a misunderstanding 
of the table's actual storage properties if the GUC is changed later.
   
   ### Reproduction and Evidence
   
   The following raw `psql` session logs demonstrate the issue. The session on 
Cloudberry shows that `checksum=true` is the configured default but is not 
persisted to `pg_class.reloptions`. The session on Greenplum shows the 
expected, consistent behavior.
   
   ```text
   cbdb@robertmu-VirtualBox:~/Projects/cloudberry$ psql
   psql (14.4, server 14.4)
   Type "help" for help.
   
   cbdb=# select version();
                                                                                
                   version
   
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PostgreSQL 14.4 (Apache Cloudberry 2.1.0-devel+dev.2019.g1cc76495e18 build 
dev) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 
9.4.0, 64-bit compiled on Jul 18 2025 11:29:40
   (1 row)
   
   cbdb=#
   cbdb=# create table tab_ao(a int, b int) with(appendonly=true, 
orientation=column, compresstype=zlib, blocksize=32768, compresslevel=1);
   NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 
'a' as the Apache Cloudberry data distribution key for this table.
   HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make 
sure column(s) chosen are the optimal data distribution key to minimize skew.
   CREATE TABLE
   cbdb=#
   cbdb=# select oid, relname, reloptions from pg_class where relname = 
'tab_ao';
     oid  | relname |                     reloptions
   -------+---------+-----------------------------------------------------
    20763 | tab_ao  | {compresstype=zlib,blocksize=32768,compresslevel=1}
   (1 row)
   
   cbdb=#
   cbdb=# show gp_default_storage_options;
              gp_default_storage_options
   -------------------------------------------------
    blocksize=32768,compresstype=none,checksum=true
   (1 row)
   
   cbdb=#
   cbdb=#gpdb7@robertmu-VirtualBox:~/Projects/gpdb-archive$ psql
   psql (12.12)
   Type "help" for help.
   
   gpdb7=# select version();
                                                                                
                     version
   
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PostgreSQL 12.12 (Greenplum Database 7.0.0-beta.0+482967c1b4 build dev) on 
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 
64-bit compiled on Nov  8 2024 23:43:47 Bhuvnesh C.
   (1 row)
   
   gpdb7=#
   gpdb7=# create table tab_ao(a int, b int) with(appendonly=true, 
orientation=column, compresstype=zlib, blocksize=32768, compresslevel=1);
   NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 
'a' as the Greenplum Database data distribution key for this table.
   HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make 
sure column(s) chosen are the optimal data distribution key to minimize skew.
   CREATE TABLE
   gpdb7=#
   gpdb7=# select oid, relname, reloptions from pg_class where relname = 
'tab_ao';
     oid  | relname |                            reloptions
   
-------+---------+-------------------------------------------------------------------
    18293 | tab_ao  | 
{compresstype=zlib,blocksize=32768,compresslevel=1,checksum=true}
   (1 row)
   
   gpdb7=#
   gpdb7=# show gp_default_storage_options;
              gp_default_storage_options
   -------------------------------------------------
    blocksize=32768,compresstype=none,checksum=true
   (1 row)
   
   gpdb7=#
   gpdb7=#
   ```
   
   ### Expected Behavior
   
   The `reloptions` column in `pg_class` for the `tab_ao` table should contain 
`checksum=true`, as this is the effective default set by 
`gp_default_storage_options` at the time of creation.
   
   ### Actual Behavior
   
   The `reloptions` column in `pg_class` for the `tab_ao` table **does not** 
contain the `checksum=true` option, even though it is part of the system 
default.
   
   ### Environment
   
   - **Cloudberry Version:** `PostgreSQL 14.4 (Apache Cloudberry 
2.1.0-devel+dev.2019.g1cc76495e18 build dev)`
   - **Greenplum Version:** `PostgreSQL 12.12 (Greenplum Database 
7.0.0-beta.0+482967c1b4 build dev)`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to