leborchuk commented on code in PR #1192:
URL: https://github.com/apache/cloudberry/pull/1192#discussion_r2204698041


##########
src/backend/utils/misc/guc_gp.c:
##########
@@ -4488,6 +4489,17 @@ struct config_int ConfigureNamesInt_gp[] =
                NULL, NULL, NULL
        },
 
+       {
+               {"optimizer_agg_pds_strategy", PGC_USERSET, DEVELOPER_OPTIONS,
+                       gettext_noop("Set the strategy of agg required 
distribution."),
+                       NULL,
+                       GUC_NOT_IN_SAMPLE
+               },
+               &optimizer_agg_pds_strategy,
+               OPTIMIZER_AGG_PDS_ALL_KEY, OPTIMIZER_AGG_PDS_ALL_KEY, 
OPTIMIZER_AGG_PDS_MINIMAL_LEN_KEY,

Review Comment:
   Sorry, but if I understand correctly, the goal we are pursuing is to exclude 
long strings from hash calculations. For example, if we have 10 columns and 5th 
column is a long string, it would be better not to hash it, and hash all the 
remaining columns in order to achieve uniform distribution of data across 
segments. 
   
   From this logic, perhaps we could add a GUC with the maximum data type size 
for hash to exclude all long strings. If all columns contain long strings, then 
let's select the one with the minimum size.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to