davecromberge commented on code in PR #14373:
URL: https://github.com/apache/pinot/pull/14373#discussion_r1869402277
##########
pinot-plugins/pinot-minion-tasks/pinot-minion-builtin-tasks/src/main/java/org/apache/pinot/plugin/minion/tasks/mergerollup/MergeRollupTaskUtils.java:
##########
@@ -45,12 +47,25 @@ private MergeRollupTaskUtils() {
*/
public static Map<String, Map<String, String>>
getLevelToConfigMap(Map<String, String> taskConfig) {
Map<String, Map<String, String>> levelToConfigMap = new TreeMap<>();
+
+ // Regex to match aggregation function parameter keys
+ Pattern pattern =
Pattern.compile("(\\w+)\\.aggregationFunctionParameters\\.(\\w+)\\.(\\w+)");
+
for (Map.Entry<String, String> entry : taskConfig.entrySet()) {
String key = entry.getKey();
for (String configKey : VALID_CONFIG_KEYS) {
if (key.endsWith(configKey)) {
String level = key.substring(0, key.length() - configKey.length() -
1);
levelToConfigMap.computeIfAbsent(level, k -> new
TreeMap<>()).put(configKey, entry.getValue());
+ } else {
+ Matcher matcher = pattern.matcher(key);
Review Comment:
@swaminathanmanish this is unfortunately not possible. The "level" variable
is extracted by using the substring of the key from the beginning of the string
until the position of the config key itself.
Because the config key includes a metric name, the key parameter does not
solely consist of the term in the `VALID_CONFIG_KEYS` set.
For example, using the following task configuration:
```
"hourly.aggregationFunctionParameters.metricColumnA.nominalEntries": "16384"
"hourly.aggregationFunctionParameters.metricColumnB.nominalEntries": "8192"
"daily.aggregationFunctionParameters.metricColumnA.nominalEntries": "8192"
"daily.aggregationFunctionParameters.metricColumnB.nominalEntries": "4096"
```
Correctly produces the following `levelToConfigMap` of type `Map<String,
Map<String, String>>`:
```
hourly ->
aggregationFunctionParameters.metricColumnA.nominalEntries -> 16384
aggregationFunctionParameters.metricColumnB.nominalEntries -> 8192
daily ->
aggregationFunctionParameters.metricColumnA.nominalEntries -> 8192
aggregationFunctionParameters.metricColumnB.nominalEntries -> 4096
```
If `nominalEntries` was added to the `VALID_CONFIG_KEYS` parsing path, we
would get the following `levelToConfigMap`:
```
hourly.aggregationFunctionParameters.metricColumnA ->
nominalEntries -> 16384
hourly.aggregationFunctionParameters.metricColumnB ->
nominalEntries -> 8192
daily.aggregationFunctionParameters.metricColumnA ->
nominalEntries -> 8192
daily.aggregationFunctionParameters.metricColumnB ->
nominalEntries -> 4096
```
This is because the level would be extracted incorrectly to include the
prefix - which excludes the `VALID_CONFIG_KEY`. I can't think of an
alternative to the above, which preserves the existing structure. Perhaps this
is another "abuse" of using Maps but IMO it seems the most compatible with the
existing design.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]