siddharthteotia commented on code in PR #9454:
URL: https://github.com/apache/pinot/pull/9454#discussion_r983142148
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/IndexLoadingConfig.java:
##########
@@ -215,6 +218,26 @@ private void extractFromTableConfig(TableConfig
tableConfig) {
}
}
+ /**
+ * Extracts compressionType for each column. Populates a map containing
column name as key and compression type as
+ * value. Note that only RAW forward index columns will be populated in this
map.
+ * @param tableConfig table config
+ */
+ private void extractCompressionConfigs(TableConfig tableConfig) {
Review Comment:
My concern was the following
- User set the `noDictionaryConfig` (which is the column to compression
codec map) in tableConfig in old style and asks for compression during initial
segment creation
- Now sometime later in order to change compression, adds the same thing to
`FieldConfig` but with different codec that they want to convert to but does
not change the codec in the 2nd place in `noDictionaryConfig`
- We honor the `FieldConfig` and do the conversion using this feature.
- User later backfills the segment (thus going through the generation path
which actually uses the old style `noDictionaryConfig`) and ends up getting the
segment in previous compression codec.
However, I checked that this will not happen. When `SegmentGeneratorConfig`
is created, we actually unify / merge the configs (for the reason that
migration hasn't been done) and so it will honor `FieldConfig`. Check the
`extractCompressionCodecConfigsFromTableConfig` method which is invoked after
the following code and thus given the preference
```
if (noDictionaryColumns != null) {
this.setRawIndexCreationColumns(noDictionaryColumns);
if (noDictionaryColumnMap != null) {
Map<String, ChunkCompressionType> serializedNoDictionaryColumnMap
= noDictionaryColumnMap.entrySet().stream()
.collect(Collectors.toMap(Map.Entry::getKey, e ->
ChunkCompressionType.valueOf(e.getValue())));
this.setRawIndexCompressionType(serializedNoDictionaryColumnMap);
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]