xiangfu0 commented on code in PR #18229:
URL: https://github.com/apache/pinot/pull/18229#discussion_r3110499478
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/loader/ForwardIndexHandler.java:
##########
@@ -374,28 +377,88 @@ private boolean shouldDisableDictionary(String column,
ColumnMetadata existingCo
return true;
}
- private boolean shouldChangeRawCompressionType(String column,
SegmentDirectory.Reader segmentReader)
+ private boolean shouldRewriteRawForwardIndex(String column,
SegmentDirectory.Reader segmentReader)
throws Exception {
// The compression type for an existing segment can only be determined by
reading the forward index header.
ColumnMetadata existingColMetadata =
_segmentDirectory.getSegmentMetadata().getColumnMetadataFor(column);
ChunkCompressionType existingCompressionType;
+ String existingCodecSpec;
// Get the forward index reader factory and create a reader
IndexReaderFactory<ForwardIndexReader> readerFactory =
StandardIndexes.forward().getReaderFactory();
try (ForwardIndexReader<?> fwdIndexReader =
readerFactory.createIndexReader(segmentReader,
_fieldIndexConfigs.get(column), existingColMetadata)) {
existingCompressionType = fwdIndexReader.getCompressionType();
- Preconditions.checkState(existingCompressionType != null,
- "Existing compressionType cannot be null for raw forward index
column=" + column);
+ existingCodecSpec = fwdIndexReader.getCodecSpec();
}
- // Get the new compression type.
- ChunkCompressionType newCompressionType =
-
_fieldIndexConfigs.get(column).getConfig(StandardIndexes.forward()).getChunkCompressionType();
+ ForwardIndexConfig fwdConfig =
_fieldIndexConfigs.get(column).getConfig(StandardIndexes.forward());
+
+ if (existingCompressionType == null) {
+ // V7 codec-pipeline only supports INT and LONG; guard against future
type expansion reaching this path.
+ DataType existingStoredType =
existingColMetadata.getDataType().getStoredType();
+ Preconditions.checkState(existingStoredType == DataType.INT ||
existingStoredType == DataType.LONG,
+ "V7 codec-pipeline segment for column=%s has unexpected stored type
%s; expected INT or LONG",
+ column, existingStoredType);
+ // Codec-pipeline segment (version 7): compare the stored canonical spec
against the configured codecSpec.
+ // If no codecSpec is configured, a legacy compressionCodec or default
will apply to new segments only;
+ // existing V7 segments are left as-is (no rewrite needed).
+ String newCodecSpec = fwdConfig.getCodecSpec();
+ if (newCodecSpec == null) {
+ // Config may have reverted from codecSpec back to a legacy
compressionCodec — schedule a
+ // rewrite back to the legacy format so the segment can be read by
older servers.
+ // PASS_THROUGH (internally aliased as NONE) is excluded:
PASS_THROUGH, CLP, CLPV2, and
+ // related codecs all resolve to ChunkCompressionType.PASS_THROUGH and
do not represent a
+ // genuine rollback to a real compression codec.
+ ChunkCompressionType newCompressionType =
fwdConfig.getChunkCompressionType();
+ if (newCompressionType != null && newCompressionType !=
ChunkCompressionType.PASS_THROUGH) {
Review Comment:
This `PASS_THROUGH` exclusion makes rollback incomplete. If an operator
reverts a V7 `codecSpec` column to explicit legacy
`compressionCodec=PASS_THROUGH`, segment reload will skip the rewrite and leave
the segment on the V7 on-disk format, so older servers still cannot load it.
For rollback safety, any explicit legacy `compressionCodec` should force
rewriting the segment back to the legacy format, including `PASS_THROUGH`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]