jberragan commented on code in PR #206:
URL:
https://github.com/apache/cassandra-analytics/pull/206#discussion_r3306567758
##########
cassandra-analytics-common/src/main/java/org/apache/cassandra/spark/data/FileType.java:
##########
@@ -83,4 +83,15 @@ public String getFileSuffix()
{
return fileSuffix;
}
+
+ /**
+ * Whether the on-disk size of this component can drift from the value
recorded in a backup
+ * manifest. Cassandra rewrites Summary/Filter/Statistics in place during
compaction, so a
+ * stale manifest size for these components can produce truncated
ranged-GETs against the
+ * backing store. The data layer treats these components specially when
issuing reads.
+ */
+ public boolean isMutableMetadata()
+ {
+ return this == SUMMARY || this == FILTER || this == STATISTICS;
Review Comment:
Is this code/comment fully accurate? I'm aware Statistics.db can mutate the
repair metadata (repairedAt, pendingRepair) to avoid recompaction, and also the
compaction level to avoid a full rewrite.
Summary.db appears to only mutate if there is a change in the index sample
size (redistributeSummaries).
Filter.db I can't find any place where it mutates. I think it is immutable
from the first flush.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]