jberragan commented on code in PR #206:
URL: 
https://github.com/apache/cassandra-analytics/pull/206#discussion_r3306567758


##########
cassandra-analytics-common/src/main/java/org/apache/cassandra/spark/data/FileType.java:
##########
@@ -83,4 +83,15 @@ public String getFileSuffix()
     {
         return fileSuffix;
     }
+
+    /**
+     * Whether the on-disk size of this component can drift from the value 
recorded in a backup
+     * manifest. Cassandra rewrites Summary/Filter/Statistics in place during 
compaction, so a
+     * stale manifest size for these components can produce truncated 
ranged-GETs against the
+     * backing store. The data layer treats these components specially when 
issuing reads.
+     */
+    public boolean isMutableMetadata()
+    {
+        return this == SUMMARY || this == FILTER || this == STATISTICS;

Review Comment:
   Is this code/comment fully accurate? I'm aware Statistics.db can mutate the 
repair metadata (repairedAt, pendingRepair) to avoid recompaction, and also the 
compaction level to avoid a full rewrite.
   
   Summary.db appears to only mutate if there is a change in the index sample 
size (redistributeSummaries).
   
   Filter.db I can't find any place where it mutates. I think it is immutable 
from the first flush.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to