lukasz-antoniak commented on code in PR #211:
URL: 
https://github.com/apache/cassandra-analytics/pull/211#discussion_r3323484553


##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/SortedSSTableWriter.java:
##########
@@ -297,6 +311,25 @@ public synchronized void close(BulkWriterContext 
writerContext) throws IOExcepti
         validateSSTables(writerContext);
     }
 
+    private void rebuildFilterComponents(@NotNull BulkWriterContext 
writerContext, @NotNull Path outputDirectory,
+                                         @NotNull DirectoryStream.Filter<Path> 
filter) throws IOException
+    {
+        LocalDataLayer layer = buildLocalDataLayer(writerContext, 
outputDirectory, null);
+        for (Path dataFile : getDataFileStream(filter))
+        {
+            try
+            {
+                FileSystemSSTable ssTable = new FileSystemSSTable(dataFile, 
false, BufferingInputStreamStats::doNothingStats);
+                writerContext.bridge().rebuildBloomFilter(layer.partitioner(), 
layer.cqlTable(), ssTable, outputDirectory);
+                LOGGER.error("Rebuilt bloom filter for sstable {}", dataFile);
+            }
+            catch (Exception e)
+            {
+                LOGGER.warn("Failed to rebuild bloom filter for sstable {}", 
dataFile, e);

Review Comment:
   I initially though to ignore failures, but producing unparseable file (and 
uploading it) may indeed cause an issue. Original 16B file is overwritten, so 
we do not publish the empty one in case of failure, but potentially broken one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to