yifan-c commented on code in PR #38:
URL: 
https://github.com/apache/cassandra-analytics/pull/38#discussion_r1482390314


##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/BulkSparkConf.java:
##########
@@ -128,12 +128,13 @@ public class BulkSparkConf implements Serializable
     protected boolean useOpenSsl;
     protected int ringRetryCount;
     protected final Set<String> blockedInstances;
+    protected final DigestTypeOption digestTypeOption;
 
     public BulkSparkConf(SparkConf conf, Map<String, String> options)
     {
         this.conf = conf;
         Optional<Integer> sidecarPortFromOptions = 
MapUtils.getOptionalInt(options, WriterOptions.SIDECAR_PORT.name(), "sidecar 
port");
-        this.userProvidedSidecarPort = sidecarPortFromOptions.isPresent() ? 
sidecarPortFromOptions.get() : getOptionalInt(SIDECAR_PORT).orElse(-1);
+        this.userProvidedSidecarPort = sidecarPortFromOptions.orElseGet(() -> 
getOptionalInt(SIDECAR_PORT).orElse(-1));

Review Comment:
   nit: ternary operator reads better than orElseGet



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/common/Digest.java:
##########
@@ -19,25 +19,13 @@
 
 package org.apache.cassandra.spark.common;
 
-import java.security.MessageDigest;
-import java.util.Base64;
-
-public final class MD5Hash
+/**
+ * Interface that represents a checksum digest

Review Comment:
   nit: checksum kind of equals to digest. It is repeat like "I have two animal 
cats". Checksum can be removed



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/DigestTypeOption.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.cassandra.spark.bulkwriter;
+
+import org.apache.cassandra.spark.utils.DigestProvider;
+import org.apache.cassandra.spark.utils.MD5DigestProvider;
+import org.apache.cassandra.spark.utils.XXHash32DigestProvider;
+
+/**
+ * Represents the user-provided digest type configuration to be used to 
validate SSTable files during bulk writes
+ */
+public enum DigestTypeOption
+{
+    /**
+     * Represents an MD5 digest type option. This option is supported for 
legacy reasons, but its use
+     * is strongly discouraged.
+     */
+    MD5
+    {
+        @Override
+        DigestProvider provider()
+        {
+            return new MD5DigestProvider();
+        }
+    },
+
+    /**
+     * Represents an xxhash32 digest type option
+     */
+    XXHASH32
+    {
+        @Override
+        DigestProvider provider()
+        {
+            return new XXHash32DigestProvider();
+        }
+    };
+
+    /**
+     * @return the provider for the configured digest type
+     */
+    abstract DigestProvider provider();

Review Comment:
   How about extracting out as an interface? And implement the interface in the 
enum. Potentially, one cannot change the source and supply a different impl.



##########
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/RecordWriter.java:
##########
@@ -400,4 +403,23 @@ private StreamSession createStreamSession(TaskContext 
taskContext)
         LOGGER.info("[{}] Creating stream session for range={}", 
taskContext.partitionId(), tokenRange);
         return new StreamSession(writerContext, getStreamId(taskContext), 
tokenRange, failureHandler);
     }
+
+    /**
+     * Functional interface that helps with supplying {@link SSTableWriter} 
instances.
+     */
+    public interface SSTableWriterSupplier

Review Comment:
   nit: supplier by convention takes no parameter. It is the factory here. 
SSTableWriterFactory wdyt?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to