Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-26 Thread via GitHub


frankgh commented on PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#issuecomment-1912886999

   Closed via 
https://github.com/apache/cassandra-analytics/commit/8c20b452dd0728a6fad6d276a7be9fa1b9274495


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-26 Thread via GitHub


frankgh closed pull request #31: CASSANDRA-19273 Allow setting TTL for 
snapshots created through reader
URL: https://github.com/apache/cassandra-analytics/pull/31


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-26 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1468236613


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,
+ boolean 
clearSnapshot,
+ String 
clearSnapshotStrategyOption)
+{
+if (hasDeprecatedOption)
+{
+LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please 
set 'clearSnapshotStrategy' instead.");
+if (clearSnapshotStrategyOption == null)
+{
+return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() 
: new ClearSnapshotStrategy.NoOp();
+}
+}
+if (clearSnapshotStrategyOption == null)
+{
+LOGGER.debug("No clearSnapshotStrategy is set. Using the default 
strategy");
+return ClearSnapshotStrategy.defaultStrategy();
+}
+String[] strategyParts = clearSnapshotStrategyOption.split(" ");
+String strategyName;
+String snapshotTTL = null;
+if (strategyParts.length == 1)
+{
+strategyName = strategyParts[0].trim();
+}
+else if (strategyParts.length == 2)
+{
+strategyName = strategyParts[0].trim();
+snapshotTTL = strategyParts[1].trim();
+if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL))
+{
+String msg = "Incorrect value set for clearSnapshotStrategy, 
expected format is " +
+ "{strategy [snapshotTTLvalue]}. TTL value 
specified must contain unit along. " +
+ "For e.g. 2d represents a TTL for 2 days";
+throw new IllegalArgumentException(msg);
+}
+}
+else
+{
+LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", 
clearSnapshotStrategyOption);
+throw new IllegalArgumentException("Invalid value: " + 
clearSnapshotStrategyOption);

Review Comment:
   Fail fast sounds good. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-25 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1467053662


##
cassandra-bridge/src/main/java/org/apache/cassandra/spark/utils/MapUtils.java:
##
@@ -225,4 +225,15 @@ public static String getOrDefault(Map 
options, String key, Strin
 {
 return options.getOrDefault(lowerCaseKey(key), defaultValue);
 }
+
+/**
+ * Method to check if key is present in {@code options} map.
+ * @param options   the map
+ * @param key   the key to the map
+ * @return boolean
+ */
+public static boolean contains(Map options, String key)

Review Comment:
   makes sense, will make it consistent with `Map`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-25 Thread via GitHub


frankgh commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1467052902


##
cassandra-bridge/src/main/java/org/apache/cassandra/spark/utils/MapUtils.java:
##
@@ -225,4 +225,15 @@ public static String getOrDefault(Map 
options, String key, Strin
 {
 return options.getOrDefault(lowerCaseKey(key), defaultValue);
 }
+
+/**
+ * Method to check if key is present in {@code options} map.
+ * @param options   the map
+ * @param key   the key to the map
+ * @return boolean
+ */
+public static boolean contains(Map options, String key)

Review Comment:
   should this method be called `containsKey` instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462775859


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,
+ boolean 
clearSnapshot,
+ String 
clearSnapshotStrategyOption)
+{
+if (hasDeprecatedOption)
+{
+LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please 
set 'clearSnapshotStrategy' instead.");
+if (clearSnapshotStrategyOption == null)
+{
+return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() 
: new ClearSnapshotStrategy.NoOp();
+}
+}
+if (clearSnapshotStrategyOption == null)
+{
+LOGGER.debug("No clearSnapshotStrategy is set. Using the default 
strategy");
+return ClearSnapshotStrategy.defaultStrategy();
+}
+String[] strategyParts = clearSnapshotStrategyOption.split(" ");
+String strategyName;
+String snapshotTTL = null;
+if (strategyParts.length == 1)
+{
+strategyName = strategyParts[0].trim();
+}
+else if (strategyParts.length == 2)
+{
+strategyName = strategyParts[0].trim();
+snapshotTTL = strategyParts[1].trim();
+if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL))
+{
+String msg = "Incorrect value set for clearSnapshotStrategy, 
expected format is " +
+ "{strategy [snapshotTTLvalue]}. TTL value 
specified must contain unit along. " +
+ "For e.g. 2d represents a TTL for 2 days";
+throw new IllegalArgumentException(msg);
+}
+}
+else
+{
+LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", 
clearSnapshotStrategyOption);
+throw new IllegalArgumentException("Invalid value: " + 
clearSnapshotStrategyOption);

Review Comment:
   Should we fail fast there too?. If users are setting `clearSnapshotStrategy` 
then they might want a particular behaviour, if that behaviour can't be met, 
better to alert earlier. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462774050


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -24,25 +24,45 @@
 import java.util.Map;
 import java.util.Optional;
 import java.util.UUID;
+import java.util.regex.Pattern;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.bridge.BigNumberConfigImpl;
 import org.apache.cassandra.spark.config.SchemaFeature;
 import org.apache.cassandra.spark.config.SchemaFeatureSet;
 import org.apache.cassandra.spark.data.partitioner.ConsistencyLevel;
 import org.apache.cassandra.spark.utils.MapUtils;
+import org.jetbrains.annotations.NotNull;
 import org.jetbrains.annotations.Nullable;
 
 import static 
org.apache.cassandra.spark.data.CassandraDataLayer.aliasLastModifiedTimestamp;
 
 public final class ClientConfig
 {
+private static final Logger LOGGER = 
LoggerFactory.getLogger(ClientConfig.class);
+
 public static final String SIDECAR_INSTANCES = "sidecar_instances";
 public static final String KEYSPACE_KEY = "keyspace";
 public static final String TABLE_KEY = "table";
 public static final String SNAPSHOT_NAME_KEY = "snapshotName";
 public static final String DC_KEY = "dc";
 public static final String CREATE_SNAPSHOT_KEY = "createSnapshot";
 public static final String CLEAR_SNAPSHOT_KEY = "clearSnapshot";
+/**
+ * Format of clearSnapshotStrategy is {strategy [snapshotTTLvalue]}, 
clearSnapshotStrategy holds both the strategy
+ * and in case of TTL based strategy, TTL value. For e.g. 
onCompletionOrTTL 2d, TTL 2d, noOp, onCompletion. For
+ * clear snapshot strategies allowed check {@link ClearSnapshotStrategy}
+ */
+public static final String CLEAR_SNAPSHOT_STRATEGY_KEY = 
"clearSnapshotStrategy";
+/**
+ * TTL value is time to live option for the snapshot (available since 
Cassandra 4.1+). TTL value specified must
+ * contain unit along. For e.g. 2d represents a TTL for 2 days; 1h 
represents a TTL of 1 hour, etc.
+ * Valid units are {@code d}, {@code h}, {@code s}, {@code ms}, {@code 
us}, {@code µs}, {@code ns}, and {@code m}.
+ */
+public static final String DEFAULT_SNAPSHOT_TTL_VALUE = "2d";
+public static final String SNAPSHOT_TTL_PATTERN = 
"\\d+(d|h|m|s|ms)|(\\d+|\\d+\\.\\d+|\\.\\d+)[eE][+-](\\d+|\\d+\\.\\d+|\\.\\d+)(us|µs|ns)";

Review Comment:
   Thanks updated pattern accordingly 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


frankgh commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462559574


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,

Review Comment:
   maybe make it protected for extensibility?



##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -109,6 +134,50 @@ private ClientConfig(Map options)
 this.quoteIdentifiers = MapUtils.getBoolean(options, 
QUOTE_IDENTIFIERS, false);
 }
 
+private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean 
hasDeprecatedOption,
+ boolean 
clearSnapshot,
+ String 
clearSnapshotStrategyOption)
+{
+if (hasDeprecatedOption)
+{
+LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please 
set 'clearSnapshotStrategy' instead.");
+if (clearSnapshotStrategyOption == null)
+{
+return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() 
: new ClearSnapshotStrategy.NoOp();
+}
+}
+if (clearSnapshotStrategyOption == null)
+{
+LOGGER.debug("No clearSnapshotStrategy is set. Using the default 
strategy");
+return ClearSnapshotStrategy.defaultStrategy();
+}
+String[] strategyParts = clearSnapshotStrategyOption.split(" ");
+String strategyName;
+String snapshotTTL = null;
+if (strategyParts.length == 1)
+{
+strategyName = strategyParts[0].trim();
+}
+else if (strategyParts.length == 2)
+{
+strategyName = strategyParts[0].trim();
+snapshotTTL = strategyParts[1].trim();
+if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL))
+{
+String msg = "Incorrect value set for clearSnapshotStrategy, 
expected format is " +
+ "{strategy [snapshotTTLvalue]}. TTL value 
specified must contain unit along. " +
+ "For e.g. 2d represents a TTL for 2 days";
+throw new IllegalArgumentException(msg);
+}
+}
+else
+{
+LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", 
clearSnapshotStrategyOption);
+throw new IllegalArgumentException("Invalid value: " + 
clearSnapshotStrategyOption);

Review Comment:
   In ClearSnapshotStrategy.create the fallback behavior is to return the 
default strategy. Should we consider that fallback strategy here as well? 



##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -24,25 +24,45 @@
 import java.util.Map;
 import java.util.Optional;
 import java.util.UUID;
+import java.util.regex.Pattern;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import org.apache.cassandra.bridge.BigNumberConfigImpl;
 import org.apache.cassandra.spark.config.SchemaFeature;
 import org.apache.cassandra.spark.config.SchemaFeatureSet;
 import org.apache.cassandra.spark.data.partitioner.ConsistencyLevel;
 import org.apache.cassandra.spark.utils.MapUtils;
+import org.jetbrains.annotations.NotNull;
 import org.jetbrains.annotations.Nullable;
 
 import static 
org.apache.cassandra.spark.data.CassandraDataLayer.aliasLastModifiedTimestamp;
 
 public final class ClientConfig
 {
+private static final Logger LOGGER = 
LoggerFactory.getLogger(ClientConfig.class);
+
 public static final String SIDECAR_INSTANCES = "sidecar_instances";
 public static final String KEYSPACE_KEY = "keyspace";
 public static final String TABLE_KEY = "table";
 public static final String SNAPSHOT_NAME_KEY = "snapshotName";
 public static final String DC_KEY = "dc";
 public static final String CREATE_SNAPSHOT_KEY = "createSnapshot";
 public static final String CLEAR_SNAPSHOT_KEY = "clearSnapshot";
+/**
+ * Format of clearSnapshotStrategy is {strategy [snapshotTTLvalue]}, 
clearSnapshotStrategy holds both the strategy
+ * and in case of TTL based strategy, TTL value. For e.g. 
onCompletionOrTTL 2d, TTL 2d, noOp, onCompletion. For
+ * clear snapshot strategies allowed check {@link ClearSnapshotStrategy}
+ */
+public static final String CLEAR_SNAPSHOT_STRATEGY_KEY = 
"clearSnapshotStrategy";
+/**
+ * TTL value is time to live option for the snapshot (available since 
Cassandra 4.1+). TTL value specified must
+ * contain unit along. For e.g. 2d represents a TTL for 2 days; 1h 
represents a TTL of 1 hour, etc.
+ * Valid units 

Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462469394


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -138,7 +138,7 @@ private ClearSnapshotStrategy 
parseClearSnapshotStrategy(boolean hasDeprecatedOp
  boolean 
clearSnapshot,
  String 
clearSnapshotStrategyOption)
 {
-if (hasDeprecatedOption)
+if (hasDeprecatedOption && clearSnapshotStrategyOption == null)

Review Comment:
   Makes sense, will update it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


sarankk commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462466619


##
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/SharedClusterSparkIntegrationTestBase.java:
##
@@ -84,7 +84,7 @@ protected DataFrameReader bulkReaderDataFrame(QualifiedName 
tableName)
   // Shutdown hooks are called after the job ends, and in the 
case of integration tests
   // the sidecar is already shut down before this. Since the 
cluster will be torn
   // down anyway, the integration job skips clearing snapshots.
-  .option("clearSnapshot", "false")
+  .option("clearSnapshotStrategy", "OnCompletion")

Review Comment:
   since `onCompletion` behaviour was similar to test setup (clear after test 
method completes), kept that. But for consistency, will change it to `noop`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462464312


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -138,7 +138,7 @@ private ClearSnapshotStrategy 
parseClearSnapshotStrategy(boolean hasDeprecatedOp
  boolean 
clearSnapshot,
  String 
clearSnapshotStrategyOption)
 {
-if (hasDeprecatedOption)
+if (hasDeprecatedOption && clearSnapshotStrategyOption == null)

Review Comment:
   If the deprecated option is set, the warning should always be logged, 
regardless whether the new option is present or not. 
   
   ```java
   if (hasDeprecatedOption)
   {
 log warning
 if (clearSnapshotStrategyOption == null)
 {
   return the default based on the value
 }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-22 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462458671


##
cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/SharedClusterSparkIntegrationTestBase.java:
##
@@ -84,7 +84,7 @@ protected DataFrameReader bulkReaderDataFrame(QualifiedName 
tableName)
   // Shutdown hooks are called after the job ends, and in the 
case of integration tests
   // the sidecar is already shut down before this. Since the 
cluster will be torn
   // down anyway, the integration job skips clearing snapshots.
-  .option("clearSnapshot", "false")
+  .option("clearSnapshotStrategy", "OnCompletion")

Review Comment:
   before the patch, the test does not clear snapshot. Can you comment why it 
is changed?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-18 Thread via GitHub


sarankk commented on PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#issuecomment-1899368851

   As per @yifan-c suggestions, to avoid confusion caused by allowing both 
`clearSnapshot` option and `snapshotTTL` option, we are going ahead with 
following setting to determine whether snapshots should be cleared. Introducing 
a new option `clearSnapshotStrategy`. Possible values it can take are,
   
   `clearSnapshotStrategy: onCompletion // Only clear snapshot on the 
completion of the job. The onCompletion hook is evaluated at the best effort. 
It is possible the clear snapshot operation is not issued in some 
circumstances. 
   clearSnapshotStrategy: noOp // Do not clear snapshot at all.
   clearSnapshotStrategy: onCompletionOrTTL [2d] // Clear the snapshot on 
completion or after the specified duration (2 days as the default), whichever 
comes first. Note that there is a risk that the snapshot is deleted before job 
completion if it takes more than the specified duration. In most cases, the job 
should complete before the default 2 days
   clearSnapshotStrategy: TTL [2d] // Only clear snapshot after 2 days
   clearSnapshot: true === clearSnapshotStrategy: onCompletionOrTTL
   clearSnapshot: false === clearSnapshotStrategy: noOp`
   
   clearSnapshot option will be deprecated. For backward compatibility, we will 
still understand the option and resolve appropriate `clearSnapshotStrategy` 
option. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-17 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456718177


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java:
##
@@ -341,24 +341,31 @@ private CompletionStage> 
createSnapshot(RingRespon
 LOGGER.info("Creating snapshot on instance snapshotName={} 
keyspace={} table={} datacenter={} fqdn={}",
 snapshotName, maybeQuotedKeyspace, 
maybeQuotedTable, datacenter, ringEntry.fqdn());
 SidecarInstance sidecarInstance = new 
SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort());
-createSnapshotFuture = sidecar
-   .createSnapshot(sidecarInstance, 
maybeQuotedKeyspace, maybeQuotedTable, snapshotName)
-   .handle((resp, throwable) -> {
-   if (throwable == null)
-   {
-   // Create snapshot succeeded
-   return hint;
-   }
-
-   if (isExhausted(throwable))
-   {
-   LOGGER.warn("Failed to 
create snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.DOWN;
-   }
-
-   LOGGER.error("Unexpected error 
creating snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.UNKNOWN;
-   });
+String resolvedSnapshotTtl = options.clearSnapshot() ? 
options.effectiveSnapshotTtl() : null;

Review Comment:
   my understanding is when clearSnapshot is true, the snapshot is cleared on 
job completion; otherwise it sets a TTL and let Cassandra server to delete. 
   I'd wait for Saranya to clarify. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-17 Thread via GitHub


frankgh commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456716322


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -146,6 +158,16 @@ public boolean clearSnapshot()
 return clearSnapshot;
 }
 
+public String userProvidedSnapshotTtl()
+{
+return userProvidedSnapshotTtl;
+}
+
+public String effectiveSnapshotTtl()
+{
+return effectiveSnapshotTtl;
+}

Review Comment:
   The user provided option is used for logging:
   ```
if (!options.clearSnapshot() && options.userProvidedSnapshotTtl() != null)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-17 Thread via GitHub


frankgh commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456715644


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java:
##
@@ -341,24 +341,31 @@ private CompletionStage> 
createSnapshot(RingRespon
 LOGGER.info("Creating snapshot on instance snapshotName={} 
keyspace={} table={} datacenter={} fqdn={}",
 snapshotName, maybeQuotedKeyspace, 
maybeQuotedTable, datacenter, ringEntry.fqdn());
 SidecarInstance sidecarInstance = new 
SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort());
-createSnapshotFuture = sidecar
-   .createSnapshot(sidecarInstance, 
maybeQuotedKeyspace, maybeQuotedTable, snapshotName)
-   .handle((resp, throwable) -> {
-   if (throwable == null)
-   {
-   // Create snapshot succeeded
-   return hint;
-   }
-
-   if (isExhausted(throwable))
-   {
-   LOGGER.warn("Failed to 
create snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.DOWN;
-   }
-
-   LOGGER.error("Unexpected error 
creating snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.UNKNOWN;
-   });
+String resolvedSnapshotTtl = options.clearSnapshot() ? 
options.effectiveSnapshotTtl() : null;

Review Comment:
   I think the condition here is correct. When clear snapshot is set, it allows 
the snapshot TTL to be set. When clear snapshot is set to false, we don't want 
the TTL to clear the snapshot at a later point. I think that's how I understand 
it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-17 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456714473


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java:
##
@@ -341,24 +341,31 @@ private CompletionStage> 
createSnapshot(RingRespon
 LOGGER.info("Creating snapshot on instance snapshotName={} 
keyspace={} table={} datacenter={} fqdn={}",
 snapshotName, maybeQuotedKeyspace, 
maybeQuotedTable, datacenter, ringEntry.fqdn());
 SidecarInstance sidecarInstance = new 
SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort());
-createSnapshotFuture = sidecar
-   .createSnapshot(sidecarInstance, 
maybeQuotedKeyspace, maybeQuotedTable, snapshotName)
-   .handle((resp, throwable) -> {
-   if (throwable == null)
-   {
-   // Create snapshot succeeded
-   return hint;
-   }
-
-   if (isExhausted(throwable))
-   {
-   LOGGER.warn("Failed to 
create snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.DOWN;
-   }
-
-   LOGGER.error("Unexpected error 
creating snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.UNKNOWN;
-   });
+String resolvedSnapshotTtl = options.clearSnapshot() ? 
options.effectiveSnapshotTtl() : null;
+if (!options.clearSnapshot() && 
options.userProvidedSnapshotTtl() != null)
+{
+LOGGER.warn("Snapshot TTL option was provided along 
with Clear Snapshot option, bulk reader" +
+"can honor only one of the 2. Clear 
Snapshot takes precedence over Snapshot TTL, " +
+"hence snapshot {} will not be removed 
after the specified snapshot TTL option", snapshotName);
+}

Review Comment:
   The condition is reversed too. Beside that, I think the warning is best 
emitted from `ClientConfig`, and only use the resolved value at the call-sites. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]

2024-01-17 Thread via GitHub


yifan-c commented on code in PR #31:
URL: 
https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456713939


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java:
##
@@ -341,24 +341,31 @@ private CompletionStage> 
createSnapshot(RingRespon
 LOGGER.info("Creating snapshot on instance snapshotName={} 
keyspace={} table={} datacenter={} fqdn={}",
 snapshotName, maybeQuotedKeyspace, 
maybeQuotedTable, datacenter, ringEntry.fqdn());
 SidecarInstance sidecarInstance = new 
SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort());
-createSnapshotFuture = sidecar
-   .createSnapshot(sidecarInstance, 
maybeQuotedKeyspace, maybeQuotedTable, snapshotName)
-   .handle((resp, throwable) -> {
-   if (throwable == null)
-   {
-   // Create snapshot succeeded
-   return hint;
-   }
-
-   if (isExhausted(throwable))
-   {
-   LOGGER.warn("Failed to 
create snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.DOWN;
-   }
-
-   LOGGER.error("Unexpected error 
creating snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.UNKNOWN;
-   });
+String resolvedSnapshotTtl = options.clearSnapshot() ? 
options.effectiveSnapshotTtl() : null;

Review Comment:
   The condition is reversed. The resolved snapshot ttl is set, when 
`!options.clearSnapshot()`.



##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java:
##
@@ -146,6 +158,16 @@ public boolean clearSnapshot()
 return clearSnapshot;
 }
 
+public String userProvidedSnapshotTtl()
+{
+return userProvidedSnapshotTtl;
+}
+
+public String effectiveSnapshotTtl()
+{
+return effectiveSnapshotTtl;
+}

Review Comment:
   It looks like the `effectiveSnapshotTtl` is the resolved value on reading 
from the map. It is confusing to expose both the raw and the resolved values. 
   Having `effectiveSnapshotTtl()` is suffice. 



##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java:
##
@@ -341,24 +341,31 @@ private CompletionStage> 
createSnapshot(RingRespon
 LOGGER.info("Creating snapshot on instance snapshotName={} 
keyspace={} table={} datacenter={} fqdn={}",
 snapshotName, maybeQuotedKeyspace, 
maybeQuotedTable, datacenter, ringEntry.fqdn());
 SidecarInstance sidecarInstance = new 
SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort());
-createSnapshotFuture = sidecar
-   .createSnapshot(sidecarInstance, 
maybeQuotedKeyspace, maybeQuotedTable, snapshotName)
-   .handle((resp, throwable) -> {
-   if (throwable == null)
-   {
-   // Create snapshot succeeded
-   return hint;
-   }
-
-   if (isExhausted(throwable))
-   {
-   LOGGER.warn("Failed to 
create snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.DOWN;
-   }
-
-   LOGGER.error("Unexpected error 
creating snapshot on instance", throwable);
-   return 
PartitionedDataLayer.AvailabilityHint.UNKNOWN;
-   });
+String resolvedSnapshotTtl = options.clearSnapshot() ? 
options.effectiveSnapshotTtl() : null;
+if (!options.clearSnapshot() && 
options.userProvidedSnapshotTtl() != null)
+{
+LOGGER.warn("Snapshot TTL option was provided along 
with Clear Snapshot option, bulk reader" +
+