Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
frankgh commented on PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#issuecomment-1912886999 Closed via https://github.com/apache/cassandra-analytics/commit/8c20b452dd0728a6fad6d276a7be9fa1b9274495 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
frankgh closed pull request #31: CASSANDRA-19273 Allow setting TTL for snapshots created through reader URL: https://github.com/apache/cassandra-analytics/pull/31 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
yifan-c commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1468236613 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -109,6 +134,50 @@ private ClientConfig(Map options) this.quoteIdentifiers = MapUtils.getBoolean(options, QUOTE_IDENTIFIERS, false); } +private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean hasDeprecatedOption, + boolean clearSnapshot, + String clearSnapshotStrategyOption) +{ +if (hasDeprecatedOption) +{ +LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please set 'clearSnapshotStrategy' instead."); +if (clearSnapshotStrategyOption == null) +{ +return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() : new ClearSnapshotStrategy.NoOp(); +} +} +if (clearSnapshotStrategyOption == null) +{ +LOGGER.debug("No clearSnapshotStrategy is set. Using the default strategy"); +return ClearSnapshotStrategy.defaultStrategy(); +} +String[] strategyParts = clearSnapshotStrategyOption.split(" "); +String strategyName; +String snapshotTTL = null; +if (strategyParts.length == 1) +{ +strategyName = strategyParts[0].trim(); +} +else if (strategyParts.length == 2) +{ +strategyName = strategyParts[0].trim(); +snapshotTTL = strategyParts[1].trim(); +if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL)) +{ +String msg = "Incorrect value set for clearSnapshotStrategy, expected format is " + + "{strategy [snapshotTTLvalue]}. TTL value specified must contain unit along. " + + "For e.g. 2d represents a TTL for 2 days"; +throw new IllegalArgumentException(msg); +} +} +else +{ +LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", clearSnapshotStrategyOption); +throw new IllegalArgumentException("Invalid value: " + clearSnapshotStrategyOption); Review Comment: Fail fast sounds good. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
sarankk commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1467053662 ## cassandra-bridge/src/main/java/org/apache/cassandra/spark/utils/MapUtils.java: ## @@ -225,4 +225,15 @@ public static String getOrDefault(Map options, String key, Strin { return options.getOrDefault(lowerCaseKey(key), defaultValue); } + +/** + * Method to check if key is present in {@code options} map. + * @param options the map + * @param key the key to the map + * @return boolean + */ +public static boolean contains(Map options, String key) Review Comment: makes sense, will make it consistent with `Map` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
frankgh commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1467052902 ## cassandra-bridge/src/main/java/org/apache/cassandra/spark/utils/MapUtils.java: ## @@ -225,4 +225,15 @@ public static String getOrDefault(Map options, String key, Strin { return options.getOrDefault(lowerCaseKey(key), defaultValue); } + +/** + * Method to check if key is present in {@code options} map. + * @param options the map + * @param key the key to the map + * @return boolean + */ +public static boolean contains(Map options, String key) Review Comment: should this method be called `containsKey` instead? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
sarankk commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462775859 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -109,6 +134,50 @@ private ClientConfig(Map options) this.quoteIdentifiers = MapUtils.getBoolean(options, QUOTE_IDENTIFIERS, false); } +private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean hasDeprecatedOption, + boolean clearSnapshot, + String clearSnapshotStrategyOption) +{ +if (hasDeprecatedOption) +{ +LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please set 'clearSnapshotStrategy' instead."); +if (clearSnapshotStrategyOption == null) +{ +return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() : new ClearSnapshotStrategy.NoOp(); +} +} +if (clearSnapshotStrategyOption == null) +{ +LOGGER.debug("No clearSnapshotStrategy is set. Using the default strategy"); +return ClearSnapshotStrategy.defaultStrategy(); +} +String[] strategyParts = clearSnapshotStrategyOption.split(" "); +String strategyName; +String snapshotTTL = null; +if (strategyParts.length == 1) +{ +strategyName = strategyParts[0].trim(); +} +else if (strategyParts.length == 2) +{ +strategyName = strategyParts[0].trim(); +snapshotTTL = strategyParts[1].trim(); +if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL)) +{ +String msg = "Incorrect value set for clearSnapshotStrategy, expected format is " + + "{strategy [snapshotTTLvalue]}. TTL value specified must contain unit along. " + + "For e.g. 2d represents a TTL for 2 days"; +throw new IllegalArgumentException(msg); +} +} +else +{ +LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", clearSnapshotStrategyOption); +throw new IllegalArgumentException("Invalid value: " + clearSnapshotStrategyOption); Review Comment: Should we fail fast there too?. If users are setting `clearSnapshotStrategy` then they might want a particular behaviour, if that behaviour can't be met, better to alert earlier. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
sarankk commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462774050 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -24,25 +24,45 @@ import java.util.Map; import java.util.Optional; import java.util.UUID; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; import org.apache.cassandra.bridge.BigNumberConfigImpl; import org.apache.cassandra.spark.config.SchemaFeature; import org.apache.cassandra.spark.config.SchemaFeatureSet; import org.apache.cassandra.spark.data.partitioner.ConsistencyLevel; import org.apache.cassandra.spark.utils.MapUtils; +import org.jetbrains.annotations.NotNull; import org.jetbrains.annotations.Nullable; import static org.apache.cassandra.spark.data.CassandraDataLayer.aliasLastModifiedTimestamp; public final class ClientConfig { +private static final Logger LOGGER = LoggerFactory.getLogger(ClientConfig.class); + public static final String SIDECAR_INSTANCES = "sidecar_instances"; public static final String KEYSPACE_KEY = "keyspace"; public static final String TABLE_KEY = "table"; public static final String SNAPSHOT_NAME_KEY = "snapshotName"; public static final String DC_KEY = "dc"; public static final String CREATE_SNAPSHOT_KEY = "createSnapshot"; public static final String CLEAR_SNAPSHOT_KEY = "clearSnapshot"; +/** + * Format of clearSnapshotStrategy is {strategy [snapshotTTLvalue]}, clearSnapshotStrategy holds both the strategy + * and in case of TTL based strategy, TTL value. For e.g. onCompletionOrTTL 2d, TTL 2d, noOp, onCompletion. For + * clear snapshot strategies allowed check {@link ClearSnapshotStrategy} + */ +public static final String CLEAR_SNAPSHOT_STRATEGY_KEY = "clearSnapshotStrategy"; +/** + * TTL value is time to live option for the snapshot (available since Cassandra 4.1+). TTL value specified must + * contain unit along. For e.g. 2d represents a TTL for 2 days; 1h represents a TTL of 1 hour, etc. + * Valid units are {@code d}, {@code h}, {@code s}, {@code ms}, {@code us}, {@code µs}, {@code ns}, and {@code m}. + */ +public static final String DEFAULT_SNAPSHOT_TTL_VALUE = "2d"; +public static final String SNAPSHOT_TTL_PATTERN = "\\d+(d|h|m|s|ms)|(\\d+|\\d+\\.\\d+|\\.\\d+)[eE][+-](\\d+|\\d+\\.\\d+|\\.\\d+)(us|µs|ns)"; Review Comment: Thanks updated pattern accordingly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
frankgh commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462559574 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -109,6 +134,50 @@ private ClientConfig(Map options) this.quoteIdentifiers = MapUtils.getBoolean(options, QUOTE_IDENTIFIERS, false); } +private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean hasDeprecatedOption, Review Comment: maybe make it protected for extensibility? ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -109,6 +134,50 @@ private ClientConfig(Map options) this.quoteIdentifiers = MapUtils.getBoolean(options, QUOTE_IDENTIFIERS, false); } +private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean hasDeprecatedOption, + boolean clearSnapshot, + String clearSnapshotStrategyOption) +{ +if (hasDeprecatedOption) +{ +LOGGER.warn("The deprecated option 'clearSnapshot' is set. Please set 'clearSnapshotStrategy' instead."); +if (clearSnapshotStrategyOption == null) +{ +return clearSnapshot ? ClearSnapshotStrategy.defaultStrategy() : new ClearSnapshotStrategy.NoOp(); +} +} +if (clearSnapshotStrategyOption == null) +{ +LOGGER.debug("No clearSnapshotStrategy is set. Using the default strategy"); +return ClearSnapshotStrategy.defaultStrategy(); +} +String[] strategyParts = clearSnapshotStrategyOption.split(" "); +String strategyName; +String snapshotTTL = null; +if (strategyParts.length == 1) +{ +strategyName = strategyParts[0].trim(); +} +else if (strategyParts.length == 2) +{ +strategyName = strategyParts[0].trim(); +snapshotTTL = strategyParts[1].trim(); +if (!Pattern.matches(SNAPSHOT_TTL_PATTERN, snapshotTTL)) +{ +String msg = "Incorrect value set for clearSnapshotStrategy, expected format is " + + "{strategy [snapshotTTLvalue]}. TTL value specified must contain unit along. " + + "For e.g. 2d represents a TTL for 2 days"; +throw new IllegalArgumentException(msg); +} +} +else +{ +LOGGER.error("Invalid value for ClearSnapshotStrategy: '{}'", clearSnapshotStrategyOption); +throw new IllegalArgumentException("Invalid value: " + clearSnapshotStrategyOption); Review Comment: In ClearSnapshotStrategy.create the fallback behavior is to return the default strategy. Should we consider that fallback strategy here as well? ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -24,25 +24,45 @@ import java.util.Map; import java.util.Optional; import java.util.UUID; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; import org.apache.cassandra.bridge.BigNumberConfigImpl; import org.apache.cassandra.spark.config.SchemaFeature; import org.apache.cassandra.spark.config.SchemaFeatureSet; import org.apache.cassandra.spark.data.partitioner.ConsistencyLevel; import org.apache.cassandra.spark.utils.MapUtils; +import org.jetbrains.annotations.NotNull; import org.jetbrains.annotations.Nullable; import static org.apache.cassandra.spark.data.CassandraDataLayer.aliasLastModifiedTimestamp; public final class ClientConfig { +private static final Logger LOGGER = LoggerFactory.getLogger(ClientConfig.class); + public static final String SIDECAR_INSTANCES = "sidecar_instances"; public static final String KEYSPACE_KEY = "keyspace"; public static final String TABLE_KEY = "table"; public static final String SNAPSHOT_NAME_KEY = "snapshotName"; public static final String DC_KEY = "dc"; public static final String CREATE_SNAPSHOT_KEY = "createSnapshot"; public static final String CLEAR_SNAPSHOT_KEY = "clearSnapshot"; +/** + * Format of clearSnapshotStrategy is {strategy [snapshotTTLvalue]}, clearSnapshotStrategy holds both the strategy + * and in case of TTL based strategy, TTL value. For e.g. onCompletionOrTTL 2d, TTL 2d, noOp, onCompletion. For + * clear snapshot strategies allowed check {@link ClearSnapshotStrategy} + */ +public static final String CLEAR_SNAPSHOT_STRATEGY_KEY = "clearSnapshotStrategy"; +/** + * TTL value is time to live option for the snapshot (available since Cassandra 4.1+). TTL value specified must + * contain unit along. For e.g. 2d represents a TTL for 2 days; 1h represents a TTL of 1 hour, etc. + * Valid units
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
sarankk commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462469394 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -138,7 +138,7 @@ private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean hasDeprecatedOp boolean clearSnapshot, String clearSnapshotStrategyOption) { -if (hasDeprecatedOption) +if (hasDeprecatedOption && clearSnapshotStrategyOption == null) Review Comment: Makes sense, will update it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
sarankk commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462466619 ## cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/SharedClusterSparkIntegrationTestBase.java: ## @@ -84,7 +84,7 @@ protected DataFrameReader bulkReaderDataFrame(QualifiedName tableName) // Shutdown hooks are called after the job ends, and in the case of integration tests // the sidecar is already shut down before this. Since the cluster will be torn // down anyway, the integration job skips clearing snapshots. - .option("clearSnapshot", "false") + .option("clearSnapshotStrategy", "OnCompletion") Review Comment: since `onCompletion` behaviour was similar to test setup (clear after test method completes), kept that. But for consistency, will change it to `noop` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
yifan-c commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462464312 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -138,7 +138,7 @@ private ClearSnapshotStrategy parseClearSnapshotStrategy(boolean hasDeprecatedOp boolean clearSnapshot, String clearSnapshotStrategyOption) { -if (hasDeprecatedOption) +if (hasDeprecatedOption && clearSnapshotStrategyOption == null) Review Comment: If the deprecated option is set, the warning should always be logged, regardless whether the new option is present or not. ```java if (hasDeprecatedOption) { log warning if (clearSnapshotStrategyOption == null) { return the default based on the value } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
yifan-c commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1462458671 ## cassandra-analytics-integration-tests/src/test/java/org/apache/cassandra/analytics/SharedClusterSparkIntegrationTestBase.java: ## @@ -84,7 +84,7 @@ protected DataFrameReader bulkReaderDataFrame(QualifiedName tableName) // Shutdown hooks are called after the job ends, and in the case of integration tests // the sidecar is already shut down before this. Since the cluster will be torn // down anyway, the integration job skips clearing snapshots. - .option("clearSnapshot", "false") + .option("clearSnapshotStrategy", "OnCompletion") Review Comment: before the patch, the test does not clear snapshot. Can you comment why it is changed? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
sarankk commented on PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#issuecomment-1899368851 As per @yifan-c suggestions, to avoid confusion caused by allowing both `clearSnapshot` option and `snapshotTTL` option, we are going ahead with following setting to determine whether snapshots should be cleared. Introducing a new option `clearSnapshotStrategy`. Possible values it can take are, `clearSnapshotStrategy: onCompletion // Only clear snapshot on the completion of the job. The onCompletion hook is evaluated at the best effort. It is possible the clear snapshot operation is not issued in some circumstances. clearSnapshotStrategy: noOp // Do not clear snapshot at all. clearSnapshotStrategy: onCompletionOrTTL [2d] // Clear the snapshot on completion or after the specified duration (2 days as the default), whichever comes first. Note that there is a risk that the snapshot is deleted before job completion if it takes more than the specified duration. In most cases, the job should complete before the default 2 days clearSnapshotStrategy: TTL [2d] // Only clear snapshot after 2 days clearSnapshot: true === clearSnapshotStrategy: onCompletionOrTTL clearSnapshot: false === clearSnapshotStrategy: noOp` clearSnapshot option will be deprecated. For backward compatibility, we will still understand the option and resolve appropriate `clearSnapshotStrategy` option. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
yifan-c commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456718177 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -341,24 +341,31 @@ private CompletionStage> createSnapshot(RingRespon LOGGER.info("Creating snapshot on instance snapshotName={} keyspace={} table={} datacenter={} fqdn={}", snapshotName, maybeQuotedKeyspace, maybeQuotedTable, datacenter, ringEntry.fqdn()); SidecarInstance sidecarInstance = new SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort()); -createSnapshotFuture = sidecar - .createSnapshot(sidecarInstance, maybeQuotedKeyspace, maybeQuotedTable, snapshotName) - .handle((resp, throwable) -> { - if (throwable == null) - { - // Create snapshot succeeded - return hint; - } - - if (isExhausted(throwable)) - { - LOGGER.warn("Failed to create snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.DOWN; - } - - LOGGER.error("Unexpected error creating snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.UNKNOWN; - }); +String resolvedSnapshotTtl = options.clearSnapshot() ? options.effectiveSnapshotTtl() : null; Review Comment: my understanding is when clearSnapshot is true, the snapshot is cleared on job completion; otherwise it sets a TTL and let Cassandra server to delete. I'd wait for Saranya to clarify. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
frankgh commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456716322 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -146,6 +158,16 @@ public boolean clearSnapshot() return clearSnapshot; } +public String userProvidedSnapshotTtl() +{ +return userProvidedSnapshotTtl; +} + +public String effectiveSnapshotTtl() +{ +return effectiveSnapshotTtl; +} Review Comment: The user provided option is used for logging: ``` if (!options.clearSnapshot() && options.userProvidedSnapshotTtl() != null) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
frankgh commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456715644 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -341,24 +341,31 @@ private CompletionStage> createSnapshot(RingRespon LOGGER.info("Creating snapshot on instance snapshotName={} keyspace={} table={} datacenter={} fqdn={}", snapshotName, maybeQuotedKeyspace, maybeQuotedTable, datacenter, ringEntry.fqdn()); SidecarInstance sidecarInstance = new SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort()); -createSnapshotFuture = sidecar - .createSnapshot(sidecarInstance, maybeQuotedKeyspace, maybeQuotedTable, snapshotName) - .handle((resp, throwable) -> { - if (throwable == null) - { - // Create snapshot succeeded - return hint; - } - - if (isExhausted(throwable)) - { - LOGGER.warn("Failed to create snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.DOWN; - } - - LOGGER.error("Unexpected error creating snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.UNKNOWN; - }); +String resolvedSnapshotTtl = options.clearSnapshot() ? options.effectiveSnapshotTtl() : null; Review Comment: I think the condition here is correct. When clear snapshot is set, it allows the snapshot TTL to be set. When clear snapshot is set to false, we don't want the TTL to clear the snapshot at a later point. I think that's how I understand it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
yifan-c commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456714473 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -341,24 +341,31 @@ private CompletionStage> createSnapshot(RingRespon LOGGER.info("Creating snapshot on instance snapshotName={} keyspace={} table={} datacenter={} fqdn={}", snapshotName, maybeQuotedKeyspace, maybeQuotedTable, datacenter, ringEntry.fqdn()); SidecarInstance sidecarInstance = new SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort()); -createSnapshotFuture = sidecar - .createSnapshot(sidecarInstance, maybeQuotedKeyspace, maybeQuotedTable, snapshotName) - .handle((resp, throwable) -> { - if (throwable == null) - { - // Create snapshot succeeded - return hint; - } - - if (isExhausted(throwable)) - { - LOGGER.warn("Failed to create snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.DOWN; - } - - LOGGER.error("Unexpected error creating snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.UNKNOWN; - }); +String resolvedSnapshotTtl = options.clearSnapshot() ? options.effectiveSnapshotTtl() : null; +if (!options.clearSnapshot() && options.userProvidedSnapshotTtl() != null) +{ +LOGGER.warn("Snapshot TTL option was provided along with Clear Snapshot option, bulk reader" + +"can honor only one of the 2. Clear Snapshot takes precedence over Snapshot TTL, " + +"hence snapshot {} will not be removed after the specified snapshot TTL option", snapshotName); +} Review Comment: The condition is reversed too. Beside that, I think the warning is best emitted from `ClientConfig`, and only use the resolved value at the call-sites. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19273 Allow setting TTL for snapshots created through reader [cassandra-analytics]
yifan-c commented on code in PR #31: URL: https://github.com/apache/cassandra-analytics/pull/31#discussion_r1456713939 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -341,24 +341,31 @@ private CompletionStage> createSnapshot(RingRespon LOGGER.info("Creating snapshot on instance snapshotName={} keyspace={} table={} datacenter={} fqdn={}", snapshotName, maybeQuotedKeyspace, maybeQuotedTable, datacenter, ringEntry.fqdn()); SidecarInstance sidecarInstance = new SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort()); -createSnapshotFuture = sidecar - .createSnapshot(sidecarInstance, maybeQuotedKeyspace, maybeQuotedTable, snapshotName) - .handle((resp, throwable) -> { - if (throwable == null) - { - // Create snapshot succeeded - return hint; - } - - if (isExhausted(throwable)) - { - LOGGER.warn("Failed to create snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.DOWN; - } - - LOGGER.error("Unexpected error creating snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.UNKNOWN; - }); +String resolvedSnapshotTtl = options.clearSnapshot() ? options.effectiveSnapshotTtl() : null; Review Comment: The condition is reversed. The resolved snapshot ttl is set, when `!options.clearSnapshot()`. ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/ClientConfig.java: ## @@ -146,6 +158,16 @@ public boolean clearSnapshot() return clearSnapshot; } +public String userProvidedSnapshotTtl() +{ +return userProvidedSnapshotTtl; +} + +public String effectiveSnapshotTtl() +{ +return effectiveSnapshotTtl; +} Review Comment: It looks like the `effectiveSnapshotTtl` is the resolved value on reading from the map. It is confusing to expose both the raw and the resolved values. Having `effectiveSnapshotTtl()` is suffice. ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java: ## @@ -341,24 +341,31 @@ private CompletionStage> createSnapshot(RingRespon LOGGER.info("Creating snapshot on instance snapshotName={} keyspace={} table={} datacenter={} fqdn={}", snapshotName, maybeQuotedKeyspace, maybeQuotedTable, datacenter, ringEntry.fqdn()); SidecarInstance sidecarInstance = new SidecarInstanceImpl(ringEntry.fqdn(), sidecarClientConfig.effectivePort()); -createSnapshotFuture = sidecar - .createSnapshot(sidecarInstance, maybeQuotedKeyspace, maybeQuotedTable, snapshotName) - .handle((resp, throwable) -> { - if (throwable == null) - { - // Create snapshot succeeded - return hint; - } - - if (isExhausted(throwable)) - { - LOGGER.warn("Failed to create snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.DOWN; - } - - LOGGER.error("Unexpected error creating snapshot on instance", throwable); - return PartitionedDataLayer.AvailabilityHint.UNKNOWN; - }); +String resolvedSnapshotTtl = options.clearSnapshot() ? options.effectiveSnapshotTtl() : null; +if (!options.clearSnapshot() && options.userProvidedSnapshotTtl() != null) +{ +LOGGER.warn("Snapshot TTL option was provided along with Clear Snapshot option, bulk reader" + +