[
https://issues.apache.org/jira/browse/CASSANALYTICS-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh McKenzie updated CASSANALYTICS-32:
---------------------------------------
Fix Version/s: 1.0
> Support writing to tables with constraints
> ------------------------------------------
>
> Key: CASSANALYTICS-32
> URL: https://issues.apache.org/jira/browse/CASSANALYTICS-32
> Project: Apache Cassandra Analytics
> Issue Type: New Feature
> Components: Writer
> Reporter: Doug Rohrer
> Priority: Normal
> Fix For: 1.0
>
>
> Today, the underlying CQLSSTableWriter will throw an exception if we write
> data which violates a constraint. If left in this state, we would end up
> failing a Spark job because the task with the invalid data wouldn't be able
> to complete.
> Decide on how we're going to handle these cases and implement the appropriate
> logic in the bulk writer. This could be a writer option that lets the
> end-user choose which option to take, and could include:
> # Fail job (no code changes on the Bulk Writer)
> # Skip rows that violate constraints and log - from experience, most folks
> don't check logs for successful jobs, so this may eventually lead to issues
> with users thinking they "lost" data when it was really just not writable -
> this would mostly be adding a try/catch around the call to addRow and a log
> line to log invalid data. We should absolutely not do this "by default"
> though - it should be an opt-in feature that defaults to failing the job.
> # Add a feature to the CQLSSTableWriter to disable constraints and allow
> writing of data that would otherwise not meet the table's constraints. This
> would allow otherwise-invalid data to be written and not fail the Spark job,
> and should probably be logged as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]