echauchot opened a new pull request, #19680:
URL: https://github.com/apache/flink/pull/19680

   
   
   ## What is the purpose of the change
   
   When migrating to Cassandra 4.x in [this 
PR](https://github.com/apache/flink/pull/19586) a race condition in the tests 
between the asynchronous writes and the junit assertions was uncovered. So it 
was decided to introduce the flush mechanism to asynchronous writes in the 
Cassandra output formats similarly to what was done in Cassandra sinks.
   
   
   ## Brief change log
   
   The existing class `CassandraOutputFormatBase` that was previously used as a 
base class only for Tuple and Row outputFormats is now used as a base class for 
the 3 output formats including Pojo. the base class for column based output 
formats (tuple and row) is now a new class called 
CassandraColumnarOutputFormatBase.
   Regarding configuration of the flush I preferred using simple setters to a 
configuration object as there was no builders for the output formats.
   Regarding other modules: I extracted a utility method for semaphore 
management (SinkUtils) because it is used by both sinks and output formats now. 
And I also had to change the exceptions thrown in OutputFormat as some methods 
can now throw TimeoutException and InterruptedException because of the flush 
mechanism. I think it is ok as this interface is not user facing.
   
   
   ## Verifying this change
   
   
   
   This change is already covered by existing ITCAse tests 
   This change added UTests for the flush mechanism and can be verified as 
follows: CassandraOutputFormatBaseTest
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): adding 
semaphore permit management + flush on close
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? yes 
     - If yes, how is the feature documented? javadocs
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to