skywalker0618 commented on PR #18701:
URL: https://github.com/apache/hudi/pull/18701#issuecomment-4408941622

   > some flink tests fail during reading parquet file, plz check. 
@skywalker0618
   
   Thanks, the test failure was due to a race condition
   
   Sequence:
   
   1. Sink reaches the expected row count and throws SuccessException, which 
begins to propagate as the job's failure cause.
   2. The cascading shutdown closes the chained source's Hadoop 
FSDataInputStream.
   3. The source's SplitFetcher thread (or the mailbox-side reader, depending 
on timing) does one more row-group read on the now-closed stream and surfaces 
IOException("Stream is closed!").
   4. With restart-strategy.fixed-delay.attempts=0 (set in beforeEach to keep 
IT cases deterministic), whichever exception the JobMaster latches first 
becomes the reported failure cause. When the source's IOException wins the 
race, assertThrowable(..., SuccessException.class) doesn't find 
SuccessException in the cause chain and the test fails — even though the sink 
had already collected the expected rows by that point.
   
   Put in a fix in the failed test. It now tolerates two terminal causes: the 
existing SuccessException (happy path) and IOException("Stream is closed!") 
walked via the cause chain. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to