stream2000 commented on PR #9887:
URL: https://github.com/apache/hudi/pull/9887#issuecomment-1777029605

   > so in such case, files are always created already?
   
   I added some logs to track one of the write tasks(task 62.0)  and get the 
following result: 
   
   ```txt
   31398 [dispatcher-event-loop-0] INFO  org.apache.spark.executor.Executor [] 
- Executor is trying to kill task 62.0 in stage 38.0 (TID 845), reason: Stage 
cancelled
   33673 [ScalaTest-run-running-TestInsertTable] ERROR 
org.apache.hudi.internal.DataSourceInternalWriterHelper [] - Commit 
20231024112325400 aborted 
   33720 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
INFO  org.apache.hudi.table.marker.TimelineServerBasedWriteMarkers [] - 
[timeline-server-based] Created marker file 
dt=2021-01-05/bb4c8ba7-475a-3e8f-b2ad-5ed55bee188f-0_62-845-0_20231024112325400.parquet.marker.CREATE
 in 2390 ms
   33725 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
INFO  org.apache.hudi.io.storage.row.HoodieRowParquetWriteSupport [] - 
Initialized Parquet WriteSupport with Catalyst schema:
   33749 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
ERROR org.apache.spark.util.Utils [] - Aborting task
   33753 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
ERROR 
   33753 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
ERROR org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask [] - 
Aborting commit for partition 62 (task 845, attempt 0, stage 38.0)
   33753 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
WARN  org.apache.hudi.spark3.internal.HoodieBulkInsertDataInternalWriter [] - 
Task Aborted
   org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask [] - 
Aborted commit for partition 62 (task 845, attempt 0, stage 38.0)
   33753 [Executor task launch worker for task 62.0 in stage 38.0 (TID 845)] 
WARN  org.apache.hudi.spark3.internal.HoodieBulkInsertDataInternalWriter [] - 
Task Closed
   ```
   
   The event sequence is: 
   
   driver try to cancel task 62 -> driver aborted -> task 62 create marker -> 
task 62 write parquet -> task 62 aborted -> task 62 closed. 
   
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to