WarsenLiu opened a new issue, #7106:
URL: https://github.com/apache/seatunnel/issues/7106

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   Build SeaTunnel Engine using three 8u32g servers, but sometimes there may be 
checkpoint timeout and checkpoints always write to the first node, resulting in 
too many inode.
   
   ### SeaTunnel Version
   
   2.3.5
   
   ### SeaTunnel Config
   
   ```conf
   seatunnel:
     engine:
       classloader-cache-mode: true
       history-job-expire-minutes: 180
       backup-count: 1
       queue-type: blockingqueue
       print-execution-info-interval: 60
       print-job-metrics-info-interval: 60
       slot-service:
         dynamic-slot: true
       checkpoint:
         interval: 300000
         timeout: 600000
         storage:
           type: hdfs
           max-retained: 3
           plugin-config:
             namespace: /data/apache-seatunnel-2.3.5/checkpoint
             # namespace: /tmp/seatunnel/checkpoint_snapshot
             storage.type: hdfs
             fs.defaultFS: file:///data/apache-seatunnel-2.3.5/
   ```
   
   
   ### Running Command
   
   ```shell
   used ds:
   env {
     parallelism = 1
     job.mode = "STREAMING"
     checkpoint.interval = 60000
     job.name = "z003"
   }
   
   source {
     MySQL-CDC {
       base-url = "jdbc:mysql://xxx:3306/xxx?autoReconnect=true"
       username = "root"
       password = "xxx"
       table-names = ["xxx.xxx"]
       startup.mode = "initial"
       result_table_name = "source_table_2"
       query = "select xxx from xxx"
     }
   }
   
   transform {
     Sql {
       source_table_name = "source_table_2"
       result_table_name = "target_table_2"
       query = "select xxx from source_table_2"
     }
     Sql {
       source_table_name = "target_table_2"
       result_table_name = "target_table_log_2"
       query = "select xxx from target_table_2"
     }
   }
   
   sink {
     Jdbc {
       url = "jdbc:mysql://xxx:3306/xxx?autoReconnect=true"
       driver= "com.mysql.cj.jdbc.Driver"
       user = "root"
       password = "xxx"
       database = "xxx"
       source_table_name = "target_table_2"
       generate_sink_sql = true
       table = "xxx"
       batch_size = 10
       primary_keys = ["xxx"]
     }
     Jdbc {
       url = "jdbc:mysql://xxx:3306/xxx?autoReconnect=true"
       driver= "com.mysql.cj.jdbc.Driver"
       user = "root"
       password = "xxx"
       database = "xxx"
       source_table_name = "target_table_log_2"
       batch_size = 10
       query = "insert into xxx(xxx) values(?) ON DUPLICATE KEY UPDATE field= 
VALUES(field);"
     }
   }
   ```
   
   
   ### Error Exception
   
   ```log
   [INFO] 2024-07-04 13:37:16.704 +0800 -  -> 
        2024-07-04 13:37:15,926 INFO  
org.apache.seatunnel.engine.client.job.ClientJobProxy - Job 
(861093691492663301) end with state FAILED
        2024-07-04 13:37:15,927 INFO  com.hazelcast.core.LifecycleService - 
hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is 
SHUTTING_DOWN
        2024-07-04 13:37:15,936 INFO  
com.hazelcast.client.impl.connection.ClientConnectionManager - hz.client_1 
[seatunnel] [5.1] Removed connection to endpoint: 
[10.60.162.14]:5801:aed286d6-1625-4ff4-91b6-34028db07da3, connection: 
ClientConnection{alive=false, connectionId=2, 
channel=NioChannel{/10.60.162.35:52531->/10.60.162.14:5801}, 
remoteAddress=[10.60.162.14]:5801, lastReadTime=2024-07-04 13:37:08.482, 
lastWriteTime=2024-07-04 13:37:08.481, closedTime=2024-07-04 13:37:15.932, 
connected server version=5.1}
        2024-07-04 13:37:15,940 INFO  
com.hazelcast.client.impl.connection.ClientConnectionManager - hz.client_1 
[seatunnel] [5.1] Removed connection to endpoint: 
[10.60.162.16]:5801:a3bbbfda-b0bc-4738-a148-a17337bdb588, connection: 
ClientConnection{alive=false, connectionId=3, 
channel=NioChannel{/10.60.162.35:47319->/10.60.162.16:5801}, 
remoteAddress=[10.60.162.16]:5801, lastReadTime=2024-07-04 13:37:13.484, 
lastWriteTime=2024-07-04 13:37:13.482, closedTime=2024-07-04 13:37:15.937, 
connected server version=5.1}
        2024-07-04 13:37:15,942 INFO  
com.hazelcast.client.impl.connection.ClientConnectionManager - hz.client_1 
[seatunnel] [5.1] Removed connection to endpoint: 
[10.60.162.31]:5801:b9abbcd8-ac93-41d8-9d85-25664ce23716, connection: 
ClientConnection{alive=false, connectionId=1, 
channel=NioChannel{/10.60.162.35:49005->/10.60.162.31:5801}, 
remoteAddress=[10.60.162.31]:5801, lastReadTime=2024-07-04 13:37:15.906, 
lastWriteTime=2024-07-04 13:37:13.328, closedTime=2024-07-04 13:37:15.940, 
connected server version=5.1}
        2024-07-04 13:37:15,942 INFO  com.hazelcast.core.LifecycleService - 
hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is 
CLIENT_DISCONNECTED
        2024-07-04 13:37:15,946 INFO  com.hazelcast.core.LifecycleService - 
hz.client_1 [seatunnel] [5.1] HazelcastClient 5.1 (20220228 - 21f20e7) is 
SHUTDOWN
        2024-07-04 13:37:15,946 INFO  
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand - 
Closed SeaTunnel client......
        2024-07-04 13:37:15,946 INFO  
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand - 
Closed metrics executor service ......
        2024-07-04 13:37:15,946 ERROR 
org.apache.seatunnel.core.starter.SeaTunnel - 
        
        
===============================================================================
        
        
        2024-07-04 13:37:15,947 ERROR 
org.apache.seatunnel.core.starter.SeaTunnel - Fatal Error, 
        
        2024-07-04 13:37:15,947 ERROR 
org.apache.seatunnel.core.starter.SeaTunnel - Please submit bug report in 
https://github.com/apache/seatunnel/issues
        
        2024-07-04 13:37:15,947 ERROR 
org.apache.seatunnel.core.starter.SeaTunnel - Reason:SeaTunnel job executed 
failed 
        
        2024-07-04 13:37:15,949 ERROR 
org.apache.seatunnel.core.starter.SeaTunnel - Exception 
StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: 
SeaTunnel job executed failed
                at 
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
                at 
org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
                at 
org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
        Caused by: 
org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: 
org.apache.seatunnel.engine.server.checkpoint.CheckpointException: Checkpoint 
expired before completing. Please increase checkpoint timeout in the 
seatunnel.yaml or jobConfig env.
                at 
org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.handleCoordinatorError(CheckpointCoordinator.java:274)
                at 
org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.lambda$null$9(CheckpointCoordinator.java:590)
                at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
                at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
                at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                at java.lang.Thread.run(Thread.java:750)
        
                at 
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
                ... 2 more
         
        2024-07-04 13:37:15,949 ERROR 
org.apache.seatunnel.core.starter.SeaTunnel - 
        
===============================================================================
        
        
        
        Exception in thread "main" 
org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel 
job executed failed
                at 
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
                at 
org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
                at 
org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
        Caused by: 
org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: 
org.apache.seatunnel.engine.server.checkpoint.CheckpointException: Checkpoint 
expired before completing. Please increase checkpoint timeout in the 
seatunnel.yaml or jobConfig env.
                at 
org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.handleCoordinatorError(CheckpointCoordinator.java:274)
                at 
org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.lambda$null$9(CheckpointCoordinator.java:590)
                at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
                at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
                at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
                at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                at java.lang.Thread.run(Thread.java:750)
        
                at 
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
                ... 2 more
        2024-07-04 13:37:15,951 INFO  
org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand - run 
shutdown hook because get close signal
   [INFO] 2024-07-04 13:37:16.706 +0800 - process has exited. execute 
path:/data/hubs/dolphinscheduler/tmp/exec/process/default/13823066283424/13831054200608_19/164485/311103,
 processId:560021 ,exitStatusCode:1 ,processWaitForStatus:true 
,processExitValue:1
   [INFO] 2024-07-04 13:37:16.707 +0800 - 
***********************************************************************************************
   [INFO] 2024-07-04 13:37:16.707 +0800 - *********************************  
Finalize task instance  ************************************
   [INFO] 2024-07-04 13:37:16.707 +0800 - 
***********************************************************************************************
   [INFO] 2024-07-04 13:37:16.707 +0800 - Upload output files: [] successfully
   [INFO] 2024-07-04 13:37:16.707 +0800 - Send task execute status: FAILURE to 
master : 10.60.162.35:1234
   [INFO] 2024-07-04 13:37:16.708 +0800 - Remove the current task execute 
context from worker cache
   [INFO] 2024-07-04 13:37:16.708 +0800 - The current execute mode isn't 
develop mode, will clear the task execute file: 
/data/hubs/dolphinscheduler/tmp/exec/process/default/13823066283424/13831054200608_19/164485/311103
   [INFO] 2024-07-04 13:37:16.708 +0800 - Success clear the task execute file: 
/data/hubs/dolphinscheduler/tmp/exec/process/default/13823066283424/13831054200608_19/164485/311103
   [INFO] 2024-07-04 13:37:16.708 +0800 - FINALIZE_SESSION
   ```
   
   
   ### Zeta or Flink or Spark Version
   
   Zeta
   
   ### Java or Scala Version
   
   java version "1.8.0_401"
   Java(TM) SE Runtime Environment (build 1.8.0_401-b10)
   Java HotSpot(TM) 64-Bit Server VM (build 25.401-b10, mixed mode)
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to