[I] [Bug] [Sink-JDBC] Out of memory issue. [seatunnel]

via GitHub Wed, 08 Jan 2025 11:18:15 -0800


adamfarmer0 opened a new issue, #8484:
URL: https://github.com/apache/seatunnel/issues/8484


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   I am running seatunnel cluster in a simple docker container,
   here is my dockerfile for context:
   
   ```
   FROM adoptopenjdk:8-jdk-hotspot
   ENV SEATUNNEL_HOME=/app/apache-seatunnel-2.3.8
   ENV PATH=$PATH:$SEATUNNEL_HOME/bin
   
   RUN mkdir -p /tmp/seatunnel/checkpoint_snapshot
   RUN chmod -R 777 /tmp/seatunnel/checkpoint_snapshot
   
   WORKDIR /app
   
   ADD . /app
   
   ENV version="2.3.8"
   
   RUN echo 'export SEATUNNEL_HOME=/app/apache-seatunnel-2.3.8' >> 
/etc/profile.d/seatunnel.sh && \
       echo 'export PATH=$PATH:$SEATUNNEL_HOME/bin' >> 
/etc/profile.d/seatunnel.sh
   
   RUN chmod +x /etc/profile.d/seatunnel.sh
   
   EXPOSE 5801
   
   CMD seatunnel-cluster.sh
   ```
   
   There are no changes to seatunnel configurations, these connectors are 
active, but I am only using jdbc source (mssql) and clickhouse sink jobs.
   `connector-cdc-mysql
   connector-cdc-mongodb
   connector-cdc-sqlserver
   connector-cdc-postgres
   connector-clickhouse
   connector-elasticsearch
   connector-email
   connector-file-ftp
   connector-file-local
   connector-file-sftp
   connector-http-base
   connector-jdbc
   connector-kafka
   connector-mongodb
   connector-rabbitmq
   connector-redis
   connector-activemq`
   
   I use 5801 port with api to submit jobs to seatunnel. I am not using 
checkpoint and I use it always with batch mode.
   
   Memory usage graph:
   
![image](https://github.com/user-attachments/assets/4d927b7f-6644-455c-a56a-664137fb00d7)
   
   After a day, It uses more than 4gb of ram and I see that after every job, 
memory increases and drops a bit, but it never goes back to its starting 
memory. It's memory increases by each job depending on job's size and after a 
day it starts getting out of memory exception. Before updating I was using 
version 2.3.3, and in that version after memory exception container got exited 
and using always restart option in docker it restarted automatically and only 
one job was lost, after updating to 2.3.8, when I get out of memory exception 
nothing happens and all my jobs fail and docker container did not exit.
   Note that I run about 5000 jobs each day, all are simple but they run every 
15 minutes. all submitted through submit-job endpoint.
   
   Some error logs are added to attachments.
   [errors.log](https://github.com/user-attachments/files/18351905/errors.log)
   
   ### SeaTunnel Version
   
   2.3.8
   
   ### SeaTunnel Config
   
   ```conf
   url: /hazelcast/rest/maps/submit-job?jobName=testjob
   request body:
   
   {
     "env": {
       "execution.parallelism": 4,
       "job.mode": "BATCH",
       "checkpoint.interval": 10000
     },
     "source": [
       {
         "plugin_name": "Jdbc",
         "url": 
"jdbc:sqlserver://*****:1433;database=***;trustServerCertificate=true",
         "driver": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
         "connection_check_timeout_sec": 100,
         "user": "******",
         "password": "******",
         "query": "select * from SOMETABLENAME"
       }
     ],
     "sink": [
       {
         "plugin_name": "Clickhouse",
         "host": "192.168.1.1:8180",
         "database": "my_database",
         "table": "SOMETABLENAME",
         "username": "******",
         "password": "*******",
         "clickhouse.confg": {
           "max_rows_to_read": "100",
           "read_overflow_mode": "throw"
         }
       }
     ]
   }
   ```
   ```
   
   
   ### Running Command
   
   ```shell
   /hazelcast/rest/maps/submit-job?jobName=testjob
   ```
   
   
   ### Error Exception
   
   ```log
   [929069645775699969] 2025-01-07 18:00:12,618 INFO  
[o.a.s.c.s.j.s.ChunkSplitter   ] 
[BlockingWorker-TaskGroupLocation{jobId=929069645775699969, pipelineId=1, 
taskGroupId=30000}] - Switch to dynamic chunk splitter
   java.lang.OutOfMemoryError: Metaspace
   Dumping heap to /tmp/seatunnel/dump/zeta-server ...
   Unable to create /tmp/seatunnel/dump/zeta-server: No such file or directory
   [929069645775831041] 2025-01-07 18:00:21,050 ERROR 
[.s.e.s.c.CheckpointCoordinator] [hz.main.generic-operation.thread-36] - report 
error from task
   org.apache.seatunnel.common.utils.SeaTunnelException: 
java.lang.OutOfMemoryError: Metaspace
        at 
org.apache.seatunnel.engine.server.checkpoint.CheckpointCoordinator.reportCheckpointErrorFromTask(CheckpointCoordinator.java:390)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.checkpoint.CheckpointManager.reportCheckpointErrorFromTask(CheckpointManager.java:184)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.checkpoint.operation.CheckpointErrorReportOperation.runInternal(CheckpointErrorReportOperation.java:48)
 ~[seatunnel-starter.jar:2.3.8]
        at 
org.apache.seatunnel.engine.server.task.operation.TracingOperation.run(TracingOperation.java:44)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:189) 
~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.call(OperationRunnerImpl.java:273)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:248)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:213)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:175)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:139)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.executeRun(OperationThread.java:123)
 ~[seatunnel-starter.jar:2.3.8]
        at 
com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102)
 ~[seatunnel-starter.jar:2.3.8]
   ```
   
   
   ### Zeta or Flink or Spark Version
   
   _No response_
   
   ### Java or Scala Version
   
   java 8
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Bug] [Sink-JDBC] Out of memory issue. [seatunnel]

Reply via email to