[ 
https://issues.apache.org/jira/browse/FLINK-30960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bo Shen updated FLINK-30960:
----------------------------
     Attachment: image-2023-02-08-16-58-27-765.png
    Description: 
Here in production I have an environment consisting of a kafka source, to which 
third-party providers push a few thousand messages every hour, and I use flink 
jdbc sink to store those messages into a mysql database.

Recently there are a few taskmanager process crashes. My investigation suggests 
that the cause boils down to the way exceptions are handled in jdbc batched 
mode.

 

When writing to JDBC failed in batched mode due to some error like 
DataTuncation, the exception is stored in field "flushException" waiting to be 
processed by the task main thread.

This exception is only checked on the next call to "writeRecords". In my case, 
if the exception happens to occur when processing the last batch of records, 
and there are no further record comming from source for the next hour, the 
flushing thread simply repeatly throws a new RuntimeException wrapping the last 
"flushException" as cause. The new RuntimeException is stored as the new 
"flushException". 

Hence, every time the RuntimeException is thrown, the exception hierachy gets 
bigger, and eventually before the exception is processed on the main thread, 
the jvm runs out of memory, which causes an unexpected process crash. 

!image-2023-02-08-16-58-27-765.png!

 

 

  was:
Here in production I have an environment consisting of a kafka source, to which 
third-party providers push a few thousand messages every hour, and I use flink 
jdbc sink to store those messages into a mysql database.

Recently there are a few taskmanager process crashes. My investigation suggests 
that the cause boils down to the way exceptions are handled in jdbc batched 
mode.

 

When writing to JDBC failed in batched mode due to some error like 
DataTuncation, the exception is stored in field "flushException" waiting to be 
processed by the task main thread.

This exception is only checked on the next call to "writeRecords". In my case, 
if the exception happens to occur when processing the last batch of records, 
and there are no further record comming from source for the next hour, the 
flushing thread simply repeatly throws a new RuntimeException wrapping the last 
"flushException" as cause. The new RuntimeException is stored as the new 
"flushException". 

Hence, every time the RuntimeException is thrown, the exception hierachy gets 
bigger, and eventually before the exception is processed on the main thread, 
the jvm runs out of memory, which causes an unexpected process crash. 

!image-2023-02-08-16-57-07-850.png!

 

 


> OutOfMemory error using jdbc sink
> ---------------------------------
>
>                 Key: FLINK-30960
>                 URL: https://issues.apache.org/jira/browse/FLINK-30960
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / JDBC
>    Affects Versions: 1.13.6
>         Environment: Using SQL-Client + TableAPI
> Processing Kafka messages to MySQL databases.
>            Reporter: Bo Shen
>            Priority: Major
>         Attachments: image-2023-02-08-16-57-07-850.png, 
> image-2023-02-08-16-58-27-765.png
>
>
> Here in production I have an environment consisting of a kafka source, to 
> which third-party providers push a few thousand messages every hour, and I 
> use flink jdbc sink to store those messages into a mysql database.
> Recently there are a few taskmanager process crashes. My investigation 
> suggests that the cause boils down to the way exceptions are handled in jdbc 
> batched mode.
>  
> When writing to JDBC failed in batched mode due to some error like 
> DataTuncation, the exception is stored in field "flushException" waiting to 
> be processed by the task main thread.
> This exception is only checked on the next call to "writeRecords". In my 
> case, if the exception happens to occur when processing the last batch of 
> records, and there are no further record comming from source for the next 
> hour, the flushing thread simply repeatly throws a new RuntimeException 
> wrapping the last "flushException" as cause. The new RuntimeException is 
> stored as the new "flushException". 
> Hence, every time the RuntimeException is thrown, the exception hierachy gets 
> bigger, and eventually before the exception is processed on the main thread, 
> the jvm runs out of memory, which causes an unexpected process crash. 
> !image-2023-02-08-16-58-27-765.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to