[ https://issues.apache.org/jira/browse/FLINK-20972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
huajiewang updated FLINK-20972: ------------------------------- Description: when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe A large number of EventData will be output (log.info),which will cause IO bottleneck and disk waste. my code in the attachment, A large number event data output in the log output by flink , output e.g: {code:java} Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b, ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ), transactionStartTime=1610426158532} from checkpoint 4{code} in TwoPhaseCommitSinkFunction about LOG.info code is as follows: {code:java} LOG.info( "{} - checkpoint {} complete, committing transaction {} from checkpoint {}", name(), checkpointId, pendingTransaction, pendingTransactionCheckpointId); {code} will be invoke pendingTransaction'toString method (pendingTransaction is TransactionHolder'instance), TransactionHolder'toString method code is: {code:java} @Override public String toString() { return "TransactionHolder{" + "handle=" + handle + ", transactionStartTime=" + transactionStartTime + '}'; }{code} handle is the concrete realization of my Transaction! There is a parameter of List type in my Transaction, which is used to receive data. as a result, these data are printed out(log.info) was: when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe A large number of EventData will be output (log.info) ,which will cause IO bottleneck and disk waste my code in the attachment, A large number event data output in the log output by flink, e.g: {code:java} Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b, ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ), transactionStartTime=1610426158532} from checkpoint 4{code} in TwoPhaseCommitSinkFunction about LOG.info code is as follows: {code:java} LOG.info( "{} - checkpoint {} complete, committing transaction {} from checkpoint {}", name(), checkpointId, pendingTransaction, pendingTransactionCheckpointId); {code} will be invoke pendingTransaction'toString method (pendingTransaction is TransactionHolder'instance), TransactionHolder'toString method code is: {code:java} @Override public String toString() { return "TransactionHolder{" + "handle=" + handle + ", transactionStartTime=" + transactionStartTime + '}'; }{code} handle is the concrete realization of my Transaction! There is a parameter of List type in my Transaction, which is used to receive data. as a result, these data are printed out(log.info) > TwoPhaseCommitSinkFunction Output a large amount of EventData > ------------------------------------------------------------- > > Key: FLINK-20972 > URL: https://issues.apache.org/jira/browse/FLINK-20972 > Project: Flink > Issue Type: Improvement > Components: API / DataStream > Affects Versions: 1.12.0 > Environment: flink 1.4.0 + > Reporter: huajiewang > Priority: Minor > Labels: easyfix, pull-request-available > Attachments: Jdbc2PCSinkFunction.scala > > Original Estimate: 1h > Remaining Estimate: 1h > > when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe > A large number of EventData will be output (log.info),which will cause IO > bottleneck and disk waste. > > my code in the attachment, A large number event data output in the log > output by flink , output e.g: > {code:java} > Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction > TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b, > ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ), > transactionStartTime=1610426158532} from checkpoint 4{code} > in TwoPhaseCommitSinkFunction about LOG.info code is as follows: > {code:java} > LOG.info( > "{} - checkpoint {} complete, committing transaction {} from > checkpoint {}", > name(), > checkpointId, > pendingTransaction, > pendingTransactionCheckpointId); {code} > will be invoke pendingTransaction'toString method (pendingTransaction is > TransactionHolder'instance), TransactionHolder'toString method code is: > > {code:java} > @Override > public String toString() { > return "TransactionHolder{" > + "handle=" > + handle > + ", transactionStartTime=" > + transactionStartTime > + '}'; > }{code} > > handle is the concrete realization of my Transaction! There is a parameter > of List type in my Transaction, which is used to receive data. as a result, > these data are printed out(log.info) > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)