[ 
https://issues.apache.org/jira/browse/FLINK-20972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huajiewang updated FLINK-20972:
-------------------------------
    Description: 
when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe A 
large number of EventData will be output (log.info),which will cause IO 
bottleneck and disk waste.

 
 my code in the attachment, A large number event data output in the log output 
by flink , e.g: 
{code:java}
Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction 
TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b, 
ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ), 
transactionStartTime=1610426158532} from checkpoint 4{code}
 in TwoPhaseCommitSinkFunction about LOG.info code is as follows:
{code:java}
LOG.info(
        "{} - checkpoint {} complete, committing transaction {} from checkpoint 
{}",
        name(),
        checkpointId,
        pendingTransaction,
        pendingTransactionCheckpointId); {code}
will be invoke pendingTransaction'toString method (pendingTransaction is 
TransactionHolder'instance), TransactionHolder'toString method code is:

 
{code:java}
@Override
public String toString() {
    return "TransactionHolder{"
            + "handle="
            +  handle
            + ", transactionStartTime="
            + transactionStartTime
            + '}';
}{code}
 
 handle is the concrete realization of my Transaction! There is a parameter of 
List type in my Transaction, which is used to receive data. as a result, these 
data are printed out(log.info)
  
  

 

  was:
when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe A 
large number of EventData will be output (log.info),which will cause IO 
bottleneck and disk waste.

 
 my code in the attachment, A large number event data output in the log output 
by flink , output e.g: 
{code:java}
Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction 
TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b, 
ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ), 
transactionStartTime=1610426158532} from checkpoint 4{code}
 in TwoPhaseCommitSinkFunction about LOG.info code is as follows:
{code:java}
LOG.info(
        "{} - checkpoint {} complete, committing transaction {} from checkpoint 
{}",
        name(),
        checkpointId,
        pendingTransaction,
        pendingTransactionCheckpointId); {code}
will be invoke pendingTransaction'toString method (pendingTransaction is 
TransactionHolder'instance), TransactionHolder'toString method code is:

 
{code:java}
@Override
public String toString() {
    return "TransactionHolder{"
            + "handle="
            +  handle
            + ", transactionStartTime="
            + transactionStartTime
            + '}';
}{code}
 
 handle is the concrete realization of my Transaction! There is a parameter of 
List type in my Transaction, which is used to receive data. as a result, these 
data are printed out(log.info)
  
  

 


> TwoPhaseCommitSinkFunction Output a large amount of EventData
> -------------------------------------------------------------
>
>                 Key: FLINK-20972
>                 URL: https://issues.apache.org/jira/browse/FLINK-20972
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataStream
>    Affects Versions: 1.12.0
>         Environment: flink 1.4.0 +
>            Reporter: huajiewang
>            Priority: Minor
>              Labels: easyfix, pull-request-available
>         Attachments: Jdbc2PCSinkFunction.scala
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe 
> A large number of EventData will be output (log.info),which will cause IO 
> bottleneck and disk waste.
>  
>  my code in the attachment, A large number event data output in the log 
> output by flink , e.g: 
> {code:java}
> Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction 
> TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b, 
> ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ), 
> transactionStartTime=1610426158532} from checkpoint 4{code}
>  in TwoPhaseCommitSinkFunction about LOG.info code is as follows:
> {code:java}
> LOG.info(
>         "{} - checkpoint {} complete, committing transaction {} from 
> checkpoint {}",
>         name(),
>         checkpointId,
>         pendingTransaction,
>         pendingTransactionCheckpointId); {code}
> will be invoke pendingTransaction'toString method (pendingTransaction is 
> TransactionHolder'instance), TransactionHolder'toString method code is:
>  
> {code:java}
> @Override
> public String toString() {
>     return "TransactionHolder{"
>             + "handle="
>             +  handle
>             + ", transactionStartTime="
>             + transactionStartTime
>             + '}';
> }{code}
>  
>  handle is the concrete realization of my Transaction! There is a parameter 
> of List type in my Transaction, which is used to receive data. as a result, 
> these data are printed out(log.info)
>   
>   
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to