[ 
https://issues.apache.org/jira/browse/FLINK-17691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17235662#comment-17235662
 ] 

John Mathews commented on FLINK-17691:
--------------------------------------

hey [~aljoscha] I am happy to submit a PR to fix this bug, can I simply submit 
one that truncates the taskName? I think we can truncate either in the 
TransactionIdGenerator or in the TaskInfo constructor itself, depending on if 
we want the limit to apply everywhere or not.

> FlinkKafkaProducer transactional.id too long when using Semantic.EXACTLY_ONCE
> -----------------------------------------------------------------------------
>
>                 Key: FLINK-17691
>                 URL: https://issues.apache.org/jira/browse/FLINK-17691
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Kafka
>    Affects Versions: 1.10.0, 1.11.0
>            Reporter: freezhan
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2020-05-14-20-43-57-414.png, 
> image-2020-05-14-20-45-24-030.png, image-2020-05-14-20-45-59-878.png, 
> image-2020-05-14-21-09-01-906.png, image-2020-05-14-21-16-43-810.png, 
> image-2020-05-14-21-17-09-784.png
>
>
> When sink to Kafka using the {color:#FF0000}Semantic.EXACTLY_ONCE {color}mode.
> The flink Kafka Connector Producer will auto set the 
> {color:#FF0000}transactional.id{color}, and the user - defined value are 
> ignored.
>  
> When the job operator name too long, will send failed
> transactional.id is exceeds the kafka  {color:#FF0000}coordinator_key{color} 
> limit
> !image-2020-05-14-21-09-01-906.png!
>  
> *The flink Kafka Connector policy for automatic generation of transaction.id 
> is as follows*
>  
> 1. use the {color:#FF0000}taskName + "-" + operatorUniqueID{color} as 
> transactional.id prefix (may be too long)
>   getRuntimeContext().getTaskName() + "-" + ((StreamingRuntimeContext)    
> getRuntimeContext()).getOperatorUniqueID()
> 2. Range of available transactional ids 
> [nextFreeTransactionalId, nextFreeTransactionalId + parallelism * 
> kafkaProducersPoolSize)
> !image-2020-05-14-20-43-57-414.png!
>   !image-2020-05-14-20-45-24-030.png!
> !image-2020-05-14-20-45-59-878.png!
>  
> *The Kafka transaction.id check policy as follows:*
>  
> {color:#FF0000}string bytes.length can't larger than Short.MAX_VALUE 
> (32767){color}
> !image-2020-05-14-21-16-43-810.png!
> !image-2020-05-14-21-17-09-784.png!
>  
> *To reproduce this bug, the following conditions must be met:*
>  
>  # send msg to kafka with exactly once mode
>  # the task TaskName' length + TaskName's length is lager than the 32767 (A 
> very long line of SQL or window statements can appear)
> *I suggest a solution:*
>  
>      1.  Allows users to customize transactional.id 's prefix
> or
>      2. Do md5 on the prefix before returning the real transactional.id
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to