[ 
https://issues.apache.org/jira/browse/FLINK-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769536#comment-15769536
 ] 

ASF GitHub Bot commented on FLINK-4616:
---------------------------------------

Github user MayerRoman commented on the issue:

    https://github.com/apache/flink/pull/3031
  
    I think that the changes that I propose eliminates the possibility of 
starting with checkpoints created before my code changes.
    
    Because now it saves ListState<Tuple2<KafkaTopicPartition, Tuple2<Long, 
Long>>> (partition + offset + watermark).
    And before it saved ListState<Tuple2<KafkaTopicPartition, Long>> (partition 
+ offset).
    
    (I mean checkpoints version later then 1.1.
    Recently Added backward compatibility with 1.1 snapshots is taken into 
account in my commit with it, I think everything is ok)
    
    
    Please advise me how to repair backward compatibility.
    
    I have some idea of how to implement it:
    
    1)  somehow verify returned from stateStore.getSerializableListState(..) 
object
    in initializeState method
    
https://github.com/apache/flink/pull/3031/files?diff=unified#diff-06bf4a7f73d98ef91309154654563475R321
    
    is it
    ListState<Tuple2<KafkaTopicPartition, Long>>
    or
    ListState<Tuple2<KafkaTopicPartition, Tuple<Long, Long>>>
    
    2)  Use for saving watermark separate state-object.
    
    Or it is necessary implement different way.
    
    I would be grateful for help.



> Kafka consumer doesn't store last emmited watermarks per partition in state
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-4616
>                 URL: https://issues.apache.org/jira/browse/FLINK-4616
>             Project: Flink
>          Issue Type: Bug
>          Components: Kafka Connector
>    Affects Versions: 1.1.1
>            Reporter: Yuri Makhno
>            Assignee: Roman Maier
>             Fix For: 1.2.0
>
>
> Kafka consumers stores in state only kafka offsets and doesn't store last 
> emmited watermarks, this may go to wrong state when checkpoint is restored:
> Let's say our watermark is (timestamp - 10) and in case we have the following 
> messages queue results will be different after checkpoint restore and during 
> normal processing:
> A(ts = 30)
> B(ts = 35)
> ------ checkpoint goes here
> C(ts=15) -- this one should be filtered by next time window
> D(ts=60)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to