psolomin commented on issue #25975:
URL: https://github.com/apache/beam/issues/25975#issuecomment-1764531872

   @je-ik tried that flag with `KinesisIO` too. Still observe the same pattern:
   
   - start from a savepoint with reduced parallelism -> data loss
   - start again from that savepoint with previous parallelism -> no data loss
   
   Interestingly, Flink UI reports the same `Records sent` from this stage when 
I do both parallelism = 2 and = 3:
   
   ```
    Source: Source/Read(KinesisSource)
   +- Flat Map
      +- Fixed windows/Window.Assign.out
         +- Process payloads/ParMultiDo(SlowProcessor)
            +- Maybe fail/ParMultiDo(FailingProcessor)
               +- Parse payloads/ParMultiDo(ConsumedEventDeserializer)
                  +- Sink to 
S3/WriteFiles/WriteShardedBundlesToTempFiles/ApplyShardingKey/ParMultiDo(ApplyShardingFunction)
                     +- ToBinaryKeyedWorkItem
   ```
   
   My Kinesis and Kafka pipeline actually do a bit different things, so, I will 
try converge them as much as possible before concluding anything futher.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to