bharath kumar avusherla created SPARK-25052:
-----------------------------------------------

             Summary: Is there any possibility that spark structured streaming 
generate duplicates in the output?
                 Key: SPARK-25052
                 URL: https://issues.apache.org/jira/browse/SPARK-25052
             Project: Spark
          Issue Type: Question
          Components: Spark Core
    Affects Versions: 2.3.0
            Reporter: bharath kumar avusherla


We recently observed that the spark structured streaming generated duplicates 
in the output when reading from Kafka topic and storing the output to the S3 
(and checkpointing in S3).  We ran into this issue twice. This is not 
reproducible. Is there anyone has ever faced this kind of issue before? Is this 
because of S3 eventual consistency?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to