[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16454788#comment-16454788 ]
Jose Torres commented on SPARK-24036: ------------------------------------- https://docs.google.com/document/d/1IL4kJoKrZWeyIhklKUJqsW-yEN7V7aL05MmM65AYOfE I wrote a quick doc summarizing my thoughts. TLDR is: * I think it's better to not reuse the existing shuffle infrastructure - we'll have to do more work to get good performance later, but current shuffle has very bad characteristics for what continuous processing is trying to do. In particular I doubt we'd be able to maintain millisecond-scale latency with anything like UnsafeShuffleWriter. * It's a small diff on top of a working shuffle to support exactly-once state management. I don't think the coordinator needs to worry about stateful operators; a writer will never commit if a stateful operator below it fails to checkpoint, and the stateful operator itself can rewind if it commits an epoch that ends up failing. Let me know what you two think. I'll send this out to the dev list if it looks reasonable, and then we can start thinking about how this breaks down into individual tasks. > Stateful operators in continuous processing > ------------------------------------------- > > Key: SPARK-24036 > URL: https://issues.apache.org/jira/browse/SPARK-24036 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 2.4.0 > Reporter: Jose Torres > Priority: Major > > The first iteration of continuous processing in Spark 2.3 does not work with > stateful operators. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org