Boyang Jerry Peng created SPARK-52330: -----------------------------------------
Summary: SPIP: Real-Time Mode in Apache Spark Structured Streaming Key: SPARK-52330 URL: https://issues.apache.org/jira/browse/SPARK-52330 Project: Spark Issue Type: Umbrella Components: Structured Streaming Affects Versions: 4.1.0 Reporter: Boyang Jerry Peng We propose to add a *real-time mode* in Spark Structured Streaming that significantly lowers end-to-end latency for processing streams of data. Our goal is to make Spark capable of handling streaming jobs that need results *almost immediately (within* {*}O(100) millisecond{*}{*}){*}. We want to achieve this *without changing the high-level DataFrame/Dataset API* that users already use – so existing streaming queries can run in this new ultra-low-latency mode by simply turning it on, without rewriting their logic. In short, we’re trying to enable Spark to power *real-time applications* (like instant anomaly alerts or live personalization) that today cannot meet their latency requirements with Spark’s current streaming engine. SPIP doc: https://docs.google.com/document/d/1CvJvtlTGP6TwQIT4kW6GFT1JbdziAYOBvt60ybb7Dw8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org