Spark streaming receivers

2020-08-08 Thread Dark Crusader
Hi, I'm having some trouble figuring out how receivers tie into spark driver-executor structure. Do all executors have a receiver that is blocked as soon as it receives some stream data? Or can multiple streams of data be taken as input into a single executor? I have stream data coming in at ever

Re: Spark streaming receivers

2020-08-08 Thread Russell Spitzer
Note, none of this applies to Direct streaming approaches, only receiver based Dstreams. You can think of a receiver as a long running task that never finishes. Each receiver is submitted to an executor slot somewhere, it then runs indefinitely and internally has a method which passes records over

Re: Spark streaming receivers

2020-08-09 Thread Dark Crusader
Hi Russell, This is super helpful. Thank you so much. Can you elaborate on the differences between structured streaming vs dstreams? How would the number of receivers required etc change? On Sat, 8 Aug, 2020, 10:28 pm Russell Spitzer, wrote: > Note, none of this applies to Direct streaming appr

Re: Spark streaming receivers

2020-08-10 Thread Russell Spitzer
The direct approach, which is also available through dstreams, and structured streaming use a different model. Instead of being a push based streaming solution they instead are pull based. (In general) On every batch the driver uses the configuration to create a number of partitions, each is respo