Re: Spark Structured Streaming

2021-05-31 Thread S
Hi Mich, I agree with you; spark streaming will become defunct in favor of Structured Streaming. And I have gone over the document in detail. I am aware of the unbounded datasets and running aggregate etc.. Nevertheless, I wouldn't say it's a moot point as it provides a good intuition of the

Re: Spark Structured Streaming

2021-05-31 Thread Mich Talebzadeh
Hi, I guess whether structured streaming (SS) inherited anything from spark streaming is a moot point now, although it is a concept built on spark streaming which will be defunct soon. Going forward, It all depends on what problem you are trying to address. These are explained in the following

Spark Structured Streaming

2021-05-31 Thread S
Hi, I am using Structured Streaming on Azure HdInsight. The version is 2.4.6. I am trying to understand the microbatch mode - default and fixed intervals. Does the fixed interval microbatch follow something similar to receiver based model where records keep getting pulled and stored into blocks

Re: Missing module spark-hadoop-cloud in Maven central

2021-05-31 Thread Sean Owen
I know it's not enabled by default when the binary artifacts are built, but not exactly sure why it's not built separately at all. It's almost a dependencies-only pom artifact, but there are two source files. Steve do you have an angle on that? On Mon, May 31, 2021 at 5:37 AM Erik Torres wrote:

Missing module spark-hadoop-cloud in Maven central

2021-05-31 Thread Erik Torres
Hi, I'm following this documentation to configure my Spark-based application to interact with Amazon S3. However, I cannot find the spark-hadoop-cloud module in Maven central for the non-commercial distribution of Apache Spark. From the documentation I would expect that I can get this module