Re: Processing Multiple Streams in a Single Job

2021-08-27 Thread Sean Owen
That is something else. Yes, you can create a single, complex stream job that joins different data sources, etc. That is not different than any other Spark usage. What are you looking for w.r.t. docs? We are also saying you can simply run N unrelated streaming jobs in parallel on the driver,

Re: Processing Multiple Streams in a Single Job

2021-08-27 Thread Artemis User
Thanks Mich.  I understand now how to deal multiple streams in a single job, but the responses I got before were very abstract and confusing.  So I had to go back to the Spark doc and figure out the details.  This is what I found out: 1. The standard and recommended way to do multi-stream

Re: Processing Multiple Streams in a Single Job

2021-08-26 Thread Mich Talebzadeh
Hi ND, Within the same Spark job you can handle two topics simultaneously SSS. Is that what you are implying? HTH view my Linkedin profile *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or

Re: Processing Multiple Streams in a Single Job

2021-08-25 Thread Sean Owen
This part isn't Spark specific, just a matter of running code in parallel on the driver (that happens to start streaming jobs). In Scala it's things like .par collections, in Python it's something like multiprocessing. On Wed, Aug 25, 2021 at 8:48 AM Artemis User wrote: > Thanks Sean. Excuse

Re: Processing Multiple Streams in a Single Job

2021-08-25 Thread Artemis User
Thanks Sean.  Excuse my ignorant, but I just can't figure out how to create a collection across multiple streams using multiple stream readers.  Could you provide some examples or additional references? Thanks! On 8/24/21 11:01 PM, Sean Owen wrote: No, that applies to the streaming DataFrame

Re: Processing Multiple Streams in a Single Job

2021-08-24 Thread Gourav Sengupta
Hi, can you please give more details around this? What is the requirement? What is the SPARK version you are using? What do you mean by multiple sources? What are these sources? Regards, Gourav Sengupta On Wed, Aug 25, 2021 at 3:51 AM Artemis User wrote: > Thanks Daniel. I guess you were

Re: Processing Multiple Streams in a Single Job

2021-08-24 Thread Sean Owen
No, that applies to the streaming DataFrame API too. No jobs can't communicate with each other. On Tue, Aug 24, 2021 at 9:51 PM Artemis User wrote: > Thanks Daniel. I guess you were suggesting using DStream/RDD. Would it > be possible to use structured streaming/DataFrames for multi-source >

Re: Processing Multiple Streams in a Single Job

2021-08-24 Thread Artemis User
Thanks Daniel.  I guess you were suggesting using DStream/RDD. Would it be possible to use structured streaming/DataFrames for multi-source streaming?  In addition, we really need each stream data ingestion to be asynchronous or non-blocking...  thanks! On 8/24/21 9:27 PM, daniel williams

Processing Multiple Streams in a Single Job

2021-08-24 Thread Artemis User
Is there a way to run multiple streams in a single Spark job using Structured Streaming?  If not, is there an easy way to perform inter-job communications (e.g. referencing a dataframe among concurrent jobs) in Spark?  Thanks a lot in advance! -- ND