Thanks TD.

On Tue, Mar 14, 2017 at 4:37 PM, Tathagata Das <t...@databricks.com> wrote:

> This setting allows multiple spark jobs generated through multiple
> foreachRDD to run concurrently, even if they are across batches. So output
> op2 from batch X, can run concurrently with op1 of batch X+1
> This is not safe because it breaks the checkpointing logic in subtle ways.
> Note that this was never documented in the spark online docs.
>
> On Tue, Mar 14, 2017 at 2:29 PM, shyla deshpande <deshpandesh...@gmail.com
> > wrote:
>
>> Thanks TD for the response. Can you please provide more explanation. I am
>>  having multiple streams in the spark streaming application (Spark 2.0.2
>> using DStreams).  I know many people using this setting. So your
>> explanation will help a lot of people.
>>
>> Thanks
>>
>> On Fri, Mar 10, 2017 at 6:24 PM, Tathagata Das <t...@databricks.com>
>> wrote:
>>
>>> That config I not safe. Please do not use it.
>>>
>>> On Mar 10, 2017 10:03 AM, "shyla deshpande" <deshpandesh...@gmail.com>
>>> wrote:
>>>
>>>> I have a spark streaming application which processes 3 kafka streams
>>>> and has 5 output operations.
>>>>
>>>> Not sure what should be the setting for spark.streaming.concurrentJobs.
>>>>
>>>> 1. If the concurrentJobs setting is 4 does that mean 2 output
>>>> operations will be run sequentially?
>>>>
>>>> 2. If I had 6 cores what would be a ideal setting for concurrentJobs in
>>>> this situation?
>>>>
>>>> I appreciate your input. Thanks
>>>>
>>>
>>
>

Reply via email to