Re: Spark streaming multi-tasking during I/O

Sateesh Kavuri Sat, 22 Aug 2015 05:56:29 -0700

Hi Rishitesh,

We are not using any RDD's to parallelize the processing and all of the
algorithm runs on a single core (and in a single thread). The parallelism
is done at the user level


The disk can be started in a separate IO, but then the executor will not be
able to take up more jobs, since thats how I believe Spark is designed by
default

On Sat, Aug 22, 2015 at 12:51 AM, Rishitesh Mishra <rishi80.mis...@gmail.com
> wrote:

> Hi Sateesh,
> It is interesting to know , how did you determine that the Dstream runs on
> a single core. Did you mean receivers?
>
> Coming back to your question, could you not start disk io in a separate
> thread, so that the sceduler can go ahead and assign other tasks ?
> On 21 Aug 2015 16:06, "Sateesh Kavuri" <sateesh.kav...@gmail.com> wrote:
>
>> Hi,
>>
>> My scenario goes like this:
>> I have an algorithm running in Spark streaming mode on a 4 core virtual
>> machine. Majority of the time, the algorithm does disk I/O and database
>> I/O. Question is, during the I/O, where the CPU is not considerably loaded,
>> is it possible to run any other task/thread so as to efficiently utilize
>> the CPU?
>>
>> Note that one DStream of the algorithm runs completely on a single CPU
>>
>> Thank you,
>> Sateesh
>>
>

Re: Spark streaming multi-tasking during I/O

Reply via email to