Anyone else can answer below questions on performance tuning Structured
streaming?
@Jacek?

On Sun, May 3, 2020 at 12:07 AM Srinivas V <srini....@gmail.com> wrote:

> Hi Alex, read the book , it is a good one but i don’t see things which I
> strongly want to understand.
> You are right on the partition and tasks.
> 1.How to use coalesce with spark structured streaming ?
>
> Also I want to ask few more questions,
> 2. How to restrict number of executors on structured streaming?
>  —num-executors is minimum is it ?
> To cap max, can I use spark.dynamicAllocation.maxExecutors ?
>
> 3. Does other streaming properties hold good for structured streaming?
> Like spark.streaming.dynamicAllocation.enabled ?
> If not what are the ones it takes into consideration?
>
> 4. Does structured streaming 2.4.5 allow dynamicAllocation of executors/
> cores? In case of Kafka consumer, when the cluster has to scale down, does
> it reconfigure the mapping of executors cores to kaka partitions?
>
> 5. Why spark srtructured  Streaming web ui (SQL tab) is not so informative
> like streaming tab of Spark streaming ?
>
> It would be great if these questions are answered, otherwise the only
> option left would be to go through the spark code and figure out.
>
> On Sat, Apr 18, 2020 at 1:09 PM Alex Ott <alex...@gmail.com> wrote:
>
>> Just to clarify - I didn't write this explicitly in my answer. When you're
>> working with Kafka, every partition in Kafka is mapped into Spark
>> partition. And in Spark, every partition is mapped into task.   But you
>> can
>> use `coalesce` to decrease the number of Spark partitions, so you'll have
>> less tasks...
>>
>> Srinivas V  at "Sat, 18 Apr 2020 10:32:33 +0530" wrote:
>>  SV> Thank you Alex. I will check it out and let you know if I have any
>> questions
>>
>>  SV> On Fri, Apr 17, 2020 at 11:36 PM Alex Ott <alex...@gmail.com> wrote:
>>
>>  SV>     http://shop.oreilly.com/product/0636920047568.do has quite good
>> information
>>  SV>     on it.  For Kafka, you need to start with approximation that
>> processing of
>>  SV>     each partition is a separate task that need to be executed, so
>> you need to
>>  SV>     plan number of cores correspondingly.
>>  SV>
>>  SV>     Srinivas V  at "Thu, 16 Apr 2020 22:49:15 +0530" wrote:
>>  SV>      SV> Hello,
>>  SV>      SV> Can someone point me to a good video or document which
>> takes about performance tuning for structured streaming app?
>>  SV>      SV> I am looking especially for listening to Kafka topics say 5
>> topics each with 100 portions .
>>  SV>      SV> Trying to figure out best cluster size and number of
>> executors and cores required.
>>
>>
>> --
>> With best wishes,                    Alex Ott
>> http://alexott.net/
>> Twitter: alexott_en (English), alexott (Russian)
>>
>

Reply via email to