Hi,

Thanks for your response.


I understand there is no explicit way to configure dynamic scaling for
Spark Structured Streaming as the ticket is still open for that. But is
there a way to manage dynamic scaling with the existing Batch Dynamic
scaling algorithm as this kicks in when Dynamic allocation is enabled with
Structured Streaming. The issue I’m facing with batch dynamic allocation is
that it requests executors based on pending/running tasks. And to have
parallelism we have set spark.sql.shuffle.partitions: "100"  due to which
100 partitions are getting created and thus 100 tasks which is causing more
executors to be requested(not scaling based on load). Is there mechanism to
control this autoscaling behaviour of executors based on data load?


Additionally, Spark Streaming dynamic allocation algorithm autoscales
executors based on the processing time/ batch interval ratio which would be
a preferred method for streaming use case. So is there a provision to use
the streaming configurations instead of the batch mode configurations with
structured streaming?


Any suggestions on the above would be helpful.


Thanks and Regards,

Aishwarya


On Thu, 25 May, 2023, 11:46 PM Mich Talebzadeh, <mich.talebza...@gmail.com>
wrote:

> Hi,
> Autoscaling is not compatible with Spark Structured Streaming
> <https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>
>  since
> Spark Structured Streaming currently does not support dynamic allocation
> (see SPARK-24815: Structured Streaming should support dynamic allocation
> <https://issues.apache.org/jira/browse/SPARK-24815>).
>
> That ticket is still open
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies Limited
> London
> United Kingdom
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 25 May 2023 at 18:44, Aishwarya Panicker <
> aishwaryapanicke...@gmail.com> wrote:
>
>> Hi Team,
>>
>> I have been working on Spark Structured Streaming and trying to autoscale
>> our application through dynamic allocation. But I couldn't find any
>> documentation or configurations that supports dynamic scaling in Spark
>> Structured Streaming, due to which I had been using Spark Batch mode
>> dynamic scaling which is not so efficient with streaming use case.
>>
>> I also tried with Spark streaming dynamic allocation configurations which
>> didn't work with structured streaming.
>>
>> Below are the configurations I tried for dynamic scaling of my Spark
>> Structured Streaming Application:
>>
>> With Batch Spark configurations:
>>
>> spark.dynamicAllocation.enabled: true
>> spark.dynamicAllocation.executorAllocationRatio: 0.5
>> spark.dynamicAllocation.minExecutors: 1
>> spark.dynamicAllocation.maxExecutors: 5
>>
>>
>> With Streaming Spark configurations:
>>
>> spark.dynamicAllocation.enabled: false
>> spark.streaming.dynamicAllocation.enabled: true
>> spark.streaming.dynamicAllocation.scaleUpRatio: 0.7
>> spark.streaming.dynamicAllocation.scaleDownRatio: 0.2
>> spark.streaming.dynamicAllocation.minExecutors: 1
>> spark.streaming.dynamicAllocation.maxExecutors: 5
>>
>> Kindly let me know if there is any configuration for the dynamic
>> allocation of Spark Structured Streaming which I'm missing due to which
>> autoscaling of my application is not working properly.
>>
>> Awaiting your response.
>>
>> Thanks and Regards,
>> Aishwarya
>>
>>
>>
>>
>>

Reply via email to