I have not seen stack traces under autoscaling, so not even sure what the
error in question is.
There is always delay in acquiring a whole new executor in the cloud as it
usually means a new VM is provisioned.
Spark treats the new executor like any other, available for executing tasks.

On Fri, Feb 4, 2022 at 4:28 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Thanks for the info.
>
> My concern has always been on how Spark handles autoscaling (adding new
> executors) when the load pattern changes.I have tried to test this with
> setting the following parameters (Spark 3.1.2 on GCP)
>
>         spark-submit --verbose \
>         .......
>           --conf spark.dynamicAllocation.enabled="true" \
>            --conf spark.shuffle.service.enabled="true" \
>            --conf spark.dynamicAllocation.minExecutors=2 \
>            --conf spark.dynamicAllocation.maxExecutors=10 \
>            --conf spark.dynamicAllocation.initialExecutors=4 \
>
> It is not very clear to me how Spark distributes tasks on the added
> executors and the source of delay. As you have observed there is a delay in
> adding new resources and allocating tasks. If that process is efficient?
>
> Thanks
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 4 Feb 2022 at 03:04, Maksim Grinman <m...@resolute.ai> wrote:
>
>> It's actually on AWS EMR. The job bootstraps and runs fine -- the
>> autoscaling group is to bring up a service that spark will be calling. Some
>> code waits for the autoscaling group to come up before continuing
>> processing in Spark, since the Spark cluster will need to make requests to
>> the service in the autoscaling group. It takes several minutes for the
>> service to come up, and during the wait, Spark starts to show these thread
>> dumps, as presumably it thinks something is wrong since the executor is
>> busy waiting and not doing anything. The previous version of Spark did not
>> do this (2.4.4).
>>
>> On Thu, Feb 3, 2022 at 6:59 PM Mich Talebzadeh <mich.talebza...@gmail.com>
>> wrote:
>>
>>> Sounds like you are running this on Google Dataproc cluster (spark
>>> 3.1.2)  with auto scaling policy?
>>>
>>>  Can you describe if this happens before Spark starts a new job on the
>>> cluster or somehow half way through processing an existing job?
>>>
>>> Also is the job involved doing Spark Structured Streaming?
>>>
>>> HTH
>>>
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 3 Feb 2022 at 21:29, Maksim Grinman <m...@resolute.ai> wrote:
>>>
>>>> We've got a spark task that, after some processing, starts an
>>>> autoscaling group and waits for it to be up before continuing processing.
>>>> While waiting for the autoscaling group, spark starts throwing full thread
>>>> dumps, presumably at the spark.executor.heartbeat interval. Is there a way
>>>> to prevent the thread dumps?
>>>>
>>>> --
>>>> Maksim Grinman
>>>> VP Engineering
>>>> Resolute AI
>>>>
>>>
>>
>> --
>> Maksim Grinman
>> VP Engineering
>> Resolute AI
>>
>

Reply via email to