OK, your findings do not imply those settings are incorrect. Those settings
will work if you set-up your k8s cluster in peer-to-peer mode with equal
amounts of RAM for each node which is common practice.

HTH



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Thu, 11 Nov 2021 at 21:39, Sergey Ivanychev <sergeyivanyc...@gmail.com>
wrote:

> Yes, in fact those are the settings that cause this behaviour. If set to
> false, everything goes fine since the implementation in spark sources in
> this case is
>
> pdf = pd.DataFrame.from_records(self.collect(), columns=self.columns)
>
> Best regards,
>
>
> Sergey Ivanychev
>
> 11 нояб. 2021 г., в 13:58, Mich Talebzadeh <mich.talebza...@gmail.com>
> написал(а):
>
> 
> Have you tried the following settings:
>
> spark.conf.set("spark.sql.execution.arrow.pysppark.enabled", "true")
> spark.conf.set("spark.sql.execution.arrow.pyspark.fallback.enabled","true")
>
> HTH
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 4 Nov 2021 at 18:06, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Ok so it boils down on how spark does create toPandas() DF under the
>> bonnet. How many executors are involved in k8s cluster. In this model spark
>> will create executors = no of nodes - 1
>>
>> On Thu, 4 Nov 2021 at 17:42, Sergey Ivanychev <sergeyivanyc...@gmail.com>
>> wrote:
>>
>>> > Just to confirm with Collect() alone, this is all on the driver?
>>>
>>> I shared the screenshot with the plan in the first email. In the
>>> collect() case the data gets fetched to the driver without problems.
>>>
>>> Best regards,
>>>
>>>
>>> Sergey Ivanychev
>>>
>>> 4 нояб. 2021 г., в 20:37, Mich Talebzadeh <mich.talebza...@gmail.com>
>>> написал(а):
>>>
>>> Just to confirm with Collect() alone, this is all on the driver?
>>>
>>> --
>>
>>
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>

Reply via email to