Can you paste the code? How much memory does your system have and how big
is your dataset? Did you try df.persist(StorageLevel.MEMORY_AND_DISK)?

Thanks
Best Regards

On Fri, Jul 17, 2015 at 5:14 PM, Harit Vishwakarma <
harit.vishwaka...@gmail.com> wrote:

> Thanks,
> Code is running on a single machine.
> And it still doesn't answer my question.
>
> On Fri, Jul 17, 2015 at 4:52 PM, ayan guha <guha.a...@gmail.com> wrote:
>
>> You can bump up number of partitions while creating the rdd you are using
>> for df
>> On 17 Jul 2015 21:03, "Harit Vishwakarma" <harit.vishwaka...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I used createDataFrame API of SqlContext in python. and getting
>>> OutOfMemoryException. I am wondering if it is creating whole dataFrame in
>>> memory?
>>> I did not find any documentation describing memory usage of Spark APIs.
>>> Documentation given is nice but little more details (specially on memory
>>> usage/ data distribution etc.) will really help.
>>>
>>> --
>>> Regards
>>> Harit Vishwakarma
>>>
>>>
>
>
> --
> Regards
> Harit Vishwakarma
>
>

Reply via email to