Re: Spark APIs memory usage?

2015-07-18 Thread Harit Vishwakarma
Best Regards On Fri, Jul 17, 2015 at 5:46 PM, Harit Vishwakarma harit.vishwaka...@gmail.com wrote: 1. load 3 matrices of size ~ 1 X 1 using numpy. 2. rdd2 = rdd1.values().flatMap( fun ) # rdd1 has roughly 10^7 tuples 3. df = sqlCtx.createDataFrame(rdd2) 4. df.save() # in parquet

Re: Spark APIs memory usage?

2015-07-17 Thread Harit Vishwakarma
(StorageLevel.MEMORY_AND_DISK)? Thanks Best Regards On Fri, Jul 17, 2015 at 5:14 PM, Harit Vishwakarma harit.vishwaka...@gmail.com wrote: Thanks, Code is running on a single machine. And it still doesn't answer my question. On Fri, Jul 17, 2015 at 4:52 PM, ayan guha guha.a...@gmail.com wrote: You can

Re: Spark APIs memory usage?

2015-07-17 Thread Harit Vishwakarma
Thanks, Code is running on a single machine. And it still doesn't answer my question. On Fri, Jul 17, 2015 at 4:52 PM, ayan guha guha.a...@gmail.com wrote: You can bump up number of partitions while creating the rdd you are using for df On 17 Jul 2015 21:03, Harit Vishwakarma harit.vishwaka

Spark APIs memory usage?

2015-07-17 Thread Harit Vishwakarma
usage/ data distribution etc.) will really help. -- Regards Harit Vishwakarma