Maximum memory limits

2014-03-16 Thread Debasish Das
Hi, I gave my spark job 16 gb of memory and it is running on 8 executors. The job needs more memory due to ALS requirements (20M x 1M matrix) On each node I do have 96 gb of memory and I am using 16 gb out of it. I want to increase the memory but I am not sure what is the right way to do that...

Re: Maximum memory limits

2014-03-16 Thread Sean Owen
Are you using HEAD or 0.9.0? I know there was a memory issue fixed a few weeks ago that made ALS need a lot more memory than is needed. https://github.com/apache/incubator-spark/pull/629 Try the latest code. -- Sean Owen | Director, Data Science | London On Sun, Mar 16, 2014 at 11:40 AM, Debas

Re: Maximum memory limits

2014-03-16 Thread Debasish Das
Thanks Sean...let me get the latest code..do you know which PR was it ? But will the executors run fine with say 32 gb or 64 gb of memory ? Does not JVM shows up issues when the max memory goes beyond certain limit... Also the failure is due to GC limits from jblas...and I was thinking that jblas

Re: Maximum memory limits

2014-03-16 Thread Sean Owen
You should simply use a snapshot built from HEAD of github.com/apache/sparkif you can. The key change is in MLlib and with any luck you can just replace that bit. See the PR I referenced. Sure with enough memory you can get it to run even with the memory issue, but it could be hundreds of GB at yo

Re: Maximum memory limits

2014-03-16 Thread Patrick Wendell
Sean - was this merged into the 0.9 branch as well (it seems so based on the message from rxin). If so it might make sense to try out the head of branch-0.9 as well. Unless there are *also* other changes relevant to this in master. - Patrick On Sun, Mar 16, 2014 at 12:24 PM, Sean Owen wrote: > Y

Re: Maximum memory limits

2014-03-16 Thread Sean Owen
Good point -- there's been another optimization for ALS in HEAD ( https://github.com/apache/spark/pull/131), but yes the better place to pick up just essential changes since 0.9.0 including the previous one is the 0.9 branch. -- Sean Owen | Director, Data Science | London On Sun, Mar 16, 2014 at