Re: ALS.trainImplicit running out of mem when using higher rank

Antony Mayi Sun, 11 Jan 2015 01:49:49 -0800

the question really is whether this is expected that the memory requirements 
grow rapidly with the rank... as I would expect memory is rather O(1) problem 
with dependency only on the size of input data.
if this is expected is there any rough formula to determine the required memory 
based on ALS input and parameters?
thanks,Antony.


     On Saturday, 10 January 2015, 10:47, Antony Mayi <[email protected]> 
wrote:
   
 

 the actual case looks like this:* spark 1.1.0 on yarn (cdh 5.2.1)* ~8-10 
executors, 36GB phys RAM per host* input RDD is roughly 3GB containing 
~150-200M items (and this RDD is made persistent using .cache())* using pyspark
yarn is configured with the limit yarn.nodemanager.resource.memory-mb of 33792 
(33GB), spark is set to 
be:SPARK_EXECUTOR_CORES=6SPARK_EXECUTOR_INSTANCES=9SPARK_EXECUTOR_MEMORY=30G
when using higher rank (above 20) for ALS.trainImplicit the executor runs after 
some time (~hour) of execution out of the yarn limit and gets killed:
2015-01-09 17:51:27,130 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Container [pid=27125,containerID=container_1420871936411_0002_01_000023] is 
running beyond physical memory limits. Current usage: 31.2 GB of 31 GB physical 
memory used; 34.7 GB of 65.1 GB virtual memory used. Killing container.
thanks for any ideas,Antony.
 

     On Saturday, 10 January 2015, 10:11, Antony Mayi <[email protected]> 
wrote:
   
 

 the memory requirements seem to be rapidly growing hen using higher rank... I 
am unable to get over 20 without running out of memory. is this 
expected?thanks, Antony.

Re: ALS.trainImplicit running out of mem when using higher rank

Reply via email to