Probably this JIRA
issue<https://spark-project.atlassian.net/browse/SPARK-1006>solves
your problem. When running with large iteration number, the lineage
DAG of ALS becomes very deep, both DAGScheduler and Java serializer may
overflow because they are implemented in a recursive way. You may resort to
checkpointing as a workaround.


On Wed, Apr 16, 2014 at 5:29 AM, Xiaoli Li <lixiaolima...@gmail.com> wrote:

> Hi,
>
> I am testing ALS using 7 nodes. Each node has 4 cores and 8G memeory. ALS
> program cannot run  even with a very small size of training data (about 91
> lines) due to StackVverFlow error when I set the number of iterations to
> 100. I think the problem may be caused by updateFeatures method which
> updates products RDD iteratively by join previous products RDD.
>
>
> I am writing a program which has a similar update process with ALS.  This
> problem also appeared when I iterate too many times (more than 80).
>
> The iterative part of my code is as following:
>
> solution = outlinks.join(solution). map {
>      .......
>  }
>
>
> Has anyone had similar problem?  Thanks.
>
>
> Xiaoli
>

Reply via email to