All these warnings come from ALS iterations, from flatMap and also from
aggregate, for instance the origin of the state where the flatMap is
showing these warnings (w/ Spark 1.3.0, they are also shown in Spark 1.3.1):

org.apache.spark.rdd.RDD.flatMap(RDD.scala:296)
org.apache.spark.ml.recommendation.ALS$.org$apache$spark$ml$recommendation$ALS$$computeFactors(ALS.scala:1065)
org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:530)
org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:527)
scala.collection.immutable.Range.foreach(Range.scala:141)
org.apache.spark.ml.recommendation.ALS$.train(ALS.scala:527)
org.apache.spark.mllib.recommendation.ALS.run(ALS.scala:203)

And from the aggregate:

org.apache.spark.rdd.RDD.aggregate(RDD.scala:968)
org.apache.spark.ml.recommendation.ALS$.computeYtY(ALS.scala:1112)
org.apache.spark.ml.recommendation.ALS$.org$apache$spark$ml$recommendation$ALS$$computeFactors(ALS.scala:1064)
org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:538)
org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:527)
scala.collection.immutable.Range.foreach(Range.scala:141)
org.apache.spark.ml.recommendation.ALS$.train(ALS.scala:527)
org.apache.spark.mllib.recommendation.ALS.run(ALS.scala:203)



On Thu, Apr 23, 2015 at 2:49 AM, Xiangrui Meng <men...@gmail.com> wrote:

> This is the size of the serialized task closure. Is stage 246 part of
> ALS iterations, or something before or after it? -Xiangrui
>
> On Tue, Apr 21, 2015 at 10:36 AM, Christian S. Perone
> <christian.per...@gmail.com> wrote:
> > Hi Sean, thanks for the answer. I tried to call repartition() on the
> input
> > with many different sizes and it still continues to show that warning
> > message.
> >
> > On Tue, Apr 21, 2015 at 7:05 AM, Sean Owen <so...@cloudera.com> wrote:
> >>
> >> I think maybe you need more partitions in your input, which might make
> >> for smaller tasks?
> >>
> >> On Tue, Apr 21, 2015 at 2:56 AM, Christian S. Perone
> >> <christian.per...@gmail.com> wrote:
> >> > I keep seeing these warnings when using trainImplicit:
> >> >
> >> > WARN TaskSetManager: Stage 246 contains a task of very large size (208
> >> > KB).
> >> > The maximum recommended task size is 100 KB.
> >> >
> >> > And then the task size starts to increase. Is this a known issue ?
> >> >
> >> > Thanks !
> >> >
> >> > --
> >> > Blog | Github | Twitter
> >> > "Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great
> >> > big
> >> > joke on me."
> >
> >
> >
> >
> > --
> > Blog | Github | Twitter
> > "Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great big
> > joke on me."
>



-- 
Blog <http://blog.christianperone.com> | Github <https://github.com/perone>
| Twitter <https://twitter.com/tarantulae>
"Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great big
joke on me."

Reply via email to