Could you try different ranks and see whether the task size changes?
We do use YtY in the closure, which should work the same as broadcast.
If that is the case, it should be safe to ignore this warning.
-Xiangrui

On Thu, Apr 23, 2015 at 4:52 AM, Christian S. Perone
<christian.per...@gmail.com> wrote:
> All these warnings come from ALS iterations, from flatMap and also from
> aggregate, for instance the origin of the state where the flatMap is showing
> these warnings (w/ Spark 1.3.0, they are also shown in Spark 1.3.1):
>
> org.apache.spark.rdd.RDD.flatMap(RDD.scala:296)
> org.apache.spark.ml.recommendation.ALS$.org$apache$spark$ml$recommendation$ALS$$computeFactors(ALS.scala:1065)
> org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:530)
> org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:527)
> scala.collection.immutable.Range.foreach(Range.scala:141)
> org.apache.spark.ml.recommendation.ALS$.train(ALS.scala:527)
> org.apache.spark.mllib.recommendation.ALS.run(ALS.scala:203)
>
> And from the aggregate:
>
> org.apache.spark.rdd.RDD.aggregate(RDD.scala:968)
> org.apache.spark.ml.recommendation.ALS$.computeYtY(ALS.scala:1112)
> org.apache.spark.ml.recommendation.ALS$.org$apache$spark$ml$recommendation$ALS$$computeFactors(ALS.scala:1064)
> org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:538)
> org.apache.spark.ml.recommendation.ALS$$anonfun$train$3.apply(ALS.scala:527)
> scala.collection.immutable.Range.foreach(Range.scala:141)
> org.apache.spark.ml.recommendation.ALS$.train(ALS.scala:527)
> org.apache.spark.mllib.recommendation.ALS.run(ALS.scala:203)
>
>
>
> On Thu, Apr 23, 2015 at 2:49 AM, Xiangrui Meng <men...@gmail.com> wrote:
>>
>> This is the size of the serialized task closure. Is stage 246 part of
>> ALS iterations, or something before or after it? -Xiangrui
>>
>> On Tue, Apr 21, 2015 at 10:36 AM, Christian S. Perone
>> <christian.per...@gmail.com> wrote:
>> > Hi Sean, thanks for the answer. I tried to call repartition() on the
>> > input
>> > with many different sizes and it still continues to show that warning
>> > message.
>> >
>> > On Tue, Apr 21, 2015 at 7:05 AM, Sean Owen <so...@cloudera.com> wrote:
>> >>
>> >> I think maybe you need more partitions in your input, which might make
>> >> for smaller tasks?
>> >>
>> >> On Tue, Apr 21, 2015 at 2:56 AM, Christian S. Perone
>> >> <christian.per...@gmail.com> wrote:
>> >> > I keep seeing these warnings when using trainImplicit:
>> >> >
>> >> > WARN TaskSetManager: Stage 246 contains a task of very large size
>> >> > (208
>> >> > KB).
>> >> > The maximum recommended task size is 100 KB.
>> >> >
>> >> > And then the task size starts to increase. Is this a known issue ?
>> >> >
>> >> > Thanks !
>> >> >
>> >> > --
>> >> > Blog | Github | Twitter
>> >> > "Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great
>> >> > big
>> >> > joke on me."
>> >
>> >
>> >
>> >
>> > --
>> > Blog | Github | Twitter
>> > "Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great
>> > big
>> > joke on me."
>
>
>
>
> --
> Blog | Github | Twitter
> "Forgive, O Lord, my little jokes on Thee, and I'll forgive Thy great big
> joke on me."

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to