Serializable like a Java object? no, it's an RDD. A factored matrix model is huge, unlike most models, and is not a local object. You can of course persist the RDDs to storage manually and read them back.
On Fri, Nov 7, 2014 at 11:33 PM, Dariusz Kobylarz <darek.kobyl...@gmail.com> wrote: > I am trying to persist MatrixFactorizationModel (Collaborative Filtering > example) and use it in another script to evaluate/apply it. > This is the exception I get when I try to use a deserialized model instance: > > Exception in thread "main" java.lang.NullPointerException > at > org.apache.spark.rdd.CoGroupedRDD$$anonfun$getPartitions$1.apply$mcVI$sp(CoGroupedRDD.scala:103) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) > at > org.apache.spark.rdd.CoGroupedRDD.getPartitions(CoGroupedRDD.scala:101) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at > org.apache.spark.rdd.MappedValuesRDD.getPartitions(MappedValuesRDD.scala:26) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at > org.apache.spark.rdd.FlatMappedValuesRDD.getPartitions(FlatMappedValuesRDD.scala:26) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at > org.apache.spark.rdd.FlatMappedRDD.getPartitions(FlatMappedRDD.scala:30) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.Partitioner$$anonfun$2.apply(Partitioner.scala:58) > at org.apache.spark.Partitioner$$anonfun$2.apply(Partitioner.scala:58) > at scala.math.Ordering$$anon$5.compare(Ordering.scala:122) > at java.util.TimSort.countRunAndMakeAscending(TimSort.java:324) > at java.util.TimSort.sort(TimSort.java:189) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at scala.collection.SeqLike$class.sorted(SeqLike.scala:615) > at scala.collection.AbstractSeq.sorted(Seq.scala:40) > at scala.collection.SeqLike$class.sortBy(SeqLike.scala:594) > at scala.collection.AbstractSeq.sortBy(Seq.scala:40) > at > org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:58) > at > org.apache.spark.rdd.PairRDDFunctions.join(PairRDDFunctions.scala:536) > at > org.apache.spark.mllib.recommendation.MatrixFactorizationModel.predict(MatrixFactorizationModel.scala:57) > ... > > Is this model serializable at all, I noticed it has two RDDs inside (user & > product features)? > > Thanks, > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org