How big is your dataset, and what is the vocabulary size? -Xiangrui On Sun, Jan 4, 2015 at 11:18 PM, Eric Zhen <zhpeng...@gmail.com> wrote: > Hi, > > When we run mllib word2vec(spark-1.1.0), driver get stuck with 100% cup > usage. Here is the jstack output: > > "main" prio=10 tid=0x0000000040112800 nid=0x46f2 runnable > [0x000000004162e000] > java.lang.Thread.State: RUNNABLE > at > java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1847) > at > java.io.ObjectOutputStream$BlockDataOutputStream.write(ObjectOutputStream.java:1778) > at java.io.DataOutputStream.writeInt(DataOutputStream.java:182) > at java.io.DataOutputStream.writeFloat(DataOutputStream.java:225) > at > java.io.ObjectOutputStream$BlockDataOutputStream.writeFloats(ObjectOutputStream.java:2064) > at > java.io.ObjectOutputStream.writeArray(ObjectOutputStream.java:1310) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1154) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1518) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1483) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1518) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1483) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158) > at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1518) > at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1483) > at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1400) > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1158) > at > java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:330) > at > org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42) > at > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:73) > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:164) > at > org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158) > at org.apache.spark.SparkContext.clean(SparkContext.scala:1242) > at org.apache.spark.rdd.RDD.mapPartitionsWithIndex(RDD.scala:610) > at > org.apache.spark.mllib.feature.Word2Vec$$anonfun$fit$1.apply$mcVI$sp(Word2Vec.scala:291) > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) > at org.apache.spark.mllib.feature.Word2Vec.fit(Word2Vec.scala:290) > at com.baidu.inf.WordCount$.main(WordCount.scala:31) > at com.baidu.inf.WordCount.main(WordCount.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > -- > Best Regards
--------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org