Thanks, Burak,Yes, tab was an issue and I was able to get it working after replacing that with space.
> Date: Wed, 17 Sep 2014 21:11:00 -0700 > From: bya...@stanford.edu > To: ssti...@live.com > CC: user@spark.apache.org > Subject: Re: MLLib: LIBSVM issue > > Hi, > > The spacing between the inputs should be a single space, not a tab. I feel > like your inputs have tabs between them instead of a single space. Therefore > the parser > cannot parse the input. > > Best, > Burak > > ----- Original Message ----- > From: "Sameer Tilak" <ssti...@live.com> > To: user@spark.apache.org > Sent: Wednesday, September 17, 2014 7:25:10 PM > Subject: MLLib: LIBSVM issue > > Hi All,We have a fairly large amount of sparse data. I was following the > following instructions in the manual: > Sparse dataIt is very common in practice to have sparse training data. MLlib > supports reading training examples stored in LIBSVM format, which is the > default format used by LIBSVM and LIBLINEAR. It is a text format in which > each line represents a labeled sparse feature vector using the following > format:label index1:value1 index2:value2 ... > import org.apache.spark.mllib.regression.LabeledPointimport > org.apache.spark.mllib.util.MLUtilsimport org.apache.spark.rdd.RDD > val examples: RDD[LabeledPoint] = MLUtils.loadLibSVMFile(sc, > "data/mllib/sample_libsvm_data.txt") > I believe that I have formatted my data as per the required Libsvm format. > Here is a snippet of that: > 1 122:1 1693:1 1771:1 1974:1 2334:1 > 2378:1 2562:1 1 118:1 1389:1 1413:1 1454:1 > 1780:1 2562:1 5051:1 5417:1 5548:1 > 5798:1 5862:1 0 150:1 214:1 468:1 1013:1 > 1078:1 1092:1 1117:1 1489:1 1546:1 > 1630:1 1635:1 1827:1 2024:1 2215:1 2478:1 > 2761:1 5985:1 6115:1 6218:1 0 251:1 > 5578:1 > However,When I use MLUtils.loadLibSVMFile(sc, "path-to-data-file")I get the > following error messages in mt spark-shell. Can someone please point me in > right direction. > java.lang.NumberFormatException: For input string: "150:1 214:1 > 468:1 1013:1 1078:1 1092:1 1117:1 1489:1 > 1546:1 1630:1 1635:1 1827:1 2024:1 > 2215:1 2478:1 2761:1 5985:1 6115:1 6218:1" > at > sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) > at java.lang.Double.parseDouble(Double.java:540) at > scala.collection.immutable.StringLike$class.toDouble(StringLike.scala:232) > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org >