MLLib: LIBSVM issue

Sameer Tilak Wed, 17 Sep 2014 19:26:11 -0700
Hi All,We have a fairly large amount of sparse data. I was following the 
following instructions in the manual:
Sparse dataIt is very common in practice to have sparse training data. MLlib 
supports reading training examples stored in LIBSVM format, which is the 
default format used by LIBSVM and LIBLINEAR. It is a text format in which each 
line represents a labeled sparse feature vector using the following 
format:label index1:value1 index2:value2 ...
import org.apache.spark.mllib.regression.LabeledPointimport 
org.apache.spark.mllib.util.MLUtilsimport org.apache.spark.rdd.RDD
val examples: RDD[LabeledPoint] = MLUtils.loadLibSVMFile(sc, 
"data/mllib/sample_libsvm_data.txt")
I believe that I have formatted my data as per the required Libsvm format. Here 
is a snippet of that:
1        122:1        1693:1        1771:1        1974:1        2334:1        
2378:1        2562:1 1        118:1        1389:1        1413:1        1454:1   
     1780:1        2562:1        5051:1        5417:1        5548:1        
5798:1        5862:1 0        150:1        214:1        468:1        1013:1     
   1078:1        1092:1        1117:1        1489:1        1546:1        1630:1 
       1635:1        1827:1        2024:1        2215:1        2478:1        
2761:1        5985:1        6115:1        6218:1 0        251:1        5578:1 
However,When I use MLUtils.loadLibSVMFile(sc, "path-to-data-file")I get the 
following error messages in mt spark-shell. Can someone please point me in 
right direction.
java.lang.NumberFormatException: For input string: "150:1        214:1        
468:1        1013:1        1078:1        1092:1        1117:1        1489:1     
   1546:1        1630:1        1635:1        1827:1        2024:1        2215:1 
       2478:1        2761:1        5985:1        6115:1        6218:1"         
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241)     
    at java.lang.Double.parseDouble(Double.java:540)         at 
scala.collection.immutable.StringLike$class.toDouble(StringLike.scala:232)
MLLib: LIBSVM issue

Reply via email to