Hi,

The spacing between the inputs should be a single space, not a tab. I feel like 
your inputs have tabs between them instead of a single space. Therefore the 
parser
cannot parse the input.

Best,
Burak

----- Original Message -----
From: "Sameer Tilak" <ssti...@live.com>
To: user@spark.apache.org
Sent: Wednesday, September 17, 2014 7:25:10 PM
Subject: MLLib: LIBSVM issue

Hi All,We have a fairly large amount of sparse data. I was following the 
following instructions in the manual:
Sparse dataIt is very common in practice to have sparse training data. MLlib 
supports reading training examples stored in LIBSVM format, which is the 
default format used by LIBSVM and LIBLINEAR. It is a text format in which each 
line represents a labeled sparse feature vector using the following 
format:label index1:value1 index2:value2 ...
import org.apache.spark.mllib.regression.LabeledPointimport 
org.apache.spark.mllib.util.MLUtilsimport org.apache.spark.rdd.RDD
val examples: RDD[LabeledPoint] = MLUtils.loadLibSVMFile(sc, 
"data/mllib/sample_libsvm_data.txt")
I believe that I have formatted my data as per the required Libsvm format. Here 
is a snippet of that:
1        122:1        1693:1        1771:1        1974:1        2334:1        
2378:1        2562:1 1        118:1        1389:1        1413:1        1454:1   
     1780:1        2562:1        5051:1        5417:1        5548:1        
5798:1        5862:1 0        150:1        214:1        468:1        1013:1     
   1078:1        1092:1        1117:1        1489:1        1546:1        1630:1 
       1635:1        1827:1        2024:1        2215:1        2478:1        
2761:1        5985:1        6115:1        6218:1 0        251:1        5578:1 
However,When I use MLUtils.loadLibSVMFile(sc, "path-to-data-file")I get the 
following error messages in mt spark-shell. Can someone please point me in 
right direction.
java.lang.NumberFormatException: For input string: "150:1        214:1        
468:1        1013:1        1078:1        1092:1        1117:1        1489:1     
   1546:1        1630:1        1635:1        1827:1        2024:1        2215:1 
       2478:1        2761:1        5985:1        6115:1        6218:1"         
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241)     
    at java.lang.Double.parseDouble(Double.java:540)         at 
scala.collection.immutable.StringLike$class.toDouble(StringLike.scala:232)      
                                     


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to