RE: MLLib: LIBSVM issue

Sameer Tilak Thu, 18 Sep 2014 10:25:13 -0700

Thanks, Burak,Yes, tab was an issue and I was able to get it working after 
replacing that with space.


> Date: Wed, 17 Sep 2014 21:11:00 -0700
> From: bya...@stanford.edu
> To: ssti...@live.com
> CC: user@spark.apache.org
> Subject: Re: MLLib: LIBSVM issue
> 
> Hi,
> 
> The spacing between the inputs should be a single space, not a tab. I feel 
> like your inputs have tabs between them instead of a single space. Therefore 
> the parser
> cannot parse the input.
> 
> Best,
> Burak
> 
> ----- Original Message -----
> From: "Sameer Tilak" <ssti...@live.com>
> To: user@spark.apache.org
> Sent: Wednesday, September 17, 2014 7:25:10 PM
> Subject: MLLib: LIBSVM issue
> 
> Hi All,We have a fairly large amount of sparse data. I was following the 
> following instructions in the manual:
> Sparse dataIt is very common in practice to have sparse training data. MLlib 
> supports reading training examples stored in LIBSVM format, which is the 
> default format used by LIBSVM and LIBLINEAR. It is a text format in which 
> each line represents a labeled sparse feature vector using the following 
> format:label index1:value1 index2:value2 ...
> import org.apache.spark.mllib.regression.LabeledPointimport 
> org.apache.spark.mllib.util.MLUtilsimport org.apache.spark.rdd.RDD
> val examples: RDD[LabeledPoint] = MLUtils.loadLibSVMFile(sc, 
> "data/mllib/sample_libsvm_data.txt")
> I believe that I have formatted my data as per the required Libsvm format. 
> Here is a snippet of that:
> 1        122:1        1693:1        1771:1        1974:1        2334:1        
> 2378:1        2562:1 1        118:1        1389:1        1413:1        1454:1 
>        1780:1        2562:1        5051:1        5417:1        5548:1        
> 5798:1        5862:1 0        150:1        214:1        468:1        1013:1   
>      1078:1        1092:1        1117:1        1489:1        1546:1        
> 1630:1        1635:1        1827:1        2024:1        2215:1        2478:1  
>       2761:1        5985:1        6115:1        6218:1 0        251:1        
> 5578:1 
> However,When I use MLUtils.loadLibSVMFile(sc, "path-to-data-file")I get the 
> following error messages in mt spark-shell. Can someone please point me in 
> right direction.
> java.lang.NumberFormatException: For input string: "150:1        214:1        
> 468:1        1013:1        1078:1        1092:1        1117:1        1489:1   
>      1546:1        1630:1        1635:1        1827:1        2024:1        
> 2215:1        2478:1        2761:1        5985:1        6115:1        6218:1" 
>         at 
> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241)      
>    at java.lang.Double.parseDouble(Double.java:540)         at 
> scala.collection.immutable.StringLike$class.toDouble(StringLike.scala:232)    
>                                      
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

RE: MLLib: LIBSVM issue

Reply via email to