Hi Yu,

Try this :
val data = csv.map( line => line.split(",").map(elem => elem.trim)) //lines
in rows

   data.map( rec => (rec(0).toInt, rec(1).toInt))

to convert into integer.

On 16 December 2014 at 10:49, yu [via Apache Spark User List] <
ml-node+s1001560n20694...@n3.nabble.com> wrote:
> Hello, everyone
> I know 'NumberFormatException' is due to the reason that String can not be
> parsed properly, but I really can not find any mistakes for my code. I hope
> someone may kindly help me.
> My hdfs file is as follows:
> 8,22
> 3,11
> 40,10
> 49,47
> 48,29
> 24,28
> 50,30
> 33,56
> 4,20
> 30,38
> ...
> So each line contains an integer + "," + an integer + "\n"
> My code is as follows:
> object StreamMonitor {
>   def main(args: Array[String]): Unit = {
>     val myFunc = (str: String) => {
>       val strArray = str.trim().split(",")
>       (strArray(0).toInt, strArray(1).toInt)
>     }
>     val conf = new SparkConf().setAppName("StreamMonitor");
>     val ssc = new StreamingContext(conf, Seconds(30));
>     val datastream = ssc.textFileStream("/user/yu/streaminput");
>     val newstream = datastream.map(myFunc)
>     newstream.saveAsTextFiles("output/", "");
>     ssc.start()
>     ssc.awaitTermination()
>   }
> }
> The exception info is:
> 14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage
> 0.0 (TID 0, h3): java.lang.NumberFormatException: For input string: "8"
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>         java.lang.Integer.parseInt(Integer.java:492)
>         java.lang.Integer.parseInt(Integer.java:527)
> scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
>         scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
>         StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:9)
>         StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:7)
>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:984)
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:974)
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>         org.apache.spark.scheduler.Task.run(Task.scala:54)
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:745)
> So based on the above info, "8" is the first number in the file and I
> think it should be parsed to integer without any problems.
> I know it may be a very stupid question and the answer may be very easy.
> But I really can not find the reason. I am thankful to anyone who helps!
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/NumberFormatException-tp20694.html
>  To start a new topic under Apache Spark User List, email
> ml-node+s1001560n1...@n3.nabble.com
> To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=aG5haGFrQHd5bnlhcmRncm91cC5jb218MXwtMTgxOTE5MTkyOQ==>
> .
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

Harihar Nahak
BigData Developer
Email:hna...@wynyardgroup.com | Extn: 8019

View this message in context: 
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to