Re: spark 2.4.3 build fails using java 8 and scala 2.11 with NumberFormatException: Not a version: 9

2019-05-19 Thread Bulldog20630405
after blowing away my m2 repo cache; i was able to build just fine... i
dont know why; but now it works :-)

On Sun, May 19, 2019 at 10:22 PM Bulldog20630405 
wrote:

> i am trying to build spark 2.4.3 with the following env:
>
>- fedora 29
>- 1.8.0_202
>- spark 2.4.3
>- scala 2.11.12
>- maven 3.5.4
>- hadoop 2.6.5
>
> according to the documentation this can be done with the following
> commands:
> *export TERM=xterm-color*
> *./build/mvn -Pyarn -DskipTests clean package*
>
> however i get the following error (it seems to me that somehow it think i
> am using java 9):
> (note: my real goals is to build spark for hadoop 3; however, i need to
> understand why the default build is failing first)
>
> *[ERROR] Failed to execute goal
> net.alchim31.maven:scala-maven-plugin:3.2.2:compile *(scala-compile-first)*
> on project spark-tags_2.11*: Execution scala-compile-first of goal
> net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed.: CompileFailed
> -> [Help 1]
>
> [INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @
> spark-tags_2.11 ---
> [INFO] Using zinc server for incremental compilation
> [info] 'compiler-interface' not yet compiled for Scala 2.11.12.
> Compiling...
> *error: java.lang.NumberFormatException: Not a version: 9*
> at scala.util.PropertiesTrait$class.parts$1(Properties.scala:184)
> at scala.util.PropertiesTrait$class.isJavaAtLeast(Properties.scala:187)
> at scala.util.Properties$.isJavaAtLeast(Properties.scala:17)
> at
> scala.tools.util.PathResolverBase$Calculated$.javaBootClasspath(PathResolver.scala:276)
> at
> scala.tools.util.PathResolverBase$Calculated$.basis(PathResolver.scala:283)
> at
> scala.tools.util.PathResolverBase$Calculated$.containers$lzycompute(PathResolver.scala:293)
> at
> scala.tools.util.PathResolverBase$Calculated$.containers(PathResolver.scala:293)
> at scala.tools.util.PathResolverBase.containers(PathResolver.scala:309)
> at scala.tools.util.PathResolver.computeResult(PathResolver.scala:341)
> at scala.tools.util.PathResolver.computeResult(PathResolver.scala:332)
> at scala.tools.util.PathResolverBase.result(PathResolver.scala:314)
> at
> scala.tools.nsc.backend.JavaPlatform$class.classPath(JavaPlatform.scala:28)
> at scala.tools.nsc.Global$GlobalPlatform.classPath(Global.scala:115)
> at
> scala.tools.nsc.Global.scala$tools$nsc$Global$$recursiveClassPath(Global.scala:131)
> at scala.tools.nsc.Global$GlobalMirror.rootLoader(Global.scala:64)
> at scala.reflect.internal.Mirrors$Roots$RootClass.(Mirrors.scala:307)
> at
> scala.reflect.internal.Mirrors$Roots.RootClass$lzycompute(Mirrors.scala:321)
> at scala.reflect.internal.Mirrors$Roots.RootClass(Mirrors.scala:321)
> at
> scala.reflect.internal.Mirrors$Roots$EmptyPackageClass.(Mirrors.scala:330)
> at
> scala.reflect.internal.Mirrors$Roots.EmptyPackageClass$lzycompute(Mirrors.scala:336)
> at
> scala.reflect.internal.Mirrors$Roots.EmptyPackageClass(Mirrors.scala:336)
> at
> scala.reflect.internal.Mirrors$Roots.EmptyPackageClass(Mirrors.scala:276)
> at scala.reflect.internal.Mirrors$RootsBase.init(Mirrors.scala:250)
> at scala.tools.nsc.Global.rootMirror$lzycompute(Global.scala:73)
> at scala.tools.nsc.Global.rootMirror(Global.scala:71)
> at scala.tools.nsc.Global.rootMirror(Global.scala:39)
> at
> scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass$lzycompute(Definitions.scala:257)
> at
> scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass(Definitions.scala:257)
> at
> scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1390)
> at scala.tools.nsc.Global$Run.(Global.scala:1242)
> at scala.tools.nsc.Driver.doCompile(Driver.scala:31)
> at scala.tools.nsc.MainClass.doCompile(Main.scala:23)
> at scala.tools.nsc.Driver.process(Driver.scala:51)
> at scala.tools.nsc.Main.process(Main.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sbt.compiler.RawCompiler.apply(RawCompiler.scala:33)
> at
> sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1$$anonfun$apply$2.apply(AnalyzingCompiler.scala:159)
> at
> sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1$$anonfun$apply$2.apply(AnalyzingCompiler.scala:155)
> at sbt.IO$.withTemporaryDirectory(IO.scala:358)
> at
> sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1.apply(AnalyzingCompiler.scala:155)
> at
> sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1.apply(AnalyzingCompiler.scala:152)
> at sbt.IO$.withTemporaryDirectory(IO.scala:358)
> at
> sbt.compiler.AnalyzingCompiler$.compileSources(AnalyzingCompiler.scala:152)
> at sbt.compiler.IC$.compileInterfaceJar(IncrementalCompiler.scala:58)
> at com.typesafe.zinc.Compiler$.compilerInterface(Compiler.scala:154)
> at com.typesafe.zinc.Compiler$.create(Compiler.scala:55)
> 

spark 2.4.3 build fails using java 8 and scala 2.11 with NumberFormatException: Not a version: 9

2019-05-19 Thread Bulldog20630405
i am trying to build spark 2.4.3 with the following env:

   - fedora 29
   - 1.8.0_202
   - spark 2.4.3
   - scala 2.11.12
   - maven 3.5.4
   - hadoop 2.6.5

according to the documentation this can be done with the following commands:
*export TERM=xterm-color*
*./build/mvn -Pyarn -DskipTests clean package*

however i get the following error (it seems to me that somehow it think i
am using java 9):
(note: my real goals is to build spark for hadoop 3; however, i need to
understand why the default build is failing first)

*[ERROR] Failed to execute goal
net.alchim31.maven:scala-maven-plugin:3.2.2:compile *(scala-compile-first)*
on project spark-tags_2.11*: Execution scala-compile-first of goal
net.alchim31.maven:scala-maven-plugin:3.2.2:compile failed.: CompileFailed
-> [Help 1]

[INFO] --- scala-maven-plugin:3.2.2:compile (scala-compile-first) @
spark-tags_2.11 ---
[INFO] Using zinc server for incremental compilation
[info] 'compiler-interface' not yet compiled for Scala 2.11.12. Compiling...
*error: java.lang.NumberFormatException: Not a version: 9*
at scala.util.PropertiesTrait$class.parts$1(Properties.scala:184)
at scala.util.PropertiesTrait$class.isJavaAtLeast(Properties.scala:187)
at scala.util.Properties$.isJavaAtLeast(Properties.scala:17)
at
scala.tools.util.PathResolverBase$Calculated$.javaBootClasspath(PathResolver.scala:276)
at
scala.tools.util.PathResolverBase$Calculated$.basis(PathResolver.scala:283)
at
scala.tools.util.PathResolverBase$Calculated$.containers$lzycompute(PathResolver.scala:293)
at
scala.tools.util.PathResolverBase$Calculated$.containers(PathResolver.scala:293)
at scala.tools.util.PathResolverBase.containers(PathResolver.scala:309)
at scala.tools.util.PathResolver.computeResult(PathResolver.scala:341)
at scala.tools.util.PathResolver.computeResult(PathResolver.scala:332)
at scala.tools.util.PathResolverBase.result(PathResolver.scala:314)
at
scala.tools.nsc.backend.JavaPlatform$class.classPath(JavaPlatform.scala:28)
at scala.tools.nsc.Global$GlobalPlatform.classPath(Global.scala:115)
at
scala.tools.nsc.Global.scala$tools$nsc$Global$$recursiveClassPath(Global.scala:131)
at scala.tools.nsc.Global$GlobalMirror.rootLoader(Global.scala:64)
at scala.reflect.internal.Mirrors$Roots$RootClass.(Mirrors.scala:307)
at
scala.reflect.internal.Mirrors$Roots.RootClass$lzycompute(Mirrors.scala:321)
at scala.reflect.internal.Mirrors$Roots.RootClass(Mirrors.scala:321)
at
scala.reflect.internal.Mirrors$Roots$EmptyPackageClass.(Mirrors.scala:330)
at
scala.reflect.internal.Mirrors$Roots.EmptyPackageClass$lzycompute(Mirrors.scala:336)
at scala.reflect.internal.Mirrors$Roots.EmptyPackageClass(Mirrors.scala:336)
at scala.reflect.internal.Mirrors$Roots.EmptyPackageClass(Mirrors.scala:276)
at scala.reflect.internal.Mirrors$RootsBase.init(Mirrors.scala:250)
at scala.tools.nsc.Global.rootMirror$lzycompute(Global.scala:73)
at scala.tools.nsc.Global.rootMirror(Global.scala:71)
at scala.tools.nsc.Global.rootMirror(Global.scala:39)
at
scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass$lzycompute(Definitions.scala:257)
at
scala.reflect.internal.Definitions$DefinitionsClass.ObjectClass(Definitions.scala:257)
at
scala.reflect.internal.Definitions$DefinitionsClass.init(Definitions.scala:1390)
at scala.tools.nsc.Global$Run.(Global.scala:1242)
at scala.tools.nsc.Driver.doCompile(Driver.scala:31)
at scala.tools.nsc.MainClass.doCompile(Main.scala:23)
at scala.tools.nsc.Driver.process(Driver.scala:51)
at scala.tools.nsc.Main.process(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sbt.compiler.RawCompiler.apply(RawCompiler.scala:33)
at
sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1$$anonfun$apply$2.apply(AnalyzingCompiler.scala:159)
at
sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1$$anonfun$apply$2.apply(AnalyzingCompiler.scala:155)
at sbt.IO$.withTemporaryDirectory(IO.scala:358)
at
sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1.apply(AnalyzingCompiler.scala:155)
at
sbt.compiler.AnalyzingCompiler$$anonfun$compileSources$1.apply(AnalyzingCompiler.scala:152)
at sbt.IO$.withTemporaryDirectory(IO.scala:358)
at
sbt.compiler.AnalyzingCompiler$.compileSources(AnalyzingCompiler.scala:152)
at sbt.compiler.IC$.compileInterfaceJar(IncrementalCompiler.scala:58)
at com.typesafe.zinc.Compiler$.compilerInterface(Compiler.scala:154)
at com.typesafe.zinc.Compiler$.create(Compiler.scala:55)
at com.typesafe.zinc.Compiler$$anonfun$apply$1.apply(Compiler.scala:42)
at com.typesafe.zinc.Compiler$$anonfun$apply$1.apply(Compiler.scala:42)
at com.typesafe.zinc.Cache.get(Cache.scala:41)
at com.typesafe.zinc.Compiler$.apply(Compiler.scala:42)
at com.typesafe.zinc.Main$.run(Main.scala:96)
at com.typesafe.zinc.Nailgun$.zinc(Nailgun.scala:95)
at 

Re: NumberFormatException while reading and split the file

2018-04-04 Thread utkarsh_deep
Response to the 1st approach:

When you do spark.read.text("/xyz/a/b/filename") it returns a DataFrame and
when applying the rdd methods gives you a RDD[Row], so when you use map,
your function get Row as the parameter i.e; ip in your code. Therefore you
must use the Row methods to access its members.
The error message says it clearly "error :  value split is not a member of
org.apache.spark.sql.Row" that there is no method like split so it is
throwing error.



Response to the 2nd approach:

There is something fishy there. The if condition in Row ip(0).isEmpty()
should catch the case when it is an empty string so when it is not actually
empty ip(0).toInt shouldn't fail. But also you need to make sure ip(0) is
not just some random string which can't be converted to Int.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



NumberFormatException while reading and split the file

2018-04-04 Thread anbu
1st Approach:

error :  value split is not a member of org.apache.spark.sql.Row?

val newRdd = spark.read.text("/xyz/a/b/filename").rdd

anotherRDD = newRdd.
map(ip =>ip.split("\\|")).map(ip => Row(if (ip(0).isEmpty()) {
null.asInstanceOf[Int] }
else ip(0).toInt, ip(1),
ip(2), ip(3), ip(4), ip(5))


I'm getting the error in the  line 'ip.split("\\|")' value split is not a
member of org.apache.spark.sql.Row?
 
 
Another approach:
 
 error:"java.lang.NumberFormatException: For input string:
 
 
 val newRdd = spark.read.text("/xyz/a/b/filename").rdd

anotherRDD = newRdd.
map(ip =>ip.toString().split("\\|")).map(ip => Row(if (ip(0).isEmpty())
{ null.asInstanceOf[Int] }
else ip(0).toInt, ip(1),
ip(2), ip(3), ip(4), ip(5))

anotherRDD.collect().foreach(println)   
In this case I'm getting the error "java.lang.NumberFormatException: For
input string: ""




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: NumberFormatException: For input string: "0.00000"

2016-09-19 Thread Hyukjin Kwon
It seems not an issue in Spark. Does "CSVParser" works fine without Spark
with the data?

BTW, it seems there is something wrong with your email address. I am
sending this again.

On 20 Sep 2016 8:32 a.m., "Hyukjin Kwon"  wrote:

> It seems not an issue in Spark. Does "CSVParser" works fine without Spark
> with the data?
>
> On 20 Sep 2016 2:15 a.m., "Mohamed ismail" 
> wrote:
>
>> Hi all
>>
>> I am trying to read:
>>
>> sc.textFile(DataFile).mapPartitions(lines => {
>> val parser = new CSVParser(",")
>> lines.map(line=>parseLineToTuple(line, parser))
>> })
>> Data looks like:
>> android phone,0,0,0,,0,0,0,0,0,0,0,5,0,0,0,5,0,0.0,0.0,0.000
>> 00,0.0,0.0,0,0,0,0,0,0,0,0.0,0,0,0
>> ios phone,0,-1,0,,0,0,0,0,0,0,1,0,0,0,0,1,0,0.0,0.0,0.00
>> 000,0.0,0.0,0,0,0,0,0,0,0,0.0,0,0,0
>>
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 1 in stage 23055.0 failed 4 times, most recent failure: Lost task 1.3 in
>> stage 23055.0 (TID 191607, ):
>> java.lang.NumberFormatException: For input string: "0.0"
>>
>> Has anyone faced such issues. Is there a solution?
>>
>> Thanks,
>> Mohamed
>>
>>


Re: NumberFormatException: For input string: "0.00000"

2016-09-19 Thread Hyukjin Kwon
It seems not an issue in Spark. Does "CSVParser" works fine without Spark
with the data?

On 20 Sep 2016 2:15 a.m., "Mohamed ismail" 
wrote:

> Hi all
>
> I am trying to read:
>
> sc.textFile(DataFile).mapPartitions(lines => {
> val parser = new CSVParser(",")
> lines.map(line=>parseLineToTuple(line, parser))
> })
> Data looks like:
> android phone,0,0,0,,0,0,0,0,0,0,0,5,0,0,0,5,0,0.0,0.0,0.
> 0,0.0,0.0,0,0,0,0,0,0,0,0.0,0,0,0
> ios phone,0,-1,0,,0,0,0,0,0,0,1,0,0,0,0,1,0,0.0,0.0,0.
> 0,0.0,0.0,0,0,0,0,0,0,0,0.0,0,0,0
>
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1
> in stage 23055.0 failed 4 times, most recent failure: Lost task 1.3 in
> stage 23055.0 (TID 191607, ):
> java.lang.NumberFormatException: For input string: "0.0"
>
> Has anyone faced such issues. Is there a solution?
>
> Thanks,
> Mohamed
>
>


NumberFormatException: For input string: "0.00000"

2016-09-19 Thread Mohamed ismail
Hi all

I am trying to read: 

sc.textFile(DataFile).mapPartitions(lines => {
val parser = new CSVParser(",")
lines.map(line=>parseLineToTuple(line, parser))
})
Data looks like:
android 
phone,0,0,0,,0,0,0,0,0,0,0,5,0,0,0,5,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0,0.0,0,0,0
ios 
phone,0,-1,0,,0,0,0,0,0,0,1,0,0,0,0,1,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0,0,0,0.0,0,0,0

org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in 
stage 23055.0 failed 4 times, most recent failure: Lost task 1.3 in stage 
23055.0 (TID 191607, ): 
java.lang.NumberFormatException: For input string: "0.0"

Has anyone faced such issues. Is there a solution?

Thanks,
Mohamed



Re: SPARKonYARN failing on CDH 5.3.0 : container cannot be fetched because of NumberFormatException

2015-01-09 Thread Mukesh Jha
I am using pre built *spark-1.2.0-bin-hadoop2.4* from *[1] *to submit spark
applications to yarn, I cannot find the pre built spark for *CDH-5.x*
versions. So, In my case the org.apache.hadoop.yarn.util.ConverterUtils class
is coming from the spark-assembly-1.1.0-hadoop2.4.0.jar which is part of
the pre built spark and hence causing this issue.

How / where can I get spark 1.2.0 built for CDH-5.3.0, Icheck in maven repo
etc with no luck.

*[1]* https://spark.apache.org/downloads.html

On Fri, Jan 9, 2015 at 1:12 AM, Marcelo Vanzin van...@cloudera.com wrote:

 Just to add to Sandy's comment, check your client configuration
 (generally in /etc/spark/conf). If you're using CM, you may need to
 run the Deploy Client Configuration command on the cluster to update
 the configs to match the new version of CDH.

 On Thu, Jan 8, 2015 at 11:38 AM, Sandy Ryza sandy.r...@cloudera.com
 wrote:
  Hi Mukesh,
 
  Those line numbers in ConverterUtils in the stack trace don't appear to
 line
  up with CDH 5.3:
 
 https://github.com/cloudera/hadoop-common/blob/cdh5-2.5.0_5.3.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java
 
  Is it possible you're still including the old jars on the classpath in
 some
  way?
 
  -Sandy
 
  On Thu, Jan 8, 2015 at 3:38 AM, Mukesh Jha me.mukesh@gmail.com
 wrote:
 
  Hi Experts,
 
  I am running spark inside YARN job.
 
  The spark-streaming job is running fine in CDH-5.0.0 but after the
 upgrade
  to 5.3.0 it cannot fetch containers with the below errors. Looks like
 the
  container id is incorrect and a string is present in a pace where it's
  expecting a number.
 
 
 
  java.lang.IllegalArgumentException: Invalid ContainerId:
  container_e01_1420481081140_0006_01_01
 
  Caused by: java.lang.NumberFormatException: For input string: e01
 
 
 
  Is this a bug?? Did you face something similar and any ideas how to fix
  this?
 
 
 
  15/01/08 09:50:28 INFO yarn.ApplicationMaster: Registered signal
 handlers
  for [TERM, HUP, INT]
 
  15/01/08 09:50:29 ERROR yarn.ApplicationMaster: Uncaught exception:
 
  java.lang.IllegalArgumentException: Invalid ContainerId:
  container_e01_1420481081140_0006_01_01
 
  at
 
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)
 
  at
 
 org.apache.spark.deploy.yarn.YarnRMClientImpl.getAttemptId(YarnRMClientImpl.scala:79)
 
  at
 
 org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:79)
 
  at
 
 org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:515)
 
  at
 
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
 
  at
 
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
 
  at java.security.AccessController.doPrivileged(Native Method)
 
  at javax.security.auth.Subject.doAs(Subject.java:415)
 
  at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 
  at
 
 org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
 
  at
 
 org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:513)
 
  at
 
 org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
 
  Caused by: java.lang.NumberFormatException: For input string: e01
 
  at
 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 
  at java.lang.Long.parseLong(Long.java:441)
 
  at java.lang.Long.parseLong(Long.java:483)
 
  at
 
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
 
  at
 
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
 
  ... 11 more
 
  15/01/08 09:50:29 INFO yarn.ApplicationMaster: Final app status: FAILED,
  exitCode: 10, (reason: Uncaught exception: Invalid ContainerId:
  container_e01_1420481081140_0006_01_01)
 
 
  --
  Thanks  Regards,
 
  Mukesh Jha
 
 



 --
 Marcelo




-- 


Thanks  Regards,

*Mukesh Jha me.mukesh@gmail.com*


Re: SPARKonYARN failing on CDH 5.3.0 : container cannot be fetched because of NumberFormatException

2015-01-09 Thread Sean Owen
Again this is probably not the place for CDH-specific questions, and
this one is already answered at
http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/CDH-5-3-0-container-cannot-be-fetched-because-of/m-p/23497#M478

On Fri, Jan 9, 2015 at 9:23 AM, Mukesh Jha me.mukesh@gmail.com wrote:
 I am using pre built spark-1.2.0-bin-hadoop2.4 from [1] to submit spark
 applications to yarn, I cannot find the pre built spark for CDH-5.x
 versions. So, In my case the org.apache.hadoop.yarn.util.ConverterUtils
 class is coming from the spark-assembly-1.1.0-hadoop2.4.0.jar which is part
 of the pre built spark and hence causing this issue.

 How / where can I get spark 1.2.0 built for CDH-5.3.0, Icheck in maven repo
 etc with no luck.

 [1] https://spark.apache.org/downloads.html

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



SPARKonYARN failing on CDH 5.3.0 : container cannot be fetched because of NumberFormatException

2015-01-08 Thread Mukesh Jha
Hi Experts,

I am running spark inside YARN job.

The spark-streaming job is running fine in CDH-5.0.0 but after the upgrade
to 5.3.0 it cannot fetch containers with the below errors. Looks like the
container id is incorrect and a string is present in a pace where it's
expecting a number.



java.lang.IllegalArgumentException: Invalid ContainerId:
container_e01_1420481081140_0006_01_01

Caused by: java.lang.NumberFormatException: For input string: e01



Is this a bug?? Did you face something similar and any ideas how to fix
this?



15/01/08 09:50:28 INFO yarn.ApplicationMaster: Registered signal handlers
for [TERM, HUP, INT]

15/01/08 09:50:29 ERROR yarn.ApplicationMaster: Uncaught exception:

java.lang.IllegalArgumentException: Invalid ContainerId:
container_e01_1420481081140_0006_01_01

at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)

at
org.apache.spark.deploy.yarn.YarnRMClientImpl.getAttemptId(YarnRMClientImpl.scala:79)

at
org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:79)

at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:515)

at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)

at
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:415)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

at
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)

at
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:513)

at
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

Caused by: java.lang.NumberFormatException: For input string: e01

at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

at java.lang.Long.parseLong(Long.java:441)

at java.lang.Long.parseLong(Long.java:483)

at
org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)

at
org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)

... 11 more

15/01/08 09:50:29 INFO yarn.ApplicationMaster: Final app status: FAILED,
exitCode: 10, (reason: Uncaught exception: Invalid ContainerId:
container_e01_1420481081140_0006_01_01)

-- 
Thanks  Regards,

*Mukesh Jha me.mukesh@gmail.com*


Re: SPARKonYARN failing on CDH 5.3.0 : container cannot be fetched because of NumberFormatException

2015-01-08 Thread Sandy Ryza
Hi Mukesh,

Those line numbers in ConverterUtils in the stack trace don't appear to
line up with CDH 5.3:
https://github.com/cloudera/hadoop-common/blob/cdh5-2.5.0_5.3.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java

Is it possible you're still including the old jars on the classpath in some
way?

-Sandy

On Thu, Jan 8, 2015 at 3:38 AM, Mukesh Jha me.mukesh@gmail.com wrote:

 Hi Experts,

 I am running spark inside YARN job.

 The spark-streaming job is running fine in CDH-5.0.0 but after the upgrade
 to 5.3.0 it cannot fetch containers with the below errors. Looks like the
 container id is incorrect and a string is present in a pace where it's
 expecting a number.



 java.lang.IllegalArgumentException: Invalid ContainerId:
 container_e01_1420481081140_0006_01_01

 Caused by: java.lang.NumberFormatException: For input string: e01



 Is this a bug?? Did you face something similar and any ideas how to fix
 this?



 15/01/08 09:50:28 INFO yarn.ApplicationMaster: Registered signal handlers
 for [TERM, HUP, INT]

 15/01/08 09:50:29 ERROR yarn.ApplicationMaster: Uncaught exception:

 java.lang.IllegalArgumentException: Invalid ContainerId:
 container_e01_1420481081140_0006_01_01

 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)

 at
 org.apache.spark.deploy.yarn.YarnRMClientImpl.getAttemptId(YarnRMClientImpl.scala:79)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:79)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:515)

 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)

 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:415)

 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

 at
 org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:513)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

 Caused by: java.lang.NumberFormatException: For input string: e01

 at
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

 at java.lang.Long.parseLong(Long.java:441)

 at java.lang.Long.parseLong(Long.java:483)

 at
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)

 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)

 ... 11 more

 15/01/08 09:50:29 INFO yarn.ApplicationMaster: Final app status: FAILED,
 exitCode: 10, (reason: Uncaught exception: Invalid ContainerId:
 container_e01_1420481081140_0006_01_01)

 --
 Thanks  Regards,

 *Mukesh Jha me.mukesh@gmail.com*



Re: SPARKonYARN failing on CDH 5.3.0 : container cannot be fetched because of NumberFormatException

2015-01-08 Thread Marcelo Vanzin
Just to add to Sandy's comment, check your client configuration
(generally in /etc/spark/conf). If you're using CM, you may need to
run the Deploy Client Configuration command on the cluster to update
the configs to match the new version of CDH.

On Thu, Jan 8, 2015 at 11:38 AM, Sandy Ryza sandy.r...@cloudera.com wrote:
 Hi Mukesh,

 Those line numbers in ConverterUtils in the stack trace don't appear to line
 up with CDH 5.3:
 https://github.com/cloudera/hadoop-common/blob/cdh5-2.5.0_5.3.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ConverterUtils.java

 Is it possible you're still including the old jars on the classpath in some
 way?

 -Sandy

 On Thu, Jan 8, 2015 at 3:38 AM, Mukesh Jha me.mukesh@gmail.com wrote:

 Hi Experts,

 I am running spark inside YARN job.

 The spark-streaming job is running fine in CDH-5.0.0 but after the upgrade
 to 5.3.0 it cannot fetch containers with the below errors. Looks like the
 container id is incorrect and a string is present in a pace where it's
 expecting a number.



 java.lang.IllegalArgumentException: Invalid ContainerId:
 container_e01_1420481081140_0006_01_01

 Caused by: java.lang.NumberFormatException: For input string: e01



 Is this a bug?? Did you face something similar and any ideas how to fix
 this?



 15/01/08 09:50:28 INFO yarn.ApplicationMaster: Registered signal handlers
 for [TERM, HUP, INT]

 15/01/08 09:50:29 ERROR yarn.ApplicationMaster: Uncaught exception:

 java.lang.IllegalArgumentException: Invalid ContainerId:
 container_e01_1420481081140_0006_01_01

 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)

 at
 org.apache.spark.deploy.yarn.YarnRMClientImpl.getAttemptId(YarnRMClientImpl.scala:79)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:79)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:515)

 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)

 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)

 at java.security.AccessController.doPrivileged(Native Method)

 at javax.security.auth.Subject.doAs(Subject.java:415)

 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)

 at
 org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:513)

 at
 org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

 Caused by: java.lang.NumberFormatException: For input string: e01

 at
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

 at java.lang.Long.parseLong(Long.java:441)

 at java.lang.Long.parseLong(Long.java:483)

 at
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)

 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)

 ... 11 more

 15/01/08 09:50:29 INFO yarn.ApplicationMaster: Final app status: FAILED,
 exitCode: 10, (reason: Uncaught exception: Invalid ContainerId:
 container_e01_1420481081140_0006_01_01)


 --
 Thanks  Regards,

 Mukesh Jha





-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: NumberFormatException

2014-12-16 Thread Imran Rashid
wow, really weird.  My intuition is the same as everyone else's, some
unprintable character.  Here's a couple more debugging tricks I've used in
the past:

//set up an accumulator to catch the bad rows as a side-effect
val nBadRows = sc.accumulator(0)
val nGoodRows = sc.accumulator(0)
val badRows =
sc.accumulableCollection(scala.collection.mutable.Set[String]())

//flatMap so that you can skip the bad rows

datastream.flatMap{ str =
  try {
val strArray = str.trim().split(,)
val result = (strArray(0).toInt, strArray(1).toInt)
nGoodRows += 1
Some(result)
  }  catch {
case NumberFormatException =
  nBadRows += 1
  badRows += str
  None
  }
}.saveAsTextFile(...)


if (badRows.value.nonEmpty) {
  println( BAD ROWS *)
  badRows.value.foreach{str =
//look at a bit more info from each string ... print out length  each
character one by one
println(str)
println(str.length)
str.foreach{println}
println()
  }
}

// if it is some data corruption, that you just have to live with, you
might leave the flatMap / try
// even when you'e running it for real.  But then you might want to add a
little check that there aren't
// t many bad rows.  Note that the accumulator[Set] will run out of
mem if there are really
// a ton of bad rows, in which case you might switch to a reservoir sample

val badFrac = nBadRows.value / (nGoodRows.value + nBadRows.value.toDouble)
println(s${nBadRows.value} bad rows; ${nGoodRows.value} good rows;
($badFrac) bad fraction)
if (badFrac  maxAllowedBadRows) {
  throw new RuntimeException(too many bad rows!  + badFrac)
}




On Mon, Dec 15, 2014 at 3:49 PM, yu yuz1...@iastate.edu wrote:

 Hello, everyone

 I know 'NumberFormatException' is due to the reason that String can not be
 parsed properly, but I really can not find any mistakes for my code. I hope
 someone may kindly help me.
 My hdfs file is as follows:
 8,22
 3,11
 40,10
 49,47
 48,29
 24,28
 50,30
 33,56
 4,20
 30,38
 ...

 So each line contains an integer + , + an integer + \n
 My code is as follows:
 object StreamMonitor {
   def main(args: Array[String]): Unit = {
 val myFunc = (str: String) = {
   val strArray = str.trim().split(,)
   (strArray(0).toInt, strArray(1).toInt)
 }
 val conf = new SparkConf().setAppName(StreamMonitor);
 val ssc = new StreamingContext(conf, Seconds(30));
 val datastream = ssc.textFileStream(/user/yu/streaminput);
 val newstream = datastream.map(myFunc)
 newstream.saveAsTextFiles(output/, );
 ssc.start()
 ssc.awaitTermination()
   }

 }

 The exception info is:
 14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0
 (TID 0, h3): java.lang.NumberFormatException: For input string: 8


 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 java.lang.Integer.parseInt(Integer.java:492)
 java.lang.Integer.parseInt(Integer.java:527)

 scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
 scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
 StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:9)
 StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:7)
 scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
 scala.collection.Iterator$$anon$11.next(Iterator.scala:328)


 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:984)


 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:974)
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
 org.apache.spark.scheduler.Task.run(Task.scala:54)

 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)


 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 java.lang.Thread.run(Thread.java:745)

 So based on the above info, 8 is the first number in the file and I think
 it should be parsed to integer without any problems.
 I know it may be a very stupid question and the answer may be very easy.
 But
 I really can not find the reason. I am thankful to anyone who helps!



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/NumberFormatException-tp20694.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




NumberFormatException

2014-12-15 Thread yu
Hello, everyone

I know 'NumberFormatException' is due to the reason that String can not be
parsed properly, but I really can not find any mistakes for my code. I hope
someone may kindly help me.
My hdfs file is as follows:
8,22
3,11
40,10
49,47
48,29
24,28
50,30
33,56
4,20
30,38
...

So each line contains an integer + , + an integer + \n
My code is as follows:
object StreamMonitor {
  def main(args: Array[String]): Unit = {
val myFunc = (str: String) = {
  val strArray = str.trim().split(,) 
  (strArray(0).toInt, strArray(1).toInt)
}
val conf = new SparkConf().setAppName(StreamMonitor);
val ssc = new StreamingContext(conf, Seconds(30));
val datastream = ssc.textFileStream(/user/yu/streaminput);
val newstream = datastream.map(myFunc)  
newstream.saveAsTextFiles(output/, );   
ssc.start()
ssc.awaitTermination()
  }

}

The exception info is:
14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0
(TID 0, h3): java.lang.NumberFormatException: For input string: 8
   
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
java.lang.Integer.parseInt(Integer.java:492)
java.lang.Integer.parseInt(Integer.java:527)
   
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:9)
StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:7)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
   
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:984)
   
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:974)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
org.apache.spark.scheduler.Task.run(Task.scala:54)
   
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
   
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)

So based on the above info, 8 is the first number in the file and I think
it should be parsed to integer without any problems.
I know it may be a very stupid question and the answer may be very easy. But
I really can not find the reason. I am thankful to anyone who helps!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/NumberFormatException-tp20694.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: NumberFormatException

2014-12-15 Thread Sean Owen
That certainly looks surprising. Are you sure there are no unprintable
characters in the file?

On Mon, Dec 15, 2014 at 9:49 PM, yu yuz1...@iastate.edu wrote:
 The exception info is:
 14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0
 (TID 0, h3): java.lang.NumberFormatException: For input string: 8


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: NumberFormatException

2014-12-15 Thread Harihar Nahak
Hi Yu,

Try this :
val data = csv.map( line = line.split(,).map(elem = elem.trim)) //lines
in rows

   data.map( rec = (rec(0).toInt, rec(1).toInt))

to convert into integer.

On 16 December 2014 at 10:49, yu [via Apache Spark User List] 
ml-node+s1001560n20694...@n3.nabble.com wrote:

 Hello, everyone

 I know 'NumberFormatException' is due to the reason that String can not be
 parsed properly, but I really can not find any mistakes for my code. I hope
 someone may kindly help me.
 My hdfs file is as follows:
 8,22
 3,11
 40,10
 49,47
 48,29
 24,28
 50,30
 33,56
 4,20
 30,38
 ...

 So each line contains an integer + , + an integer + \n
 My code is as follows:
 object StreamMonitor {
   def main(args: Array[String]): Unit = {
 val myFunc = (str: String) = {
   val strArray = str.trim().split(,)
   (strArray(0).toInt, strArray(1).toInt)
 }
 val conf = new SparkConf().setAppName(StreamMonitor);
 val ssc = new StreamingContext(conf, Seconds(30));
 val datastream = ssc.textFileStream(/user/yu/streaminput);
 val newstream = datastream.map(myFunc)
 newstream.saveAsTextFiles(output/, );
 ssc.start()
 ssc.awaitTermination()
   }

 }

 The exception info is:
 14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage
 0.0 (TID 0, h3): java.lang.NumberFormatException: For input string: 8

 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)

 java.lang.Integer.parseInt(Integer.java:492)
 java.lang.Integer.parseInt(Integer.java:527)

 scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
 scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
 StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:9)
 StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:7)
 scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
 scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:984)


 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:974)

 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
 org.apache.spark.scheduler.Task.run(Task.scala:54)

 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

 java.lang.Thread.run(Thread.java:745)

 So based on the above info, 8 is the first number in the file and I
 think it should be parsed to integer without any problems.
 I know it may be a very stupid question and the answer may be very easy.
 But I really can not find the reason. I am thankful to anyone who helps!

 --
  If you reply to this email, your message will be added to the discussion
 below:

 http://apache-spark-user-list.1001560.n3.nabble.com/NumberFormatException-tp20694.html
  To start a new topic under Apache Spark User List, email
 ml-node+s1001560n1...@n3.nabble.com
 To unsubscribe from Apache Spark User List, click here
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=1code=aG5haGFrQHd5bnlhcmRncm91cC5jb218MXwtMTgxOTE5MTkyOQ==
 .
 NAML
 http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml



-- 
Regards,
Harihar Nahak
BigData Developer
Wynyard
Email:hna...@wynyardgroup.com | Extn: 8019




-
--Harihar
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/NumberFormatException-tp20694p20696.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: NumberFormatException

2014-12-15 Thread Akhil Das
There could be some other character like a space or ^M etc. You could try
the following and see the actual row.

val newstream = datastream.map(row = {
try{

val strArray = str.trim().split(,)
(strArray(0).toInt, strArray(1).toInt)
//Instead try this
//*(strArray(0).trim().toInt, strArray(1).trim().toInt)*

}catch{ case e: Exception = println(W000t!! Exception!! =  + e + \n
The line was : + row); (0, 0) }
})


Thanks
Best Regards

On Tue, Dec 16, 2014 at 3:19 AM, yu yuz1...@iastate.edu wrote:

 Hello, everyone

 I know 'NumberFormatException' is due to the reason that String can not be
 parsed properly, but I really can not find any mistakes for my code. I hope
 someone may kindly help me.
 My hdfs file is as follows:
 8,22
 3,11
 40,10
 49,47
 48,29
 24,28
 50,30
 33,56
 4,20
 30,38
 ...

 So each line contains an integer + , + an integer + \n
 My code is as follows:
 object StreamMonitor {
   def main(args: Array[String]): Unit = {
 val myFunc = (str: String) = {
   val strArray = str.trim().split(,)
   (strArray(0).toInt, strArray(1).toInt)
 }
 val conf = new SparkConf().setAppName(StreamMonitor);
 val ssc = new StreamingContext(conf, Seconds(30));
 val datastream = ssc.textFileStream(/user/yu/streaminput);
 val newstream = datastream.map(myFunc)
 newstream.saveAsTextFiles(output/, );
 ssc.start()
 ssc.awaitTermination()
   }

 }

 The exception info is:
 14/12/15 15:35:03 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0
 (TID 0, h3): java.lang.NumberFormatException: For input string: 8


 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 java.lang.Integer.parseInt(Integer.java:492)
 java.lang.Integer.parseInt(Integer.java:527)

 scala.collection.immutable.StringLike$class.toInt(StringLike.scala:229)
 scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
 StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:9)
 StreamMonitor$$anonfun$1.apply(StreamMonitor.scala:7)
 scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
 scala.collection.Iterator$$anon$11.next(Iterator.scala:328)


 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:984)


 org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:974)
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
 org.apache.spark.scheduler.Task.run(Task.scala:54)

 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)


 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 java.lang.Thread.run(Thread.java:745)

 So based on the above info, 8 is the first number in the file and I think
 it should be parsed to integer without any problems.
 I know it may be a very stupid question and the answer may be very easy.
 But
 I really can not find the reason. I am thankful to anyone who helps!



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/NumberFormatException-tp20694.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org