RE: spark 1.1.0 save data to hdfs failed

2015-01-24 Thread ey-chih chow
I modified my pom.xml according to the Spark pom.xml.  It is working right now. 
 Hadoop2 classes are no longer packaged into my jar.  Thanks.

From: eyc...@hotmail.com
To: so...@cloudera.com
CC: user@spark.apache.org
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Sat, 24 Jan 2015 07:30:45 -0800




Thanks for the information.  I changed the dependencies for Spark jars as 
follows:
 
 org.apache.spark
 spark-core_2.10
 1.1.0
 provided

 
 org.apache.spark
 spark-sql_2.10
 1.1.0
 provided

I don't know how these libraries are built, but I saw Spark has maven pom 
files.  I think these jars should be built from the corresponding pom files.  
These pom files have dependencies on hadoop version 1.0.4.  So I don't know 
where the hadoop2 jar come from.  What follows is a major fragment of my 
current dependency tree.  I don't know where the hadoop2 classes come into my 
built jar.


==




[INFO] |  \- org.apache.hadoop:hadoop-core:jar:1.2.1:provided
[INFO] | +- xmlenc:xmlenc:jar:0.52:provided
[INFO] | +- (com.sun.jersey:jersey-core:jar:1.8:provided - omitted for 
duplicate)
[INFO] | +- (com.sun.jersey:jersey-json:jar:1.8:provided - omitted for 
duplicate)
[INFO] | +- (com.sun.jersey:jersey-server:jar:1.8:provided - omitted for 
duplicate)
[INFO] | +- (commons-io:commons-io:jar:2.1:provided - omitted for conflict 
with 2.4)
[INFO] | +- (commons-codec:commons-codec:jar:1.4:compile - scope updated 
from provided; omitted for duplicate)
[INFO] | +- (org.apache.commons:commons-math:jar:2.1:provided - omitted for 
duplicate)
[INFO] | +- commons-configuration:commons-configuration:jar:1.6:provided
[INFO] | |  +- (commons-collections:commons-collections:jar:3.2.1:provided 
- omitted for duplicate)
[INFO] | |  +- (commons-lang:commons-lang:jar:2.4:provided - omitted for 
conflict with 2.6)
[INFO] | |  +- (commons-logging:commons-logging:jar:1.1.1:provided - 
omitted for duplicate)
[INFO] | |  +- commons-digester:commons-digester:jar:1.8:provided
[INFO] | |  |  +- commons-beanutils:commons-beanutils:jar:1.7.0:provided
[INFO] | |  |  |  \- (commons-logging:commons-logging:jar:1.0.3:provided - 
omitted for conflict with 1.1.1)
[INFO] | |  |  \- (commons-logging:commons-logging:jar:1.1:provided - 
omitted for conflict with 1.1.1)
[INFO] | |  \- commons-beanutils:commons-beanutils-core:jar:1.8.0:provided
[INFO] | | \- (commons-logging:commons-logging:jar:1.1.1:provided - 
omitted for duplicate)
[INFO] | +- (commons-net:commons-net:jar:1.4.1:provided - omitted for 
conflict with 2.2)
[INFO] | +- commons-el:commons-el:jar:1.0:provided
[INFO] | |  \- (commons-logging:commons-logging:jar:1.0.3:provided - 
omitted for conflict with 1.1.1)
[INFO] | +- hsqldb:hsqldb:jar:1.8.0.10:provided
[INFO] | +- oro:oro:jar:2.0.8:provided
[INFO] | \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:provided - 
omitted for conflict with 1.9.13)
[INFO] +- org.apache.spark:spark-core_2.10:jar:1.1.0:provided
[INFO] |  +- (org.apache.hadoop:hadoop-client:jar:1.0.4:provided - omitted for 
conflict with 1.2.1)
[INFO] |  +- net.java.dev.jets3t:jets3t:jar:0.7.1:provided
[INFO] |  |  +- (commons-codec:commons-codec:jar:1.3:provided - omitted for 
conflict with 1.4)
[INFO] |  |  \- (commons-httpclient:commons-httpclient:jar:3.1:provided - 
omitted for duplicate)
[INFO] |  +- org.apache.curator:curator-recipes:jar:2.4.0:provided
[INFO] |  |  +- org.apache.curator:curator-framework:jar:2.4.0:provided
[INFO] |  |  |  +- org.apache.curator:curator-client:jar:2.4.0:provided
[INFO] |  |  |  |  +- (org.slf4j:slf4j-api:jar:1.6.4:provided - omitted for 
conflict with 1.6.1)
[INFO] |  |  |  |  +- (org.apache.zookeeper:zookeeper:jar:3.4.5:provided - 
omitted for duplicate)
[INFO] |  |  |  |  \- (com.google.guava:guava:jar:14.0.1:provided - omitted for 
duplicate)
[INFO] |  |  |  +- (org.apache.zookeeper:zookeeper:jar:3.4.5:provided - omitted 
for duplicate)
[INFO] |  |  |  \- (com.google.guava:guava:jar:14.0.1:provided - omitted for 
duplicate)
[INFO] |  |  +- (org.apache.zookeeper:zookeeper:jar:3.4.5:provided - omitted 
for conflict with 3.4.6)
[INFO] |  |  \- (com.google.guava:guava:jar:14.0.1:provided - omitted for 
duplicate)
[INFO] |  +- org.eclipse.jetty:jetty-plus:jar:8.1.14.v20131031:provided
[INFO] |  |  +- 
org.eclipse.jetty.orbit:javax.transaction:jar:1.1.1.v201105210645:provided
[INFO] |  |  +- org.eclipse.jetty:jetty-webapp:jar:8.1.14.v20131031:provided
[INFO] |  |  |  +- org.eclipse.jetty:jetty-xml:jar:8.1.14.v20131031:provided
[INFO] |  |  |  |  \- 
(org.eclipse.jetty:jetty-util:jar:8.1.14.v20131031:provided - omitted for 
d

RE: spark 1.1.0 save data to hdfs failed

2015-01-24 Thread ey-chih chow
h 1.9.13)
[INFO] |  +- org.codehaus.jackson:jackson-jaxrs:jar:1.8.8:provided
[INFO] |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.8.8:provided - 
omitted for conflict with 1.9.13)
[INFO] |  |  \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:provided - 
omitted for conflict with 1.9.13)
[INFO] |  +- tomcat:jasper-compiler:jar:5.5.23:provided
[INFO] |  +- tomcat:jasper-runtime:jar:5.5.23:provided
[INFO] |  |  \- (commons-el:commons-el:jar:1.0:provided - omitted for duplicate)
[INFO] |  +- org.jamon:jamon-runtime:jar:2.3.1:provided
[INFO] |  +- (com.google.protobuf:protobuf-java:jar:2.5.0:provided - omitted 
for conflict with 2.4.1)
[INFO] |  +- com.sun.jersey:jersey-core:jar:1.8:provided
[INFO] |  +- com.sun.jersey:jersey-json:jar:1.8:provided
[INFO] |  |  +- org.codehaus.jettison:jettison:jar:1.1:provided
[INFO] |  |  +- com.sun.xml.bind:jaxb-impl:jar:2.2.3-1:provided
[INFO] |  |  |  \- (javax.xml.bind:jaxb-api:jar:2.2.2:provided - omitted for 
duplicate)
[INFO] |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.7.1:provided - 
omitted for conflict with 1.9.13)
[INFO] |  |  +- (org.codehaus.jackson:jackson-mapper-asl:jar:1.7.1:provided - 
omitted for conflict with 1.9.13)
[INFO] |  |  +- (org.codehaus.jackson:jackson-jaxrs:jar:1.7.1:provided - 
omitted for conflict with 1.8.8)
[INFO] |  |  +- org.codehaus.jackson:jackson-xc:jar:1.7.1:provided
[INFO] |  |  |  +- (org.codehaus.jackson:jackson-core-asl:jar:1.7.1:provided - 
omitted for conflict with 1.9.13)
[INFO] |  |  |  \- (org.codehaus.jackson:jackson-mapper-asl:jar:1.7.1:provided 
- omitted for conflict with 1.9.13)
[INFO] |  |  \- (com.sun.jersey:jersey-core:jar:1.8:provided - omitted for 
duplicate)
[INFO] |  +- com.sun.jersey:jersey-server:jar:1.8:provided
[INFO] |  |  +- asm:asm:jar:3.1:provided
[INFO] |  |  \- (com.sun.jersey:jersey-core:jar:1.8:provided - omitted for 
duplicate)
[INFO] |  +- javax.xml.bind:jaxb-api:jar:2.2.2:provided
[INFO] |  |  \- javax.activation:activation:jar:1.1:provided
[INFO] |  +- org.cloudera.htrace:htrace-core:jar:2.04:provided
[INFO] |  |  +- (com.google.guava:guava:jar:12.0.1:provided - omitted for 
conflict with 14.0)
[INFO] |  |  +- (commons-logging:commons-logging:jar:1.1.1:provided - omitted 
for duplicate)
[INFO] |  |  \- (org.mortbay.jetty:jetty-util:jar:6.1.26:provided - omitted for 
duplicate)
[INFO] |  +- (org.apache.hadoop:hadoop-core:jar:1.2.1:provided - omitted for 
duplicate)
[INFO] |  +- 
com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1:provided
[INFO] |  \- junit:junit:jar:4.11:provided
[INFO] | \- org.hamcrest:hamcrest-core:jar:1.3:provided
[INFO] \- org.apache.hbase:hbase-client:jar:0.98.1-hadoop1:provided
[INFO]+- (org.apache.hbase:hbase-common:jar:0.98.1-hadoop1:provided - 
omitted for duplicate)
[INFO]+- (org.apache.hbase:hbase-protocol:jar:0.98.1-hadoop1:provided - 
omitted for duplicate)
[INFO]+- (commons-codec:commons-codec:jar:1.7:compile - scope updated from 
provided; omitted for duplicate)
[INFO]+- (commons-io:commons-io:jar:2.4:provided - omitted for duplicate)
[INFO]+- (commons-lang:commons-lang:jar:2.6:provided - omitted for 
duplicate)
[INFO]+- (commons-logging:commons-logging:jar:1.1.1:provided - omitted for 
duplicate)
[INFO]+- (com.google.guava:guava:jar:12.0.1:provided - omitted for conflict 
with 14.0)
[INFO]+- (com.google.protobuf:protobuf-java:jar:2.5.0:provided - omitted 
for conflict with 2.4.1)
[INFO]+- (org.apache.zookeeper:zookeeper:jar:3.4.6:provided - omitted for 
duplicate)
[INFO]+- (org.cloudera.htrace:htrace-core:jar:2.04:provided - omitted for 
duplicate)
[INFO]+- (org.codehaus.jackson:jackson-mapper-asl:jar:1.8.8:provided - 
omitted for conflict with 1.9.13)
[INFO]+- (org.apache.hadoop:hadoop-core:jar:1.2.1:provided - omitted for 
duplicate)
[INFO]+- 
(com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1:provided - 
omitted for duplicate)
[INFO]\- (junit:junit:jar:4.11:provided - omitted for duplicate)




> From: so...@cloudera.com
> Date: Sat, 24 Jan 2015 09:46:02 +
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> Hadoop 2's artifact is hadoop-common rather than hadoop-core but I
> assume you looked for that too. To answer your earlier question, no,
> Spark works with both Hadoop 1 and Hadoop 2 and is source-compatible
> with both. It can't be binary-compatible with both at once though. The
> code you cite is correct; there is no bug there.
> 
> Your first error definitely indicates you have the wrong version of
> Hadoop on the client side. It's not matching your HDFS version. And
> the second suggests you are mixing code compiled for different
> versions of Hadoop. I think you need to check what version of Hadoop
> your Spark is compiled for. For example I saw a reference to CDH 5.2
> which is Hadoop 2.5, but then you're sho

Re: spark 1.1.0 save data to hdfs failed

2015-01-24 Thread Sean Owen
Hadoop 2's artifact is hadoop-common rather than hadoop-core but I
assume you looked for that too. To answer your earlier question, no,
Spark works with both Hadoop 1 and Hadoop 2 and is source-compatible
with both. It can't be binary-compatible with both at once though. The
code you cite is correct; there is no bug there.

Your first error definitely indicates you have the wrong version of
Hadoop on the client side. It's not matching your HDFS version. And
the second suggests you are mixing code compiled for different
versions of Hadoop. I think you need to check what version of Hadoop
your Spark is compiled for. For example I saw a reference to CDH 5.2
which is Hadoop 2.5, but then you're showing that you are running an
old Hadoop 1.x HDFS? there seem to be a number of possible
incompatibilities here.

On Fri, Jan 23, 2015 at 11:38 PM, ey-chih chow  wrote:
> Sorry I still did not quiet get your resolution.  In my jar, there are
> following three related classes:
>
> org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.class
> org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl$DummyReporter.class
> org/apache/hadoop/mapreduce/TaskAttemptContext.class
>
> I think the first two come from hadoop2 and the third from hadoop1.  I would
> like to get rid of the first two.  I checked my source code.  It does have a
> place using the class (or interface in hadoop2) TaskAttemptContext.
> Do you mean I make a separate jar for this portion of code and built with
> hadoop1 to get rid of dependency?  An alternative way is to  modify the code
> in SparkHadoopMapReduceUtil.scala and put it into my own source code to
> bypass the problem.  Any comment on this?  Thanks.
>
> 
> From: eyc...@hotmail.com
> To: so...@cloudera.com
> CC: user@spark.apache.org
> Subject: RE: spark 1.1.0 save data to hdfs failed
> Date: Fri, 23 Jan 2015 11:17:36 -0800
>
>
> Thanks.  I looked at the dependency tree.  I did not see any dependent jar
> of hadoop-core from hadoop2.  However the jar built from maven has the
> class:
>
>  org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.class
>
> Do you know why?
>
>
>
>
> 
> Date: Fri, 23 Jan 2015 17:01:48 +
> Subject: RE: spark 1.1.0 save data to hdfs failed
> From: so...@cloudera.com
> To: eyc...@hotmail.com
>
> Are you receiving my replies? I have suggested a resolution. Look at the
> dependency tree next.
>
> On Jan 23, 2015 2:43 PM, "ey-chih chow"  wrote:
>
> I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it
> is broken in the following code:
>
>   def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID):
> TaskAttemptContext = {
> val klass = firstAvailableClass(
> "org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",  //
> hadoop2, hadoop2-yarn
> "org.apache.hadoop.mapreduce.TaskAttemptContext")   //
> hadoop1
> val ctor = klass.getDeclaredConstructor(classOf[Configuration],
> classOf[TaskAttemptID])
> ctor.newInstance(conf, attemptId).asInstanceOf[TaskAttemptContext]
>   }
>
> In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any
> suggestion how to resolve it?
>
> Thanks.
>
>
>
>> From: so...@cloudera.com
>> Date: Fri, 23 Jan 2015 14:01:45 +
>> Subject: Re: spark 1.1.0 save data to hdfs failed
>> To: eyc...@hotmail.com
>> CC: user@spark.apache.org
>>
>> These are all definitely symptoms of mixing incompatible versions of
>> libraries.
>>
>> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
>> not the only way Hadoop deps get into your app. See my suggestion
>> about investigating the dependency tree.
>>
>> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow  wrote:
>> > Thanks. But I think I already mark all the Spark and Hadoop reps as
>> > provided. Why the cluster's version is not used?
>> >
>> > Any way, as I mentioned in the previous message, after changing the
>> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
>> > exception and go to another one as indicated below. Any suggestion on
>> > this?
>> >
>> > =
>> >
>> > Exception in thread "main" java.lang.reflect.InvocationTargetException
>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> > at
>> >
>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> > at
>> >
>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(Del

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
Sorry I still did not quiet get your resolution.  In my jar, there are 
following three related classes:
org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.classorg/apache/hadoop/mapreduce/task/TaskAttemptContextImpl$DummyReporter.classorg/apache/hadoop/mapreduce/TaskAttemptContext.class
I think the first two come from hadoop2 and the third from hadoop1.  I would 
like to get rid of the first two.  I checked my source code.  It does have a 
place using the class (or interface in hadoop2) TaskAttemptContext.Do you mean 
I make a separate jar for this portion of code and built with hadoop1 to get 
rid of dependency?  An alternative way is to  modify the code in 
SparkHadoopMapReduceUtil.scala and put it into my own source code to bypass the 
problem.  Any comment on this?  Thanks.
From: eyc...@hotmail.com
To: so...@cloudera.com
CC: user@spark.apache.org
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Fri, 23 Jan 2015 11:17:36 -0800




Thanks.  I looked at the dependency tree.  I did not see any dependent jar of 
hadoop-core from hadoop2.  However the jar built from maven has the class:
 org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.class
Do you know why?



Date: Fri, 23 Jan 2015 17:01:48 +
Subject: RE: spark 1.1.0 save data to hdfs failed
From: so...@cloudera.com
To: eyc...@hotmail.com

Are you receiving my replies? I have suggested a resolution. Look at the 
dependency tree next. 
On Jan 23, 2015 2:43 PM, "ey-chih chow"  wrote:



I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is 
broken in the following code:
  def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): 
TaskAttemptContext = {val klass = firstAvailableClass(
"org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",  // hadoop2, 
hadoop2-yarn"org.apache.hadoop.mapreduce.TaskAttemptContext")   
// hadoop1val ctor = klass.getDeclaredConstructor(classOf[Configuration], 
classOf[TaskAttemptID])ctor.newInstance(conf, 
attemptId).asInstanceOf[TaskAttemptContext]  }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any 
suggestion how to resolve it?
Thanks.


> From: so...@cloudera.com
> Date: Fri, 23 Jan 2015 14:01:45 +0000
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> These are all definitely symptoms of mixing incompatible versions of 
> libraries.
> 
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
> 
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow  wrote:
> > Thanks.  But I think I already mark all the Spark and Hadoop reps as
> > provided.  Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below.  Any suggestion on this?
> >
> > =
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.Spar

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
Thanks.  I looked at the dependency tree.  I did not see any dependent jar of 
hadoop-core from hadoop2.  However the jar built from maven has the class:
 org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.class
Do you know why?



Date: Fri, 23 Jan 2015 17:01:48 +
Subject: RE: spark 1.1.0 save data to hdfs failed
From: so...@cloudera.com
To: eyc...@hotmail.com

Are you receiving my replies? I have suggested a resolution. Look at the 
dependency tree next. 
On Jan 23, 2015 2:43 PM, "ey-chih chow"  wrote:



I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is 
broken in the following code:
  def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): 
TaskAttemptContext = {val klass = firstAvailableClass(
"org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",  // hadoop2, 
hadoop2-yarn"org.apache.hadoop.mapreduce.TaskAttemptContext")   
// hadoop1val ctor = klass.getDeclaredConstructor(classOf[Configuration], 
classOf[TaskAttemptID])ctor.newInstance(conf, 
attemptId).asInstanceOf[TaskAttemptContext]  }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any 
suggestion how to resolve it?
Thanks.


> From: so...@cloudera.com
> Date: Fri, 23 Jan 2015 14:01:45 +0000
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> These are all definitely symptoms of mixing incompatible versions of 
> libraries.
> 
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
> 
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow  wrote:
> > Thanks.  But I think I already mark all the Spark and Hadoop reps as
> > provided.  Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below.  Any suggestion on this?
> >
> > =
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> > at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> > at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> >
> > ... 6 more
> >
  
  

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
I also think the code is not robust enough.  First, Spark works with hadoop1, 
why the code try hadoop2 first.  Also the following code only handle 
ClassNotFoundException.  It should handle all the exceptions.
  private def firstAvailableClass(first: String, second: String): Class[_] = {  
  try {  Class.forName(first)} catch {  case e: 
ClassNotFoundException =>Class.forName(second)}  }
From: eyc...@hotmail.com
To: so...@cloudera.com
CC: user@spark.apache.org
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Fri, 23 Jan 2015 06:43:00 -0800




I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is 
broken in the following code:
  def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): 
TaskAttemptContext = {val klass = firstAvailableClass(
"org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",  // hadoop2, 
hadoop2-yarn"org.apache.hadoop.mapreduce.TaskAttemptContext")   
// hadoop1val ctor = klass.getDeclaredConstructor(classOf[Configuration], 
classOf[TaskAttemptID])ctor.newInstance(conf, 
attemptId).asInstanceOf[TaskAttemptContext]  }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any 
suggestion how to resolve it?
Thanks.


> From: so...@cloudera.com
> Date: Fri, 23 Jan 2015 14:01:45 +0000
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> These are all definitely symptoms of mixing incompatible versions of 
> libraries.
> 
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
> 
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow  wrote:
> > Thanks.  But I think I already mark all the Spark and Hadoop reps as
> > provided.  Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below.  Any suggestion on this?
> >
> > =
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> > at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> > at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> >
> > ... 6 more
> >

  

RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is 
broken in the following code:
  def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): 
TaskAttemptContext = {val klass = firstAvailableClass(
"org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",  // hadoop2, 
hadoop2-yarn"org.apache.hadoop.mapreduce.TaskAttemptContext")   
// hadoop1val ctor = klass.getDeclaredConstructor(classOf[Configuration], 
classOf[TaskAttemptID])ctor.newInstance(conf, 
attemptId).asInstanceOf[TaskAttemptContext]  }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any 
suggestion how to resolve it?
Thanks.


> From: so...@cloudera.com
> Date: Fri, 23 Jan 2015 14:01:45 +
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> These are all definitely symptoms of mixing incompatible versions of 
> libraries.
> 
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
> 
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow  wrote:
> > Thanks.  But I think I already mark all the Spark and Hadoop reps as
> > provided.  Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below.  Any suggestion on this?
> >
> > =
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> > at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> > at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> >
> > ... 6 more
> >
  

Re: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread Sean Owen
These are all definitely symptoms of mixing incompatible versions of libraries.

I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
not the only way Hadoop deps get into your app. See my suggestion
about investigating the dependency tree.

On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow  wrote:
> Thanks.  But I think I already mark all the Spark and Hadoop reps as
> provided.  Why the cluster's version is not used?
>
> Any way, as I mentioned in the previous message, after changing the
> hadoop-client to version 1.2.1 in my maven deps, I already pass the
> exception and go to another one as indicated below.  Any suggestion on this?
>
> =
>
> Exception in thread "main" java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> at java.lang.ClassLoader.defineClass1(Native Method)
> at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:191)
> at
> org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> at
> org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> at
> org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> at
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> at
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
>
> ... 6 more
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
Thanks.  But I think I already mark all the Spark and Hadoop reps as provided.  
Why the cluster's version is not used?
Any way, as I mentioned in the previous message, after changing the 
hadoop-client to version 1.2.1 in my maven deps, I already pass the exception 
and go to another one as indicated below.  Any suggestion on this?
=Exception in thread "main" 
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.IncompatibleClassChangeError: Implementing class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at 
org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
at 
org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
at 
org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
at 
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
at 
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)... 6 
more  
> From: so...@cloudera.com
> Date: Fri, 23 Jan 2015 10:41:12 +
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> So, you should not depend on Hadoop artifacts unless you use them
> directly. You should mark Hadoop and Spark deps as provided. Then the
> cluster's version is used at runtime with spark-submit. That's the
> usual way to do it, which works.
> 
> If you need to embed Spark in your app and are running it outside the
> cluster for some reason, and you have to embed Hadoop and Spark code
> in your app, the version has to match. You should also use mvn
> dependency:tree to see all the dependencies coming in. There may be
> many sources of a Hadoop dep.
> 
> On Fri, Jan 23, 2015 at 1:05 AM, ey-chih chow  wrote:
> > Thanks.  But after I replace the maven dependence from
> >
> > 
> >  org.apache.hadoop
> >  hadoop-client
> >  2.5.0-cdh5.2.0
> >  provided
> >  
> >
> >  org.mortbay.jetty
> >  servlet-api
> >
> >
> >  javax.servlet
> >  servlet-api
> >
> >
> >  io.netty
> >  netty
> >
> >  
> > 
> >
> > to
> >
> > 
> >
> >  org.apache.hadoop
> >
> >  hadoop-client
> >
> >  1.0.4
> >
> >  provided
> >
> >  
> >
> >
> >
> >  org.mortbay.jetty
> >
> >  servlet-api
&

Re: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread Sean Owen
So, you should not depend on Hadoop artifacts unless you use them
directly. You should mark Hadoop and Spark deps as provided. Then the
cluster's version is used at runtime with spark-submit. That's the
usual way to do it, which works.

If you need to embed Spark in your app and are running it outside the
cluster for some reason, and you have to embed Hadoop and Spark code
in your app, the version has to match. You should also use mvn
dependency:tree to see all the dependencies coming in. There may be
many sources of a Hadoop dep.

On Fri, Jan 23, 2015 at 1:05 AM, ey-chih chow  wrote:
> Thanks.  But after I replace the maven dependence from
>
> 
>  org.apache.hadoop
>  hadoop-client
>  2.5.0-cdh5.2.0
>  provided
>  
>
>  org.mortbay.jetty
>  servlet-api
>
>
>  javax.servlet
>  servlet-api
>
>
>  io.netty
>  netty
>
>  
> 
>
> to
>
> 
>
>  org.apache.hadoop
>
>  hadoop-client
>
>  1.0.4
>
>  provided
>
>  
>
>
>
>  org.mortbay.jetty
>
>  servlet-api
>
>
>
>
>
>  javax.servlet
>
>  servlet-api
>
>
>
>
>
>  io.netty
>
>  netty
>
>
>
>  
>
> 
>
>
> the warning message is still shown up in the namenode log.  Is there any
> other thing I need to do?
>
>
> Thanks.
>
>
> Ey-Chih Chow
>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: spark 1.1.0 save data to hdfs failed

2015-01-23 Thread ey-chih chow
After I changed the dependency to the following:

 org.apache.hadoop
 hadoop-client
 1.2.1
 
   
 org.mortbay.jetty
 servlet-api
   
   
 javax.servlet
 servlet-api
   
   
 io.netty
 netty
   
 

I got the following error.  Any idea on this?  Thanks.
===Caused by: java.lang.IncompatibleClassChangeError: Implementing 
class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at 
org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
at 
org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
at 
org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
at 
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
at 
org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
... 6 moreFrom: eyc...@hotmail.com
To: so...@cloudera.com
CC: yuzhih...@gmail.com; user@spark.apache.org
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Thu, 22 Jan 2015 17:05:26 -0800




Thanks.  But after I replace the maven dependence from
 
org.apache.hadoop 
hadoop-client 
2.5.0-cdh5.2.0 
provided 

org.mortbay.jetty 
servlet-api
   
javax.servlet 
servlet-api
   
io.netty 
netty  
  
to
 org.apache.hadoop
 hadoop-client
 1.0.4
 provided
 
   
 org.mortbay.jetty
 servlet-api
   
   
 javax.servlet
 servlet-api
   
   
 io.netty
 netty
   
 

the warning message is still shown up in the namenode log.  Is there any other 
thing I need to do?


Thanks.


Ey-Chih Chow


> From: so...@cloudera.com
> Date: Thu, 22 Jan 2015 22:34:22 +
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: yuzhih...@gmail.com; user@spark.apache.org
> 
> It means your client app is using Hadoop 2.x and your HDFS is Hadoop 1.x.
> 
> On Thu, Jan 22, 2015 at 10:32 PM, ey-chih chow  wrote:
> > I looked into the namenode log and found this message:
> >
> > 2015-01-22 22:18:39,441 WARN org.apache.hadoop.ipc.Server: Incorrect header
> > or version mismatch from 10.33.140.233:53776 got version 9 expected version
> > 4
> >
> > What should I do to fix this?
> >
> > Thanks.
> >
> > Ey-Chih
> >
> > ____
> > From: eyc...@hotmail.com
> > To: yuzhih...@gmail.com
> > CC: user@spark.apache.o

RE: spark 1.1.0 save data to hdfs failed

2015-01-22 Thread ey-chih chow
Thanks.  But after I replace the maven dependence from
 
org.apache.hadoop 
hadoop-client 
2.5.0-cdh5.2.0 
provided 

org.mortbay.jetty 
servlet-api
   
javax.servlet 
servlet-api
   
io.netty 
netty  
  
to
 org.apache.hadoop
 hadoop-client
 1.0.4
 provided
 
   
 org.mortbay.jetty
 servlet-api
   
   
 javax.servlet
 servlet-api
   
   
 io.netty
 netty
   
 

the warning message is still shown up in the namenode log.  Is there any other 
thing I need to do?


Thanks.


Ey-Chih Chow


> From: so...@cloudera.com
> Date: Thu, 22 Jan 2015 22:34:22 +
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: yuzhih...@gmail.com; user@spark.apache.org
> 
> It means your client app is using Hadoop 2.x and your HDFS is Hadoop 1.x.
> 
> On Thu, Jan 22, 2015 at 10:32 PM, ey-chih chow  wrote:
> > I looked into the namenode log and found this message:
> >
> > 2015-01-22 22:18:39,441 WARN org.apache.hadoop.ipc.Server: Incorrect header
> > or version mismatch from 10.33.140.233:53776 got version 9 expected version
> > 4
> >
> > What should I do to fix this?
> >
> > Thanks.
> >
> > Ey-Chih
> >
> > ____
> > From: eyc...@hotmail.com
> > To: yuzhih...@gmail.com
> > CC: user@spark.apache.org
> > Subject: RE: spark 1.1.0 save data to hdfs failed
> > Date: Wed, 21 Jan 2015 23:12:56 -0800
> >
> > The hdfs release should be hadoop 1.0.4.
> >
> > Ey-Chih Chow
> >
> > 
> > Date: Wed, 21 Jan 2015 16:56:25 -0800
> > Subject: Re: spark 1.1.0 save data to hdfs failed
> > From: yuzhih...@gmail.com
> > To: eyc...@hotmail.com
> > CC: user@spark.apache.org
> >
> > What hdfs release are you using ?
> >
> > Can you check namenode log around time of error below to see if there is
> > some clue ?
> >
> > Cheers
> >
> > On Wed, Jan 21, 2015 at 4:51 PM, ey-chih chow  wrote:
> >
> > Hi,
> >
> > I used the following fragment of a scala program to save data to hdfs:
> >
> > contextAwareEvents
> > .map(e => (new AvroKey(e), null))
> > .saveAsNewAPIHadoopFile("hdfs://" + masterHostname + ":9000/ETL/output/"
> > + dateDir,
> > classOf[AvroKey[GenericRecord]],
> > classOf[NullWritable],
> > classOf[AvroKeyOutputFormat[GenericRecord]],
> > job.getConfiguration)
> >
> > But it failed with the following error messages.  Is there any people who
> > can help?  Thanks.
> >
> > Ey-Chih Chow
> >
> > =
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.io.IOException: Failed on local exception:
> > java.io.EOFException; Host Details : local host is:
> > "ip-10-33-140-157/10.33.140.157"; destination host is:
> > "ec2-54-203-58-2.us-west-2.compute.amazonaws.com":9000;
> > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> &

Re: spark 1.1.0 save data to hdfs failed

2015-01-22 Thread Sean Owen
It means your client app is using Hadoop 2.x and your HDFS is Hadoop 1.x.

On Thu, Jan 22, 2015 at 10:32 PM, ey-chih chow  wrote:
> I looked into the namenode log and found this message:
>
> 2015-01-22 22:18:39,441 WARN org.apache.hadoop.ipc.Server: Incorrect header
> or version mismatch from 10.33.140.233:53776 got version 9 expected version
> 4
>
> What should I do to fix this?
>
> Thanks.
>
> Ey-Chih
>
> 
> From: eyc...@hotmail.com
> To: yuzhih...@gmail.com
> CC: user@spark.apache.org
> Subject: RE: spark 1.1.0 save data to hdfs failed
> Date: Wed, 21 Jan 2015 23:12:56 -0800
>
> The hdfs release should be hadoop 1.0.4.
>
> Ey-Chih Chow
>
> ________
> Date: Wed, 21 Jan 2015 16:56:25 -0800
> Subject: Re: spark 1.1.0 save data to hdfs failed
> From: yuzhih...@gmail.com
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
>
> What hdfs release are you using ?
>
> Can you check namenode log around time of error below to see if there is
> some clue ?
>
> Cheers
>
> On Wed, Jan 21, 2015 at 4:51 PM, ey-chih chow  wrote:
>
> Hi,
>
> I used the following fragment of a scala program to save data to hdfs:
>
> contextAwareEvents
> .map(e => (new AvroKey(e), null))
> .saveAsNewAPIHadoopFile("hdfs://" + masterHostname + ":9000/ETL/output/"
> + dateDir,
> classOf[AvroKey[GenericRecord]],
> classOf[NullWritable],
> classOf[AvroKeyOutputFormat[GenericRecord]],
> job.getConfiguration)
>
> But it failed with the following error messages.  Is there any people who
> can help?  Thanks.
>
> Ey-Chih Chow
>
> =
>
> Exception in thread "main" java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> at
> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> Caused by: java.io.IOException: Failed on local exception:
> java.io.EOFException; Host Details : local host is:
> "ip-10-33-140-157/10.33.140.157"; destination host is:
> "ec2-54-203-58-2.us-west-2.compute.amazonaws.com":9000;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1415)
> at org.apache.hadoop.ipc.Client.call(Client.java:1364)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:744)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1925)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1079)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
> at
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)
> at
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:900)
> at
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> at com.crowdstar.etl.ParseAndClean$.main(Par

RE: spark 1.1.0 save data to hdfs failed

2015-01-22 Thread ey-chih chow
I looked into the namenode log and found this message:
2015-01-22 22:18:39,441 WARN org.apache.hadoop.ipc.Server: Incorrect header or 
version mismatch from 10.33.140.233:53776 got version 9 expected version 4
What should I do to fix this?
Thanks.
Ey-Chih
From: eyc...@hotmail.com
To: yuzhih...@gmail.com
CC: user@spark.apache.org
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Wed, 21 Jan 2015 23:12:56 -0800




The hdfs release should be hadoop 1.0.4.
Ey-Chih Chow 

Date: Wed, 21 Jan 2015 16:56:25 -0800
Subject: Re: spark 1.1.0 save data to hdfs failed
From: yuzhih...@gmail.com
To: eyc...@hotmail.com
CC: user@spark.apache.org

What hdfs release are you using ?
Can you check namenode log around time of error below to see if there is some 
clue ?
Cheers
On Wed, Jan 21, 2015 at 4:51 PM, ey-chih chow  wrote:
Hi,



I used the following fragment of a scala program to save data to hdfs:



contextAwareEvents

.map(e => (new AvroKey(e), null))

.saveAsNewAPIHadoopFile("hdfs://" + masterHostname + ":9000/ETL/output/"

+ dateDir,

classOf[AvroKey[GenericRecord]],

classOf[NullWritable],

classOf[AvroKeyOutputFormat[GenericRecord]],

job.getConfiguration)



But it failed with the following error messages.  Is there any people who

can help?  Thanks.



Ey-Chih Chow



=



Exception in thread "main" java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at

org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)

at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)

Caused by: java.io.IOException: Failed on local exception:

java.io.EOFException; Host Details : local host is:

"ip-10-33-140-157/10.33.140.157"; destination host is:

"ec2-54-203-58-2.us-west-2.compute.amazonaws.com":9000;

at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)

at org.apache.hadoop.ipc.Client.call(Client.java:1415)

at org.apache.hadoop.ipc.Client.call(Client.java:1364)

at

org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)

at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)

at

org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:744)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at

org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at

org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1925)

at

org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1079)

at

org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)

at

org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at

org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)

at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)

at

org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)

at

org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:900)

at

org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)

at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:101)

at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)

... 6 more

Caused by: java.io.EOFException

at java.io.DataInputStream.readInt(DataInputStream.java:392)

at

org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)

at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)



===











--

View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-save-data-to-hdfs-failed-tp21305.html

Sent from the Apac

RE: spark 1.1.0 save data to hdfs failed

2015-01-21 Thread ey-chih chow
The hdfs release should be hadoop 1.0.4.
Ey-Chih Chow 

Date: Wed, 21 Jan 2015 16:56:25 -0800
Subject: Re: spark 1.1.0 save data to hdfs failed
From: yuzhih...@gmail.com
To: eyc...@hotmail.com
CC: user@spark.apache.org

What hdfs release are you using ?
Can you check namenode log around time of error below to see if there is some 
clue ?
Cheers
On Wed, Jan 21, 2015 at 4:51 PM, ey-chih chow  wrote:
Hi,



I used the following fragment of a scala program to save data to hdfs:



contextAwareEvents

.map(e => (new AvroKey(e), null))

.saveAsNewAPIHadoopFile("hdfs://" + masterHostname + ":9000/ETL/output/"

+ dateDir,

classOf[AvroKey[GenericRecord]],

classOf[NullWritable],

classOf[AvroKeyOutputFormat[GenericRecord]],

job.getConfiguration)



But it failed with the following error messages.  Is there any people who

can help?  Thanks.



Ey-Chih Chow



=



Exception in thread "main" java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at

org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)

at 
org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)

Caused by: java.io.IOException: Failed on local exception:

java.io.EOFException; Host Details : local host is:

"ip-10-33-140-157/10.33.140.157"; destination host is:

"ec2-54-203-58-2.us-west-2.compute.amazonaws.com":9000;

at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)

at org.apache.hadoop.ipc.Client.call(Client.java:1415)

at org.apache.hadoop.ipc.Client.call(Client.java:1364)

at

org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)

at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)

at

org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:744)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:606)

at

org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)

at

org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)

at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)

at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1925)

at

org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1079)

at

org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)

at

org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)

at

org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)

at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)

at

org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)

at

org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:900)

at

org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)

at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:101)

at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)

... 6 more

Caused by: java.io.EOFException

at java.io.DataInputStream.readInt(DataInputStream.java:392)

at

org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)

at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)



===











--

View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-save-data-to-hdfs-failed-tp21305.html

Sent from the Apache Spark User List mailing list archive at Nabble.com.



-

To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

For additional commands, e-mail: user-h...@spark.apache.org




  

Re: spark 1.1.0 save data to hdfs failed

2015-01-21 Thread Ted Yu
What hdfs release are you using ?

Can you check namenode log around time of error below to see if there is
some clue ?

Cheers

On Wed, Jan 21, 2015 at 4:51 PM, ey-chih chow  wrote:

> Hi,
>
> I used the following fragment of a scala program to save data to hdfs:
>
> contextAwareEvents
> .map(e => (new AvroKey(e), null))
> .saveAsNewAPIHadoopFile("hdfs://" + masterHostname +
> ":9000/ETL/output/"
> + dateDir,
> classOf[AvroKey[GenericRecord]],
> classOf[NullWritable],
> classOf[AvroKeyOutputFormat[GenericRecord]],
> job.getConfiguration)
>
> But it failed with the following error messages.  Is there any people who
> can help?  Thanks.
>
> Ey-Chih Chow
>
> =
>
> Exception in thread "main" java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> at
> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> Caused by: java.io.IOException: Failed on local exception:
> java.io.EOFException; Host Details : local host is:
> "ip-10-33-140-157/10.33.140.157"; destination host is:
> "ec2-54-203-58-2.us-west-2.compute.amazonaws.com":9000;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1415)
> at org.apache.hadoop.ipc.Client.call(Client.java:1364)
> at
>
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> at
>
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:744)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
> at
>
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1925)
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1079)
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
> at
>
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
>
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
> at
>
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)
> at
>
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:900)
> at
>
> org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:101)
> at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> ... 6 more
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:392)
> at
>
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1055)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:950)
>
> ===
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-save-data-to-hdfs-failed-tp21305.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>