I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is 
broken in the following code:
  def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): 
TaskAttemptContext = {    val klass = firstAvailableClass(        
"org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",  // hadoop2, 
hadoop2-yarn        "org.apache.hadoop.mapreduce.TaskAttemptContext")           
// hadoop1    val ctor = klass.getDeclaredConstructor(classOf[Configuration], 
classOf[TaskAttemptID])    ctor.newInstance(conf, 
attemptId).asInstanceOf[TaskAttemptContext]  }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any 
suggestion how to resolve it?
Thanks.


> From: so...@cloudera.com
> Date: Fri, 23 Jan 2015 14:01:45 +0000
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eyc...@hotmail.com
> CC: user@spark.apache.org
> 
> These are all definitely symptoms of mixing incompatible versions of 
> libraries.
> 
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
> 
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow <eyc...@hotmail.com> wrote:
> > Thanks.  But I think I already mark all the Spark and Hadoop reps as
> > provided.  Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below.  Any suggestion on this?
> >
> > =================================
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> > at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> > at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> >
> > ... 6 more
> >
                                          

Reply via email to