[ https://issues.apache.org/jira/browse/SPARK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156578#comment-14156578 ]
Takuya Ueshin commented on SPARK-3764: -------------------------------------- Now I found the instruction [here|http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark] but this would not work for hadoop-1. I think we need some notice to lead hadoop-1 users to [Building Spark with Maven|http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version], and also need it at [Download Spark |http://spark.apache.org/downloads.html]. > Invalid dependencies of artifacts in Maven Central Repository. > -------------------------------------------------------------- > > Key: SPARK-3764 > URL: https://issues.apache.org/jira/browse/SPARK-3764 > Project: Spark > Issue Type: Bug > Components: Build > Affects Versions: 1.1.0 > Reporter: Takuya Ueshin > > While testing my spark applications locally using spark artifacts downloaded > from Maven Central, the following exception was thrown: > {quote} > ERROR executor.ExecutorUncaughtExceptionHandler: Uncaught exception in thread > Thread[Executor task launch worker-2,5,main] > java.lang.IncompatibleClassChangeError: Found class > org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected > at > org.apache.spark.sql.parquet.AppendingParquetOutputFormat.getDefaultWorkFile(ParquetTableOperations.scala:334) > at > parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:251) > at > org.apache.spark.sql.parquet.InsertIntoParquetTable.org$apache$spark$sql$parquet$InsertIntoParquetTable$$writeShard$1(ParquetTableOperations.scala:300) > at > org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:318) > at > org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:318) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) > at org.apache.spark.scheduler.Task.run(Task.scala:54) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {quote} > This is because the hadoop class {{TaskAttemptContext}} is incompatible > between hadoop-1 and hadoop-2. > I guess the spark artifacts in Maven Central were built against hadoop-2 with > Maven, but the depending version of hadoop in {{pom.xml}} remains 1.0.4, so > the hadoop version mismatch is happend. > FYI: > sbt seems to publish 'effective pom'-like pom file, so the dependencies are > correctly resolved. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org