[ 
https://issues.apache.org/jira/browse/SPARK-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14156578#comment-14156578
 ] 

Takuya Ueshin commented on SPARK-3764:
--------------------------------------

Now I found the instruction 
[here|http://spark.apache.org/docs/latest/programming-guide.html#linking-with-spark]
 but this would not work for hadoop-1.
I think we need some notice to lead hadoop-1 users to [Building Spark with 
Maven|http://spark.apache.org/docs/latest/building-with-maven.html#specifying-the-hadoop-version],
 and also need it at [Download Spark |http://spark.apache.org/downloads.html].

> Invalid dependencies of artifacts in Maven Central Repository.
> --------------------------------------------------------------
>
>                 Key: SPARK-3764
>                 URL: https://issues.apache.org/jira/browse/SPARK-3764
>             Project: Spark
>          Issue Type: Bug
>          Components: Build
>    Affects Versions: 1.1.0
>            Reporter: Takuya Ueshin
>
> While testing my spark applications locally using spark artifacts downloaded 
> from Maven Central, the following exception was thrown:
> {quote}
> ERROR executor.ExecutorUncaughtExceptionHandler: Uncaught exception in thread 
> Thread[Executor task launch worker-2,5,main]
> java.lang.IncompatibleClassChangeError: Found class 
> org.apache.hadoop.mapreduce.TaskAttemptContext, but interface was expected
>       at 
> org.apache.spark.sql.parquet.AppendingParquetOutputFormat.getDefaultWorkFile(ParquetTableOperations.scala:334)
>       at 
> parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:251)
>       at 
> org.apache.spark.sql.parquet.InsertIntoParquetTable.org$apache$spark$sql$parquet$InsertIntoParquetTable$$writeShard$1(ParquetTableOperations.scala:300)
>       at 
> org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:318)
>       at 
> org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(ParquetTableOperations.scala:318)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>       at org.apache.spark.scheduler.Task.run(Task.scala:54)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {quote}
> This is because the hadoop class {{TaskAttemptContext}} is incompatible 
> between hadoop-1 and hadoop-2.
> I guess the spark artifacts in Maven Central were built against hadoop-2 with 
> Maven, but the depending version of hadoop in {{pom.xml}} remains 1.0.4, so 
> the hadoop version mismatch is happend.
> FYI:
> sbt seems to publish 'effective pom'-like pom file, so the dependencies are 
> correctly resolved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to