[ 
https://issues.apache.org/jira/browse/SPARK-26146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16782519#comment-16782519
 ] 

Michael Heuer commented on SPARK-26146:
---------------------------------------

[~srowen] Hey, wait a minute, the full example from the original submitter 
appears to be in github here

[https://github.com/jgperrin/net.jgp.books.spark.ch01]

This example has no explicit dependency on paranamer, and neither does Disq for 
that matter 

[https://github.com/disq-bio/disq/blob/5760a80b3322c537226a876e0df4f7710188f7b2/pom.xml]

This is not the first instance I've seen (and reported) where Spark itself has 
dependencies that conflict or otherwise are not compatible with each other, and 
those do not manifest themselves at test scope in Spark CI, only when an 
external project has a compile scope dependency on Spark jars.

> CSV wouln't be ingested in Spark 2.4.0 with Scala 2.12
> ------------------------------------------------------
>
>                 Key: SPARK-26146
>                 URL: https://issues.apache.org/jira/browse/SPARK-26146
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 2.4.0
>            Reporter: Jean Georges Perrin
>            Priority: Major
>
> Ingestion of a CSV file seems to fail with Spark v2.4.0 and Scala v2.12, 
> where it works ok with Scala v2.11.
> When running a simple CSV ingestion like:{{ }}
> {code:java}
>     // Creates a session on a local master
>     SparkSession spark = SparkSession.builder()
>         .appName("CSV to Dataset")
>         .master("local")
>         .getOrCreate();
>     // Reads a CSV file with header, called books.csv, stores it in a 
> dataframe
>     Dataset<Row> df = spark.read().format("csv")
>         .option("header", "true")
>         .load("data/books.csv");
> {code}
>   With Scala 2.12, I get: 
> {code:java}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
> at 
> com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
> at 
> com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
> at 
> com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
> at 
> com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
> at 
> com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
> at 
> com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
> at 
> com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
> at 
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240)
> ...
> at 
> net.jgp.books.sparkWithJava.ch01.CsvToDataframeApp.start(CsvToDataframeApp.java:37)
> at 
> net.jgp.books.sparkWithJava.ch01.CsvToDataframeApp.main(CsvToDataframeApp.java:21)
> {code}
> Where it works pretty smoothly if I switch back to 2.11.
> Full example available at 
> [https://github.com/jgperrin/net.jgp.books.sparkWithJava.ch01.] You can 
> modify pom.xml to change easily the Scala version in the property section:
> {code:java}
> <properties>
>  <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
>  <java.version>1.8</java.version>
>  <scala.version>2.11</scala.version>
>  <spark.version>2.4.0</spark.version>
> </properties>{code}
>  
> (ps. It's my first bug submission, so I hope I did not mess too much, be 
> tolerant if I did)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to