[ https://issues.apache.org/jira/browse/SPARK-26146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697533#comment-16697533 ]
Anders Eriksson commented on SPARK-26146: ----------------------------------------- I also run into this bug. I too could avoid it by reverting from Scala version 2.12 to 2.11. > CSV wouln't be ingested in Spark 2.4.0 with Scala 2.12 > ------------------------------------------------------ > > Key: SPARK-26146 > URL: https://issues.apache.org/jira/browse/SPARK-26146 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.4.0 > Reporter: Jean Georges Perrin > Priority: Major > > Ingestion of a CSV file seems to fail with Spark v2.4.0 and Scala v2.12, > where it works ok with Scala v2.11. > When running a simple CSV ingestion like:{{ }} > {code:java} > // Creates a session on a local master > SparkSession spark = SparkSession.builder() > .appName("CSV to Dataset") > .master("local") > .getOrCreate(); > // Reads a CSV file with header, called books.csv, stores it in a > dataframe > Dataset<Row> df = spark.read().format("csv") > .option("header", "true") > .load("data/books.csv"); > {code} > With Scala 2.12, I get: > {code:java} > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582 > at > com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563) > at > com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338) > at > com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103) > at > com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90) > at > com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44) > at > com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58) > at > com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:240) > ... > at > net.jgp.books.sparkWithJava.ch01.CsvToDataframeApp.start(CsvToDataframeApp.java:37) > at > net.jgp.books.sparkWithJava.ch01.CsvToDataframeApp.main(CsvToDataframeApp.java:21) > {code} > Where it works pretty smoothly if I switch back to 2.11. > Full example available at > [https://github.com/jgperrin/net.jgp.books.sparkWithJava.ch01.] You can > modify pom.xml to change easily the Scala version in the property section: > {code:java} > <properties> > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> > <java.version>1.8</java.version> > <scala.version>2.11</scala.version> > <spark.version>2.4.0</spark.version> > </properties>{code} > > (ps. It's my first bug submission, so I hope I did not mess too much, be > tolerant if I did) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org