[ https://issues.apache.org/jira/browse/SPARK-25749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653209#comment-16653209 ]
Hyukjin Kwon commented on SPARK-25749: -------------------------------------- Please avoid set the priority, Critical+ which is usually reserved for committers. For the issue itself, as of Spark 2.3.x, it's external datasource. The issue should better be asked to databricks/spark-avro. > Exception thrown while reading avro file with large schema > ---------------------------------------------------------- > > Key: SPARK-25749 > URL: https://issues.apache.org/jira/browse/SPARK-25749 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.3.0, 2.3.1, 2.3.2 > Reporter: Raj > Priority: Blocker > Attachments: EncoderExample.scala, MainCC.scala, build.sbt, exception > > > Hi, We are migrating our jobs from Spark 2.2.0 to Spark 2.3.1. One of the job > reads avro source that has large nested schema. The job fails for Spark > 2.3.1(Have tested in Spark 2.3.0 & Spark 2.3.2 and the job fails in this case > also). I am able to replicate this with some sample data + dummy case class. > Please find attached the, > *Code*: EncoderExample.scala, MainCC.scala & build.sbt > *Exception log*: exception > PS: > I am getting exception \{{java.lang.OutOfMemoryError: Java heap space}}. I > have tried increasing the JVM size in eclipse, but that does not help either > I have also tested the code in Spark 2.2.2 and works fine. Seems like this > bug introduced in Spark 2.3.0 > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org