[ https://issues.apache.org/jira/browse/SPARK-40282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598132#comment-17598132 ]
Hyukjin Kwon commented on SPARK-40282: -------------------------------------- We don't have this problem in the languages supported by the official Apache Spark . Is it problem from Kotlin? > DataType argument in StructType.add is incorrectly throwing scala.MatchError > ---------------------------------------------------------------------------- > > Key: SPARK-40282 > URL: https://issues.apache.org/jira/browse/SPARK-40282 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.3.0 > Reporter: M. Manna > Priority: Major > Attachments: SparkApplication.kt, retailstore.csv > > > *Problem Description* > as part of contract mentioned here, Spark should be able to support > {{IntegerType}} as an argument in StructType.add method. However, it > complaints with {{scala.MatchError}} today. > > If we call the override version which access String value as Type e.g. > "Integer" - it works. > *How to Reproduce* > # Create a Kotlin Project - I have used Kotlin but Java will also work > (needs minor adjustment) > # Place the attached CSV file in {{src/main/resources}} > # Compile the project with Java 11 > # Run - it will give you error. > {code:java} > Exception in thread "main" scala.MatchError: > org.apache.spark.sql.types.IntegerType@363fe35a (of class > org.apache.spark.sql.types.IntegerType) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeFor(RowEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.externalDataTypeForInput(RowEncoder.scala:236) > at > org.apache.spark.sql.catalyst.expressions.objects.ValidateExternalType.<init>(objects.scala:1890) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.$anonfun$serializerFor$3(RowEncoder.scala:197) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293) > at > scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) > at > scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) > at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293) > at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290) > at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:198) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.serializerFor(RowEncoder.scala:192) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:73) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:81) > at org.apache.spark.sql.Dataset$.$anonfun$ofRows$1(Dataset.scala:92) > at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:89) > at > org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:444) > at > org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) > at > org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) > at scala.Option.getOrElse(Option.scala:189) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:185) > {code} > # Now change line (commented as HERE) - to have a String value i.e. "Integer" > # It works > *Ask* > # Why does it not accept IntegerType, StringType as DataType as part of the > parameters supplied through {{add}} function in {{StructType}} ? > # If this is a bug, do we know when the fix can come? > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org