Ruifeng Zheng created SPARK-47749: ------------------------------------- Summary: Dataframe.collect should accept duplicated column names Key: SPARK-47749 URL: https://issues.apache.org/jira/browse/SPARK-47749 Project: Spark Issue Type: Improvement Components: Connect Affects Versions: 4.0.0 Reporter: Ruifeng Zheng
{code:java} +---+---+---+---+ | i| j| i| j| +---+---+---+---+ | 1| a| 1| a| +---+---+---+---+ {code} collect fails with {code:java} [info] org.apache.spark.sql.AnalysisException: [AMBIGUOUS_COLUMN_OR_FIELD] Column or field `i` is ambiguous and has 2 matches. SQLSTATE: 42702 [info] at org.apache.spark.sql.errors.CompilationErrors.ambiguousColumnOrFieldError(CompilationErrors.scala:28) [info] at org.apache.spark.sql.errors.CompilationErrors.ambiguousColumnOrFieldError$(CompilationErrors.scala:23) [info] at org.apache.spark.sql.errors.CompilationErrors$.ambiguousColumnOrFieldError(CompilationErrors.scala:54) [info] at org.apache.spark.sql.connect.client.arrow.ArrowDeserializers$.$anonfun$createFieldLookup$1(ArrowDeserializer.scala:460) [info] at org.apache.spark.sql.connect.client.arrow.ArrowDeserializers$.$anonfun$createFieldLookup$1$adapted(ArrowDeserializer.scala:454) [info] at scala.collection.immutable.List.foreach(List.scala:334) [info] at org.apache.spark.sql.connect.client.arrow.ArrowDeserializers$.createFieldLookup(ArrowDeserializer.scala:454) [info] at org.apache.spark.sql.connect.client.arrow.ArrowDeserializers$.deserializerFor(ArrowDeserializer.scala:328) [info] at org.apache.spark.sql.connect.client.arrow.ArrowDeserializers$.deserializerFor(ArrowDeserializer.scala:86) [info] at org.apache.spark.sql.connect.client.arrow.ArrowDeserializingIterator.<init>(ArrowDeserializer.scala:542) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org