[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832314#comment-16832314 ] Terry Moschou commented on SPARK-20193: --- I am in agreement with [~a.ionescu], and have a need for empty structs, built dynamically. It is quite confusing when spark can (or seems to) support empty structs constructed in other (non dynamic) ways. {code:java} case class Empty() val emptyStruct = udf(() => Empty()) val nullEmptyStruct = udf(() => null.asInstanceOf[Empty]) val df1 = spark.range(10).select(emptyStruct() as "a", nullEmptyStruct() as "b") case class X(a: Empty, b: Empty) val df2 = Seq(X(Empty(), null)).toDF val df3 = spark.createDataFrame( spark.sparkContext.parallelize(Seq( Row(Row(), null) )), StructType(StructField("a", StructType(Nil)) :: StructField("b", StructType(Nil)) :: Nil) ){code} > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu >Priority: Minor > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037001#comment-16037001 ] Michael Smith commented on SPARK-20193: --- Would be nice for the error message to describe the error (e.g. "Cannot encode empty struct (column 'xxx')"), as the current message is a bit inscrutable. > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu >Priority: Minor > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962422#comment-15962422 ] Hyukjin Kwon commented on SPARK-20193: -- Maybe, yea. but I guess we can't change the method signature as it breaks binary compatibility. > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu >Priority: Minor > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954911#comment-15954911 ] Adrian Ionescu commented on SPARK-20193: In that case better change the signature of the function: {{def struct(col: Column, cols: Column*): Column}} > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu >Priority: Minor > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954878#comment-15954878 ] Adrian Ionescu commented on SPARK-20193: Thanks for the workaround, but, sorry, this is not good enough. I agree that an empty struct is not very useful, but if it's not supported then the docs should say so and the error message should be clear. In my case, I'm building this struct dynamically, based on user input, so it may or may not be empty. Right now I have to special case it, but that introduces unnecessary complexity and makes the code less readable. > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu > Labels: struct > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954584#comment-15954584 ] Liang-Chi Hsieh commented on SPARK-20193: - Actually I am not sure what {{struct()}} represents. If you want a null for this struct, you can write: {code} spark.range(3).select(col("id"), lit(null).cast(new StructType())) {code} > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu > Labels: struct > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20193) Selecting empty struct causes ExpressionEncoder error.
[ https://issues.apache.org/jira/browse/SPARK-20193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953704#comment-15953704 ] Adrian Ionescu commented on SPARK-20193: cc [~hvanhovell] > Selecting empty struct causes ExpressionEncoder error. > -- > > Key: SPARK-20193 > URL: https://issues.apache.org/jira/browse/SPARK-20193 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: Adrian Ionescu > Labels: struct > > {{def struct(cols: Column*): Column}} > Given the above signature and the lack of any note in the docs saying that a > struct with no columns is not supported, I would expect the following to work: > {{spark.range(3).select(col("id"), struct().as("empty_struct")).collect}} > However, this results in: > {quote} > java.lang.AssertionError: assertion failed: each serializer expression should > contains at least one `BoundReference` > at scala.Predef$.assert(Predef.scala:170) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:240) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$$anonfun$11.apply(ExpressionEncoder.scala:238) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.immutable.List.flatMap(List.scala:344) > at > org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.(ExpressionEncoder.scala:238) > at > org.apache.spark.sql.catalyst.encoders.RowEncoder$.apply(RowEncoder.scala:63) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$withPlan(Dataset.scala:2837) > at org.apache.spark.sql.Dataset.select(Dataset.scala:1131) > ... 39 elided > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org