Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21403#discussion_r191383447 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala --- @@ -45,6 +46,10 @@ object RewritePredicateSubquery extends Rule[LogicalPlan] with PredicateHelper { private def getValueExpression(e: Expression): Seq[Expression] = { e match { case cns : CreateNamedStruct => cns.valExprs + case Literal(struct: InternalRow, dt: StructType) if dt.isInstanceOf[StructType] => + dt.zipWithIndex.map { case (field, idx) => Literal(struct.get(idx, field.dataType)) } + case a @ AttributeReference(_, dt: StructType, _, _) => --- End diff -- @hvanhovell I think also SPARK-24395 somewhat relates to this. If we consider `(a, b) in (select (null, null))` as a comparison between structs, as you mentioned, we have to return the row when `a` and `b` are `null`. So, is the right approach to keep structs as they are and not unpacking them? The more I think about it, the more I think unpacking is the right option honestly.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org