Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21403#discussion_r191383447
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala
 ---
    @@ -45,6 +46,10 @@ object RewritePredicateSubquery extends 
Rule[LogicalPlan] with PredicateHelper {
       private def getValueExpression(e: Expression): Seq[Expression] = {
         e match {
           case cns : CreateNamedStruct => cns.valExprs
    +      case Literal(struct: InternalRow, dt: StructType) if 
dt.isInstanceOf[StructType] =>
    +        dt.zipWithIndex.map { case (field, idx) => Literal(struct.get(idx, 
field.dataType)) }
    +      case a @ AttributeReference(_, dt: StructType, _, _) =>
    --- End diff --
    
    @hvanhovell I think  also SPARK-24395 somewhat relates to this. If we 
consider `(a, b) in (select (null, null))` as a comparison between structs, as 
you mentioned, we have to return the row when `a` and `b` are `null`. So, is 
the right approach to keep structs as they are and not unpacking them? The more 
I think about it, the more I think unpacking is the right option honestly.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to