mcdull_zhang created SPARK-38542:
------------------------------------

             Summary: UnsafeHashedRelation should serialize numKeys out
                 Key: SPARK-38542
                 URL: https://issues.apache.org/jira/browse/SPARK-38542
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: mcdull_zhang


At present, UnsafeHashedRelation does not write out numKeys during 
serialization, so the numKeys of UnsafeHashedRelation obtained by 
deserialization is equal to 0. The numFields of UnsafeRows returned by 
UnsafeHashedRelation.keys() are all 0, which can lead to missing or incorrect 
data.

 

For example, in SubqueryBroadcastExec, the HashedRelation.keys() function is 
called.
{code:java}
val broadcastRelation = child.executeBroadcast[HashedRelation]().value
val (iter, expr) = if (broadcastRelation.isInstanceOf[LongHashedRelation]) {
  (broadcastRelation.keys(), HashJoin.extractKeyExprAt(buildKeys, index))
} else {
  (broadcastRelation.keys(),
    BoundReference(index, buildKeys(index).dataType, buildKeys(index).nullable))
}{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to