[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Description: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting {{null}} into \{{None}} when the target type is \{{{}an Option. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} was: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting \{{null}} into \{{None }} when the target type is \{{{}an Option. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} > Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast > null into None for Option values > -- > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, > encoder2, ..)}} correctly handle casting {{null}} into \{{None}} when the > target type is \{{{}an Option. > In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes > through as {{null}} which is likely to cause a {{NullPointerException}} for > most Scala code that operates on the Option. The change seems to be related > to the following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match. Since 3.3.3 this would fail if the > encoder is derived manually using {{Encoders.tuple(leftEncoder, > rightEncoder).}} > If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at > once using reflection, the encoder works as expected. The bug appears to be > in the following function inside {{ExpressionEncoder.scala}} > {code:java} > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = > ...{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) --
[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Description: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting \{{null}} into \{{None }} when the target type is \{{{}an Option. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} was: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting null into None when the target type is {{{}an Option{}}}. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} > Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast > null into None for Option values > -- > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, > encoder2, ..)}} correctly handle casting \{{null}} into \{{None }} when the > target type is \{{{}an Option. > In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes > through as {{null}} which is likely to cause a {{NullPointerException}} for > most Scala code that operates on the Option. The change seems to be related > to the following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match. Since 3.3.3 this would fail if the > encoder is derived manually using {{Encoders.tuple(leftEncoder, > rightEncoder).}} > If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at > once using reflection, the encoder works as expected. The bug appears to be > in the following function inside {{ExpressionEncoder.scala}} > {code:java} > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = > ...{code} > -- This message was sent by Atlassian Jira (v8.20.10#820
[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using Encoders.tuple do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Summary: Spark 3.3.3 tuple encoders built using Encoders.tuple do not correctly cast null into None for Option values (was: Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values) > Spark 3.3.3 tuple encoders built using Encoders.tuple do not correctly cast > null into None for Option values > > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, > encoder2, ..)}} correctly handle casting {{null}} into {{None}} when the > target type is an Option. > In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes > through as {{null}} which is likely to cause a {{NullPointerException}} for > most Scala code that operates on the Option. The change seems to be related > to the following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match. Since 3.3.3 this would fail if the > encoder is derived manually using {{Encoders.tuple(leftEncoder, > rightEncoder).}} > If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at > once using reflection, the encoder works as expected. The bug appears to be > in the following function inside {{ExpressionEncoder.scala}} > {code:java} > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = > ...{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Description: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting {{null}} into {{None}} when the target type is an Option. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} was: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting {{null}} into \{{None}} when the target type is \{{{}an Option. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} > Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast > null into None for Option values > -- > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, > encoder2, ..)}} correctly handle casting {{null}} into {{None}} when the > target type is an Option. > In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes > through as {{null}} which is likely to cause a {{NullPointerException}} for > most Scala code that operates on the Option. The change seems to be related > to the following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match. Since 3.3.3 this would fail if the > encoder is derived manually using {{Encoders.tuple(leftEncoder, > rightEncoder).}} > If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at > once using reflection, the encoder works as expected. The bug appears to be > in the following function inside {{ExpressionEncoder.scala}} > {code:java} > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = > ...{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Description: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting null into None when the target type is {{{}an Option{}}}. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} was: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting {{null}} into {{None }}when the target type is an {{{}Option{}}}. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} > Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast > null into None for Option values > -- > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, > encoder2, ..)}} correctly handle casting null into None when > the target type is {{{}an Option{}}}. > In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes > through as {{null}} which is likely to cause a {{NullPointerException}} for > most Scala code that operates on the Option. The change seems to be related > to the following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match. Since 3.3.3 this would fail if the > encoder is derived manually using {{Encoders.tuple(leftEncoder, > rightEncoder).}} > If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at > once using reflection, the encoder works as expected. The bug appears to be > in the following function inside {{ExpressionEncoder.scala}} > {code:java} > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = > ...{code} > -- This message was sent by Atlassia
[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Description: In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, encoder2, ..)}} correctly handle casting {{null}} into {{None }}when the target type is an {{{}Option{}}}. In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes through as {{null}} which is likely to cause a {{NullPointerException}} for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match. Since 3.3.3 this would fail if the encoder is derived manually using {{Encoders.tuple(leftEncoder, rightEncoder).}} If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at once using reflection, the encoder works as expected. The bug appears to be in the following function inside {{ExpressionEncoder.scala}} {code:java} def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ...{code} was: In Spark `3.3.2`, encoders created using `Encoders.tuple(encoder1, encoder2, ..)` correctly handle casting `null` into `None` when the target type is an `Option`. In Spark `3.3.3`, this behaviour has changed and the Option value comes through as `null` which is likely to cause a `NullPointerException` for most Scala code that operates on the Option. The change seems to be related to the following commit: [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] I have made a reproduction with a couple of examples in a public Github repo here: [https://github.com/q-willboulter/spark-tuple-encoders-bug] The common use case where this is likely to be encountered is while doing any joins that can return null, e.g. left or outer joins. When casting the result of a left join it is sensible to wrap the right-hand side in an Option to handle the case where there is no match - since 3.3.3 this could fail if the encoder is derived manually using `Encoders.tuple(leftEncoder, rightEncoder)`. If the entire tuple encoder `Encoder[(Left, Option[Right]])` is derived at once, the encoder works as expected - the bug appears to be in the following function inside `ExpressionEncoder.scala` ``` def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ... ``` > Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast > null into None for Option values > -- > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark {{3.3.2}} encoders created using {{Encoders.tuple(encoder1, > encoder2, ..)}} correctly handle casting {{null}} into {{None }}when the > target type is an {{{}Option{}}}. > > In Spark {{{}3.3.3{}}}, this behaviour has changed and the Option value comes > through as {{null}} which is likely to cause a {{NullPointerException}} for > most Scala code that operates on the Option. The change seems to be related > to the following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match. Since 3.3.3 this would fail if the > encoder is derived manually using {{Encoders.tuple(leftEncoder, > rightEncoder).}} > If the entire tuple encoder {{Encoder[(Left, Option[Right]])}} is derived at > once using reflection, the encoder works as expected. The bug appears to be > in the following function inside {{ExpressionEncoder.scala}} > > {code:java} > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = > ...{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (SPARK-46251) Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values
[ https://issues.apache.org/jira/browse/SPARK-46251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Boulter updated SPARK-46251: - Summary: Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast null into None for Option values (was: Spark 3.3.3 tuple encoders do not correctly cast null into None for Option values) > Spark 3.3.3 tuple encoders built using `Encoders.tuple` do not correctly cast > null into None for Option values > -- > > Key: SPARK-46251 > URL: https://issues.apache.org/jira/browse/SPARK-46251 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.3, 3.4.2, 3.4.0, 3.4.1, 3.5.0 >Reporter: Will Boulter >Priority: Major > > In Spark `3.3.2`, encoders created using `Encoders.tuple(encoder1, encoder2, > ..)` correctly handle casting `null` into `None` when the target type is an > `Option`. > > In Spark `3.3.3`, this behaviour has changed and the Option value comes > through as `null` which is likely to cause a `NullPointerException` for most > Scala code that operates on the Option. The change seems to be related to the > following commit: > [https://github.com/apache/spark/commit/9110c05d54c392e55693eba4509be37c571d610a] > > I have made a reproduction with a couple of examples in a public Github repo > here: > [https://github.com/q-willboulter/spark-tuple-encoders-bug] > > The common use case where this is likely to be encountered is while doing any > joins that can return null, e.g. left or outer joins. When casting the result > of a left join it is sensible to wrap the right-hand side in an Option to > handle the case where there is no match - since 3.3.3 this could fail if the > encoder is derived manually using `Encoders.tuple(leftEncoder, > rightEncoder)`. If the entire tuple encoder `Encoder[(Left, Option[Right]])` > is derived at once, the encoder works as expected - the bug appears to be in > the following function inside `ExpressionEncoder.scala` > ``` > def tuple(encoders: Seq[ExpressionEncoder[_]]): ExpressionEncoder[_] = ... > ``` -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org