[ 
https://issues.apache.org/jira/browse/BEAM-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robin Trietsch updated BEAM-4114:
---------------------------------
    Description: 
When using the 
[Join.fullOuterJoin()|https://beam.apache.org/documentation/sdks/javadoc/2.4.0/org/apache/beam/sdk/extensions/joinlibrary/Join.html#fullOuterJoin-org.apache.beam.sdk.values.PCollection-org.apache.beam.sdk.values.PCollection-V1-V2-],
 a checkNotNull() is done for the 
[leftNullValue|https://github.com/apache/beam/blob/master/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L207]
 and 
[rightNullValue|https://github.com/apache/beam/blob/master/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L208].

However, it makes more sense to allow null values, since sometimes, if the key 
used for the join is not the same, you'd like to see that the value will become 
null. This should be decided by the developer, and not by the join library.

Looking at the source code, this is also supported by 
[KV.of()|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/KV.java#L42]
 (it allows null values), which is used in Join.fullOuterJoin().

If required, I can create a pull request on GitHub.

  was:
When using the 
[Join.fullOuterJoin()|https://beam.apache.org/documentation/sdks/javadoc/2.4.0/org/apache/beam/sdk/extensions/joinlibrary/Join.html#fullOuterJoin-org.apache.beam.sdk.values.PCollection-org.apache.beam.sdk.values.PCollection-V1-V2-],
 a checkNotNull() is done for the 
[leftNullValue|https://github.com/apache/beam/blob/master/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L207]
 and 
[rightNullValue|https://github.com/apache/beam/blob/master/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L208].

However, it makes more sense to allow null values, since sometimes, if the key 
used for the join is not the same, you'd like to see that the value will become 
null. This should be decided by the developer, and not by the join library.

Looking at the source code, this is also supported by 
[KV.of()|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/KV.java#L42]
 (it allows null values), which is used in Join.fullOuterJoin().


> Allow null as leftNullValue/rightNullValue in Join.fullOuterJoin()
> ------------------------------------------------------------------
>
>                 Key: BEAM-4114
>                 URL: https://issues.apache.org/jira/browse/BEAM-4114
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>    Affects Versions: 2.4.0
>            Reporter: Robin Trietsch
>            Assignee: Kenneth Knowles
>            Priority: Major
>
> When using the 
> [Join.fullOuterJoin()|https://beam.apache.org/documentation/sdks/javadoc/2.4.0/org/apache/beam/sdk/extensions/joinlibrary/Join.html#fullOuterJoin-org.apache.beam.sdk.values.PCollection-org.apache.beam.sdk.values.PCollection-V1-V2-],
>  a checkNotNull() is done for the 
> [leftNullValue|https://github.com/apache/beam/blob/master/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L207]
>  and 
> [rightNullValue|https://github.com/apache/beam/blob/master/sdks/java/extensions/join-library/src/main/java/org/apache/beam/sdk/extensions/joinlibrary/Join.java#L208].
> However, it makes more sense to allow null values, since sometimes, if the 
> key used for the join is not the same, you'd like to see that the value will 
> become null. This should be decided by the developer, and not by the join 
> library.
> Looking at the source code, this is also supported by 
> [KV.of()|https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/values/KV.java#L42]
>  (it allows null values), which is used in Join.fullOuterJoin().
> If required, I can create a pull request on GitHub.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to