[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-29 Thread Davies Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davies Liu updated SPARK-14757:
---
Fix Version/s: 1.6.2

> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
> Fix For: 1.6.2, 2.0.0
>
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> {code}
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> {code}
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> {code}
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> {code}
> {code}
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   B( None )
> ) ).toDF()
> b.show
> a.registerTempTable( "a" )
> b.registerTempTable( "b" )
> sqlContext.sql( "select * from a FULL JOIN b ON ( 
> outgoing_0<=>outgoing_1)" ).show()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-29 Thread Davies Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davies Liu updated SPARK-14757:
---
Assignee: Reynold Xin

> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>Assignee: Reynold Xin
> Fix For: 1.6.2, 2.0.0
>
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> {code}
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> {code}
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> {code}
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> {code}
> {code}
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   B( None )
> ) ).toDF()
> b.show
> a.registerTempTable( "a" )
> b.registerTempTable( "b" )
> sqlContext.sql( "select * from a FULL JOIN b ON ( 
> outgoing_0<=>outgoing_1)" ).show()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-29 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-14757:

Description: 
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this bug by pasting the following code fragment:

{code}
case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )
{code}

{code}
val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()
{code}

  was:
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this bug by pasting the following code fragment:

case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )

val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()



> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> {code}
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> {code}
> {code}
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   

[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-29 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-14757:

Description: 
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

{code}
select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
{code}

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this bug by pasting the following code fragment:

{code}
case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )
{code}

{code}
val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()
{code}

  was:
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this bug by pasting the following code fragment:

{code}
case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )
{code}

{code}
val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()
{code}


> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> {code}
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> {code}
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> {code}
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> {code}
> {code}
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   

[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-22 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-14757:

Target Version/s: 1.6.1, 2.0.0

> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   B( None )
> ) ).toDF()
> b.show
> a.registerTempTable( "a" )
> b.registerTempTable( "b" )
> sqlContext.sql( "select * from a FULL JOIN b ON ( 
> outgoing_0<=>outgoing_1)" ).show()
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-22 Thread Reynold Xin (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reynold Xin updated SPARK-14757:

Target Version/s: 1.6.2, 2.0.0  (was: 1.6.1, 2.0.0)

> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   B( None )
> ) ).toDF()
> b.show
> a.registerTempTable( "a" )
> b.registerTempTable( "b" )
> sqlContext.sql( "select * from a FULL JOIN b ON ( 
> outgoing_0<=>outgoing_1)" ).show()
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-20 Thread Hong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Huang updated SPARK-14757:
---
Description: 
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this bug by pasting the following code fragment:

case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )

val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()


  was:
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this by pasting the following code fragment:

case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )

val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()



> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this bug by pasting the following code fragment:
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   B( None )
> ) ).toDF()
> b.show
> 

[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-20 Thread Hong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Huang updated SPARK-14757:
---
Description: 
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


You can easily reproduce this by pasting the following code fragment:

case class A( outgoing_0: Option[Boolean] )
case class B( outgoing_1: Option[Boolean] )

val a = sc.parallelize( Seq(
  A( Some( false ) ),
  A( Some( true ) ),
  A( None )
) ).toDF()
a.show
val b = sc.parallelize( Seq(
  B( Some( false ) ),
  B( Some( true ) ),
  B( None )
) ).toDF()
b.show
a.registerTempTable( "a" )
b.registerTempTable( "b" )
sqlContext.sql( "select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)" 
).show()


  was:
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: Hong Huang
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)
> You can easily reproduce this by pasting the following code fragment:
> case class A( outgoing_0: Option[Boolean] )
> case class B( outgoing_1: Option[Boolean] )
> val a = sc.parallelize( Seq(
>   A( Some( false ) ),
>   A( Some( true ) ),
>   A( None )
> ) ).toDF()
> a.show
> val b = sc.parallelize( Seq(
>   B( Some( false ) ),
>   B( Some( true ) ),
>   B( None )
> ) ).toDF()
> b.show
> a.registerTempTable( "a" )
> b.registerTempTable( "b" )
> sqlContext.sql( "select * from a FULL JOIN b ON ( 
> outgoing_0<=>outgoing_1)" ).show()
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

黄泓 updated SPARK-14757:
---
Description: 
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)

  was:
Content of table a:
+-+
|outgoing_0|
+-+
| false |
|  true |
|  null  |
+--+

Content of table b:

+--+
|outgoing_1|
+--+
| false  |
|  true  |
|  null   |
+--+

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

+-+--+
|outgoing_0|outgoing_1|
+--+--+
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |
+--+--+

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: 黄泓
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-14757) Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left table is joined to "null" on the right table

2016-04-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SPARK-14757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

黄泓 updated SPARK-14757:
---
Description: 
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

a has only one field: outgoing_0 

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

b has only one filed: outgoing_1

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)

  was:
Content of table a:

|outgoing_0|
| false |
|  true |
|  null  |

Content of table b:

|outgoing_1|
| false  |
|  true  |
|  null   |

After running this query:

select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)

I got the following result:

|outgoing_0|outgoing_1|
|  true  |  true  |
| false  | false |
| false  |  null  |
|  null   |  null  |

The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. The 
operator <=> should match null with null. 

While left "false" is matched with right "null", it is also strange to find 
that the "false" on the right table does not match with "null" on the left 
table (no row with "null" as outgoing_0 and "false" as outgoing_1)


> Incorrect behavior of Join operation in Spqrk SQL JOIN : "false" in the left 
> table is joined to "null" on the right table
> -
>
> Key: SPARK-14757
> URL: https://issues.apache.org/jira/browse/SPARK-14757
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.6.0
>Reporter: 黄泓
>
> Content of table a:
> |outgoing_0|
> | false |
> |  true |
> |  null  |
> a has only one field: outgoing_0 
> Content of table b:
> |outgoing_1|
> | false  |
> |  true  |
> |  null   |
> b has only one filed: outgoing_1
> After running this query:
> select * from a FULL JOIN b ON ( outgoing_0<=>outgoing_1)
> I got the following result:
> |outgoing_0|outgoing_1|
> |  true  |  true  |
> | false  | false |
> | false  |  null  |
> |  null   |  null  |
> The row with "false" as outgoing_0 and "null" as outgoing_1 is unexpected. 
> The operator <=> should match null with null. 
> While left "false" is matched with right "null", it is also strange to find 
> that the "false" on the right table does not match with "null" on the left 
> table (no row with "null" as outgoing_0 and "false" as outgoing_1)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org