[ 
https://issues.apache.org/jira/browse/FLINK-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln.lee updated FLINK-5498:
-------------------------------
    Description: 
I found the expected result of a unit test case incorrect compare to that in a 
RDMBS, 
see 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/table/JoinITCase.scala
{code:title=JoinITCase.scala}
def testRightJoinWithNotOnlyEquiJoin(): Unit = {
     ...
     val ds1 = CollectionDataSets.get3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
'c)
     val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 
'f, 'g, 'h)

     val joinT = ds1.rightOuterJoin(ds2, 'a === 'd && 'b < 'h).select('c, 'g)
 
     val expected = "Hello world,BCD\n"
     val results = joinT.toDataSet[Row].collect()
     TestBaseUtils.compareResultAsText(results.asJava, expected)
}
{code}

Then I took some time to learn about the ‘outer join’ in relational databases, 
the right result of above case should be(tested in SQL Server and MySQL, the 
results are same):
{code}
> select c, g from tuple3 right outer join tuple5 on a=f and b<h;
c                                g                               
-------------------------------- --------------------------------
NULL                             Hallo                           
NULL                             Hallo Welt                      
NULL                             Hallo Welt wie                  
NULL                             Hallo Welt wie gehts?           
NULL                             ABC                             
Hello world                      BCD                             
NULL                             CDE                             
NULL                             DEF                             
NULL                             EFG                             
NULL                             FGH                             
NULL                             GHI                             
NULL                             HIJ                             
NULL                             IJK                             
NULL                             JKL                             
NULL                             KLM   
{code}
the join condition “rightOuterJoin('a === 'd && 'b < 'h)” is not equivalent to 
“rightOuterJoin('a === 'd).where('b < 'h)”.  

But another test case in 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/table/JoinITCase.scala
will throw a ValidationException indicating: “Invalid non-join predicate 'b < 
3. For non-join predicates use Table#where.”
{code:title=JoinITCase.scala}
@Test(expected = classOf[ValidationException])
def testNoJoinCondition(): Unit = {
     …
     val ds1 = CollectionDataSets.get3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
'c)
     val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 
'f, 'g, 'h)

     val joinT = ds2.leftOuterJoin(ds1, 'b === 'd && 'b < 3).select('c, 'g)
}
{code}
This jira aims to make clear what kind of expression is supported on the join 
predicate.

More detailed description: http://goo.gl/gK6vP3



  was:
I found the expected result of a unit test case incorrect compare to that in a 
RDMBS, 
see 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/table/JoinITCase.scala
{code:title=JoinITCase.scala}
def testRightJoinWithNotOnlyEquiJoin(): Unit = {
     ...
     val ds1 = CollectionDataSets.get3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
'c)
     val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 
'f, 'g, 'h)

     val joinT = ds1.rightOuterJoin(ds2, 'a === 'd && 'b < 'h).select('c, 'g)
 
     val expected = "Hello world,BCD\n"
     val results = joinT.toDataSet[Row].collect()
     TestBaseUtils.compareResultAsText(results.asJava, expected)
}
{code}

Then I took some time to learn about the ‘outer join’ in relational databases, 
the right result of above case should be(tested in SQL Server and MySQL, the 
results are same):
{code}
> select c, g from tuple3 right outer join tuple5 on a=f and b<h;
c                                g                               
-------------------------------- --------------------------------
NULL                             Hallo                           
NULL                             Hallo Welt                      
NULL                             Hallo Welt wie                  
NULL                             Hallo Welt wie gehts?           
NULL                             ABC                             
Hello world                      BCD                             
NULL                             CDE                             
NULL                             DEF                             
NULL                             EFG                             
NULL                             FGH                             
NULL                             GHI                             
NULL                             HIJ                             
NULL                             IJK                             
NULL                             JKL                             
NULL                             KLM   
{code}
the join condition “rightOuterJoin('a === 'd && 'b < 'h)” is not equivalent to 
“rightOuterJoin('a === 'd).where('b < 'h)”.  

But another test case test case in 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/table/JoinITCase.scala
will throw a ValidationException indicating: “Invalid non-join predicate 'b < 
3. For non-join predicates use Table#where.”
{code:title=JoinITCase.scala}
@Test(expected = classOf[ValidationException])
def testNoJoinCondition(): Unit = {
     …
     val ds1 = CollectionDataSets.get3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
'c)
     val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 
'f, 'g, 'h)

     val joinT = ds2.leftOuterJoin(ds1, 'b === 'd && 'b < 3).select('c, 'g)
}
{code}
This jira aims to make clear what kind of expression is supported on the join 
predicate.

More detailed description: http://goo.gl/gK6vP3




> Fix JoinITCase and add support for filter expressions on the On clause in 
> left/right outer joins
> ------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-5498
>                 URL: https://issues.apache.org/jira/browse/FLINK-5498
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>    Affects Versions: 1.1.4
>            Reporter: lincoln.lee
>            Assignee: lincoln.lee
>
> I found the expected result of a unit test case incorrect compare to that in 
> a RDMBS, 
> see 
> flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/table/JoinITCase.scala
> {code:title=JoinITCase.scala}
> def testRightJoinWithNotOnlyEquiJoin(): Unit = {
>      ...
>      val ds1 = CollectionDataSets.get3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
> 'c)
>      val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 
> 'f, 'g, 'h)
>      val joinT = ds1.rightOuterJoin(ds2, 'a === 'd && 'b < 'h).select('c, 'g)
>  
>      val expected = "Hello world,BCD\n"
>      val results = joinT.toDataSet[Row].collect()
>      TestBaseUtils.compareResultAsText(results.asJava, expected)
> }
> {code}
> Then I took some time to learn about the ‘outer join’ in relational 
> databases, the right result of above case should be(tested in SQL Server and 
> MySQL, the results are same):
> {code}
> > select c, g from tuple3 right outer join tuple5 on a=f and b<h;
> c                                g                               
> -------------------------------- --------------------------------
> NULL                             Hallo                           
> NULL                             Hallo Welt                      
> NULL                             Hallo Welt wie                  
> NULL                             Hallo Welt wie gehts?           
> NULL                             ABC                             
> Hello world                      BCD                             
> NULL                             CDE                             
> NULL                             DEF                             
> NULL                             EFG                             
> NULL                             FGH                             
> NULL                             GHI                             
> NULL                             HIJ                             
> NULL                             IJK                             
> NULL                             JKL                             
> NULL                             KLM   
> {code}
> the join condition “rightOuterJoin('a === 'd && 'b < 'h)” is not equivalent 
> to “rightOuterJoin('a === 'd).where('b < 'h)”.  
> But another test case in 
> flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/batch/table/JoinITCase.scala
> will throw a ValidationException indicating: “Invalid non-join predicate 'b < 
> 3. For non-join predicates use Table#where.”
> {code:title=JoinITCase.scala}
> @Test(expected = classOf[ValidationException])
> def testNoJoinCondition(): Unit = {
>      …
>      val ds1 = CollectionDataSets.get3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
> 'c)
>      val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 
> 'f, 'g, 'h)
>      val joinT = ds2.leftOuterJoin(ds1, 'b === 'd && 'b < 3).select('c, 'g)
> }
> {code}
> This jira aims to make clear what kind of expression is supported on the join 
> predicate.
> More detailed description: http://goo.gl/gK6vP3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to