[jira] [Commented] (FLINK-6426) Update the document of group-window table API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057617#comment-16057617
 ] 

ASF GitHub Bot commented on FLINK-6426:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3806
  
Did this PR become obsolete with the recent restructuring of the Table API 
docs?
Can you check that and close this PR if that's the case @sunjincheng121?

Thank you, Fabian


> Update the document of group-window table API
> -
>
> Key: FLINK-6426
> URL: https://issues.apache.org/jira/browse/FLINK-6426
> Project: Flink
>  Issue Type: Sub-task
>  Components: Documentation, Table API & SQL
>Affects Versions: 1.3.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> 1.Correct the method parameter type error in the group-window table API 
> document, change the document from ` .window([w: Window] as 'w)` to ` 
> .window([w: WindowWithoutAlias] as 'w)`
> 2. For the consistency of tableAPI and SQL, change the description of SQL 
> document from "Group Windows" to "Windows".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #3806: [FLINK-6426][table]Update the document of group-window ta...

2017-06-21 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3806
  
Did this PR become obsolete with the recent restructuring of the Table API 
docs?
Can you check that and close this PR if that's the case @sunjincheng121?

Thank you, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6841) using TableSourceTable for both Stream and Batch OR remove useless import

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057612#comment-16057612
 ] 

ASF GitHub Bot commented on FLINK-6841:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4061
  
Hi @sunjincheng121! I left a comment on the JIRA issue. Short summary: I 
don't think this PR is a significant improvement. I'd rather keep it as it is 
unless you have a good argument to convince me.

Thank you, Fabian


> using TableSourceTable for both Stream and Batch OR remove useless import
> -
>
> Key: FLINK-6841
> URL: https://issues.apache.org/jira/browse/FLINK-6841
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> 1. {{StreamTableSourceTable}} exist useless import of {{TableException}}
> 2. {{StreamTableSourceTable}} only override {{getRowType}} of  
> {{FlinkTable}}, I think we can override the method in {{TableSourceTable}}, 
> If so we can using {{TableSourceTable}} for both {{Stream}} and {{Batch}}.
> What do you think? [~fhueske] [~twalthr]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #4061: [FLINK-6841][table]Using TableSourceTable for both Stream...

2017-06-21 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/4061
  
Hi @sunjincheng121! I left a comment on the JIRA issue. Short summary: I 
don't think this PR is a significant improvement. I'd rather keep it as it is 
unless you have a good argument to convince me.

Thank you, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6841) using TableSourceTable for both Stream and Batch OR remove useless import

2017-06-21 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057610#comment-16057610
 ] 

Fabian Hueske commented on FLINK-6841:
--

To be honest, I don't see a significant improvement by removing 
{{StreamTableSourceTable}}.
We have to move the code of {{getRowType()}} from {{StreamTableSourceTable}} 
and additionally add a condition that it is only called if the table source is 
a {{StreamTableSource}}. 
Maybe we even have to add a {{StreamTableSourceTable}} back if both table 
sources differ more at some point.

Therefore, I would rather keep the current design as it is for now.
In general, we should think about the design of table sources and improve the 
design to have a good design that supports watermarks, time attributes, nested 
data, streaming & batch, projection & filter push down, etc. Once we have that, 
we might also need to refactor the internals a bit.

> using TableSourceTable for both Stream and Batch OR remove useless import
> -
>
> Key: FLINK-6841
> URL: https://issues.apache.org/jira/browse/FLINK-6841
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> 1. {{StreamTableSourceTable}} exist useless import of {{TableException}}
> 2. {{StreamTableSourceTable}} only override {{getRowType}} of  
> {{FlinkTable}}, I think we can override the method in {{TableSourceTable}}, 
> If so we can using {{TableSourceTable}} for both {{Stream}} and {{Batch}}.
> What do you think? [~fhueske] [~twalthr]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6649) Improve Non-window group aggregate with configurable `earlyFire`.

2017-06-21 Thread sunjincheng (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057595#comment-16057595
 ] 

sunjincheng commented on FLINK-6649:


Hi [~fhueske] I have opened the PR. In the 
PR(https://github.com/apache/flink/pull/4157). I have add supports updating the 
calculated data according to the specified time interval on non-window group 
AGG. If we config the time interval is N seconds, then the next update time 
relative to the latest update time T, is T+N seconds. For example, the time 
interval is 2 seconds, the previous update time is T seconds, and the next 
update time T1> = T + 2 seconds. If no data arrives during T to T + 2, no 
updates are made. 

I appreciated if you have time to look at the PR. And feel free to left your 
comments. :)

> Improve Non-window group aggregate with configurable `earlyFire`.
> -
>
> Key: FLINK-6649
> URL: https://issues.apache.org/jira/browse/FLINK-6649
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Affects Versions: 1.4.0
>Reporter: sunjincheng
>Assignee: sunjincheng
>
> Currently,  Non-windowed group aggregate is earlyFiring at count(1), that is 
> every row will emit a aggregate result. But some times user want config count 
> number (`early firing with count[N]`) , to reduce the downstream pressure. 
> This JIRA. will enable the config of e`earlyFiring` for  Non-windowed group 
> aggregate.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-6965) Avro is missing snappy dependency

2017-06-21 Thread Chesnay Schepler (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-6965:

Description: 
The shading rework made before 1.3 removed a snappy dependency that was 
accidentally pulled in through hadoop. This is technically alright, until 
class-loaders rear their ugly heads.

Our kafka connector can read avro records, which may or may not require snappy. 
Usually this _should_ be solvable by including the snappy dependency in the 
user-jar if necessary, however since the kafka connector loads classes that it 
requires using the system class loader this doesn't work.

As such we have to add a separate snappy dependency to flink-core.

  was:
The shading rework made before 1.3 removed a snappy dependency that was 
accidentally pulled in through hadoop. This is technically alright, until 
class-loaders rear their ugly heads.

Our kafka connector can read avro records, which may or may not require snappy. 
Usually this _should _be solvable by including the snappy dependency in the 
user-jar if necessary, however since the kafka connector loads classes that it 
requires using the system class loader this doesn't work.

As such we have to add a separate snappy dependency to flink-core.


> Avro is missing snappy dependency
> -
>
> Key: FLINK-6965
> URL: https://issues.apache.org/jira/browse/FLINK-6965
> Project: Flink
>  Issue Type: Bug
>  Components: Type Serialization System
>Affects Versions: 1.3.0
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
> Fix For: 1.3.2
>
>
> The shading rework made before 1.3 removed a snappy dependency that was 
> accidentally pulled in through hadoop. This is technically alright, until 
> class-loaders rear their ugly heads.
> Our kafka connector can read avro records, which may or may not require 
> snappy. Usually this _should_ be solvable by including the snappy dependency 
> in the user-jar if necessary, however since the kafka connector loads classes 
> that it requires using the system class loader this doesn't work.
> As such we have to add a separate snappy dependency to flink-core.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6960) Add E(2.7182818284590452354),PI(3.14159265358979323846) supported in SQL

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057589#comment-16057589
 ] 

ASF GitHub Bot commented on FLINK-6960:
---

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4152#discussion_r123261877
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/functions/scalarSqlFunctions/MathSqlFunctions.scala
 ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.functions.scalarSqlFunctions
+
+import org.apache.calcite.sql.{SqlFunction, SqlFunctionCategory, SqlKind}
+import org.apache.calcite.sql.`type`._
+
+class MathSqlFunctions {
+
+}
+
+object MathSqlFunctions {
+  val E = new SqlFunction(
--- End diff --

I notice this class, I have not using this because:
1. The constructor of `SqlBaseContextVariable` is protected.
2. I think called the scalar function with "()" is make sense. 

About the folder, at the point of my view,  Build-in scalar function in 
flink will need two part:
1. `SqlFuncitons` which are define the interface of scalar function for 
`FunctionGenerator` and `FunctionCatalog`.
2. `Runtime implementation` which are the real logic of scalar function. 
using in `BuiltInMethods`.

So, I want create two packages: 
`org.apache.flink.table.functions.scalarSqlFunctions` and 
`org.apache.flink.table.runtime.scalarfunctions`(in this PR not need the 
runtime package). 
Please let me know What do you think?


> Add E(2.7182818284590452354),PI(3.14159265358979323846) supported in SQL
> 
>
> Key: FLINK-6960
> URL: https://issues.apache.org/jira/browse/FLINK-6960
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>  Labels: starter
>
> E=Math.E 
> PI=Math.PI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4152: [FLINK-6960][table] Add E supported in SQL.

2017-06-21 Thread sunjincheng121
Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4152#discussion_r123261877
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/functions/scalarSqlFunctions/MathSqlFunctions.scala
 ---
@@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.functions.scalarSqlFunctions
+
+import org.apache.calcite.sql.{SqlFunction, SqlFunctionCategory, SqlKind}
+import org.apache.calcite.sql.`type`._
+
+class MathSqlFunctions {
+
+}
+
+object MathSqlFunctions {
+  val E = new SqlFunction(
--- End diff --

I notice this class, I have not using this because:
1. The constructor of `SqlBaseContextVariable` is protected.
2. I think called the scalar function with "()" is make sense. 

About the folder, at the point of my view,  Build-in scalar function in 
flink will need two part:
1. `SqlFuncitons` which are define the interface of scalar function for 
`FunctionGenerator` and `FunctionCatalog`.
2. `Runtime implementation` which are the real logic of scalar function. 
using in `BuiltInMethods`.

So, I want create two packages: 
`org.apache.flink.table.functions.scalarSqlFunctions` and 
`org.apache.flink.table.runtime.scalarfunctions`(in this PR not need the 
runtime package). 
Please let me know What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6457) Clean up ScalarFunction and TableFunction interface

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057575#comment-16057575
 ] 

ASF GitHub Bot commented on FLINK-6457:
---

Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3880
  
Hi @Xpray, I left a comment on the related JIRA issue. In short, I do not 
why we should remove methods with constant signature and call them via 
reflection. This makes the code more prone for failures, it is less comfortable 
for users that need to override the methods and does not change anything for 
users that do not need these methods.

It would be great if you could reply on the JIRA.

Thank you, Fabian


> Clean up ScalarFunction and TableFunction interface
> ---
>
> Key: FLINK-6457
> URL: https://issues.apache.org/jira/browse/FLINK-6457
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>
> Motivation:
> Some methods in ScalarFunction and TableFunction are unnecessary, e.g 
> toString() and getResultType in ScalarFunction
> this issue intend to clear the interface.
> Goal:
>  only methods related to `Collector` will remain in TableFunction interface, 
> and ScalarFunction interface shall have no methods , user can choose whether 
> to implement the `getResultType` method, which will be called by reflection, 
> and the Flink document will have instructions for user.
> Future:
> There should be some Annotations for user to implement methods like `@Eval` 
> for eval method, it be will in the next issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6960) Add E(2.7182818284590452354),PI(3.14159265358979323846) supported in SQL

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057574#comment-16057574
 ] 

ASF GitHub Bot commented on FLINK-6960:
---

Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4152#discussion_r123258425
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala
 ---
@@ -1116,6 +1116,13 @@ class ScalarFunctionsTest extends ExpressionTestBase 
{
   math.Pi.toString)
   }
 
+  @Test
+  def testE(): Unit = {
+testSqlApi(
+  "E()",
--- End diff --

Yes, It's works well.


> Add E(2.7182818284590452354),PI(3.14159265358979323846) supported in SQL
> 
>
> Key: FLINK-6960
> URL: https://issues.apache.org/jira/browse/FLINK-6960
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: sunjincheng
>Assignee: sunjincheng
>  Labels: starter
>
> E=Math.E 
> PI=Math.PI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink issue #3880: [FLINK-6457] Clean up ScalarFunction and TableFunction in...

2017-06-21 Thread fhueske
Github user fhueske commented on the issue:

https://github.com/apache/flink/pull/3880
  
Hi @Xpray, I left a comment on the related JIRA issue. In short, I do not 
why we should remove methods with constant signature and call them via 
reflection. This makes the code more prone for failures, it is less comfortable 
for users that need to override the methods and does not change anything for 
users that do not need these methods.

It would be great if you could reply on the JIRA.

Thank you, Fabian


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4152: [FLINK-6960][table] Add E supported in SQL.

2017-06-21 Thread sunjincheng121
Github user sunjincheng121 commented on a diff in the pull request:

https://github.com/apache/flink/pull/4152#discussion_r123258425
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/expressions/ScalarFunctionsTest.scala
 ---
@@ -1116,6 +1116,13 @@ class ScalarFunctionsTest extends ExpressionTestBase 
{
   math.Pi.toString)
   }
 
+  @Test
+  def testE(): Unit = {
+testSqlApi(
+  "E()",
--- End diff --

Yes, It's works well.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6457) Clean up ScalarFunction and TableFunction interface

2017-06-21 Thread Fabian Hueske (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057572#comment-16057572
 ] 

Fabian Hueske commented on FLINK-6457:
--

I don't understand which problem this issue aims to resolve. All removed 
methods are implemented methods of abstract classes that users do not need to 
override. Removing this methods and calling them via reflection has a couple of 
issues:

- Users need to get the signature exactly right. When overriding existing 
methods, users have IDE and compiler support.
- Calling methods via reflection is always a bit brittle in my opinion.

All UDF methods that are called via reflection cannot be abstract methods 
because parameter and return types must be flexible. So, calling via reflection 
is justified in these cases. However, methods with a fixed signature can be 
directly called, IMO.

Can you explain why these changes are necessary?

Thank you, Fabian.

> Clean up ScalarFunction and TableFunction interface
> ---
>
> Key: FLINK-6457
> URL: https://issues.apache.org/jira/browse/FLINK-6457
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API & SQL
>Reporter: Ruidong Li
>Assignee: Ruidong Li
>
> Motivation:
> Some methods in ScalarFunction and TableFunction are unnecessary, e.g 
> toString() and getResultType in ScalarFunction
> this issue intend to clear the interface.
> Goal:
>  only methods related to `Collector` will remain in TableFunction interface, 
> and ScalarFunction interface shall have no methods , user can choose whether 
> to implement the `getResultType` method, which will be called by reflection, 
> and the Flink document will have instructions for user.
> Future:
> There should be some Annotations for user to implement methods like `@Eval` 
> for eval method, it be will in the next issue



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6965) Avro is missing snappy dependency

2017-06-21 Thread Chesnay Schepler (JIRA)
Chesnay Schepler created FLINK-6965:
---

 Summary: Avro is missing snappy dependency
 Key: FLINK-6965
 URL: https://issues.apache.org/jira/browse/FLINK-6965
 Project: Flink
  Issue Type: Bug
  Components: Type Serialization System
Affects Versions: 1.3.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.3.2


The shading rework made before 1.3 removed a snappy dependency that was 
accidentally pulled in through hadoop. This is technically alright, until 
class-loaders rear their ugly heads.

Our kafka connector can read avro records, which may or may not require snappy. 
Usually this _should _be solvable by including the snappy dependency in the 
user-jar if necessary, however since the kafka connector loads classes that it 
requires using the system class loader this doesn't work.

As such we have to add a separate snappy dependency to flink-core.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-6964) Fix recovery for incremental checkpoints in StandaloneCompletedCheckpointStore

2017-06-21 Thread Stefan Richter (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Richter updated FLINK-6964:
--
Description: {{StandaloneCompletedCheckpointStore}} does not register 
shared states ion resume. However, for externalized checkpoints, it register 
the checkpoint from which it resumed. This checkpoint gets added to the 
completed checkpoint store as part of resume.  (was: 
{{StandaloneCompletedCheckpointStore}} does not register shared states in the 
{{recover}} method. However, for externalized checkpoints, it register the 
checkpoint from which it resumed. This checkpoint gets added to the completed 
checkpoint store as part of resume.)

> Fix recovery for incremental checkpoints in StandaloneCompletedCheckpointStore
> --
>
> Key: FLINK-6964
> URL: https://issues.apache.org/jira/browse/FLINK-6964
> Project: Flink
>  Issue Type: Bug
>  Components: State Backends, Checkpointing
>Reporter: Stefan Richter
>Assignee: Stefan Richter
>
> {{StandaloneCompletedCheckpointStore}} does not register shared states ion 
> resume. However, for externalized checkpoints, it register the checkpoint 
> from which it resumed. This checkpoint gets added to the completed checkpoint 
> store as part of resume.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (FLINK-6949) Add ability to ship custom resource files to YARN cluster

2017-06-21 Thread Mikhail Pryakhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Pryakhin updated FLINK-6949:

Description: 
*The problem:*
When deploying a flink job on YARN it is not possible to specify custom 
resource files to be shipped to YARN cluster.
 
*The use case description:*
When running a flink job on multiple environments it becomes necessary to pass 
environment-related configuration files to the job's runtime. It can be 
accomplished by packaging configuration files within the job's jar. But having 
tens of different environments one can easily end up packaging as many jars as 
there are environments. It would be great to have an ability to separate 
configuration files from the job artifacts. 
 
*The possible solution:*
add the --yarnship-files option to flink cli to specify files that should be 
shipped to the YARN cluster.


  was:
*The problem:*
When deploying a flink job on YARN it is not possible to specify custom 
resource files to be shipped to YARN cluster.
 
*The use case description:*
When running a flink job on multiple environments it becomes necessary to pass 
environment-related configuration files to the job's runtime. It can be 
accomplished by packaging configuration files within the job's jar. But having 
tens of different environments one can easily end up packaging as many jar as 
there are environments. It would be great to have an ability to separate 
configuration files from the job artifacts. 
 
*The possible solution:*
add the --yarnship-files option to flink cli to specify files that should be 
shipped to the YARN cluster.



> Add ability to ship custom resource files to YARN cluster
> -
>
> Key: FLINK-6949
> URL: https://issues.apache.org/jira/browse/FLINK-6949
> Project: Flink
>  Issue Type: Improvement
>  Components: Client, YARN
>Affects Versions: 1.3.0
>Reporter: Mikhail Pryakhin
>Priority: Critical
>
> *The problem:*
> When deploying a flink job on YARN it is not possible to specify custom 
> resource files to be shipped to YARN cluster.
>  
> *The use case description:*
> When running a flink job on multiple environments it becomes necessary to 
> pass environment-related configuration files to the job's runtime. It can be 
> accomplished by packaging configuration files within the job's jar. But 
> having tens of different environments one can easily end up packaging as many 
> jars as there are environments. It would be great to have an ability to 
> separate configuration files from the job artifacts. 
>  
> *The possible solution:*
> add the --yarnship-files option to flink cli to specify files that should be 
> shipped to the YARN cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6916) FLIP-19: Improved BLOB storage architecture

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057536#comment-16057536
 ] 

ASF GitHub Bot commented on FLINK-6916:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4158

[FLINK-6916][blob] remove (unused) NAME_ADDRESSABLE mode

This is based upon #4146 and in addition removes the currently unused 
`NAME_ADDRESSABLE` BLOB addressing mode. There are currently no plans to use it 
and previous attempts to do so resulted in several bugs showing up, especially 
regarding the cleanup of these files (see #3742, #3512).
FLIP-19 will actually also use the `CONTENT_ADDRESSABLE` mode to achieve 
the same and thus `NAME_ADDRESSABLE` can finally be removed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-remove-nameaddressable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4158


commit ce719ee39fbbca7b7828c17d9792fc87d37450c7
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit 9efa8808e46adc1253f52a6a8cec6d3b4d29fee3
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit ca3d533b0affa645ec93d40de378dadc829bbfe5
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit 0eededeb36dd833835753def7f4bb27c9d5fb67e
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 6249041a9db2b39ddf54e79a1aed5e7706e739c7
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit e681239a538547f752d65358db1ebd2ba312b33c
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 20beae2dbc91859e2ec724b35b20536dcd11fe90
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit 8a33517fe6eb2fa932ab17cb0d82a3fa8d7b8d0b
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 23889866ac21494fc4af90905ab1518cbe897118
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 01b1a245528c264a6061ed3a48b24c5a207369f6
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit cb249759b79d88eda37a8bb149040be3052059ac
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-6916][blob] remove (unused) NAME_ADDRESSABLE mode




> FLIP-19: Improved BLOB storage architecture
> ---
>
> Key: FLINK-6916
> URL: https://issues.apache.org/jira/browse/FLINK-6916
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Affects Versions: 1.4.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> The current architecture around the BLOB server and cache components seems 
> rather patched up and has some issues regarding concurrency ([FLINK-6380]), 
> cleanup, API inconsistencies / currently unused API ([FLINK-6329], 
> [FLINK-6008]). These make future integration with FLIP-6 or extensions like 
> offloading oversized RPC messages ([FLINK-6046]) difficult. We therefore 
> propose an improvement on the current architecture as described below which 
> tackles these issues, provides some cleanup, and enables further BLOB server 
> use cases.
> Please refer to 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture
>  for a full overview on the proposed changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #4158: [FLINK-6916][blob] remove (unused) NAME_ADDRESSABL...

2017-06-21 Thread NicoK
GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/4158

[FLINK-6916][blob] remove (unused) NAME_ADDRESSABLE mode

This is based upon #4146 and in addition removes the currently unused 
`NAME_ADDRESSABLE` BLOB addressing mode. There are currently no plans to use it 
and previous attempts to do so resulted in several bugs showing up, especially 
regarding the cleanup of these files (see #3742, #3512).
FLIP-19 will actually also use the `CONTENT_ADDRESSABLE` mode to achieve 
the same and thus `NAME_ADDRESSABLE` can finally be removed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink flink-6916-remove-nameaddressable

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4158


commit ce719ee39fbbca7b7828c17d9792fc87d37450c7
Author: Nico Kruber 
Date:   2017-01-06T17:42:58Z

[FLINK-6008][docs] update some config options to the new, non-deprecated 
ones

commit 9efa8808e46adc1253f52a6a8cec6d3b4d29fee3
Author: Nico Kruber 
Date:   2016-12-20T15:49:57Z

[FLINK-6008][docs] minor improvements in the BlobService docs

commit ca3d533b0affa645ec93d40de378dadc829bbfe5
Author: Nico Kruber 
Date:   2016-12-20T17:27:13Z

[FLINK-6008] refactor BlobCache#getURL() for cleaner code

commit 0eededeb36dd833835753def7f4bb27c9d5fb67e
Author: Nico Kruber 
Date:   2017-03-09T17:14:02Z

[FLINK-6008] use Preconditions.checkArgument in BlobClient

commit 6249041a9db2b39ddf54e79a1aed5e7706e739c7
Author: Nico Kruber 
Date:   2016-12-21T15:23:29Z

[FLINK-6008] do not fail the BlobServer if delete fails

also extend the delete tests and remove one code duplication

commit e681239a538547f752d65358db1ebd2ba312b33c
Author: Nico Kruber 
Date:   2017-03-17T15:21:40Z

[FLINK-6008] fix concurrent job directory creation

also add according unit tests

commit 20beae2dbc91859e2ec724b35b20536dcd11fe90
Author: Nico Kruber 
Date:   2017-04-18T14:37:37Z

[FLINK-6008] some comments about BlobLibraryCacheManager cleanup

commit 8a33517fe6eb2fa932ab17cb0d82a3fa8d7b8d0b
Author: Nico Kruber 
Date:   2017-04-19T13:39:03Z

[hotfix] minor typos

commit 23889866ac21494fc4af90905ab1518cbe897118
Author: Nico Kruber 
Date:   2017-04-19T14:10:16Z

[FLINK-6008] further cleanup tests for BlobLibraryCacheManager

commit 01b1a245528c264a6061ed3a48b24c5a207369f6
Author: Nico Kruber 
Date:   2017-06-14T16:01:47Z

[FLINK-6008] do not guard a delete() call with a check for existence

commit cb249759b79d88eda37a8bb149040be3052059ac
Author: Nico Kruber 
Date:   2017-06-16T08:51:04Z

[FLINK-6916][blob] remove (unused) NAME_ADDRESSABLE mode




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #4157: [Flink 6649][table]Improve Non-window group aggreg...

2017-06-21 Thread sunjincheng121
GitHub user sunjincheng121 opened a pull request:

https://github.com/apache/flink/pull/4157

[Flink 6649][table]Improve Non-window group aggregate with update int…

In this PR. I have add supports updating the calculated data according to 
the specified time interval on non-window group AGG. If we config the time 
interval is N seconds, then the next update time relative to the latest update 
time T, is T+N seconds. For example, the time interval is 2 seconds, the 
previous update time is T seconds, and the next update time T1> = T + 2 
seconds. If no data arrives during T to T + 2, no updates are made.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [x] General
  - The pull request references the related JIRA issue ("[Flink 
6649][table]Improve Non-window group aggregate with update interval.")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [x] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sunjincheng121/flink FLINK-6649

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/4157.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4157


commit 68ecd416bbdf70ed01d195c805c92d03f5ae6004
Author: sunjincheng121 
Date:   2017-06-14T01:36:48Z

[Flink 6649][table]Improve Non-window group aggregate with update interval.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123016738
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057460#comment-16057460
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123031759
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123009399
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
--- End diff --

The check is only approximate, i.e., the stream join operator might not be 
able to execute the query even if this check is passed.

For example it only checks if there is at least one time indicator in the 
condition. However, we would need to check that there are exactly two 
conjunctive terms that have time indicator attributes on both sides and define 
bounds to both sides. Basically the complete analysis that we later do in the 
join. I think we can do this analysis already in the rule and pass the result 
of the analysis to the join.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123228354
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057466#comment-16057466
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123209008
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
--- End diff --

add a new line before the method


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123031759
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057458#comment-16057458
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123016738
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057469#comment-16057469
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123232793
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinTest.scala
 ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.calcite.rel.logical.LogicalJoin
+import org.apache.flink.api.scala._
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.runtime.join.JoinUtil
+import org.apache.flink.table.utils.TableTestUtil._
+import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase}
+import org.junit.Assert._
+import org.junit.Test
+
+class JoinTest extends TableTestBase {
+  private val streamUtil: StreamTableTestUtil = streamTestUtil()
+  streamUtil.addTable[(Int, String, Long)]("MyTable", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+  streamUtil.addTable[(Int, String, Long)]("MyTable2", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+
+  @Test
+  def testProcessingTimeInnerJoin() = {
+
+val sqlQuery = "SELECT t1.a, t2.b " +
+  "FROM MyTable as t1 join MyTable2 as t2 on t1.a = t2.a and " +
+  "t1.proctime between t2.proctime - interval '1' hour and t2.proctime 
+ interval '1' hour"
+val expected =
+  unaryNode(
+"DataStreamCalc",
+binaryNode(
+  "DataStreamRowStreamJoin",
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(0),
+term("select", "a", "proctime")
+  ),
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(1),
+term("select", "a", "b", "proctime")
+  ),
+  term("condition",
+"AND(=(a, a0), >=(TIME_MATERIALIZATION(proctime), " +
+  "-(TIME_MATERIALIZATION(proctime0), 360)), " +
+  "<=(TIME_MATERIALIZATION(proctime), " +
+  "DATETIME_PLUS(TIME_MATERIALIZATION(proctime0), 360)))"),
+  term("select", "a, proctime, a0, b, proctime0"),
+  term("joinType", "InnerJoin")
+),
+term("select", "a", "b")
+  )
+
+streamUtil.verifySql(sqlQuery, expected)
+  }
+
+
+  @Test
+  def testJoinTimeBoundary(): Unit = {
+verifyTimeBoundary(
+  "t1.proctime between t2.proctime - interval '1' hour " +
+"and t2.proctime + interval '1' hour",
+  360,
+  360,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.proctime > t2.proctime - interval '1' second and " +
+"t1.proctime < t2.proctime + interval '1' second",
+  999,
+  999,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c - interval '1' second and " +
+"t1.c <= t2.c + interval '1' second",
+  1000,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c and " +
+"t1.c <= t2.c + interval '1' second",
+  0,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c + interval '1' second and " +
+"t1.c <= t2.c + interval '10' second",
+  0,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t2.c - interval '1' second <= t1.c and " +
+"t2.c + interval '10' second >= t1.c",
+  1000,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c - interval '2' second >= t2.c + interval '1' second -" +
+"interval '10' second and " +
+

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123035300
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123228716
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123030850
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123226695
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123236585
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.harness
+
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.co.KeyedCoProcessOperator
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedTwoInputStreamOperatorTestHarness, 
TwoInputStreamOperatorTestHarness}
+import org.apache.flink.table.codegen.GeneratedFunction
+import 
org.apache.flink.table.runtime.harness.HarnessTestBase.{RowResultSortComparator,
 TupleRowKeySelector}
+import org.apache.flink.table.runtime.join.ProcTimeInnerJoin
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.junit.Test
+
+
+class JoinHarnessTest extends HarnessTestBase{
+
+  private val rT = new RowTypeInfo(Array[TypeInformation[_]](
+INT_TYPE_INFO,
+STRING_TYPE_INFO),
+Array("a", "b"))
+
+
+  val funcCode: String =
+"""
+  |public class TestJoinFunction
+  |  extends 
org.apache.flink.api.common.functions.RichFlatJoinFunction {
+  |  transient org.apache.flink.types.Row out =
+  |new org.apache.flink.types.Row(4);
+  |  public TestJoinFunction() throws Exception {}
+  |
+  |  @Override
+  |  public void open(org.apache.flink.configuration.Configuration 
parameters)
+  |  throws Exception {}
+  |
+  |  @Override
+  |  public void join(Object _in1, Object _in2, 
org.apache.flink.util.Collector c)
+  |   throws Exception {
+  |   org.apache.flink.types.Row in1 = (org.apache.flink.types.Row) 
_in1;
+  |   org.apache.flink.types.Row in2 = (org.apache.flink.types.Row) 
_in2;
+  |
+  |   out.setField(0, in1.getField(0));
+  |   out.setField(1, in1.getField(1));
+  |   out.setField(2, in2.getField(0));
+  |   out.setField(3, in2.getField(1));
+  |
+  |   c.collect(out);
+  |
+  |  }
+  |
+  |  @Override
+  |  public void close() throws Exception {}
+  |}
+""".stripMargin
+
+  @Test
+  def testProcTimeJoin() {
+
+val joinProcessFunc = new ProcTimeInnerJoin(10, 20, rT, rT, 
"TestJoinFunction", funcCode)
+
+val operator: KeyedCoProcessOperator[Integer, CRow, CRow, CRow] =
+  new KeyedCoProcessOperator[Integer, CRow, CRow, 
CRow](joinProcessFunc)
+val testHarness: TwoInputStreamOperatorTestHarness[CRow, CRow, CRow] =
+  new KeyedTwoInputStreamOperatorTestHarness[Integer, CRow, CRow, 
CRow](
+   operator,
+   new TupleRowKeySelector[Integer](0),
+   new TupleRowKeySelector[Integer](0),
+   BasicTypeInfo.INT_TYPE_INFO,
+   1,1,0)
+
+testHarness.open()
+
+testHarness.setProcessingTime(1)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(1: JInt, "aaa"), true), 1))
+testHarness.setProcessingTime(2)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(2: JInt, "bbb"), true), 2))
+testHarness.setProcessingTime(3)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(1: JInt, "aaa2"), true), 3))
+
+testHarness.processElement2(new StreamRecord(
+  CRow(Row.of(1: JInt, "Hi1"), true), 3))
+

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123017357
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
--- End diff --

replace `_size > 0` by `_.nonEmpty`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123020714
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057463#comment-16057463
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123226130
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123021563
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057461#comment-16057461
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123228716
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057479#comment-16057479
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123227488
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123233427
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinTest.scala
 ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.calcite.rel.logical.LogicalJoin
+import org.apache.flink.api.scala._
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.runtime.join.JoinUtil
+import org.apache.flink.table.utils.TableTestUtil._
+import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase}
+import org.junit.Assert._
+import org.junit.Test
+
+class JoinTest extends TableTestBase {
+  private val streamUtil: StreamTableTestUtil = streamTestUtil()
+  streamUtil.addTable[(Int, String, Long)]("MyTable", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+  streamUtil.addTable[(Int, String, Long)]("MyTable2", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+
+  @Test
+  def testProcessingTimeInnerJoin() = {
+
+val sqlQuery = "SELECT t1.a, t2.b " +
+  "FROM MyTable as t1 join MyTable2 as t2 on t1.a = t2.a and " +
+  "t1.proctime between t2.proctime - interval '1' hour and t2.proctime 
+ interval '1' hour"
+val expected =
+  unaryNode(
+"DataStreamCalc",
+binaryNode(
+  "DataStreamRowStreamJoin",
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(0),
+term("select", "a", "proctime")
+  ),
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(1),
+term("select", "a", "b", "proctime")
+  ),
+  term("condition",
+"AND(=(a, a0), >=(TIME_MATERIALIZATION(proctime), " +
+  "-(TIME_MATERIALIZATION(proctime0), 360)), " +
+  "<=(TIME_MATERIALIZATION(proctime), " +
+  "DATETIME_PLUS(TIME_MATERIALIZATION(proctime0), 360)))"),
+  term("select", "a, proctime, a0, b, proctime0"),
+  term("joinType", "InnerJoin")
+),
+term("select", "a", "b")
+  )
+
+streamUtil.verifySql(sqlQuery, expected)
+  }
+
+
+  @Test
+  def testJoinTimeBoundary(): Unit = {
+verifyTimeBoundary(
+  "t1.proctime between t2.proctime - interval '1' hour " +
+"and t2.proctime + interval '1' hour",
+  360,
+  360,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.proctime > t2.proctime - interval '1' second and " +
+"t1.proctime < t2.proctime + interval '1' second",
+  999,
+  999,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c - interval '1' second and " +
+"t1.c <= t2.c + interval '1' second",
+  1000,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c and " +
+"t1.c <= t2.c + interval '1' second",
+  0,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c + interval '1' second and " +
+"t1.c <= t2.c + interval '10' second",
+  0,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t2.c - interval '1' second <= t1.c and " +
+"t2.c + interval '10' second >= t1.c",
+  1000,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c - interval '2' second >= t2.c + interval '1' second -" +
+"interval '10' second and " +
+"t1.c <= t2.c + interval '10' second",
+  7000,
+  1,
+  "rowtime")
+  }
+
+  def verifyTimeBoundary(
--- End diff --

I would move this also into a dedicated test class. I don't 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057476#comment-16057476
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123037106
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057477#comment-16057477
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123030850
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123226087
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123020909
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123226130
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057456#comment-16057456
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123020909
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123225232
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
--- End diff --

Does it make sense to split the implementation into two operators:
1. both streams need to be buffered (`l.ptime > r.ptime - 10.secs AND 
l.ptime < r.ptime + 5.secs`)
2. only one stream needs to be buffered (`l.ptime > r.ptime - 10.secs AND 
l.ptime < r.ptime - 5.secs`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057454#comment-16057454
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123026855
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123232793
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinTest.scala
 ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.calcite.rel.logical.LogicalJoin
+import org.apache.flink.api.scala._
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.runtime.join.JoinUtil
+import org.apache.flink.table.utils.TableTestUtil._
+import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase}
+import org.junit.Assert._
+import org.junit.Test
+
+class JoinTest extends TableTestBase {
+  private val streamUtil: StreamTableTestUtil = streamTestUtil()
+  streamUtil.addTable[(Int, String, Long)]("MyTable", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+  streamUtil.addTable[(Int, String, Long)]("MyTable2", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+
+  @Test
+  def testProcessingTimeInnerJoin() = {
+
+val sqlQuery = "SELECT t1.a, t2.b " +
+  "FROM MyTable as t1 join MyTable2 as t2 on t1.a = t2.a and " +
+  "t1.proctime between t2.proctime - interval '1' hour and t2.proctime 
+ interval '1' hour"
+val expected =
+  unaryNode(
+"DataStreamCalc",
+binaryNode(
+  "DataStreamRowStreamJoin",
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(0),
+term("select", "a", "proctime")
+  ),
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(1),
+term("select", "a", "b", "proctime")
+  ),
+  term("condition",
+"AND(=(a, a0), >=(TIME_MATERIALIZATION(proctime), " +
+  "-(TIME_MATERIALIZATION(proctime0), 360)), " +
+  "<=(TIME_MATERIALIZATION(proctime), " +
+  "DATETIME_PLUS(TIME_MATERIALIZATION(proctime0), 360)))"),
+  term("select", "a, proctime, a0, b, proctime0"),
+  term("joinType", "InnerJoin")
+),
+term("select", "a", "b")
+  )
+
+streamUtil.verifySql(sqlQuery, expected)
+  }
+
+
+  @Test
+  def testJoinTimeBoundary(): Unit = {
+verifyTimeBoundary(
+  "t1.proctime between t2.proctime - interval '1' hour " +
+"and t2.proctime + interval '1' hour",
+  360,
+  360,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.proctime > t2.proctime - interval '1' second and " +
+"t1.proctime < t2.proctime + interval '1' second",
+  999,
+  999,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c - interval '1' second and " +
+"t1.c <= t2.c + interval '1' second",
+  1000,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c and " +
+"t1.c <= t2.c + interval '1' second",
+  0,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c + interval '1' second and " +
+"t1.c <= t2.c + interval '10' second",
+  0,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t2.c - interval '1' second <= t1.c and " +
+"t2.c + interval '10' second >= t1.c",
+  1000,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c - interval '2' second >= t2.c + interval '1' second -" +
+"interval '10' second and " +
+"t1.c <= t2.c + interval '10' second",
+  7000,
+  1,
+  "rowtime")
+  }
+
+  def verifyTimeBoundary(
--- End diff --

I like this test, but it would also be good to check that all 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057478#comment-16057478
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123009399
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
--- End diff --

The check is only approximate, i.e., the stream join operator might not be 
able to execute the query even if this check is passed.

For example it only checks if there is at least one time indicator in the 
condition. However, we would need to check that there are exactly two 
conjunctive terms that have time indicator attributes on both sides and define 
bounds to both sides. Basically the complete analysis that we later do in the 
join. I think we can do this analysis already in the rule and pass the result 
of the analysis to the join.


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057450#comment-16057450
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123004606
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
--- End diff --

can be simplified to `c.getOperands.exists(isExistTumble(_))`


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057475#comment-16057475
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123236585
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.harness
+
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.co.KeyedCoProcessOperator
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedTwoInputStreamOperatorTestHarness, 
TwoInputStreamOperatorTestHarness}
+import org.apache.flink.table.codegen.GeneratedFunction
+import 
org.apache.flink.table.runtime.harness.HarnessTestBase.{RowResultSortComparator,
 TupleRowKeySelector}
+import org.apache.flink.table.runtime.join.ProcTimeInnerJoin
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.junit.Test
+
+
+class JoinHarnessTest extends HarnessTestBase{
+
+  private val rT = new RowTypeInfo(Array[TypeInformation[_]](
+INT_TYPE_INFO,
+STRING_TYPE_INFO),
+Array("a", "b"))
+
+
+  val funcCode: String =
+"""
+  |public class TestJoinFunction
+  |  extends 
org.apache.flink.api.common.functions.RichFlatJoinFunction {
+  |  transient org.apache.flink.types.Row out =
+  |new org.apache.flink.types.Row(4);
+  |  public TestJoinFunction() throws Exception {}
+  |
+  |  @Override
+  |  public void open(org.apache.flink.configuration.Configuration 
parameters)
+  |  throws Exception {}
+  |
+  |  @Override
+  |  public void join(Object _in1, Object _in2, 
org.apache.flink.util.Collector c)
+  |   throws Exception {
+  |   org.apache.flink.types.Row in1 = (org.apache.flink.types.Row) 
_in1;
+  |   org.apache.flink.types.Row in2 = (org.apache.flink.types.Row) 
_in2;
+  |
+  |   out.setField(0, in1.getField(0));
+  |   out.setField(1, in1.getField(1));
+  |   out.setField(2, in2.getField(0));
+  |   out.setField(3, in2.getField(1));
+  |
+  |   c.collect(out);
+  |
+  |  }
+  |
+  |  @Override
+  |  public void close() throws Exception {}
+  |}
+""".stripMargin
+
+  @Test
+  def testProcTimeJoin() {
+
+val joinProcessFunc = new ProcTimeInnerJoin(10, 20, rT, rT, 
"TestJoinFunction", funcCode)
+
+val operator: KeyedCoProcessOperator[Integer, CRow, CRow, CRow] =
+  new KeyedCoProcessOperator[Integer, CRow, CRow, 
CRow](joinProcessFunc)
+val testHarness: TwoInputStreamOperatorTestHarness[CRow, CRow, CRow] =
+  new KeyedTwoInputStreamOperatorTestHarness[Integer, CRow, CRow, 
CRow](
+   operator,
+   new TupleRowKeySelector[Integer](0),
+   new TupleRowKeySelector[Integer](0),
+   BasicTypeInfo.INT_TYPE_INFO,
+   1,1,0)
+
+testHarness.open()
+
+testHarness.setProcessingTime(1)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(1: JInt, "aaa"), true), 1))
+testHarness.setProcessingTime(2)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(2: JInt, "bbb"), true), 2))
+

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057483#comment-16057483
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123035300
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057481#comment-16057481
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123233427
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinTest.scala
 ---
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.calcite.rel.logical.LogicalJoin
+import org.apache.flink.api.scala._
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.runtime.join.JoinUtil
+import org.apache.flink.table.utils.TableTestUtil._
+import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase}
+import org.junit.Assert._
+import org.junit.Test
+
+class JoinTest extends TableTestBase {
+  private val streamUtil: StreamTableTestUtil = streamTestUtil()
+  streamUtil.addTable[(Int, String, Long)]("MyTable", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+  streamUtil.addTable[(Int, String, Long)]("MyTable2", 'a, 'b, 'c.rowtime, 
'proctime.proctime)
+
+  @Test
+  def testProcessingTimeInnerJoin() = {
+
+val sqlQuery = "SELECT t1.a, t2.b " +
+  "FROM MyTable as t1 join MyTable2 as t2 on t1.a = t2.a and " +
+  "t1.proctime between t2.proctime - interval '1' hour and t2.proctime 
+ interval '1' hour"
+val expected =
+  unaryNode(
+"DataStreamCalc",
+binaryNode(
+  "DataStreamRowStreamJoin",
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(0),
+term("select", "a", "proctime")
+  ),
+  unaryNode(
+"DataStreamCalc",
+streamTableNode(1),
+term("select", "a", "b", "proctime")
+  ),
+  term("condition",
+"AND(=(a, a0), >=(TIME_MATERIALIZATION(proctime), " +
+  "-(TIME_MATERIALIZATION(proctime0), 360)), " +
+  "<=(TIME_MATERIALIZATION(proctime), " +
+  "DATETIME_PLUS(TIME_MATERIALIZATION(proctime0), 360)))"),
+  term("select", "a, proctime, a0, b, proctime0"),
+  term("joinType", "InnerJoin")
+),
+term("select", "a", "b")
+  )
+
+streamUtil.verifySql(sqlQuery, expected)
+  }
+
+
+  @Test
+  def testJoinTimeBoundary(): Unit = {
+verifyTimeBoundary(
+  "t1.proctime between t2.proctime - interval '1' hour " +
+"and t2.proctime + interval '1' hour",
+  360,
+  360,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.proctime > t2.proctime - interval '1' second and " +
+"t1.proctime < t2.proctime + interval '1' second",
+  999,
+  999,
+  "proctime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c - interval '1' second and " +
+"t1.c <= t2.c + interval '1' second",
+  1000,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c and " +
+"t1.c <= t2.c + interval '1' second",
+  0,
+  1000,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c >= t2.c + interval '1' second and " +
+"t1.c <= t2.c + interval '10' second",
+  0,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t2.c - interval '1' second <= t1.c and " +
+"t2.c + interval '10' second >= t1.c",
+  1000,
+  1,
+  "rowtime")
+
+verifyTimeBoundary(
+  "t1.c - interval '2' second >= t2.c + interval '1' second -" +
+"interval '10' second and " +
+

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123229245
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123227488
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123238837
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.harness
+
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.co.KeyedCoProcessOperator
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedTwoInputStreamOperatorTestHarness, 
TwoInputStreamOperatorTestHarness}
+import org.apache.flink.table.codegen.GeneratedFunction
+import 
org.apache.flink.table.runtime.harness.HarnessTestBase.{RowResultSortComparator,
 TupleRowKeySelector}
+import org.apache.flink.table.runtime.join.ProcTimeInnerJoin
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.junit.Test
+
+
+class JoinHarnessTest extends HarnessTestBase{
+
+  private val rT = new RowTypeInfo(Array[TypeInformation[_]](
+INT_TYPE_INFO,
+STRING_TYPE_INFO),
+Array("a", "b"))
+
+
+  val funcCode: String =
+"""
+  |public class TestJoinFunction
+  |  extends 
org.apache.flink.api.common.functions.RichFlatJoinFunction {
+  |  transient org.apache.flink.types.Row out =
+  |new org.apache.flink.types.Row(4);
+  |  public TestJoinFunction() throws Exception {}
+  |
+  |  @Override
+  |  public void open(org.apache.flink.configuration.Configuration 
parameters)
+  |  throws Exception {}
+  |
+  |  @Override
+  |  public void join(Object _in1, Object _in2, 
org.apache.flink.util.Collector c)
+  |   throws Exception {
+  |   org.apache.flink.types.Row in1 = (org.apache.flink.types.Row) 
_in1;
+  |   org.apache.flink.types.Row in2 = (org.apache.flink.types.Row) 
_in2;
+  |
+  |   out.setField(0, in1.getField(0));
+  |   out.setField(1, in1.getField(1));
+  |   out.setField(2, in2.getField(0));
+  |   out.setField(3, in2.getField(1));
+  |
+  |   c.collect(out);
+  |
+  |  }
+  |
+  |  @Override
+  |  public void close() throws Exception {}
+  |}
+""".stripMargin
+
+  @Test
+  def testProcTimeJoin() {
+
+val joinProcessFunc = new ProcTimeInnerJoin(10, 20, rT, rT, 
"TestJoinFunction", funcCode)
+
+val operator: KeyedCoProcessOperator[Integer, CRow, CRow, CRow] =
+  new KeyedCoProcessOperator[Integer, CRow, CRow, 
CRow](joinProcessFunc)
+val testHarness: TwoInputStreamOperatorTestHarness[CRow, CRow, CRow] =
+  new KeyedTwoInputStreamOperatorTestHarness[Integer, CRow, CRow, 
CRow](
+   operator,
+   new TupleRowKeySelector[Integer](0),
+   new TupleRowKeySelector[Integer](0),
+   BasicTypeInfo.INT_TYPE_INFO,
+   1,1,0)
+
+testHarness.open()
+
+testHarness.setProcessingTime(1)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(1: JInt, "aaa"), true), 1))
+testHarness.setProcessingTime(2)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(2: JInt, "bbb"), true), 2))
+testHarness.setProcessingTime(3)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(1: JInt, "aaa2"), true), 3))
+
+testHarness.processElement2(new StreamRecord(
+  CRow(Row.of(1: JInt, "Hi1"), true), 3))
+

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057480#comment-16057480
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123231931
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinITCase.scala
 ---
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamingWithStateTestBase}
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class JoinITCase extends StreamingWithStateTestBase {
+
+  val data = List(
+(1L, 1, "Hello"),
+(2L, 2, "Hello"),
+(3L, 3, "Hello"),
+(4L, 4, "Hello"),
+(5L, 5, "Hello"),
+(6L, 6, "Hello"),
+(7L, 7, "Hello World"),
+(8L, 8, "Hello World"),
+(20L, 20, "Hello World"))
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException0(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException1(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
+  "and t1.proctime > t2.proctime - interval '5' second"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should use same time indicator
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException2(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'rowtime.rowtime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057474#comment-16057474
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123229245
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123205673
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123238701
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.harness
+
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.co.KeyedCoProcessOperator
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedTwoInputStreamOperatorTestHarness, 
TwoInputStreamOperatorTestHarness}
+import org.apache.flink.table.codegen.GeneratedFunction
+import 
org.apache.flink.table.runtime.harness.HarnessTestBase.{RowResultSortComparator,
 TupleRowKeySelector}
+import org.apache.flink.table.runtime.join.ProcTimeInnerJoin
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.junit.Test
+
+
+class JoinHarnessTest extends HarnessTestBase{
+
+  private val rT = new RowTypeInfo(Array[TypeInformation[_]](
+INT_TYPE_INFO,
+STRING_TYPE_INFO),
+Array("a", "b"))
+
+
+  val funcCode: String =
+"""
+  |public class TestJoinFunction
+  |  extends 
org.apache.flink.api.common.functions.RichFlatJoinFunction {
+  |  transient org.apache.flink.types.Row out =
+  |new org.apache.flink.types.Row(4);
+  |  public TestJoinFunction() throws Exception {}
+  |
+  |  @Override
+  |  public void open(org.apache.flink.configuration.Configuration 
parameters)
+  |  throws Exception {}
+  |
+  |  @Override
+  |  public void join(Object _in1, Object _in2, 
org.apache.flink.util.Collector c)
+  |   throws Exception {
+  |   org.apache.flink.types.Row in1 = (org.apache.flink.types.Row) 
_in1;
+  |   org.apache.flink.types.Row in2 = (org.apache.flink.types.Row) 
_in2;
+  |
+  |   out.setField(0, in1.getField(0));
+  |   out.setField(1, in1.getField(1));
+  |   out.setField(2, in2.getField(0));
+  |   out.setField(3, in2.getField(1));
+  |
+  |   c.collect(out);
+  |
+  |  }
+  |
+  |  @Override
+  |  public void close() throws Exception {}
+  |}
+""".stripMargin
+
+  @Test
+  def testProcTimeJoin() {
--- End diff --

Please add comments for the scenarios that this test covers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123229427
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = getRuntimeContext.getMapState(mapStateDescriptor1)
+
+val rowListTypeInfo2: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element2Type)
+val mapStateDescriptor2: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123222750
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamRowStreamJoin.scala
 ---
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.core.{JoinInfo, JoinRelType}
+import org.apache.calcite.rel.{BiRel, RelNode, RelWriter}
+import org.apache.calcite.rex.RexNode
+import org.apache.flink.api.java.functions.NullByteKeySelector
+import org.apache.flink.streaming.api.datastream.DataStream
+import org.apache.flink.table.api.{StreamQueryConfig, 
StreamTableEnvironment, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.plan.nodes.CommonJoin
+import org.apache.flink.table.plan.schema.RowSchema
+import org.apache.flink.table.runtime.join.{JoinUtil, ProcTimeInnerJoin}
+import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo}
+
+/**
+  * Flink RelNode which matches along with JoinOperator and its related 
operations.
+  */
+class DataStreamRowStreamJoin(
--- End diff --

Rename to `DataStreamWindowJoin`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057472#comment-16057472
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123238837
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.harness
+
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.co.KeyedCoProcessOperator
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedTwoInputStreamOperatorTestHarness, 
TwoInputStreamOperatorTestHarness}
+import org.apache.flink.table.codegen.GeneratedFunction
+import 
org.apache.flink.table.runtime.harness.HarnessTestBase.{RowResultSortComparator,
 TupleRowKeySelector}
+import org.apache.flink.table.runtime.join.ProcTimeInnerJoin
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.junit.Test
+
+
+class JoinHarnessTest extends HarnessTestBase{
+
+  private val rT = new RowTypeInfo(Array[TypeInformation[_]](
+INT_TYPE_INFO,
+STRING_TYPE_INFO),
+Array("a", "b"))
+
+
+  val funcCode: String =
+"""
+  |public class TestJoinFunction
+  |  extends 
org.apache.flink.api.common.functions.RichFlatJoinFunction {
+  |  transient org.apache.flink.types.Row out =
+  |new org.apache.flink.types.Row(4);
+  |  public TestJoinFunction() throws Exception {}
+  |
+  |  @Override
+  |  public void open(org.apache.flink.configuration.Configuration 
parameters)
+  |  throws Exception {}
+  |
+  |  @Override
+  |  public void join(Object _in1, Object _in2, 
org.apache.flink.util.Collector c)
+  |   throws Exception {
+  |   org.apache.flink.types.Row in1 = (org.apache.flink.types.Row) 
_in1;
+  |   org.apache.flink.types.Row in2 = (org.apache.flink.types.Row) 
_in2;
+  |
+  |   out.setField(0, in1.getField(0));
+  |   out.setField(1, in1.getField(1));
+  |   out.setField(2, in2.getField(0));
+  |   out.setField(3, in2.getField(1));
+  |
+  |   c.collect(out);
+  |
+  |  }
+  |
+  |  @Override
+  |  public void close() throws Exception {}
+  |}
+""".stripMargin
+
+  @Test
+  def testProcTimeJoin() {
+
+val joinProcessFunc = new ProcTimeInnerJoin(10, 20, rT, rT, 
"TestJoinFunction", funcCode)
+
+val operator: KeyedCoProcessOperator[Integer, CRow, CRow, CRow] =
+  new KeyedCoProcessOperator[Integer, CRow, CRow, 
CRow](joinProcessFunc)
+val testHarness: TwoInputStreamOperatorTestHarness[CRow, CRow, CRow] =
+  new KeyedTwoInputStreamOperatorTestHarness[Integer, CRow, CRow, 
CRow](
+   operator,
+   new TupleRowKeySelector[Integer](0),
+   new TupleRowKeySelector[Integer](0),
+   BasicTypeInfo.INT_TYPE_INFO,
+   1,1,0)
+
+testHarness.open()
+
+testHarness.setProcessingTime(1)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(1: JInt, "aaa"), true), 1))
+testHarness.setProcessingTime(2)
+testHarness.processElement1(new StreamRecord(
+  CRow(Row.of(2: JInt, "bbb"), true), 2))
+

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057482#comment-16057482
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123226695
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057470#comment-16057470
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123229427
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123037106
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057465#comment-16057465
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123231612
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinITCase.scala
 ---
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamingWithStateTestBase}
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class JoinITCase extends StreamingWithStateTestBase {
+
+  val data = List(
+(1L, 1, "Hello"),
+(2L, 2, "Hello"),
+(3L, 3, "Hello"),
+(4L, 4, "Hello"),
+(5L, 5, "Hello"),
+(6L, 6, "Hello"),
+(7L, 7, "Hello World"),
+(8L, 8, "Hello World"),
+(20L, 20, "Hello World"))
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException0(): Unit = {
--- End diff --

We should not need an ITCase for these checks. The problem is that the 
validation is done during translation. If we would move the correctness checks 
into the optimizer, we don't need to translate the program.


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123025909
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057467#comment-16057467
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123020714
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123030370
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123209008
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
--- End diff --

add a new line before the method


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123231612
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinITCase.scala
 ---
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamingWithStateTestBase}
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class JoinITCase extends StreamingWithStateTestBase {
+
+  val data = List(
+(1L, 1, "Hello"),
+(2L, 2, "Hello"),
+(3L, 3, "Hello"),
+(4L, 4, "Hello"),
+(5L, 5, "Hello"),
+(6L, 6, "Hello"),
+(7L, 7, "Hello World"),
+(8L, 8, "Hello World"),
+(20L, 20, "Hello World"))
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException0(): Unit = {
--- End diff --

We should not need an ITCase for these checks. The problem is that the 
validation is done during translation. If we would move the correctness checks 
into the optimizer, we don't need to translate the program.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123231931
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinITCase.scala
 ---
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamingWithStateTestBase}
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class JoinITCase extends StreamingWithStateTestBase {
+
+  val data = List(
+(1L, 1, "Hello"),
+(2L, 2, "Hello"),
+(3L, 3, "Hello"),
+(4L, 4, "Hello"),
+(5L, 5, "Hello"),
+(6L, 6, "Hello"),
+(7L, 7, "Hello World"),
+(8L, 8, "Hello World"),
+(20L, 20, "Hello World"))
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException0(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException1(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
+  "and t1.proctime > t2.proctime - interval '5' second"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should use same time indicator
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException2(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'rowtime.rowtime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
+  "and t1.proctime > t2.rowtime - interval '5' second "
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+
+  /** 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123189670
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinITCase.scala
 ---
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamingWithStateTestBase}
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class JoinITCase extends StreamingWithStateTestBase {
+
+  val data = List(
+(1L, 1, "Hello"),
+(2L, 2, "Hello"),
+(3L, 3, "Hello"),
+(4L, 4, "Hello"),
+(5L, 5, "Hello"),
+(6L, 6, "Hello"),
+(7L, 7, "Hello World"),
+(8L, 8, "Hello World"),
+(20L, 20, "Hello World"))
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException0(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException1(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
+  "and t1.proctime > t2.proctime - interval '5' second"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should use same time indicator
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException2(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'rowtime.rowtime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
+  "and t1.proctime > t2.rowtime - interval '5' second "
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+
+  /** 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057473#comment-16057473
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123225232
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
--- End diff --

Does it make sense to split the implementation into two operators:
1. both streams need to be buffered (`l.ptime > r.ptime - 10.secs AND 
l.ptime < r.ptime + 5.secs`)
2. only one stream needs to be buffered (`l.ptime > r.ptime - 10.secs AND 
l.ptime < r.ptime - 5.secs`)


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057462#comment-16057462
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123226087
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057452#comment-16057452
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123021563
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057468#comment-16057468
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123205673
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057464#comment-16057464
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123189670
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/scala/stream/sql/JoinITCase.scala
 ---
@@ -0,0 +1,204 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.scala.stream.sql
+
+import org.apache.flink.api.scala._
+import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.api.scala.stream.utils.{StreamITCase, 
StreamingWithStateTestBase}
+import org.apache.flink.table.api.{TableEnvironment, TableException}
+import org.apache.flink.types.Row
+import org.junit.Assert._
+import org.junit._
+
+import scala.collection.mutable
+
+class JoinITCase extends StreamingWithStateTestBase {
+
+  val data = List(
+(1L, 1, "Hello"),
+(2L, 2, "Hello"),
+(3L, 3, "Hello"),
+(4L, 4, "Hello"),
+(5L, 5, "Hello"),
+(6L, 6, "Hello"),
+(7L, 7, "Hello World"),
+(8L, 8, "Hello World"),
+(20L, 20, "Hello World"))
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException0(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should have boundary
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException1(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
+  "and t1.proctime > t2.proctime - interval '5' second"
+
+val result = tEnv.sql(sqlQuery).toDataStream[Row]
+result.addSink(new StreamITCase.StringSink)
+env.execute()
+  }
+
+  /**
+* both stream should use same time indicator
+*/
+  @Test(expected = classOf[TableException])
+  def testJoinException2(): Unit = {
+val env = StreamExecutionEnvironment.getExecutionEnvironment
+env.setStateBackend(getStateBackend)
+val tEnv = TableEnvironment.getTableEnvironment(env)
+StreamITCase.testResults = mutable.MutableList()
+
+val t1 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'proctime.proctime)
+val t2 = env.fromCollection(data).toTable(tEnv, 'a, 'b, 'c, 
'rowtime.rowtime)
+
+tEnv.registerTable("T1", t1)
+tEnv.registerTable("T2", t2)
+
+val sqlQuery = "SELECT t2.a, t2.c, t1.c from T1 as t1 join T2 as t2 on 
t1.a = t2.a " +
   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057471#comment-16057471
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123238701
  
--- Diff: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/harness/JoinHarnessTest.scala
 ---
@@ -0,0 +1,200 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.harness
+
+import java.util.concurrent.ConcurrentLinkedQueue
+import java.lang.{Integer => JInt}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.streaming.api.operators.co.KeyedCoProcessOperator
+import org.apache.flink.streaming.runtime.streamrecord.StreamRecord
+import 
org.apache.flink.streaming.util.{KeyedTwoInputStreamOperatorTestHarness, 
TwoInputStreamOperatorTestHarness}
+import org.apache.flink.table.codegen.GeneratedFunction
+import 
org.apache.flink.table.runtime.harness.HarnessTestBase.{RowResultSortComparator,
 TupleRowKeySelector}
+import org.apache.flink.table.runtime.join.ProcTimeInnerJoin
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.junit.Test
+
+
+class JoinHarnessTest extends HarnessTestBase{
+
+  private val rT = new RowTypeInfo(Array[TypeInformation[_]](
+INT_TYPE_INFO,
+STRING_TYPE_INFO),
+Array("a", "b"))
+
+
+  val funcCode: String =
+"""
+  |public class TestJoinFunction
+  |  extends 
org.apache.flink.api.common.functions.RichFlatJoinFunction {
+  |  transient org.apache.flink.types.Row out =
+  |new org.apache.flink.types.Row(4);
+  |  public TestJoinFunction() throws Exception {}
+  |
+  |  @Override
+  |  public void open(org.apache.flink.configuration.Configuration 
parameters)
+  |  throws Exception {}
+  |
+  |  @Override
+  |  public void join(Object _in1, Object _in2, 
org.apache.flink.util.Collector c)
+  |   throws Exception {
+  |   org.apache.flink.types.Row in1 = (org.apache.flink.types.Row) 
_in1;
+  |   org.apache.flink.types.Row in2 = (org.apache.flink.types.Row) 
_in2;
+  |
+  |   out.setField(0, in1.getField(0));
+  |   out.setField(1, in1.getField(1));
+  |   out.setField(2, in2.getField(0));
+  |   out.setField(3, in2.getField(1));
+  |
+  |   c.collect(out);
+  |
+  |  }
+  |
+  |  @Override
+  |  public void close() throws Exception {}
+  |}
+""".stripMargin
+
+  @Test
+  def testProcTimeJoin() {
--- End diff --

Please add comments for the scenarios that this test covers.


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057459#comment-16057459
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123222750
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamRowStreamJoin.scala
 ---
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.core.{JoinInfo, JoinRelType}
+import org.apache.calcite.rel.{BiRel, RelNode, RelWriter}
+import org.apache.calcite.rex.RexNode
+import org.apache.flink.api.java.functions.NullByteKeySelector
+import org.apache.flink.streaming.api.datastream.DataStream
+import org.apache.flink.table.api.{StreamQueryConfig, 
StreamTableEnvironment, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.plan.nodes.CommonJoin
+import org.apache.flink.table.plan.schema.RowSchema
+import org.apache.flink.table.runtime.join.{JoinUtil, ProcTimeInnerJoin}
+import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo}
+
+/**
+  * Flink RelNode which matches along with JoinOperator and its related 
operations.
+  */
+class DataStreamRowStreamJoin(
--- End diff --

Rename to `DataStreamWindowJoin`?


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057453#comment-16057453
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123030370
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057441#comment-16057441
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r122838733
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala
 ---
@@ -162,8 +162,25 @@ class RelTimeIndicatorConverter(rexBuilder: 
RexBuilder) extends RelShuttle {
 LogicalProject.create(input, projects, fieldNames)
   }
 
-  override def visit(join: LogicalJoin): RelNode =
-throw new TableException("Logical join in a stream environment is not 
supported yet.")
+  override def visit(join: LogicalJoin): RelNode = {
+val left = join.getLeft.accept(this)
+val right = join.getRight.accept(this)
+
+// check if input field contains time indicator type
+// materialize field if no time indicator is present anymore
+// if input field is already materialized, change to timestamp type
+val inputFields = left.getRowType.getFieldList.map(_.getType) ++
+  right.getRowType.getFieldList.map(_.getType)
+val materializer = new RexTimeIndicatorMaterializer(
+  rexBuilder,
+  inputFields)
+
+val condition = join.getCondition.accept(materializer)
--- End diff --

I think we do not need to materialize time indicators for join predicates. 
If the time indicators are used in valid time-based join predicates we do not 
code-gen the predicate and if they the time-based join predicate is not valid, 
the query will fail anyway.


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057443#comment-16057443
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123003230
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
--- End diff --

remove "other"?


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057457#comment-16057457
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123025909
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r122843078
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamRowStreamJoin.scala
 ---
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.core.{JoinInfo, JoinRelType}
+import org.apache.calcite.rel.{BiRel, RelNode, RelWriter}
+import org.apache.calcite.rex.RexNode
+import org.apache.flink.api.java.functions.NullByteKeySelector
+import org.apache.flink.streaming.api.datastream.DataStream
+import org.apache.flink.table.api.{StreamQueryConfig, 
StreamTableEnvironment, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.plan.nodes.CommonJoin
+import org.apache.flink.table.plan.schema.RowSchema
+import org.apache.flink.table.runtime.join.{JoinUtil, ProcTimeInnerJoin}
+import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo}
+
+/**
+  * Flink RelNode which matches along with JoinOperator and its related 
operations.
+  */
+class DataStreamRowStreamJoin(
+cluster: RelOptCluster,
+traitSet: RelTraitSet,
+leftNode: RelNode,
+rightNode: RelNode,
+joinCondition: RexNode,
+joinType: JoinRelType,
+leftSchema: RowSchema,
+rightSchema: RowSchema,
+schema: RowSchema,
+ruleDescription: String)
+  extends BiRel(cluster, traitSet, leftNode, rightNode)
+  with CommonJoin
+  with DataStreamRel {
+
+  override def deriveRowType() = schema.logicalType
+
+  override def copy(traitSet: RelTraitSet, inputs: 
java.util.List[RelNode]): RelNode = {
+new DataStreamRowStreamJoin(
+  cluster,
+  traitSet,
+  inputs.get(0),
+  inputs.get(1),
+  joinCondition,
+  joinType,
+  leftSchema,
+  rightSchema,
+  schema,
+  ruleDescription)
+  }
+
+  override def toString: String = {
+
+s"${joinTypeToString(joinType)}" +
+  s"(condition: (${joinConditionToString(schema.logicalType,
+joinCondition, getExpressionString)}), " +
+  s"select: (${joinSelectionToString(schema.logicalType)}))"
+  }
+
+  override def explainTerms(pw: RelWriter): RelWriter = {
+super.explainTerms(pw)
+  .item("condition", joinConditionToString(schema.logicalType,
+joinCondition, getExpressionString))
+  .item("select", joinSelectionToString(schema.logicalType))
+  .item("joinType", joinTypeToString(joinType))
+  }
+
+  override def translateToPlan(
+  tableEnv: StreamTableEnvironment,
+  queryConfig: StreamQueryConfig): DataStream[CRow] = {
+
+val config = tableEnv.getConfig
+
+// get the equality keys and other condition
+val joinInfo = JoinInfo.of(leftNode, rightNode, joinCondition)
+val leftKeys = joinInfo.leftKeys.toIntArray
+val rightKeys = joinInfo.rightKeys.toIntArray
+val otherCondition = joinInfo.getRemaining(cluster.getRexBuilder)
+
+// analyze time boundary and time predicate type(proctime/rowtime)
+val (timeType, leftStreamWindowSize, rightStreamWindowSize, 
remainCondition) =
+  JoinUtil.analyzeTimeBoundary(
--- End diff --

I think we should move the analysis to the rule. Otherwise, we might end up 
with a plan that cannot be translated. It is the rule's responsibility to 
ensure that the translated plan can be executed.

The rule can then pass the analyzed time predicate parameters (time type, 
bounds) to the `DataStreamRowStreamJoin`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057455#comment-16057455
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123017357
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
--- End diff --

replace `_size > 0` by `_.nonEmpty`


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r122838733
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/calcite/RelTimeIndicatorConverter.scala
 ---
@@ -162,8 +162,25 @@ class RelTimeIndicatorConverter(rexBuilder: 
RexBuilder) extends RelShuttle {
 LogicalProject.create(input, projects, fieldNames)
   }
 
-  override def visit(join: LogicalJoin): RelNode =
-throw new TableException("Logical join in a stream environment is not 
supported yet.")
+  override def visit(join: LogicalJoin): RelNode = {
+val left = join.getLeft.accept(this)
+val right = join.getRight.accept(this)
+
+// check if input field contains time indicator type
+// materialize field if no time indicator is present anymore
+// if input field is already materialized, change to timestamp type
+val inputFields = left.getRowType.getFieldList.map(_.getType) ++
+  right.getRowType.getFieldList.map(_.getType)
+val materializer = new RexTimeIndicatorMaterializer(
+  rexBuilder,
+  inputFields)
+
+val condition = join.getCondition.accept(materializer)
--- End diff --

I think we do not need to materialize time indicators for join predicates. 
If the time indicators are used in valid time-based join predicates we do not 
code-gen the predicate and if they the time-based join predicate is not valid, 
the query will fail anyway.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057448#comment-16057448
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123228354
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/ProcTimeInnerJoin.scala
 ---
@@ -0,0 +1,316 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.runtime.join
+
+import java.util
+import java.util.{List => JList}
+
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.state._
+import org.apache.flink.api.common.typeinfo.{BasicTypeInfo, 
TypeInformation}
+import org.apache.flink.api.java.typeutils.ListTypeInfo
+import org.apache.flink.configuration.Configuration
+import org.apache.flink.streaming.api.functions.co.CoProcessFunction
+import org.apache.flink.table.codegen.Compiler
+import org.apache.flink.table.runtime.CRowWrappingCollector
+import org.apache.flink.table.runtime.types.CRow
+import org.apache.flink.types.Row
+import org.apache.flink.util.Collector
+import org.slf4j.LoggerFactory
+
+/**
+  * A CoProcessFunction to support stream join stream, currently just 
support inner-join
+  *
+  * @param leftStreamWindowSizethe left stream window size
+  * @param rightStreamWindowSizethe right stream window size
+  * @param element1Type  the input type of left stream
+  * @param element2Type  the input type of right stream
+  * @param genJoinFuncNamethe function code of other non-equi condition
+  * @param genJoinFuncCodethe function name of other non-equi condition
+  *
+  */
+class ProcTimeInnerJoin(
+private val leftStreamWindowSize: Long,
+private val rightStreamWindowSize: Long,
+private val element1Type: TypeInformation[Row],
+private val element2Type: TypeInformation[Row],
+private val genJoinFuncName: String,
+private val genJoinFuncCode: String)
+  extends CoProcessFunction[CRow, CRow, CRow]
+with Compiler[FlatJoinFunction[Row, Row, Row]]{
+
+  private var cRowWrapper: CRowWrappingCollector = _
+
+  /** other condition function **/
+  private var joinFunction: FlatJoinFunction[Row, Row, Row] = _
+
+  /** tmp list to store expired records **/
+  private var listToRemove: JList[Long] = _
+
+  /** state to hold left stream element **/
+  private var row1MapState: MapState[Long, JList[Row]] = _
+  /** state to hold right stream element **/
+  private var row2MapState: MapState[Long, JList[Row]] = _
+
+  /** state to record last timer of left stream, 0 means no timer **/
+  private var timerState1: ValueState[Long] = _
+  /** state to record last timer of right stream, 0 means no timer **/
+  private var timerState2: ValueState[Long] = _
+
+  val LOG = LoggerFactory.getLogger(this.getClass)
+  override def open(config: Configuration) {
+LOG.debug(s"Compiling JoinFunction: $genJoinFuncName \n\n " +
+  s"Code:\n$genJoinFuncCode")
+val clazz = compile(
+  getRuntimeContext.getUserCodeClassLoader,
+  genJoinFuncName,
+  genJoinFuncCode)
+LOG.debug("Instantiating JoinFunction.")
+joinFunction = clazz.newInstance()
+
+listToRemove = new util.ArrayList[Long]()
+cRowWrapper = new CRowWrappingCollector()
+
+// initialize row state
+val rowListTypeInfo1: TypeInformation[JList[Row]] = new 
ListTypeInfo[Row](element1Type)
+val mapStateDescriptor1: MapStateDescriptor[Long, JList[Row]] =
+  new MapStateDescriptor[Long, JList[Row]]("row1mapstate",
+BasicTypeInfo.LONG_TYPE_INFO.asInstanceOf[TypeInformation[Long]], 
rowListTypeInfo1)
+row1MapState = 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123020420
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123029455
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123026855
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
+}
+
+// extract time offset from the time indicator conditon
+val 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057446#comment-16057446
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r122842288
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/CommonJoin.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes
+
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.RexNode
+
+import scala.collection.JavaConverters._
+
+trait CommonJoin {
+
+  private[flink] def joinSelectionToString(inputType: RelDataType): String 
= {
+inputType.getFieldNames.asScala.toList.mkString(", ")
+  }
+
+  private[flink] def joinConditionToString(
+inputType: RelDataType,
+joinCondition: RexNode,
+expression: (RexNode, List[String], Option[List[RexNode]]) => String): 
String = {
+
+val inFields = inputType.getFieldNames.asScala.toList
+expression(joinCondition, inFields, None)
+  }
+
+  private[flink] def joinTypeToString(joinType: JoinRelType) = {
+joinType match {
+  case JoinRelType.INNER => "InnerJoin"
+  case JoinRelType.LEFT=> "LeftOuterJoin"
+  case JoinRelType.RIGHT => "RightOuterJoin"
+  case JoinRelType.FULL => "FullOuterJoin"
+}
+  }
+
--- End diff --

add `explainTerms` and `toString`


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057449#comment-16057449
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r122841979
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/CommonJoin.scala
 ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.plan.nodes
+
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex.RexNode
+
+import scala.collection.JavaConverters._
+
+trait CommonJoin {
--- End diff --

The `DataSetJoin` should also extend from this class. 


> Support proctime inner equi-join between two streams in the SQL API
> ---
>
> Key: FLINK-6232
> URL: https://issues.apache.org/jira/browse/FLINK-6232
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API & SQL
>Reporter: hongyuhong
>Assignee: hongyuhong
>
> The goal of this issue is to add support for inner equi-join on proc time 
> streams to the SQL interface.
> Queries similar to the following should be supported:
> {code}
> SELECT o.proctime, o.productId, o.orderId, s.proctime AS shipTime 
> FROM Orders AS o 
> JOIN Shipments AS s 
> ON o.orderId = s.orderId 
> AND o.proctime BETWEEN s.proctime AND s.proctime + INTERVAL '1' HOUR;
> {code}
> The following restrictions should initially apply:
> * The join hint only support inner join
> * The ON clause should include equi-join condition
> * The time-condition {{o.proctime BETWEEN s.proctime AND s.proctime + 
> INTERVAL '1' HOUR}} only can use proctime that is a system attribute, the 
> time condition only support bounded time range like {{o.proctime BETWEEN 
> s.proctime - INTERVAL '1' HOUR AND s.proctime + INTERVAL '1' HOUR}}, not 
> support unbounded like {{o.proctime > s.protime}},  and  should include both 
> two stream's proctime attribute, {{o.proctime between proctime() and 
> proctime() + 1}} should also not be supported.
> This issue includes:
> * Design of the DataStream operator to deal with stream join
> * Translation from Calcite's RelNode representation (LogicalJoin). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057451#comment-16057451
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r122843078
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamRowStreamJoin.scala
 ---
@@ -0,0 +1,186 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.nodes.datastream
+
+import org.apache.calcite.plan._
+import org.apache.calcite.rel.core.{JoinInfo, JoinRelType}
+import org.apache.calcite.rel.{BiRel, RelNode, RelWriter}
+import org.apache.calcite.rex.RexNode
+import org.apache.flink.api.java.functions.NullByteKeySelector
+import org.apache.flink.streaming.api.datastream.DataStream
+import org.apache.flink.table.api.{StreamQueryConfig, 
StreamTableEnvironment, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.plan.nodes.CommonJoin
+import org.apache.flink.table.plan.schema.RowSchema
+import org.apache.flink.table.runtime.join.{JoinUtil, ProcTimeInnerJoin}
+import org.apache.flink.table.runtime.types.{CRow, CRowTypeInfo}
+
+/**
+  * Flink RelNode which matches along with JoinOperator and its related 
operations.
+  */
+class DataStreamRowStreamJoin(
+cluster: RelOptCluster,
+traitSet: RelTraitSet,
+leftNode: RelNode,
+rightNode: RelNode,
+joinCondition: RexNode,
+joinType: JoinRelType,
+leftSchema: RowSchema,
+rightSchema: RowSchema,
+schema: RowSchema,
+ruleDescription: String)
+  extends BiRel(cluster, traitSet, leftNode, rightNode)
+  with CommonJoin
+  with DataStreamRel {
+
+  override def deriveRowType() = schema.logicalType
+
+  override def copy(traitSet: RelTraitSet, inputs: 
java.util.List[RelNode]): RelNode = {
+new DataStreamRowStreamJoin(
+  cluster,
+  traitSet,
+  inputs.get(0),
+  inputs.get(1),
+  joinCondition,
+  joinType,
+  leftSchema,
+  rightSchema,
+  schema,
+  ruleDescription)
+  }
+
+  override def toString: String = {
+
+s"${joinTypeToString(joinType)}" +
+  s"(condition: (${joinConditionToString(schema.logicalType,
+joinCondition, getExpressionString)}), " +
+  s"select: (${joinSelectionToString(schema.logicalType)}))"
+  }
+
+  override def explainTerms(pw: RelWriter): RelWriter = {
+super.explainTerms(pw)
+  .item("condition", joinConditionToString(schema.logicalType,
+joinCondition, getExpressionString))
+  .item("select", joinSelectionToString(schema.logicalType))
+  .item("joinType", joinTypeToString(joinType))
+  }
+
+  override def translateToPlan(
+  tableEnv: StreamTableEnvironment,
+  queryConfig: StreamQueryConfig): DataStream[CRow] = {
+
+val config = tableEnv.getConfig
+
+// get the equality keys and other condition
+val joinInfo = JoinInfo.of(leftNode, rightNode, joinCondition)
+val leftKeys = joinInfo.leftKeys.toIntArray
+val rightKeys = joinInfo.rightKeys.toIntArray
+val otherCondition = joinInfo.getRemaining(cluster.getRexBuilder)
+
+// analyze time boundary and time predicate type(proctime/rowtime)
+val (timeType, leftStreamWindowSize, rightStreamWindowSize, 
remainCondition) =
+  JoinUtil.analyzeTimeBoundary(
--- End diff --

I think we should move the analysis to the rule. Otherwise, we might end up 
with a plan that cannot be translated. It is the rule's responsibility to 
ensure that the translated plan can be executed.

The rule can then pass the analyzed time predicate parameters (time type, 
bounds) to the 

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057445#comment-16057445
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123028155
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057442#comment-16057442
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123020420
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[jira] [Commented] (FLINK-6232) Support proctime inner equi-join between two streams in the SQL API

2017-06-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057447#comment-16057447
 ] 

ASF GitHub Bot commented on FLINK-6232:
---

Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123029455
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+   

[GitHub] flink pull request #3715: [FLINK-6232][Table]Support proctime inner equi...

2017-06-21 Thread fhueske
Github user fhueske commented on a diff in the pull request:

https://github.com/apache/flink/pull/3715#discussion_r123023871
  
--- Diff: 
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/JoinUtil.scala
 ---
@@ -0,0 +1,385 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.join
+
+import java.math.{BigDecimal => JBigDecimal}
+import java.util
+
+import org.apache.calcite.plan.RelOptUtil
+import org.apache.calcite.rel.`type`.RelDataType
+import org.apache.calcite.rel.core.JoinRelType
+import org.apache.calcite.rex._
+import org.apache.calcite.sql.SqlKind
+import org.apache.calcite.sql.fun.{SqlFloorFunction, SqlStdOperatorTable}
+import org.apache.flink.api.common.functions.FlatJoinFunction
+import org.apache.flink.api.common.typeinfo.TypeInformation
+import org.apache.flink.table.api.{TableConfig, TableException}
+import org.apache.flink.table.calcite.FlinkTypeFactory
+import org.apache.flink.table.codegen.{CodeGenerator, ExpressionReducer}
+import org.apache.flink.table.functions.TimeMaterializationSqlFunction
+import org.apache.flink.table.plan.schema.{RowSchema, 
TimeIndicatorRelDataType}
+import org.apache.flink.types.Row
+
+import scala.collection.JavaConversions._
+
+/**
+  * An util class to help analyze and build join code .
+  */
+object JoinUtil {
+
+  /**
+* check if the join case is stream join stream
+*
+* @param  condition   other condtion include time-condition
+* @param  inputType   left and right connect stream type
+*/
+  private[flink] def isStreamStreamJoin(
+  condition: RexNode,
+  inputType: RelDataType) = {
+
+def isExistTumble(expr: RexNode): Boolean = {
+  expr match {
+case c: RexCall =>
+  c.getOperator match {
+case _: SqlFloorFunction =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case SqlStdOperatorTable.TUMBLE =>
+  c.getOperands.map(analyzeSingleConditionTerm(_, 0, 
inputType)).exists(_.size > 0)
+case _ =>
+  c.getOperands.map(isExistTumble(_)).exists(_ == true)
+  }
+case _ => false
+  }
+}
+
+val isExistTimeIndicator = analyzeSingleConditionTerm(condition, 0, 
inputType).size > 0
+val isExistTumbleExpr = isExistTumble(condition)
+
+!isExistTumbleExpr && isExistTimeIndicator
+  }
+
+  /**
+* Analyze time-condtion to get time boundary for each stream and get 
the time type
+* and return remain condition.
+*
+* @param  condition   other condtion include time-condition
+* @param  leftLogicalFieldCnt left stream logical field num
+* @param  leftPhysicalFieldCnt left stream physical field num
+* @param  inputType   left and right connect stream type
+* @param  rexBuilder   util to build rexNode
+* @param  config  table environment config
+*/
+  private[flink] def analyzeTimeBoundary(
+  condition: RexNode,
+  leftLogicalFieldCnt: Int,
+  leftPhysicalFieldCnt: Int,
+  inputType: RelDataType,
+  rexBuilder: RexBuilder,
+  config: TableConfig): (RelDataType, Long, Long, Option[RexNode]) = {
+
+// Converts the condition to conjunctive normal form (CNF)
+val cnfCondition = RexUtil.toCnf(rexBuilder, condition)
+
+// split the condition into time indicator condition and other 
condition
+val (timeTerms, remainTerms) =
+  splitJoinCondition(
+cnfCondition,
+leftLogicalFieldCnt,
+inputType
+  )
+
+if (timeTerms.size != 2) {
+  throw new TableException("There only can and must have 2 time 
conditions.")
--- End diff --

"A time-based stream join requires exactly two join predicates that bound 

<    1   2   3   >