[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user fhueske commented on the issue: https://github.com/apache/flink/pull/2294 Sorry for the late reply @greghogan. +1 from me as well. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2294 @fhueske what is your analysis? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2294 This is okay to merge from my side. Definitely seems easier than the `Delegate` trick. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2294 The alternatives are much more hacky. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2294 This would replace `Delegate` and is much simpler. I placed it in the Java API package but it would not be visible to users through the `DataSet` API. The ultimate goal is to implicitly reuse operations performing the same computation on the same input `DataSet`. The optimizer cannot do this without understanding the UDF configuration and logic. Two instances of a UDF may be * incompatible, in which case a new result must be computed * the same, in which case the old result can be reused * different but compatible, in which case the UDF can merge the configurations and the new, shared result replaces the old result Replacing the old result requires a wrapper. Using `ProxyFactory` in `Delegate` has limitations as documented in FLINK-4257. With a `NoOpOperator` we can perform the same function by appending and then ignoring a dummy POJO operator. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2294 @greghogan I am not sure I completely understand this. If I see it correctly, the no-op operator is in the API, but gets removed during the translation? It looks like a bit of a hacky approach to me. Can you try and explain how exactly it helps? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2294 @StephanEwen thoughts on how to proceed? Not looking to rush this. With this simple PR I can remove the use of javassist's `ProxyFactory` from Gelly. Or we can defer consideration of this PR until 1.2.0 and I can apply the temporary 90% fix to Gelly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2294: [FLINK-4265] [dataset api] Add a NoOpOperator
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2294 @fhueske @ggevay what are your thoughts on adding this as an `@Internal` operator? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---