[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613553#comment-16613553
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun commented on issue #6643: [FLINK-10275] StreamTask support object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-421020528
 
 
   Although I don't think remain a stall pull request does harm, since this 
pull request should be updated nearly as a rework, I would close it.
   Thanks for your reminding @kl0u!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613554#comment-16613554
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun closed pull request #6643: [FLINK-10275] StreamTask support object 
reuse
URL: https://github.com/apache/flink/pull/6643
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/dev/execution_configuration.md 
b/docs/dev/execution_configuration.md
index f0103b0f39f..3991ab59ac5 100644
--- a/docs/dev/execution_configuration.md
+++ b/docs/dev/execution_configuration.md
@@ -59,7 +59,7 @@ With the closure cleaner disabled, it might happen that an 
anonymous user functi
 
 - `enableForceAvro()` / **`disableForceAvro()`**. Avro is not forced by 
default. Forces the Flink AvroTypeInformation to use the Avro serializer 
instead of Kryo for serializing Avro POJOs.
 
-- `enableObjectReuse()` / **`disableObjectReuse()`** By default, objects are 
not reused in Flink. Enabling the object reuse mode will instruct the runtime 
to reuse user objects for better performance. Keep in mind that this can lead 
to bugs when the user-code function of an operation is not aware of this 
behavior.
+- `enableObjectReuse()` / **`disableObjectReuse()`** By default, objects are 
not reused in Flink. Enabling the object reuse mode will instruct the runtime 
to reuse user objects for better performance. Keep in mind that this can lead 
to bugs when the user-defined function of an operation is not aware of this 
behavior.
 
 - **`enableSysoutLogging()`** / `disableSysoutLogging()` JobManager status 
updates are printed to `System.out` by default. This setting allows to disable 
this behavior.
 
diff --git 
a/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java 
b/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java
index 59fa803791a..d36fd296562 100644
--- a/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java
+++ b/flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java
@@ -601,8 +601,8 @@ public boolean isForceAvroEnabled() {
 
/**
 * Enables reusing objects that Flink internally uses for 
deserialization and passing
-* data to user-code functions. Keep in mind that this can lead to bugs 
when the
-* user-code function of an operation is not aware of this behaviour.
+* data to user-defined functions. Keep in mind that this can lead to 
bugs when the
+* user-defined function of an operation is not aware of this behaviour.
 */
public ExecutionConfig enableObjectReuse() {
objectReuse = true;
@@ -611,7 +611,7 @@ public ExecutionConfig enableObjectReuse() {
 
/**
 * Disables reusing objects that Flink internally uses for 
deserialization and passing
-* data to user-code functions. @see #enableObjectReuse()
+* data to user-defined functions. @see #enableObjectReuse()
 */
public ExecutionConfig disableObjectReuse() {
objectReuse = false;
diff --git 
a/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/batch/table/TableSinkITCase.scala
 
b/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/batch/table/TableSinkITCase.scala
index d8ba29ae478..f4185f4c129 100644
--- 
a/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/batch/table/TableSinkITCase.scala
+++ 
b/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/batch/table/TableSinkITCase.scala
@@ -51,7 +51,7 @@ class TableSinkITCase(
 val input = CollectionDataSets.get3TupleDataSet(env)
   .map(x => x).setParallelism(4) // increase DOP to 4
 
-val results = input.toTable(tEnv, 'a, 'b, 'c)
+input.toTable(tEnv, 'a, 'b, 'c)
   .where('a < 5 || 'a > 17)
   .select('c, 'b)
   .writeToSink(new CsvTableSink(path, fieldDelim = "|"))
diff --git 
a/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/TableSinkITCase.scala
 
b/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/TableSinkITCase.scala
index 70e59f3d24d..95cb1df1b04 100644
--- 
a/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/TableSinkITCase.scala
+++ 
b/flink-libraries/flink-table/src/test/scala/org/apache/flink/table/runtime/stream/table/TableSinkITCase.scala
@@ -21,10 +21,12 @@ package org.apache.flink.table.runtime.stream.table
 import java.io.File
 import java.lang.{Boolean => JBool}
 
+import org.apache.flink.api.common.ExecutionConfig
 import org.apache.flink.api.common.functions.MapFunction
 im

[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613544#comment-16613544
 ] 

ASF GitHub Bot commented on FLINK-10275:


kl0u commented on issue #6643: [FLINK-10275] StreamTask support object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-421019224
 
 
   Hi @TisonKun,
   
   In the above discussion, there is consensus that until the related FLIP is 
agreed upon, there is not going to be any activity on this PR.  Could you close 
this PR so that we keep a clean backlog of PR's?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-13 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613150#comment-16613150
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun edited a comment on issue #6643: [FLINK-10275] StreamTask support 
object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-420644895
 
 
   @StephanEwen you're right. After raising the exception mentioned above I 
communicate with our batch team, who pointing me to FLIP-21. So this thread 
stalls until the corresponding patch from batch side given out. Still under 
discussion and attempts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612681#comment-16612681
 ] 

ASF GitHub Bot commented on FLINK-10275:


StephanEwen commented on issue #6643: [FLINK-10275] StreamTask support object 
reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-420782868
 
 
   Yes, let's resolve FLIP-21. Will try to revive it in the next days...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612121#comment-16612121
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun commented on issue #6643: [FLINK-10275] StreamTask support object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-420644895
 
 
   @StephanEwen you're right. After raising the exception mentioned above I 
communicate with our batch team, who pointing me to FLIP-21. So this thread 
stalls until the corresponding patch from batch size given out. Still under 
discussion and attempts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611931#comment-16611931
 ] 

ASF GitHub Bot commented on FLINK-10275:


StephanEwen commented on issue #6643: [FLINK-10275] StreamTask support object 
reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-420611044
 
 
   I think we need to find consensus on FLIP-21 before going for this. 
Currently, object reuse in streaming means to have "no copy on chaining", which 
is very different from the semantics in the batch API.
   Simply reusing objects now will break many setups.
   
   Before we can merge this, we need to extend the semantics to differentiate 
between actual object reuse and avoiding copies on handover.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610679#comment-16610679
 ] 

ASF GitHub Bot commented on FLINK-10275:


kl0u commented on issue #6643: [FLINK-10275] StreamTask support object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-420289620
 
 
   Sounds good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16610674#comment-16610674
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun commented on issue #6643: [FLINK-10275] StreamTask support object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-420289282
 
 
   @kl0u Thanks for comments! I would clean this PR and emphases the main 
changes later. Recently working on other threads, ping here once done.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-09-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16602588#comment-16602588
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun commented on issue #6643: [FLINK-10275] StreamTask support object reuse
URL: https://github.com/apache/flink/pull/6643#issuecomment-418236948
 
 
   This pull request would affect reusing of object from network, according to 
[FLIP-21](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71012982),
 it is another reuse aspect from current `ExecutionConfig#isObjectReuseEnable` 
configuration.
   
   Travis fails on `TableSinkITCase#testAppendSinkOnAppendTableForInnerJoin` 
which is because of mutating inputValue. We need a extra config to switch 
network aspect object reuse. Still digging...
   
   Suggestions are welcome :-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-10275) StreamTask support object reuse

2018-08-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16599068#comment-16599068
 ] 

ASF GitHub Bot commented on FLINK-10275:


TisonKun opened a new pull request #6643: [FLINK-10275] StreamTask support 
object reuse
URL: https://github.com/apache/flink/pull/6643
 
 
   ## What is the purpose of the change
   
   StreamTask support efficient object reuse. The purpose behind this is to 
reduce pressure on the garbage collector.
   
   All objects are reused, without backup copies. The operators and UDFs must 
be careful to not keep any objects as state or not to modify the objects.
   
   ## Brief change log
   
   - With `ExecutionConfig#isObjectReuseEnable` on, reuse `StreamRecord` 
associated to `StreamTask`.
   - Also clean code as glancing over. 
   
   
   ## Verifying this change
   
   Add case to unit test `OneInputStreamTaskTest.java` and 
`TwoInputStreamTaskTest.java`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> StreamTask support object reuse
> ---
>
> Key: FLINK-10275
> URL: https://issues.apache.org/jira/browse/FLINK-10275
> Project: Flink
>  Issue Type: Improvement
>  Components: Streaming
>Affects Versions: 1.7.0
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0
>
>
> StreamTask support efficient object reuse. The purpose behind this is to 
> reduce pressure on the garbage collector.
> All objects are reused, without backup copies. The operators and UDFs must be 
> careful to not keep any objects as state or not to modify the objects.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)