[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-11-30 Thread Stefan Richter (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791536#comment-17791536
 ] 

Stefan Richter commented on FLINK-32444:


[~pnowojski] if there really is an issue with heap backend, then we also need 
to be careful about what type of caching we can build for RocksDB in the future.

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-11-23 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789031#comment-17789031
 ] 

Piotr Nowojski commented on FLINK-32444:


Instead of checking for the configured state backend, I would add some getter 
to the statebackend interface like:
{code:java}
boolean StateBackend#storesObjectReferences(); // false for RocksDB, true for 
HashMap
{code}.

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-11-08 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783921#comment-17783921
 ] 

Timo Walther commented on FLINK-32444:
--

The easiest solution could be to check for the configured state backend? If 
heap still causes issues?

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-11-08 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783919#comment-17783919
 ] 

Timo Walther commented on FLINK-32444:
--

I was about to simply open a PR for this change, but then I found this comment 
here:
https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/runtime/utils/StreamingWithStateTestBase.scala#L52
{code}
  enableObjectReuse = state match {
case HEAP_BACKEND => false // TODO heap statebackend not support obj reuse 
now.
case ROCKSDB_BACKEND => true
  }
{code}

This also matches with my memory why we didn't enable it by default. Does 
anyone know whether something has changed in the meantime?

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-11-03 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782658#comment-17782658
 ] 

Piotr Nowojski commented on FLINK-32444:


{quote}
Does it give us a performance benefits? 
{quote}
Yes. One one job that I've looked into recently, a subtask reading from Kafka, 
filtering/projecting records and doing local windowed aggregation, with object 
reused disabled, is spending something between 25%-50% time inside 
{{CopyingChainingOutput}}.

If there are no correctness issues with built-in operators/functions in Flink 
SQL I would be also giving big +1 for enabling reuse by default.

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-11-03 Thread Timo Walther (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782654#comment-17782654
 ] 

Timo Walther commented on FLINK-32444:
--

[~jark] is there a reason why you didn't implement this issue yet? Are there 
known issues? I guess this would be very low hanging fruit for performance if 
it causes no issues.

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.19.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-06-27 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737739#comment-17737739
 ] 

Benchao Li commented on FLINK-32444:


Big +1 on this, we've enabled it for all production jobs, and get a very good 
performance improvement.

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.18.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-06-27 Thread lincoln lee (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737699#comment-17737699
 ] 

lincoln lee commented on FLINK-32444:
-

[~jark] Cool! this would be benifitial for sql users

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.18.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32444) Enable object reuse for Flink SQL jobs by default

2023-06-26 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17737470#comment-17737470
 ] 

Jark Wu commented on FLINK-32444:
-

cc [~lincoln.86xy], [~lsy], [~twalthr] what do you think?

> Enable object reuse for Flink SQL jobs by default
> -
>
> Key: FLINK-32444
> URL: https://issues.apache.org/jira/browse/FLINK-32444
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: Jark Wu
>Priority: Major
> Fix For: 1.18.0
>
>
> Currently, object reuse is not enabled by default for Flink Streaming Jobs, 
> but is enabled by default for Flink Batch jobs. That is not consistent for 
> stream-batch unification. Besides, SQL operators are safe to enable object 
> reuse and this is a great performance improvement for SQL jobs. 
> We should also be careful with the Table-DataStream conversion case 
> (StreamTableEnvironment) which is not safe to enable object reuse by default. 
> Maybe we can just enable it for SQL Client/Gateway and TableEnvironment. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)