[jira] [Commented] (FLINK-7251) Merge the flink-java8 project into flink-core

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548808#comment-16548808
 ] 

ASF GitHub Bot commented on FLINK-7251:
---

Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/6120#discussion_r203605524
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/KeyedStream.java
 ---
@@ -545,15 +543,15 @@ public IntervalJoined(
 
TypeInformation resultType = 
TypeExtractor.getBinaryOperatorReturnType(
cleanedUdf,
-   ProcessJoinFunction.class,// 
ProcessJoinFunction
-   0, //   
012
+   ProcessJoinFunction.class,
+   0,
1,
2,
-   TypeExtractor.NO_INDEX, // output arg 
indices
-   left.getType(), // input 1 type 
information
-   right.getType(),// input 2 type 
information
-   INTERVAL_JOIN_FUNC_NAME ,
-   false
+   TypeExtractor.NO_INDEX,
+   left.getType(),
+   right.getType(),
+   Utils.getCallLocationName(),
+   true
--- End diff --

Just saw that this should be false. Will correct that.


> Merge the flink-java8 project into flink-core
> -
>
> Key: FLINK-7251
> URL: https://issues.apache.org/jira/browse/FLINK-7251
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6120: [FLINK-7251] [types] Remove the flink-java8 module...

2018-07-18 Thread twalthr
Github user twalthr commented on a diff in the pull request:

https://github.com/apache/flink/pull/6120#discussion_r203605524
  
--- Diff: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/KeyedStream.java
 ---
@@ -545,15 +543,15 @@ public IntervalJoined(
 
TypeInformation resultType = 
TypeExtractor.getBinaryOperatorReturnType(
cleanedUdf,
-   ProcessJoinFunction.class,// 
ProcessJoinFunction
-   0, //   
012
+   ProcessJoinFunction.class,
+   0,
1,
2,
-   TypeExtractor.NO_INDEX, // output arg 
indices
-   left.getType(), // input 1 type 
information
-   right.getType(),// input 2 type 
information
-   INTERVAL_JOIN_FUNC_NAME ,
-   false
+   TypeExtractor.NO_INDEX,
+   left.getType(),
+   right.getType(),
+   Utils.getCallLocationName(),
+   true
--- End diff --

Just saw that this should be false. Will correct that.


---


[jira] [Commented] (FLINK-9875) Add concurrent creation of execution job vertex

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548797#comment-16548797
 ] 

ASF GitHub Bot commented on FLINK-9875:
---

Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6353
  
Fix unstable case, the problem is code below, that may assign `constraints` 
to a long array and then to a short array, which cause out of index exception. 
to solve it we could init `constraints` in object construct.


https://github.com/apache/flink/blob/056486a1b81e9648a6d3dc795e7e2c6976f8388c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/scheduler/CoLocationGroup.java#L87-L91


> Add concurrent creation of execution job vertex
> ---
>
> Key: FLINK-9875
> URL: https://issues.apache.org/jira/browse/FLINK-9875
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Affects Versions: 1.5.1
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
>
> in some case like inputformat vertex, creation of execution job vertex is time
> consuming, this pr add concurrent creation of execution job vertex to 
> accelerate it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6353: [FLINK-9875] Add concurrent creation of execution job ver...

2018-07-18 Thread tison1
Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6353
  
Fix unstable case, the problem is code below, that may assign `constraints` 
to a long array and then to a short array, which cause out of index exception. 
to solve it we could init `constraints` in object construct.


https://github.com/apache/flink/blob/056486a1b81e9648a6d3dc795e7e2c6976f8388c/flink-runtime/src/main/java/org/apache/flink/runtime/jobmanager/scheduler/CoLocationGroup.java#L87-L91


---


[jira] [Commented] (FLINK-4534) Lack of synchronization in BucketingSink#restoreState()

2018-07-18 Thread zhangminglei (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548759#comment-16548759
 ] 

zhangminglei commented on FLINK-4534:
-

Hi, [~gjy] Can we use a lightweight synchronization mechanism to solve this ? 
For example, use {{volatile}} to void this issue.

> Lack of synchronization in BucketingSink#restoreState()
> ---
>
> Key: FLINK-4534
> URL: https://issues.apache.org/jira/browse/FLINK-4534
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>
> Iteration over state.bucketStates is protected by synchronization in other 
> methods, except for the following in restoreState():
> {code}
> for (BucketState bucketState : state.bucketStates.values()) {
> {code}
> and following in close():
> {code}
> for (Map.Entry> entry : 
> state.bucketStates.entrySet()) {
>   closeCurrentPartFile(entry.getValue());
> {code}
> w.r.t. bucketState.pendingFilesPerCheckpoint , there is similar issue 
> starting line 752:
> {code}
>   Set pastCheckpointIds = 
> bucketState.pendingFilesPerCheckpoint.keySet();
>   LOG.debug("Moving pending files to final location on restore.");
>   for (Long pastCheckpointId : pastCheckpointIds) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9675) Avoid FileInputStream/FileOutputStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548748#comment-16548748
 ] 

ASF GitHub Bot commented on FLINK-9675:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/6335
  
Hi, @hequn8128 I will create another jira for ```new InputStreamReader``` 
refactor issue. What do you think ?


> Avoid FileInputStream/FileOutputStream
> --
>
> Key: FLINK-9675
> URL: https://issues.apache.org/jira/browse/FLINK-9675
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Minor
>  Labels: filesystem, pull-request-available
>
> They rely on finalizers (before Java 11), which create unnecessary GC load.
> The alternatives, Files.newInputStream, are as easy to use and don't have 
> this issue.
> And here is a benchmark 
> https://arnaudroger.github.io/blog/2017/03/20/faster-reader-inpustream-in-java.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6335: [FLINK-9675] [fs] Avoid FileInputStream/FileOutputStream

2018-07-18 Thread zhangminglei
Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/6335
  
Hi, @hequn8128 I will create another jira for ```new InputStreamReader``` 
refactor issue. What do you think ?


---


[jira] [Commented] (FLINK-9236) Use Apache Parent POM 19

2018-07-18 Thread Stephen Jason (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548735#comment-16548735
 ] 

Stephen Jason commented on FLINK-9236:
--

Yes, but I can't find [~jiayichao] in the name list. [~fhueske], could you 
please add him to contributor list.

> Use Apache Parent POM 19
> 
>
> Key: FLINK-9236
> URL: https://issues.apache.org/jira/browse/FLINK-9236
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: Stephen Jason
>Priority: Major
>
> Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.
> This will also fix Javadoc generation with JDK 10+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-9236) Use Apache Parent POM 19

2018-07-18 Thread Stephen Jason (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Jason reassigned FLINK-9236:


Assignee: Stephen Jason

> Use Apache Parent POM 19
> 
>
> Key: FLINK-9236
> URL: https://issues.apache.org/jira/browse/FLINK-9236
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: Stephen Jason
>Priority: Major
>
> Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.
> This will also fix Javadoc generation with JDK 10+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6367: [FLINK-9850] Add a string to the print method to i...

2018-07-18 Thread hequn8128
Github user hequn8128 commented on a diff in the pull request:

https://github.com/apache/flink/pull/6367#discussion_r203588835
  
--- Diff: 
flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
 ---
@@ -959,6 +959,29 @@ class DataStream[T](stream: JavaStream[T]) {
   @PublicEvolving
   def printToErr() = stream.printToErr()
 
+  /**
+* Writes a DataStream to the standard output stream (stdout). For each
+* element of the DataStream the result of .toString is
--- End diff --

.toString => [[AnyRef.toString()]]


---


[jira] [Assigned] (FLINK-9236) Use Apache Parent POM 19

2018-07-18 Thread Stephen Jason (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Jason reassigned FLINK-9236:


Assignee: (was: Stephen Jason)

> Use Apache Parent POM 19
> 
>
> Key: FLINK-9236
> URL: https://issues.apache.org/jira/browse/FLINK-9236
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Priority: Major
>
> Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.
> This will also fix Javadoc generation with JDK 10+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9850) Add a string to the print method to identify output for DataStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548718#comment-16548718
 ] 

ASF GitHub Bot commented on FLINK-9850:
---

Github user hequn8128 commented on a diff in the pull request:

https://github.com/apache/flink/pull/6367#discussion_r203588835
  
--- Diff: 
flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/DataStream.scala
 ---
@@ -959,6 +959,29 @@ class DataStream[T](stream: JavaStream[T]) {
   @PublicEvolving
   def printToErr() = stream.printToErr()
 
+  /**
+* Writes a DataStream to the standard output stream (stdout). For each
+* element of the DataStream the result of .toString is
--- End diff --

.toString => [[AnyRef.toString()]]


> Add a string to the print method to identify output for DataStream
> --
>
> Key: FLINK-9850
> URL: https://issues.apache.org/jira/browse/FLINK-9850
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Hequn Cheng
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> The output of the print method of {[DataSet}} allows the user to supply a 
> String to identify the output(see 
> [FLINK-1486|https://issues.apache.org/jira/browse/FLINK-1486]). But 
> {[DataStream}} doesn't support now. It is valuable to add this feature for 
> {{DataStream}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9236) Use Apache Parent POM 19

2018-07-18 Thread jiayichao (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548715#comment-16548715
 ] 

jiayichao commented on FLINK-9236:
--

hi [~Stephen Jason]Stephen Jason,  Could you give this issue to me

> Use Apache Parent POM 19
> 
>
> Key: FLINK-9236
> URL: https://issues.apache.org/jira/browse/FLINK-9236
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: Stephen Jason
>Priority: Major
>
> Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.
> This will also fix Javadoc generation with JDK 10+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9675) Avoid FileInputStream/FileOutputStream

2018-07-18 Thread zhangminglei (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangminglei updated FLINK-9675:

Description: 
They rely on finalizers (before Java 11), which create unnecessary GC load.

The alternatives, Files.newInputStream, are as easy to use and don't have this 
issue.

And here is a benchmark 
https://arnaudroger.github.io/blog/2017/03/20/faster-reader-inpustream-in-java.html

  was:
They rely on finalizers (before Java 11), which create unnecessary GC load.

The alternatives, Files.newInputStream, are as easy to use and don't have this 
issue.


> Avoid FileInputStream/FileOutputStream
> --
>
> Key: FLINK-9675
> URL: https://issues.apache.org/jira/browse/FLINK-9675
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Minor
>  Labels: filesystem, pull-request-available
>
> They rely on finalizers (before Java 11), which create unnecessary GC load.
> The alternatives, Files.newInputStream, are as easy to use and don't have 
> this issue.
> And here is a benchmark 
> https://arnaudroger.github.io/blog/2017/03/20/faster-reader-inpustream-in-java.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9675) Avoid FileInputStream/FileOutputStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548677#comment-16548677
 ] 

ASF GitHub Bot commented on FLINK-9675:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/6335
  
Thank you very much! @hequn8128 ! I will take a look.


> Avoid FileInputStream/FileOutputStream
> --
>
> Key: FLINK-9675
> URL: https://issues.apache.org/jira/browse/FLINK-9675
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Minor
>  Labels: filesystem, pull-request-available
>
> They rely on finalizers (before Java 11), which create unnecessary GC load.
> The alternatives, Files.newInputStream, are as easy to use and don't have 
> this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6335: [FLINK-9675] [fs] Avoid FileInputStream/FileOutputStream

2018-07-18 Thread zhangminglei
Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/6335
  
Thank you very much! @hequn8128 ! I will take a look.


---


[jira] [Commented] (FLINK-9735) Potential resource leak in RocksDBStateBackend#getDbOptions

2018-07-18 Thread vinoyang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548670#comment-16548670
 ] 

vinoyang commented on FLINK-9735:
-

Will process this issue soon.

> Potential resource leak in RocksDBStateBackend#getDbOptions
> ---
>
> Key: FLINK-9735
> URL: https://issues.apache.org/jira/browse/FLINK-9735
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Minor
>
> Here is related code:
> {code}
> if (optionsFactory != null) {
>   opt = optionsFactory.createDBOptions(opt);
> }
> {code}
> opt, an DBOptions instance, should be closed before being rewritten.
> getColumnOptions has similar issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548654#comment-16548654
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/6365
  
Thanks @yanghua you are right! I just want waiting the travis ending. then 
give the old and new version's dependency tree.


> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6365: [FLINK-9849] [hbase] Hbase upgrade to 2.0.1

2018-07-18 Thread zhangminglei
Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/6365
  
Thanks @yanghua you are right! I just want waiting the travis ending. then 
give the old and new version's dependency tree.


---


[jira] [Commented] (FLINK-9061) add entropy to s3 path for better scalability

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548629#comment-16548629
 ] 

ASF GitHub Bot commented on FLINK-9061:
---

Github user mdaxini commented on the issue:

https://github.com/apache/flink/pull/6302
  
@indrc 

In addition to removing the additional dependency, I think there should be 
a test for validating the randomness of the chose algorithm, and to make sure 
there are no conflicts. For a Flink Job with large number of Task Managers and 
state in Terrabytes resulting in several files, there could be a possibility of 
conflicts with the current random string generation method. 

fyi @StefanRRichter @StephanEwen 


> add entropy to s3 path for better scalability
> -
>
> Key: FLINK-9061
> URL: https://issues.apache.org/jira/browse/FLINK-9061
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystem, State Backends, Checkpointing
>Affects Versions: 1.5.0, 1.4.2
>Reporter: Jamie Grier
>Assignee: Indrajit Roychoudhury
>Priority: Critical
>  Labels: pull-request-available
>
> I think we need to modify the way we write checkpoints to S3 for high-scale 
> jobs (those with many total tasks).  The issue is that we are writing all the 
> checkpoint data under a common key prefix.  This is the worst case scenario 
> for S3 performance since the key is used as a partition key.
>  
> In the worst case checkpoints fail with a 500 status code coming back from S3 
> and an internal error type of TooBusyException.
>  
> One possible solution would be to add a hook in the Flink filesystem code 
> that allows me to "rewrite" paths.  For example say I have the checkpoint 
> directory set to:
>  
> s3://bucket/flink/checkpoints
>  
> I would hook that and rewrite that path to:
>  
> s3://bucket/[HASH]/flink/checkpoints, where HASH is the hash of the original 
> path
>  
> This would distribute the checkpoint write load around the S3 cluster evenly.
>  
> For reference: 
> https://aws.amazon.com/premiumsupport/knowledge-center/s3-bucket-performance-improve/
>  
> Any other people hit this issue?  Any other ideas for solutions?  This is a 
> pretty serious problem for people trying to checkpoint to S3.
>  
> -Jamie
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6302: [FLINK-9061][checkpointing] add entropy to s3 path for be...

2018-07-18 Thread mdaxini
Github user mdaxini commented on the issue:

https://github.com/apache/flink/pull/6302
  
@indrc 

In addition to removing the additional dependency, I think there should be 
a test for validating the randomness of the chose algorithm, and to make sure 
there are no conflicts. For a Flink Job with large number of Task Managers and 
state in Terrabytes resulting in several files, there could be a possibility of 
conflicts with the current random string generation method. 

fyi @StefanRRichter @StephanEwen 


---


[jira] [Updated] (FLINK-9236) Use Apache Parent POM 19

2018-07-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated FLINK-9236:
--
Description: 
Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.


This will also fix Javadoc generation with JDK 10+

  was:
Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.

This will also fix Javadoc generation with JDK 10+


> Use Apache Parent POM 19
> 
>
> Key: FLINK-9236
> URL: https://issues.apache.org/jira/browse/FLINK-9236
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Ted Yu
>Assignee: Stephen Jason
>Priority: Major
>
> Flink is still using Apache Parent POM 18. Apache Parent POM 19 is out.
> This will also fix Javadoc generation with JDK 10+



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (FLINK-9735) Potential resource leak in RocksDBStateBackend#getDbOptions

2018-07-18 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540754#comment-16540754
 ] 

Ted Yu edited comment on FLINK-9735 at 7/19/18 12:37 AM:
-

Short term, we should fix the leaked DBOptions instance by releasing it.


was (Author: yuzhih...@gmail.com):
Short term, we should fix the leaked DBOptions instance.

> Potential resource leak in RocksDBStateBackend#getDbOptions
> ---
>
> Key: FLINK-9735
> URL: https://issues.apache.org/jira/browse/FLINK-9735
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: vinoyang
>Priority: Minor
>
> Here is related code:
> {code}
> if (optionsFactory != null) {
>   opt = optionsFactory.createDBOptions(opt);
> }
> {code}
> opt, an DBOptions instance, should be closed before being rewritten.
> getColumnOptions has similar issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6259: [FLINK-9679] Implement AvroSerializationSchema

2018-07-18 Thread medcv
Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys 
in last commit, I did extend `SchemaCoder` to have `getSchemaId` as you 
suggested.   


---


[jira] [Commented] (FLINK-9679) Implement AvroSerializationSchema

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548595#comment-16548595
 ] 

ASF GitHub Bot commented on FLINK-9679:
---

Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys 
in last commit, I did extend `SchemaCoder` to have `getSchemaId` as you 
suggested.   


> Implement AvroSerializationSchema
> -
>
> Key: FLINK-9679
> URL: https://issues.apache.org/jira/browse/FLINK-9679
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Yazdan Shirvany
>Priority: Major
>  Labels: pull-request-available
>
> Implement AvroSerializationSchema using Confluent Schema Registry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6364: [hotfix] typo for SqlExecutionException msg

2018-07-18 Thread tison1
Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6364
  
Thanks! LGTM

cc @zentol 


---


[GitHub] flink issue #6360: [FLINK-9884] [runtime] fix slot request may not be remove...

2018-07-18 Thread tison1
Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6360
  
@shuai-xu 
It makes sense.
The message that TM has successfully allocated slot might lost in transport.
When slot manager receives a slot status report which says one slot has 
allocation id irrelevant to this offer, then the slot is allocated to another 
slot request.
It looks this PR prevents runtime from some potential resource leak, 
doesn't it?


---


[jira] [Commented] (FLINK-9884) Slot request may not be removed when it has already be assigned in slot manager

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548559#comment-16548559
 ] 

ASF GitHub Bot commented on FLINK-9884:
---

Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6360
  
@shuai-xu 
It makes sense.
The message that TM has successfully allocated slot might lost in transport.
When slot manager receives a slot status report which says one slot has 
allocation id irrelevant to this offer, then the slot is allocated to another 
slot request.
It looks this PR prevents runtime from some potential resource leak, 
doesn't it?


> Slot request may not be removed when it has already be assigned in slot 
> manager
> ---
>
> Key: FLINK-9884
> URL: https://issues.apache.org/jira/browse/FLINK-9884
> Project: Flink
>  Issue Type: Bug
>  Components: Cluster Management
>Reporter: shuai.xu
>Assignee: shuai.xu
>Priority: Major
>  Labels: pull-request-available
>
> When task executor report a slotA with allocationId1, it may happen that slot 
> manager record slotA is assigned to allocationId2, and the slot request with 
> allocationId1 is not assigned. Then slot manager will update itself with 
> slotA assigned to allocationId1, by it does not clear the slot request with 
> allocationId1.
> For example:
>  # There is one free slot in slot manager.
>  # Now come two slot request with allocationId1 and allocationId2.
>  # The slot is assigned to allocationId1, but the requestSlot call timeout.
>  # SlotManager assign the slot to allocationId2 and insert a slot request 
> with allocationId1.
>  # The second requestSlot call to task executor return SlotOccupiedException.
>  # SlotManager update the slot to allocationID1, but the slot request is left.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9679) Implement AvroSerializationSchema

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548479#comment-16548479
 ] 

ASF GitHub Bot commented on FLINK-9679:
---

Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys For second issue I am looking at other Schema registries and 
trying to extend `SchemaCoder`


> Implement AvroSerializationSchema
> -
>
> Key: FLINK-9679
> URL: https://issues.apache.org/jira/browse/FLINK-9679
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Yazdan Shirvany
>Priority: Major
>  Labels: pull-request-available
>
> Implement AvroSerializationSchema using Confluent Schema Registry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6259: [FLINK-9679] Implement AvroSerializationSchema

2018-07-18 Thread medcv
Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys For second issue I am looking at other Schema registries and 
trying to extend `SchemaCoder`


---


[jira] [Commented] (FLINK-9679) Implement AvroSerializationSchema

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548425#comment-16548425
 ] 

ASF GitHub Bot commented on FLINK-9679:
---

Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys  Thanks!

As far as I dog on Confluent code, their api needs `subject` to retrieve 
the Schema Id and version and it should be provided by consumer. 


https://github.com/confluentinc/schema-registry/blob/master/client/src/main/java/io/confluent/kafka/schemaregistry/client/SchemaRegistryClient.java#L30

Purpose of new commit is to address your first comments by removing `topic` 
name in the serialization constructor and replace it with `subject`. So this 
way serializer doesn't need to know about the `topic` name.

If you still see issues with this approach I would appreciate it if you 
help me to find a better solution.


 



> Implement AvroSerializationSchema
> -
>
> Key: FLINK-9679
> URL: https://issues.apache.org/jira/browse/FLINK-9679
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Yazdan Shirvany
>Priority: Major
>  Labels: pull-request-available
>
> Implement AvroSerializationSchema using Confluent Schema Registry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6259: [FLINK-9679] Implement AvroSerializationSchema

2018-07-18 Thread medcv
Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys  Thanks!

As far as I dog on Confluent code, their api needs `subject` to retrieve 
the Schema Id and version and it should be provided by consumer. 


https://github.com/confluentinc/schema-registry/blob/master/client/src/main/java/io/confluent/kafka/schemaregistry/client/SchemaRegistryClient.java#L30

Purpose of new commit is to address your first comments by removing `topic` 
name in the serialization constructor and replace it with `subject`. So this 
way serializer doesn't need to know about the `topic` name.

If you still see issues with this approach I would appreciate it if you 
help me to find a better solution.


 



---


[jira] [Assigned] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-07-18 Thread Shuyi Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuyi Chen reassigned FLINK-9891:
-

Assignee: Shuyi Chen

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Blocker
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to ./conf/flink-conf.yaml
>  # Execute the following command:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because 

[jira] [Commented] (FLINK-9679) Implement AvroSerializationSchema

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548207#comment-16548207
 ] 

ASF GitHub Bot commented on FLINK-9679:
---

Github user dawidwys commented on the issue:

https://github.com/apache/flink/pull/6259
  
@medcv the new commit does not address any of my previous comments or I 
don't understand something.


> Implement AvroSerializationSchema
> -
>
> Key: FLINK-9679
> URL: https://issues.apache.org/jira/browse/FLINK-9679
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Yazdan Shirvany
>Priority: Major
>  Labels: pull-request-available
>
> Implement AvroSerializationSchema using Confluent Schema Registry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6259: [FLINK-9679] Implement AvroSerializationSchema

2018-07-18 Thread dawidwys
Github user dawidwys commented on the issue:

https://github.com/apache/flink/pull/6259
  
@medcv the new commit does not address any of my previous comments or I 
don't understand something.


---


[jira] [Commented] (FLINK-9679) Implement AvroSerializationSchema

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548150#comment-16548150
 ] 

ASF GitHub Bot commented on FLINK-9679:
---

Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys I update the PR, please review

the usage would be like this
` ConfluentRegistryAvroSerializationSchema.forSpecific(User.class, subject, 
 schemaRegistryUrl)`
as Confluent needs "subject"  to fetch the Schema info. Now 
`ConfluentRegistryAvroSerializationSchema` uses "subject" directly instated on 
`topic + "-value"`.


> Implement AvroSerializationSchema
> -
>
> Key: FLINK-9679
> URL: https://issues.apache.org/jira/browse/FLINK-9679
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Yazdan Shirvany
>Assignee: Yazdan Shirvany
>Priority: Major
>  Labels: pull-request-available
>
> Implement AvroSerializationSchema using Confluent Schema Registry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6259: [FLINK-9679] Implement AvroSerializationSchema

2018-07-18 Thread medcv
Github user medcv commented on the issue:

https://github.com/apache/flink/pull/6259
  
@dawidwys I update the PR, please review

the usage would be like this
` ConfluentRegistryAvroSerializationSchema.forSpecific(User.class, subject, 
 schemaRegistryUrl)`
as Confluent needs "subject"  to fetch the Schema info. Now 
`ConfluentRegistryAvroSerializationSchema` uses "subject" directly instated on 
`topic + "-value"`.


---


[jira] [Created] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-07-18 Thread Sergey Krasovskiy (JIRA)
Sergey Krasovskiy created FLINK-9891:


 Summary: Flink cluster is not shutdown in YARN mode when Flink 
client is stopped
 Key: FLINK-9891
 URL: https://issues.apache.org/jira/browse/FLINK-9891
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.5.1, 1.5.0
Reporter: Sergey Krasovskiy


We are not using session mode and detached mode. The command to run flink job 
on YARN is:
{code:java}
/bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 2048 
-j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
{code}
Flink CLI logs:
{code:java}
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2018-07-18 12:47:03,747 INFO 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No 
path for the flink jar passed. Using the location of class 
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No 
path for the flink jar passed. Using the location of class 
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2018-07-18 12:47:04,248 WARN 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
YARN Client needs one of these to be set to properly load the Hadoop 
configuration for accessing YARN.
2018-07-18 12:47:04,409 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
numberTaskManagers=1, slotsPerTaskManager=1}
2018-07-18 12:47:04,783 WARN 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
local reads feature cannot be used because libhadoop cannot be loaded.
2018-07-18 12:47:04,788 WARN 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
directory 
('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
 contains both LOG4J and Logback configuration files. Please delete or rename 
one of them.
2018-07-18 12:47:07,846 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
master application_1531474158783_10814
2018-07-18 12:47:08,073 INFO 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
application_1531474158783_10814
2018-07-18 12:47:08,074 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
to be allocated
2018-07-18 12:47:08,076 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
current state ACCEPTED
2018-07-18 12:47:12,864 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been 
deployed successfully.
{code}
Job Manager logs:
{code:java}
2018-07-18 12:47:09,913 INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 

2018-07-18 12:47:09,915 INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
11:51:27 GMT)
...
{code}
Issues:
 # Flink job is running as a Flink session
 # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
 # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
cluster we need to run: yarn application -kill 

We also tried to run a flink job with 'mode: legacy' and we have the same 
issues:
 # Add property 'mode: legacy' to ./conf/flink-conf.yaml
 # Execute the following command:

{code:java}
/bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 2048 
-j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
{code}
Flink CLI logs:
{code:java}
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See 

[jira] [Updated] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-07-18 Thread Sergey Krasovskiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Krasovskiy updated FLINK-9891:
-
Description: 
We are not using session mode and detached mode. The command to run Flink job 
on YARN is:
{code:java}
/bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 2048 
-j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
{code}
Flink CLI logs:
{code:java}
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2018-07-18 12:47:03,747 INFO 
org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No 
path for the flink jar passed. Using the location of class 
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No 
path for the flink jar passed. Using the location of class 
org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
2018-07-18 12:47:04,248 WARN 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
YARN Client needs one of these to be set to properly load the Hadoop 
configuration for accessing YARN.
2018-07-18 12:47:04,409 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
numberTaskManagers=1, slotsPerTaskManager=1}
2018-07-18 12:47:04,783 WARN 
org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
local reads feature cannot be used because libhadoop cannot be loaded.
2018-07-18 12:47:04,788 WARN 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
directory 
('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
 contains both LOG4J and Logback configuration files. Please delete or rename 
one of them.
2018-07-18 12:47:07,846 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
master application_1531474158783_10814
2018-07-18 12:47:08,073 INFO 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
application_1531474158783_10814
2018-07-18 12:47:08,074 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
to be allocated
2018-07-18 12:47:08,076 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
current state ACCEPTED
2018-07-18 12:47:12,864 INFO 
org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has been 
deployed successfully.
{code}
Job Manager logs:
{code:java}
2018-07-18 12:47:09,913 INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 

2018-07-18 12:47:09,915 INFO 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
11:51:27 GMT)
...
{code}
Issues:
 # Flink job is running as a Flink session
 # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
 # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
cluster we need to run: yarn application -kill 

We also tried to run a flink job with 'mode: legacy' and we have the same 
issues:
 # Add property 'mode: legacy' to ./conf/flink-conf.yaml
 # Execute the following command:

{code:java}
/bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 2048 
-j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
{code}
Flink CLI logs:
{code:java}
Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2018-07-18 16:07:13,820 INFO 

[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548029#comment-16548029
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203439890
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

I think we should add try / catch block to protect the connect leak.


> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548031#comment-16548031
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203439859
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

I think we should add try / catch block to protect the connect leak.


> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548030#comment-16548030
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203440989
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

Based on [HBase Connection 
JavaDoc](https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html#close--)
 it seems the caller should invoke `close` method to release resource? so I 
suggest we should close connection in udf's `close` method.


> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548027#comment-16548027
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203438579
  
--- Diff: 
flink-connectors/flink-hbase/src/main/java/org/apache/flink/addons/hbase/TableInputFormat.java
 ---
@@ -81,7 +85,9 @@ private HTable createTable() {
org.apache.hadoop.conf.Configuration hConf = 
HBaseConfiguration.create();
 
try {
-   return new HTable(hConf, getTableName());
+   Connection connection = 
ConnectionFactory.createConnection(hConf);
+   Table table = 
connection.getTable(TableName.valueOf(getTableName()));
+   return (HTable) table;
--- End diff --

I think we should release the connection when happens exception


> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548028#comment-16548028
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203439523
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

I think we should add try / catch block to protect the connect leak.


> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6365: [FLINK-9849] [hbase] Hbase upgrade to 2.0.1

2018-07-18 Thread yanghua
Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203439890
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

I think we should add try / catch block to protect the connect leak.


---


[GitHub] flink pull request #6365: [FLINK-9849] [hbase] Hbase upgrade to 2.0.1

2018-07-18 Thread yanghua
Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203440989
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

Based on [HBase Connection 
JavaDoc](https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html#close--)
 it seems the caller should invoke `close` method to release resource? so I 
suggest we should close connection in udf's `close` method.


---


[GitHub] flink pull request #6365: [FLINK-9849] [hbase] Hbase upgrade to 2.0.1

2018-07-18 Thread yanghua
Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203438579
  
--- Diff: 
flink-connectors/flink-hbase/src/main/java/org/apache/flink/addons/hbase/TableInputFormat.java
 ---
@@ -81,7 +85,9 @@ private HTable createTable() {
org.apache.hadoop.conf.Configuration hConf = 
HBaseConfiguration.create();
 
try {
-   return new HTable(hConf, getTableName());
+   Connection connection = 
ConnectionFactory.createConnection(hConf);
+   Table table = 
connection.getTable(TableName.valueOf(getTableName()));
+   return (HTable) table;
--- End diff --

I think we should release the connection when happens exception


---


[GitHub] flink pull request #6365: [FLINK-9849] [hbase] Hbase upgrade to 2.0.1

2018-07-18 Thread yanghua
Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203439523
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

I think we should add try / catch block to protect the connect leak.


---


[GitHub] flink pull request #6365: [FLINK-9849] [hbase] Hbase upgrade to 2.0.1

2018-07-18 Thread yanghua
Github user yanghua commented on a diff in the pull request:

https://github.com/apache/flink/pull/6365#discussion_r203439859
  
--- Diff: 
flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java
 ---
@@ -87,22 +90,22 @@ public void configure(Configuration parameters) {
 
@Override
public void open(int taskNumber, int numTasks) throws 
IOException {
-   table = new HTable(conf, "flinkExample");
+   Connection connection = 
ConnectionFactory.createConnection(conf);
--- End diff --

I think we should add try / catch block to protect the connect leak.


---


[jira] [Commented] (FLINK-9675) Avoid FileInputStream/FileOutputStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548019#comment-16548019
 ] 

ASF GitHub Bot commented on FLINK-9675:
---

Github user hequn8128 commented on the issue:

https://github.com/apache/flink/pull/6335
  
Hi @zhangminglei ,
Good catch! Maybe the Reader may also need to be adapted, making `new 
InputStreamReader` to `Channels.newReader`. I find a benchmark about File 
InputStream and Reader 
[here](https://arnaudroger.github.io/blog/2017/03/20/faster-reader-inpustream-in-java.html).
 Hope it helps.


> Avoid FileInputStream/FileOutputStream
> --
>
> Key: FLINK-9675
> URL: https://issues.apache.org/jira/browse/FLINK-9675
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Minor
>  Labels: filesystem, pull-request-available
>
> They rely on finalizers (before Java 11), which create unnecessary GC load.
> The alternatives, Files.newInputStream, are as easy to use and don't have 
> this issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6335: [FLINK-9675] [fs] Avoid FileInputStream/FileOutputStream

2018-07-18 Thread hequn8128
Github user hequn8128 commented on the issue:

https://github.com/apache/flink/pull/6335
  
Hi @zhangminglei ,
Good catch! Maybe the Reader may also need to be adapted, making `new 
InputStreamReader` to `Channels.newReader`. I find a benchmark about File 
InputStream and Reader 
[here](https://arnaudroger.github.io/blog/2017/03/20/faster-reader-inpustream-in-java.html).
 Hope it helps.


---


[jira] [Commented] (FLINK-9878) IO worker threads BLOCKED on SSL Session Cache while CMS full gc

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548014#comment-16548014
 ] 

ASF GitHub Bot commented on FLINK-9878:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6355#discussion_r203437995
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java ---
@@ -160,4 +160,41 @@
key("security.ssl.verify-hostname")
.defaultValue(true)
.withDescription("Flag to enable peer’s hostname 
verification during ssl handshake.");
+
+   /**
+* SSL session cache size.
+*/
+   public static final ConfigOption SSL_SESSION_CACHE_SIZE =
+   key("security.ssl.session-cache-size")
+   .defaultValue(-1)
+   .withDescription("The size of the cache used for 
storing SSL session objects. "
+   + "According to 
https://github.com/netty/netty/issues/832, you should always set "
+   + "this to an appropriate number to not run 
into a bug with stalling IO threads "
+   + "during garbage collection. (-1 = use system 
default).");
+
+   /**
+* SSL session timeout.
+*/
+   public static final ConfigOption SSL_SESSION_TIMEOUT =
+   key("security.ssl.session-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for the cached 
SSL session objects. (-1 = use system default)");
+
+   /**
+* SSL session timeout during handshakes.
+*/
+   public static final ConfigOption SSL_HANDSHAKE_TIMEOUT =
+   key("security.ssl.handshake-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) during SSL 
handshake. (-1 = use system default)");
+
+   /**
+* SSL session timeout after flushing the `close_notify` message.
+*/
+   public static final ConfigOption 
SSL_CLOSE_NOTIFY_FLUSH_TIMEOUT =
+   key("security.ssl.close-notify-flush-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for flushing the 
`close_notify` that was triggered by closing a " +
--- End diff --

it's not showing up as a code block since that only works for markdown; the 
description so far was plain-text.


> IO worker threads BLOCKED on SSL Session Cache while CMS full gc
> 
>
> Key: FLINK-9878
> URL: https://issues.apache.org/jira/browse/FLINK-9878
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.0, 1.5.1, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.2, 1.6.0
>
>
> According to https://github.com/netty/netty/issues/832, there is a JDK issue 
> during garbage collection when the SSL session cache is not limited. We 
> should allow the user to configure this and further (advanced) SSL parameters 
> for fine-tuning to fix this and similar issues. In particular, the following 
> parameters should be configurable:
> - SSL session cache size
> - SSL session timeout
> - SSL handshake timeout
> - SSL close notify flush timeout



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6355: [FLINK-9878][network][ssl] add more low-level ssl ...

2018-07-18 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6355#discussion_r203437995
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java ---
@@ -160,4 +160,41 @@
key("security.ssl.verify-hostname")
.defaultValue(true)
.withDescription("Flag to enable peer’s hostname 
verification during ssl handshake.");
+
+   /**
+* SSL session cache size.
+*/
+   public static final ConfigOption SSL_SESSION_CACHE_SIZE =
+   key("security.ssl.session-cache-size")
+   .defaultValue(-1)
+   .withDescription("The size of the cache used for 
storing SSL session objects. "
+   + "According to 
https://github.com/netty/netty/issues/832, you should always set "
+   + "this to an appropriate number to not run 
into a bug with stalling IO threads "
+   + "during garbage collection. (-1 = use system 
default).");
+
+   /**
+* SSL session timeout.
+*/
+   public static final ConfigOption SSL_SESSION_TIMEOUT =
+   key("security.ssl.session-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for the cached 
SSL session objects. (-1 = use system default)");
+
+   /**
+* SSL session timeout during handshakes.
+*/
+   public static final ConfigOption SSL_HANDSHAKE_TIMEOUT =
+   key("security.ssl.handshake-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) during SSL 
handshake. (-1 = use system default)");
+
+   /**
+* SSL session timeout after flushing the `close_notify` message.
+*/
+   public static final ConfigOption 
SSL_CLOSE_NOTIFY_FLUSH_TIMEOUT =
+   key("security.ssl.close-notify-flush-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for flushing the 
`close_notify` that was triggered by closing a " +
--- End diff --

it's not showing up as a code block since that only works for markdown; the 
description so far was plain-text.


---


[jira] [Commented] (FLINK-9860) Netty resource leak on receiver side

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548008#comment-16548008
 ] 

ASF GitHub Bot commented on FLINK-9860:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/6363
  
+1


> Netty resource leak on receiver side
> 
>
> Key: FLINK-9860
> URL: https://issues.apache.org/jira/browse/FLINK-9860
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Till Rohrmann
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.5.2, 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>   
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>   
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> {code}
> We might have a resource leak on the receiving side of our network stack.
> https://api.travis-ci.org/v3/job/404225956/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6363: [FLINK-9860][REST] fix buffer leak in FileUploadHandler

2018-07-18 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/6363
  
+1


---


[jira] [Commented] (FLINK-9850) Add a string to the print method to identify output for DataStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548004#comment-16548004
 ] 

ASF GitHub Bot commented on FLINK-9850:
---

Github user yanghua commented on the issue:

https://github.com/apache/flink/pull/6367
  
@tillrohrmann and @zentol I see the Python DataStream API methods do not 
match DataStream Java API methods (missed some API methods), Shall we add those 
missed API into `PythonDataStream`? If yes, I'd like to do this.


> Add a string to the print method to identify output for DataStream
> --
>
> Key: FLINK-9850
> URL: https://issues.apache.org/jira/browse/FLINK-9850
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Hequn Cheng
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> The output of the print method of {[DataSet}} allows the user to supply a 
> String to identify the output(see 
> [FLINK-1486|https://issues.apache.org/jira/browse/FLINK-1486]). But 
> {[DataStream}} doesn't support now. It is valuable to add this feature for 
> {{DataStream}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6367: [FLINK-9850] Add a string to the print method to identify...

2018-07-18 Thread yanghua
Github user yanghua commented on the issue:

https://github.com/apache/flink/pull/6367
  
@tillrohrmann and @zentol I see the Python DataStream API methods do not 
match DataStream Java API methods (missed some API methods), Shall we add those 
missed API into `PythonDataStream`? If yes, I'd like to do this.


---


[jira] [Commented] (FLINK-9860) Netty resource leak on receiver side

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547991#comment-16547991
 ] 

ASF GitHub Bot commented on FLINK-9860:
---

Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203432013
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/NettyLeakDetectionResource.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.netty;
+
+import org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector;
+import 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetectorFactory;
+
+import org.junit.rules.ExternalResource;
+
+import static org.junit.Assert.fail;
+
+/**
+ * JUnit resource to fail with an assertion when Netty detects a resource 
leak (only with
+ * ERROR logging enabled for
+ * 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector).
+ *
+ * This should be used in a class rule:
+ * {@code
+ * @literal @ClassRule
+ *  public static final NettyLeakDetectionResource LEAK_DETECTION = new 
NettyLeakDetectionResource();
+ * }
+ */
+public class NettyLeakDetectionResource extends ExternalResource {
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @Override
+   protected void before() {
+   previousLeakDetector = ResourceLeakDetectorFactory.instance();
+   previousLeakDetectorLevel = ResourceLeakDetector.getLevel();
+
+   
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+   ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(new 
FailingResourceLeakDetectorFactory());
--- End diff --

true - done


> Netty resource leak on receiver side
> 
>
> Key: FLINK-9860
> URL: https://issues.apache.org/jira/browse/FLINK-9860
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Till Rohrmann
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.5.2, 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>   
> 

[GitHub] flink pull request #6363: [FLINK-9860][REST] fix buffer leak in FileUploadHa...

2018-07-18 Thread NicoK
Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203432013
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/NettyLeakDetectionResource.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.netty;
+
+import org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector;
+import 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetectorFactory;
+
+import org.junit.rules.ExternalResource;
+
+import static org.junit.Assert.fail;
+
+/**
+ * JUnit resource to fail with an assertion when Netty detects a resource 
leak (only with
+ * ERROR logging enabled for
+ * 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector).
+ *
+ * This should be used in a class rule:
+ * {@code
+ * @literal @ClassRule
+ *  public static final NettyLeakDetectionResource LEAK_DETECTION = new 
NettyLeakDetectionResource();
+ * }
+ */
+public class NettyLeakDetectionResource extends ExternalResource {
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @Override
+   protected void before() {
+   previousLeakDetector = ResourceLeakDetectorFactory.instance();
+   previousLeakDetectorLevel = ResourceLeakDetector.getLevel();
+
+   
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+   ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(new 
FailingResourceLeakDetectorFactory());
--- End diff --

true - done


---


[jira] [Updated] (FLINK-9889) create .bat script to start Flink task manager

2018-07-18 Thread vinoyang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

vinoyang updated FLINK-9889:

Summary: create .bat script to start Flink task manager  (was: .bat script 
to start Flink task manager)

> create .bat script to start Flink task manager
> --
>
> Key: FLINK-9889
> URL: https://issues.apache.org/jira/browse/FLINK-9889
> Project: Flink
>  Issue Type: Sub-task
>  Components: Startup Shell Scripts
>Reporter: Pavel Shvetsov
>Assignee: Pavel Shvetsov
>Priority: Minor
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Create .bat script to start additional task managers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9860) Netty resource leak on receiver side

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547969#comment-16547969
 ] 

ASF GitHub Bot commented on FLINK-9860:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203414700
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/NettyLeakDetectionResource.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.netty;
+
+import org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector;
+import 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetectorFactory;
+
+import org.junit.rules.ExternalResource;
+
+import static org.junit.Assert.fail;
+
+/**
+ * JUnit resource to fail with an assertion when Netty detects a resource 
leak (only with
+ * ERROR logging enabled for
+ * 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector).
+ *
+ * This should be used in a class rule:
+ * {@code
+ * @literal @ClassRule
+ *  public static final NettyLeakDetectionResource LEAK_DETECTION = new 
NettyLeakDetectionResource();
+ * }
+ */
+public class NettyLeakDetectionResource extends ExternalResource {
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @Override
+   protected void before() {
+   previousLeakDetector = ResourceLeakDetectorFactory.instance();
+   previousLeakDetectorLevel = ResourceLeakDetector.getLevel();
+
+   
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+   ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(new 
FailingResourceLeakDetectorFactory());
+   }
+
+   @Override
+   protected void after() {
+   
ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(previousLeakDetector);
+   ResourceLeakDetector.setLevel(previousLeakDetectorLevel);
+   }
+
+   static class FailingResourceLeakDetectorFactory extends 
ResourceLeakDetectorFactory {
--- End diff --

these could be private?


> Netty resource leak on receiver side
> 
>
> Key: FLINK-9860
> URL: https://issues.apache.org/jira/browse/FLINK-9860
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Till Rohrmann
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.5.2, 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>   
> 

[jira] [Commented] (FLINK-9860) Netty resource leak on receiver side

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547968#comment-16547968
 ] 

ASF GitHub Bot commented on FLINK-9860:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203422176
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/NettyLeakDetectionResource.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.netty;
+
+import org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector;
+import 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetectorFactory;
+
+import org.junit.rules.ExternalResource;
+
+import static org.junit.Assert.fail;
+
+/**
+ * JUnit resource to fail with an assertion when Netty detects a resource 
leak (only with
+ * ERROR logging enabled for
+ * 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector).
+ *
+ * This should be used in a class rule:
+ * {@code
+ * @literal @ClassRule
+ *  public static final NettyLeakDetectionResource LEAK_DETECTION = new 
NettyLeakDetectionResource();
+ * }
+ */
+public class NettyLeakDetectionResource extends ExternalResource {
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @Override
+   protected void before() {
+   previousLeakDetector = ResourceLeakDetectorFactory.instance();
+   previousLeakDetectorLevel = ResourceLeakDetector.getLevel();
+
+   
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+   ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(new 
FailingResourceLeakDetectorFactory());
--- End diff --

so this isn't something we necessarily have to deal with now, but if 
multiple tests that use this resource run in parallel we might not reset to the 
correct factory/level.

Effectively what we need is some kinda ref-counting so that only the first 
resource modifies the level and factory, and only the last resource reset them. 
:/




> Netty resource leak on receiver side
> 
>
> Key: FLINK-9860
> URL: https://issues.apache.org/jira/browse/FLINK-9860
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Till Rohrmann
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.5.2, 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>   
> 

[jira] [Commented] (FLINK-9860) Netty resource leak on receiver side

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547967#comment-16547967
 ] 

ASF GitHub Bot commented on FLINK-9860:
---

Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203414529
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/rest/FileUploadHandlerTest.java
 ---
@@ -220,4 +224,5 @@ public void testUploadCleanupOnFailure() throws 
IOException {
}
MULTIPART_UPLOAD_RESOURCE.assertUploadDirectoryIsEmpty();
}
+
--- End diff --

revert


> Netty resource leak on receiver side
> 
>
> Key: FLINK-9860
> URL: https://issues.apache.org/jira/browse/FLINK-9860
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Till Rohrmann
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.5.2, 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>   
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>   
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> {code}
> We might have a resource leak on the receiving side of our network stack.
> https://api.travis-ci.org/v3/job/404225956/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6363: [FLINK-9860][REST] fix buffer leak in FileUploadHa...

2018-07-18 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203422176
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/NettyLeakDetectionResource.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.netty;
+
+import org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector;
+import 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetectorFactory;
+
+import org.junit.rules.ExternalResource;
+
+import static org.junit.Assert.fail;
+
+/**
+ * JUnit resource to fail with an assertion when Netty detects a resource 
leak (only with
+ * ERROR logging enabled for
+ * 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector).
+ *
+ * This should be used in a class rule:
+ * {@code
+ * @literal @ClassRule
+ *  public static final NettyLeakDetectionResource LEAK_DETECTION = new 
NettyLeakDetectionResource();
+ * }
+ */
+public class NettyLeakDetectionResource extends ExternalResource {
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @Override
+   protected void before() {
+   previousLeakDetector = ResourceLeakDetectorFactory.instance();
+   previousLeakDetectorLevel = ResourceLeakDetector.getLevel();
+
+   
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+   ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(new 
FailingResourceLeakDetectorFactory());
--- End diff --

so this isn't something we necessarily have to deal with now, but if 
multiple tests that use this resource run in parallel we might not reset to the 
correct factory/level.

Effectively what we need is some kinda ref-counting so that only the first 
resource modifies the level and factory, and only the last resource reset them. 
:/




---


[GitHub] flink pull request #6363: [FLINK-9860][REST] fix buffer leak in FileUploadHa...

2018-07-18 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203414529
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/rest/FileUploadHandlerTest.java
 ---
@@ -220,4 +224,5 @@ public void testUploadCleanupOnFailure() throws 
IOException {
}
MULTIPART_UPLOAD_RESOURCE.assertUploadDirectoryIsEmpty();
}
+
--- End diff --

revert


---


[GitHub] flink pull request #6363: [FLINK-9860][REST] fix buffer leak in FileUploadHa...

2018-07-18 Thread zentol
Github user zentol commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203414700
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/io/network/netty/NettyLeakDetectionResource.java
 ---
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.netty;
+
+import org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector;
+import 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetectorFactory;
+
+import org.junit.rules.ExternalResource;
+
+import static org.junit.Assert.fail;
+
+/**
+ * JUnit resource to fail with an assertion when Netty detects a resource 
leak (only with
+ * ERROR logging enabled for
+ * 
org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector).
+ *
+ * This should be used in a class rule:
+ * {@code
+ * @literal @ClassRule
+ *  public static final NettyLeakDetectionResource LEAK_DETECTION = new 
NettyLeakDetectionResource();
+ * }
+ */
+public class NettyLeakDetectionResource extends ExternalResource {
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @Override
+   protected void before() {
+   previousLeakDetector = ResourceLeakDetectorFactory.instance();
+   previousLeakDetectorLevel = ResourceLeakDetector.getLevel();
+
+   
ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.PARANOID);
+   ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(new 
FailingResourceLeakDetectorFactory());
+   }
+
+   @Override
+   protected void after() {
+   
ResourceLeakDetectorFactory.setResourceLeakDetectorFactory(previousLeakDetector);
+   ResourceLeakDetector.setLevel(previousLeakDetectorLevel);
+   }
+
+   static class FailingResourceLeakDetectorFactory extends 
ResourceLeakDetectorFactory {
--- End diff --

these could be private?


---


[jira] [Commented] (FLINK-9850) Add a string to the print method to identify output for DataStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547965#comment-16547965
 ] 

ASF GitHub Bot commented on FLINK-9850:
---

GitHub user yanghua opened a pull request:

https://github.com/apache/flink/pull/6367

[FLINK-9850] Add a string to the print method to identify output for 
DataStream

## What is the purpose of the change

*This pull request adds a string to the print method to identify output for 
DataStream*


## Brief change log

  - *add print(string) / printToErr(string) to DataStream Java API*
  - *add print(string) / printToErr(string) to DataStream Scala API*
  - *add print(string) to DataStream Python API*

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (not applicable / **docs** / 
**JavaDocs** / not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanghua/flink FLINK-9850

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6367.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6367


commit 80215cd12618392ab0909a431863939d3353ca16
Author: yanghua 
Date:   2018-07-18T15:20:11Z

[FLINK-9850] Add a string to the print method to identify output for 
DataStream




> Add a string to the print method to identify output for DataStream
> --
>
> Key: FLINK-9850
> URL: https://issues.apache.org/jira/browse/FLINK-9850
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Hequn Cheng
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> The output of the print method of {[DataSet}} allows the user to supply a 
> String to identify the output(see 
> [FLINK-1486|https://issues.apache.org/jira/browse/FLINK-1486]). But 
> {[DataStream}} doesn't support now. It is valuable to add this feature for 
> {{DataStream}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9850) Add a string to the print method to identify output for DataStream

2018-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9850:
--
Labels: pull-request-available  (was: )

> Add a string to the print method to identify output for DataStream
> --
>
> Key: FLINK-9850
> URL: https://issues.apache.org/jira/browse/FLINK-9850
> Project: Flink
>  Issue Type: New Feature
>  Components: DataStream API
>Reporter: Hequn Cheng
>Assignee: vinoyang
>Priority: Major
>  Labels: pull-request-available
>
> The output of the print method of {[DataSet}} allows the user to supply a 
> String to identify the output(see 
> [FLINK-1486|https://issues.apache.org/jira/browse/FLINK-1486]). But 
> {[DataStream}} doesn't support now. It is valuable to add this feature for 
> {{DataStream}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6367: [FLINK-9850] Add a string to the print method to i...

2018-07-18 Thread yanghua
GitHub user yanghua opened a pull request:

https://github.com/apache/flink/pull/6367

[FLINK-9850] Add a string to the print method to identify output for 
DataStream

## What is the purpose of the change

*This pull request adds a string to the print method to identify output for 
DataStream*


## Brief change log

  - *add print(string) / printToErr(string) to DataStream Java API*
  - *add print(string) / printToErr(string) to DataStream Scala API*
  - *add print(string) to DataStream Python API*

## Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): (yes / **no**)
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no)
  - The serializers: (yes / **no** / don't know)
  - The runtime per-record code paths (performance sensitive): (yes / 
**no** / don't know)
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
  - The S3 file system connector: (yes / **no** / don't know)

## Documentation

  - Does this pull request introduce a new feature? (yes / **no**)
  - If yes, how is the feature documented? (not applicable / **docs** / 
**JavaDocs** / not documented)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yanghua/flink FLINK-9850

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6367.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6367


commit 80215cd12618392ab0909a431863939d3353ca16
Author: yanghua 
Date:   2018-07-18T15:20:11Z

[FLINK-9850] Add a string to the print method to identify output for 
DataStream




---


[jira] [Commented] (FLINK-4534) Lack of synchronization in BucketingSink#restoreState()

2018-07-18 Thread Gary Yao (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547934#comment-16547934
 ] 

Gary Yao commented on FLINK-4534:
-

[~yuzhih...@gmail.com] Synchronization bears a performance penalty. 
Synchronization makes the code harder to read. We should avoid unnecessary 
synchronization by defining clear contracts and threading models. Ideally, 
every line of code in the Flink repository should be idiomatic because it 
possibly serves as a role model for future contributions.


> Lack of synchronization in BucketingSink#restoreState()
> ---
>
> Key: FLINK-4534
> URL: https://issues.apache.org/jira/browse/FLINK-4534
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming Connectors
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>
> Iteration over state.bucketStates is protected by synchronization in other 
> methods, except for the following in restoreState():
> {code}
> for (BucketState bucketState : state.bucketStates.values()) {
> {code}
> and following in close():
> {code}
> for (Map.Entry> entry : 
> state.bucketStates.entrySet()) {
>   closeCurrentPartFile(entry.getValue());
> {code}
> w.r.t. bucketState.pendingFilesPerCheckpoint , there is similar issue 
> starting line 752:
> {code}
>   Set pastCheckpointIds = 
> bucketState.pendingFilesPerCheckpoint.keySet();
>   LOG.debug("Moving pending files to final location on restore.");
>   for (Long pastCheckpointId : pastCheckpointIds) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9815) YARNSessionCapacitySchedulerITCase flaky

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547936#comment-16547936
 ] 

ASF GitHub Bot commented on FLINK-9815:
---

Github user zentol commented on the issue:

https://github.com/apache/flink/pull/6352
  
yay travis is green.


> YARNSessionCapacitySchedulerITCase flaky
> 
>
> Key: FLINK-9815
> URL: https://issues.apache.org/jira/browse/FLINK-9815
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.6.0
>Reporter: Dawid Wysakowicz
>Assignee: Chesnay Schepler
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> The test fails because of dangling yarn applications.
> Logs: https://api.travis-ci.org/v3/job/402657694/log.txt
> It was also reported previously in [FLINK-8161] : 
> https://issues.apache.org/jira/browse/FLINK-8161?focusedCommentId=16480216=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16480216



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6352: [FLINK-9815][yarn][tests] Harden tests against slow job s...

2018-07-18 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/6352
  
yay travis is green.


---


[jira] [Commented] (FLINK-9869) Send PartitionInfo in batch to Improve perfornance

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547929#comment-16547929
 ] 

ASF GitHub Bot commented on FLINK-9869:
---

Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6345
  
OK. This PR is about performance improvement. I will try to give out a 
benchmark, but since it is inspired by our own batch table tasks, it might take 
time to give one. Though since this PR concurrently send partition info and 
deploy task in another thread, it theoretically does good.

Keep on on Flink 1.6! I will nudge you guys to review this one, 
though(laughed)


> Send PartitionInfo in batch to Improve perfornance
> --
>
> Key: FLINK-9869
> URL: https://issues.apache.org/jira/browse/FLINK-9869
> Project: Flink
>  Issue Type: Improvement
>  Components: Local Runtime
>Affects Versions: 1.5.1
>Reporter: 陈梓立
>Assignee: 陈梓立
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.2
>
>
> ... current we send partition info as soon as one arrive. we could 
> `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve 
> performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6345: [FLINK-9869] Send PartitionInfo in batch to Improve perfo...

2018-07-18 Thread tison1
Github user tison1 commented on the issue:

https://github.com/apache/flink/pull/6345
  
OK. This PR is about performance improvement. I will try to give out a 
benchmark, but since it is inspired by our own batch table tasks, it might take 
time to give one. Though since this PR concurrently send partition info and 
deploy task in another thread, it theoretically does good.

Keep on on Flink 1.6! I will nudge you guys to review this one, 
though(laughed)


---


[jira] [Commented] (FLINK-9886) Build SQL jars with every build

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547917#comment-16547917
 ] 

ASF GitHub Bot commented on FLINK-9886:
---

GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/6366

[FLINK-9886] [sql-client] Build SQL jars with every build

## What is the purpose of the change

This enables the building of the SQL jars by default. This solves a couple 
of issues:

- Reduces user confusion for finding SQL jars in SNAPSHOT releases
- Enables end-to-end testing

## Brief change log

- Building enabled by defaul but can be skipped with `-DskipSqlJars`


## Verifying this change

This change is a trivial rework / code cleanup without any test coverage

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): yes
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? no
  - If yes, how is the feature documented? not documented


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink FLINK-9886

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6366


commit 1fcf364c1b28b29fd9a346ec981b9945507f7f60
Author: Timo Walther 
Date:   2018-07-18T11:30:28Z

[FLINK-9886] [sql-client] Build SQL jars with every build




> Build SQL jars with every build
> ---
>
> Key: FLINK-9886
> URL: https://issues.apache.org/jira/browse/FLINK-9886
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the shaded fat jars for SQL are only built in the {{-Prelease}} 
> profile. However, end-to-end tests require those jars and should also be able 
> to test them. E.g. existing {{META-INF}} entry and proper shading. We should 
> build them with every release. If a build should happen quicker one can use 
> the {{-Pfast}} profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9886) Build SQL jars with every build

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547920#comment-16547920
 ] 

ASF GitHub Bot commented on FLINK-9886:
---

Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/6366
  
CC @zentol 


> Build SQL jars with every build
> ---
>
> Key: FLINK-9886
> URL: https://issues.apache.org/jira/browse/FLINK-9886
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the shaded fat jars for SQL are only built in the {{-Prelease}} 
> profile. However, end-to-end tests require those jars and should also be able 
> to test them. E.g. existing {{META-INF}} entry and proper shading. We should 
> build them with every release. If a build should happen quicker one can use 
> the {{-Pfast}} profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-9886) Build SQL jars with every build

2018-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9886:
--
Labels: pull-request-available  (was: )

> Build SQL jars with every build
> ---
>
> Key: FLINK-9886
> URL: https://issues.apache.org/jira/browse/FLINK-9886
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the shaded fat jars for SQL are only built in the {{-Prelease}} 
> profile. However, end-to-end tests require those jars and should also be able 
> to test them. E.g. existing {{META-INF}} entry and proper shading. We should 
> build them with every release. If a build should happen quicker one can use 
> the {{-Pfast}} profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink issue #6366: [FLINK-9886] [sql-client] Build SQL jars with every build

2018-07-18 Thread twalthr
Github user twalthr commented on the issue:

https://github.com/apache/flink/pull/6366
  
CC @zentol 


---


[GitHub] flink pull request #6366: [FLINK-9886] [sql-client] Build SQL jars with ever...

2018-07-18 Thread twalthr
GitHub user twalthr opened a pull request:

https://github.com/apache/flink/pull/6366

[FLINK-9886] [sql-client] Build SQL jars with every build

## What is the purpose of the change

This enables the building of the SQL jars by default. This solves a couple 
of issues:

- Reduces user confusion for finding SQL jars in SNAPSHOT releases
- Enables end-to-end testing

## Brief change log

- Building enabled by defaul but can be skipped with `-DskipSqlJars`


## Verifying this change

This change is a trivial rework / code cleanup without any test coverage

## Does this pull request potentially affect one of the following parts:

  - Dependencies (does it add or upgrade a dependency): yes
  - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
  - The serializers: no
  - The runtime per-record code paths (performance sensitive): no
  - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
  - The S3 file system connector: no

## Documentation

  - Does this pull request introduce a new feature? no
  - If yes, how is the feature documented? not documented


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/twalthr/flink FLINK-9886

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6366.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6366


commit 1fcf364c1b28b29fd9a346ec981b9945507f7f60
Author: Timo Walther 
Date:   2018-07-18T11:30:28Z

[FLINK-9886] [sql-client] Build SQL jars with every build




---


[jira] [Commented] (FLINK-9878) IO worker threads BLOCKED on SSL Session Cache while CMS full gc

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547909#comment-16547909
 ] 

ASF GitHub Bot commented on FLINK-9878:
---

Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/6355#discussion_r203405530
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java ---
@@ -160,4 +160,41 @@
key("security.ssl.verify-hostname")
.defaultValue(true)
.withDescription("Flag to enable peer’s hostname 
verification during ssl handshake.");
+
+   /**
+* SSL session cache size.
+*/
+   public static final ConfigOption SSL_SESSION_CACHE_SIZE =
+   key("security.ssl.session-cache-size")
+   .defaultValue(-1)
+   .withDescription("The size of the cache used for 
storing SSL session objects. "
+   + "According to 
https://github.com/netty/netty/issues/832, you should always set "
+   + "this to an appropriate number to not run 
into a bug with stalling IO threads "
+   + "during garbage collection. (-1 = use system 
default).");
+
+   /**
+* SSL session timeout.
+*/
+   public static final ConfigOption SSL_SESSION_TIMEOUT =
+   key("security.ssl.session-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for the cached 
SSL session objects. (-1 = use system default)");
+
+   /**
+* SSL session timeout during handshakes.
+*/
+   public static final ConfigOption SSL_HANDSHAKE_TIMEOUT =
+   key("security.ssl.handshake-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) during SSL 
handshake. (-1 = use system default)");
+
+   /**
+* SSL session timeout after flushing the `close_notify` message.
+*/
+   public static final ConfigOption 
SSL_CLOSE_NOTIFY_FLUSH_TIMEOUT =
+   key("security.ssl.close-notify-flush-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for flushing the 
`close_notify` that was triggered by closing a " +
--- End diff --

could try - strangely though, this is working for e.g. 
`security.kerberos.login.contexts` although the desired effect (marking it as 
code) is not there...but that's a different problem.


> IO worker threads BLOCKED on SSL Session Cache while CMS full gc
> 
>
> Key: FLINK-9878
> URL: https://issues.apache.org/jira/browse/FLINK-9878
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.0, 1.5.1, 1.6.0
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.5.2, 1.6.0
>
>
> According to https://github.com/netty/netty/issues/832, there is a JDK issue 
> during garbage collection when the SSL session cache is not limited. We 
> should allow the user to configure this and further (advanced) SSL parameters 
> for fine-tuning to fix this and similar issues. In particular, the following 
> parameters should be configurable:
> - SSL session cache size
> - SSL session timeout
> - SSL handshake timeout
> - SSL close notify flush timeout



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6355: [FLINK-9878][network][ssl] add more low-level ssl ...

2018-07-18 Thread NicoK
Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/6355#discussion_r203405530
  
--- Diff: 
flink-core/src/main/java/org/apache/flink/configuration/SecurityOptions.java ---
@@ -160,4 +160,41 @@
key("security.ssl.verify-hostname")
.defaultValue(true)
.withDescription("Flag to enable peer’s hostname 
verification during ssl handshake.");
+
+   /**
+* SSL session cache size.
+*/
+   public static final ConfigOption SSL_SESSION_CACHE_SIZE =
+   key("security.ssl.session-cache-size")
+   .defaultValue(-1)
+   .withDescription("The size of the cache used for 
storing SSL session objects. "
+   + "According to 
https://github.com/netty/netty/issues/832, you should always set "
+   + "this to an appropriate number to not run 
into a bug with stalling IO threads "
+   + "during garbage collection. (-1 = use system 
default).");
+
+   /**
+* SSL session timeout.
+*/
+   public static final ConfigOption SSL_SESSION_TIMEOUT =
+   key("security.ssl.session-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for the cached 
SSL session objects. (-1 = use system default)");
+
+   /**
+* SSL session timeout during handshakes.
+*/
+   public static final ConfigOption SSL_HANDSHAKE_TIMEOUT =
+   key("security.ssl.handshake-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) during SSL 
handshake. (-1 = use system default)");
+
+   /**
+* SSL session timeout after flushing the `close_notify` message.
+*/
+   public static final ConfigOption 
SSL_CLOSE_NOTIFY_FLUSH_TIMEOUT =
+   key("security.ssl.close-notify-flush-timeout")
+   .defaultValue(-1)
+   .withDescription("The timeout (in ms) for flushing the 
`close_notify` that was triggered by closing a " +
--- End diff --

could try - strangely though, this is working for e.g. 
`security.kerberos.login.contexts` although the desired effect (marking it as 
code) is not there...but that's a different problem.


---


[jira] [Resolved] (FLINK-9575) Potential race condition when removing JobGraph in HA

2018-07-18 Thread Till Rohrmann (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann resolved FLINK-9575.
--
Resolution: Fixed

Fixed via
master:
e984168e2eca59c08da90bd5feeac458eaa91bed
f6b2e8c5ff0304e4835d2dc8c792a0d055679603

1.6.0:
e2b4ffc016da822dda544b31fb3caf679f80a9d9
b9fe077d221bdb013ed57f2555405c9fe4a96aa1

1.5.2:
1bf77cfe17bc046772d02b22d6347388de359ff6
9c4b40dd0bbb22f8f312b0fc42f54a1a4619bf53

> Potential race condition when removing JobGraph in HA
> -
>
> Key: FLINK-9575
> URL: https://issues.apache.org/jira/browse/FLINK-9575
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Dominik Wosiński
>Assignee: Dominik Wosiński
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.5.2, 1.6.0
>
>
> When we are removing the _JobGraph_ from _JobManager_ for example after 
> invoking _cancel()_, the following code is executed : 
> {noformat}
>  
> val futureOption = currentJobs.get(jobID) match { case Some((eg, _)) => val 
> result = if (removeJobFromStateBackend) { val futureOption = Some(future { 
> try { // ...otherwise, we can have lingering resources when there is a 
> concurrent shutdown // and the ZooKeeper client is closed. Not removing the 
> job immediately allow the // shutdown to release all resources. 
> submittedJobGraphs.removeJobGraph(jobID) } catch { case t: Throwable => 
> log.warn(s"Could not remove submitted job graph $jobID.", t) } 
> }(context.dispatcher)) try { archive ! decorateMessage( 
> ArchiveExecutionGraph( jobID, ArchivedExecutionGraph.createFrom(eg))) } catch 
> { case t: Throwable => log.warn(s"Could not archive the execution graph 
> $eg.", t) } futureOption } else { None } currentJobs.remove(jobID) result 
> case None => None } // remove all job-related BLOBs from local and HA store 
> libraryCacheManager.unregisterJob(jobID) blobServer.cleanupJob(jobID, 
> removeJobFromStateBackend) jobManagerMetricGroup.removeJob(jobID) 
> futureOption }
> val futureOption = currentJobs.get(jobID) match {
> case Some((eg, _)) =>
> val result = if (removeJobFromStateBackend) {
> val futureOption = Some(future {
> try {
> // ...otherwise, we can have lingering resources when there is a concurrent 
> shutdown
> // and the ZooKeeper client is closed. Not removing the job immediately allow 
> the
> // shutdown to release all resources.
> submittedJobGraphs.removeJobGraph(jobID)
> } catch {
> case t: Throwable => log.warn(s"Could not remove submitted job graph 
> $jobID.", t)
> }
> }(context.dispatcher))
> try {
> archive ! decorateMessage(
> ArchiveExecutionGraph(
> jobID,
> ArchivedExecutionGraph.createFrom(eg)))
> } catch {
> case t: Throwable => log.warn(s"Could not archive the execution graph $eg.", 
> t)
> }
> futureOption
> } else {
> None
> }
> currentJobs.remove(jobID)
> result
> case None => None
> }
> // remove all job-related BLOBs from local and HA store
> libraryCacheManager.unregisterJob(jobID)
> blobServer.cleanupJob(jobID, removeJobFromStateBackend)
> jobManagerMetricGroup.removeJob(jobID)
> futureOption
> }{noformat}
> This causes the asynchronous removal of the job and synchronous removal of 
> blob files connected with this jar. This means as far as I understand that 
> there is a potential problem that we can fail to remove job graph from 
> _submittedJobGraphs._ If the JobManager fails and we elect the new leader it 
> can try to recover such job, but it will fail with an exception since the 
> assigned blob was already removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9860) Netty resource leak on receiver side

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547890#comment-16547890
 ] 

ASF GitHub Bot commented on FLINK-9860:
---

Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203398509
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/rest/FileUploadHandlerTest.java
 ---
@@ -50,6 +55,24 @@
 
private static final ObjectMapper OBJECT_MAPPER = 
RestMapperUtils.getStrictObjectMapper();
 
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @BeforeClass
+   public static void setLeakDetector() {
--- End diff --

great idea - done


> Netty resource leak on receiver side
> 
>
> Key: FLINK-9860
> URL: https://issues.apache.org/jira/browse/FLINK-9860
> Project: Flink
>  Issue Type: Bug
>  Components: Network
>Affects Versions: 1.5.1, 1.6.0
>Reporter: Till Rohrmann
>Assignee: Nico Kruber
>Priority: Blocker
>  Labels: pull-request-available, test-stability
> Fix For: 1.5.2, 1.6.0
>
>
> The Hadoop-free Wordcount end-to-end test fails with the following exception:
> {code}
> ERROR org.apache.flink.shaded.netty4.io.netty.util.ResourceLeakDetector  - 
> LEAK: ByteBuf.release() was not called before it's garbage-collected. See 
> http://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records: 
> Created at:
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:331)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:185)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:176)
>   
> org.apache.flink.shaded.netty4.io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:137)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
>   
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
>   
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>   
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> {code}
> We might have a resource leak on the receiving side of our network stack.
> https://api.travis-ci.org/v3/job/404225956/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6363: [FLINK-9860][REST] fix buffer leak in FileUploadHa...

2018-07-18 Thread NicoK
Github user NicoK commented on a diff in the pull request:

https://github.com/apache/flink/pull/6363#discussion_r203398509
  
--- Diff: 
flink-runtime/src/test/java/org/apache/flink/runtime/rest/FileUploadHandlerTest.java
 ---
@@ -50,6 +55,24 @@
 
private static final ObjectMapper OBJECT_MAPPER = 
RestMapperUtils.getStrictObjectMapper();
 
+   private static ResourceLeakDetectorFactory previousLeakDetector;
+   private static ResourceLeakDetector.Level previousLeakDetectorLevel;
+
+   @BeforeClass
+   public static void setLeakDetector() {
--- End diff --

great idea - done


---


[jira] [Created] (FLINK-9890) Remove obsolete Class ResourceManagerConfiguration

2018-07-18 Thread Gary Yao (JIRA)
Gary Yao created FLINK-9890:
---

 Summary: Remove obsolete Class ResourceManagerConfiguration 
 Key: FLINK-9890
 URL: https://issues.apache.org/jira/browse/FLINK-9890
 Project: Flink
  Issue Type: Improvement
  Components: Distributed Coordination
Affects Versions: 1.6.0
 Environment: Rev: 690bc370e19d8003add4e41c05acfb4dccc662b4
Reporter: Gary Yao
Assignee: Gary Yao


The class {{ResourceManagerConfiguration}} is effectively not used, and should 
be therefore removed to avoid confusion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9575) Potential race condition when removing JobGraph in HA

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547868#comment-16547868
 ] 

ASF GitHub Bot commented on FLINK-9575:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/6322


> Potential race condition when removing JobGraph in HA
> -
>
> Key: FLINK-9575
> URL: https://issues.apache.org/jira/browse/FLINK-9575
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Dominik Wosiński
>Assignee: Dominik Wosiński
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.5.2, 1.6.0
>
>
> When we are removing the _JobGraph_ from _JobManager_ for example after 
> invoking _cancel()_, the following code is executed : 
> {noformat}
>  
> val futureOption = currentJobs.get(jobID) match { case Some((eg, _)) => val 
> result = if (removeJobFromStateBackend) { val futureOption = Some(future { 
> try { // ...otherwise, we can have lingering resources when there is a 
> concurrent shutdown // and the ZooKeeper client is closed. Not removing the 
> job immediately allow the // shutdown to release all resources. 
> submittedJobGraphs.removeJobGraph(jobID) } catch { case t: Throwable => 
> log.warn(s"Could not remove submitted job graph $jobID.", t) } 
> }(context.dispatcher)) try { archive ! decorateMessage( 
> ArchiveExecutionGraph( jobID, ArchivedExecutionGraph.createFrom(eg))) } catch 
> { case t: Throwable => log.warn(s"Could not archive the execution graph 
> $eg.", t) } futureOption } else { None } currentJobs.remove(jobID) result 
> case None => None } // remove all job-related BLOBs from local and HA store 
> libraryCacheManager.unregisterJob(jobID) blobServer.cleanupJob(jobID, 
> removeJobFromStateBackend) jobManagerMetricGroup.removeJob(jobID) 
> futureOption }
> val futureOption = currentJobs.get(jobID) match {
> case Some((eg, _)) =>
> val result = if (removeJobFromStateBackend) {
> val futureOption = Some(future {
> try {
> // ...otherwise, we can have lingering resources when there is a concurrent 
> shutdown
> // and the ZooKeeper client is closed. Not removing the job immediately allow 
> the
> // shutdown to release all resources.
> submittedJobGraphs.removeJobGraph(jobID)
> } catch {
> case t: Throwable => log.warn(s"Could not remove submitted job graph 
> $jobID.", t)
> }
> }(context.dispatcher))
> try {
> archive ! decorateMessage(
> ArchiveExecutionGraph(
> jobID,
> ArchivedExecutionGraph.createFrom(eg)))
> } catch {
> case t: Throwable => log.warn(s"Could not archive the execution graph $eg.", 
> t)
> }
> futureOption
> } else {
> None
> }
> currentJobs.remove(jobID)
> result
> case None => None
> }
> // remove all job-related BLOBs from local and HA store
> libraryCacheManager.unregisterJob(jobID)
> blobServer.cleanupJob(jobID, removeJobFromStateBackend)
> jobManagerMetricGroup.removeJob(jobID)
> futureOption
> }{noformat}
> This causes the asynchronous removal of the job and synchronous removal of 
> blob files connected with this jar. This means as far as I understand that 
> there is a potential problem that we can fail to remove job graph from 
> _submittedJobGraphs._ If the JobManager fails and we elect the new leader it 
> can try to recover such job, but it will fail with an exception since the 
> assigned blob was already removed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6322: [FLINK-9575]: Remove job-related BLOBS only if the...

2018-07-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/6322


---


[jira] [Updated] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-9849:
--
Labels: pull-request-available  (was: )

> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9849) Upgrade hbase version to 2.0.1 for hbase connector

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547862#comment-16547862
 ] 

ASF GitHub Bot commented on FLINK-9849:
---

GitHub user zhangminglei opened a pull request:

https://github.com/apache/flink/pull/6365

[FLINK-9849] [hbase] Hbase upgrade

## What is the purpose of the change

Upgrade hbase version to 2.0.1 for hbase connector



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangminglei/flink flink-9849-hbase-upgrade

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6365


commit cb4bc4b641565e6caf823e85d541df3022a59237
Author: zhangminglei 
Date:   2018-07-18T13:44:37Z

[FLINK-9849] [hbase] Hbase upgrade




> Upgrade hbase version to 2.0.1 for hbase connector
> --
>
> Key: FLINK-9849
> URL: https://issues.apache.org/jira/browse/FLINK-9849
> Project: Flink
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: zhangminglei
>Priority: Major
>  Labels: pull-request-available
>
> Currently hbase 1.4.3 is used for hbase connector.
> We should upgrade to 2.0.1 which was recently released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6365: [FLINK-9849] [hbase] Hbase upgrade

2018-07-18 Thread zhangminglei
GitHub user zhangminglei opened a pull request:

https://github.com/apache/flink/pull/6365

[FLINK-9849] [hbase] Hbase upgrade

## What is the purpose of the change

Upgrade hbase version to 2.0.1 for hbase connector



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangminglei/flink flink-9849-hbase-upgrade

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6365.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6365


commit cb4bc4b641565e6caf823e85d541df3022a59237
Author: zhangminglei 
Date:   2018-07-18T13:44:37Z

[FLINK-9849] [hbase] Hbase upgrade




---


[GitHub] flink pull request #6364: [hotfix] typo for SqlExecutionException msg

2018-07-18 Thread xueyumusic
GitHub user xueyumusic opened a pull request:

https://github.com/apache/flink/pull/6364

[hotfix] typo for SqlExecutionException msg

fix typo in SqlExecutionException msg in ExecutionContext.java

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xueyumusic/flink hotfix1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/6364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6364


commit a8a4e497acf1e62c1c632724f7bf6604032302fa
Author: xueyu <278006819@...>
Date:   2018-07-18T13:39:57Z

hotfix for SqlExecutionException msg




---


[jira] [Commented] (FLINK-9862) Update end-to-end test to use RocksDB backed timers

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547853#comment-16547853
 ] 

ASF GitHub Bot commented on FLINK-9862:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/6351#discussion_r203382388
  
--- Diff: 
flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/SemanticsCheckMapper.java
 ---
@@ -56,6 +56,7 @@ public void flatMap(Event event, Collector out) 
throws Exception {
if (validator.check(currentValue, nextValue)) {
sequenceValue.update(nextValue);
} else {
+   sequenceValue.update(nextValue);
--- End diff --

```
sequenceValue.update(nextValue);
if (!validator.check(currentValue, nextValue)) {
out.collect("Alert: " + currentValue + " -> " + 
nextValue + " (" + event.getKey() + ")");
}
```


> Update end-to-end test to use RocksDB backed timers
> ---
>
> Key: FLINK-9862
> URL: https://issues.apache.org/jira/browse/FLINK-9862
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing, Streaming
>Affects Versions: 1.6.0
>Reporter: Till Rohrmann
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> We should add or modify an end-to-end test to use RocksDB backed timers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6351: [FLINK-9862] [test] Extend general puropose DataSt...

2018-07-18 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/6351#discussion_r203382388
  
--- Diff: 
flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/SemanticsCheckMapper.java
 ---
@@ -56,6 +56,7 @@ public void flatMap(Event event, Collector out) 
throws Exception {
if (validator.check(currentValue, nextValue)) {
sequenceValue.update(nextValue);
} else {
+   sequenceValue.update(nextValue);
--- End diff --

```
sequenceValue.update(nextValue);
if (!validator.check(currentValue, nextValue)) {
out.collect("Alert: " + currentValue + " -> " + 
nextValue + " (" + event.getKey() + ")");
}
```


---


[GitHub] flink pull request #6351: [FLINK-9862] [test] Extend general puropose DataSt...

2018-07-18 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/6351#discussion_r203380952
  
--- Diff: 
flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/DataStreamAllroundTestJobFactory.java
 ---
@@ -184,12 +189,16 @@
 
private static final ConfigOption 
SEQUENCE_GENERATOR_SRC_EVENT_TIME_MAX_OUT_OF_ORDERNESS = ConfigOptions
.key("sequence_generator_source.event_time.max_out_of_order")
-   .defaultValue(500L);
+   .defaultValue(0L);
 
private static final ConfigOption 
SEQUENCE_GENERATOR_SRC_EVENT_TIME_CLOCK_PROGRESS = ConfigOptions
.key("sequence_generator_source.event_time.clock_progress")
.defaultValue(100L);
 
+   private static final ConfigOption 
TUMBLING_WINDOW_OPERATOR_NUM_EVENTS = ConfigOptions
+   .key("sliding_window_operator.num_events")
--- End diff --

`tumbling` instead of `sliding`?


---


[jira] [Commented] (FLINK-9862) Update end-to-end test to use RocksDB backed timers

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547848#comment-16547848
 ] 

ASF GitHub Bot commented on FLINK-9862:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/6351#discussion_r203380952
  
--- Diff: 
flink-end-to-end-tests/flink-datastream-allround-test/src/main/java/org/apache/flink/streaming/tests/DataStreamAllroundTestJobFactory.java
 ---
@@ -184,12 +189,16 @@
 
private static final ConfigOption 
SEQUENCE_GENERATOR_SRC_EVENT_TIME_MAX_OUT_OF_ORDERNESS = ConfigOptions
.key("sequence_generator_source.event_time.max_out_of_order")
-   .defaultValue(500L);
+   .defaultValue(0L);
 
private static final ConfigOption 
SEQUENCE_GENERATOR_SRC_EVENT_TIME_CLOCK_PROGRESS = ConfigOptions
.key("sequence_generator_source.event_time.clock_progress")
.defaultValue(100L);
 
+   private static final ConfigOption 
TUMBLING_WINDOW_OPERATOR_NUM_EVENTS = ConfigOptions
+   .key("sliding_window_operator.num_events")
--- End diff --

`tumbling` instead of `sliding`?


> Update end-to-end test to use RocksDB backed timers
> ---
>
> Key: FLINK-9862
> URL: https://issues.apache.org/jira/browse/FLINK-9862
> Project: Flink
>  Issue Type: Sub-task
>  Components: State Backends, Checkpointing, Streaming
>Affects Versions: 1.6.0
>Reporter: Till Rohrmann
>Assignee: Tzu-Li (Gordon) Tai
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>
> We should add or modify an end-to-end test to use RocksDB backed timers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-9185) Potential null dereference in PrioritizedOperatorSubtaskState#resolvePrioritizedAlternatives

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547846#comment-16547846
 ] 

ASF GitHub Bot commented on FLINK-9185:
---

Github user StephenJeson commented on a diff in the pull request:

https://github.com/apache/flink/pull/5894#discussion_r203378828
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PrioritizedOperatorSubtaskState.java
 ---
@@ -281,10 +281,15 @@ public PrioritizedOperatorSubtaskState build() {
// approve-function signaled true.
if (alternative != null
&& alternative.hasState()
-   && alternative.size() == 1
-   && approveFun.apply(reference, 
alternative.iterator().next())) {
--- End diff --

Many thanks for your suggestion, I'll try to refine it.


> Potential null dereference in 
> PrioritizedOperatorSubtaskState#resolvePrioritizedAlternatives
> 
>
> Key: FLINK-9185
> URL: https://issues.apache.org/jira/browse/FLINK-9185
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Stephen Jason
>Priority: Minor
>  Labels: pull-request-available
>
> {code}
> if (alternative != null
>   && alternative.hasState()
>   && alternative.size() == 1
>   && approveFun.apply(reference, alternative.iterator().next())) {
> {code}
> The return value from approveFun.apply would be unboxed.
> We should check that the return value is not null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #5894: [FLINK-9185] [runtime] Fix potential null derefere...

2018-07-18 Thread StephenJeson
Github user StephenJeson commented on a diff in the pull request:

https://github.com/apache/flink/pull/5894#discussion_r203378828
  
--- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PrioritizedOperatorSubtaskState.java
 ---
@@ -281,10 +281,15 @@ public PrioritizedOperatorSubtaskState build() {
// approve-function signaled true.
if (alternative != null
&& alternative.hasState()
-   && alternative.size() == 1
-   && approveFun.apply(reference, 
alternative.iterator().next())) {
--- End diff --

Many thanks for your suggestion, I'll try to refine it.


---


[jira] [Commented] (FLINK-9858) State TTL End-to-End Test

2018-07-18 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547837#comment-16547837
 ] 

ASF GitHub Bot commented on FLINK-9858:
---

Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/6361#discussion_r203377098
  
--- Diff: flink-end-to-end-tests/test-scripts/common.sh ---
@@ -240,6 +240,15 @@ function start_cluster {
   done
 }
 
+function start_taskmanagers {
--- End diff --

I think you could just reuse function `tm_watchdog` for this purpose.


> State TTL End-to-End Test
> -
>
> Key: FLINK-9858
> URL: https://issues.apache.org/jira/browse/FLINK-9858
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Andrey Zagrebin
>Assignee: Andrey Zagrebin
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] flink pull request #6361: [FLINK-9858] [tests] State TTL End-to-End Test

2018-07-18 Thread StefanRRichter
Github user StefanRRichter commented on a diff in the pull request:

https://github.com/apache/flink/pull/6361#discussion_r203377098
  
--- Diff: flink-end-to-end-tests/test-scripts/common.sh ---
@@ -240,6 +240,15 @@ function start_cluster {
   done
 }
 
+function start_taskmanagers {
--- End diff --

I think you could just reuse function `tm_watchdog` for this purpose.


---


[jira] [Closed] (FLINK-9832) Allow commas in job submission query params

2018-07-18 Thread Chesnay Schepler (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-9832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-9832.
---
   Resolution: Won't Fix
Fix Version/s: (was: 1.5.2)
   (was: 1.6.0)

> Allow commas in job submission query params
> ---
>
> Key: FLINK-9832
> URL: https://issues.apache.org/jira/browse/FLINK-9832
> Project: Flink
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 1.5.1
>Reporter: Ufuk Celebi
>Assignee: Chesnay Schepler
>Priority: Blocker
>
> As reported on the user mailing list in the thread "Run programs w/ params 
> including comma via REST api" [1], submitting a job with mainArgs that 
> include a comma results in an exception.
> To reproduce submit a job with the following mainArgs:
> {code}
> --servers 10.100.98.9:9092,10.100.98.237:9092
> {code}
> The request fails with
> {code}
> Expected only one value [--servers 10.100.98.9:9092, 10.100.98.237:9092].
> {code}
> As a work around, users have to use a different delimiter such as {{;}}.
> The proper fix of this API would make these params part of the {{POST}} 
> request instead of relying on query params (as noted in FLINK-9499). I think 
> it's still valuable to fix this as part of a bug fix release for 1.5.
> [1] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Run-programs-w-params-including-comma-via-REST-api-td19662.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   >