[jira] [Created] (FLINK-11882) Introduce BytesHashMap to batch hash agg

2019-03-12 Thread Jingsong Lee (JIRA)
Jingsong Lee created FLINK-11882:


 Summary: Introduce BytesHashMap to batch hash agg
 Key: FLINK-11882
 URL: https://issues.apache.org/jira/browse/FLINK-11882
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Operators
Reporter: Jingsong Lee
Assignee: Jingsong Lee


Introduce bytes based hash table.
It can be used for performing aggregations where the aggregated values are 
fixed-width.
Because the data is stored in continuous memory, AggBuffer of variable length 
cannot be applied to this HashMap. The KeyValue form in hash map is designed to 
reduce the cost of key fetching in lookup.

Add a test to do a complete hash agg. When HashMap has enough memory, pure hash 
AGG is performed; when memory is insufficient, it degenerates into sort agg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] JingsongLi opened a new pull request #7961: [FLINK-11882][table-runtime-blink] Introduce BytesHashMap to batch hash agg

2019-03-12 Thread GitBox
JingsongLi opened a new pull request #7961: [FLINK-11882][table-runtime-blink] 
Introduce BytesHashMap to batch hash agg
URL: https://github.com/apache/flink/pull/7961
 
 
   ## What is the purpose of the change
   
   Introduce bytes based hash table.
   It can be used for performing aggregations where the aggregated values are 
fixed-width.
   Because the data is stored in continuous memory, AggBuffer of variable 
length cannot be applied to this HashMap. The KeyValue form in hash map is 
designed to reduce the cost of key fetching in lookup.
   
   Add a test to do a complete hash agg. When HashMap has enough memory, pure 
hash AGG is performed; when memory is insufficient, it degenerates into sort 
agg.
   
   ## Verifying this change
   
   ut & coverage
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes, just add test 
dependency)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (JavaDocs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #7961: [FLINK-11882][table-runtime-blink] Introduce BytesHashMap to batch hash agg

2019-03-12 Thread GitBox
flinkbot commented on issue #7961: [FLINK-11882][table-runtime-blink] Introduce 
BytesHashMap to batch hash agg
URL: https://github.com/apache/flink/pull/7961#issuecomment-471895102
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #7958: [FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink table

2019-03-12 Thread GitBox
JingsongLi commented on a change in pull request #7958: 
[FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink 
table
URL: https://github.com/apache/flink/pull/7958#discussion_r264556661
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/dataformat/BinaryGeneric.java
 ##
 @@ -79,4 +79,16 @@ public void materialize() {
int offset = (int) (offsetAndSize >> 32);
return new BinaryGeneric<>(segments, offset + baseOffset, size, 
null);
}
+
+   public static  T getJavaObjectFromBinaryGeneric(BinaryGeneric 
value, TypeSerializer ser) {
+   if (value.getJavaObject() == null) {
+   try {
+   
value.setJavaObject(InstantiationUtil.deserializeFromByteArray(ser,
+   
SegmentsUtil.copyToBytes(value.getSegments(), value.getOffset(), 
value.getSizeInBytes(;
+   } catch (IOException e) {
+   throw new RuntimeException(e);
 
 Review comment:
   Good point, Thanks~


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on issue #7960: [FLINK-9854][table] Allow passing multi-line input to SQL Client CLI

2019-03-12 Thread GitBox
dawidwys commented on issue #7960: [FLINK-9854][table] Allow passing multi-line 
input to SQL Client CLI
URL: https://github.com/apache/flink/pull/7960#issuecomment-471896256
 
 
   Hi @yanghua, could you extend the description with how does the piping 
multiline support actually works. Why only the last command in a pipe has to 
end with `;`? That looks a bit weird for me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on issue #7956: fix Environment.java catalogs lose

2019-03-12 Thread GitBox
dawidwys commented on issue #7956: fix  Environment.java catalogs lose
URL: https://github.com/apache/flink/pull/7956#issuecomment-471896714
 
 
   Hi @ssyue , could you please improve the description of this PR? Also could 
you please edit the title of this PR to match our guidelines (included in the 
PR template).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11882) Introduce BytesHashMap to batch hash agg

2019-03-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11882:
---
Labels: pull-request-available  (was: )

> Introduce BytesHashMap to batch hash agg
> 
>
> Key: FLINK-11882
> URL: https://issues.apache.org/jira/browse/FLINK-11882
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Operators
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
>
> Introduce bytes based hash table.
> It can be used for performing aggregations where the aggregated values are 
> fixed-width.
> Because the data is stored in continuous memory, AggBuffer of variable length 
> cannot be applied to this HashMap. The KeyValue form in hash map is designed 
> to reduce the cost of key fetching in lookup.
> Add a test to do a complete hash agg. When HashMap has enough memory, pure 
> hash AGG is performed; when memory is insufficient, it degenerates into sort 
> agg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-11420) Serialization of case classes containing a Map[String, Any] sometimes throws ArrayIndexOutOfBoundsException

2019-03-12 Thread Dawid Wysakowicz (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dawid Wysakowicz closed FLINK-11420.

Resolution: Fixed

Additional issue with {{CoGroupedStreams.UnionSerializer}} fixed in:
master: 3133a6dc684c485e3942d9536a26fe5d6b31f17d
1.8.0: db0b874c41c3ae5f61942c5ca0e6950a2694f4ac
1.7.3: f04d2cb87775b42aa54161ffb3bdaeb1f9d4af3c

> Serialization of case classes containing a Map[String, Any] sometimes throws 
> ArrayIndexOutOfBoundsException
> ---
>
> Key: FLINK-11420
> URL: https://issues.apache.org/jira/browse/FLINK-11420
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Affects Versions: 1.7.1
>Reporter: Jürgen Kreileder
>Assignee: Dawid Wysakowicz
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.7.3, 1.8.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> We frequently run into random ArrayIndexOutOfBounds exceptions when flink 
> tries to serialize Scala case classes containing a Map[String, Any] (Any 
> being String, Long, Int, or Boolean) with the FsStateBackend. (This probably 
> happens with any case class containing a type requiring Kryo, see this thread 
> for instance: 
> [http://mail-archives.apache.org/mod_mbox/flink-user/201710.mbox/%3cCANNGFpjX4gjV=df6tlfeojsb_rhwxs_ruoylqcqv2gvwqtt...@mail.gmail.com%3e])
> Disabling asynchronous snapshots seems to work around the problem, so maybe 
> something is not thread-safe in CaseClassSerializer.
> Our objects look like this:
> {code}
> case class Event(timestamp: Long, [...], content: Map[String, Any]
> case class EnrichedEvent(event: Event, additionalInfo: Map[String, Any])
> {code}
> I've looked at a few of the exceptions in a debugger. It always happens when 
> serializing the right-hand side a tuple from EnrichedEvent -> Event -> 
> content, e.g: 13 from ("foo", 13) or false from ("bar", false).
> Stacktrace:
> {code:java}
> java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 0
>  at com.esotericsoftware.kryo.util.IntArray.pop(IntArray.java:157)
>  at com.esotericsoftware.kryo.Kryo.reference(Kryo.java:822)
>  at com.esotericsoftware.kryo.Kryo.copy(Kryo.java:863)
>  at 
> org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.copy(KryoSerializer.java:217)
>  at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:101)
>  at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:32)
>  at 
> org.apache.flink.api.scala.typeutils.TraversableSerializer.$anonfun$copy$1(TraversableSerializer.scala:69)
>  at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
>  at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:465)
>  at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:465)
>  at 
> org.apache.flink.api.scala.typeutils.TraversableSerializer.copy(TraversableSerializer.scala:69)
>  at 
> org.apache.flink.api.scala.typeutils.TraversableSerializer.copy(TraversableSerializer.scala:33)
>  at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:101)
>  at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:32)
>  at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:101)
>  at 
> org.apache.flink.api.scala.typeutils.CaseClassSerializer.copy(CaseClassSerializer.scala:32)
>  at 
> org.apache.flink.api.common.typeutils.base.ListSerializer.copy(ListSerializer.java:99)
>  at 
> org.apache.flink.api.common.typeutils.base.ListSerializer.copy(ListSerializer.java:42)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTable.get(CopyOnWriteStateTable.java:287)
>  at 
> org.apache.flink.runtime.state.heap.CopyOnWriteStateTable.get(CopyOnWriteStateTable.java:311)
>  at 
> org.apache.flink.runtime.state.heap.HeapListState.add(HeapListState.java:95)
>  at 
> org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:391)
>  at 
> org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)
>  at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
>  at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:704)
>  at java.base/java.lang.Thread.run(Thread.java:834){code}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] yanghua commented on issue #7960: [FLINK-9854][table] Allow passing multi-line input to SQL Client CLI

2019-03-12 Thread GitBox
yanghua commented on issue #7960: [FLINK-9854][table] Allow passing multi-line 
input to SQL Client CLI
URL: https://github.com/apache/flink/pull/7960#issuecomment-471899887
 
 
   > Hi @yanghua, could you extend the description with how does the piping 
multiline support actually works. Why only the last command in a pipe has to 
end with `;`? That looks a bit weird for me.
   
   @dawidwys Did you guess this from my test case? Sorry, it may mislead you. 
Actually, there is no requirement for this. I just used a SQL statement which 
ends with ';'.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11883) Harmonize the version of maven-shade-plugin

2019-03-12 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated FLINK-11883:
-
Summary: Harmonize the version of maven-shade-plugin  (was: Harmonize the 
versions of the )

> Harmonize the version of maven-shade-plugin
> ---
>
> Key: FLINK-11883
> URL: https://issues.apache.org/jira/browse/FLINK-11883
> Project: Flink
>  Issue Type: Bug
>Reporter: Fokko Driesprong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11883) Harmonize the version of maven-shade-plugin

2019-03-12 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated FLINK-11883:
-
Affects Version/s: 1.7.2

> Harmonize the version of maven-shade-plugin
> ---
>
> Key: FLINK-11883
> URL: https://issues.apache.org/jira/browse/FLINK-11883
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.7.2
>Reporter: Fokko Driesprong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11883) Harmonize the version of maven-shade-plugin

2019-03-12 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong updated FLINK-11883:
-
Component/s: Build System

> Harmonize the version of maven-shade-plugin
> ---
>
> Key: FLINK-11883
> URL: https://issues.apache.org/jira/browse/FLINK-11883
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.7.2
>Reporter: Fokko Driesprong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11068) Convert the API classes *Table, *Window to interfaces

2019-03-12 Thread Dawid Wysakowicz (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790352#comment-16790352
 ] 

Dawid Wysakowicz commented on FLINK-11068:
--

Hi [~hequn8128] thanks for working on this issue.
Ad.1 Yes
Ad.2 I don't think this is feasible right now. This would be a breaking API 
change. We would need to deprecate the Window first. I don't see as much 
benefit so that it would make sense to do it now.
Ad.3 Yes, we should not touch those for now. They might end up in the 
{{flink-table-api-scala-bridge}} at some point.

By the way do you know when can you open PR for the interface extraction? I 
already started migrating the Table to java, so I think it would make sense to 
base it on top of your changes.

> Convert the API classes *Table, *Window to interfaces
> -
>
> Key: FLINK-11068
> URL: https://issues.apache.org/jira/browse/FLINK-11068
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Table SQL
>Reporter: Timo Walther
>Assignee: Hequn Cheng
>Priority: Major
>
> A more detailed description can be found in 
> [FLIP-32|https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions].
> This includes: Table, GroupedTable, WindowedTable, WindowGroupedTable, 
> OverWindowedTable, Window, OverWindow
> We can keep the "Table" Scala implementation in a planner module until it has 
> been converted to Java.
> We can add a method to the planner later to give us a concrete instance. This 
> is one possibility to have a smooth transition period instead of changing all 
> classes at once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] Fokko opened a new pull request #7962: [FLINK-11883] Harmonize the version of maven-shade-plugin

2019-03-12 Thread GitBox
Fokko opened a new pull request #7962: [FLINK-11883] Harmonize the version of 
maven-shade-plugin
URL: https://github.com/apache/flink/pull/7962
 
 
   ## What is the purpose of the change
   
   Currently we're using two version of the maven-shade-plugin which might lead 
to confusion/collisions. Therefore I would to harmonize the two versions.
   
   Ticket: https://jira.apache.org/jira/browse/FLINK-11883
   
   ## Brief change log
   
   Set all the versions of the Maven shade plugin to 3.1.1
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial setting of versions.
   
   This change is already covered by existing tests, such as the full test 
suite.
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11883) Harmonize the version of maven-shade-plugin

2019-03-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11883:
---
Labels: pull-request-available  (was: )

> Harmonize the version of maven-shade-plugin
> ---
>
> Key: FLINK-11883
> URL: https://issues.apache.org/jira/browse/FLINK-11883
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.7.2
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] flinkbot commented on issue #7962: [FLINK-11883] Harmonize the version of maven-shade-plugin

2019-03-12 Thread GitBox
flinkbot commented on issue #7962: [FLINK-11883] Harmonize the version of 
maven-shade-plugin
URL: https://github.com/apache/flink/pull/7962#issuecomment-471904554
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11884) Port Table to flink-api-java

2019-03-12 Thread Dawid Wysakowicz (JIRA)
Dawid Wysakowicz created FLINK-11884:


 Summary: Port Table to flink-api-java
 Key: FLINK-11884
 URL: https://issues.apache.org/jira/browse/FLINK-11884
 Project: Flink
  Issue Type: New Feature
  Components: API / Table SQL
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz


This includes:
* uncoupling {{LogicalNode}} from {{RelBuilder}}
* uncoupling {{CatalogNode}} from {{RelDataType}}
* migrating {{Table}} to java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] klion26 commented on a change in pull request #7958: [FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink table

2019-03-12 Thread GitBox
klion26 commented on a change in pull request #7958: 
[FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink 
table
URL: https://github.com/apache/flink/pull/7958#discussion_r264567432
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/generated/CompileUtils.java
 ##
 @@ -77,6 +77,7 @@
try {
compiler.cook(code);
} catch (Throwable t) {
+   System.out.println(addLineNumber(code));
 
 Review comment:
   Where will the error message appear? If this error message goes into the 
stdout file other than the log file, maybe users can't find the message.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11883) Harmonize the versions of the

2019-03-12 Thread Fokko Driesprong (JIRA)
Fokko Driesprong created FLINK-11883:


 Summary: Harmonize the versions of the 
 Key: FLINK-11883
 URL: https://issues.apache.org/jira/browse/FLINK-11883
 Project: Flink
  Issue Type: Bug
Reporter: Fokko Driesprong






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-11883) Harmonize the version of maven-shade-plugin

2019-03-12 Thread Fokko Driesprong (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fokko Driesprong reassigned FLINK-11883:


Assignee: Fokko Driesprong

> Harmonize the version of maven-shade-plugin
> ---
>
> Key: FLINK-11883
> URL: https://issues.apache.org/jira/browse/FLINK-11883
> Project: Flink
>  Issue Type: Bug
>  Components: Build System
>Affects Versions: 1.7.2
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] klion26 commented on a change in pull request #7958: [FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink table

2019-03-12 Thread GitBox
klion26 commented on a change in pull request #7958: 
[FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink 
table
URL: https://github.com/apache/flink/pull/7958#discussion_r264568021
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/generated/CompileUtils.java
 ##
 @@ -87,4 +88,13 @@
throw new RuntimeException("Can not load class " + 
name, e);
}
}
+
+   private static String addLineNumber(String code) {
 
 Review comment:
   So the index of following forloop should begin from 1?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on issue #7960: [FLINK-9854][table] Allow passing multi-line input to SQL Client CLI

2019-03-12 Thread GitBox
dawidwys commented on issue #7960: [FLINK-9854][table] Allow passing multi-line 
input to SQL Client CLI
URL: https://github.com/apache/flink/pull/7960#issuecomment-471906653
 
 
   Yes, I induced that from the tests. Then I think that's still a problem, 
cause afaik we require a sql statement to end with `;`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhijiangW commented on issue #7730: [FLINK-11646][test] Remove legacy MockRecordWriter

2019-03-12 Thread GitBox
zhijiangW commented on issue #7730: [FLINK-11646][test] Remove legacy 
MockRecordWriter
URL: https://github.com/apache/flink/pull/7730#issuecomment-471907125
 
 
   @pnowojski , could you help review this? I would introduce the 
`RecordWriterBuilder` to avoid creating `RecordWriter` directly, so this 
useless class does not need to be changed along with it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] yanghua commented on issue #7960: [FLINK-9854][table] Allow passing multi-line input to SQL Client CLI

2019-03-12 Thread GitBox
yanghua commented on issue #7960: [FLINK-9854][table] Allow passing multi-line 
input to SQL Client CLI
URL: https://github.com/apache/flink/pull/7960#issuecomment-471910241
 
 
   @dawidwys Align the information, the issue currently only supports one SQL 
statement is split into multiple lines, the current SQL statement accepted by 
the CLI can end with ';' or not. ';' will be removed during parsing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #7958: [FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink table

2019-03-12 Thread GitBox
JingsongLi commented on a change in pull request #7958: 
[FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink 
table
URL: https://github.com/apache/flink/pull/7958#discussion_r264572260
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/generated/CompileUtils.java
 ##
 @@ -87,4 +88,13 @@
throw new RuntimeException("Can not load class " + 
name, e);
}
}
+
+   private static String addLineNumber(String code) {
 
 Review comment:
   Yes


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11068) Convert the API classes *Table, *Window to interfaces

2019-03-12 Thread Hequn Cheng (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790378#comment-16790378
 ] 

Hequn Cheng commented on FLINK-11068:
-

[~dawidwys] Thanks for your valuable suggestions. I think you make a good point 
here, it's better to assure compatibility for our APIs. And we can discuss it 
later, if we feel need to.

As for the time of opening the PR, I planned to do it right after the 
Expression PR merged(FLINK-11449). Because the table interface can only be put 
into api-java when Expression has been moved into flink-common. I think we have 
two options now:
 * Wait until FLINK-11449 is merged. Maybe very soon or a few days?
 * Put table interface in planner module temporary and move it into api-java 
later. This can avoid the "block chain" and speed up our work if FLINK-11449 
still needs some time.

What do you think? [~twalthr] [~dawidwys]

> Convert the API classes *Table, *Window to interfaces
> -
>
> Key: FLINK-11068
> URL: https://issues.apache.org/jira/browse/FLINK-11068
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Table SQL
>Reporter: Timo Walther
>Assignee: Hequn Cheng
>Priority: Major
>
> A more detailed description can be found in 
> [FLIP-32|https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions].
> This includes: Table, GroupedTable, WindowedTable, WindowGroupedTable, 
> OverWindowedTable, Window, OverWindow
> We can keep the "Table" Scala implementation in a planner module until it has 
> been converted to Java.
> We can add a method to the planner later to give us a concrete instance. This 
> is one possibility to have a smooth transition period instead of changing all 
> classes at once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] KurtYoung commented on issue #3511: [Flink-5734] code generation for normalizedkey sorter

2019-03-12 Thread GitBox
KurtYoung commented on issue #3511: [Flink-5734] code generation for 
normalizedkey sorter
URL: https://github.com/apache/flink/pull/3511#issuecomment-471922611
 
 
   As you may want to know, we have opened a PR to introduce code generated 
sorter, check out #7958 if you are interested. @ggevay @heytitle 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] JingsongLi commented on a change in pull request #7958: [FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink table

2019-03-12 Thread GitBox
JingsongLi commented on a change in pull request #7958: 
[FLINK-11881][table-planner-blink] Introduce code generated typed sort to blink 
table
URL: https://github.com/apache/flink/pull/7958#discussion_r264587151
 
 

 ##
 File path: 
flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/generated/CompileUtils.java
 ##
 @@ -77,6 +77,7 @@
try {
compiler.cook(code);
} catch (Throwable t) {
+   System.out.println(addLineNumber(code));
 
 Review comment:
   When there is a bug in the codeGen code, It's for developers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11881) Introduce code generated typed sort to blink table

2019-03-12 Thread Jingsong Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-11881:
-
Description: 
Introduce SortCodeGenerator (CodeGen efficient computation and comparison of  
NormalizedKey, idea based on FLINK-5734 ):

support sort by primitive type, string, decimal...

support sort by ArrayType

support sort by RowType(Nested Struct)

 

 

  was:
Introduce SortCodeGenerator (CodeGen efficient computation and comparison of  
NormalizedKey):

support sort by primitive type, string, decimal...

support sort by ArrayType

support sort by RowType(Nested Struct)

 

 


> Introduce code generated typed sort to blink table
> --
>
> Key: FLINK-11881
> URL: https://issues.apache.org/jira/browse/FLINK-11881
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Operators, SQL / Planner
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Introduce SortCodeGenerator (CodeGen efficient computation and comparison of  
> NormalizedKey, idea based on FLINK-5734 ):
> support sort by primitive type, string, decimal...
> support sort by ArrayType
> support sort by RowType(Nested Struct)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11885) Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread zhijiang (JIRA)
zhijiang created FLINK-11885:


 Summary: Introduce RecordWriterBuilder for creating RecordWriter 
instance
 Key: FLINK-11885
 URL: https://issues.apache.org/jira/browse/FLINK-11885
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: zhijiang
Assignee: zhijiang


The current {{RecordWriter}} would be refactored as an abstract class for 
improving broadcast mode mentioned in FLINK-10995. So it is better to migrate 
the logic of building {{RecordWriter}} instance into a separate 
{{RecordWriterBuilder}} utility class which should be the only entrance for 
other usages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11886) update the output of cluster management script in jobmanager_high_availability doc

2019-03-12 Thread chunpinghe (JIRA)
chunpinghe created FLINK-11886:
--

 Summary: update the output of cluster management script in 
jobmanager_high_availability doc
 Key: FLINK-11886
 URL: https://issues.apache.org/jira/browse/FLINK-11886
 Project: Flink
  Issue Type: Improvement
  Components: Documentation
Reporter: chunpinghe
Assignee: chunpinghe
 Fix For: 1.9.0
 Attachments: jira_cluster.sh.png

after flip6 released,the start and stop cluster scripts output "jobmanager" as 
"standalonesession",

"taskmanager" as "taskexecutor".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] zhijiangW opened a new pull request #7963: [FLINK-11885][network] Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread GitBox
zhijiangW opened a new pull request #7963: [FLINK-11885][network] Introduce 
RecordWriterBuilder for creating RecordWriter instance
URL: https://github.com/apache/flink/pull/7963
 
 
   ## What is the purpose of the change
   
   The current `RecordWriter` would be refactored as an abstract class for 
improving broadcast mode mentioned in FLINK-10995. So it is better to migrate 
the logic of building `RecordWriter` instance into a separate 
`RecordWriterBuilder` utility class which should be the only entrance for other 
usages.
   
   ## Brief change log
   
 - *Introduce new `RecordWriterBuilder` class for providing static creation 
methods*
 - *Remove useless constructor in `RecordWriter`*
 - *Make constructor package private for `BroadcastRecordWriter`*
 - *Change other usages of creating `RecordWriter` via new builder class*
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as *RecordWriterTest*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #7963: [FLINK-11885][network] Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread GitBox
flinkbot commented on issue #7963: [FLINK-11885][network] Introduce 
RecordWriterBuilder for creating RecordWriter instance
URL: https://github.com/apache/flink/pull/7963#issuecomment-471935384
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhijiangW commented on issue #7963: [FLINK-11885][network] Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread GitBox
zhijiangW commented on issue #7963: [FLINK-11885][network] Introduce 
RecordWriterBuilder for creating RecordWriter instance
URL: https://github.com/apache/flink/pull/7963#issuecomment-471935488
 
 
   @pnowojski , this is the preparation work for #7713 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11886) update the output of cluster management script in jobmanager_high_availability doc

2019-03-12 Thread chunpinghe (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunpinghe updated FLINK-11886:
---
Attachment: (was: jira_cluster.sh.png)

> update the output of cluster management script in 
> jobmanager_high_availability doc
> --
>
> Key: FLINK-11886
> URL: https://issues.apache.org/jira/browse/FLINK-11886
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: chunpinghe
>Assignee: chunpinghe
>Priority: Major
> Fix For: 1.9.0
>
>
> after flip6 released,the start and stop cluster scripts output "jobmanager" 
> as "standalonesession",
> "taskmanager" as "taskexecutor".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11886) update the output of cluster management script in jobmanager_high_availability doc

2019-03-12 Thread chunpinghe (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunpinghe updated FLINK-11886:
---
Attachment: ha_doc_updated.png

> update the output of cluster management script in 
> jobmanager_high_availability doc
> --
>
> Key: FLINK-11886
> URL: https://issues.apache.org/jira/browse/FLINK-11886
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: chunpinghe
>Assignee: chunpinghe
>Priority: Major
> Fix For: 1.9.0
>
> Attachments: ha_doc.png, ha_doc_updated.png
>
>
> after flip6 released,the start and stop cluster scripts output "jobmanager" 
> as "standalonesession",
> "taskmanager" as "taskexecutor".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11886) update the output of cluster management script in jobmanager_high_availability doc

2019-03-12 Thread chunpinghe (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

chunpinghe updated FLINK-11886:
---
Attachment: ha_doc.png

> update the output of cluster management script in 
> jobmanager_high_availability doc
> --
>
> Key: FLINK-11886
> URL: https://issues.apache.org/jira/browse/FLINK-11886
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: chunpinghe
>Assignee: chunpinghe
>Priority: Major
> Fix For: 1.9.0
>
> Attachments: ha_doc.png, ha_doc_updated.png
>
>
> after flip6 released,the start and stop cluster scripts output "jobmanager" 
> as "standalonesession",
> "taskmanager" as "taskexecutor".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] KurtYoung merged pull request #7949: [FLINK-11856][table-runtime-blink] Introduce BinaryHashTable to batch table runtime

2019-03-12 Thread GitBox
KurtYoung merged pull request #7949: [FLINK-11856][table-runtime-blink] 
Introduce BinaryHashTable to batch table runtime
URL: https://github.com/apache/flink/pull/7949
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11885) Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11885:
---
Labels: pull-request-available  (was: )

> Introduce RecordWriterBuilder for creating RecordWriter instance
> 
>
> Key: FLINK-11885
> URL: https://issues.apache.org/jira/browse/FLINK-11885
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
>
> The current {{RecordWriter}} would be refactored as an abstract class for 
> improving broadcast mode mentioned in FLINK-10995. So it is better to migrate 
> the logic of building {{RecordWriter}} instance into a separate 
> {{RecordWriterBuilder}} utility class which should be the only entrance for 
> other usages.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-11856) Introduce BinaryHashTable to batch table runtime

2019-03-12 Thread Kurt Young (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kurt Young closed FLINK-11856.
--
   Resolution: Implemented
Fix Version/s: 1.9.0

fixed in 5968206f843a58e10e3db58cde3407bb3608d085

> Introduce BinaryHashTable to batch table runtime
> 
>
> Key: FLINK-11856
> URL: https://issues.apache.org/jira/browse/FLINK-11856
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Operators
>Reporter: Kurt Young
>Assignee: Kurt Young
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Once we have BinaryRow, we should also introduce the hash table which stored 
> all rows in binary format. It has multiple benefits if we knows the binary 
> format about the build side row and probe side row. For example, we can 
> directly compare on byte arrays when we want to check if build key and probe 
> key equals. We can also eliminate lots of deserialize cost when we want to 
> get a row instance from some binary data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] chummyhe89 opened a new pull request #7964: [FLINK-11886] [doc] update the output of start/stop_cluster.sh in job…

2019-03-12 Thread GitBox
chummyhe89 opened a new pull request #7964: [FLINK-11886] [doc] update the 
output of start/stop_cluster.sh in job…
URL: https://github.com/apache/flink/pull/7964
 
 
   
   
   
   
   ## What is the purpose of the change
   
   Update the output of the cluster management script in 
jobmanager_high_availability document.
   
   
   ## Brief change log
   
 - Change the `jobmanager` to `standalonesession` in 
jobmanager_high_availability.md
 - Change the `taskmanager` to `taskexecutor` in 
jobmanager_high_availability.md
   ## Verifying this change
   This change is a trivial rework / code cleanup without any test coverage.
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11886) update the output of cluster management script in jobmanager_high_availability doc

2019-03-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11886:
---
Labels: pull-request-available  (was: )

> update the output of cluster management script in 
> jobmanager_high_availability doc
> --
>
> Key: FLINK-11886
> URL: https://issues.apache.org/jira/browse/FLINK-11886
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: chunpinghe
>Assignee: chunpinghe
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0
>
> Attachments: ha_doc.png, ha_doc_updated.png
>
>
> after flip6 released,the start and stop cluster scripts output "jobmanager" 
> as "standalonesession",
> "taskmanager" as "taskexecutor".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] flinkbot commented on issue #7964: [FLINK-11886] [doc] update the output of start/stop_cluster.sh in job…

2019-03-12 Thread GitBox
flinkbot commented on issue #7964: [FLINK-11886] [doc] update the output of 
start/stop_cluster.sh in job…
URL: https://github.com/apache/flink/pull/7964#issuecomment-471946845
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] 
Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264614405
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSubscriberFactoryForEmulator.java
 ##
 @@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+
+import com.google.api.gax.core.NoCredentialsProvider;
+import com.google.api.gax.grpc.GrpcTransportChannel;
+import com.google.api.gax.rpc.FixedTransportChannelProvider;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.stub.GrpcSubscriberStub;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.cloud.pubsub.v1.stub.SubscriberStubSettings;
+import io.grpc.ManagedChannel;
+import io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+
+import java.io.IOException;
+
+/**
+ * A convenience PubSubSubscriberFactory that can be used to connect to a 
PubSub emulator.
+ * The PubSub emulators do not support SSL or Credentials and as such this 
SubscriberStub does not require or provide this.
+ */
+public class PubSubSubscriberFactoryForEmulator implements 
PubSubSubscriberFactory {
 
 Review comment:
   I was actually thinking just in the test-jar of the PubSub connector module. 
Do you expect the class to be used by others besides the tests for PubSub in 
Flink repo?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] becketqin commented on issue #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
becketqin commented on issue #6594: [FLINK-9311] [pubsub] Added PubSub source 
connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#issuecomment-471952495
 
 
   > Hello @rmetzger
   > 
   > Let me address them point by point:
   > 
   > 
   > I really like the idea of providing an out of the box exactly-once 
implementation.
   > 
   > I am worried about the state size on high volume PubSub subscriptions. As 
in my experience, PubSub _will_ be sending messages multiple times and this 
might happen up to 7 days after the initial message. The 7 days being the upper 
bound as this is the maximum PubSub retention time. So this would mean we would 
be storing message ids for 7 days.
   > 
   > Implementation wise this should still be very doable though. We could 
replace the `DeserializationSchema` from `Object deserialize(byte[])` to a 
class with something like: `Object deserialize(PubSubMessage)` so we can decide 
if and how we want to pass the messageId on to the next operator. I think if we 
can add this behavior in this PR we can always provide an Exactly-once 
implementation later on without becoming backward incompatible. What do you 
think?
   > 
   > I think either way it still makes sense to use the 
`MultipleIdsMessageAcknowledgingSourceBase` class for the 
Acknowledge-on-checkpoint behavior, just so we don't duplicate this behavior in 
code. The most elegant solution might be to split the 
`acknowledge-on-checkpoint` and `exactly-once on parallelism 1` code. So 
RabbitMQ connector can use both parts while PubSub only uses the 
`acknowledge-on-checkpoint` part. But this would require some large changes for 
the RabbitMQ connector which I cannot test myself. I'm a bit hesitant to go 
this direction. What do you think?
   > 
   > 
   > I'm going to test this myself this weekend. In the previous (async) 
implementation I would for sure expect this behavior but the new implementation 
should only be bounded by memory (the amount of acknowledge Ids it has to store 
between checkpoints)
   > I fully agree we should not force users to lower their checkpoint 
frequency if they want a higher throughput. I think this is fixable, I'll come 
back to this!
   > 
   > When I first created the PR I had added `modes` to the PubSubSource. A 
NONE / ATLEAST_ONCE / EXACTLY_ONCE enum.
   > Where NONE would be the behavior you describe: acknowledge immediately 
with the risk of losing messages. If we can't seem to fix the performance for 
large checkpoint interval jobs we might consider adding these modes back again.
   > 
   > 
   > Could you play with the `withMaxMessagesPerPull` setting? I can imagine 
each pull has some overhead and reducing the number of pulls needed might be 
nicer for the CPU cycles.
   > 
   > Do you see this high CPU consumption on empty subscriptions as well?
   > 
   > We could always give the json api a try, but I would be suprised if this 
has a better performance..
   
   @Xeli I tend to agree with Stephan on getting rid of the 
`MultipleIdsMessageAcknowledgingSourceBase`. The fact that its memory 
consumption is unbounded worries me.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] 
Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264622246
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSource.java
 ##
 @@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
+import org.apache.flink.configuration.Configuration;
+import 
org.apache.flink.streaming.api.functions.source.MultipleIdsMessageAcknowledgingSourceBase;
+import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
+import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.Subscriber;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.pubsub.v1.AcknowledgeRequest;
+import com.google.pubsub.v1.ProjectSubscriptionName;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.PullRequest;
+import com.google.pubsub.v1.PullResponse;
+import com.google.pubsub.v1.ReceivedMessage;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+import io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoopGroup;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.TimeUnit;
+
+import static 
com.google.cloud.pubsub.v1.SubscriptionAdminSettings.defaultCredentialsProviderBuilder;
+
+/**
+ * PubSub Source, this Source will consume PubSub messages from a subscription 
and Acknowledge them on the next checkpoint.
+ * This ensures every message will get acknowledged at least once.
+ */
+public class PubSubSource extends 
MultipleIdsMessageAcknowledgingSourceBase
+   implements ResultTypeQueryable, 
ParallelSourceFunction, StoppableFunction {
+   private static final Logger LOG = 
LoggerFactory.getLogger(PubSubSource.class);
+   protected final DeserializationSchema deserializationSchema;
+   protected final PubSubSubscriberFactory pubSubSubscriberFactory;
+   protected final Credentials credentials;
+   protected final String projectSubscriptionName;
+   protected final int maxMessagesPerPull;
+
+   protected transient boolean deduplicateMessages;
+   protected transient SubscriberStub subscriber;
+   protected transient PullRequest pullRequest;
+   protected transient EventLoopGroup eventLoopGroup;
+
+   protected transient volatile boolean isRunning;
+   protected transient volatile ApiFuture messagesFuture;
+
+   PubSubSource(DeserializationSchema deserializationSchema, 
PubSubSubscriberFactory pubSubSubscriberFactory, Credentials credentials, 
String projectSubscriptionName, int maxMessagesPerPull) {
+   super(String.class);
+   this.deserializationSchema = deserializationSchema;
+   this.pubSubSubscriberFactory = pubSubSubscriberFactory;
+   this.credentials = credentials;
+   this.projectSubscriptionName = projectSubscriptionName;
+   this.maxMessagesPerPull = maxMessagesPerPull;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   super.open(configuration);
+   if (hasNoCheckpointingEnabled(getRuntimeContext())) {
+   throw new IllegalArgumentException("The PubSubSource 
REQUIRES Checkpointing to be enabled and " +
+   

[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] 
Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264624917
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSource.java
 ##
 @@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
+import org.apache.flink.configuration.Configuration;
+import 
org.apache.flink.streaming.api.functions.source.MultipleIdsMessageAcknowledgingSourceBase;
+import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
+import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.Subscriber;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.pubsub.v1.AcknowledgeRequest;
+import com.google.pubsub.v1.ProjectSubscriptionName;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.PullRequest;
+import com.google.pubsub.v1.PullResponse;
+import com.google.pubsub.v1.ReceivedMessage;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+import io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoopGroup;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.TimeUnit;
+
+import static 
com.google.cloud.pubsub.v1.SubscriptionAdminSettings.defaultCredentialsProviderBuilder;
+
+/**
+ * PubSub Source, this Source will consume PubSub messages from a subscription 
and Acknowledge them on the next checkpoint.
+ * This ensures every message will get acknowledged at least once.
+ */
+public class PubSubSource extends 
MultipleIdsMessageAcknowledgingSourceBase
+   implements ResultTypeQueryable, 
ParallelSourceFunction, StoppableFunction {
+   private static final Logger LOG = 
LoggerFactory.getLogger(PubSubSource.class);
+   protected final DeserializationSchema deserializationSchema;
+   protected final PubSubSubscriberFactory pubSubSubscriberFactory;
+   protected final Credentials credentials;
+   protected final String projectSubscriptionName;
+   protected final int maxMessagesPerPull;
+
+   protected transient boolean deduplicateMessages;
+   protected transient SubscriberStub subscriber;
+   protected transient PullRequest pullRequest;
+   protected transient EventLoopGroup eventLoopGroup;
+
+   protected transient volatile boolean isRunning;
+   protected transient volatile ApiFuture messagesFuture;
+
+   PubSubSource(DeserializationSchema deserializationSchema, 
PubSubSubscriberFactory pubSubSubscriberFactory, Credentials credentials, 
String projectSubscriptionName, int maxMessagesPerPull) {
+   super(String.class);
+   this.deserializationSchema = deserializationSchema;
+   this.pubSubSubscriberFactory = pubSubSubscriberFactory;
+   this.credentials = credentials;
+   this.projectSubscriptionName = projectSubscriptionName;
+   this.maxMessagesPerPull = maxMessagesPerPull;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   super.open(configuration);
+   if (hasNoCheckpointingEnabled(getRuntimeContext())) {
+   throw new IllegalArgumentException("The PubSubSource 
REQUIRES Checkpointing to be enabled and " +
+   

[GitHub] [flink] pnowojski commented on a change in pull request #7963: [FLINK-11885][network] Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread GitBox
pnowojski commented on a change in pull request #7963: [FLINK-11885][network] 
Introduce RecordWriterBuilder for creating RecordWriter instance
URL: https://github.com/apache/flink/pull/7963#discussion_r264631813
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/writer/RecordWriterBuilder.java
 ##
 @@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.io.network.api.writer;
+
+/**
+ * Utility class to encapsulate the logic of building a {@link RecordWriter} 
instance.
+ */
+public class RecordWriterBuilder {
 
 Review comment:
   With "builder" I meant builder pattern. There is a little value of 
extracting static methods to separate class on it's own. The point is to avoid 
having multiple overloaded constructors/methods that allow more flexibility 
when creating class instances with default parameter values.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] pnowojski merged pull request #7730: [FLINK-11646][test] Remove legacy MockRecordWriter

2019-03-12 Thread GitBox
pnowojski merged pull request #7730: [FLINK-11646][test] Remove legacy 
MockRecordWriter
URL: https://github.com/apache/flink/pull/7730
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Closed] (FLINK-11646) Remove legacy MockRecordWriter

2019-03-12 Thread Piotr Nowojski (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-11646.
--
   Resolution: Fixed
Fix Version/s: 1.9.0

merged commit 6f66e23 into apache:master

> Remove legacy MockRecordWriter
> --
>
> Key: FLINK-11646
> URL: https://issues.apache.org/jira/browse/FLINK-11646
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.9.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{MockRecordWriter}} is not used any more, so remove it to make code clean.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] pnowojski commented on a change in pull request #7713: [FLINK-10995][network] Copy intermediate serialization results only once for broadcast mode

2019-03-12 Thread GitBox
pnowojski commented on a change in pull request #7713: [FLINK-10995][network] 
Copy intermediate serialization results only once for broadcast mode
URL: https://github.com/apache/flink/pull/7713#discussion_r264634735
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/api/writer/BroadcastRecordWriter.java
 ##
 @@ -44,4 +52,67 @@ public BroadcastRecordWriter(
public void emit(T record) throws IOException, InterruptedException {
broadcastEmit(record);
}
+
+   @Override
+   public void broadcastEmit(T record) throws IOException, 
InterruptedException {
 
 Review comment:
   Sounds good :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] pnowojski commented on a change in pull request #7911: [FLINK-11082][network] Fix the logic of getting backlog in sub partition

2019-03-12 Thread GitBox
pnowojski commented on a change in pull request #7911: [FLINK-11082][network] 
Fix the logic of getting backlog in sub partition
URL: https://github.com/apache/flink/pull/7911#discussion_r264637060
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/ResultSubpartition.java
 ##
 @@ -116,52 +115,58 @@ protected Throwable getFailureCause() {
 
public abstract boolean isReleased();
 
-   /**
-* Gets the number of non-event buffers in this subpartition.
-*
-* Beware: This method should only be used in tests 
in non-concurrent access
-* scenarios since it does not make any concurrency guarantees.
-*/
-   @VisibleForTesting
-   public int getBuffersInBacklog() {
-   return buffersInBacklog;
-   }
-
/**
 * Makes a best effort to get the current size of the queue.
 * This method must not acquire locks or interfere with the task and 
network threads in
 * any way.
 */
public abstract int unsynchronizedGetNumberOfQueuedBuffers();
 
+   /**
+* Gets the number of non-event buffers in this subpartition.
+*/
+   public abstract int getBuffersInBacklog();
+
+   /**
+* @param lastBufferAvailable whether the last buffer in this 
subpartition is available for consumption
+* @return the number of non-event buffers in this subpartition
+*/
+   protected int getBuffersInBacklog(boolean lastBufferAvailable) {
 
 Review comment:
   Ok, this gets a little bit too complicated for my taste as well, but if you 
don't have idea how to fix it, lets keep it as it is :(


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] pnowojski commented on a change in pull request #7911: [FLINK-11082][network] Fix the logic of getting backlog in sub partition

2019-03-12 Thread GitBox
pnowojski commented on a change in pull request #7911: [FLINK-11082][network] 
Fix the logic of getting backlog in sub partition
URL: https://github.com/apache/flink/pull/7911#discussion_r264636431
 
 

 ##
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/SpillableSubpartitionView.java
 ##
 @@ -156,7 +156,8 @@ public BufferAndBacklog getNextBuffer() throws 
IOException, InterruptedException
checkState(nextBuffer.isFinished(),
"We can only read from 
SpillableSubpartition after it was finished");
 
-   newBacklog = 
parent.decreaseBuffersInBacklogUnsafe(nextBuffer.isBuffer());
+   
parent.decreaseBuffersInBacklog(nextBuffer.isBuffer());
+   newBacklog = parent.getBuffersInBacklog();
 
 Review comment:
   1. Ok, I get it now :) Thanks for pointing this out.
   
   I think that adding this extra parameter to the `decreaseBuffersInBacklog` 
would indeed be a worse solution and it's better to just use 
`getBuffersInBacklogUnsafe` here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] FaxianZhao opened a new pull request #7965: make JDBCOutputFormat ClassCastException more helpful

2019-03-12 Thread GitBox
FaxianZhao opened a new pull request #7965: make JDBCOutputFormat 
ClassCastException more helpful
URL: https://github.com/apache/flink/pull/7965
 
 
   ## What is the purpose of the change
   This PR makes JDBCOutputFormat's ClassCastException more readable. Current 
exception make me confused which column is wrong. The new error message will 
help people find the error location.
   
   ## Brief change log
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
- Added class cast exception tests in 
org.apache.flink.api.java.io.jdbc.JDBCAppendTableSinkTest.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not documented


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] 
Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264638629
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSource.java
 ##
 @@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
+import org.apache.flink.configuration.Configuration;
+import 
org.apache.flink.streaming.api.functions.source.MultipleIdsMessageAcknowledgingSourceBase;
+import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
+import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.Subscriber;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.pubsub.v1.AcknowledgeRequest;
+import com.google.pubsub.v1.ProjectSubscriptionName;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.PullRequest;
+import com.google.pubsub.v1.PullResponse;
+import com.google.pubsub.v1.ReceivedMessage;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+import io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoopGroup;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.TimeUnit;
+
+import static 
com.google.cloud.pubsub.v1.SubscriptionAdminSettings.defaultCredentialsProviderBuilder;
+
+/**
+ * PubSub Source, this Source will consume PubSub messages from a subscription 
and Acknowledge them on the next checkpoint.
+ * This ensures every message will get acknowledged at least once.
+ */
+public class PubSubSource extends 
MultipleIdsMessageAcknowledgingSourceBase
+   implements ResultTypeQueryable, 
ParallelSourceFunction, StoppableFunction {
+   private static final Logger LOG = 
LoggerFactory.getLogger(PubSubSource.class);
+   protected final DeserializationSchema deserializationSchema;
+   protected final PubSubSubscriberFactory pubSubSubscriberFactory;
+   protected final Credentials credentials;
+   protected final String projectSubscriptionName;
+   protected final int maxMessagesPerPull;
+
+   protected transient boolean deduplicateMessages;
+   protected transient SubscriberStub subscriber;
+   protected transient PullRequest pullRequest;
+   protected transient EventLoopGroup eventLoopGroup;
+
+   protected transient volatile boolean isRunning;
+   protected transient volatile ApiFuture messagesFuture;
+
+   PubSubSource(DeserializationSchema deserializationSchema, 
PubSubSubscriberFactory pubSubSubscriberFactory, Credentials credentials, 
String projectSubscriptionName, int maxMessagesPerPull) {
+   super(String.class);
+   this.deserializationSchema = deserializationSchema;
+   this.pubSubSubscriberFactory = pubSubSubscriberFactory;
+   this.credentials = credentials;
+   this.projectSubscriptionName = projectSubscriptionName;
+   this.maxMessagesPerPull = maxMessagesPerPull;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   super.open(configuration);
+   if (hasNoCheckpointingEnabled(getRuntimeContext())) {
+   throw new IllegalArgumentException("The PubSubSource 
REQUIRES Checkpointing to be enabled and " +
+   

[GitHub] [flink] flinkbot commented on issue #7965: make JDBCOutputFormat ClassCastException more helpful

2019-03-12 Thread GitBox
flinkbot commented on issue #7965: make JDBCOutputFormat ClassCastException 
more helpful
URL: https://github.com/apache/flink/pull/7965#issuecomment-471968284
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] 
Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264635101
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSource.java
 ##
 @@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
+import org.apache.flink.configuration.Configuration;
+import 
org.apache.flink.streaming.api.functions.source.MultipleIdsMessageAcknowledgingSourceBase;
+import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
+import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.Subscriber;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.pubsub.v1.AcknowledgeRequest;
+import com.google.pubsub.v1.ProjectSubscriptionName;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.PullRequest;
+import com.google.pubsub.v1.PullResponse;
+import com.google.pubsub.v1.ReceivedMessage;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+import io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoopGroup;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.TimeUnit;
+
+import static 
com.google.cloud.pubsub.v1.SubscriptionAdminSettings.defaultCredentialsProviderBuilder;
+
+/**
+ * PubSub Source, this Source will consume PubSub messages from a subscription 
and Acknowledge them on the next checkpoint.
+ * This ensures every message will get acknowledged at least once.
+ */
+public class PubSubSource extends 
MultipleIdsMessageAcknowledgingSourceBase
+   implements ResultTypeQueryable, 
ParallelSourceFunction, StoppableFunction {
+   private static final Logger LOG = 
LoggerFactory.getLogger(PubSubSource.class);
+   protected final DeserializationSchema deserializationSchema;
+   protected final PubSubSubscriberFactory pubSubSubscriberFactory;
+   protected final Credentials credentials;
+   protected final String projectSubscriptionName;
+   protected final int maxMessagesPerPull;
+
+   protected transient boolean deduplicateMessages;
+   protected transient SubscriberStub subscriber;
+   protected transient PullRequest pullRequest;
+   protected transient EventLoopGroup eventLoopGroup;
+
+   protected transient volatile boolean isRunning;
+   protected transient volatile ApiFuture messagesFuture;
+
+   PubSubSource(DeserializationSchema deserializationSchema, 
PubSubSubscriberFactory pubSubSubscriberFactory, Credentials credentials, 
String projectSubscriptionName, int maxMessagesPerPull) {
+   super(String.class);
+   this.deserializationSchema = deserializationSchema;
+   this.pubSubSubscriberFactory = pubSubSubscriberFactory;
+   this.credentials = credentials;
+   this.projectSubscriptionName = projectSubscriptionName;
+   this.maxMessagesPerPull = maxMessagesPerPull;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   super.open(configuration);
+   if (hasNoCheckpointingEnabled(getRuntimeContext())) {
+   throw new IllegalArgumentException("The PubSubSource 
REQUIRES Checkpointing to be enabled and " +
+   

[GitHub] [flink] GJL commented on issue #7922: [FLINK-10756][runtime][tests] Wait for TM processes to shutdown

2019-03-12 Thread GitBox
GJL commented on issue #7922: [FLINK-10756][runtime][tests] Wait for TM 
processes to shutdown
URL: https://github.com/apache/flink/pull/7922#issuecomment-471968212
 
 
   @flinkbot approve description


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] GJL commented on issue #7922: [FLINK-10756][runtime][tests] Wait for TM processes to shutdown

2019-03-12 Thread GitBox
GJL commented on issue #7922: [FLINK-10756][runtime][tests] Wait for TM 
processes to shutdown
URL: https://github.com/apache/flink/pull/7922#issuecomment-471974410
 
 
   @flinkbot approve all


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] leesf commented on issue #7940: [hotfix][docs] fix error in functions example

2019-03-12 Thread GitBox
leesf commented on issue #7940: [hotfix][docs] fix error in functions example 
URL: https://github.com/apache/flink/pull/7940#issuecomment-471980702
 
 
   cc @twalthr @zentol 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] tweise commented on issue #7896: [FLINK-9007] [kinesis] [e2e] Add Kinesis end-to-end test

2019-03-12 Thread GitBox
tweise commented on issue #7896: [FLINK-9007] [kinesis] [e2e] Add Kinesis 
end-to-end test
URL: https://github.com/apache/flink/pull/7896#issuecomment-471992102
 
 
   @StefanRRichter could you approve the PR or let me know if anything else 
should be done here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11887) Operator's Latency Metrics continues to increase

2019-03-12 Thread Suxing Lee (JIRA)
Suxing Lee created FLINK-11887:
--

 Summary: Operator's Latency Metrics continues to increase
 Key: FLINK-11887
 URL: https://issues.apache.org/jira/browse/FLINK-11887
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Metrics
Affects Versions: 1.6.3
Reporter: Suxing Lee
 Attachments: 
flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_1_latency.png

The operator's latency time is increased by approximately 2.7 minutes per day 
(see the attached).
We compute the latency by System.currentTimeMillis - marker.getMarkedTime.
There is no guarantee that System.currentTimeMillis and System.nanoTime don't 
drift apart.
If a GC pause or linux preemptive scheduling happenes, this should affect 
latency metrics.
Latency metrics drift away from their initial values with time(verify this 
result via the JVM Heap Dump).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11888) Refine document of how to run end-to-end tests

2019-03-12 Thread Yu Li (JIRA)
Yu Li created FLINK-11888:
-

 Summary: Refine document of how to run end-to-end tests
 Key: FLINK-11888
 URL: https://issues.apache.org/jira/browse/FLINK-11888
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Tests
Reporter: Yu Li
Assignee: Yu Li


End-to-end tests are included in the scope of verifying a Flink release from 
[document|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release],
 and following the [read 
me|https://github.com/apache/flink/tree/master/flink-end-to-end-tests] on 
github to verify the nightly tests, I found several things need to be 
clarified, especially on a clean environment (never run nightly tests before), 
including:
* We should clarify where to find the "" to run the tests against
* We should clarify what preparation required before run the "Kerberized YARN 
on Docker test", if user choose to.

Besides, I've observed some test failure caused by port occupation like below 
(probably due to resource release failure of completed tests) but it seems to 
be environment-related (checked against 4 macbook, 3 with stable failure and 1 
success; checked against 2 linux vm, 1 with stable failure and 1 success). Will 
dig more and add to the document if any environment cleanup/set required, or 
open another JIRA if any improvement needed for the scripts.
{noformat}
java.net.BindException: Could not start actor system on any port in port range 
6123
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11889) Remove "stop" signal along with Stoppable interfaces

2019-03-12 Thread Aljoscha Krettek (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek updated FLINK-11889:
-
Description: 
During the [ML 
discussion|https://lists.apache.org/thread.html/b8d2f3209e7ca7467af6037383ade6c14c35276f7acb2bbbc9a50c0f@%3Cdev.flink.apache.org%3E]
 of 
[FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212]
 we realised that it would be beneficial for this new feature to replace the 
existing "stop" functionality. The current "stop" functionality cannot be used 
because no real-world sources support the functionality. Therefore, I think it 
is save to remove because it should not break existing workflows.

The issue proposes completely removing the old stop feature, introduced via 
FLINK-2111, as preparation for 
[FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212].

We have to be careful when doing this because it touches quite a few things. 
Basically, we have to do a manual revert of this commit: 
https://github.com/apache/flink/commit/bdd4024e20fdfb0accb6121a68780ce3a0c218c0

  was:
During the discussion of 
[FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212]
 we realised that it would be beneficial for this new feature to replace the 
existing "stop" functionality. The current "stop" functionality cannot be used 
because no real-world sources support the functionality. Therefore, I think it 
is save to remove because it should not break existing workflows.

The issue proposes completely removing the old stop feature, introduced via 
FLINK-2111, as preparation for 
[FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212].

We have to be careful when doing this because it touches quite a few things. 
Basically, we have to do a manual revert of this commit: 
https://github.com/apache/flink/commit/bdd4024e20fdfb0accb6121a68780ce3a0c218c0


> Remove "stop" signal along with Stoppable interfaces
> 
>
> Key: FLINK-11889
> URL: https://issues.apache.org/jira/browse/FLINK-11889
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream
>Reporter: Aljoscha Krettek
>Priority: Major
> Fix For: 1.9.0
>
>
> During the [ML 
> discussion|https://lists.apache.org/thread.html/b8d2f3209e7ca7467af6037383ade6c14c35276f7acb2bbbc9a50c0f@%3Cdev.flink.apache.org%3E]
>  of 
> [FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212]
>  we realised that it would be beneficial for this new feature to replace the 
> existing "stop" functionality. The current "stop" functionality cannot be 
> used because no real-world sources support the functionality. Therefore, I 
> think it is save to remove because it should not break existing workflows.
> The issue proposes completely removing the old stop feature, introduced via 
> FLINK-2111, as preparation for 
> [FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212].
> We have to be careful when doing this because it touches quite a few things. 
> Basically, we have to do a manual revert of this commit: 
> https://github.com/apache/flink/commit/bdd4024e20fdfb0accb6121a68780ce3a0c218c0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11889) Remove "stop" signal along with Stoppable interfaces

2019-03-12 Thread Aljoscha Krettek (JIRA)
Aljoscha Krettek created FLINK-11889:


 Summary: Remove "stop" signal along with Stoppable interfaces
 Key: FLINK-11889
 URL: https://issues.apache.org/jira/browse/FLINK-11889
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Aljoscha Krettek
 Fix For: 1.9.0


During the discussion of 
[FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212]
 we realised that it would be beneficial for this new feature to replace the 
existing "stop" functionality. The current "stop" functionality cannot be used 
because no real-world sources support the functionality. Therefore, I think it 
is save to remove because it should not break existing workflows.

The issue proposes completely removing the old stop feature, introduced via 
FLINK-2111, as preparation for 
[FLIP-34|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103090212].

We have to be careful when doing this because it touches quite a few things. 
Basically, we have to do a manual revert of this commit: 
https://github.com/apache/flink/commit/bdd4024e20fdfb0accb6121a68780ce3a0c218c0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] SuXingLee opened a new pull request #7966: [FLINK-11887][metrics] Fixed latency metrics drift apart

2019-03-12 Thread GitBox
SuXingLee opened a new pull request #7966: [FLINK-11887][metrics] Fixed latency 
metrics drift apart
URL: https://github.com/apache/flink/pull/7966
 
 
   ## What is the purpose of the change
   
   Use ```System.currentTimeMillis``` to replace ```System.nanoTime``` in 
```LatencyMarker```,
   in order to fix latency metrics drift apart.
   
   
   ## Verifying this change
   
   LatencyMarker  only affect one latency metric.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ **not documented**)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #7966: [FLINK-11887][metrics] Fixed latency metrics drift apart

2019-03-12 Thread GitBox
flinkbot commented on issue #7966: [FLINK-11887][metrics] Fixed latency metrics 
drift apart
URL: https://github.com/apache/flink/pull/7966#issuecomment-472009888
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11888) Refine document of how to run end-to-end tests

2019-03-12 Thread Yu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790567#comment-16790567
 ] 

Yu Li commented on FLINK-11888:
---

[~Zentol] Please let me know if you have any comments here, thanks in advance.

> Refine document of how to run end-to-end tests
> --
>
> Key: FLINK-11888
> URL: https://issues.apache.org/jira/browse/FLINK-11888
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation, Tests
>Reporter: Yu Li
>Assignee: Yu Li
>Priority: Major
>
> End-to-end tests are included in the scope of verifying a Flink release from 
> [document|https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Release],
>  and following the [read 
> me|https://github.com/apache/flink/tree/master/flink-end-to-end-tests] on 
> github to verify the nightly tests, I found several things need to be 
> clarified, especially on a clean environment (never run nightly tests 
> before), including:
> * We should clarify where to find the "" to run the tests against
> * We should clarify what preparation required before run the "Kerberized YARN 
> on Docker test", if user choose to.
> Besides, I've observed some test failure caused by port occupation like below 
> (probably due to resource release failure of completed tests) but it seems to 
> be environment-related (checked against 4 macbook, 3 with stable failure and 
> 1 success; checked against 2 linux vm, 1 with stable failure and 1 success). 
> Will dig more and add to the document if any environment cleanup/set 
> required, or open another JIRA if any improvement needed for the scripts.
> {noformat}
> java.net.BindException: Could not start actor system on any port in port 
> range 6123
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] zhijiangW commented on issue #7730: [FLINK-11646][test] Remove legacy MockRecordWriter

2019-03-12 Thread GitBox
zhijiangW commented on issue #7730: [FLINK-11646][test] Remove legacy 
MockRecordWriter
URL: https://github.com/apache/flink/pull/7730#issuecomment-472012093
 
 
   Thanks for review and merging @pnowojski .
   I would add that in the commit next time. :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11887) Operator's Latency Metrics continues to increase

2019-03-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11887:
---
Labels: pull-request-available  (was: )

> Operator's Latency Metrics continues to increase
> 
>
> Key: FLINK-11887
> URL: https://issues.apache.org/jira/browse/FLINK-11887
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.6.3
>Reporter: Suxing Lee
>Priority: Major
>  Labels: pull-request-available
> Attachments: 
> flink_taskmanager_job_latency_source_id_operator_id_operator_subtask_index_1_latency.png
>
>
> The operator's latency time is increased by approximately 2.7 minutes per day 
> (see the attached).
> We compute the latency by System.currentTimeMillis - marker.getMarkedTime.
> There is no guarantee that System.currentTimeMillis and System.nanoTime don't 
> drift apart.
> If a GC pause or linux preemptive scheduling happenes, this should affect 
> latency metrics.
> Latency metrics drift away from their initial values with time(verify this 
> result via the JVM Heap Dump).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] twalthr opened a new pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
twalthr opened a new pull request #7967: [FLINK-11449][table] Uncouple the 
Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967
 
 
   ## What is the purpose of the change
   
   This PR aims to improves the contribution of #7664. It further minimizes the 
amount of changes and postpones changes in other modules. Only a small number 
of tests needed to be adapted. Backwards-compatibility is ensured.
   
   
   ## Brief change log
   
   - Rename CommonExpression to Expression
   - Rework the function catalog to work with the newly introduced function 
definitions
   - Rework the Scala implicits and Java expression parser to return the new 
expression stack
   - Introduce an expression bridge on the edges of the API to convert to the 
old expression stack as a temporary solution until the old expression stack 
becomes obsolete
   - Cleanup of main API interfaces (i.e. Table.scala and expressionDsl.scala)
   - Ensure that SPI classes such as FilterableTableSource and 
TimestampExtractor still work
   
   ## Verifying this change
   
   This change is already covered by existing tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: yes
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? JavaDocs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
flinkbot commented on issue #7967: [FLINK-11449][table] Uncouple the Expression 
class from RexNodes
URL: https://github.com/apache/flink/pull/7967#issuecomment-472017793
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] zhijiangW commented on issue #7963: [FLINK-11885][network] Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread GitBox
zhijiangW commented on issue #7963: [FLINK-11885][network] Introduce 
RecordWriterBuilder for creating RecordWriter instance
URL: https://github.com/apache/flink/pull/7963#issuecomment-472020549
 
 
   Thanks for the review @pnowojski .
   
   I forgot fixing one point resulting in travis failure. I would rebase the 
latest master and update the codes later today. After ready I would notify you. 
:)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11865) Code generation in TraversableSerializer is prohibitively slow

2019-03-12 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790604#comment-16790604
 ] 

Aljoscha Krettek commented on FLINK-11865:
--

Fixed on master in
f0cdd9d2ae09061ca3866b81457ba42482704fb7

> Code generation in TraversableSerializer is prohibitively slow
> --
>
> Key: FLINK-11865
> URL: https://issues.apache.org/jira/browse/FLINK-11865
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Reporter: Aljoscha Krettek
>Assignee: Igal Shilman
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in FLINK-11539, the new code generation makes job 
> submissions/translation prohibitively slow.
> The solution should be to introduce a Cache for the generated code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11068) Convert the API classes *Table, *Window to interfaces

2019-03-12 Thread Timo Walther (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790614#comment-16790614
 ] 

Timo Walther commented on FLINK-11068:
--

Hi [~hequn8128], sorry for not answering earlier. FLINK-11449 is close to 
completion. If there are no bigger objections, it will be merged by the end of 
the day.

I reordered some methods and fixed some Table JavaDocs in Table, sorry for the 
merge conflicts in advance!

I totally agree that {{OverWindow}} and {{Window}} are confusing in the API. I 
also noticed that while working on FLINK-11449. {{Window}} was called 
{{GroupWindow}} in the early days and was renamed for the over windows which 
were renamed again later.

I would be in favor of fixing this issue now but in a backwards compatible way. 
Feel free to open a separate PR for just the ported windows before you open a 
PR for the {{Table}} classes.

I would propose the following simplified class structure:
{code}
org.apache.flink.table.api.Tumble
org.apache.flink.table.api.Slide
org.apache.flink.table.api.Session

org.apache.flink.table.api.GroupWindow

@Deprecated
org.apache.flink.table.api.java.Tumble extends api.Tumble
@Deprecated
org.apache.flink.table.api.scala.Tumble extends api.Tumble
...

@Deprecated
org.apache.flink.table.api.Window extends GroupWindow
{code}

What do you think?

Adding the table interface to the planner module is totally fine for now. We 
will move the class cluster to {{api-java}} later.

> Convert the API classes *Table, *Window to interfaces
> -
>
> Key: FLINK-11068
> URL: https://issues.apache.org/jira/browse/FLINK-11068
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Table SQL
>Reporter: Timo Walther
>Assignee: Hequn Cheng
>Priority: Major
>
> A more detailed description can be found in 
> [FLIP-32|https://cwiki.apache.org/confluence/display/FLINK/FLIP-32%3A+Restructure+flink-table+for+future+contributions].
> This includes: Table, GroupedTable, WindowedTable, WindowGroupedTable, 
> OverWindowedTable, Window, OverWindow
> We can keep the "Table" Scala implementation in a planner module until it has 
> been converted to Java.
> We can add a method to the planner later to give us a concrete instance. This 
> is one possibility to have a smooth transition period instead of changing all 
> classes at once.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-2316) Optimize the SCA for bytecode generated by Scala

2019-03-12 Thread Timo Walther (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther closed FLINK-2316.
---
Resolution: Won't Fix

The static code analysis was never really used. This issue is not relevant 
anymore.

> Optimize the SCA for bytecode generated by Scala
> 
>
> Key: FLINK-2316
> URL: https://issues.apache.org/jira/browse/FLINK-2316
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Scala
>Reporter: Timo Walther
>Assignee: Timo Walther
>Priority: Minor
>
> The static code analysis is not fully optimized for bytecode generated by 
> Scala. Some Scala classes (esp. Scala tuples and case classes) need a special 
> treatment to get the best analysis results. Test cases for the Scala API are 
> also missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] zhijiangW commented on issue #7963: [FLINK-11885][network] Introduce RecordWriterBuilder for creating RecordWriter instance

2019-03-12 Thread GitBox
zhijiangW commented on issue #7963: [FLINK-11885][network] Introduce 
RecordWriterBuilder for creating RecordWriter instance
URL: https://github.com/apache/flink/pull/7963#issuecomment-472059561
 
 
   @pnowojski , I have updated the codes for only fixing `StreamTask` class 
compared with first version.
   The travis is successful now on my side.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11890) Replace Table API string-based expressions by a Java DSL

2019-03-12 Thread Timo Walther (JIRA)
Timo Walther created FLINK-11890:


 Summary: Replace Table API string-based expressions by a Java DSL
 Key: FLINK-11890
 URL: https://issues.apache.org/jira/browse/FLINK-11890
 Project: Flink
  Issue Type: New Feature
  Components: API / Table SQL
Reporter: Timo Walther
Assignee: Timo Walther


Currently, expressions in the Table API can be defined in two ways. Either via 
the implicit Scala DSL or via custom strings:

{code}
Table revenue = orders
  .filter("cCountry === 'FRANCE'")
  .groupBy("cID, cName")
  .select("cID, cName, revenue.sum AS revSum");

// vs.

val revenue = orders
  .filter('cCountry === "FRANCE")
  .groupBy('cID, 'cName)
  .select('cID, 'cName, 'revenue.sum AS 'revSum)
{code}

The Java version of the Table API was always treated as a third class citizen.

In the past the string-based expression parser was/is:
- multiple times out of sync with the Scala DSL
- buggy because it was not properly tested
- it blows up the API as every method must accept both representations like 
{{select(String)}} and {{select(Expression)}}
- confusing for users because it looks like SQL but is actually only SQL-like
- does not provide JavaDocs or further help in case of parse errors within the 
IDE, a user can only consult the online documentation

We should discuss alternatives for the string-based expressions and might come 
up with a DSL-like approach to make the Table API ready for the future.

This issue would be a big design shift for Table API users in Java. We need to 
discuss if the benefits are worth the change. A first proof-of-concept/design 
proposal will follow shortly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264751565
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/call.scala
 ##
 @@ -40,9 +40,10 @@ import _root_.scala.collection.JavaConverters._
   * General expression for unresolved function calls. The function can be a 
built-in
   * scalar function or a user-defined scalar function.
   */
-case class Call(functionName: String, args: Seq[Expression]) extends 
PlannerExpression {
+case class UnresolvedCall(functionName: String, args: Seq[PlannerExpression])
 
 Review comment:
   Do we need the `UnresolvedCall`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264703739
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -259,178 +302,177 @@ trait ImplicitExpressionOperations {
 *
 * e.g. "42".in(1, 2, 3) leads to false.
 */
-  def in(elements: Expression*) = In(expr, elements)
+  def in(elements: Expression*): Expression = call(IN, Seq(expr) ++ elements: 
_*)
 
 Review comment:
   expr +: elements


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264710350
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -1289,7 +1399,7 @@ object map {
 * Creates a map of expressions. The map will be a map between two objects 
(not primitives).
 */
   def apply(key: Expression, value: Expression, tail: Expression*): Expression 
= {
-MapConstructor(Seq(key, value) ++ tail.toSeq)
+call(MAP, Seq(key, value) ++ tail.toSeq: _*)
 
 Review comment:
   key +: value +: tail.toSeq: _*


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264742946
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/expressions/PlannerExpressionConverter.scala
 ##
 @@ -723,6 +721,16 @@ class PlannerExpressionConverter private extends 
ExpressionVisitor[PlannerExpres
 throw new TableException("Unrecognized expression: " + other)
 }
   }
+
+  private def getValue[T](literal: PlannerExpression): T = {
+literal.asInstanceOf[Literal].value.asInstanceOf[T]
+  }
+
+  private def assert(condition: Boolean): Unit = {
+if (!condition) {
+  throw new TableException("Assertion is violated.")
 
 Review comment:
   ```suggestion
 throw new ValidationException("Assertion is violated.")
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264703419
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -234,21 +272,26 @@ trait ImplicitExpressionOperations {
 * @param extraNames additional names if the expression expands to multiple 
fields
 * @return field with an alias
 */
-  def as(name: Symbol, extraNames: Symbol*) = Alias(expr, name.name, 
extraNames.map(_.name))
+  def as(name: Symbol, extraNames: Symbol*): Expression =
+call(
+  AS,
+  Seq(expr) ++
 
 Review comment:
   We can prepend the `Seq`:
   ``` 
   expr +: valueLiteral(name.name) +: extraNames.map(name => 
valueLiteral(name.name)): _*)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264723067
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -1249,8 +1359,8 @@ object timestampDiff {
   timePointUnit: TimePointUnit,
   timePoint1: Expression,
   timePoint2: Expression)
-: Expression = {
-TimestampDiff(timePointUnit, timePoint1, timePoint2)
+  : Expression = {
 
 Review comment:
   ```suggestion
   : Expression = {
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264711540
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/api/scala/expressionDsl.scala
 ##
 @@ -1402,7 +1512,7 @@ object atan2 {
   **/
 object concat_ws {
   def apply(separator: Expression, string: Expression, strings: Expression*): 
Expression = {
-ConcatWs(separator, Seq(string) ++ strings)
+call(CONCAT_WS, Seq(separator) ++ Seq(string) ++ strings: _*)
 
 Review comment:
   
   ```suggestion
   call(CONCAT_WS, separator +: string +: strings: _*)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] dawidwys commented on a change in pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
dawidwys commented on a change in pull request #7967: [FLINK-11449][table] 
Uncouple the Expression class from RexNodes
URL: https://github.com/apache/flink/pull/7967#discussion_r264742116
 
 

 ##
 File path: 
flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/validate/FunctionCatalog.scala
 ##
 @@ -63,252 +112,16 @@ class FunctionCatalog {
 )
 
   /**
-* Lookup and create an expression if we find a match.
+* Lookup a function by name and operands and return the 
[[FunctionDefinition]].
 */
-  def lookupFunction(name: String, children: Seq[Expression]): Expression = {
-val funcClass = functionBuilders
-  .getOrElse(name.toLowerCase, throw new ValidationException(s"Undefined 
function: $name"))
-
-// Instantiate a function using the provided `children`
-funcClass match {
-
-  // user-defined scalar function call
-  case sf if classOf[ScalarFunction].isAssignableFrom(sf) =>
-val scalarSqlFunction = sqlFunctions
-  .find(f => f.getName.equalsIgnoreCase(name) && 
f.isInstanceOf[ScalarSqlFunction])
-  .getOrElse(throw new ValidationException(s"Undefined scalar 
function: $name"))
-  .asInstanceOf[ScalarSqlFunction]
-ScalarFunctionCall(scalarSqlFunction.getScalarFunction, children)
-
-  // user-defined table function call
-  case tf if classOf[TableFunction[_]].isAssignableFrom(tf) =>
-val tableSqlFunction = sqlFunctions
-  .find(f => f.getName.equalsIgnoreCase(name) && 
f.isInstanceOf[TableSqlFunction])
-  .getOrElse(throw new ValidationException(s"Undefined table function: 
$name"))
-  .asInstanceOf[TableSqlFunction]
-val typeInfo = tableSqlFunction.getRowTypeInfo
-val function = tableSqlFunction.getTableFunction
-TableFunctionCall(name, function, children, typeInfo)
-
-  // user-defined aggregate function call
-  case af if classOf[AggregateFunction[_, _]].isAssignableFrom(af) =>
-val aggregateFunction = sqlFunctions
-  .find(f => f.getName.equalsIgnoreCase(name) && 
f.isInstanceOf[AggSqlFunction])
-  .getOrElse(throw new ValidationException(s"Undefined table function: 
$name"))
-  .asInstanceOf[AggSqlFunction]
-val function = aggregateFunction.getFunction
-val returnType = aggregateFunction.returnType
-val accType = aggregateFunction.accType
-AggFunctionCall(function, returnType, accType, children)
-
-  // general expression call
-  case expression if classOf[Expression].isAssignableFrom(expression) =>
-// try to find a constructor accepts `Seq[Expression]`
-Try(funcClass.getDeclaredConstructor(classOf[Seq[_]])) match {
-  case Success(seqCtor) =>
-Try(seqCtor.newInstance(children).asInstanceOf[Expression]) match {
-  case Success(expr) => expr
-  case Failure(e) => throw new ValidationException(e.getMessage)
-}
-  case Failure(_) =>
-Try(funcClass.getDeclaredConstructor(classOf[Expression], 
classOf[Seq[_]])) match {
-  case Success(ctor) =>
-Try(ctor.newInstance(children.head, 
children.tail).asInstanceOf[Expression]) match {
-  case Success(expr) => expr
-  case Failure(e) => throw new 
ValidationException(e.getMessage)
-}
-  case Failure(_) =>
-val childrenClass = 
Seq.fill(children.length)(classOf[Expression])
-// try to find a constructor matching the exact number of 
children
-Try(funcClass.getDeclaredConstructor(childrenClass: _*)) match 
{
-  case Success(ctor) =>
-Try(ctor.newInstance(children: 
_*).asInstanceOf[Expression]) match {
-  case Success(expr) => expr
-  case Failure(exception) => throw new 
ValidationException(exception.getMessage)
-}
-  case Failure(_) =>
-throw new ValidationException(
-  s"Invalid number of arguments for function $funcClass")
-}
-}
-}
-  case _ =>
-throw new ValidationException("Unsupported function.")
-}
+  def lookupFunction(name: String, children: Seq[Expression]): 
FunctionDefinition = {
 
 Review comment:
   I would remove the `children`. It is not used for the lookup and complicates 
other places.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11539) Add TypeSerializerSnapshot for TraversableSerializer

2019-03-12 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790701#comment-16790701
 ] 

Aljoscha Krettek commented on FLINK-11539:
--

[~jkreileder] The fix for this (FLINK-11865) is now on {{release-1.8}}.

> Add TypeSerializerSnapshot for TraversableSerializer
> 
>
> Key: FLINK-11539
> URL: https://issues.apache.org/jira/browse/FLINK-11539
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Type Serialization System
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This will replace the deprecated {{TypeSerializerConfigSnapshot}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-11865) Code generation in TraversableSerializer is prohibitively slow

2019-03-12 Thread Aljoscha Krettek (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aljoscha Krettek closed FLINK-11865.

Resolution: Fixed

> Code generation in TraversableSerializer is prohibitively slow
> --
>
> Key: FLINK-11865
> URL: https://issues.apache.org/jira/browse/FLINK-11865
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Reporter: Aljoscha Krettek
>Assignee: Igal Shilman
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in FLINK-11539, the new code generation makes job 
> submissions/translation prohibitively slow.
> The solution should be to introduce a Cache for the generated code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] aljoscha closed pull request #7957: [FLINK-11865] Improve code generation speed in TraversableSerializer

2019-03-12 Thread GitBox
aljoscha closed pull request #7957: [FLINK-11865] Improve code generation speed 
in TraversableSerializer
URL: https://github.com/apache/flink/pull/7957
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] aljoscha commented on issue #7957: [FLINK-11865] Improve code generation speed in TraversableSerializer

2019-03-12 Thread GitBox
aljoscha commented on issue #7957: [FLINK-11865] Improve code generation speed 
in TraversableSerializer
URL: https://github.com/apache/flink/pull/7957#issuecomment-472069699
 
 
   Merged


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11865) Code generation in TraversableSerializer is prohibitively slow

2019-03-12 Thread Aljoscha Krettek (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16790700#comment-16790700
 ] 

Aljoscha Krettek commented on FLINK-11865:
--

Fixed on release-1.8 in
ce7aea06785d8e06fbacffe9c6100fe9fbb02f2a

> Code generation in TraversableSerializer is prohibitively slow
> --
>
> Key: FLINK-11865
> URL: https://issues.apache.org/jira/browse/FLINK-11865
> Project: Flink
>  Issue Type: Bug
>  Components: API / Type Serialization System
>Reporter: Aljoscha Krettek
>Assignee: Igal Shilman
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As discussed in FLINK-11539, the new code generation makes job 
> submissions/translation prohibitively slow.
> The solution should be to introduce a Cache for the generated code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] scott-mitchell opened a new pull request #7968: [FLINK-11692] [Datadog Reporter] added proxy to datadog reporter

2019-03-12 Thread GitBox
scott-mitchell opened a new pull request #7968: [FLINK-11692] [Datadog 
Reporter] added proxy to datadog reporter
URL: https://github.com/apache/flink/pull/7968
 
 
   
   
   ## What is the purpose of the change
   
   This pull request adds a configurable proxy to the Datadog metric reporter.
   
   ## Brief change log
   
 - Proxy host and port configurable in flink config for Datadog reporter
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added unit test ensuring the correct proxy is used*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): No
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: No
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? docs
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] flinkbot commented on issue #7968: [FLINK-11692] [Datadog Reporter] added proxy to datadog reporter

2019-03-12 Thread GitBox
flinkbot commented on issue #7968: [FLINK-11692] [Datadog Reporter] added proxy 
to datadog reporter
URL: https://github.com/apache/flink/pull/7968#issuecomment-472096718
 
 
   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of 
the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11692) Add Proxy for DataDog Metric Reporter

2019-03-12 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11692:
---
Labels: pull-request-available  (was: )

> Add Proxy for DataDog Metric Reporter
> -
>
> Key: FLINK-11692
> URL: https://issues.apache.org/jira/browse/FLINK-11692
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Metrics
>Reporter: Scott Mitchell
>Priority: Minor
>  Labels: pull-request-available
>
> There's no clear way currently to route just metric traffic through a proxy. 
> Although one can use java http proxy environment variables and an extensive 
> blacklist, the solution is far from elegant or robust. It would be great to 
> have a proxy option for the DataDog metric reporter, and other http metric 
> reporters if the solution is easily extendable.
>  
> Additional sample config lines:
> {code:java}
> metrics.reporter.dghttp.proxyHost: my.proxy.com
> metrics.reporter.dghttp.proxyPort: 8080{code}
>  
> I can take this ticket myself, unless there is already another robust 
> solution to route just metric data through a proxy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [flink] Xeli commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
Xeli commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added 
PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264810803
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSubscriberFactoryForEmulator.java
 ##
 @@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+
+import com.google.api.gax.core.NoCredentialsProvider;
+import com.google.api.gax.grpc.GrpcTransportChannel;
+import com.google.api.gax.rpc.FixedTransportChannelProvider;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.stub.GrpcSubscriberStub;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.cloud.pubsub.v1.stub.SubscriberStubSettings;
+import io.grpc.ManagedChannel;
+import io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+
+import java.io.IOException;
+
+/**
+ * A convenience PubSubSubscriberFactory that can be used to connect to a 
PubSub emulator.
+ * The PubSub emulators do not support SSL or Credentials and as such this 
SubscriberStub does not require or provide this.
+ */
+public class PubSubSubscriberFactoryForEmulator implements 
PubSubSubscriberFactory {
 
 Review comment:
   Yeah I believe it will be useful for others if they want to run integration 
tests of their flink jobs. See the docs for a small piece of documentation how 
I think this class can help.
   
   I looked into adding it to the `flink-test-utils` but because connectors are 
not part of flink the test utils would have to depend on this connector. I 
think it's best to leave it as is


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] Xeli commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
Xeli commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added 
PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264827593
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSource.java
 ##
 @@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
+import org.apache.flink.configuration.Configuration;
+import 
org.apache.flink.streaming.api.functions.source.MultipleIdsMessageAcknowledgingSourceBase;
+import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
+import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.Subscriber;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.pubsub.v1.AcknowledgeRequest;
+import com.google.pubsub.v1.ProjectSubscriptionName;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.PullRequest;
+import com.google.pubsub.v1.PullResponse;
+import com.google.pubsub.v1.ReceivedMessage;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+import io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoopGroup;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.TimeUnit;
+
+import static 
com.google.cloud.pubsub.v1.SubscriptionAdminSettings.defaultCredentialsProviderBuilder;
+
+/**
+ * PubSub Source, this Source will consume PubSub messages from a subscription 
and Acknowledge them on the next checkpoint.
+ * This ensures every message will get acknowledged at least once.
+ */
+public class PubSubSource extends 
MultipleIdsMessageAcknowledgingSourceBase
+   implements ResultTypeQueryable, 
ParallelSourceFunction, StoppableFunction {
+   private static final Logger LOG = 
LoggerFactory.getLogger(PubSubSource.class);
+   protected final DeserializationSchema deserializationSchema;
+   protected final PubSubSubscriberFactory pubSubSubscriberFactory;
+   protected final Credentials credentials;
+   protected final String projectSubscriptionName;
+   protected final int maxMessagesPerPull;
+
+   protected transient boolean deduplicateMessages;
+   protected transient SubscriberStub subscriber;
+   protected transient PullRequest pullRequest;
+   protected transient EventLoopGroup eventLoopGroup;
+
+   protected transient volatile boolean isRunning;
+   protected transient volatile ApiFuture messagesFuture;
+
+   PubSubSource(DeserializationSchema deserializationSchema, 
PubSubSubscriberFactory pubSubSubscriberFactory, Credentials credentials, 
String projectSubscriptionName, int maxMessagesPerPull) {
+   super(String.class);
+   this.deserializationSchema = deserializationSchema;
+   this.pubSubSubscriberFactory = pubSubSubscriberFactory;
+   this.credentials = credentials;
+   this.projectSubscriptionName = projectSubscriptionName;
+   this.maxMessagesPerPull = maxMessagesPerPull;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   super.open(configuration);
+   if (hasNoCheckpointingEnabled(getRuntimeContext())) {
+   throw new IllegalArgumentException("The PubSubSource 
REQUIRES Checkpointing to be enabled and " +
+   "the 

[GitHub] [flink] Xeli commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)

2019-03-12 Thread GitBox
Xeli commented on a change in pull request #6594: [FLINK-9311] [pubsub] Added 
PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r264828377
 
 

 ##
 File path: 
flink-connectors/flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSource.java
 ##
 @@ -0,0 +1,296 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.gcp.pubsub;
+
+import org.apache.flink.api.common.functions.RuntimeContext;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import org.apache.flink.api.common.serialization.DeserializationSchema;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
+import org.apache.flink.configuration.Configuration;
+import 
org.apache.flink.streaming.api.functions.source.MultipleIdsMessageAcknowledgingSourceBase;
+import org.apache.flink.streaming.api.functions.source.ParallelSourceFunction;
+import org.apache.flink.streaming.api.operators.StreamingRuntimeContext;
+import 
org.apache.flink.streaming.connectors.gcp.pubsub.common.PubSubSubscriberFactory;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.auth.Credentials;
+import com.google.cloud.pubsub.v1.Subscriber;
+import com.google.cloud.pubsub.v1.stub.SubscriberStub;
+import com.google.pubsub.v1.AcknowledgeRequest;
+import com.google.pubsub.v1.ProjectSubscriptionName;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.PullRequest;
+import com.google.pubsub.v1.PullResponse;
+import com.google.pubsub.v1.ReceivedMessage;
+import io.grpc.netty.shaded.io.netty.channel.EventLoopGroup;
+import io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoopGroup;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.concurrent.CancellationException;
+import java.util.concurrent.TimeUnit;
+
+import static 
com.google.cloud.pubsub.v1.SubscriptionAdminSettings.defaultCredentialsProviderBuilder;
+
+/**
+ * PubSub Source, this Source will consume PubSub messages from a subscription 
and Acknowledge them on the next checkpoint.
+ * This ensures every message will get acknowledged at least once.
+ */
+public class PubSubSource extends 
MultipleIdsMessageAcknowledgingSourceBase
+   implements ResultTypeQueryable, 
ParallelSourceFunction, StoppableFunction {
+   private static final Logger LOG = 
LoggerFactory.getLogger(PubSubSource.class);
+   protected final DeserializationSchema deserializationSchema;
+   protected final PubSubSubscriberFactory pubSubSubscriberFactory;
+   protected final Credentials credentials;
+   protected final String projectSubscriptionName;
+   protected final int maxMessagesPerPull;
+
+   protected transient boolean deduplicateMessages;
+   protected transient SubscriberStub subscriber;
+   protected transient PullRequest pullRequest;
+   protected transient EventLoopGroup eventLoopGroup;
+
+   protected transient volatile boolean isRunning;
+   protected transient volatile ApiFuture messagesFuture;
+
+   PubSubSource(DeserializationSchema deserializationSchema, 
PubSubSubscriberFactory pubSubSubscriberFactory, Credentials credentials, 
String projectSubscriptionName, int maxMessagesPerPull) {
+   super(String.class);
+   this.deserializationSchema = deserializationSchema;
+   this.pubSubSubscriberFactory = pubSubSubscriberFactory;
+   this.credentials = credentials;
+   this.projectSubscriptionName = projectSubscriptionName;
+   this.maxMessagesPerPull = maxMessagesPerPull;
+   }
+
+   @Override
+   public void open(Configuration configuration) throws Exception {
+   super.open(configuration);
+   if (hasNoCheckpointingEnabled(getRuntimeContext())) {
+   throw new IllegalArgumentException("The PubSubSource 
REQUIRES Checkpointing to be enabled and " +
+   "the 

[GitHub] [flink] twalthr commented on issue #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
twalthr commented on issue #7967: [FLINK-11449][table] Uncouple the Expression 
class from RexNodes
URL: https://github.com/apache/flink/pull/7967#issuecomment-472180884
 
 
   Thank you @dawidwys for the review and @sunjincheng121 for the preparatory 
work. Merging this now.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] asfgit merged pull request #7967: [FLINK-11449][table] Uncouple the Expression class from RexNodes

2019-03-12 Thread GitBox
asfgit merged pull request #7967: [FLINK-11449][table] Uncouple the Expression 
class from RexNodes
URL: https://github.com/apache/flink/pull/7967
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [flink] asfgit closed pull request #7664: [FLINK-11449][table] Uncouple the Expression class from RexNodes.

2019-03-12 Thread GitBox
asfgit closed pull request #7664: [FLINK-11449][table] Uncouple the Expression 
class from RexNodes.
URL: https://github.com/apache/flink/pull/7664
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >