[GitHub] lamber-ken opened a new pull request #7551: [hotfix][runtime] add default statement in switch block when judge taskAcknowledgeResult

2019-01-21 Thread GitBox
lamber-ken opened a new pull request #7551: [hotfix][runtime] add default 
statement in switch block when judge taskAcknowledgeResult
URL: https://github.com/apache/flink/pull/7551
 
 
   ## What is the purpose of the change
   
   add default statement in switch block when judge taskAcknowledgeResult
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] eaglewatcherwb commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution …

2019-01-21 Thread GitBox
eaglewatcherwb commented on a change in pull request #7436: [FLINK-11071][core] 
add support for dynamic proxy classes resolution …
URL: https://github.com/apache/flink/pull/7436#discussion_r249663932
 
 

 ##
 File path: 
flink-core/src/test/java/org/apache/flink/util/InstantiationUtilTest.java
 ##
 @@ -46,6 +50,17 @@
  */
 public class InstantiationUtilTest extends TestLogger {
 
+   @Test
+   public void testResolveProxyClass() throws Exception {
 
 Review comment:
   @dawidwys Thanks for the comments.
   Unit test is updated, and the test fails when 
`ClassLoaderObjectInputStream#resolveProxyClass` only calls 
`super.resolveProxyClass `. I can not remove all the change to verify the case 
since `ObjectInputStream#resolveProxyClass` is `protected`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Clarkkkkk opened a new pull request #7550: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink and Upsert Table Sink for HBase

2019-01-21 Thread GitBox
Clark opened a new pull request #7550: [FLINK-10245] [Streaming Connector] 
Add Pojo, Tuple, Row and Scala Product DataStream Sink and Upsert Table Sink 
for HBase
URL: https://github.com/apache/flink/pull/7550
 
 
   ## What is the purpose of the change
   - Add datastream sink and upsert table sink for HBase
   
   ## Brief change log
   - Add HBase connector, details are in the doc.
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   - org.apache.flink.streaming.connectors.hbase.HBaseSinkITCase: starts a mini 
HBase cluster to verify all sink functions.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency):  no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? yes
 - If yes, how is the feature documented? JavaDocs and 
https://docs.google.com/document/d/1of0cYd73CtKGPt-UL3WVFTTBsVEre-TNRzoAt5u2PdQ/edit#
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-5703) ExecutionGraph recovery based on reconciliation with TaskManager reports

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5703:
--
Labels: pull-request-available  (was: )

> ExecutionGraph recovery based on reconciliation with TaskManager reports
> 
>
> Key: FLINK-5703
> URL: https://issues.apache.org/jira/browse/FLINK-5703
> Project: Flink
>  Issue Type: Sub-task
>  Components: Distributed Coordination, JobManager
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Major
>  Labels: pull-request-available
>
> The ExecutionGraph structure would be recovered from TaskManager reports 
> during reconciling period, and the necessary information includes:
> - Execution: ExecutionAttemptID, AttemptNumber, StartTimestamp, 
> ExecutionState, SimpleSlot, PartialInputChannelDeploymentDescriptor(Consumer 
> Execution)
> - ExecutionVertex: Map IntermediateResultPartition>
> - ExecutionGraph: ConcurrentHashMap
> For {{RECONCILING}} ExecutionState, it should be transition into any existing 
> task states ({{RUNNING}},{{CANCELED}},{{FAILED}},{{FINISHED}}). To do so, the 
> TaskManger should maintain the terminal task state 
> ({{CANCELED}},{{FAILED}},{{FINISHED}}) for a while and we try to realize this 
> mechanism in another jira. In addition, the state transition would trigger 
> different actions, and some actions rely on above necessary information. 
> Considering this limit, the recovery process will be divided into two steps:
> - First, recovery all other necessary information except ExecutionState.
> - Second, transition ExecutionState into real task state and trigger 
> actions. The behavior is the same with current {{UpdateTaskExecutorState}}.
> To make logic easy and consistency, during recovery period, all the other RPC 
> messages ({{UpdateTaskExecutionState}}, {{ScheduleOrUpdateConsumers}},etc) 
> from TaskManager should be refused temporarily and responded with a special 
> message by JobMaster. Then the TaskManager should retry to send these 
> messages later until JobManager ends recovery and acknowledgement.
> For {{RECONCILING}} JobStatus, it would be transition into one of the states 
> ({{RUNNING}},{{FAILING}},{{FINISHED}}) after recovery.
> - {{RECONCILING}} to {{RUNNING}}: All the TaskManager report within 
> duration time and all the tasks are in {{RUNNING}} states.
> - {{RECONCILING}} to {{FAILING}}: One of the TaskManager does not report 
> in time, or one of the tasks state is in {{FAILED}} or {{CANCELED}}
> - {{RECONCILING}} to {{FINISHED}}: All the TaskManger report within 
> duration time and all the tasks are in {{FINISHED}} states.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zhijiangW commented on issue #3340: [FLINK-5703] [Job Manager] ExecutionGraph recovery via reconciliation with TaskManager reports

2019-01-21 Thread GitBox
zhijiangW commented on issue #3340: [FLINK-5703] [Job Manager] ExecutionGraph 
recovery via reconciliation with TaskManager reports
URL: https://github.com/apache/flink/pull/3340#issuecomment-456296041
 
 
   Close this pr temporarily, and this feature would be re-submitted if needed 
in future.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] zhijiangW closed pull request #3340: [FLINK-5703] [Job Manager] ExecutionGraph recovery via reconciliation with TaskManager reports

2019-01-21 Thread GitBox
zhijiangW closed pull request #3340: [FLINK-5703] [Job Manager] ExecutionGraph 
recovery via reconciliation with TaskManager reports
URL: https://github.com/apache/flink/pull/3340
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11403) Remove ResultPartitionConsumableNotifier from ResultPartition

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11403:
---
Labels: pull-request-available  (was: )

> Remove ResultPartitionConsumableNotifier from ResultPartition
> -
>
> Key: FLINK-11403
> URL: https://issues.apache.org/jira/browse/FLINK-11403
> Project: Flink
>  Issue Type: Sub-task
>  Components: Network
>Reporter: zhijiang
>Assignee: zhijiang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>
> This is the precondition for introducing pluggable {{ShuffleService}} on TM 
> side.
> In current process of creating {{ResultPartition}}, the 
> {{ResultPartitionConsumableNotifier}} regarded as TM level component has to 
> be passed into the constructor. In order to create {{ResultPartition}} easily 
> from {{ShuffleService}}, the required information should be covered by 
> {{ResultPartitionDeploymentDescriptor}} as much as possible, then we could 
> remove this notifier from the constructor. And it is also reasonable for 
> notifying consumable partition via {{TaskActions}} which is already covered 
> in {{ResultPartition}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zhijiangW opened a new pull request #7549: [FLINK-11403][network] Remove ResultPartitionConsumableNotifier from ResultPartition

2019-01-21 Thread GitBox
zhijiangW opened a new pull request #7549: [FLINK-11403][network] Remove 
ResultPartitionConsumableNotifier from ResultPartition
URL: https://github.com/apache/flink/pull/7549
 
 
   ## What is the purpose of the change
   
   *This is the precondition for introducing pluggable `ShuffleService` on TM 
side.*
   
   *In current process of creating `ResultPartition`, the 
`ResultPartitionConsumableNotifier` regarded as TM level component has to be 
passed into the constructor. In order to create `ResultPartition` easily via 
`ShuffleService`, the required information should be covered by 
`ResultPartitionDeploymentDescriptor` as much as possible, then we could remove 
this notifier from the constructor. And it is also reasonable for notifying 
consumable partition via `TaskActions` which is already covered in 
`ResultPartition`.*
   
   ## Brief change log
   
 - *Remove `ResultPartitionConsumableNotifier` from `ResultPartition` 
constructor*
 - *Introduce `notifyPartitionConsumable` for `TaskActions` interface*
 - *Make `NoOpTaskActions` implementation public class for tests usage*
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as 
*ResultPartitionTest*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-10245) Add HBase Streaming Sink

2019-01-21 Thread Shimin Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shimin Yang updated FLINK-10245:

Summary: Add HBase Streaming Sink  (was: Add DataStream HBase Sink)

> Add HBase Streaming Sink
> 
>
> Key: FLINK-10245
> URL: https://issues.apache.org/jira/browse/FLINK-10245
> Project: Flink
>  Issue Type: Sub-task
>  Components: Streaming Connectors
>Reporter: Shimin Yang
>Assignee: Shimin Yang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Design documentation: 
> [https://docs.google.com/document/d/1of0cYd73CtKGPt-UL3WVFTTBsVEre-TNRzoAt5u2PdQ/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11405) rest api can see exception by start end time filter

2019-01-21 Thread lining (JIRA)
lining created FLINK-11405:
--

 Summary: rest api can see exception by start end time filter
 Key: FLINK-11405
 URL: https://issues.apache.org/jira/browse/FLINK-11405
 Project: Flink
  Issue Type: Sub-task
Reporter: lining
Assignee: lining






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11404) web ui add see page and can filter by time

2019-01-21 Thread lining (JIRA)
lining created FLINK-11404:
--

 Summary: web ui add see page and can filter by time
 Key: FLINK-11404
 URL: https://issues.apache.org/jira/browse/FLINK-11404
 Project: Flink
  Issue Type: Sub-task
Reporter: lining
Assignee: lining






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (FLINK-11404) web ui add see page and can filter by time

2019-01-21 Thread lining (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining reassigned FLINK-11404:
--

Assignee: Yadong Xie  (was: lining)

> web ui add see page and can filter by time
> --
>
> Key: FLINK-11404
> URL: https://issues.apache.org/jira/browse/FLINK-11404
> Project: Flink
>  Issue Type: Sub-task
>Reporter: lining
>Assignee: Yadong Xie
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining updated FLINK-11374:
---
Comment: was deleted

(was: agg by vertex

!image-2019-01-22-11-42-33-808.png!)

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png, 
> image-2019-01-22-11-42-33-808.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Myasuka commented on issue #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base

2019-01-21 Thread GitBox
Myasuka commented on issue #7544: [FLINK-11366][tests] Port 
TaskManagerMetricsTest to new code base
URL: https://github.com/apache/flink/pull/7544#issuecomment-456287424
 
 
   I close-reopen this PR to rebuild checks for unexpected ITcase broken ( 
refer to [FLINK-9920](https://issues.apache.org/jira/browse/FLINK-9920)).
   
   CC @tillrohrmann would you please take a look at this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Myasuka closed pull request #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base

2019-01-21 Thread GitBox
Myasuka closed pull request #7544: [FLINK-11366][tests] Port 
TaskManagerMetricsTest to new code base
URL: https://github.com/apache/flink/pull/7544
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Myasuka opened a new pull request #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base

2019-01-21 Thread GitBox
Myasuka opened a new pull request #7544: [FLINK-11366][tests] Port 
TaskManagerMetricsTest to new code base
URL: https://github.com/apache/flink/pull/7544
 
 
   
   ## What is the purpose of the change
   Port `TaskManagerMetricsTest` to new code base.
   
   ## Brief change log
   
 - Move `TaskManagerMetricsTest`'s test to 
`TaskExecutorTest#testHeartbeatTimeoutWithResourceManager`
   
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   
 - Verify taskmanager's metrics registry was not shutdown due to the 
disconnect to RM
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Myasuka commented on issue #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest

2019-01-21 Thread GitBox
Myasuka commented on issue #7525: [FLINK-11363][test] Check and remove 
TaskManagerConfigurationTest
URL: https://github.com/apache/flink/pull/7525#issuecomment-456287098
 
 
   I close-reopen this PR to rebuild checks for unexpected ITcase crashed ( 
refer to [FLINK-11380](https://issues.apache.org/jira/browse/FLINK-11380)).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-9920) BucketingSinkFaultToleranceITCase fails on travis

2019-01-21 Thread Yun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748419#comment-16748419
 ] 

Yun Tang commented on FLINK-9920:
-

Another instance:  [https://api.travis-ci.org/v3/job/481937341/log.txt]

[~aljoscha] when you try to verify this case locally. Did you use hadoop-2.8.3 
just like travis used, and the local OS is linux not mac-os?

> BucketingSinkFaultToleranceITCase fails on travis
> -
>
> Key: FLINK-9920
> URL: https://issues.apache.org/jira/browse/FLINK-9920
> Project: Flink
>  Issue Type: Bug
>  Components: filesystem-connector
>Affects Versions: 1.6.0, 1.7.0
>Reporter: Chesnay Schepler
>Assignee: Aljoscha Krettek
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.8.0
>
>
> https://travis-ci.org/zentol/flink/jobs/407021898
> {code}
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.082 sec 
> <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase
> runCheckpointedProgram(org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase)
>   Time elapsed: 5.696 sec  <<< FAILURE!
> java.lang.AssertionError: Read line does not match expected pattern.
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase.postSubmit(BucketingSinkFaultToleranceITCase.java:182)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249653432
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java
 ##
 @@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.internal;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Function;
+import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables;
+
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.common.header.Header;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.AbstractMap;
+import java.util.Map;
+
+/**
+ * Extends base Kafka09ConsumerRecord to provide access to Kafka headers.
 
 Review comment:
   Done - a6065aa


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Aitozi commented on issue #4514: [FLINK-7384][cep] Unify event and processing time handling in the Abs…

2019-01-21 Thread GitBox
Aitozi commented on issue #4514: [FLINK-7384][cep] Unify event and processing 
time handling in the Abs…
URL: https://github.com/apache/flink/pull/4514#issuecomment-456284959
 
 
   Hi @dawidwys
   I have gone through this PR, I think the unified solution is more clear. 
Maybe better check about the timeCharacteristic in `onTime` because i think we 
just need to `advanceTime` under processingtime and to not have to try to get 
from `elementqueue` . And why this PR was not checked in, Are there sth blocked?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Aitozi removed a comment on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model

2019-01-21 Thread GitBox
Aitozi removed a comment on issue #7543: [FLINK-10996]Fix the possible state 
leak with CEP processing time model
URL: https://github.com/apache/flink/pull/7543#issuecomment-456284633
 
 
   Hi @dawidwys 
   I have gone through the `FLINK-7384`, I think the unified solution is more 
clear. Maybe better check about the timeCharacteristic in `onTimer` because i 
think we just need to `advanceTime` under processingtime and to not have to try 
to get from `elementqueue` . And why that PR was not checked in, Are there sth 
blocking? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Aitozi commented on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model

2019-01-21 Thread GitBox
Aitozi commented on issue #7543: [FLINK-10996]Fix the possible state leak with 
CEP processing time model
URL: https://github.com/apache/flink/pull/7543#issuecomment-456284633
 
 
   Hi @dawidwys 
   I have gone through the `FLINK-7384`, I think the unified solution is more 
clear. Maybe better check about the timeCharacteristic in `onTimer` because i 
think we just need to `advanceTime` under processingtime and to not have to try 
to get from `elementqueue` . And why that PR was not checked in, Are there sth 
blocking? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637189
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   Yes, sorry. Looks like I un-intentionally re-formatted it. Will revert.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11403) Remove ResultPartitionConsumableNotifier from ResultPartition

2019-01-21 Thread zhijiang (JIRA)
zhijiang created FLINK-11403:


 Summary: Remove ResultPartitionConsumableNotifier from 
ResultPartition
 Key: FLINK-11403
 URL: https://issues.apache.org/jira/browse/FLINK-11403
 Project: Flink
  Issue Type: Sub-task
  Components: Network
Reporter: zhijiang
Assignee: zhijiang
 Fix For: 1.8.0


This is the precondition for introducing pluggable {{ShuffleService}} on TM 
side.

In current process of creating {{ResultPartition}}, the 
{{ResultPartitionConsumableNotifier}} regarded as TM level component has to be 
passed into the constructor. In order to create {{ResultPartition}} easily from 
{{ShuffleService}}, the required information should be covered by 
{{ResultPartitionDeploymentDescriptor}} as much as possible, then we could 
remove this notifier from the constructor. And it is also reasonable for 
notifying consumable partition via {{TaskActions}} which is already covered in 
{{ResultPartition}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Myasuka opened a new pull request #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest

2019-01-21 Thread GitBox
Myasuka opened a new pull request #7525: [FLINK-11363][test] Check and remove 
TaskManagerConfigurationTest
URL: https://github.com/apache/flink/pull/7525
 
 
   
   ## What is the purpose of the change
   Check whether `TaskManagerConfigurationTest` contains any relevant tests for 
the new code base and then remove this test.
   
   
   ## Brief change log
   
   `TaskManagerConfigurationTest` contains tests for `TaskManager` with 
configuration of host name, port and filesystem. Theses tests could be merged 
(host name and port) to test `TaskManagerRunner`, which I create a new test 
named `TaskManagerRunnerConfigurationTest`.
   
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
   `TaskManagerRunnerConfigurationTest` to test `TaskManagerRunner` with 
configuration of host name, port and filesystem.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: (no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] Myasuka closed pull request #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest

2019-01-21 Thread GitBox
Myasuka closed pull request #7525: [FLINK-11363][test] Check and remove 
TaskManagerConfigurationTest
URL: https://github.com/apache/flink/pull/7525
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637422
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/SimpleConsumerThread.java
 ##
 @@ -370,8 +370,8 @@ else if (partitionsRemoved) {

keyPayload.get(keyBytes);
}
 
-   final T value = 
deserializer.deserialize(keyBytes, valueBytes,
-   
currentPartition.getTopic(), currentPartition.getPartition(), offset);
+   final T value = 
deserializer.deserialize(
 
 Review comment:
   Thank you for suggestion, I will try


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637189
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   Yes, sorry. Looks like I un-intentionally reformat it. Will revert.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249637189
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   Yes, sorry. Looks like I intentionally reformat it. Will revert.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748360#comment-16748360
 ] 

lining commented on FLINK-11374:


list detail exception count 100 and can see by page

!image-2019-01-22-11-40-53-135.png!

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748362#comment-16748362
 ] 

lining commented on FLINK-11374:


Hi, [~till.rohrmann]. I have added some details about this. What do you think 
about it?

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png, 
> image-2019-01-22-11-42-33-808.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] leesf commented on issue #7536: [hotfix][docs] Remove redundant symbols

2019-01-21 Thread GitBox
leesf commented on issue #7536: [hotfix][docs] Remove redundant symbols
URL: https://github.com/apache/flink/pull/7536#issuecomment-456261374
 
 
   cc @GJL @zentol 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining updated FLINK-11374:
---
Attachment: image-2019-01-22-11-42-33-808.png

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png, 
> image-2019-01-22-11-42-33-808.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748361#comment-16748361
 ] 

lining commented on FLINK-11374:


agg by vertex

!image-2019-01-22-11-42-33-808.png!

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11374) See more failover and can filter by time range

2019-01-21 Thread lining (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lining updated FLINK-11374:
---
Attachment: image-2019-01-22-11-40-53-135.png

> See more failover and can filter by time range
> --
>
> Key: FLINK-11374
> URL: https://issues.apache.org/jira/browse/FLINK-11374
> Project: Flink
>  Issue Type: Improvement
>  Components: REST, Webfrontend
>Reporter: lining
>Assignee: lining
>Priority: Major
> Attachments: image-2019-01-22-11-40-53-135.png
>
>
> Now failover just show limit size task failover latest time. If task has 
> failed many time, we can not see the earlier time failover. Can we add filter 
> by time to see failover which contains task attemp fail msg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Clarkkkkk closed pull request #6628: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink for HBase

2019-01-21 Thread GitBox
Clark closed pull request #6628: [FLINK-10245] [Streaming Connector] Add 
Pojo, Tuple, Row and Scala Product DataStream Sink for HBase
URL: https://github.com/apache/flink/pull/6628
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11380) YarnFlinkResourceManagerTest test case crashed

2019-01-21 Thread Yun Tang (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748350#comment-16748350
 ] 

Yun Tang commented on FLINK-11380:
--

Another instance https://api.travis-ci.org/v3/job/481937341/log.txt

> YarnFlinkResourceManagerTest test case crashed 
> ---
>
> Key: FLINK-11380
> URL: https://issues.apache.org/jira/browse/FLINK-11380
> Project: Flink
>  Issue Type: Test
>  Components: Tests
>Reporter: vinoyang
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.8.0
>
>
> context:
> {code:java}
> 17:18:44.415 [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (default-test) on 
> project flink-yarn_2.11: There are test failures.
> 17:18:44.415 [ERROR] 
> 17:18:44.415 [ERROR] Please refer to 
> /home/travis/build/apache/flink/flink-yarn/target/surefire-reports for the 
> individual test results.
> 17:18:44.415 [ERROR] Please refer to dump files (if any exist) [date].dump, 
> [date]-jvmRun[N].dump and [date].dumpstream.
> 17:18:44.415 [ERROR] ExecutionException The forked VM terminated without 
> properly saying goodbye. VM crash or System.exit called?
> 17:18:44.415 [ERROR] Command was /bin/sh -c cd 
> /home/travis/build/apache/flink/flink-yarn && 
> /usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx2048m -Dmvn.forkNumber=2 
> -XX:+UseG1GC -jar 
> /home/travis/build/apache/flink/flink-yarn/target/surefire/surefirebooter3487840902331471745.jar
>  /home/travis/build/apache/flink/flink-yarn/target/surefire 
> 2019-01-16T17-02-23_939-jvmRun2 surefire3706271590182708448tmp 
> surefire_332496616764820906947tmp
> 17:18:44.416 [ERROR] Error occurred in starting fork, check output in log
> 17:18:44.416 [ERROR] Process Exit Code: 243
> 17:18:44.416 [ERROR] Crashed tests:
> 17:18:44.416 [ERROR] org.apache.flink.yarn.YarnFlinkResourceManagerTest
> 17:18:44.416 [ERROR] 
> org.apache.maven.surefire.booter.SurefireBooterForkException: 
> ExecutionException The forked VM terminated without properly saying goodbye. 
> VM crash or System.exit called?
> 17:18:44.416 [ERROR] Command was /bin/sh -c cd 
> /home/travis/build/apache/flink/flink-yarn && 
> /usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx2048m -Dmvn.forkNumber=2 
> -XX:+UseG1GC -jar 
> /home/travis/build/apache/flink/flink-yarn/target/surefire/surefirebooter3487840902331471745.jar
>  /home/travis/build/apache/flink/flink-yarn/target/surefire 
> 2019-01-16T17-02-23_939-jvmRun2 surefire3706271590182708448tmp 
> surefire_332496616764820906947tmp
> 17:18:44.416 [ERROR] Error occurred in starting fork, check output in log
> 17:18:44.416 [ERROR] Process Exit Code: 243
> 17:18:44.416 [ERROR] Crashed tests:
> 17:18:44.416 [ERROR] org.apache.flink.yarn.YarnFlinkResourceManagerTest
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:382)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:297)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120)
> 17:18:44.416 [ERROR] at 
> org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355)
> 17:18:44.416 [ERROR] at 
> 

[jira] [Closed] (FLINK-6101) GroupBy fields with arithmetic expression (include UDF) can not be selected

2019-01-21 Thread lincoln.lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln.lee closed FLINK-6101.
--
Resolution: Later

> GroupBy fields with arithmetic expression (include UDF) can not be selected
> ---
>
> Key: FLINK-6101
> URL: https://issues.apache.org/jira/browse/FLINK-6101
> Project: Flink
>  Issue Type: Bug
>  Components: Table API  SQL
>Reporter: lincoln.lee
>Assignee: lincoln.lee
>Priority: Minor
>
> currently the TableAPI do not support selecting GroupBy fields with 
> expression either using original field name or the expression 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> caused
> {code}
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [e, ('b % 3), TMP_0, TMP_1, TMP_2].
> {code}
> (BTW, this syntax is invalid in RDBMS which will indicate the selected column 
> is invalid in the select list because it is not contained in either an 
> aggregate function or the GROUP BY clause in SQL Server.)
> and 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b%3, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> will also cause
> {code}
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [e, ('b % 3), TMP_0, TMP_1, TMP_2].
> {code}
> and add an alias in groupBy clause "group(e, 'b%3 as 'b)" work without avail. 
> and apply an UDF doesn’t work either
> {code}
>table.groupBy('a, Mod('b, 3)).select('a, Mod('b, 3), 'c.count, 'c.count, 
> 'd.count, 'e.avg)
> org.apache.flink.table.api.ValidationException: Cannot resolve [b] given 
> input [a, org.apache.flink.table.api.scala.batch.table.Mod$('b, 3), TMP_0, 
> TMP_1, TMP_2].
> {code}
> the only way to get this work can be 
> {code}
>  val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .select('a, 'b%3 as 'b, 'c, 'd, 'e)
> .groupBy('e, 'b)
> .select('b, 'c.min, 'e, 'a.avg, 'd.count)
> {code}
> One way to solve this is to add support alias in groupBy clause ( it seems a 
> bit odd against SQL though TableAPI has a different groupBy grammar),  
> and I prefer to support select original expressions and UDF in groupBy 
> clause(make consistent with SQL).
> as thus:
> {code}
> // use expression
> val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, 'b % 3)
> .select('b % 3, 'c.min, 'e, 'a.avg, 'd.count)
> // use UDF
> val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, 
> 'd, 'e)
> .groupBy('e, Mod('b,3))
> .select(Mod('b,3), 'c.min, 'e, 'a.avg, 'd.count)
> {code}

> After had a look into the code, found there was a problem in the groupBy 
> implementation, validation hadn't considered the expressions in groupBy 
> clause. it should be noted that a table has been actually changed after 
> groupBy operation ( a new Table) and the groupBy keys replace the original 
> field reference in essence.
>  
> What do you think?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (FLINK-11214) Support non mergable aggregates for group windows on batch table

2019-01-21 Thread Dian Fu (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-11214.
---
Resolution: Won't Do

> Support non mergable aggregates for group windows on batch table
> 
>
> Key: FLINK-11214
> URL: https://issues.apache.org/jira/browse/FLINK-11214
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, it does not support non-mergable aggregates for group window on 
> batch table. It would be nice to support it as many code paths(but not all) 
> have already considered it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] dianfu closed pull request #7354: [FLINK-11214] [table] Support non mergable aggregates for group windows on batch table

2019-01-21 Thread GitBox
dianfu closed pull request #7354: [FLINK-11214] [table] Support non mergable 
aggregates for group windows on batch table
URL: https://github.com/apache/flink/pull/7354
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KarmaGYZ commented on a change in pull request #7260: [hotfix][docs] Fix typo and make improvement in Kafka Connectors doc

2019-01-21 Thread GitBox
KarmaGYZ commented on a change in pull request #7260: [hotfix][docs] Fix typo 
and make improvement in Kafka Connectors doc
URL: https://github.com/apache/flink/pull/7260#discussion_r249611548
 
 

 ##
 File path: docs/dev/connectors/kafka.md
 ##
 @@ -681,13 +681,13 @@ chosen by passing appropriate `semantic` parameter to 
the `FlinkKafkaProducer011
 
 `Semantic.EXACTLY_ONCE` mode relies on the ability to commit transactions
 that were started before taking a checkpoint, after recovering from the said 
checkpoint. If the time
-between Flink application crash and completed restart is larger then Kafka's 
transaction timeout
+between Flink application crash and completed restart is larger than Kafka's 
transaction timeout
 there will be data loss (Kafka will automatically abort transactions that 
exceeded timeout time).
 Having this in mind, please configure your transaction timeout appropriately 
to your expected down
 times.
 
 Kafka brokers by default have `transaction.max.timeout.ms` set to 15 minutes. 
This property will
-not allow to set transaction timeouts for the producers larger then it's value.
+not allow to set transaction timeouts for the producers larger than it's value.
 `FlinkKafkaProducer011` by default sets the `transaction.timeout.ms` property 
in producer config to
 
 Review comment:
   It sounds more clear and fluent to me. Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11326:
---
Labels: pull-request-available  (was: )

> Using offsets to adjust windows to timezones UTC-8 throws 
> IllegalArgumentException
> --
>
> Key: FLINK-11326
> URL: https://issues.apache.org/jira/browse/FLINK-11326
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.3, 1.7.1
>Reporter: TANG Wen-hui
>Assignee: Kezhu Wang
>Priority: Major
>  Labels: pull-request-available
>
> According to comments, we can use offset to adjust windows to timezones other 
> than UTC-0. For example, in China you would have to specify an offset of 
> {{Time.hours(-8)}}. 
>  
> {code:java}
> /**
>  * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that 
> assigns
>  * elements to time windows based on the element timestamp and offset.
>  *
>  * For example, if you want window a stream by hour,but window begins at 
> the 15th minutes
>  * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then 
> you will get
>  * time windows start at 0:15:00,1:15:00,2:15:00,etc.
>  *
>  * Rather than that,if you are living in somewhere which is not using 
> UTC±00:00 time,
>  * such as China which is using UTC+08:00,and you want a time window with 
> size of one day,
>  * and window begins at every 00:00:00 of local time,you may use {@code 
> of(Time.days(1),Time.hours(-8))}.
>  * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 
> hours earlier than UTC time.
>  *
>  * @param size The size of the generated windows.
>  * @param offset The offset which window start would be shifted by.
>  * @return The time policy.
>  */
> public static TumblingEventTimeWindows of(Time size, Time offset) {
>  return new TumblingEventTimeWindows(size.toMilliseconds(), 
> offset.toMilliseconds());
> }{code}
>  
> but when offset is smaller than zero, a IllegalArgumentException will be 
> throwed.
>  
> {code:java}
> protected TumblingEventTimeWindows(long size, long offset) {
>  if (offset < 0 || offset >= size) {
>  throw new IllegalArgumentException("TumblingEventTimeWindows parameters must 
> satisfy 0 <= offset < size");
>  }
>  this.size = size;
>  this.offset = offset;
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] kezhuw opened a new pull request #7548: [FLINK-11326] Fix forbidden negative offset in window assigners

2019-01-21 Thread GitBox
kezhuw opened a new pull request #7548: [FLINK-11326] Fix forbidden negative 
offset in window assigners
URL: https://github.com/apache/flink/pull/7548
 
 
   ## What is the purpose of the change
   Allow negative window offset in window assignment as the javadoc says.
   
   ## Brief change log
   - Allow negative window offset in window assignment.
   - Throw `IllegalArgumentException` if offset is out of range for 
`SlidingEventTimeWindows.of` and `SlidingProcessingTimeWindows.of`. 
   
   ## Verifying this change
   
   This change is already covered by existing tests and new test cases has been 
added to allow negative window offset.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   
   ## Breaking changes
   * `@PublicEvolving` API `SlidingEventTimeWindows.of` and 
`SlidingProcessingTimeWindows.of` allows out of window offset previously, this 
merge request forbid this behavior. This way they behaves same as 
`TumblingEventTimeWindows.of` and `TumblingProcessingTimeWindows.of`. I think 
it is easier for caller to understand [sliding 
windows](https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/windows.html#sliding-windows).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] KarmaGYZ commented on issue #7423: [hotfix][docs] Remove the legacy hint in production-ready doc

2019-01-21 Thread GitBox
KarmaGYZ commented on issue #7423: [hotfix][docs] Remove the legacy hint in 
production-ready doc
URL: https://github.com/apache/flink/pull/7423#issuecomment-456233322
 
 
   @alpinegizmo Thanks for the review.
   In the stale version, this sentence point to the asynchronous snapshotting. 
However, I keep this sentence because there are 
[explanation](https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html#the-rocksdbstatebackend)
 about the limitation of the throughput of RocksDBBackend.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
stevenzwu commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249607011
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   We can add a new deserialize method to `KeyedDeserializationSchema` 
interface with a default implementation that just forwards to the other 
deserialize method (mark as deprecated).
   
   ```java
   T deserialize(ConsumerRecord consumerRecord) throws IOException {
   return deserialize(messageKey, message, topic, partition, offset);
   }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
stevenzwu commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249607011
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   We can add a new deserialize method to `KeyedDeserializationSchema` 
interface with a default implementation that just forwards to the other 
deserialize method (mark as deprecated).
   
   {code}
   T deserialize(ConsumerRecord consumerRecord) throws IOException {
   return deserialize(messageKey, message, topic, partition, offset);
   }
   {code}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249606898
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java
 ##
 @@ -641,6 +641,9 @@ public void invoke(KafkaTransactionState transaction, IN 
next, Context context)
} else {
record = new ProducerRecord<>(targetTopic, null, 
timestamp, serializedKey, serializedValue);
}
+   for (Map.Entry header: schema.headers(next)) {
 
 Review comment:
   Pushed e0ca3e0


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249606803
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -251,6 +269,18 @@ public FlinkKafkaConsumerBase(
//  Configuration
// 

 
+   /**
+* Make sure that auto commit is disabled when our offset commit mode 
is ON_CHECKPOINTS.
+* This overwrites whatever setting the user configured in the 
properties.
+* @param properties - Kafka configuration properties to be adjusted
+* @param offsetCommitMode offset commit mode
+*/
+   protected static void adjustAutoCommitConfig(Properties properties, 
OffsetCommitMode offsetCommitMode) {
 
 Review comment:
   Pushed 3cfda16


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException

2019-01-21 Thread Kezhu Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kezhu Wang updated FLINK-11326:
---
Affects Version/s: 1.7.1

> Using offsets to adjust windows to timezones UTC-8 throws 
> IllegalArgumentException
> --
>
> Key: FLINK-11326
> URL: https://issues.apache.org/jira/browse/FLINK-11326
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.3, 1.7.1
>Reporter: TANG Wen-hui
>Assignee: Kezhu Wang
>Priority: Major
>
> According to comments, we can use offset to adjust windows to timezones other 
> than UTC-0. For example, in China you would have to specify an offset of 
> {{Time.hours(-8)}}. 
>  
> {code:java}
> /**
>  * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that 
> assigns
>  * elements to time windows based on the element timestamp and offset.
>  *
>  * For example, if you want window a stream by hour,but window begins at 
> the 15th minutes
>  * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then 
> you will get
>  * time windows start at 0:15:00,1:15:00,2:15:00,etc.
>  *
>  * Rather than that,if you are living in somewhere which is not using 
> UTC±00:00 time,
>  * such as China which is using UTC+08:00,and you want a time window with 
> size of one day,
>  * and window begins at every 00:00:00 of local time,you may use {@code 
> of(Time.days(1),Time.hours(-8))}.
>  * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 
> hours earlier than UTC time.
>  *
>  * @param size The size of the generated windows.
>  * @param offset The offset which window start would be shifted by.
>  * @return The time policy.
>  */
> public static TumblingEventTimeWindows of(Time size, Time offset) {
>  return new TumblingEventTimeWindows(size.toMilliseconds(), 
> offset.toMilliseconds());
> }{code}
>  
> but when offset is smaller than zero, a IllegalArgumentException will be 
> throwed.
>  
> {code:java}
> protected TumblingEventTimeWindows(long size, long offset) {
>  if (offset < 0 || offset >= size) {
>  throw new IllegalArgumentException("TumblingEventTimeWindows parameters must 
> satisfy 0 <= offset < size");
>  }
>  this.size = size;
>  this.offset = offset;
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
stevenzwu commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249606472
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   @alexeyt820 as pointed out by @azagrebin, kafka 0.8 seems to have 
`ConsumerRecord` interface
   
https://archive.apache.org/dist/kafka/0.8.2-beta/java-doc/org/apache/kafka/clients/consumer/ConsumerRecord.html
   
   As Flink is moving to the direction of just one modern kafka connector, I am 
also wondering if it is ok to drop kafka 0.8 connector?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-11402:

Component/s: Local Runtime

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination, Local Runtime
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Commented] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748241#comment-16748241
 ] 

Ufuk Celebi commented on FLINK-11402:
-

Similar issue for RocksDB (FLINK-5408)

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-11402:

Attachment: hello-snappy.tgz

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ufuk Celebi updated FLINK-11402:

Attachment: hello-snappy-1.0-SNAPSHOT.jar

> User code can fail with an UnsatisfiedLinkError in the presence of multiple 
> classloaders
> 
>
> Key: FLINK-11402
> URL: https://issues.apache.org/jira/browse/FLINK-11402
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.7.0
>Reporter: Ufuk Celebi
>Priority: Major
> Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz
>
>
> As reported on the user mailing list thread "[`env.java.opts` not persisting 
> after job canceled or failed and then 
> restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
>  there can be issues with using native libraries and user code class loading.
> h2. Steps to reproduce
> I was able to reproduce the issue reported on the mailing list using 
> [snappy-java|https://github.com/xerial/snappy-java] in a user program. 
> Running the attached user program works fine on initial submission, but 
> results in a failure when re-executed.
> I'm using Flink 1.7.0 using a standalone cluster started via 
> {{bin/start-cluster.sh}}.
> 0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
> directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> 1. Download 
> [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
>  and unpack libsnappyjava for your system:
> {code}
> jar tf snappy-java-1.1.7.2.jar | grep libsnappy
> ...
> org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
> ...
> org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
> ...
> {code}
> 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
> (path needs to be adjusted for your system):
> {code}
> env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
> {code}
> 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> Program execution finished
> Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
> Job Runtime: 359 ms
> {code}
> 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
> {code}
> bin/flink run hello-snappy-1.0-SNAPSHOT.jar
> Starting execution of program
> 
>  The program finished with the following exception:
> org.apache.flink.client.program.ProgramInvocationException: Job failed. 
> (JobID: 7d69baca58f33180cb9251449ddcd396)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
>   at 
> org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
>   at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
>   at 
> org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
>   at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
>   at 
> org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
>   at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
>   at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
>   at 
> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
>   at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
>   at 
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
>   at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
> Caused by: org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
>   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
>   at 
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
>   ... 17 more
> Caused by: java.lang.UnsatisfiedLinkError: Native Library 
> /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded 
> in another classloader
>   at 

[jira] [Created] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders

2019-01-21 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-11402:
---

 Summary: User code can fail with an UnsatisfiedLinkError in the 
presence of multiple classloaders
 Key: FLINK-11402
 URL: https://issues.apache.org/jira/browse/FLINK-11402
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.7.0
Reporter: Ufuk Celebi
 Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz

As reported on the user mailing list thread "[`env.java.opts` not persisting 
after job canceled or failed and then 
restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];,
 there can be issues with using native libraries and user code class loading.

h2. Steps to reproduce

I was able to reproduce the issue reported on the mailing list using 
[snappy-java|https://github.com/xerial/snappy-java] in a user program. Running 
the attached user program works fine on initial submission, but results in a 
failure when re-executed.

I'm using Flink 1.7.0 using a standalone cluster started via 
{{bin/start-cluster.sh}}.

0. Unpack attached Maven project and build using {{mvn clean package}} *or* 
directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}}
1. Download 
[snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar]
 and unpack libsnappyjava for your system:
{code}
jar tf snappy-java-1.1.7.2.jar | grep libsnappy
...
org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so
...
org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib
...
{code}
2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} 
(path needs to be adjusted for your system):
{code}
env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64
{code}
3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}}
{code}
bin/flink run hello-snappy-1.0-SNAPSHOT.jar
Starting execution of program
Program execution finished
Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished.
Job Runtime: 359 ms
{code}
4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}}
{code}
bin/flink run hello-snappy-1.0-SNAPSHOT.jar
Starting execution of program


 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 
7d69baca58f33180cb9251449ddcd396)
  at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
  at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487)
  at 
org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
  at com.github.uce.HelloSnappy.main(HelloSnappy.java:18)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
  at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
  at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
  at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
  at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
  at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
  at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
  at 
org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
  at 
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
  at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
  at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
  at 
org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
  ... 17 more
Caused by: java.lang.UnsatisfiedLinkError: Native Library 
/.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded in 
another classloader
  at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1907)
  at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1861)
  at java.lang.Runtime.loadLibrary0(Runtime.java:870)
  at java.lang.System.loadLibrary(System.java:1122)
  at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:182)
  at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154)
  at org.xerial.snappy.Snappy.(Snappy.java:47)
  at 

[jira] [Updated] (FLINK-11401) Allow compression on ParquetBulkWriter

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-11401:
---
Labels: pull-request-available  (was: )

> Allow compression on ParquetBulkWriter
> --
>
> Key: FLINK-11401
> URL: https://issues.apache.org/jira/browse/FLINK-11401
> Project: Flink
>  Issue Type: Improvement
>Affects Versions: 1.7.1
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.8.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] Fokko opened a new pull request #7547: [FLINK-11401] Allow setting of compression on ParquetWriter

2019-01-21 Thread GitBox
Fokko opened a new pull request #7547: [FLINK-11401] Allow setting of 
compression on ParquetWriter
URL: https://github.com/apache/flink/pull/7547
 
 
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11401) Allow compression on ParquetBulkWriter

2019-01-21 Thread Fokko Driesprong (JIRA)
Fokko Driesprong created FLINK-11401:


 Summary: Allow compression on ParquetBulkWriter
 Key: FLINK-11401
 URL: https://issues.apache.org/jira/browse/FLINK-11401
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.7.1
Reporter: Fokko Driesprong
Assignee: Fokko Driesprong
 Fix For: 1.8.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] bowenli86 commented on issue #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager

2019-01-21 Thread GitBox
bowenli86 commented on issue #7011: [FLINK-10768][Table & SQL] Move external 
catalog related code from TableEnvironment to CatalogManager
URL: https://github.com/apache/flink/pull/7011#issuecomment-456154882
 
 
   closing this PR in favor of new plan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 commented on issue #6997: [FLINK-10697][Table & SQL] Create an in-memory catalog that stores Flink's meta objects

2019-01-21 Thread GitBox
bowenli86 commented on issue #6997: [FLINK-10697][Table & SQL] Create an 
in-memory catalog that stores Flink's meta objects
URL: https://github.com/apache/flink/pull/6997#issuecomment-456154907
 
 
   closing this PR in favor of new plan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 commented on issue #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java

2019-01-21 Thread GitBox
bowenli86 commented on issue #7012: [FLINK-10769][Table & SQL] port 
InMemoryExternalCatalog to java
URL: https://github.com/apache/flink/pull/7012#issuecomment-456154851
 
 
   closing this PR in favor of new plan


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 closed pull request #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager

2019-01-21 Thread GitBox
bowenli86 closed pull request #7011: [FLINK-10768][Table & SQL] Move external 
catalog related code from TableEnvironment to CatalogManager
URL: https://github.com/apache/flink/pull/7011
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 closed pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs

2019-01-21 Thread GitBox
bowenli86 closed pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to 
ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views 
and UDFs
URL: https://github.com/apache/flink/pull/6970
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] bowenli86 closed pull request #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java

2019-01-21 Thread GitBox
bowenli86 closed pull request #7012: [FLINK-10769][Table & SQL] port 
InMemoryExternalCatalog to java
URL: https://github.com/apache/flink/pull/7012
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] GJL opened a new pull request #7546: [FLINK-11390][tests] Port YARNSessionCapacitySchedulerITCase#testTaskManagerFailure to new code base

2019-01-21 Thread GitBox
GJL opened a new pull request #7546: [FLINK-11390][tests] Port 
YARNSessionCapacitySchedulerITCase#testTaskManagerFailure to new code base
URL: https://github.com/apache/flink/pull/7546
 
 
   ## What is the purpose of the change
   
   *Port `YARNSessionCapacitySchedulerITCase#testTaskManagerFailure` to new 
code base.*
   
   cc: @tillrohrmann 
   
   ## Brief change log
 - *Extract HA test out of `testTaskManagerFailure`*
 - *Rename test `testTaskManagerFailure` so that the name reflects what is 
actually asserted*
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as *itself*.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11400) JobManagerRunner does not wait for suspension of JobMaster

2019-01-21 Thread TisonKun (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748078#comment-16748078
 ] 

TisonKun commented on FLINK-11400:
--

Do you mean introduce a {{recoveryOperation}} in {{JobManagerRunner}}?

> JobManagerRunner does not wait for suspension of JobMaster
> --
>
> Key: FLINK-11400
> URL: https://issues.apache.org/jira/browse/FLINK-11400
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 1.6.3, 1.7.1, 1.8.0
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Major
> Fix For: 1.8.0
>
>
> The {{JobManagerRunner}} does not wait for the suspension of the 
> {{JobMaster}} to finish before granting leadership again. This can lead to a 
> state where the {{JobMaster}} tries to start the {{ExecutionGraph}} but the 
> {{SlotPool}} is still stopped.
> I suggest to linearize the leadership operations (granting and revoking 
> leadership) similarly to the {{Dispatcher}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249512219
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/SimpleConsumerThread.java
 ##
 @@ -370,8 +370,8 @@ else if (partitionsRemoved) {

keyPayload.get(keyBytes);
}
 
-   final T value = 
deserializer.deserialize(keyBytes, valueBytes,
-   
currentPartition.getTopic(), currentPartition.getPartition(), offset);
+   final T value = 
deserializer.deserialize(
 
 Review comment:
   Here we could try:
   ```
   ConsumerRecord consumerRecord = 
  new ConsumerRecord<>(currentPartition.getTopic(), 
currentPartition.getPartition(), 
 keyBytes, valueBytes, offset);
   final T value = deserializer.deserialize(consumerRecord);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249368384
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java
 ##
 @@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.internal;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Function;
+import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables;
+
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.common.header.Header;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.AbstractMap;
+import java.util.Map;
+
+/**
+ * Extends base Kafka09ConsumerRecord to provide access to Kafka headers.
 
 Review comment:
   could you put into `{@link Kafka09ConsumerRecord}`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249478973
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java
 ##
 @@ -32,6 +34,49 @@
  */
 @PublicEvolving
 public interface KeyedDeserializationSchema extends Serializable, 
ResultTypeQueryable {
+   /**
+* Kafka record to be deserialized.
+* Record consists of key,value pair, topic name, partition offset, 
headers and a timestamp (if available)
+*/
+   interface Record {
 
 Review comment:
   kafka-clients 0.8 actually has `ConsumerRecord`, just `SimpleConsumerThread` 
does not use it but it seems to be possible to manually wrap `MessageAndOffset` 
into that `ConsumerRecord`. I would give it a try. It seems to be simpler 
option at the moment and would eliminate currently introduced inheritance for 
the sake of wrapping.
   
   Not sure, though, how big the risk is that the Kafka API changes again. The 
Flink wrapping, as now, seems to be a safer option but it also adds some 
performance overhead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249510977
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java
 ##
 @@ -82,93 +84,110 @@
  */
 @Internal
 public abstract class FlinkKafkaConsumerBase extends 
RichParallelSourceFunction implements
-   CheckpointListener,
-   ResultTypeQueryable,
-   CheckpointedFunction {
+   CheckpointListener,
 
 Review comment:
   This class contains a lot of unrelated changes, which makes it more 
difficult to review. I would suggest to have either a follow-up PR for them or 
at least put them into a separate commit.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249482757
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java
 ##
 @@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.streaming.connectors.kafka.internal;
+
+import org.apache.flink.shaded.guava18.com.google.common.base.Function;
+import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables;
+
+import org.apache.kafka.clients.consumer.ConsumerRecord;
+import org.apache.kafka.common.header.Header;
+
+import javax.annotation.Nonnull;
+import javax.annotation.Nullable;
+
+import java.util.AbstractMap;
+import java.util.Map;
+
+/**
+ * Extends base Kafka09ConsumerRecord to provide access to Kafka headers.
+ */
+class Kafka011ConsumerRecord extends Kafka09ConsumerRecord {
+   /**
+* Wraps {@link Header} as Map.Entry.
+*/
+   private static final Function> 
HEADER_TO_MAP_ENTRY_FUNCTION =
+   new Function>() {
+   @Nonnull
+   @Override
+   public Map.Entry apply(@Nullable Header 
header) {
+   return new 
AbstractMap.SimpleImmutableEntry<>(header.key(), header.value());
+   }
+   };
+
+   Kafka011ConsumerRecord(ConsumerRecord consumerRecord) {
+   super(consumerRecord);
+   }
+
+   @Override
+   public Iterable> headers() {
+   return Iterables.transform(consumerRecord.headers(), 
HEADER_TO_MAP_ENTRY_FUNCTION);
 
 Review comment:
   Could we avoid relying on non-standard java libraries like guava?
   The record/headers processing is also on the time critical path of per 
record latency.
   I would suggest to implement our own Iterable wrapper which does only this 
header wrapping.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11400) JobManagerRunner does not wait for suspension of JobMaster

2019-01-21 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-11400:
-

 Summary: JobManagerRunner does not wait for suspension of JobMaster
 Key: FLINK-11400
 URL: https://issues.apache.org/jira/browse/FLINK-11400
 Project: Flink
  Issue Type: Bug
  Components: Distributed Coordination
Affects Versions: 1.7.1, 1.6.3, 1.8.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.8.0


The {{JobManagerRunner}} does not wait for the suspension of the {{JobMaster}} 
to finish before granting leadership again. This can lead to a state where the 
{{JobMaster}} tries to start the {{ExecutionGraph}} but the {{SlotPool}} is 
still stopped.

I suggest to linearize the leadership operations (granting and revoking 
leadership) similarly to the {{Dispatcher}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] StefanRRichter commented on issue #7009: [FLINK-10712] Support to restore state when using RestartPipelinedRegionStrategy

2019-01-21 Thread GitBox
StefanRRichter commented on issue #7009: [FLINK-10712] Support to restore state 
when using RestartPipelinedRegionStrategy
URL: https://github.com/apache/flink/pull/7009#issuecomment-456105318
 
 
   I think I agree with the assessment of the existing operators. 
   
   About adding a `RecoveryMode` to consider, would that mean that we would 
prevent all jobs that use union state to work with partial recovery? I think if 
we just consider a few popular operators like `KafkaConsumer`, that would 
already prevent a lot of jobs from using different recovery modes.
   
   I can see that this comes from the concern about existing code that uses 
union state. However, stricly speaking it should not be a regression because 
those recovery modes previously did not support state recovery at all. We also 
cannot prevent users from making wrong implementations, so I feel like a good 
thing to do is document what to care care for union state when using such 
recovery modes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers

2019-01-21 Thread GitBox
azagrebin commented on a change in pull request #6615: [FLINK-8354] 
[flink-connectors] Add ability to access and provider Kafka headers
URL: https://github.com/apache/flink/pull/6615#discussion_r249367559
 
 

 ##
 File path: 
flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java
 ##
 @@ -641,6 +641,9 @@ public void invoke(KafkaTransactionState transaction, IN 
next, Context context)
} else {
record = new ProducerRecord<>(targetTopic, null, 
timestamp, serializedKey, serializedValue);
}
+   for (Map.Entry header: schema.headers(next)) {
 
 Review comment:
   space before `:`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Assigned] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException

2019-01-21 Thread Kezhu Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kezhu Wang reassigned FLINK-11326:
--

Assignee: Kezhu Wang

> Using offsets to adjust windows to timezones UTC-8 throws 
> IllegalArgumentException
> --
>
> Key: FLINK-11326
> URL: https://issues.apache.org/jira/browse/FLINK-11326
> Project: Flink
>  Issue Type: Bug
>  Components: Streaming
>Affects Versions: 1.6.3
>Reporter: TANG Wen-hui
>Assignee: Kezhu Wang
>Priority: Major
>
> According to comments, we can use offset to adjust windows to timezones other 
> than UTC-0. For example, in China you would have to specify an offset of 
> {{Time.hours(-8)}}. 
>  
> {code:java}
> /**
>  * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that 
> assigns
>  * elements to time windows based on the element timestamp and offset.
>  *
>  * For example, if you want window a stream by hour,but window begins at 
> the 15th minutes
>  * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then 
> you will get
>  * time windows start at 0:15:00,1:15:00,2:15:00,etc.
>  *
>  * Rather than that,if you are living in somewhere which is not using 
> UTC±00:00 time,
>  * such as China which is using UTC+08:00,and you want a time window with 
> size of one day,
>  * and window begins at every 00:00:00 of local time,you may use {@code 
> of(Time.days(1),Time.hours(-8))}.
>  * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 
> hours earlier than UTC time.
>  *
>  * @param size The size of the generated windows.
>  * @param offset The offset which window start would be shifted by.
>  * @return The time policy.
>  */
> public static TumblingEventTimeWindows of(Time size, Time offset) {
>  return new TumblingEventTimeWindows(size.toMilliseconds(), 
> offset.toMilliseconds());
> }{code}
>  
> but when offset is smaller than zero, a IllegalArgumentException will be 
> throwed.
>  
> {code:java}
> protected TumblingEventTimeWindows(long size, long offset) {
>  if (offset < 0 || offset >= size) {
>  throw new IllegalArgumentException("TumblingEventTimeWindows parameters must 
> satisfy 0 <= offset < size");
>  }
>  this.size = size;
>  this.offset = offset;
> }{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] Implement proctime DataStream to Table upsert conversion

2019-01-21 Thread GitBox
hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] 
Implement proctime DataStream to Table upsert conversion
URL: https://github.com/apache/flink/pull/6787#discussion_r249476414
 
 

 ##
 File path: 
flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/stream/sql/FromUpsertStreamTest.scala
 ##
 @@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.api.stream.sql
+
+import org.apache.flink.api.common.typeinfo.BasicTypeInfo
+import org.apache.flink.api.java.typeutils.{RowTypeInfo, TupleTypeInfo}
+import org.apache.flink.api.scala._
+import org.apache.flink.table.api.Types
+import org.apache.flink.table.api.scala._
+import org.apache.flink.table.utils.TableTestUtil.{UpsertTableNode, term, 
unaryNode}
+import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase}
+import org.apache.flink.types.Row
+import org.junit.Test
+import org.apache.flink.api.java.tuple.{Tuple2 => JTuple2}
+import java.lang.{Boolean => JBool}
+
+class FromUpsertStreamTest extends TableTestBase {
+
+  private val streamUtil: StreamTableTestUtil = streamTestUtil()
+
+  @Test
+  def testRemoveUpsertToRetraction() = {
+streamUtil.addTableFromUpsert[(Boolean, (Int, String, Long))](
+  "MyTable", 'a, 'b.key, 'c, 'proctime.proctime, 'rowtime.rowtime)
+
+val sql = "SELECT a, b, c, proctime, rowtime FROM MyTable"
+
+val expected =
+  unaryNode(
+"DataStreamCalc",
+UpsertTableNode(0),
+term("select", "a", "b", "c", "PROCTIME(proctime) AS proctime",
+  "CAST(rowtime) AS rowtime")
+  )
+streamUtil.verifySql(sql, expected)
+  }
+
+  @Test
+  def testMaterializeTimeIndicatorAndCalcUpsertToRetractionTranspose() = {
 
 Review comment:
   `Calc` can't always be pushed down(More details in 
`CalcUpsertToRetractionTransposeRule`), so there are two tests for each of the 
case:
   - `testCalcTransposeUpsertToRetraction()`
   - `testCalcCannotTransposeUpsertToRetraction()`
   
   As for `testMaterializeTimeIndicatorAndCalcUpsertToRetractionTranspose()`, 
maybe it's better to rename to `testMaterializeTimeIndicator()`. It is 
dedicated to test the materialization logic.
   
   What do you think?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (FLINK-11399) Parsing nested ROW()s in SQL

2019-01-21 Thread JIRA
Benoît Paris created FLINK-11399:


 Summary: Parsing nested ROW()s in SQL
 Key: FLINK-11399
 URL: https://issues.apache.org/jira/browse/FLINK-11399
 Project: Flink
  Issue Type: Bug
  Components: Table API  SQL
Affects Versions: 1.7.1
Reporter: Benoît Paris


Hi!

I'm trying to build a nested structure in SQL (mapping to json with 
flink-json). This works fine: 
{code:java}
INSERT INTO outputTable
SELECT ROW(col1, col2) 
FROM (
  SELECT 
col1, 
ROW(col1, col1) as col2 
  FROM inputTable
) tbl2
{code}
(and I use it as a workaround), but it fails in the simpler version: 
{code:java}
INSERT INTO outputTable
SELECT ROW(col1, ROW(col1, col1)) 
FROM inputTable
{code}
, yielding the following stacktrace: 
{noformat}
Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL 
parse failed. Encountered ", ROW" at line 1, column 40.
Was expecting one of:
")" ...
","  ...
","  ...
","  ...
","  ...
","  ...

at 
org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:94)
at 
org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:803)
at 
org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777)
at TestBug.main(TestBug.java:32)
Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered ", ROW" 
at line 1, column 40.
Was expecting one of:
")" ...
","  ...
","  ...
","  ...
","  ...
","  ...

at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:347)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:128)
at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137)
at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162)
at 
org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:90)
... 3 more
Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered ", 
ROW" at line 1, column 40.
Was expecting one of:
")" ...
","  ...
","  ...
","  ...
","  ...
","  ...

at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:23019)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:22836)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.ParenthesizedSimpleIdentifierList(SqlParserImpl.java:4466)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3328)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectExpression(SqlParserImpl.java:1525)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectItem(SqlParserImpl.java:1500)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectList(SqlParserImpl.java:1477)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:912)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:552)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:3030)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2949)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:463)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlInsert(SqlParserImpl.java:1212)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:847)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:869)
at 
org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:184)
at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:130)
... 5 more{noformat}
 

I was thinking it could be a naming/referencing issue; or I was not using ROW() 
properly, in the json-idiomatic way I want to push on it.

Anyway this is very minor, thanks for all the good work on Flink!

Cheers,

Ben 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-11398) Add a dedicated phase to materialize time indicators for nodes produce updates

2019-01-21 Thread Hequn Cheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hequn Cheng updated FLINK-11398:

Description: 
As discussed 
[here|https://github.com/apache/flink/pull/6787#discussion_r247926320], we need 
a dedicated phase to materialize time indicators for nodes produce updates.

Details:
Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
need to introduce another materialize phase that materializes all time 
attributes on nodes that produce updates. We can not do it inside 
`RelTimeInidicatorConverter`, because only later, after physical optimization 
phase, we know whether it is a non-window outer join which will produce updates

There are a few other things we need to consider.
- Whether we can unify the two converter phase.
- Take window with early fire into consideration(not been implemented yet). In 
this case, we don't need to materialize time indicators even it produces 
updates.









  was:
As discussed 
[here|https://github.com/apache/flink/pull/6787#discussion_r249056249], we need 
a dedicated phase to materialize time indicators for nodes produce updates.

Details:
Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
need to introduce another materialize phase that materializes all time 
attributes on nodes that produce updates. We can not do it inside 
`RelTimeInidicatorConverter`, because only later, after physical optimization 
phase, we know whether it is a non-window outer join which will produce updates

There are a few other things we need to consider.
- Whether we can unify the two converter phase.
- Take window with early fire into consideration(not been implemented yet). In 
this case, we don't need to materialize time indicators even it produces 
updates.










> Add a dedicated phase to materialize time indicators for nodes produce updates
> --
>
> Key: FLINK-11398
> URL: https://issues.apache.org/jira/browse/FLINK-11398
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Hequn Cheng
>Assignee: Hequn Cheng
>Priority: Major
>
> As discussed 
> [here|https://github.com/apache/flink/pull/6787#discussion_r247926320], we 
> need a dedicated phase to materialize time indicators for nodes produce 
> updates.
> Details:
> Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
> need to introduce another materialize phase that materializes all time 
> attributes on nodes that produce updates. We can not do it inside 
> `RelTimeInidicatorConverter`, because only later, after physical optimization 
> phase, we know whether it is a non-window outer join which will produce 
> updates
> There are a few other things we need to consider.
> - Whether we can unify the two converter phase.
> - Take window with early fire into consideration(not been implemented yet). 
> In this case, we don't need to materialize time indicators even it produces 
> updates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11398) Add a dedicated phase to materialize time indicators for nodes produce updates

2019-01-21 Thread Hequn Cheng (JIRA)
Hequn Cheng created FLINK-11398:
---

 Summary: Add a dedicated phase to materialize time indicators for 
nodes produce updates
 Key: FLINK-11398
 URL: https://issues.apache.org/jira/browse/FLINK-11398
 Project: Flink
  Issue Type: Improvement
  Components: Table API  SQL
Reporter: Hequn Cheng
Assignee: Hequn Cheng


As discussed 
[here|https://github.com/apache/flink/pull/6787#discussion_r249056249], we need 
a dedicated phase to materialize time indicators for nodes produce updates.

Details:
Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We 
need to introduce another materialize phase that materializes all time 
attributes on nodes that produce updates. We can not do it inside 
`RelTimeInidicatorConverter`, because only later, after physical optimization 
phase, we know whether it is a non-window outer join which will produce updates

There are a few other things we need to consider.
- Whether we can unify the two converter phase.
- Take window with early fire into consideration(not been implemented yet). In 
this case, we don't need to materialize time indicators even it produces 
updates.











--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] dawidwys commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution …

2019-01-21 Thread GitBox
dawidwys commented on a change in pull request #7436: [FLINK-11071][core] add 
support for dynamic proxy classes resolution …
URL: https://github.com/apache/flink/pull/7436#discussion_r249462832
 
 

 ##
 File path: 
flink-core/src/test/java/org/apache/flink/util/InstantiationUtilTest.java
 ##
 @@ -46,6 +50,17 @@
  */
 public class InstantiationUtilTest extends TestLogger {
 
+   @Test
+   public void testResolveProxyClass() throws Exception {
 
 Review comment:
   This test does not check the new behavior. The proxy class is available in 
the same classloader as `InstantiationUtil`, which corresponds to a situation 
that the proxy class is available on the parent classpath in flink cluster. 
What you should actually test is when the proxy class is only available from 
user classloader. 
   
   In 
`org.apache.flink.runtime.classloading.ClassLoaderTest#testMessageDecodingWithUnavailableClass`
 you can check how to create a new classloader with a class. Also always please 
make sure that your test fails without the changes that are supposed to fix the 
tested problem (It is not the case here, it passes without your changes). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] fhueske commented on issue #6873: [hotfix] Add Review Progress section to PR description template.

2019-01-21 Thread GitBox
fhueske commented on issue #6873: [hotfix] Add Review Progress section to PR 
description template.
URL: https://github.com/apache/flink/pull/6873#issuecomment-456085278
 
 
   I think if we modify the review checklist in place (i.e., update it in the 
PR description) this is almost automatically given. 
   
   AFAIK, only the PR author and members of the ASF Github organization (or 
maybe even registered Flink committers?) can update the PR description. If we 
ask committers to sign-off changes with their name, we should be good.
   
   A review progress bot would be great to have, IMO.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] rmetzger commented on issue #6873: [hotfix] Add Review Progress section to PR description template.

2019-01-21 Thread GitBox
rmetzger commented on issue #6873: [hotfix] Add Review Progress section to PR 
description template.
URL: https://github.com/apache/flink/pull/6873#issuecomment-456081384
 
 
   @zentol: I'm a bit against this `**NOTE: THE REVIEW PROGRESS MUST ONLY BE 
UPDATED BY AN APACHE FLINK COMMITTER!**` line, as it seems not very inviting 
for new community members.
   Maybe as a convention, reviewers put into the comments what they have 
reviewed, so if a committer who's merging a PR has doubts about the checklist, 
they can check the comments?
   
   I'm also thinking* about implementing a bot that takes care of tracking the 
checklist.
   
   *strongly enough to have created a project locally already :) 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (FLINK-11249) FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer

2019-01-21 Thread Piotr Nowojski (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-11249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747937#comment-16747937
 ] 

Piotr Nowojski commented on FLINK-11249:


I have realised about one more issue. This fix alone might not fully solve the 
problem. With this fix in place, user will be able to update his job from 
{{0.11}} connector to the universal one, but what happens if he upgrades the 
Kafka Brokers at any point of time? If user stops Kafka Brokers, upgrades and 
then restarts them, does this process preserves the pending transactions, that 
Flink already "pre committed"? Or are they automatically aborted? If they are 
automatically aborted we might have a data loss from our perspective.

If "pre committed" transactions are aborted during the Kafka brokers upgrades, 
we would need "clean stop with savepoint" feature to handle this user story. I 
guess this needs more experiments and more testing.

CC [~tzulitai] [~aljoscha]

> FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
> ---
>
> Key: FLINK-11249
> URL: https://issues.apache.org/jira/browse/FLINK-11249
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kafka Connector
>Affects Versions: 1.7.0, 1.7.1
>Reporter: Piotr Nowojski
>Assignee: vinoyang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As reported by a user on the mailing list "How to migrate Kafka Producer ?" 
> (on 18th December 2018), {{FlinkKafkaProducer011}} can not be migrated to 
> {{FlinkKafkaProducer}} and the same problem can occur in the future Kafka 
> producer versions/refactorings.
> The issue is that {{ListState 
> FlinkKafkaProducer#nextTransactionalIdHintState}} field is serialized using 
> java serializers and this is causing problems/collisions on 
> {{FlinkKafkaProducer011.NextTransactionalIdHint}}  vs
> {{FlinkKafkaProducer.NextTransactionalIdHint}}.
> To fix that we probably need to release new versions of those classes, that 
> will rewrite/upgrade this state field to a new one, that doesn't relay on 
> java serialization. After this, we could drop the support for the old field 
> and that in turn will allow users to upgrade from 0.11 connector to the 
> universal one.
> One bright side is that technically speaking our {{FlinkKafkaProducer011}} 
> has the same compatibility matrix as the universal one (it's also forward & 
> backward compatible with the same Kafka versions), so for the time being 
> users can stick to {{FlinkKafkaProducer011}}.
> FYI [~tzulitai] [~yanghua]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7834) cost model (RelOptCost) extends network cost and memory cost

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7834:
--
Labels: pull-request-available  (was: )

> cost model (RelOptCost) extends network cost and memory cost
> 
>
> Key: FLINK-7834
> URL: https://issues.apache.org/jira/browse/FLINK-7834
> Project: Flink
>  Issue Type: New Feature
>  Components: Table API  SQL
>Reporter: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> {{RelOptCost}} defines an interface for optimizer cost in terms of number of 
> row processed, CPU cost, and I/O cost. Flink is a distributed framework, 
> network and memory are also two important cost metrics which should be 
> considered for optimizer. This feature is to extend RelOptCost.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] clarkyzl closed pull request #3623: [FLINK-6196] [table] Support dynamic schema in Table Function

2019-01-21 Thread GitBox
clarkyzl closed pull request #3623: [FLINK-6196] [table] Support dynamic schema 
in Table Function
URL: https://github.com/apache/flink/pull/3623
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-6196) Support dynamic schema in Table Function

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6196:
--
Labels: pull-request-available  (was: )

> Support dynamic schema in Table Function
> 
>
> Key: FLINK-6196
> URL: https://issues.apache.org/jira/browse/FLINK-6196
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>Priority: Major
>  Labels: pull-request-available
>
> In many of our use cases. We have to decide the schema of a UDTF at the run 
> time. For example. udtf('c1, c2, c3') will generate three columns for a 
> lateral view. 
> Most systems such as calcite and hive support this feature. However, the 
> current implementation of flink didn't implement the feature correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7065) separate the flink-streaming-java module from flink-clients

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7065:
--
Labels: pull-request-available  (was: )

> separate the flink-streaming-java module from flink-clients 
> 
>
> Key: FLINK-7065
> URL: https://issues.apache.org/jira/browse/FLINK-7065
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>Priority: Major
>  Labels: pull-request-available
>
> Motivation:
>  It is not good that "flink-streaming-java" module depends on 
> "flink-clients". Flink-clients should see something in "flink-streaming-java".
> Related Change:
>   1. LocalStreamEnvironment and RemoteStreamEnvironment can also execute 
> a job by the executors(LocalExecutor and RemoteExecutor).  Introduce 
> StreamGraphExecutor which executors a streamGraph as PlanExecutor executors 
> the plan.  StreamGraphExecutor and PlanExecutor all extend Executor.
>   2. Introduce  StreamExecutionEnvironmentFactory which works similarly 
> to ContextEnvironmentFactory in flink-clients.
>   When a object of ContextEnvironmentFactory, 
> OptimizerPlanEnvironmentFactory or PreviewPlanEnvironmentFactory is set into 
> ExecutionEnvironment(by calling initializeContextEnvironment), the relevant 
> StreamEnvFactory is alsot set into StreamExecutionEnvironment. It is similar 
> when calling unsetContext.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XuPingyong commented on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong commented on issue #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267#issuecomment-456053593
 
 
   no need


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-5833) Support for Hive GenericUDF

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5833:
--
Labels: pull-request-available  (was: )

> Support for Hive GenericUDF
> ---
>
> Key: FLINK-5833
> URL: https://issues.apache.org/jira/browse/FLINK-5833
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table API  SQL
>Reporter: Zhuoluo Yang
>Assignee: Zhuoluo Yang
>Priority: Major
>  Labels: pull-request-available
>
> The second step of FLINK-5802 is to support Hive's GenericUDF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XuPingyong closed pull request #4273: [FLINK-7065] Separate the flink-streaming-java module from flink-clients

2019-01-21 Thread GitBox
XuPingyong closed pull request #4273: [FLINK-7065] Separate the 
flink-streaming-java module from flink-clients
URL: https://github.com/apache/flink/pull/4273
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong closed pull request #4241: [FLINK-7015] [streaming] separate OperatorConfig from StreamConfig

2019-01-21 Thread GitBox
XuPingyong closed pull request #4241: [FLINK-7015] [streaming] separate 
OperatorConfig from StreamConfig
URL: https://github.com/apache/flink/pull/4241
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] clarkyzl closed pull request #3473: [FLINK-5833] [table] Support for Hive GenericUDF

2019-01-21 Thread GitBox
clarkyzl closed pull request #3473: [FLINK-5833] [table] Support for Hive 
GenericUDF
URL: https://github.com/apache/flink/pull/3473
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-7015) Separate OperatorConfig from StreamConfig

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7015:
--
Labels: pull-request-available  (was: )

> Separate OperatorConfig from StreamConfig
> -
>
> Key: FLINK-7015
> URL: https://issues.apache.org/jira/browse/FLINK-7015
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>Priority: Major
>  Labels: pull-request-available
>
>  Motivation:
> A Task contains one or more operators with chainning, however 
> configs of operator and task are all put in StreamConfig. For example, when a 
> opeator sets up with the StreamConfig, it can see the interface about 
> physicalEdges or chained.task.configs that are confused.  Similarly a 
> streamTask should not see the interface aboule chain.index.
>  So we need to separate OperatorConfig from StreamConfig. A 
> streamTask builds execution enviroment with the streamConfig, and extract 
> operatorConfigs from it, then build streamOperators with every 
> operatorConfig. 
> 
>OperatorConfig:  for the streamOperator to setup with, it constains 
> informations that only belong to the streamOperator. It contains:
>1)  operator information: name, id
>2)  Serialized StreamOperator
>3)  input serializer.
>4)  output edges and serializers.
>5)  chain.index
>6) state.key.serializer
>  StreamConfig: for the streamTask to use:
>1) in.physical.edges
>   2) out.physical.edges
>3) chained OperatorConfigs
>4) execution environment: checkpoint, state.backend and so on... 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XuPingyong opened a new pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong opened a new pull request #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] XuPingyong removed a comment on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong removed a comment on issue #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267#issuecomment-456053593
 
 
   no need


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-5514) Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-5514:
--
Labels: pull-request-available  (was: )

> Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS
> ---
>
> Key: FLINK-5514
> URL: https://issues.apache.org/jira/browse/FLINK-5514
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: Timo Walther
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> A first support for GROUPING SETS has been added in FLINK-5303. However, the 
> current runtime implementation is not very efficient as it basically only 
> translates logical operators to physical operators i.e. grouping sets are 
> currently only translated into multiple groupings that are unioned together. 
> A rough design document for this has been created in FLINK-2980.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear

2019-01-21 Thread GitBox
XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor 
streamGraph to make interfaces clear
URL: https://github.com/apache/flink/pull/4267
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (FLINK-6516) using real row count instead of dummy row count when optimizing plan

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-6516:
--
Labels: pull-request-available  (was: )

> using real row count instead of dummy row count when optimizing plan
> 
>
> Key: FLINK-6516
> URL: https://issues.apache.org/jira/browse/FLINK-6516
> Project: Flink
>  Issue Type: Improvement
>  Components: Table API  SQL
>Reporter: godfrey he
>Assignee: godfrey he
>Priority: Major
>  Labels: pull-request-available
>
> Currently, the statistic of {{TableSourceTable}} is {{UNKNOWN}} mostly, and 
> the statistic from {{ExternalCatalog}} maybe is null also. Actually, only 
> each {{TableSource}} knows its statistic exactly, especial for 
> {{FilterableTableSource}} and {{PartitionableTableSource}}. So we can add 
> {{getTableStats}} method in {{TableSource}}, and use it in TableSourceScan's 
> estimateRowCount method to get real row count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (FLINK-7018) Refactor streamGraph to make interfaces clear

2019-01-21 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/FLINK-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7018:
--
Labels: pull-request-available  (was: )

> Refactor streamGraph to make interfaces clear
> -
>
> Key: FLINK-7018
> URL: https://issues.apache.org/jira/browse/FLINK-7018
> Project: Flink
>  Issue Type: Improvement
>  Components: DataStream API
>Reporter: Xu Pingyong
>Assignee: Xu Pingyong
>Priority: Major
>  Labels: pull-request-available
>
> Motivation:
>1. StreamGraph is a graph consisted of some streamNodes. So virtual 
> nodes (such as select, sideOutput, partition) should be moved away from it. 
> Main iterfaces of StreamGraph should be as following:
> addSource(StreamNode sourceNode)
> addSink(StreamNode sinkNode)
>addOperator(StreamNode streamNode)
>addEdge(Integer upStreamVertexID, Integer downStreamVertexID, 
> StreamEdge.InputOrder inputOrder, StreamPartitioner partitioner, 
> List outputNames, OutputTag outputTag)
> getJobGraph()
>  2. StreamExecutionEnvironment should not be in StreamGraph, I create 
> StreamGraphProperties which extracts all env information the streamGraph 
> needs from StreamExecutionEnvironment. It contains:
> 1) executionConfig
> 2) checkpointConfig
> 3) timeCharacteristic
> 4) stateBackend
> 5) chainingEnabled
> 6) cachedFiles
> 7) jobName
> 
> Related Changes:
>I moved the part of dealing with virtual nodes to 
> StreamGraphGenerator. And get properties of StreamGraph from 
> StreamGraphProperties instead of StreamExecutionEnvironment.
>  
>   It is only a code abstraction internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >