[GitHub] lamber-ken opened a new pull request #7551: [hotfix][runtime] add default statement in switch block when judge taskAcknowledgeResult
lamber-ken opened a new pull request #7551: [hotfix][runtime] add default statement in switch block when judge taskAcknowledgeResult URL: https://github.com/apache/flink/pull/7551 ## What is the purpose of the change add default statement in switch block when judge taskAcknowledgeResult ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] eaglewatcherwb commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution …
eaglewatcherwb commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution … URL: https://github.com/apache/flink/pull/7436#discussion_r249663932 ## File path: flink-core/src/test/java/org/apache/flink/util/InstantiationUtilTest.java ## @@ -46,6 +50,17 @@ */ public class InstantiationUtilTest extends TestLogger { + @Test + public void testResolveProxyClass() throws Exception { Review comment: @dawidwys Thanks for the comments. Unit test is updated, and the test fails when `ClassLoaderObjectInputStream#resolveProxyClass` only calls `super.resolveProxyClass `. I can not remove all the change to verify the case since `ObjectInputStream#resolveProxyClass` is `protected`. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Clarkkkkk opened a new pull request #7550: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink and Upsert Table Sink for HBase
Clark opened a new pull request #7550: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink and Upsert Table Sink for HBase URL: https://github.com/apache/flink/pull/7550 ## What is the purpose of the change - Add datastream sink and upsert table sink for HBase ## Brief change log - Add HBase connector, details are in the doc. ## Verifying this change This change added tests and can be verified as follows: - org.apache.flink.streaming.connectors.hbase.HBaseSinkITCase: starts a mini HBase cluster to verify all sink functions. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? JavaDocs and https://docs.google.com/document/d/1of0cYd73CtKGPt-UL3WVFTTBsVEre-TNRzoAt5u2PdQ/edit# This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-5703) ExecutionGraph recovery based on reconciliation with TaskManager reports
[ https://issues.apache.org/jira/browse/FLINK-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-5703: -- Labels: pull-request-available (was: ) > ExecutionGraph recovery based on reconciliation with TaskManager reports > > > Key: FLINK-5703 > URL: https://issues.apache.org/jira/browse/FLINK-5703 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination, JobManager >Reporter: zhijiang >Assignee: zhijiang >Priority: Major > Labels: pull-request-available > > The ExecutionGraph structure would be recovered from TaskManager reports > during reconciling period, and the necessary information includes: > - Execution: ExecutionAttemptID, AttemptNumber, StartTimestamp, > ExecutionState, SimpleSlot, PartialInputChannelDeploymentDescriptor(Consumer > Execution) > - ExecutionVertex: Map IntermediateResultPartition> > - ExecutionGraph: ConcurrentHashMap > For {{RECONCILING}} ExecutionState, it should be transition into any existing > task states ({{RUNNING}},{{CANCELED}},{{FAILED}},{{FINISHED}}). To do so, the > TaskManger should maintain the terminal task state > ({{CANCELED}},{{FAILED}},{{FINISHED}}) for a while and we try to realize this > mechanism in another jira. In addition, the state transition would trigger > different actions, and some actions rely on above necessary information. > Considering this limit, the recovery process will be divided into two steps: > - First, recovery all other necessary information except ExecutionState. > - Second, transition ExecutionState into real task state and trigger > actions. The behavior is the same with current {{UpdateTaskExecutorState}}. > To make logic easy and consistency, during recovery period, all the other RPC > messages ({{UpdateTaskExecutionState}}, {{ScheduleOrUpdateConsumers}},etc) > from TaskManager should be refused temporarily and responded with a special > message by JobMaster. Then the TaskManager should retry to send these > messages later until JobManager ends recovery and acknowledgement. > For {{RECONCILING}} JobStatus, it would be transition into one of the states > ({{RUNNING}},{{FAILING}},{{FINISHED}}) after recovery. > - {{RECONCILING}} to {{RUNNING}}: All the TaskManager report within > duration time and all the tasks are in {{RUNNING}} states. > - {{RECONCILING}} to {{FAILING}}: One of the TaskManager does not report > in time, or one of the tasks state is in {{FAILED}} or {{CANCELED}} > - {{RECONCILING}} to {{FINISHED}}: All the TaskManger report within > duration time and all the tasks are in {{FINISHED}} states. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zhijiangW commented on issue #3340: [FLINK-5703] [Job Manager] ExecutionGraph recovery via reconciliation with TaskManager reports
zhijiangW commented on issue #3340: [FLINK-5703] [Job Manager] ExecutionGraph recovery via reconciliation with TaskManager reports URL: https://github.com/apache/flink/pull/3340#issuecomment-456296041 Close this pr temporarily, and this feature would be re-submitted if needed in future. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] zhijiangW closed pull request #3340: [FLINK-5703] [Job Manager] ExecutionGraph recovery via reconciliation with TaskManager reports
zhijiangW closed pull request #3340: [FLINK-5703] [Job Manager] ExecutionGraph recovery via reconciliation with TaskManager reports URL: https://github.com/apache/flink/pull/3340 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11403) Remove ResultPartitionConsumableNotifier from ResultPartition
[ https://issues.apache.org/jira/browse/FLINK-11403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11403: --- Labels: pull-request-available (was: ) > Remove ResultPartitionConsumableNotifier from ResultPartition > - > > Key: FLINK-11403 > URL: https://issues.apache.org/jira/browse/FLINK-11403 > Project: Flink > Issue Type: Sub-task > Components: Network >Reporter: zhijiang >Assignee: zhijiang >Priority: Minor > Labels: pull-request-available > Fix For: 1.8.0 > > > This is the precondition for introducing pluggable {{ShuffleService}} on TM > side. > In current process of creating {{ResultPartition}}, the > {{ResultPartitionConsumableNotifier}} regarded as TM level component has to > be passed into the constructor. In order to create {{ResultPartition}} easily > from {{ShuffleService}}, the required information should be covered by > {{ResultPartitionDeploymentDescriptor}} as much as possible, then we could > remove this notifier from the constructor. And it is also reasonable for > notifying consumable partition via {{TaskActions}} which is already covered > in {{ResultPartition}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] zhijiangW opened a new pull request #7549: [FLINK-11403][network] Remove ResultPartitionConsumableNotifier from ResultPartition
zhijiangW opened a new pull request #7549: [FLINK-11403][network] Remove ResultPartitionConsumableNotifier from ResultPartition URL: https://github.com/apache/flink/pull/7549 ## What is the purpose of the change *This is the precondition for introducing pluggable `ShuffleService` on TM side.* *In current process of creating `ResultPartition`, the `ResultPartitionConsumableNotifier` regarded as TM level component has to be passed into the constructor. In order to create `ResultPartition` easily via `ShuffleService`, the required information should be covered by `ResultPartitionDeploymentDescriptor` as much as possible, then we could remove this notifier from the constructor. And it is also reasonable for notifying consumable partition via `TaskActions` which is already covered in `ResultPartition`.* ## Brief change log - *Remove `ResultPartitionConsumableNotifier` from `ResultPartition` constructor* - *Introduce `notifyPartitionConsumable` for `TaskActions` interface* - *Make `NoOpTaskActions` implementation public class for tests usage* ## Verifying this change This change is already covered by existing tests, such as *ResultPartitionTest*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-10245) Add HBase Streaming Sink
[ https://issues.apache.org/jira/browse/FLINK-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shimin Yang updated FLINK-10245: Summary: Add HBase Streaming Sink (was: Add DataStream HBase Sink) > Add HBase Streaming Sink > > > Key: FLINK-10245 > URL: https://issues.apache.org/jira/browse/FLINK-10245 > Project: Flink > Issue Type: Sub-task > Components: Streaming Connectors >Reporter: Shimin Yang >Assignee: Shimin Yang >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Design documentation: > [https://docs.google.com/document/d/1of0cYd73CtKGPt-UL3WVFTTBsVEre-TNRzoAt5u2PdQ/edit?usp=sharing] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11405) rest api can see exception by start end time filter
lining created FLINK-11405: -- Summary: rest api can see exception by start end time filter Key: FLINK-11405 URL: https://issues.apache.org/jira/browse/FLINK-11405 Project: Flink Issue Type: Sub-task Reporter: lining Assignee: lining -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11404) web ui add see page and can filter by time
lining created FLINK-11404: -- Summary: web ui add see page and can filter by time Key: FLINK-11404 URL: https://issues.apache.org/jira/browse/FLINK-11404 Project: Flink Issue Type: Sub-task Reporter: lining Assignee: lining -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-11404) web ui add see page and can filter by time
[ https://issues.apache.org/jira/browse/FLINK-11404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lining reassigned FLINK-11404: -- Assignee: Yadong Xie (was: lining) > web ui add see page and can filter by time > -- > > Key: FLINK-11404 > URL: https://issues.apache.org/jira/browse/FLINK-11404 > Project: Flink > Issue Type: Sub-task >Reporter: lining >Assignee: Yadong Xie >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Issue Comment Deleted] (FLINK-11374) See more failover and can filter by time range
[ https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lining updated FLINK-11374: --- Comment: was deleted (was: agg by vertex !image-2019-01-22-11-42-33-808.png!) > See more failover and can filter by time range > -- > > Key: FLINK-11374 > URL: https://issues.apache.org/jira/browse/FLINK-11374 > Project: Flink > Issue Type: Improvement > Components: REST, Webfrontend >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-01-22-11-40-53-135.png, > image-2019-01-22-11-42-33-808.png > > > Now failover just show limit size task failover latest time. If task has > failed many time, we can not see the earlier time failover. Can we add filter > by time to see failover which contains task attemp fail msg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Myasuka commented on issue #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base
Myasuka commented on issue #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base URL: https://github.com/apache/flink/pull/7544#issuecomment-456287424 I close-reopen this PR to rebuild checks for unexpected ITcase broken ( refer to [FLINK-9920](https://issues.apache.org/jira/browse/FLINK-9920)). CC @tillrohrmann would you please take a look at this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka closed pull request #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base
Myasuka closed pull request #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base URL: https://github.com/apache/flink/pull/7544 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka opened a new pull request #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base
Myasuka opened a new pull request #7544: [FLINK-11366][tests] Port TaskManagerMetricsTest to new code base URL: https://github.com/apache/flink/pull/7544 ## What is the purpose of the change Port `TaskManagerMetricsTest` to new code base. ## Brief change log - Move `TaskManagerMetricsTest`'s test to `TaskExecutorTest#testHeartbeatTimeoutWithResourceManager` ## Verifying this change This change added tests and can be verified as follows: - Verify taskmanager's metrics registry was not shutdown due to the disconnect to RM ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka commented on issue #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest
Myasuka commented on issue #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest URL: https://github.com/apache/flink/pull/7525#issuecomment-456287098 I close-reopen this PR to rebuild checks for unexpected ITcase crashed ( refer to [FLINK-11380](https://issues.apache.org/jira/browse/FLINK-11380)). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-9920) BucketingSinkFaultToleranceITCase fails on travis
[ https://issues.apache.org/jira/browse/FLINK-9920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748419#comment-16748419 ] Yun Tang commented on FLINK-9920: - Another instance: [https://api.travis-ci.org/v3/job/481937341/log.txt] [~aljoscha] when you try to verify this case locally. Did you use hadoop-2.8.3 just like travis used, and the local OS is linux not mac-os? > BucketingSinkFaultToleranceITCase fails on travis > - > > Key: FLINK-9920 > URL: https://issues.apache.org/jira/browse/FLINK-9920 > Project: Flink > Issue Type: Bug > Components: filesystem-connector >Affects Versions: 1.6.0, 1.7.0 >Reporter: Chesnay Schepler >Assignee: Aljoscha Krettek >Priority: Critical > Labels: test-stability > Fix For: 1.8.0 > > > https://travis-ci.org/zentol/flink/jobs/407021898 > {code} > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 13.082 sec > <<< FAILURE! - in > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase > runCheckpointedProgram(org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase) > Time elapsed: 5.696 sec <<< FAILURE! > java.lang.AssertionError: Read line does not match expected pattern. > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.flink.streaming.connectors.fs.bucketing.BucketingSinkFaultToleranceITCase.postSubmit(BucketingSinkFaultToleranceITCase.java:182) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249653432 ## File path: flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.internal; + +import org.apache.flink.shaded.guava18.com.google.common.base.Function; +import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.common.header.Header; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; + +import java.util.AbstractMap; +import java.util.Map; + +/** + * Extends base Kafka09ConsumerRecord to provide access to Kafka headers. Review comment: Done - a6065aa This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Aitozi commented on issue #4514: [FLINK-7384][cep] Unify event and processing time handling in the Abs…
Aitozi commented on issue #4514: [FLINK-7384][cep] Unify event and processing time handling in the Abs… URL: https://github.com/apache/flink/pull/4514#issuecomment-456284959 Hi @dawidwys I have gone through this PR, I think the unified solution is more clear. Maybe better check about the timeCharacteristic in `onTime` because i think we just need to `advanceTime` under processingtime and to not have to try to get from `elementqueue` . And why this PR was not checked in, Are there sth blocked? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Aitozi removed a comment on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model
Aitozi removed a comment on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model URL: https://github.com/apache/flink/pull/7543#issuecomment-456284633 Hi @dawidwys I have gone through the `FLINK-7384`, I think the unified solution is more clear. Maybe better check about the timeCharacteristic in `onTimer` because i think we just need to `advanceTime` under processingtime and to not have to try to get from `elementqueue` . And why that PR was not checked in, Are there sth blocking? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Aitozi commented on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model
Aitozi commented on issue #7543: [FLINK-10996]Fix the possible state leak with CEP processing time model URL: https://github.com/apache/flink/pull/7543#issuecomment-456284633 Hi @dawidwys I have gone through the `FLINK-7384`, I think the unified solution is more clear. Maybe better check about the timeCharacteristic in `onTimer` because i think we just need to `advanceTime` under processingtime and to not have to try to get from `elementqueue` . And why that PR was not checked in, Are there sth blocking? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249637189 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java ## @@ -82,93 +84,110 @@ */ @Internal public abstract class FlinkKafkaConsumerBase extends RichParallelSourceFunction implements - CheckpointListener, - ResultTypeQueryable, - CheckpointedFunction { + CheckpointListener, Review comment: Yes, sorry. Looks like I un-intentionally re-formatted it. Will revert. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11403) Remove ResultPartitionConsumableNotifier from ResultPartition
zhijiang created FLINK-11403: Summary: Remove ResultPartitionConsumableNotifier from ResultPartition Key: FLINK-11403 URL: https://issues.apache.org/jira/browse/FLINK-11403 Project: Flink Issue Type: Sub-task Components: Network Reporter: zhijiang Assignee: zhijiang Fix For: 1.8.0 This is the precondition for introducing pluggable {{ShuffleService}} on TM side. In current process of creating {{ResultPartition}}, the {{ResultPartitionConsumableNotifier}} regarded as TM level component has to be passed into the constructor. In order to create {{ResultPartition}} easily from {{ShuffleService}}, the required information should be covered by {{ResultPartitionDeploymentDescriptor}} as much as possible, then we could remove this notifier from the constructor. And it is also reasonable for notifying consumable partition via {{TaskActions}} which is already covered in {{ResultPartition}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Myasuka opened a new pull request #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest
Myasuka opened a new pull request #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest URL: https://github.com/apache/flink/pull/7525 ## What is the purpose of the change Check whether `TaskManagerConfigurationTest` contains any relevant tests for the new code base and then remove this test. ## Brief change log `TaskManagerConfigurationTest` contains tests for `TaskManager` with configuration of host name, port and filesystem. Theses tests could be merged (host name and port) to test `TaskManagerRunner`, which I create a new test named `TaskManagerRunnerConfigurationTest`. ## Verifying this change This change added tests and can be verified as follows: `TaskManagerRunnerConfigurationTest` to test `TaskManagerRunner` with configuration of host name, port and filesystem. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: (no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? not applicable This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] Myasuka closed pull request #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest
Myasuka closed pull request #7525: [FLINK-11363][test] Check and remove TaskManagerConfigurationTest URL: https://github.com/apache/flink/pull/7525 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249637422 ## File path: flink-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/SimpleConsumerThread.java ## @@ -370,8 +370,8 @@ else if (partitionsRemoved) { keyPayload.get(keyBytes); } - final T value = deserializer.deserialize(keyBytes, valueBytes, - currentPartition.getTopic(), currentPartition.getPartition(), offset); + final T value = deserializer.deserialize( Review comment: Thank you for suggestion, I will try This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249637189 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java ## @@ -82,93 +84,110 @@ */ @Internal public abstract class FlinkKafkaConsumerBase extends RichParallelSourceFunction implements - CheckpointListener, - ResultTypeQueryable, - CheckpointedFunction { + CheckpointListener, Review comment: Yes, sorry. Looks like I un-intentionally reformat it. Will revert. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249637189 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java ## @@ -82,93 +84,110 @@ */ @Internal public abstract class FlinkKafkaConsumerBase extends RichParallelSourceFunction implements - CheckpointListener, - ResultTypeQueryable, - CheckpointedFunction { + CheckpointListener, Review comment: Yes, sorry. Looks like I intentionally reformat it. Will revert. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11374) See more failover and can filter by time range
[ https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748360#comment-16748360 ] lining commented on FLINK-11374: list detail exception count 100 and can see by page !image-2019-01-22-11-40-53-135.png! > See more failover and can filter by time range > -- > > Key: FLINK-11374 > URL: https://issues.apache.org/jira/browse/FLINK-11374 > Project: Flink > Issue Type: Improvement > Components: REST, Webfrontend >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-01-22-11-40-53-135.png > > > Now failover just show limit size task failover latest time. If task has > failed many time, we can not see the earlier time failover. Can we add filter > by time to see failover which contains task attemp fail msg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11374) See more failover and can filter by time range
[ https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748362#comment-16748362 ] lining commented on FLINK-11374: Hi, [~till.rohrmann]. I have added some details about this. What do you think about it? > See more failover and can filter by time range > -- > > Key: FLINK-11374 > URL: https://issues.apache.org/jira/browse/FLINK-11374 > Project: Flink > Issue Type: Improvement > Components: REST, Webfrontend >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-01-22-11-40-53-135.png, > image-2019-01-22-11-42-33-808.png > > > Now failover just show limit size task failover latest time. If task has > failed many time, we can not see the earlier time failover. Can we add filter > by time to see failover which contains task attemp fail msg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] leesf commented on issue #7536: [hotfix][docs] Remove redundant symbols
leesf commented on issue #7536: [hotfix][docs] Remove redundant symbols URL: https://github.com/apache/flink/pull/7536#issuecomment-456261374 cc @GJL @zentol This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11374) See more failover and can filter by time range
[ https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lining updated FLINK-11374: --- Attachment: image-2019-01-22-11-42-33-808.png > See more failover and can filter by time range > -- > > Key: FLINK-11374 > URL: https://issues.apache.org/jira/browse/FLINK-11374 > Project: Flink > Issue Type: Improvement > Components: REST, Webfrontend >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-01-22-11-40-53-135.png, > image-2019-01-22-11-42-33-808.png > > > Now failover just show limit size task failover latest time. If task has > failed many time, we can not see the earlier time failover. Can we add filter > by time to see failover which contains task attemp fail msg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11374) See more failover and can filter by time range
[ https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748361#comment-16748361 ] lining commented on FLINK-11374: agg by vertex !image-2019-01-22-11-42-33-808.png! > See more failover and can filter by time range > -- > > Key: FLINK-11374 > URL: https://issues.apache.org/jira/browse/FLINK-11374 > Project: Flink > Issue Type: Improvement > Components: REST, Webfrontend >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-01-22-11-40-53-135.png > > > Now failover just show limit size task failover latest time. If task has > failed many time, we can not see the earlier time failover. Can we add filter > by time to see failover which contains task attemp fail msg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11374) See more failover and can filter by time range
[ https://issues.apache.org/jira/browse/FLINK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lining updated FLINK-11374: --- Attachment: image-2019-01-22-11-40-53-135.png > See more failover and can filter by time range > -- > > Key: FLINK-11374 > URL: https://issues.apache.org/jira/browse/FLINK-11374 > Project: Flink > Issue Type: Improvement > Components: REST, Webfrontend >Reporter: lining >Assignee: lining >Priority: Major > Attachments: image-2019-01-22-11-40-53-135.png > > > Now failover just show limit size task failover latest time. If task has > failed many time, we can not see the earlier time failover. Can we add filter > by time to see failover which contains task attemp fail msg. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Clarkkkkk closed pull request #6628: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink for HBase
Clark closed pull request #6628: [FLINK-10245] [Streaming Connector] Add Pojo, Tuple, Row and Scala Product DataStream Sink for HBase URL: https://github.com/apache/flink/pull/6628 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11380) YarnFlinkResourceManagerTest test case crashed
[ https://issues.apache.org/jira/browse/FLINK-11380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748350#comment-16748350 ] Yun Tang commented on FLINK-11380: -- Another instance https://api.travis-ci.org/v3/job/481937341/log.txt > YarnFlinkResourceManagerTest test case crashed > --- > > Key: FLINK-11380 > URL: https://issues.apache.org/jira/browse/FLINK-11380 > Project: Flink > Issue Type: Test > Components: Tests >Reporter: vinoyang >Priority: Critical > Labels: test-stability > Fix For: 1.8.0 > > > context: > {code:java} > 17:18:44.415 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.1:test (default-test) on > project flink-yarn_2.11: There are test failures. > 17:18:44.415 [ERROR] > 17:18:44.415 [ERROR] Please refer to > /home/travis/build/apache/flink/flink-yarn/target/surefire-reports for the > individual test results. > 17:18:44.415 [ERROR] Please refer to dump files (if any exist) [date].dump, > [date]-jvmRun[N].dump and [date].dumpstream. > 17:18:44.415 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > 17:18:44.415 [ERROR] Command was /bin/sh -c cd > /home/travis/build/apache/flink/flink-yarn && > /usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx2048m -Dmvn.forkNumber=2 > -XX:+UseG1GC -jar > /home/travis/build/apache/flink/flink-yarn/target/surefire/surefirebooter3487840902331471745.jar > /home/travis/build/apache/flink/flink-yarn/target/surefire > 2019-01-16T17-02-23_939-jvmRun2 surefire3706271590182708448tmp > surefire_332496616764820906947tmp > 17:18:44.416 [ERROR] Error occurred in starting fork, check output in log > 17:18:44.416 [ERROR] Process Exit Code: 243 > 17:18:44.416 [ERROR] Crashed tests: > 17:18:44.416 [ERROR] org.apache.flink.yarn.YarnFlinkResourceManagerTest > 17:18:44.416 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > 17:18:44.416 [ERROR] Command was /bin/sh -c cd > /home/travis/build/apache/flink/flink-yarn && > /usr/lib/jvm/java-8-oracle/jre/bin/java -Xms256m -Xmx2048m -Dmvn.forkNumber=2 > -XX:+UseG1GC -jar > /home/travis/build/apache/flink/flink-yarn/target/surefire/surefirebooter3487840902331471745.jar > /home/travis/build/apache/flink/flink-yarn/target/surefire > 2019-01-16T17-02-23_939-jvmRun2 surefire3706271590182708448tmp > surefire_332496616764820906947tmp > 17:18:44.416 [ERROR] Error occurred in starting fork, check output in log > 17:18:44.416 [ERROR] Process Exit Code: 243 > 17:18:44.416 [ERROR] Crashed tests: > 17:18:44.416 [ERROR] org.apache.flink.yarn.YarnFlinkResourceManagerTest > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:510) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:382) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:297) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:246) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1183) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1011) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:857) > 17:18:44.416 [ERROR] at > org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:132) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) > 17:18:44.416 [ERROR] at > org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:120) > 17:18:44.416 [ERROR] at > org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:355) > 17:18:44.416 [ERROR] at >
[jira] [Closed] (FLINK-6101) GroupBy fields with arithmetic expression (include UDF) can not be selected
[ https://issues.apache.org/jira/browse/FLINK-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lincoln.lee closed FLINK-6101. -- Resolution: Later > GroupBy fields with arithmetic expression (include UDF) can not be selected > --- > > Key: FLINK-6101 > URL: https://issues.apache.org/jira/browse/FLINK-6101 > Project: Flink > Issue Type: Bug > Components: Table API SQL >Reporter: lincoln.lee >Assignee: lincoln.lee >Priority: Minor > > currently the TableAPI do not support selecting GroupBy fields with > expression either using original field name or the expression > {code} > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, 'b % 3) > .select('b, 'c.min, 'e, 'a.avg, 'd.count) > {code} > caused > {code} > org.apache.flink.table.api.ValidationException: Cannot resolve [b] given > input [e, ('b % 3), TMP_0, TMP_1, TMP_2]. > {code} > (BTW, this syntax is invalid in RDBMS which will indicate the selected column > is invalid in the select list because it is not contained in either an > aggregate function or the GROUP BY clause in SQL Server.) > and > {code} > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, 'b % 3) > .select('b%3, 'c.min, 'e, 'a.avg, 'd.count) > {code} > will also cause > {code} > org.apache.flink.table.api.ValidationException: Cannot resolve [b] given > input [e, ('b % 3), TMP_0, TMP_1, TMP_2]. > {code} > and add an alias in groupBy clause "group(e, 'b%3 as 'b)" work without avail. > and apply an UDF doesn’t work either > {code} >table.groupBy('a, Mod('b, 3)).select('a, Mod('b, 3), 'c.count, 'c.count, > 'd.count, 'e.avg) > org.apache.flink.table.api.ValidationException: Cannot resolve [b] given > input [a, org.apache.flink.table.api.scala.batch.table.Mod$('b, 3), TMP_0, > TMP_1, TMP_2]. > {code} > the only way to get this work can be > {code} > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .select('a, 'b%3 as 'b, 'c, 'd, 'e) > .groupBy('e, 'b) > .select('b, 'c.min, 'e, 'a.avg, 'd.count) > {code} > One way to solve this is to add support alias in groupBy clause ( it seems a > bit odd against SQL though TableAPI has a different groupBy grammar), > and I prefer to support select original expressions and UDF in groupBy > clause(make consistent with SQL). > as thus: > {code} > // use expression > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, 'b % 3) > .select('b % 3, 'c.min, 'e, 'a.avg, 'd.count) > // use UDF > val t = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'a, 'b, 'c, > 'd, 'e) > .groupBy('e, Mod('b,3)) > .select(Mod('b,3), 'c.min, 'e, 'a.avg, 'd.count) > {code} > After had a look into the code, found there was a problem in the groupBy > implementation, validation hadn't considered the expressions in groupBy > clause. it should be noted that a table has been actually changed after > groupBy operation ( a new Table) and the groupBy keys replace the original > field reference in essence. > > What do you think? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11214) Support non mergable aggregates for group windows on batch table
[ https://issues.apache.org/jira/browse/FLINK-11214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dian Fu closed FLINK-11214. --- Resolution: Won't Do > Support non mergable aggregates for group windows on batch table > > > Key: FLINK-11214 > URL: https://issues.apache.org/jira/browse/FLINK-11214 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Dian Fu >Assignee: Dian Fu >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently, it does not support non-mergable aggregates for group window on > batch table. It would be nice to support it as many code paths(but not all) > have already considered it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] dianfu closed pull request #7354: [FLINK-11214] [table] Support non mergable aggregates for group windows on batch table
dianfu closed pull request #7354: [FLINK-11214] [table] Support non mergable aggregates for group windows on batch table URL: https://github.com/apache/flink/pull/7354 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] KarmaGYZ commented on a change in pull request #7260: [hotfix][docs] Fix typo and make improvement in Kafka Connectors doc
KarmaGYZ commented on a change in pull request #7260: [hotfix][docs] Fix typo and make improvement in Kafka Connectors doc URL: https://github.com/apache/flink/pull/7260#discussion_r249611548 ## File path: docs/dev/connectors/kafka.md ## @@ -681,13 +681,13 @@ chosen by passing appropriate `semantic` parameter to the `FlinkKafkaProducer011 `Semantic.EXACTLY_ONCE` mode relies on the ability to commit transactions that were started before taking a checkpoint, after recovering from the said checkpoint. If the time -between Flink application crash and completed restart is larger then Kafka's transaction timeout +between Flink application crash and completed restart is larger than Kafka's transaction timeout there will be data loss (Kafka will automatically abort transactions that exceeded timeout time). Having this in mind, please configure your transaction timeout appropriately to your expected down times. Kafka brokers by default have `transaction.max.timeout.ms` set to 15 minutes. This property will -not allow to set transaction timeouts for the producers larger then it's value. +not allow to set transaction timeouts for the producers larger than it's value. `FlinkKafkaProducer011` by default sets the `transaction.timeout.ms` property in producer config to Review comment: It sounds more clear and fluent to me. Thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11326: --- Labels: pull-request-available (was: ) > Using offsets to adjust windows to timezones UTC-8 throws > IllegalArgumentException > -- > > Key: FLINK-11326 > URL: https://issues.apache.org/jira/browse/FLINK-11326 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.6.3, 1.7.1 >Reporter: TANG Wen-hui >Assignee: Kezhu Wang >Priority: Major > Labels: pull-request-available > > According to comments, we can use offset to adjust windows to timezones other > than UTC-0. For example, in China you would have to specify an offset of > {{Time.hours(-8)}}. > > {code:java} > /** > * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that > assigns > * elements to time windows based on the element timestamp and offset. > * > * For example, if you want window a stream by hour,but window begins at > the 15th minutes > * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then > you will get > * time windows start at 0:15:00,1:15:00,2:15:00,etc. > * > * Rather than that,if you are living in somewhere which is not using > UTC±00:00 time, > * such as China which is using UTC+08:00,and you want a time window with > size of one day, > * and window begins at every 00:00:00 of local time,you may use {@code > of(Time.days(1),Time.hours(-8))}. > * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 > hours earlier than UTC time. > * > * @param size The size of the generated windows. > * @param offset The offset which window start would be shifted by. > * @return The time policy. > */ > public static TumblingEventTimeWindows of(Time size, Time offset) { > return new TumblingEventTimeWindows(size.toMilliseconds(), > offset.toMilliseconds()); > }{code} > > but when offset is smaller than zero, a IllegalArgumentException will be > throwed. > > {code:java} > protected TumblingEventTimeWindows(long size, long offset) { > if (offset < 0 || offset >= size) { > throw new IllegalArgumentException("TumblingEventTimeWindows parameters must > satisfy 0 <= offset < size"); > } > this.size = size; > this.offset = offset; > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] kezhuw opened a new pull request #7548: [FLINK-11326] Fix forbidden negative offset in window assigners
kezhuw opened a new pull request #7548: [FLINK-11326] Fix forbidden negative offset in window assigners URL: https://github.com/apache/flink/pull/7548 ## What is the purpose of the change Allow negative window offset in window assignment as the javadoc says. ## Brief change log - Allow negative window offset in window assignment. - Throw `IllegalArgumentException` if offset is out of range for `SlidingEventTimeWindows.of` and `SlidingProcessingTimeWindows.of`. ## Verifying this change This change is already covered by existing tests and new test cases has been added to allow negative window offset. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) ## Breaking changes * `@PublicEvolving` API `SlidingEventTimeWindows.of` and `SlidingProcessingTimeWindows.of` allows out of window offset previously, this merge request forbid this behavior. This way they behaves same as `TumblingEventTimeWindows.of` and `TumblingProcessingTimeWindows.of`. I think it is easier for caller to understand [sliding windows](https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/windows.html#sliding-windows). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] KarmaGYZ commented on issue #7423: [hotfix][docs] Remove the legacy hint in production-ready doc
KarmaGYZ commented on issue #7423: [hotfix][docs] Remove the legacy hint in production-ready doc URL: https://github.com/apache/flink/pull/7423#issuecomment-456233322 @alpinegizmo Thanks for the review. In the stale version, this sentence point to the asynchronous snapshotting. However, I keep this sentence because there are [explanation](https://ci.apache.org/projects/flink/flink-docs-master/ops/state/state_backends.html#the-rocksdbstatebackend) about the limitation of the throughput of RocksDBBackend. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249607011 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java ## @@ -32,6 +34,49 @@ */ @PublicEvolving public interface KeyedDeserializationSchema extends Serializable, ResultTypeQueryable { + /** +* Kafka record to be deserialized. +* Record consists of key,value pair, topic name, partition offset, headers and a timestamp (if available) +*/ + interface Record { Review comment: We can add a new deserialize method to `KeyedDeserializationSchema` interface with a default implementation that just forwards to the other deserialize method (mark as deprecated). ```java T deserialize(ConsumerRecord consumerRecord) throws IOException { return deserialize(messageKey, message, topic, partition, offset); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249607011 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java ## @@ -32,6 +34,49 @@ */ @PublicEvolving public interface KeyedDeserializationSchema extends Serializable, ResultTypeQueryable { + /** +* Kafka record to be deserialized. +* Record consists of key,value pair, topic name, partition offset, headers and a timestamp (if available) +*/ + interface Record { Review comment: We can add a new deserialize method to `KeyedDeserializationSchema` interface with a default implementation that just forwards to the other deserialize method (mark as deprecated). {code} T deserialize(ConsumerRecord consumerRecord) throws IOException { return deserialize(messageKey, message, topic, partition, offset); } {code} This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249606898 ## File path: flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java ## @@ -641,6 +641,9 @@ public void invoke(KafkaTransactionState transaction, IN next, Context context) } else { record = new ProducerRecord<>(targetTopic, null, timestamp, serializedKey, serializedValue); } + for (Map.Entry header: schema.headers(next)) { Review comment: Pushed e0ca3e0 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
alexeyt820 commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249606803 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java ## @@ -251,6 +269,18 @@ public FlinkKafkaConsumerBase( // Configuration // + /** +* Make sure that auto commit is disabled when our offset commit mode is ON_CHECKPOINTS. +* This overwrites whatever setting the user configured in the properties. +* @param properties - Kafka configuration properties to be adjusted +* @param offsetCommitMode offset commit mode +*/ + protected static void adjustAutoCommitConfig(Properties properties, OffsetCommitMode offsetCommitMode) { Review comment: Pushed 3cfda16 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kezhu Wang updated FLINK-11326: --- Affects Version/s: 1.7.1 > Using offsets to adjust windows to timezones UTC-8 throws > IllegalArgumentException > -- > > Key: FLINK-11326 > URL: https://issues.apache.org/jira/browse/FLINK-11326 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.6.3, 1.7.1 >Reporter: TANG Wen-hui >Assignee: Kezhu Wang >Priority: Major > > According to comments, we can use offset to adjust windows to timezones other > than UTC-0. For example, in China you would have to specify an offset of > {{Time.hours(-8)}}. > > {code:java} > /** > * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that > assigns > * elements to time windows based on the element timestamp and offset. > * > * For example, if you want window a stream by hour,but window begins at > the 15th minutes > * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then > you will get > * time windows start at 0:15:00,1:15:00,2:15:00,etc. > * > * Rather than that,if you are living in somewhere which is not using > UTC±00:00 time, > * such as China which is using UTC+08:00,and you want a time window with > size of one day, > * and window begins at every 00:00:00 of local time,you may use {@code > of(Time.days(1),Time.hours(-8))}. > * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 > hours earlier than UTC time. > * > * @param size The size of the generated windows. > * @param offset The offset which window start would be shifted by. > * @return The time policy. > */ > public static TumblingEventTimeWindows of(Time size, Time offset) { > return new TumblingEventTimeWindows(size.toMilliseconds(), > offset.toMilliseconds()); > }{code} > > but when offset is smaller than zero, a IllegalArgumentException will be > throwed. > > {code:java} > protected TumblingEventTimeWindows(long size, long offset) { > if (offset < 0 || offset >= size) { > throw new IllegalArgumentException("TumblingEventTimeWindows parameters must > satisfy 0 <= offset < size"); > } > this.size = size; > this.offset = offset; > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
stevenzwu commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249606472 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java ## @@ -32,6 +34,49 @@ */ @PublicEvolving public interface KeyedDeserializationSchema extends Serializable, ResultTypeQueryable { + /** +* Kafka record to be deserialized. +* Record consists of key,value pair, topic name, partition offset, headers and a timestamp (if available) +*/ + interface Record { Review comment: @alexeyt820 as pointed out by @azagrebin, kafka 0.8 seems to have `ConsumerRecord` interface https://archive.apache.org/dist/kafka/0.8.2-beta/java-doc/org/apache/kafka/clients/consumer/ConsumerRecord.html As Flink is moving to the direction of just one modern kafka connector, I am also wondering if it is ok to drop kafka 0.8 connector? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders
[ https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi updated FLINK-11402: Component/s: Local Runtime > User code can fail with an UnsatisfiedLinkError in the presence of multiple > classloaders > > > Key: FLINK-11402 > URL: https://issues.apache.org/jira/browse/FLINK-11402 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination, Local Runtime >Affects Versions: 1.7.0 >Reporter: Ufuk Celebi >Priority: Major > Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz > > > As reported on the user mailing list thread "[`env.java.opts` not persisting > after job canceled or failed and then > restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];, > there can be issues with using native libraries and user code class loading. > h2. Steps to reproduce > I was able to reproduce the issue reported on the mailing list using > [snappy-java|https://github.com/xerial/snappy-java] in a user program. > Running the attached user program works fine on initial submission, but > results in a failure when re-executed. > I'm using Flink 1.7.0 using a standalone cluster started via > {{bin/start-cluster.sh}}. > 0. Unpack attached Maven project and build using {{mvn clean package}} *or* > directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}} > 1. Download > [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar] > and unpack libsnappyjava for your system: > {code} > jar tf snappy-java-1.1.7.2.jar | grep libsnappy > ... > org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so > ... > org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib > ... > {code} > 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} > (path needs to be adjusted for your system): > {code} > env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64 > {code} > 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > Program execution finished > Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished. > Job Runtime: 359 ms > {code} > 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: Job failed. > (JobID: 7d69baca58f33180cb9251449ddcd396) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at com.github.uce.HelloSnappy.main(HelloSnappy.java:18) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > at > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) > ... 17 more > Caused by: java.lang.UnsatisfiedLinkError: Native Library > /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded > in another classloader > at
[jira] [Commented] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders
[ https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748241#comment-16748241 ] Ufuk Celebi commented on FLINK-11402: - Similar issue for RocksDB (FLINK-5408) > User code can fail with an UnsatisfiedLinkError in the presence of multiple > classloaders > > > Key: FLINK-11402 > URL: https://issues.apache.org/jira/browse/FLINK-11402 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.7.0 >Reporter: Ufuk Celebi >Priority: Major > Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz > > > As reported on the user mailing list thread "[`env.java.opts` not persisting > after job canceled or failed and then > restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];, > there can be issues with using native libraries and user code class loading. > h2. Steps to reproduce > I was able to reproduce the issue reported on the mailing list using > [snappy-java|https://github.com/xerial/snappy-java] in a user program. > Running the attached user program works fine on initial submission, but > results in a failure when re-executed. > I'm using Flink 1.7.0 using a standalone cluster started via > {{bin/start-cluster.sh}}. > 0. Unpack attached Maven project and build using {{mvn clean package}} *or* > directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}} > 1. Download > [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar] > and unpack libsnappyjava for your system: > {code} > jar tf snappy-java-1.1.7.2.jar | grep libsnappy > ... > org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so > ... > org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib > ... > {code} > 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} > (path needs to be adjusted for your system): > {code} > env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64 > {code} > 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > Program execution finished > Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished. > Job Runtime: 359 ms > {code} > 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: Job failed. > (JobID: 7d69baca58f33180cb9251449ddcd396) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at com.github.uce.HelloSnappy.main(HelloSnappy.java:18) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > at > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) > ... 17 more > Caused by: java.lang.UnsatisfiedLinkError: Native Library > /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded > in another classloader > at
[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders
[ https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi updated FLINK-11402: Attachment: hello-snappy.tgz > User code can fail with an UnsatisfiedLinkError in the presence of multiple > classloaders > > > Key: FLINK-11402 > URL: https://issues.apache.org/jira/browse/FLINK-11402 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.7.0 >Reporter: Ufuk Celebi >Priority: Major > Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz > > > As reported on the user mailing list thread "[`env.java.opts` not persisting > after job canceled or failed and then > restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];, > there can be issues with using native libraries and user code class loading. > h2. Steps to reproduce > I was able to reproduce the issue reported on the mailing list using > [snappy-java|https://github.com/xerial/snappy-java] in a user program. > Running the attached user program works fine on initial submission, but > results in a failure when re-executed. > I'm using Flink 1.7.0 using a standalone cluster started via > {{bin/start-cluster.sh}}. > 0. Unpack attached Maven project and build using {{mvn clean package}} *or* > directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}} > 1. Download > [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar] > and unpack libsnappyjava for your system: > {code} > jar tf snappy-java-1.1.7.2.jar | grep libsnappy > ... > org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so > ... > org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib > ... > {code} > 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} > (path needs to be adjusted for your system): > {code} > env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64 > {code} > 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > Program execution finished > Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished. > Job Runtime: 359 ms > {code} > 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: Job failed. > (JobID: 7d69baca58f33180cb9251449ddcd396) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at com.github.uce.HelloSnappy.main(HelloSnappy.java:18) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > at > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) > ... 17 more > Caused by: java.lang.UnsatisfiedLinkError: Native Library > /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded > in another classloader > at
[jira] [Updated] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders
[ https://issues.apache.org/jira/browse/FLINK-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ufuk Celebi updated FLINK-11402: Attachment: hello-snappy-1.0-SNAPSHOT.jar > User code can fail with an UnsatisfiedLinkError in the presence of multiple > classloaders > > > Key: FLINK-11402 > URL: https://issues.apache.org/jira/browse/FLINK-11402 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.7.0 >Reporter: Ufuk Celebi >Priority: Major > Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz > > > As reported on the user mailing list thread "[`env.java.opts` not persisting > after job canceled or failed and then > restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];, > there can be issues with using native libraries and user code class loading. > h2. Steps to reproduce > I was able to reproduce the issue reported on the mailing list using > [snappy-java|https://github.com/xerial/snappy-java] in a user program. > Running the attached user program works fine on initial submission, but > results in a failure when re-executed. > I'm using Flink 1.7.0 using a standalone cluster started via > {{bin/start-cluster.sh}}. > 0. Unpack attached Maven project and build using {{mvn clean package}} *or* > directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}} > 1. Download > [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar] > and unpack libsnappyjava for your system: > {code} > jar tf snappy-java-1.1.7.2.jar | grep libsnappy > ... > org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so > ... > org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib > ... > {code} > 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} > (path needs to be adjusted for your system): > {code} > env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64 > {code} > 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > Program execution finished > Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished. > Job Runtime: 359 ms > {code} > 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}} > {code} > bin/flink run hello-snappy-1.0-SNAPSHOT.jar > Starting execution of program > > The program finished with the following exception: > org.apache.flink.client.program.ProgramInvocationException: Job failed. > (JobID: 7d69baca58f33180cb9251449ddcd396) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at com.github.uce.HelloSnappy.main(HelloSnappy.java:18) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > at > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > Caused by: org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) > ... 17 more > Caused by: java.lang.UnsatisfiedLinkError: Native Library > /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded > in another classloader > at
[jira] [Created] (FLINK-11402) User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders
Ufuk Celebi created FLINK-11402: --- Summary: User code can fail with an UnsatisfiedLinkError in the presence of multiple classloaders Key: FLINK-11402 URL: https://issues.apache.org/jira/browse/FLINK-11402 Project: Flink Issue Type: Bug Components: Distributed Coordination Affects Versions: 1.7.0 Reporter: Ufuk Celebi Attachments: hello-snappy-1.0-SNAPSHOT.jar, hello-snappy.tgz As reported on the user mailing list thread "[`env.java.opts` not persisting after job canceled or failed and then restarted|https://lists.apache.org/thread.html/37cc1b628e16ca6c0bacced5e825de8057f88a8d601b90a355b6a291@%3Cuser.flink.apache.org%3E];, there can be issues with using native libraries and user code class loading. h2. Steps to reproduce I was able to reproduce the issue reported on the mailing list using [snappy-java|https://github.com/xerial/snappy-java] in a user program. Running the attached user program works fine on initial submission, but results in a failure when re-executed. I'm using Flink 1.7.0 using a standalone cluster started via {{bin/start-cluster.sh}}. 0. Unpack attached Maven project and build using {{mvn clean package}} *or* directly use attached {{hello-snappy-1.0-SNAPSHOT.jar}} 1. Download [snappy-java-1.1.7.2.jar|http://central.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.7.2/snappy-java-1.1.7.2.jar] and unpack libsnappyjava for your system: {code} jar tf snappy-java-1.1.7.2.jar | grep libsnappy ... org/xerial/snappy/native/Linux/x86_64/libsnappyjava.so ... org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib ... {code} 2. Configure system library path to {{libsnappyjava}} in {{flink-conf.yaml}} (path needs to be adjusted for your system): {code} env.java.opts: -Djava.library.path=/.../org/xerial/snappy/native/Mac/x86_64 {code} 3. Run attached {{hello-snappy-1.0-SNAPSHOT.jar}} {code} bin/flink run hello-snappy-1.0-SNAPSHOT.jar Starting execution of program Program execution finished Job with JobID ae815b918dd7bc64ac8959e4e224f2b4 has finished. Job Runtime: 359 ms {code} 4. Rerun attached {{hello-snappy-1.0-SNAPSHOT.jar}} {code} bin/flink run hello-snappy-1.0-SNAPSHOT.jar Starting execution of program The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 7d69baca58f33180cb9251449ddcd396) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:487) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) at com.github.uce.HelloSnappy.main(HelloSnappy.java:18) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed. at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146) at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265) ... 17 more Caused by: java.lang.UnsatisfiedLinkError: Native Library /.../org/xerial/snappy/native/Mac/x86_64/libsnappyjava.jnilib already loaded in another classloader at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1907) at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1861) at java.lang.Runtime.loadLibrary0(Runtime.java:870) at java.lang.System.loadLibrary(System.java:1122) at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:182) at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154) at org.xerial.snappy.Snappy.(Snappy.java:47) at
[jira] [Updated] (FLINK-11401) Allow compression on ParquetBulkWriter
[ https://issues.apache.org/jira/browse/FLINK-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11401: --- Labels: pull-request-available (was: ) > Allow compression on ParquetBulkWriter > -- > > Key: FLINK-11401 > URL: https://issues.apache.org/jira/browse/FLINK-11401 > Project: Flink > Issue Type: Improvement >Affects Versions: 1.7.1 >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] Fokko opened a new pull request #7547: [FLINK-11401] Allow setting of compression on ParquetWriter
Fokko opened a new pull request #7547: [FLINK-11401] Allow setting of compression on ParquetWriter URL: https://github.com/apache/flink/pull/7547 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* ## Brief change log *(for example:)* - *The TaskInfo is stored in the blob store on job creation time as a persistent artifact* - *Deployments RPC transmits only the blob storage reference* - *TaskManagers retrieve the TaskInfo from the blob cache* ## Verifying this change *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. *(or)* This change is already covered by existing tests, such as *(please describe tests)*. *(or)* This change added tests and can be verified as follows: *(example:)* - *Added integration tests for end-to-end deployment with large payloads (100MB)* - *Extended integration test for recovery after master (JobManager) failure* - *Added test that validates that TaskInfo is transferred only once across recoveries* - *Manually verified the change by running a 4 node cluser with 2 JobManagers and 4 TaskManagers, a stateful streaming program, and killing one JobManager and two TaskManagers during the execution, verifying that recovery happens correctly.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (yes / no / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know) - The S3 file system connector: (yes / no / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / no) - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11401) Allow compression on ParquetBulkWriter
Fokko Driesprong created FLINK-11401: Summary: Allow compression on ParquetBulkWriter Key: FLINK-11401 URL: https://issues.apache.org/jira/browse/FLINK-11401 Project: Flink Issue Type: Improvement Affects Versions: 1.7.1 Reporter: Fokko Driesprong Assignee: Fokko Driesprong Fix For: 1.8.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] bowenli86 commented on issue #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager
bowenli86 commented on issue #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager URL: https://github.com/apache/flink/pull/7011#issuecomment-456154882 closing this PR in favor of new plan This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bowenli86 commented on issue #6997: [FLINK-10697][Table & SQL] Create an in-memory catalog that stores Flink's meta objects
bowenli86 commented on issue #6997: [FLINK-10697][Table & SQL] Create an in-memory catalog that stores Flink's meta objects URL: https://github.com/apache/flink/pull/6997#issuecomment-456154907 closing this PR in favor of new plan This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bowenli86 commented on issue #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java
bowenli86 commented on issue #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java URL: https://github.com/apache/flink/pull/7012#issuecomment-456154851 closing this PR in favor of new plan This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bowenli86 closed pull request #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager
bowenli86 closed pull request #7011: [FLINK-10768][Table & SQL] Move external catalog related code from TableEnvironment to CatalogManager URL: https://github.com/apache/flink/pull/7011 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bowenli86 closed pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs
bowenli86 closed pull request #6970: [FLINK-10696][Table API & SQL]Add APIs to ExternalCatalog, CrudExternalCatalog and InMemoryCrudExternalCatalog for views and UDFs URL: https://github.com/apache/flink/pull/6970 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] bowenli86 closed pull request #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java
bowenli86 closed pull request #7012: [FLINK-10769][Table & SQL] port InMemoryExternalCatalog to java URL: https://github.com/apache/flink/pull/7012 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] GJL opened a new pull request #7546: [FLINK-11390][tests] Port YARNSessionCapacitySchedulerITCase#testTaskManagerFailure to new code base
GJL opened a new pull request #7546: [FLINK-11390][tests] Port YARNSessionCapacitySchedulerITCase#testTaskManagerFailure to new code base URL: https://github.com/apache/flink/pull/7546 ## What is the purpose of the change *Port `YARNSessionCapacitySchedulerITCase#testTaskManagerFailure` to new code base.* cc: @tillrohrmann ## Brief change log - *Extract HA test out of `testTaskManagerFailure`* - *Rename test `testTaskManagerFailure` so that the name reflects what is actually asserted* ## Verifying this change This change is already covered by existing tests, such as *itself*. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes / **no**) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**) - The serializers: (yes / **no** / don't know) - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know) - The S3 file system connector: (yes / **no** / don't know) ## Documentation - Does this pull request introduce a new feature? (yes / **no**) - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11400) JobManagerRunner does not wait for suspension of JobMaster
[ https://issues.apache.org/jira/browse/FLINK-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16748078#comment-16748078 ] TisonKun commented on FLINK-11400: -- Do you mean introduce a {{recoveryOperation}} in {{JobManagerRunner}}? > JobManagerRunner does not wait for suspension of JobMaster > -- > > Key: FLINK-11400 > URL: https://issues.apache.org/jira/browse/FLINK-11400 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination >Affects Versions: 1.6.3, 1.7.1, 1.8.0 >Reporter: Till Rohrmann >Assignee: Till Rohrmann >Priority: Major > Fix For: 1.8.0 > > > The {{JobManagerRunner}} does not wait for the suspension of the > {{JobMaster}} to finish before granting leadership again. This can lead to a > state where the {{JobMaster}} tries to start the {{ExecutionGraph}} but the > {{SlotPool}} is still stopped. > I suggest to linearize the leadership operations (granting and revoking > leadership) similarly to the {{Dispatcher}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249512219 ## File path: flink-connectors/flink-connector-kafka-0.8/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/SimpleConsumerThread.java ## @@ -370,8 +370,8 @@ else if (partitionsRemoved) { keyPayload.get(keyBytes); } - final T value = deserializer.deserialize(keyBytes, valueBytes, - currentPartition.getTopic(), currentPartition.getPartition(), offset); + final T value = deserializer.deserialize( Review comment: Here we could try: ``` ConsumerRecord consumerRecord = new ConsumerRecord<>(currentPartition.getTopic(), currentPartition.getPartition(), keyBytes, valueBytes, offset); final T value = deserializer.deserialize(consumerRecord); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249368384 ## File path: flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.internal; + +import org.apache.flink.shaded.guava18.com.google.common.base.Function; +import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.common.header.Header; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; + +import java.util.AbstractMap; +import java.util.Map; + +/** + * Extends base Kafka09ConsumerRecord to provide access to Kafka headers. Review comment: could you put into `{@link Kafka09ConsumerRecord}` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249478973 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/util/serialization/KeyedDeserializationSchema.java ## @@ -32,6 +34,49 @@ */ @PublicEvolving public interface KeyedDeserializationSchema extends Serializable, ResultTypeQueryable { + /** +* Kafka record to be deserialized. +* Record consists of key,value pair, topic name, partition offset, headers and a timestamp (if available) +*/ + interface Record { Review comment: kafka-clients 0.8 actually has `ConsumerRecord`, just `SimpleConsumerThread` does not use it but it seems to be possible to manually wrap `MessageAndOffset` into that `ConsumerRecord`. I would give it a try. It seems to be simpler option at the moment and would eliminate currently introduced inheritance for the sake of wrapping. Not sure, though, how big the risk is that the Kafka API changes again. The Flink wrapping, as now, seems to be a safer option but it also adds some performance overhead. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249510977 ## File path: flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java ## @@ -82,93 +84,110 @@ */ @Internal public abstract class FlinkKafkaConsumerBase extends RichParallelSourceFunction implements - CheckpointListener, - ResultTypeQueryable, - CheckpointedFunction { + CheckpointListener, Review comment: This class contains a lot of unrelated changes, which makes it more difficult to review. I would suggest to have either a follow-up PR for them or at least put them into a separate commit. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249482757 ## File path: flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/Kafka011ConsumerRecord.java ## @@ -0,0 +1,57 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.kafka.internal; + +import org.apache.flink.shaded.guava18.com.google.common.base.Function; +import org.apache.flink.shaded.guava18.com.google.common.collect.Iterables; + +import org.apache.kafka.clients.consumer.ConsumerRecord; +import org.apache.kafka.common.header.Header; + +import javax.annotation.Nonnull; +import javax.annotation.Nullable; + +import java.util.AbstractMap; +import java.util.Map; + +/** + * Extends base Kafka09ConsumerRecord to provide access to Kafka headers. + */ +class Kafka011ConsumerRecord extends Kafka09ConsumerRecord { + /** +* Wraps {@link Header} as Map.Entry. +*/ + private static final Function> HEADER_TO_MAP_ENTRY_FUNCTION = + new Function>() { + @Nonnull + @Override + public Map.Entry apply(@Nullable Header header) { + return new AbstractMap.SimpleImmutableEntry<>(header.key(), header.value()); + } + }; + + Kafka011ConsumerRecord(ConsumerRecord consumerRecord) { + super(consumerRecord); + } + + @Override + public Iterable> headers() { + return Iterables.transform(consumerRecord.headers(), HEADER_TO_MAP_ENTRY_FUNCTION); Review comment: Could we avoid relying on non-standard java libraries like guava? The record/headers processing is also on the time critical path of per record latency. I would suggest to implement our own Iterable wrapper which does only this header wrapping. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11400) JobManagerRunner does not wait for suspension of JobMaster
Till Rohrmann created FLINK-11400: - Summary: JobManagerRunner does not wait for suspension of JobMaster Key: FLINK-11400 URL: https://issues.apache.org/jira/browse/FLINK-11400 Project: Flink Issue Type: Bug Components: Distributed Coordination Affects Versions: 1.7.1, 1.6.3, 1.8.0 Reporter: Till Rohrmann Assignee: Till Rohrmann Fix For: 1.8.0 The {{JobManagerRunner}} does not wait for the suspension of the {{JobMaster}} to finish before granting leadership again. This can lead to a state where the {{JobMaster}} tries to start the {{ExecutionGraph}} but the {{SlotPool}} is still stopped. I suggest to linearize the leadership operations (granting and revoking leadership) similarly to the {{Dispatcher}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] StefanRRichter commented on issue #7009: [FLINK-10712] Support to restore state when using RestartPipelinedRegionStrategy
StefanRRichter commented on issue #7009: [FLINK-10712] Support to restore state when using RestartPipelinedRegionStrategy URL: https://github.com/apache/flink/pull/7009#issuecomment-456105318 I think I agree with the assessment of the existing operators. About adding a `RecoveryMode` to consider, would that mean that we would prevent all jobs that use union state to work with partial recovery? I think if we just consider a few popular operators like `KafkaConsumer`, that would already prevent a lot of jobs from using different recovery modes. I can see that this comes from the concern about existing code that uses union state. However, stricly speaking it should not be a regression because those recovery modes previously did not support state recovery at all. We also cannot prevent users from making wrong implementations, so I feel like a good thing to do is document what to care care for union state when using such recovery modes. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers
azagrebin commented on a change in pull request #6615: [FLINK-8354] [flink-connectors] Add ability to access and provider Kafka headers URL: https://github.com/apache/flink/pull/6615#discussion_r249367559 ## File path: flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java ## @@ -641,6 +641,9 @@ public void invoke(KafkaTransactionState transaction, IN next, Context context) } else { record = new ProducerRecord<>(targetTopic, null, timestamp, serializedKey, serializedValue); } + for (Map.Entry header: schema.headers(next)) { Review comment: space before `:` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-11326) Using offsets to adjust windows to timezones UTC-8 throws IllegalArgumentException
[ https://issues.apache.org/jira/browse/FLINK-11326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kezhu Wang reassigned FLINK-11326: -- Assignee: Kezhu Wang > Using offsets to adjust windows to timezones UTC-8 throws > IllegalArgumentException > -- > > Key: FLINK-11326 > URL: https://issues.apache.org/jira/browse/FLINK-11326 > Project: Flink > Issue Type: Bug > Components: Streaming >Affects Versions: 1.6.3 >Reporter: TANG Wen-hui >Assignee: Kezhu Wang >Priority: Major > > According to comments, we can use offset to adjust windows to timezones other > than UTC-0. For example, in China you would have to specify an offset of > {{Time.hours(-8)}}. > > {code:java} > /** > * Creates a new {@code TumblingEventTimeWindows} {@link WindowAssigner} that > assigns > * elements to time windows based on the element timestamp and offset. > * > * For example, if you want window a stream by hour,but window begins at > the 15th minutes > * of each hour, you can use {@code of(Time.hours(1),Time.minutes(15))},then > you will get > * time windows start at 0:15:00,1:15:00,2:15:00,etc. > * > * Rather than that,if you are living in somewhere which is not using > UTC±00:00 time, > * such as China which is using UTC+08:00,and you want a time window with > size of one day, > * and window begins at every 00:00:00 of local time,you may use {@code > of(Time.days(1),Time.hours(-8))}. > * The parameter of offset is {@code Time.hours(-8))} since UTC+08:00 is 8 > hours earlier than UTC time. > * > * @param size The size of the generated windows. > * @param offset The offset which window start would be shifted by. > * @return The time policy. > */ > public static TumblingEventTimeWindows of(Time size, Time offset) { > return new TumblingEventTimeWindows(size.toMilliseconds(), > offset.toMilliseconds()); > }{code} > > but when offset is smaller than zero, a IllegalArgumentException will be > throwed. > > {code:java} > protected TumblingEventTimeWindows(long size, long offset) { > if (offset < 0 || offset >= size) { > throw new IllegalArgumentException("TumblingEventTimeWindows parameters must > satisfy 0 <= offset < size"); > } > this.size = size; > this.offset = offset; > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] Implement proctime DataStream to Table upsert conversion
hequn8128 commented on a change in pull request #6787: [FLINK-8577][table] Implement proctime DataStream to Table upsert conversion URL: https://github.com/apache/flink/pull/6787#discussion_r249476414 ## File path: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/api/stream/sql/FromUpsertStreamTest.scala ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.table.api.stream.sql + +import org.apache.flink.api.common.typeinfo.BasicTypeInfo +import org.apache.flink.api.java.typeutils.{RowTypeInfo, TupleTypeInfo} +import org.apache.flink.api.scala._ +import org.apache.flink.table.api.Types +import org.apache.flink.table.api.scala._ +import org.apache.flink.table.utils.TableTestUtil.{UpsertTableNode, term, unaryNode} +import org.apache.flink.table.utils.{StreamTableTestUtil, TableTestBase} +import org.apache.flink.types.Row +import org.junit.Test +import org.apache.flink.api.java.tuple.{Tuple2 => JTuple2} +import java.lang.{Boolean => JBool} + +class FromUpsertStreamTest extends TableTestBase { + + private val streamUtil: StreamTableTestUtil = streamTestUtil() + + @Test + def testRemoveUpsertToRetraction() = { +streamUtil.addTableFromUpsert[(Boolean, (Int, String, Long))]( + "MyTable", 'a, 'b.key, 'c, 'proctime.proctime, 'rowtime.rowtime) + +val sql = "SELECT a, b, c, proctime, rowtime FROM MyTable" + +val expected = + unaryNode( +"DataStreamCalc", +UpsertTableNode(0), +term("select", "a", "b", "c", "PROCTIME(proctime) AS proctime", + "CAST(rowtime) AS rowtime") + ) +streamUtil.verifySql(sql, expected) + } + + @Test + def testMaterializeTimeIndicatorAndCalcUpsertToRetractionTranspose() = { Review comment: `Calc` can't always be pushed down(More details in `CalcUpsertToRetractionTransposeRule`), so there are two tests for each of the case: - `testCalcTransposeUpsertToRetraction()` - `testCalcCannotTransposeUpsertToRetraction()` As for `testMaterializeTimeIndicatorAndCalcUpsertToRetractionTranspose()`, maybe it's better to rename to `testMaterializeTimeIndicator()`. It is dedicated to test the materialization logic. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11399) Parsing nested ROW()s in SQL
Benoît Paris created FLINK-11399: Summary: Parsing nested ROW()s in SQL Key: FLINK-11399 URL: https://issues.apache.org/jira/browse/FLINK-11399 Project: Flink Issue Type: Bug Components: Table API SQL Affects Versions: 1.7.1 Reporter: Benoît Paris Hi! I'm trying to build a nested structure in SQL (mapping to json with flink-json). This works fine: {code:java} INSERT INTO outputTable SELECT ROW(col1, col2) FROM ( SELECT col1, ROW(col1, col1) as col2 FROM inputTable ) tbl2 {code} (and I use it as a workaround), but it fails in the simpler version: {code:java} INSERT INTO outputTable SELECT ROW(col1, ROW(col1, col1)) FROM inputTable {code} , yielding the following stacktrace: {noformat} Exception in thread "main" org.apache.flink.table.api.SqlParserException: SQL parse failed. Encountered ", ROW" at line 1, column 40. Was expecting one of: ")" ... "," ... "," ... "," ... "," ... "," ... at org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:94) at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:803) at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777) at TestBug.main(TestBug.java:32) Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered ", ROW" at line 1, column 40. Was expecting one of: ")" ... "," ... "," ... "," ... "," ... "," ... at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:347) at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:128) at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:137) at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:162) at org.apache.flink.table.calcite.FlinkPlannerImpl.parse(FlinkPlannerImpl.scala:90) ... 3 more Caused by: org.apache.calcite.sql.parser.impl.ParseException: Encountered ", ROW" at line 1, column 40. Was expecting one of: ")" ... "," ... "," ... "," ... "," ... "," ... at org.apache.calcite.sql.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:23019) at org.apache.calcite.sql.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:22836) at org.apache.calcite.sql.parser.impl.SqlParserImpl.ParenthesizedSimpleIdentifierList(SqlParserImpl.java:4466) at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3328) at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:3066) at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:3092) at org.apache.calcite.sql.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:3045) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectExpression(SqlParserImpl.java:1525) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectItem(SqlParserImpl.java:1500) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SelectList(SqlParserImpl.java:1477) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:912) at org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:552) at org.apache.calcite.sql.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:3030) at org.apache.calcite.sql.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2949) at org.apache.calcite.sql.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:463) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlInsert(SqlParserImpl.java:1212) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:847) at org.apache.calcite.sql.parser.impl.SqlParserImpl.SqlStmtEof(SqlParserImpl.java:869) at org.apache.calcite.sql.parser.impl.SqlParserImpl.parseSqlStmtEof(SqlParserImpl.java:184) at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:130) ... 5 more{noformat} I was thinking it could be a naming/referencing issue; or I was not using ROW() properly, in the json-idiomatic way I want to push on it. Anyway this is very minor, thanks for all the good work on Flink! Cheers, Ben -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11398) Add a dedicated phase to materialize time indicators for nodes produce updates
[ https://issues.apache.org/jira/browse/FLINK-11398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hequn Cheng updated FLINK-11398: Description: As discussed [here|https://github.com/apache/flink/pull/6787#discussion_r247926320], we need a dedicated phase to materialize time indicators for nodes produce updates. Details: Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We need to introduce another materialize phase that materializes all time attributes on nodes that produce updates. We can not do it inside `RelTimeInidicatorConverter`, because only later, after physical optimization phase, we know whether it is a non-window outer join which will produce updates There are a few other things we need to consider. - Whether we can unify the two converter phase. - Take window with early fire into consideration(not been implemented yet). In this case, we don't need to materialize time indicators even it produces updates. was: As discussed [here|https://github.com/apache/flink/pull/6787#discussion_r249056249], we need a dedicated phase to materialize time indicators for nodes produce updates. Details: Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We need to introduce another materialize phase that materializes all time attributes on nodes that produce updates. We can not do it inside `RelTimeInidicatorConverter`, because only later, after physical optimization phase, we know whether it is a non-window outer join which will produce updates There are a few other things we need to consider. - Whether we can unify the two converter phase. - Take window with early fire into consideration(not been implemented yet). In this case, we don't need to materialize time indicators even it produces updates. > Add a dedicated phase to materialize time indicators for nodes produce updates > -- > > Key: FLINK-11398 > URL: https://issues.apache.org/jira/browse/FLINK-11398 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Hequn Cheng >Assignee: Hequn Cheng >Priority: Major > > As discussed > [here|https://github.com/apache/flink/pull/6787#discussion_r247926320], we > need a dedicated phase to materialize time indicators for nodes produce > updates. > Details: > Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We > need to introduce another materialize phase that materializes all time > attributes on nodes that produce updates. We can not do it inside > `RelTimeInidicatorConverter`, because only later, after physical optimization > phase, we know whether it is a non-window outer join which will produce > updates > There are a few other things we need to consider. > - Whether we can unify the two converter phase. > - Take window with early fire into consideration(not been implemented yet). > In this case, we don't need to materialize time indicators even it produces > updates. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11398) Add a dedicated phase to materialize time indicators for nodes produce updates
Hequn Cheng created FLINK-11398: --- Summary: Add a dedicated phase to materialize time indicators for nodes produce updates Key: FLINK-11398 URL: https://issues.apache.org/jira/browse/FLINK-11398 Project: Flink Issue Type: Improvement Components: Table API SQL Reporter: Hequn Cheng Assignee: Hequn Cheng As discussed [here|https://github.com/apache/flink/pull/6787#discussion_r249056249], we need a dedicated phase to materialize time indicators for nodes produce updates. Details: Currently, we materialize time indicators in `RelTimeInidicatorConverter`. We need to introduce another materialize phase that materializes all time attributes on nodes that produce updates. We can not do it inside `RelTimeInidicatorConverter`, because only later, after physical optimization phase, we know whether it is a non-window outer join which will produce updates There are a few other things we need to consider. - Whether we can unify the two converter phase. - Take window with early fire into consideration(not been implemented yet). In this case, we don't need to materialize time indicators even it produces updates. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] dawidwys commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution …
dawidwys commented on a change in pull request #7436: [FLINK-11071][core] add support for dynamic proxy classes resolution … URL: https://github.com/apache/flink/pull/7436#discussion_r249462832 ## File path: flink-core/src/test/java/org/apache/flink/util/InstantiationUtilTest.java ## @@ -46,6 +50,17 @@ */ public class InstantiationUtilTest extends TestLogger { + @Test + public void testResolveProxyClass() throws Exception { Review comment: This test does not check the new behavior. The proxy class is available in the same classloader as `InstantiationUtil`, which corresponds to a situation that the proxy class is available on the parent classpath in flink cluster. What you should actually test is when the proxy class is only available from user classloader. In `org.apache.flink.runtime.classloading.ClassLoaderTest#testMessageDecodingWithUnavailableClass` you can check how to create a new classloader with a class. Also always please make sure that your test fails without the changes that are supposed to fix the tested problem (It is not the case here, it passes without your changes). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] fhueske commented on issue #6873: [hotfix] Add Review Progress section to PR description template.
fhueske commented on issue #6873: [hotfix] Add Review Progress section to PR description template. URL: https://github.com/apache/flink/pull/6873#issuecomment-456085278 I think if we modify the review checklist in place (i.e., update it in the PR description) this is almost automatically given. AFAIK, only the PR author and members of the ASF Github organization (or maybe even registered Flink committers?) can update the PR description. If we ask committers to sign-off changes with their name, we should be good. A review progress bot would be great to have, IMO. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] rmetzger commented on issue #6873: [hotfix] Add Review Progress section to PR description template.
rmetzger commented on issue #6873: [hotfix] Add Review Progress section to PR description template. URL: https://github.com/apache/flink/pull/6873#issuecomment-456081384 @zentol: I'm a bit against this `**NOTE: THE REVIEW PROGRESS MUST ONLY BE UPDATED BY AN APACHE FLINK COMMITTER!**` line, as it seems not very inviting for new community members. Maybe as a convention, reviewers put into the comments what they have reviewed, so if a committer who's merging a PR has doubts about the checklist, they can check the comments? I'm also thinking* about implementing a bot that takes care of tracking the checklist. *strongly enough to have created a project locally already :) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11249) FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer
[ https://issues.apache.org/jira/browse/FLINK-11249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747937#comment-16747937 ] Piotr Nowojski commented on FLINK-11249: I have realised about one more issue. This fix alone might not fully solve the problem. With this fix in place, user will be able to update his job from {{0.11}} connector to the universal one, but what happens if he upgrades the Kafka Brokers at any point of time? If user stops Kafka Brokers, upgrades and then restarts them, does this process preserves the pending transactions, that Flink already "pre committed"? Or are they automatically aborted? If they are automatically aborted we might have a data loss from our perspective. If "pre committed" transactions are aborted during the Kafka brokers upgrades, we would need "clean stop with savepoint" feature to handle this user story. I guess this needs more experiments and more testing. CC [~tzulitai] [~aljoscha] > FlinkKafkaProducer011 can not be migrated to FlinkKafkaProducer > --- > > Key: FLINK-11249 > URL: https://issues.apache.org/jira/browse/FLINK-11249 > Project: Flink > Issue Type: Sub-task > Components: Kafka Connector >Affects Versions: 1.7.0, 1.7.1 >Reporter: Piotr Nowojski >Assignee: vinoyang >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.2, 1.8.0 > > Time Spent: 10m > Remaining Estimate: 0h > > As reported by a user on the mailing list "How to migrate Kafka Producer ?" > (on 18th December 2018), {{FlinkKafkaProducer011}} can not be migrated to > {{FlinkKafkaProducer}} and the same problem can occur in the future Kafka > producer versions/refactorings. > The issue is that {{ListState > FlinkKafkaProducer#nextTransactionalIdHintState}} field is serialized using > java serializers and this is causing problems/collisions on > {{FlinkKafkaProducer011.NextTransactionalIdHint}} vs > {{FlinkKafkaProducer.NextTransactionalIdHint}}. > To fix that we probably need to release new versions of those classes, that > will rewrite/upgrade this state field to a new one, that doesn't relay on > java serialization. After this, we could drop the support for the old field > and that in turn will allow users to upgrade from 0.11 connector to the > universal one. > One bright side is that technically speaking our {{FlinkKafkaProducer011}} > has the same compatibility matrix as the universal one (it's also forward & > backward compatible with the same Kafka versions), so for the time being > users can stick to {{FlinkKafkaProducer011}}. > FYI [~tzulitai] [~yanghua] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-7834) cost model (RelOptCost) extends network cost and memory cost
[ https://issues.apache.org/jira/browse/FLINK-7834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-7834: -- Labels: pull-request-available (was: ) > cost model (RelOptCost) extends network cost and memory cost > > > Key: FLINK-7834 > URL: https://issues.apache.org/jira/browse/FLINK-7834 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: godfrey he >Priority: Major > Labels: pull-request-available > > {{RelOptCost}} defines an interface for optimizer cost in terms of number of > row processed, CPU cost, and I/O cost. Flink is a distributed framework, > network and memory are also two important cost metrics which should be > considered for optimizer. This feature is to extend RelOptCost. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] clarkyzl closed pull request #3623: [FLINK-6196] [table] Support dynamic schema in Table Function
clarkyzl closed pull request #3623: [FLINK-6196] [table] Support dynamic schema in Table Function URL: https://github.com/apache/flink/pull/3623 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-6196) Support dynamic schema in Table Function
[ https://issues.apache.org/jira/browse/FLINK-6196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-6196: -- Labels: pull-request-available (was: ) > Support dynamic schema in Table Function > > > Key: FLINK-6196 > URL: https://issues.apache.org/jira/browse/FLINK-6196 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Zhuoluo Yang >Assignee: Zhuoluo Yang >Priority: Major > Labels: pull-request-available > > In many of our use cases. We have to decide the schema of a UDTF at the run > time. For example. udtf('c1, c2, c3') will generate three columns for a > lateral view. > Most systems such as calcite and hive support this feature. However, the > current implementation of flink didn't implement the feature correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-7065) separate the flink-streaming-java module from flink-clients
[ https://issues.apache.org/jira/browse/FLINK-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-7065: -- Labels: pull-request-available (was: ) > separate the flink-streaming-java module from flink-clients > > > Key: FLINK-7065 > URL: https://issues.apache.org/jira/browse/FLINK-7065 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Xu Pingyong >Assignee: Xu Pingyong >Priority: Major > Labels: pull-request-available > > Motivation: > It is not good that "flink-streaming-java" module depends on > "flink-clients". Flink-clients should see something in "flink-streaming-java". > Related Change: > 1. LocalStreamEnvironment and RemoteStreamEnvironment can also execute > a job by the executors(LocalExecutor and RemoteExecutor). Introduce > StreamGraphExecutor which executors a streamGraph as PlanExecutor executors > the plan. StreamGraphExecutor and PlanExecutor all extend Executor. > 2. Introduce StreamExecutionEnvironmentFactory which works similarly > to ContextEnvironmentFactory in flink-clients. > When a object of ContextEnvironmentFactory, > OptimizerPlanEnvironmentFactory or PreviewPlanEnvironmentFactory is set into > ExecutionEnvironment(by calling initializeContextEnvironment), the relevant > StreamEnvFactory is alsot set into StreamExecutionEnvironment. It is similar > when calling unsetContext. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] XuPingyong commented on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear
XuPingyong commented on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear URL: https://github.com/apache/flink/pull/4267#issuecomment-456053593 no need This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear
XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear URL: https://github.com/apache/flink/pull/4267 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-5833) Support for Hive GenericUDF
[ https://issues.apache.org/jira/browse/FLINK-5833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-5833: -- Labels: pull-request-available (was: ) > Support for Hive GenericUDF > --- > > Key: FLINK-5833 > URL: https://issues.apache.org/jira/browse/FLINK-5833 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Zhuoluo Yang >Assignee: Zhuoluo Yang >Priority: Major > Labels: pull-request-available > > The second step of FLINK-5802 is to support Hive's GenericUDF. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] XuPingyong closed pull request #4273: [FLINK-7065] Separate the flink-streaming-java module from flink-clients
XuPingyong closed pull request #4273: [FLINK-7065] Separate the flink-streaming-java module from flink-clients URL: https://github.com/apache/flink/pull/4273 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XuPingyong closed pull request #4241: [FLINK-7015] [streaming] separate OperatorConfig from StreamConfig
XuPingyong closed pull request #4241: [FLINK-7015] [streaming] separate OperatorConfig from StreamConfig URL: https://github.com/apache/flink/pull/4241 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] clarkyzl closed pull request #3473: [FLINK-5833] [table] Support for Hive GenericUDF
clarkyzl closed pull request #3473: [FLINK-5833] [table] Support for Hive GenericUDF URL: https://github.com/apache/flink/pull/3473 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-7015) Separate OperatorConfig from StreamConfig
[ https://issues.apache.org/jira/browse/FLINK-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-7015: -- Labels: pull-request-available (was: ) > Separate OperatorConfig from StreamConfig > - > > Key: FLINK-7015 > URL: https://issues.apache.org/jira/browse/FLINK-7015 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Xu Pingyong >Assignee: Xu Pingyong >Priority: Major > Labels: pull-request-available > > Motivation: > A Task contains one or more operators with chainning, however > configs of operator and task are all put in StreamConfig. For example, when a > opeator sets up with the StreamConfig, it can see the interface about > physicalEdges or chained.task.configs that are confused. Similarly a > streamTask should not see the interface aboule chain.index. > So we need to separate OperatorConfig from StreamConfig. A > streamTask builds execution enviroment with the streamConfig, and extract > operatorConfigs from it, then build streamOperators with every > operatorConfig. > >OperatorConfig: for the streamOperator to setup with, it constains > informations that only belong to the streamOperator. It contains: >1) operator information: name, id >2) Serialized StreamOperator >3) input serializer. >4) output edges and serializers. >5) chain.index >6) state.key.serializer > StreamConfig: for the streamTask to use: >1) in.physical.edges > 2) out.physical.edges >3) chained OperatorConfigs >4) execution environment: checkpoint, state.backend and so on... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] XuPingyong opened a new pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear
XuPingyong opened a new pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear URL: https://github.com/apache/flink/pull/4267 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] XuPingyong removed a comment on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear
XuPingyong removed a comment on issue #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear URL: https://github.com/apache/flink/pull/4267#issuecomment-456053593 no need This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-5514) Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS
[ https://issues.apache.org/jira/browse/FLINK-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-5514: -- Labels: pull-request-available (was: ) > Implement an efficient physical execution for CUBE/ROLLUP/GROUPING SETS > --- > > Key: FLINK-5514 > URL: https://issues.apache.org/jira/browse/FLINK-5514 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Timo Walther >Assignee: godfrey he >Priority: Major > Labels: pull-request-available > > A first support for GROUPING SETS has been added in FLINK-5303. However, the > current runtime implementation is not very efficient as it basically only > translates logical operators to physical operators i.e. grouping sets are > currently only translated into multiple groupings that are unioned together. > A rough design document for this has been created in FLINK-2980. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear
XuPingyong closed pull request #4267: [FLINK-7018][streaming] Refactor streamGraph to make interfaces clear URL: https://github.com/apache/flink/pull/4267 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-6516) using real row count instead of dummy row count when optimizing plan
[ https://issues.apache.org/jira/browse/FLINK-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-6516: -- Labels: pull-request-available (was: ) > using real row count instead of dummy row count when optimizing plan > > > Key: FLINK-6516 > URL: https://issues.apache.org/jira/browse/FLINK-6516 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: godfrey he >Assignee: godfrey he >Priority: Major > Labels: pull-request-available > > Currently, the statistic of {{TableSourceTable}} is {{UNKNOWN}} mostly, and > the statistic from {{ExternalCatalog}} maybe is null also. Actually, only > each {{TableSource}} knows its statistic exactly, especial for > {{FilterableTableSource}} and {{PartitionableTableSource}}. So we can add > {{getTableStats}} method in {{TableSource}}, and use it in TableSourceScan's > estimateRowCount method to get real row count. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-7018) Refactor streamGraph to make interfaces clear
[ https://issues.apache.org/jira/browse/FLINK-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-7018: -- Labels: pull-request-available (was: ) > Refactor streamGraph to make interfaces clear > - > > Key: FLINK-7018 > URL: https://issues.apache.org/jira/browse/FLINK-7018 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Xu Pingyong >Assignee: Xu Pingyong >Priority: Major > Labels: pull-request-available > > Motivation: >1. StreamGraph is a graph consisted of some streamNodes. So virtual > nodes (such as select, sideOutput, partition) should be moved away from it. > Main iterfaces of StreamGraph should be as following: > addSource(StreamNode sourceNode) > addSink(StreamNode sinkNode) >addOperator(StreamNode streamNode) >addEdge(Integer upStreamVertexID, Integer downStreamVertexID, > StreamEdge.InputOrder inputOrder, StreamPartitioner partitioner, > List outputNames, OutputTag outputTag) > getJobGraph() > 2. StreamExecutionEnvironment should not be in StreamGraph, I create > StreamGraphProperties which extracts all env information the streamGraph > needs from StreamExecutionEnvironment. It contains: > 1) executionConfig > 2) checkpointConfig > 3) timeCharacteristic > 4) stateBackend > 5) chainingEnabled > 6) cachedFiles > 7) jobName > > Related Changes: >I moved the part of dealing with virtual nodes to > StreamGraphGenerator. And get properties of StreamGraph from > StreamGraphProperties instead of StreamExecutionEnvironment. > > It is only a code abstraction internally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)