[jira] [Closed] (FLINK-5679) Refactor *CheckpointedITCase tests to speed up
[ https://issues.apache.org/jira/browse/FLINK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Efimov closed FLINK-5679. Resolution: Incomplete > Refactor *CheckpointedITCase tests to speed up > --- > > Key: FLINK-5679 > URL: https://issues.apache.org/jira/browse/FLINK-5679 > Project: Flink > Issue Type: Test > Components: Runtime / Checkpointing, Tests >Reporter: Andrew Efimov >Priority: Major > Labels: test-framework > > Tests refactoring to speed up: > {noformat} > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.193 sec - > in org.apache.flink.test.checkpointing.StreamCheckpointingITCasee > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 119.063 sec - > in org.apache.flink.test.checkpointing.UdfStreamOperatorCheckpointingITCase > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 47.525 sec - > in org.apache.flink.test.checkpointing.PartitionedStateCheckpointingITCase > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.355 sec - > in org.apache.flink.test.checkpointing.EventTimeAllWindowCheckpointingITCase > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 51.615 sec - > in org.apache.flink.test.checkpointing.StateCheckpointedITCase > {noformat} > Tests could be adjusted in a similar way to save some time (some may actually > even be redundant by now) > https://github.com/StephanEwen/incubator-flink/commit/0dd7ae693f30585283d334a1d65b3d8222b7ca5c -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-5679) Refactor *CheckpointedITCase tests to speed up
[ https://issues.apache.org/jira/browse/FLINK-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Efimov reassigned FLINK-5679: Assignee: (was: Andrew Efimov) > Refactor *CheckpointedITCase tests to speed up > --- > > Key: FLINK-5679 > URL: https://issues.apache.org/jira/browse/FLINK-5679 > Project: Flink > Issue Type: Test > Components: Runtime / Checkpointing, Tests >Reporter: Andrew Efimov >Priority: Major > Labels: test-framework > > Tests refactoring to speed up: > {noformat} > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.193 sec - > in org.apache.flink.test.checkpointing.StreamCheckpointingITCasee > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 119.063 sec - > in org.apache.flink.test.checkpointing.UdfStreamOperatorCheckpointingITCase > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 47.525 sec - > in org.apache.flink.test.checkpointing.PartitionedStateCheckpointingITCase > Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.355 sec - > in org.apache.flink.test.checkpointing.EventTimeAllWindowCheckpointingITCase > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 51.615 sec - > in org.apache.flink.test.checkpointing.StateCheckpointedITCase > {noformat} > Tests could be adjusted in a similar way to save some time (some may actually > even be redundant by now) > https://github.com/StephanEwen/incubator-flink/commit/0dd7ae693f30585283d334a1d65b3d8222b7ca5c -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8032: [FLINK-11994] [table-planner-blink] Introduce TableImpl and remove Table in flink-table-planner-blink
flinkbot commented on issue #8032: [FLINK-11994] [table-planner-blink] Introduce TableImpl and remove Table in flink-table-planner-blink URL: https://github.com/apache/flink/pull/8032#issuecomment-475488892 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11994) Introduce TableImpl and remove Table in flink-table-planner-blink
[ https://issues.apache.org/jira/browse/FLINK-11994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11994: --- Labels: pull-request-available (was: ) > Introduce TableImpl and remove Table in flink-table-planner-blink > - > > Key: FLINK-11994 > URL: https://issues.apache.org/jira/browse/FLINK-11994 > Project: Flink > Issue Type: Task >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > Labels: pull-request-available > > After FLINK-11068 is merged, the {{Table}} interfaced is added into > {{flink-table-api-java}}. The classpath is conflicted with {{Table}} in > {{flink-table-planner-blink}} which result in IDE errors and some tests fail > (only in my local, looks good in mvn verify). > This issue make {{Table}} in {{flink-table-planner-blink}} to extends > {{Table}} in {{flink-table-api-java}} and rename to {{TableImpl}}. We still > left the methods implementation to be empty until the {{LogicalNode}} is > refactored. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] wuchong opened a new pull request #8032: [FLINK-11994] [table-planner-blink] Introduce TableImpl and remove Table in flink-table-planner-blink
wuchong opened a new pull request #8032: [FLINK-11994] [table-planner-blink] Introduce TableImpl and remove Table in flink-table-planner-blink URL: https://github.com/apache/flink/pull/8032 ## What is the purpose of the change After [FLINK-11068](https://issues.apache.org/jira/browse/FLINK-11068) is merged, the `Table` interfaced is added into `flink-table-api-java`. The classpath is conflicted with `Table` in `flink-table-planner-blink` which result in IDE errors and some tests fail (only in my local, looks good in `mvn verify`). ## Brief change log This issue makes `Table` in `flink-table-planner-blink` to extend `Table` in `flink-table-api-java` and rename to `TableImpl`. We still left the methods implementation to be empty until the `LogicalNode` is refactored. ## Verifying this change This change is a trivial rework without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11994) Introduce TableImpl and remove Table in flink-table-planner-blink
[ https://issues.apache.org/jira/browse/FLINK-11994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-11994: Issue Type: Task (was: New Feature) > Introduce TableImpl and remove Table in flink-table-planner-blink > - > > Key: FLINK-11994 > URL: https://issues.apache.org/jira/browse/FLINK-11994 > Project: Flink > Issue Type: Task >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > > After FLINK-11068 is merged, the {{Table}} interfaced is added into > {{flink-table-api-java}}. The classpath is conflicted with {{Table}} in > {{flink-table-planner-blink}} which result in IDE errors and some tests fail > (only in my local, looks good in mvn verify). > This issue make {{Table}} in {{flink-table-planner-blink}} to extends > {{Table}} in {{flink-table-api-java}} and rename to {{TableImpl}}. We still > left the methods implementation to be empty until the {{LogicalNode}} is > refactored. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11994) Introduce TableImpl and remove Table in flink-table-planner-blink
Jark Wu created FLINK-11994: --- Summary: Introduce TableImpl and remove Table in flink-table-planner-blink Key: FLINK-11994 URL: https://issues.apache.org/jira/browse/FLINK-11994 Project: Flink Issue Type: New Feature Reporter: Jark Wu Assignee: Jark Wu After FLINK-11068 is merged, the {{Table}} interfaced is added into {{flink-table-api-java}}. The classpath is conflicted with {{Table}} in {{flink-table-planner-blink}} which result in IDE errors and some tests fail (only in my local, looks good in mvn verify). This issue make {{Table}} in {{flink-table-planner-blink}} to extends {{Table}} in {{flink-table-api-java}} and rename to {{TableImpl}}. We still left the methods implementation to be empty until the {{LogicalNode}} is refactored. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11993) Introduce partitionable filesystem sink
Jing Zhang created FLINK-11993: -- Summary: Introduce partitionable filesystem sink Key: FLINK-11993 URL: https://issues.apache.org/jira/browse/FLINK-11993 Project: Flink Issue Type: Task Components: API / Table SQL Reporter: Jing Zhang Introduce partitionable filesystem sink, 1. Add partition trait for filesystem connector 2. All the filesystem formats can be declared as partitioned through new DDL grammar. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] wuchong commented on issue #8015: [FLINK-11974][streaming] Introduce StreamOperatorSubstitutor to help table perform the whole Operator CodeGen
wuchong commented on issue #8015: [FLINK-11974][streaming] Introduce StreamOperatorSubstitutor to help table perform the whole Operator CodeGen URL: https://github.com/apache/flink/pull/8015#issuecomment-475480775 LGTM. The implementation touches little to the runtime, I think it's safe to get it in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] Clarkkkkk commented on issue #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base
Clark commented on issue #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base URL: https://github.com/apache/flink/pull/7780#issuecomment-475476357 @zentol You are right, although the code get reused, but the meaning of the test is not explicit. I'm gonna move the submission logic out of runTestAfterTaskSubmission and rename it to prepareTaskManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11982) BatchTableSourceFactory support Json Format File
[ https://issues.apache.org/jira/browse/FLINK-11982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798651#comment-16798651 ] frank wang commented on FLINK-11982: in the META-INF/services/org.apache.flink.table.factories.TableFactory,just add this org.apache.flink.formats.json.JsonRowFormatFactory > BatchTableSourceFactory support Json Format File > > > Key: FLINK-11982 > URL: https://issues.apache.org/jira/browse/FLINK-11982 > Project: Flink > Issue Type: Bug > Components: API / Table SQL >Affects Versions: 1.6.4, 1.7.2 >Reporter: pingle wang >Assignee: frank wang >Priority: Major > > java code : > {code:java} > val connector = FileSystem().path("data/in/test.json") > val desc = tEnv.connect(connector) > .withFormat( > new Json() > .schema( > Types.ROW( > Array[String]("id", "name", "age"), > Array[TypeInformation[_]](Types.STRING, Types.STRING, Types.INT)) > ) > .failOnMissingField(true) > ).registerTableSource("persion") > val sql = "select * from person" > val result = tEnv.sqlQuery(sql) > {code} > Exception info : > {code:java} > Exception in thread "main" > org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a > suitable table factory for > 'org.apache.flink.table.factories.BatchTableSourceFactory' in > the classpath. > Reason: No context matches. > The following properties are requested: > connector.path=file:///Users/batch/test.json > connector.property-version=1 > connector.type=filesystem > format.derive-schema=true > format.fail-on-missing-field=true > format.property-version=1 > format.type=json > The following factories have been considered: > org.apache.flink.table.sources.CsvBatchTableSourceFactory > org.apache.flink.table.sources.CsvAppendTableSourceFactory > org.apache.flink.table.sinks.CsvBatchTableSinkFactory > org.apache.flink.table.sinks.CsvAppendTableSinkFactory > org.apache.flink.formats.avro.AvroRowFormatFactory > org.apache.flink.formats.json.JsonRowFormatFactory > org.apache.flink.streaming.connectors.kafka.Kafka010TableSourceSinkFactory > org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory > at > org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214) > at > org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130) > at > org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81) > at > org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:44) > at > org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:46) > at com.meitu.mlink.sql.batch.JsonExample.main(JsonExample.java:36){code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] HuangZhenQiu commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type
HuangZhenQiu commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type URL: https://github.com/apache/flink/pull/7978#issuecomment-475471120 @rmetzger I saw a test module is cancelled. Should I look deeper into it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] Clarkkkkk commented on a change in pull request #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base
Clark commented on a change in pull request #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base URL: https://github.com/apache/flink/pull/7780#discussion_r268019967 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/taskexecutor/TaskExecutorTest.java ## @@ -1771,7 +2415,349 @@ private TaskExecutor createTaskExecutor(TaskManagerServices taskManagerServices) UnregisteredMetricGroups.createUnregisteredTaskManagerMetricGroup(), null, dummyBlobCacheService, - testingFatalErrorHandler); + testingFatalErrorHandler + ); + } + + private void runTestAfterTaskSubmission( + List tdds, + Function> updateTaskExecutionStateFunction, + BiConsumerWithException testAfterSubmission) + throws Throwable { + runTestAfterTaskSubmission(tdds, this.configuration, true, updateTaskExecutionStateFunction, null, testAfterSubmission); + } + + private void runTestAfterTaskSubmission( + List tdds, + Configuration configuration, + boolean localCommunication, + Function> updateTaskExecutionStateFunction, + Function> scheduleOrUpdateConsumersFunction, + BiConsumerWithException testAfterSubmission) + throws Throwable { + final JobMasterId jobMasterId = JobMasterId.generate(); + + final LibraryCacheManager libraryCacheManager = mock(LibraryCacheManager.class); + when(libraryCacheManager.getClassLoader(any(JobID.class))).thenReturn(ClassLoader.getSystemClassLoader()); + + TestTaskManagerActions taskManagerActions = new TestTaskManagerActions(); + + final JobMasterGateway jobMasterGateway; + TestingJobMasterGatewayBuilder testingJobMasterGatewayBuilder = + new TestingJobMasterGatewayBuilder() + .setFencingTokenSupplier(() -> jobMasterId); + if (scheduleOrUpdateConsumersFunction != null) { + testingJobMasterGatewayBuilder + .setScheduleOrUpdateConsumersFunction(scheduleOrUpdateConsumersFunction); + } else { + testingJobMasterGatewayBuilder + .setScheduleOrUpdateConsumersFunction(resultPartitionID -> CompletableFuture.completedFuture(Acknowledge.get())); + } + if (updateTaskExecutionStateFunction != null) { + testingJobMasterGatewayBuilder.setUpdateTaskExecutionStateFunction(updateTaskExecutionStateFunction); + jobMasterGateway = testingJobMasterGatewayBuilder.build(); + taskManagerActions.setJobMasterGateway(jobMasterGateway); + } else { + jobMasterGateway = testingJobMasterGatewayBuilder.build(); + } + + final PartitionProducerStateChecker partitionProducerStateChecker = mock(PartitionProducerStateChecker.class); + when(partitionProducerStateChecker.requestPartitionProducerState(any(), any(), any())) + .thenReturn(CompletableFuture.completedFuture(ExecutionState.RUNNING)); + + final JobManagerConnection jobManagerConnection = new JobManagerConnection( + jobId, + ResourceID.generate(), + jobMasterGateway, + taskManagerActions, + mock(CheckpointResponder.class), + new TestGlobalAggregateManager(), + libraryCacheManager, + new RpcResultPartitionConsumableNotifier(jobMasterGateway, rpc.getExecutor(), timeout), + partitionProducerStateChecker + ); + + final JobManagerTable jobManagerTable = new JobManagerTable(); + jobManagerTable.put(jobId, jobManagerConnection); + + Collection resourceProfiles = new ArrayList<>(); + for (int i = 0; i < tdds.size(); i++) { + resourceProfiles.add(ResourceProfile.UNKNOWN); + } + if (resourceProfiles.size() == 0) { + resourceProfiles.add(ResourceProfile.UNKNOWN); + } + final TaskSlotTable taskSlotTable = new TaskSlotTable(resourceProfiles, timerService); + taskManagerActions.setTaskSlotTable(taskSlotTable); + + TaskExecutorLocalStateStoresManager localStateStoresManager = new TaskExecutorLocalStateStoresManager( + false, + new File[]{tmp.newFolder()}, + Executors.directExecutor()); + +
[GitHub] [flink] Clarkkkkk commented on a change in pull request #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base
Clark commented on a change in pull request #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base URL: https://github.com/apache/flink/pull/7780#discussion_r268018759 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/TestingAbstractInvokables.java ## @@ -82,4 +84,51 @@ public void invoke() throws Exception { } } } + + public static final class TestInvokableRecordCancel extends AbstractInvokable { + + private static final Object lock = new Object(); + private static CompletableFuture gotCanceledFuture = new CompletableFuture<>(); + + public TestInvokableRecordCancel(Environment environment) { + super(environment); + } + + @Override + public void invoke() throws Exception { + final Object o = new Object(); + RecordWriter recordWriter = new RecordWriter<>(getEnvironment().getWriter(0)); + + for (int i = 0; i < 1024; i++) { + recordWriter.emit(new IntValue(42)); + } + + synchronized (o) { + //noinspection InfiniteLoopStatement + while (true) { + o.wait(); + } + } + + } + + @Override + public void cancel() { + synchronized (lock) { Review comment: You are right. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] Clarkkkkk commented on a change in pull request #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base
Clark commented on a change in pull request #7780: [FLINK-11593][tests] Check & port TaskManagerTest to new code base URL: https://github.com/apache/flink/pull/7780#discussion_r268018734 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/TestingAbstractInvokables.java ## @@ -82,4 +84,51 @@ public void invoke() throws Exception { } } } + + public static final class TestInvokableRecordCancel extends AbstractInvokable { + + private static final Object lock = new Object(); + private static CompletableFuture gotCanceledFuture = new CompletableFuture<>(); + + public TestInvokableRecordCancel(Environment environment) { + super(environment); + } + + @Override + public void invoke() throws Exception { + final Object o = new Object(); + RecordWriter recordWriter = new RecordWriter<>(getEnvironment().getWriter(0)); + + for (int i = 0; i < 1024; i++) { + recordWriter.emit(new IntValue(42)); + } + + synchronized (o) { + //noinspection InfiniteLoopStatement + while (true) { + o.wait(); Review comment: Thanks, will do that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11982) BatchTableSourceFactory support Json Format File
[ https://issues.apache.org/jira/browse/FLINK-11982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798628#comment-16798628 ] frank wang commented on FLINK-11982: format.type and format.property-version is integrant > BatchTableSourceFactory support Json Format File > > > Key: FLINK-11982 > URL: https://issues.apache.org/jira/browse/FLINK-11982 > Project: Flink > Issue Type: Bug > Components: API / Table SQL >Affects Versions: 1.6.4, 1.7.2 >Reporter: pingle wang >Assignee: frank wang >Priority: Major > > java code : > {code:java} > val connector = FileSystem().path("data/in/test.json") > val desc = tEnv.connect(connector) > .withFormat( > new Json() > .schema( > Types.ROW( > Array[String]("id", "name", "age"), > Array[TypeInformation[_]](Types.STRING, Types.STRING, Types.INT)) > ) > .failOnMissingField(true) > ).registerTableSource("persion") > val sql = "select * from person" > val result = tEnv.sqlQuery(sql) > {code} > Exception info : > {code:java} > Exception in thread "main" > org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a > suitable table factory for > 'org.apache.flink.table.factories.BatchTableSourceFactory' in > the classpath. > Reason: No context matches. > The following properties are requested: > connector.path=file:///Users/batch/test.json > connector.property-version=1 > connector.type=filesystem > format.derive-schema=true > format.fail-on-missing-field=true > format.property-version=1 > format.type=json > The following factories have been considered: > org.apache.flink.table.sources.CsvBatchTableSourceFactory > org.apache.flink.table.sources.CsvAppendTableSourceFactory > org.apache.flink.table.sinks.CsvBatchTableSinkFactory > org.apache.flink.table.sinks.CsvAppendTableSinkFactory > org.apache.flink.formats.avro.AvroRowFormatFactory > org.apache.flink.formats.json.JsonRowFormatFactory > org.apache.flink.streaming.connectors.kafka.Kafka010TableSourceSinkFactory > org.apache.flink.streaming.connectors.kafka.Kafka09TableSourceSinkFactory > at > org.apache.flink.table.factories.TableFactoryService$.filterByContext(TableFactoryService.scala:214) > at > org.apache.flink.table.factories.TableFactoryService$.findInternal(TableFactoryService.scala:130) > at > org.apache.flink.table.factories.TableFactoryService$.find(TableFactoryService.scala:81) > at > org.apache.flink.table.factories.TableFactoryUtil$.findAndCreateTableSource(TableFactoryUtil.scala:44) > at > org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.scala:46) > at com.meitu.mlink.sql.batch.JsonExample.main(JsonExample.java:36){code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-10897) Support POJO state schema evolution
[ https://issues.apache.org/jira/browse/FLINK-10897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798569#comment-16798569 ] william hesch edited comment on FLINK-10897 at 3/22/19 12:06 AM: - Does this also cover scala case classes? was (Author: whesch): Does this also cover scala case classes? * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13213003] > Support POJO state schema evolution > --- > > Key: FLINK-10897 > URL: https://issues.apache.org/jira/browse/FLINK-10897 > Project: Flink > Issue Type: Sub-task > Components: API / Type Serialization System >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Fix For: 1.8.0 > > > Main action point for this is to implement a separate POJO serializer that is > specifically used as the restore serializer. > This restore POJO serializer should be able to read and dump values of fields > that no longer exists in the updated POJO schema, and assign default values > to newly added fields. Snapshot of the {{PojoSerializer}} should contain > sufficient information so that on restore, the information can be compared > with the adapted POJO class to figure out which fields have been removed / > added. > Changing fields types is out of scope and should not be supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10897) Support POJO state schema evolution
[ https://issues.apache.org/jira/browse/FLINK-10897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798569#comment-16798569 ] william hesch commented on FLINK-10897: --- Does this also cover scala case classes? * [|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13213003] > Support POJO state schema evolution > --- > > Key: FLINK-10897 > URL: https://issues.apache.org/jira/browse/FLINK-10897 > Project: Flink > Issue Type: Sub-task > Components: API / Type Serialization System >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Fix For: 1.8.0 > > > Main action point for this is to implement a separate POJO serializer that is > specifically used as the restore serializer. > This restore POJO serializer should be able to read and dump values of fields > that no longer exists in the updated POJO schema, and assign default values > to newly added fields. Snapshot of the {{PojoSerializer}} should contain > sufficient information so that on restore, the information can be compared > with the adapted POJO class to figure out which fields have been removed / > added. > Changing fields types is out of scope and should not be supported. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11485) Migrate PojoSerializer to use new serialization compatibility abstractions
[ https://issues.apache.org/jira/browse/FLINK-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798553#comment-16798553 ] william hesch commented on FLINK-11485: --- Does this also cover scala case classes? > Migrate PojoSerializer to use new serialization compatibility abstractions > -- > > Key: FLINK-11485 > URL: https://issues.apache.org/jira/browse/FLINK-11485 > Project: Flink > Issue Type: Sub-task > Components: API / Type Serialization System >Reporter: Tzu-Li (Gordon) Tai >Assignee: Tzu-Li (Gordon) Tai >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 1h > Remaining Estimate: 0h > > This subtask covers migration of the {{PojoSerializer}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] SuXingLee commented on a change in pull request #7966: [FLINK-11887][metrics] Fixed latency metrics drift apart
SuXingLee commented on a change in pull request #7966: [FLINK-11887][metrics] Fixed latency metrics drift apart URL: https://github.com/apache/flink/pull/7966#discussion_r267976961 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/StreamSource.java ## @@ -154,7 +154,7 @@ public LatencyMarksEmitter( public void onProcessingTime(long timestamp) throws Exception { try { // ProcessingTimeService callbacks are executed under the checkpointing lock - output.emitLatencyMarker(new LatencyMarker(timestamp, operatorId, subtaskIndex)); + output.emitLatencyMarker(new LatencyMarker(System.currentTimeMillis(), operatorId, subtaskIndex)); Review comment: I read the test case carefully, and found that this test case is use to verify ```SystemProcessingTimeService``` , but we don't accumulate a fixed time interval periodicity to producer millisecond time. so the test logic is not useful, i remove this test again,if keep it,will cause BUILD FAILURE. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11058) FlinkKafkaProducer011 fails when kafka broker crash
[ https://issues.apache.org/jira/browse/FLINK-11058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798426#comment-16798426 ] Zack Bartel commented on FLINK-11058: - Hi [~till.rohrmann], Ideally the job would continue if there are more brokers in the cluster. > FlinkKafkaProducer011 fails when kafka broker crash > --- > > Key: FLINK-11058 > URL: https://issues.apache.org/jira/browse/FLINK-11058 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.6.2 > Environment: flink version:1.6.0 > >Reporter: cz >Priority: Major > > Caused by: > org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: > Could not forward element to next operator > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:596) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667) > at > org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > ... 14 more > Caused by: > org.apache.flink.streaming.connectors.kafka.FlinkKafka011Exception: Failed to > send data to Kafka: Expiring 1 record(s) for topic-5: 30091 ms has passed > since batch creation plus linger time > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.checkErroneous(FlinkKafkaProducer011.java:1010) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.invoke(FlinkKafkaProducer011.java:614) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.invoke(FlinkKafkaProducer011.java:93) > at > org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.invoke(TwoPhaseCommitSinkFunction.java:219) > at > org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:56) > at > org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579) > ... 20 more > Caused by: org.apache.kafka.common.errors.TimeoutException: Expiring 1 > record(s) for sec-yrdun-fwalarm-topic-5: 30091 ms has passed since batch > creation plus linger time -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
[ https://issues.apache.org/jira/browse/FLINK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798408#comment-16798408 ] Yu Li commented on FLINK-11990: --- bq. The reason for this is that truncate support was first introduced in Hadoop 2.7.0 (HDFS-3107). For versions after 2.7.0 the BucketingSink will not write .valid-length files but directly truncate the file I see, then everything explains. Thanks for the clarification boss [~aljoscha] > Streaming bucketing end-to-end test fail with hadoop 2.8 > > > Key: FLINK-11990 > URL: https://issues.apache.org/jira/browse/FLINK-11990 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Yu Li >Priority: Critical > > As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 > bundles always fail, while running with 2.6 bundles could pass. > Command to run the case: > {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh > test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} > The output with hadoop 2.8 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: > {noformat} > Starting taskexecutor daemon on host z05f06378.sqa.zth. > Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state > FINISHED ... > Truncating buckets > Truncating to > {noformat} > The output of the success run with hadoop 2.6 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: > {noformat} > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s > pass Bucketing Sink > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-10584) Add support for state retention to the Event Time versioned joins.
[ https://issues.apache.org/jira/browse/FLINK-10584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas closed FLINK-10584. -- Resolution: Fixed > Add support for state retention to the Event Time versioned joins. > -- > > Key: FLINK-10584 > URL: https://issues.apache.org/jira/browse/FLINK-10584 > Project: Flink > Issue Type: Sub-task > Components: API / Table SQL >Affects Versions: 1.7.0 >Reporter: Kostas Kloudas >Assignee: Kostas Kloudas >Priority: Major > Labels: pull-request-available > > Open PR [here|https://github.com/apache/flink/pull/6871] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8031: [FLINK-9007] [kinesis] [e2e] Add Kinesis end-to-end test
flinkbot commented on issue #8031: [FLINK-9007] [kinesis] [e2e] Add Kinesis end-to-end test URL: https://github.com/apache/flink/pull/8031#issuecomment-475346824 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] tweise opened a new pull request #8031: [FLINK-9007] [kinesis] [e2e] Add Kinesis end-to-end test
tweise opened a new pull request #8031: [FLINK-9007] [kinesis] [e2e] Add Kinesis end-to-end test URL: https://github.com/apache/flink/pull/8031 Backport from master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#discussion_r267808414 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/NetworkEnvironment.java ## @@ -88,6 +104,12 @@ private final boolean enableCreditBased; + private final boolean enableNetworkDetailedMetrics; + + private final ConcurrentHashMap allPartitions; Review comment: At the first glance, it looks like we could include task name into `ResultPartitionDeploymentDescriptor` as well in #7835 along with `executionId`. Alternatively, we could add `ShuffleTaskInfo` produced in Task from `TaskInfo` which would include debug name and execution id. Task could pass this `ShuffleTaskInfo` along with RPDDs/IGDDs to partition/gate create methods. This way we do not need to duplicate its info in all RPDDs/IGDDs, like in #7835. I would still index by `executionId` if needed (add to method at the moment before #7835), because it is more reliable and concise than task name + id used for debug. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
flinkbot edited a comment on issue #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#issuecomment-466952983 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ✅ 1. The [description] looks good. - Approved by @azagrebin * ✅ 2. There is [consensus] that the contribution should go into to Flink. - Approved by @azagrebin * ❓ 3. Needs [attention] from. * ✅ 4. The change fits into the overall [architecture]. - Approved by @azagrebin * ✅ 5. Overall code [quality] is good. - Approved by @azagrebin Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on issue #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on issue #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#issuecomment-475315173 @flinkbot approve all This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#discussion_r267852416 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/taskmanager/Task.java ## @@ -299,7 +296,7 @@ public Task( @Nonnull TaskMetricGroup metricGroup, ResultPartitionConsumableNotifier resultPartitionConsumableNotifier, PartitionProducerStateChecker partitionProducerStateChecker, - Executor executor) { + Executor executor) throws IOException { Review comment: It seems that partition/gate setup causes the exception. Usually, constructors should be able always just construct objects without doing any IO and throwing exceptions. If partition/gate had a lifecycle `setup()` interface method, which would do the same as `NetworkEnviroment.setupPartition/Gate`, then the setup could stay in the same place in `Task.run` as originally, do IO and throw exceptions while running. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
[ https://issues.apache.org/jira/browse/FLINK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek closed FLINK-11990. Resolution: Not A Problem > Streaming bucketing end-to-end test fail with hadoop 2.8 > > > Key: FLINK-11990 > URL: https://issues.apache.org/jira/browse/FLINK-11990 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Yu Li >Priority: Critical > > As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 > bundles always fail, while running with 2.6 bundles could pass. > Command to run the case: > {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh > test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} > The output with hadoop 2.8 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: > {noformat} > Starting taskexecutor daemon on host z05f06378.sqa.zth. > Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state > FINISHED ... > Truncating buckets > Truncating to > {noformat} > The output of the success run with hadoop 2.6 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: > {noformat} > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s > pass Bucketing Sink > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] SuXingLee commented on a change in pull request #7966: [FLINK-11887][metrics] Fixed latency metrics drift apart
SuXingLee commented on a change in pull request #7966: [FLINK-11887][metrics] Fixed latency metrics drift apart URL: https://github.com/apache/flink/pull/7966#discussion_r267829161 ## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/StreamSource.java ## @@ -154,7 +154,7 @@ public LatencyMarksEmitter( public void onProcessingTime(long timestamp) throws Exception { try { // ProcessingTimeService callbacks are executed under the checkpointing lock - output.emitLatencyMarker(new LatencyMarker(timestamp, operatorId, subtaskIndex)); + output.emitLatencyMarker(new LatencyMarker(System.currentTimeMillis(), operatorId, subtaskIndex)); Review comment: Thanks for your review, the test case has been reverted. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
[ https://issues.apache.org/jira/browse/FLINK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798221#comment-16798221 ] Aljoscha Krettek commented on FLINK-11990: -- The reason for this is that truncate support was first introduced in Hadoop 2.7.0 (HDFS-3107). For versions after 2.7.0 the {{BucketingSink}} will not write {{.valid-length}} files but directly truncate the file. The test simulates this behaviour by manually enumerating the {{.valid-length}} files by looking at the log entries. For Hadoop 2.8.x {{LOG_LINES}} is empty. It seems bash is a bit strange here and will have one iteration of the loop with an empty string, that's why you see {{Truncating to}}, i.e. its truncating nothing. That's also why you see the output from {{mv}} and {{rm}}. > Streaming bucketing end-to-end test fail with hadoop 2.8 > > > Key: FLINK-11990 > URL: https://issues.apache.org/jira/browse/FLINK-11990 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Yu Li >Priority: Critical > > As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 > bundles always fail, while running with 2.6 bundles could pass. > Command to run the case: > {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh > test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} > The output with hadoop 2.8 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: > {noformat} > Starting taskexecutor daemon on host z05f06378.sqa.zth. > Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state > FINISHED ... > Truncating buckets > Truncating to > {noformat} > The output of the success run with hadoop 2.6 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: > {noformat} > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s > pass Bucketing Sink > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
flinkbot edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-474744643 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ✅ 1. The [description] looks good. - Approved by @rmetzger [PMC] * ✅ 2. There is [consensus] that the contribution should go into to Flink. - Approved by @rmetzger [PMC] * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] rmetzger commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
rmetzger commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475287672 @flinkbot approve description consensus This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] rmetzger commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
rmetzger commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475287430 Thanks a lot for addressing my feedback so quickly! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#discussion_r267825533 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/NetworkEnvironment.java ## @@ -202,28 +233,97 @@ public TaskKvStateRegistry createKvStateTaskRegistry(JobID jobId, JobVertexID jo return kvStateRegistry.createTaskRegistry(jobId, jobVertexId); } + public ResultPartition[] getResultPartitionWriters(String taskId) { + return allPartitions.get(taskId); + } + + public SingleInputGate[] getInputGates(String taskId) { + return allInputGates.get(taskId); + } + // - // Task operations + // Create Output Writers and Input Readers // - public void registerTask(Task task) throws IOException { - final ResultPartition[] producedPartitions = task.getProducedPartitions(); + public ResultPartitionWriter[] createResultPartitionWriters( + String taskId, + JobID jobId, + ExecutionAttemptID executionId, + TaskActions taskActions, + ResultPartitionConsumableNotifier partitionConsumableNotifier, + TaskIOMetricGroup metrics, + Collection resultPartitionDeploymentDescriptors) throws IOException { + + if (isShutdown) { + throw new IllegalStateException("NetworkEnvironment is shut down"); + } - synchronized (lock) { - if (isShutdown) { - throw new IllegalStateException("NetworkEnvironment is shut down"); - } + ResultPartition[] parititons = new ResultPartition[resultPartitionDeploymentDescriptors.size()]; + int counter = 0; + for (ResultPartitionDeploymentDescriptor rpdd : resultPartitionDeploymentDescriptors) { + parititons[counter] = new ResultPartition( + taskId, + taskActions, + jobId, + new ResultPartitionID(rpdd.getPartitionId(), executionId), + rpdd.getPartitionType(), + rpdd.getNumberOfSubpartitions(), + rpdd.getMaxParallelism(), + resultPartitionManager, + partitionConsumableNotifier, + ioManager, + rpdd.sendScheduleOrUpdateConsumersMessage()); + + setupPartition(parititons[counter]); + counter++; + } - for (final ResultPartition partition : producedPartitions) { - setupPartition(partition); + allPartitions.put(taskId, parititons); + + metrics.initializeOutputBufferMetrics(parititons); Review comment: Actually. looking more into `ResultPartitionManager`, we use indexing by execution id eventually only from task and task already knows its partitions/gates. Maybe, if we move all these partition/gate release/close for-loops from `NetworkEnvironment.unregisterTask` and `ResultPartitionManager.releasePartitionsProducedBy` to `Task`, this would help further to simplify future ShuffleService. If needed, netty `ResultPartition` could remove itself from `ResultPartitionManager` in `release()` method. This would be one step forward to get rid of "by execution id" indexing in `NetworkEnvironment`/`ResultPartitionManager`. Then we have only metrics with "task to partitions/gates" dependency. Here, it does not seem to be trivial. In general, looking into `TaskIOMetricGroup.initializeOutputBufferMetrics`, `createResultPartitionWriters/gates` could just accept `ProxyMetricGroup` instead of `TaskIOMetricGroup`. Then `initializeOutputBufferMetrics` could go into `NetworkEnvironment` using `ProxyMetricGroup.addGroup` with all netty specific gauge classes. Thinking more about register "per task" VS "per partition/gate", "per task" is actually more restrictive and we do not need Shuffle API to be more flexible for Task. Other shuffle services might also benefit from this way of registration. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go
[GitHub] [flink] rmetzger commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type
rmetzger commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type URL: https://github.com/apache/flink/pull/7978#issuecomment-475286535 I've restarted the failed job, let's see This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] vthinkxie edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
vthinkxie edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475276888 > Okay, thanks a lot for your feedback. > Is it possible to show new messages as a notification on the screen (similar to notifications on OSX?), so that people trying to submit a job immediately see the error? > > I will mention this PR on the dev@ list to get more feedback for it. Hi, I just disable the mouse wheel zoom behavior and add notification box on the screen now thanks a lot for your advice. https://user-images.githubusercontent.com/1506722/54763731-7580e200-4c31-11e9-9ce2-4717dae2bdc4.png;> This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] vthinkxie commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
vthinkxie commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475277901 > I'm against removing the old UI. We can make the new one the default but for at least 1 release the old one should be kept as a backup in case some weird issue arises. This should not incur any maintenance overhead as the old UI can be kept as is and does not have to maintain feature parity with the new one. Thanks a lot for your comments It will be ok if not have to maintain feature parity with the new one, I will add the old UI back and give users the option to switch between the new and old UI later :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#discussion_r267814622 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/NetworkEnvironment.java ## @@ -421,6 +509,25 @@ public void shutdown() { LOG.warn("Network buffer pool did not shut down properly.", t); } + for (ResultPartition[] resultPartitions : allPartitions.values()) { + for (ResultPartition partition : resultPartitions) { + partition.destroyBufferPool(); Review comment: Looks like, these partition/gates releases were supposed to happen on task cancel or finish. Do you think we should have them again here? to double-check everything is released? :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] vthinkxie commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
vthinkxie commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475276888 > Okay, thanks a lot for your feedback. > Is it possible to show new messages as a notification on the screen (similar to notifications on OSX?), so that people trying to submit a job immediately see the error? > > I will mention this PR on the dev@ list to get more feedback for it. Hi, I just disable the mouse wheel zoom behavior and add notification box on the screen now thanks a lot for your advice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol commented on issue #7980: [FLINK-11913] Shadding cassandra driver dependencies in cassandra conector
zentol commented on issue #7980: [FLINK-11913] Shadding cassandra driver dependencies in cassandra conector URL: https://github.com/apache/flink/pull/7980#issuecomment-475273289 So far we never relocated dependencies that are visible to users. It makes it impossible for users to know which version of the dependency they are working against without looking at the source code. It is also a terrible flink dev experience since the IDE aren't aware of the relocations, and one has to be careful to apply the same relocation in every module that uses the connector. Finally, doing so now would break the API. As such I'm currently inclined to reject the PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-6116) Watermarks don't work when unioning with same DataStream
[ https://issues.apache.org/jira/browse/FLINK-6116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-6116: -- Labels: pull-request-available (was: ) > Watermarks don't work when unioning with same DataStream > > > Key: FLINK-6116 > URL: https://issues.apache.org/jira/browse/FLINK-6116 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.2.0, 1.3.0 >Reporter: Aljoscha Krettek >Priority: Critical > Labels: pull-request-available > > In this example job we don't get any watermarks in the {{WatermarkObserver}}: > {code} > public class WatermarkTest { > public static void main(String[] args) throws Exception { > final StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > > env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime); > env.getConfig().setAutoWatermarkInterval(1000); > env.setParallelism(1); > DataStreamSource input = env.addSource(new > SourceFunction() { > @Override > public void run(SourceContext ctx) throws > Exception { > while (true) { > ctx.collect("hello!"); > Thread.sleep(800); > } > } > @Override > public void cancel() { > } > }); > input.union(input) > .flatMap(new IdentityFlatMap()) > .transform("WatermarkOp", > BasicTypeInfo.STRING_TYPE_INFO, new WatermarkObserver()); > env.execute(); > } > public static class WatermarkObserver > extends AbstractStreamOperator > implements OneInputStreamOperator { > @Override > public void processElement(StreamRecord element) throws > Exception { > System.out.println("GOT ELEMENT: " + element); > } > @Override > public void processWatermark(Watermark mark) throws Exception { > super.processWatermark(mark); > System.out.println("GOT WATERMARK: " + mark); > } > } > private static class IdentityFlatMap > extends RichFlatMapFunction { > @Override > public void flatMap(String value, Collector out) throws > Exception { > out.collect(value); > } > } > } > {code} > When commenting out the `union` it works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] zentol merged pull request #8014: [hostfix][docs]replace tabs with spaces in template code
zentol merged pull request #8014: [hostfix][docs]replace tabs with spaces in template code URL: https://github.com/apache/flink/pull/8014 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol commented on issue #8014: [hostfix][docs]replace tabs with spaces in template code
zentol commented on issue #8014: [hostfix][docs]replace tabs with spaces in template code URL: https://github.com/apache/flink/pull/8014#issuecomment-475271712 I merged the PR since the existing docs does indeed look horrible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#discussion_r267808414 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/NetworkEnvironment.java ## @@ -88,6 +104,12 @@ private final boolean enableCreditBased; + private final boolean enableNetworkDetailedMetrics; + + private final ConcurrentHashMap allPartitions; Review comment: At the first glance, it looks like we could include task name into `ResultPartitionDeploymentDescriptor` as well in #7835 along with `executionId`. I would still index by `executionId` if needed (add to method at the moment before #7835), because it is more reliable and concise than task name + id used for debug. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] pnowojski closed pull request #4649: [FLINK-6116] Watermarks don't work when unioning with same DataStream.
pnowojski closed pull request #4649: [FLINK-6116] Watermarks don't work when unioning with same DataStream. URL: https://github.com/apache/flink/pull/4649 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] pnowojski commented on issue #4649: [FLINK-6116] Watermarks don't work when unioning with same DataStream.
pnowojski commented on issue #4649: [FLINK-6116] Watermarks don't work when unioning with same DataStream. URL: https://github.com/apache/flink/pull/4649#issuecomment-475268686 I'm closing this since it wasn't updated for very long time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] vthinkxie edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
vthinkxie edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475266066 Hi @sorahn Thanks a lot for your comment I just add the offline tip and indefinite retry when user lost connection, you can check it on the right top. https://user-images.githubusercontent.com/1506722/54761641-6ef06b80-4c2d-11e9-875e-da316931397b.png;> This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] HuangZhenQiu commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type
HuangZhenQiu commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type URL: https://github.com/apache/flink/pull/7978#issuecomment-475266804 @rmetzger It looks there are some random test failures after rebase master. Any suggestion for this situation? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] vthinkxie commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
vthinkxie commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475266066 Hi @sorahn Thanks a lot for your comment, I just add the offline tip and indefinite retry when user lost connection, you can check it on the right top. https://user-images.githubusercontent.com/1506722/54761641-6ef06b80-4c2d-11e9-875e-da316931397b.png;> This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment
azagrebin commented on a change in pull request #7822: [FLINK-11726][network] Refactor the creation of ResultPartition and InputGate into NetworkEnvironment URL: https://github.com/apache/flink/pull/7822#discussion_r267794310 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/NetworkEnvironment.java ## @@ -292,40 +392,31 @@ public void unregisterTask(Task task) { LOG.debug("Unregister task {} from network environment (state: {}).", task.getTaskInfo().getTaskNameWithSubtasks(), task.getExecutionState()); - final ExecutionAttemptID executionId = task.getExecutionId(); - synchronized (lock) { if (isShutdown) { // no need to do anything when we are not operational return; } if (task.isCanceledOrFailed()) { - resultPartitionManager.releasePartitionsProducedBy(executionId, task.getFailureCause()); + resultPartitionManager.releasePartitionsProducedBy(task.getExecutionId(), task.getFailureCause()); } - for (ResultPartition partition : task.getProducedPartitions()) { + for (ResultPartition partition : allPartitions.get(task.getTaskInfo().getTaskId())) { Review comment: the method could also accept `cause` additionally to id. `cause != null` would mean failure case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
[ https://issues.apache.org/jira/browse/FLINK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798106#comment-16798106 ] Yu Li commented on FLINK-11990: --- Thanks for the double check [~aljoscha]. Just ran locally with {{skip}} ahead in the command and confirmed 2.8.3 could also pass. However, I think this {{skip}} may has given some misleading information since it also ignores the error in the output of the test script. Allow me to repeat the output with hadoop 2.8.3: {noformat} Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state FINISHED ... Truncating buckets Truncating to {noformat} Where the relative lines of {{test_streaming_bucketing.sh}} script are: {noformat} LOG_LINES=$(grep -rnw $FLINK_DIR/log -e 'Writing valid-length file') # perform truncate on every line echo "Truncating buckets" while read -r LOG_LINE; do PART=$(echo "$LOG_LINE" | awk '{ print $10 }' FS=" ") LENGTH=$(echo "$LOG_LINE" | awk '{ print $15 }' FS=" ") echo "Truncating $PART to $LENGTH" dd if=$PART of="$PART.truncated" bs=$LENGTH count=1 rm $PART mv "$PART.truncated" $PART done <<< "$LOG_LINES" {noformat} So it means there's no message like "Writing valid-length file" in logs of the flink dist. And I think this indicates some problem happens. Wdyt? Thanks. > Streaming bucketing end-to-end test fail with hadoop 2.8 > > > Key: FLINK-11990 > URL: https://issues.apache.org/jira/browse/FLINK-11990 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Yu Li >Priority: Critical > > As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 > bundles always fail, while running with 2.6 bundles could pass. > Command to run the case: > {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh > test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} > The output with hadoop 2.8 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: > {noformat} > Starting taskexecutor daemon on host z05f06378.sqa.zth. > Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state > FINISHED ... > Truncating buckets > Truncating to > {noformat} > The output of the success run with hadoop 2.6 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: > {noformat} > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s > pass Bucketing Sink > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
[ https://issues.apache.org/jira/browse/FLINK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798106#comment-16798106 ] Yu Li edited comment on FLINK-11990 at 3/21/19 1:47 PM: Thanks for the double check [~aljoscha]. Just ran locally with {{skip}} ahead in the command and confirmed 2.8.3 could also pass. However, I think this {{skip}} may has given some misleading information since it also ignores the error in the output of the test script. Allow me to repeat the output with hadoop 2.8.3: {noformat} Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state FINISHED ... Truncating buckets Truncating to {noformat} Where the relative lines of {{test_streaming_bucketing.sh}} script are: {noformat} LOG_LINES=$(grep -rnw $FLINK_DIR/log -e 'Writing valid-length file') # perform truncate on every line echo "Truncating buckets" while read -r LOG_LINE; do PART=$(echo "$LOG_LINE" | awk '{ print $10 }' FS=" ") LENGTH=$(echo "$LOG_LINE" | awk '{ print $15 }' FS=" ") echo "Truncating $PART to $LENGTH" dd if=$PART of="$PART.truncated" bs=$LENGTH count=1 rm $PART mv "$PART.truncated" $PART done <<< "$LOG_LINES" {noformat} So it means there's no message like "Writing valid-length file" in flink task logs. And I think this indicates some problem happens. Wdyt? Thanks. was (Author: carp84): Thanks for the double check [~aljoscha]. Just ran locally with {{skip}} ahead in the command and confirmed 2.8.3 could also pass. However, I think this {{skip}} may has given some misleading information since it also ignores the error in the output of the test script. Allow me to repeat the output with hadoop 2.8.3: {noformat} Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state FINISHED ... Truncating buckets Truncating to {noformat} Where the relative lines of {{test_streaming_bucketing.sh}} script are: {noformat} LOG_LINES=$(grep -rnw $FLINK_DIR/log -e 'Writing valid-length file') # perform truncate on every line echo "Truncating buckets" while read -r LOG_LINE; do PART=$(echo "$LOG_LINE" | awk '{ print $10 }' FS=" ") LENGTH=$(echo "$LOG_LINE" | awk '{ print $15 }' FS=" ") echo "Truncating $PART to $LENGTH" dd if=$PART of="$PART.truncated" bs=$LENGTH count=1 rm $PART mv "$PART.truncated" $PART done <<< "$LOG_LINES" {noformat} So it means there's no message like "Writing valid-length file" in logs of the flink dist. And I think this indicates some problem happens. Wdyt? Thanks. > Streaming bucketing end-to-end test fail with hadoop 2.8 > > > Key: FLINK-11990 > URL: https://issues.apache.org/jira/browse/FLINK-11990 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Yu Li >Priority: Critical > > As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 > bundles always fail, while running with 2.6 bundles could pass. > Command to run the case: > {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh > test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} > The output with hadoop 2.8 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: > {noformat} > Starting taskexecutor daemon on host z05f06378.sqa.zth. > Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state > FINISHED ... > Truncating buckets > Truncating to > {noformat} > The output of the success run with hadoop 2.6 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: > {noformat} > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s > pass Bucketing Sink > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
[ https://issues.apache.org/jira/browse/FLINK-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798081#comment-16798081 ] Aljoscha Krettek commented on FLINK-11990: -- For me, this works with both the shaded Hadoop 2.8.3 and shaded Hadoop 2.4.1. However, the command I have to use is: {code} FLINK_DIR=build-target flink-end-to-end-tests/run-single-test.sh skip "flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh" {code} Note how *skip* is the first parameter of {{run-single-test.sh}}. Without that it fails because of log checking. > Streaming bucketing end-to-end test fail with hadoop 2.8 > > > Key: FLINK-11990 > URL: https://issues.apache.org/jira/browse/FLINK-11990 > Project: Flink > Issue Type: Bug > Components: Connectors / Hadoop Compatibility >Reporter: Yu Li >Priority: Critical > > As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 > bundles always fail, while running with 2.6 bundles could pass. > Command to run the case: > {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh > test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} > The output with hadoop 2.8 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: > {noformat} > Starting taskexecutor daemon on host z05f06378.sqa.zth. > Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state > FINISHED ... > Truncating buckets > Truncating to > {noformat} > The output of the success run with hadoop 2.6 > [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] > or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: > {noformat} > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s > Truncating > /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 > to 51250 > 1+0 records in > 1+0 records out > 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s > pass Bucketing Sink > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11991) Set headers to use for CSV output
[ https://issues.apache.org/jira/browse/FLINK-11991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11991: --- Labels: pull-request-available (was: ) > Set headers to use for CSV output > - > > Key: FLINK-11991 > URL: https://issues.apache.org/jira/browse/FLINK-11991 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Julien Nioche >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0 > > > As discussed in > [https://stackoverflow.com/questions/54530755/flink-write-tuples-with-csv-header-into-file/54536586?noredirect=1#comment97248717_54536586], > it would be nice to be able to specify headers to print out at the beginning > of a CSV output. > I've written a patch for this and will add submit it as a PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8030: [FLINK-11991] Set headers to use for CSV output
flinkbot commented on issue #8030: [FLINK-11991] Set headers to use for CSV output URL: https://github.com/apache/flink/pull/8030#issuecomment-475212183 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] jniocheCF opened a new pull request #8030: [FLINK-11991] Set headers to use for CSV output
jniocheCF opened a new pull request #8030: [FLINK-11991] Set headers to use for CSV output URL: https://github.com/apache/flink/pull/8030 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* This PR adds a new functionality allowing to specify headers for CSV files in output. ## Brief change log Added a method to the Dataset class and modified the CSVOutputFormat. ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: don't know - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? JavaDocs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11781) Reject "DISABLED" as value for yarn.per-job-cluster.include-user-jar
[ https://issues.apache.org/jira/browse/FLINK-11781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11781: - Fix Version/s: (was: 1.9.0) > Reject "DISABLED" as value for yarn.per-job-cluster.include-user-jar > > > Key: FLINK-11781 > URL: https://issues.apache.org/jira/browse/FLINK-11781 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.6.4, 1.7.2, 1.8.0 >Reporter: Gary Yao >Assignee: Gary Yao >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > *Description* > Setting {{yarn.per-job-cluster.include-user-jar: DISABLED}} in > {{flink-conf.yaml}} is not supported (anymore). Doing so will lead to the job > jar not being on the system classpath, which is mandatory if Flink is > deployed in job mode. The job will never run. > *Expected behavior* > Documentation should reflect that setting > {{yarn.per-job-cluster.include-user-jar: DISABLED}} does not work. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11980) Improve efficiency of iterating KeySelectionListener on notification
[ https://issues.apache.org/jira/browse/FLINK-11980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11980: - Fix Version/s: (was: 1.9.0) > Improve efficiency of iterating KeySelectionListener on notification > > > Key: FLINK-11980 > URL: https://issues.apache.org/jira/browse/FLINK-11980 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.8.0 >Reporter: Stefan Richter >Assignee: Stefan Richter >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > {{KeySelectionListener}} was introduced for incremental TTL state cleanup as > a driver of the cleanup process. Listeners are notified whenever the current > key in the backend is set (i.e. for every event). The current implementation > of the collection that holds the listener is a {{HashSet}}, iterated via > `forEach` on each key change. This method comes with the overhead of creating > temporaray objects, e.g. iterators, on every invocation and even if there is > no listener registered. We should rather use an {{ArrayList}} with for-loop > iteration in this hot code path to i) minimize overhead and ii) minimize > costs for the very likely case that there is no listener at all. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11991) Set headers to use for CSV output
[ https://issues.apache.org/jira/browse/FLINK-11991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated FLINK-11991: -- Description: As discussed in [https://stackoverflow.com/questions/54530755/flink-write-tuples-with-csv-header-into-file/54536586?noredirect=1#comment97248717_54536586], it would be nice to be able to specify headers to print out at the beginning of a CSV output. I've written a patch for this and will add submit it as a PR. was: As discussed in [https://stackoverflow.com/questions/54530755/flink-write-tuples-with-csv-header-into-file/54536586?noredirect=1#comment97248717_54536586|[http://stackoverflow.com],|http://stackoverflow.com]%2C/] it would be nice to be able to specify headers to print out at the beginning of a CSV output. I've written a patch for this and will add submit it as a PR. > Set headers to use for CSV output > - > > Key: FLINK-11991 > URL: https://issues.apache.org/jira/browse/FLINK-11991 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Julien Nioche >Priority: Minor > Fix For: 1.9.0 > > > As discussed in > [https://stackoverflow.com/questions/54530755/flink-write-tuples-with-csv-header-into-file/54536586?noredirect=1#comment97248717_54536586], > it would be nice to be able to specify headers to print out at the beginning > of a CSV output. > I've written a patch for this and will add submit it as a PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11745) TTL end-to-end test restores from the savepoint after the job cancelation
[ https://issues.apache.org/jira/browse/FLINK-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11745: - Fix Version/s: (was: 1.9.0) > TTL end-to-end test restores from the savepoint after the job cancelation > - > > Key: FLINK-11745 > URL: https://issues.apache.org/jira/browse/FLINK-11745 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends, Tests >Affects Versions: 1.6.4, 1.7.2, 1.8.0, 1.9.0 >Reporter: Andrey Zagrebin >Assignee: Andrey Zagrebin >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3, 1.8.0, 1.6.5 > > Time Spent: 1h > Remaining Estimate: 0h > > The state TTL end-to-end test currently cancels the first running job, takes > savepoint and starts the job again from stratch without using the savepoint. > The second job should start from the previously taken savepoint. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11950) Add missing dependencies in NOTICE file of flink-dist.
[ https://issues.apache.org/jira/browse/FLINK-11950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11950: - Fix Version/s: (was: 1.9.0) > Add missing dependencies in NOTICE file of flink-dist. > -- > > Key: FLINK-11950 > URL: https://issues.apache.org/jira/browse/FLINK-11950 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.8.0, 1.9.0 >Reporter: sunjincheng >Assignee: sunjincheng >Priority: Blocker > Labels: pull-request-available > Fix For: 1.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Add Missing dependencies in NOTICE file of flink-dist. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11972) The classpath is missing the `flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar` JAR during the end-to-end test.
[ https://issues.apache.org/jira/browse/FLINK-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11972: - Fix Version/s: (was: 1.9.0) > The classpath is missing the `flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar` JAR > during the end-to-end test. > > > Key: FLINK-11972 > URL: https://issues.apache.org/jira/browse/FLINK-11972 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.8.0, 1.9.0 >Reporter: sunjincheng >Assignee: Yu Li >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Attachments: image-2019-03-21-06-26-49-787.png, screenshot-1.png > > Time Spent: 20m > Remaining Estimate: 0h > > Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not put the > `hadoop-shaded` JAR integrated into the dist. It will cause an error when > the end-to-end test cannot be found with `Hadoop` Related classes, such as: > `java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem`. So we > need to improve the end-to-end test script, or explicitly stated in the > README, i.e. end-to-end test need to add `flink-shaded-hadoop2-uber-.jar` > to the classpath. So, we will get the exception something like: > {code:java} > [INFO] 3 instance(s) of taskexecutor are already running on > jinchengsunjcs-iMac.local. > Starting taskexecutor daemon on host jinchengsunjcs-iMac.local. > java.lang.NoClassDefFoundError: Lorg/apache/hadoop/fs/FileSystem; > at java.lang.Class.getDeclaredFields0(Native Method) > at java.lang.Class.privateGetDeclaredFields(Class.java:2583) > at java.lang.Class.getDeclaredFields(Class.java:1916) > at org.apache.flink.api.java.ClosureCleaner.clean(ClosureCleaner.java:72) > at > org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1558) > at > org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:185) > at > org.apache.flink.streaming.api.datastream.DataStream.addSink(DataStream.java:1227) > at > org.apache.flink.streaming.tests.BucketingSinkTestProgram.main(BucketingSinkTestProgram.java:80) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421) > at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813) > at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126) > at > org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126) > Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FileSystem > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 22 more > Job () is running.{code} > So, I think we can import the test script or improve the README. > What do you think? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11992) Update Apache Parquet 1.10.1
[ https://issues.apache.org/jira/browse/FLINK-11992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11992: --- Labels: pull-request-available (was: ) > Update Apache Parquet 1.10.1 > > > Key: FLINK-11992 > URL: https://issues.apache.org/jira/browse/FLINK-11992 > Project: Flink > Issue Type: Bug > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11992) Update Apache Parquet 1.10.1
Fokko Driesprong created FLINK-11992: Summary: Update Apache Parquet 1.10.1 Key: FLINK-11992 URL: https://issues.apache.org/jira/browse/FLINK-11992 Project: Flink Issue Type: Bug Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Reporter: Fokko Driesprong Assignee: Fokko Driesprong -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11676) Make ResultPartitionWriter extend AutoCloseable
[ https://issues.apache.org/jira/browse/FLINK-11676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11676. Resolution: Fixed master: d23704d9ca58f4866840f63ced1aef21dff10436 > Make ResultPartitionWriter extend AutoCloseable > --- > > Key: FLINK-11676 > URL: https://issues.apache.org/jira/browse/FLINK-11676 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Reporter: zhijiang >Assignee: zhijiang >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > This is a sub-task for refactoring the interface of {{ResultPartitionWriter}}. > We can make {{ResultPartitionWriter}} extend {{AutoCloseable}} in order to > replace calling {{destroyBufferPool}} in specific {{ResultPartition}}. This > is a small step for making {{Task}} reference with general > {{ResultPartitionWriter}} instead of {{ResultPartition}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11721) Remove IOMode from NetworkEnvironment
[ https://issues.apache.org/jira/browse/FLINK-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11721. Resolution: Fixed master: edebb108d0d0477efba81e55b07339755739dd39 > Remove IOMode from NetworkEnvironment > - > > Key: FLINK-11721 > URL: https://issues.apache.org/jira/browse/FLINK-11721 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Network >Reporter: zhijiang >Assignee: zhijiang >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The {{NetworkEnvironment}} would be refactored into {{NetworkShuffleService}} > created by {{ShuffleManager}} future. So the {{NetworkEnvironment}} should > only cover shuffle related components to make the preparation. > Currently {{IOMode}} is only used for getting in tests, so it is not a > necessary component in {{NetworkEnvironment to be removed directly.}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] zentol merged pull request #7767: [FLINK-11676][network] Make ResultPartitionWriter extend AutoCloseable
zentol merged pull request #7767: [FLINK-11676][network] Make ResultPartitionWriter extend AutoCloseable URL: https://github.com/apache/flink/pull/7767 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol merged pull request #7800: [FLINK-11721][network] Remove IOMode from NetworkEnvironment
zentol merged pull request #7800: [FLINK-11721][network] Remove IOMode from NetworkEnvironment URL: https://github.com/apache/flink/pull/7800 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (FLINK-11991) Set headers to use for CSV output
Julien Nioche created FLINK-11991: - Summary: Set headers to use for CSV output Key: FLINK-11991 URL: https://issues.apache.org/jira/browse/FLINK-11991 Project: Flink Issue Type: Improvement Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) Reporter: Julien Nioche Fix For: 1.9.0 As discussed in [https://stackoverflow.com/questions/54530755/flink-write-tuples-with-csv-header-into-file/54536586?noredirect=1#comment97248717_54536586|[http://stackoverflow.com],|http://stackoverflow.com]%2C/] it would be nice to be able to specify headers to print out at the beginning of a CSV output. I've written a patch for this and will add submit it as a PR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11898) Support code generation for all Blink built-in functions and operators
[ https://issues.apache.org/jira/browse/FLINK-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu updated FLINK-11898: Summary: Support code generation for all Blink built-in functions and operators (was: Support code generation for built-in functions and operators) > Support code generation for all Blink built-in functions and operators > -- > > Key: FLINK-11898 > URL: https://issues.apache.org/jira/browse/FLINK-11898 > Project: Flink > Issue Type: New Feature > Components: SQL / Planner >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Support code generation for built-in functions and operators. > FLINK-11788 has supported some of the operators. This issue is aiming to > complement the functions and operators supported in Flink SQL. > This should inlclude: CONCAT, LIKE, SUBSTRING, UPPER, LOWER, and so on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11990) Streaming bucketing end-to-end test fail with hadoop 2.8
Yu Li created FLINK-11990: - Summary: Streaming bucketing end-to-end test fail with hadoop 2.8 Key: FLINK-11990 URL: https://issues.apache.org/jira/browse/FLINK-11990 Project: Flink Issue Type: Bug Components: Connectors / Hadoop Compatibility Reporter: Yu Li As titled, running the {{test_streaming_bucketing.sh}} case with hadoop 2.8 bundles always fail, while running with 2.6 bundles could pass. Command to run the case: {{FLINK_DIR= flink-end-to-end-tests/run-single-test.sh test-scripts/test_streaming_bucketing.sh skip_check_exceptions}} The output with hadoop 2.8 [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar] or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5]: {noformat} Starting taskexecutor daemon on host z05f06378.sqa.zth. Waiting for job (905ae10bae4b99031e724b9c29f0ca7b) to reach terminal state FINISHED ... Truncating buckets Truncating to {noformat} The output of the success run with hadoop 2.6 [bundle|https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.6.5-1.8.0/flink-shaded-hadoop2-uber-2.6.5-1.8.0.jar] or [dist|http://archive.apache.org/dist/hadoop/core/hadoop-2.6.5]: {noformat} Truncating /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result3/part-3-0 to 51250 1+0 records in 1+0 records out 51250 bytes (51 kB) copied, 0.000377998 s, 136 MB/s Truncating /home/jueding.ly/flink_rc_check/flink-1.8.0-src/flink-end-to-end-tests/test-scripts/temp-test-directory-06210353709/out/result7/part-3-0 to 51250 1+0 records in 1+0 records out 51250 bytes (51 kB) copied, 0.00033118 s, 155 MB/s pass Bucketing Sink {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] sorahn edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
sorahn edited a comment on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475184015 >>What are the "Messages" at the top for? >It is an error message collector, any error message return from the server will display in the box now, not only when user submit a job. I stopped my local cluster while the UI was still running, and I saw the requests to /overview and /jobs/overview fail, but it did not produce any errors in the UI. Also, it stops polling if one request fails. I would expect it to keep trying indefinitely to try to recover from any transient outages. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol merged pull request #7192: [hotfix][test][streaming] Fix BucketStateSerializerTest failures on Windows7
zentol merged pull request #7192: [hotfix][test][streaming] Fix BucketStateSerializerTest failures on Windows7 URL: https://github.com/apache/flink/pull/7192 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol closed pull request #7330: [FLINK-11193] [state] repair rocksDb config Covered
zentol closed pull request #7330: [FLINK-11193] [state] repair rocksDb config Covered URL: https://github.com/apache/flink/pull/7330 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11340) Bump commons-configuration from 1.7 to 1.10
[ https://issues.apache.org/jira/browse/FLINK-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler updated FLINK-11340: - Component/s: (was: Runtime / Configuration) Build System > Bump commons-configuration from 1.7 to 1.10 > --- > > Key: FLINK-11340 > URL: https://issues.apache.org/jira/browse/FLINK-11340 > Project: Flink > Issue Type: Improvement > Components: Build System >Reporter: Fokko Driesprong >Assignee: Fokko Driesprong >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Bump commons-configuration from 1.7 to 1.10 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] zentol closed pull request #7956: fix Environment.java catalogs lose
zentol closed pull request #7956: fix Environment.java catalogs lose URL: https://github.com/apache/flink/pull/7956 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (FLINK-11900) Flink on Kubernetes sensitive about arguments placement
[ https://issues.apache.org/jira/browse/FLINK-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797991#comment-16797991 ] frank wang commented on FLINK-11900: {code:java} job-cluster --job-classname --fromSavepoint --allowNonRestoredState does not pick the savepoint path and does not start from it job-cluster --fromSavepoint -n --job-classname -p works for savepoint retrieval{code} maybe you should keep them accordance, try that > Flink on Kubernetes sensitive about arguments placement > --- > > Key: FLINK-11900 > URL: https://issues.apache.org/jira/browse/FLINK-11900 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.7.2 >Reporter: Mario Georgiev >Assignee: frank wang >Priority: Major > > Hello guys, > I've discovered that when deploying the job cluster on Kubernetes, the Job > Cluster Manager seems sensitive about the placement of arguments. > For instance if i put the savepoint argument last, it never reads it. > For instance if arguments are : > {code:java} > job-cluster --job-classname --fromSavepoint > --allowNonRestoredState does not pick the savepoint path and does not start > from it > job-cluster --fromSavepoint -n --job-classname > -p works for savepoint retrieval{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] sorahn commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
sorahn commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475184015 >>What are the "Messages" at the top for? >It is an error message collector, any error message return from the server will display in the box now, not only when user submit a job. I stopped my local cluster while the UI was still running, and I saw the requests to /overview and /jobs/overview fail, but it did not produce any errors in the UI. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7807: [FLINK-11723][network] Remove KvState related components from Network…
flinkbot edited a comment on issue #7807: [FLINK-11723][network] Remove KvState related components from Network… URL: https://github.com/apache/flink/pull/7807#issuecomment-466351407 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11971) Fix `Command: start_kubernetes_if_not_ruunning failed` error
[ https://issues.apache.org/jira/browse/FLINK-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-11971: - Fix Version/s: (was: 1.9.0) > Fix `Command: start_kubernetes_if_not_ruunning failed` error > > > Key: FLINK-11971 > URL: https://issues.apache.org/jira/browse/FLINK-11971 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.8.0, 1.9.0 >Reporter: sunjincheng >Assignee: Hequn Cheng >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > Attachments: screenshot-1.png > > Time Spent: 20m > Remaining Estimate: 0h > > When I did the end-to-end test under Mac OS, I found the following two > problems: > 1. The verification returned for different `minikube status` is not enough > for the robustness. The strings returned by different versions of different > platforms are different. the following misjudgment is caused: > When the `Command: start_kubernetes_if_not_ruunning failed` error occurs, > the `minikube` has actually started successfully. The core reason is that > there is a bug in the `test_kubernetes_embedded_job.sh` script. The error > message as follows: > !screenshot-1.png! > {code:java} > Current check logic: echo ${status} | grep -q "minikube: Running cluster: > Running kubectl: Correctly Configured" > My local info > jinchengsunjcs-iMac:flink-1.8.0 jincheng$ minikube status > host: Running > kubelet: Running > apiserver: Running > kubectl: Correctly Configured: pointing to minikube-vm at 192.168.99.101{code} > So, I think we should improve the check logic of `minikube status`, What do > you think? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] azagrebin commented on a change in pull request #7807: [FLINK-11723][network] Remove KvState related components from Network…
azagrebin commented on a change in pull request #7807: [FLINK-11723][network] Remove KvState related components from Network… URL: https://github.com/apache/flink/pull/7807#discussion_r267378917 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/KvStateService.java ## @@ -0,0 +1,172 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.taskexecutor; + +import org.apache.flink.api.common.JobID; +import org.apache.flink.queryablestate.network.stats.DisabledKvStateRequestStats; +import org.apache.flink.runtime.jobgraph.JobVertexID; +import org.apache.flink.runtime.query.KvStateClientProxy; +import org.apache.flink.runtime.query.KvStateRegistry; +import org.apache.flink.runtime.query.KvStateServer; +import org.apache.flink.runtime.query.QueryableStateUtils; +import org.apache.flink.runtime.query.TaskKvStateRegistry; +import org.apache.flink.runtime.state.internal.InternalKvState; +import org.apache.flink.util.Preconditions; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * KvState related components of each {@link TaskExecutor} instance. This service can + * create the kvState registration for a single task. + */ +public class KvStateService { + private static final Logger LOG = LoggerFactory.getLogger(KvStateService.class); + + /** Registry for {@link InternalKvState} instances. */ + private final KvStateRegistry kvStateRegistry; + + /** Server for {@link InternalKvState} requests. */ + private KvStateServer kvStateServer; + + /** Proxy for the queryable state client. */ + private KvStateClientProxy kvStateClientProxy; + + KvStateService(KvStateRegistry kvStateRegistry, KvStateServer kvStateServer, KvStateClientProxy kvStateClientProxy) { + this.kvStateRegistry = Preconditions.checkNotNull(kvStateRegistry); + this.kvStateServer = kvStateServer; + this.kvStateClientProxy = kvStateClientProxy; + } + + // + // Getter/Setter + // + + public KvStateRegistry getKvStateRegistry() { + return kvStateRegistry; + } + + public KvStateServer getKvStateServer() { + return kvStateServer; + } + + public KvStateClientProxy getKvStateClientProxy() { + return kvStateClientProxy; + } + + public TaskKvStateRegistry createKvStateTaskRegistry(JobID jobId, JobVertexID jobVertexId) { + return kvStateRegistry.createTaskRegistry(jobId, jobVertexId); + } + + // + // Start and shut down methods + // + + public void start() { Review comment: Previously start and shutdown methods were synchronised by lock in `NetworkEnviroment`. It seems KV objects are independent from connectionManager. Do we need the similar lock in start and shutdown in `KvStateService` to make them mutually exclusive as in `NetworkEnviroment`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] aljoscha commented on issue #8024: [FLINK-11971] Fix kubernetes check in end-to-end test
aljoscha commented on issue #8024: [FLINK-11971] Fix kubernetes check in end-to-end test URL: https://github.com/apache/flink/pull/8024#issuecomment-475182456 Thanks for merging! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #8022: [FLINK-11968] [runtime] Fixed SingleElementIterator and removed duplicate
flinkbot edited a comment on issue #8022: [FLINK-11968] [runtime] Fixed SingleElementIterator and removed duplicate URL: https://github.com/apache/flink/pull/8022#issuecomment-474830295 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ✅ 1. The [description] looks good. - Approved by @zentol [PMC] * ✅ 2. There is [consensus] that the contribution should go into to Flink. - Approved by @zentol [PMC] * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol commented on issue #8022: [FLINK-11968] [runtime] Fixed SingleElementIterator and removed duplicate
zentol commented on issue #8022: [FLINK-11968] [runtime] Fixed SingleElementIterator and removed duplicate URL: https://github.com/apache/flink/pull/8022#issuecomment-475181500 @flinkbot approve-until consensus This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (FLINK-11987) Kafka producer occasionally throws NullpointerException
[ https://issues.apache.org/jira/browse/FLINK-11987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LIU Xiao updated FLINK-11987: - Description: We are using Flink 1.6.2 in our production environment, and kafka producer occasionally throws NullpointerException. We found in line 175 of flink/flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java, NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR was created as a static variable. Then in line 837, "context.getOperatorStateStore().getUnionListState(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR);" was called, and that leads to line 734 of flink/flink-runtime/src/main/java/org/apache/flink/runtime/state/DefaultOperatorStateBackend.java: "stateDescriptor.initializeSerializerUnlessSet(getExecutionConfig());" In function initializeSerializerUnlessSet(line 283 of flink/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java): if (serializer == null) { checkState(typeInfo != null, "no serializer and no type info"); // instantiate the serializer serializer = typeInfo.createSerializer(executionConfig); // we can drop the type info now, no longer needed typeInfo = null; } "serializer = typeInfo.createSerializer(executionConfig);" is the line which throws the exception. We think that's because multiple subtasks of the same producer in a same TaskManager share a same NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR. was: We are using Flink 1.6.2 in our production environment, and kafka producer occasionally throws NullpointerException. We found in line 175 of flink/flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java, NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR was created as a static variable. Then in line 837, "context.getOperatorStateStore().getUnionListState(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR);" was called, and that leads to line 734 of flink/flink-runtime/src/main/java/org/apache/flink/runtime/state/DefaultOperatorStateBackend.java: "stateDescriptor.initializeSerializerUnlessSet(getExecutionConfig());" In function initializeSerializerUnlessSet(line 283 of flink/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java): bq. bq. {quote}if (serializer == null) { bq. checkState(typeInfo != null, "no serializer and no type info"); bq. // instantiate the serializer bq. serializer = typeInfo.createSerializer(executionConfig); bq. // we can drop the type info now, no longer needed bq. typeInfo = null; bq. }{quote} "serializer = typeInfo.createSerializer(executionConfig);" is the line which throws the exception. We think that's because multiple subtasks of the same producer in a same TaskManager share a same NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR. > Kafka producer occasionally throws NullpointerException > --- > > Key: FLINK-11987 > URL: https://issues.apache.org/jira/browse/FLINK-11987 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.6.3, 1.6.4, 1.7.2 > Environment: Flink 1.6.2 (Standalone Cluster) > Oracle JDK 1.8u151 > Centos 7.4 >Reporter: LIU Xiao >Priority: Major > > We are using Flink 1.6.2 in our production environment, and kafka producer > occasionally throws NullpointerException. > We found in line 175 of > flink/flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java, > NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR was created as a static variable. > Then in line 837, > "context.getOperatorStateStore().getUnionListState(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR);" > was called, and that leads to line 734 of > > flink/flink-runtime/src/main/java/org/apache/flink/runtime/state/DefaultOperatorStateBackend.java: > "stateDescriptor.initializeSerializerUnlessSet(getExecutionConfig());" > In function initializeSerializerUnlessSet(line 283 of > flink/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java): > if (serializer == null) { > checkState(typeInfo != null, "no serializer and no type info"); > // instantiate the serializer > serializer = typeInfo.createSerializer(executionConfig); > // we can drop the type info now, no longer needed > typeInfo = null; > } > "serializer = typeInfo.createSerializer(executionConfig);" is the line which > throws the exception. > We think that's because multiple subtasks of the same producer in a same > TaskManager share a same NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11987) Kafka producer occasionally throws NullpointerException
[ https://issues.apache.org/jira/browse/FLINK-11987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] LIU Xiao updated FLINK-11987: - Description: We are using Flink 1.6.2 in our production environment, and kafka producer occasionally throws NullpointerException. We found in line 175 of flink/flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java, NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR was created as a static variable. Then in line 837, "context.getOperatorStateStore().getUnionListState(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR);" was called, and that leads to line 734 of flink/flink-runtime/src/main/java/org/apache/flink/runtime/state/DefaultOperatorStateBackend.java: "stateDescriptor.initializeSerializerUnlessSet(getExecutionConfig());" In function initializeSerializerUnlessSet(line 283 of flink/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java): bq. bq. {quote}if (serializer == null) { bq. checkState(typeInfo != null, "no serializer and no type info"); bq. // instantiate the serializer bq. serializer = typeInfo.createSerializer(executionConfig); bq. // we can drop the type info now, no longer needed bq. typeInfo = null; bq. }{quote} "serializer = typeInfo.createSerializer(executionConfig);" is the line which throws the exception. We think that's because multiple subtasks of the same producer in a same TaskManager share a same NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR. was: We are using Flink 1.6.2 in our production environment, and kafka producer occasionally throws NullpointerException. We found in line 175 of flink/flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java, NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR was created as a static variable. Then in line 837, "context.getOperatorStateStore().getUnionListState(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR);" was called, and that leads to line 734 of flink/flink-runtime/src/main/java/org/apache/flink/runtime/state/DefaultOperatorStateBackend.java: "stateDescriptor.initializeSerializerUnlessSet(getExecutionConfig());" In function initializeSerializerUnlessSet(line 283 of flink/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java): if (serializer == null) { checkState(typeInfo != null, "no serializer and no type info"); // instantiate the serializer serializer = typeInfo.createSerializer(executionConfig); // we can drop the type info now, no longer needed typeInfo = null; } "serializer = typeInfo.createSerializer(executionConfig);" is the line which throws the exception. We think that's because multiple subtasks of the same producer in a same TaskManager share a same NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR. > Kafka producer occasionally throws NullpointerException > --- > > Key: FLINK-11987 > URL: https://issues.apache.org/jira/browse/FLINK-11987 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.6.3, 1.6.4, 1.7.2 > Environment: Flink 1.6.2 (Standalone Cluster) > Oracle JDK 1.8u151 > Centos 7.4 >Reporter: LIU Xiao >Priority: Major > > We are using Flink 1.6.2 in our production environment, and kafka producer > occasionally throws NullpointerException. > We found in line 175 of > flink/flink-connectors/flink-connector-kafka-0.11/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer011.java, > NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR was created as a static variable. > Then in line 837, > "context.getOperatorStateStore().getUnionListState(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR);" > was called, and that leads to line 734 of > > flink/flink-runtime/src/main/java/org/apache/flink/runtime/state/DefaultOperatorStateBackend.java: > "stateDescriptor.initializeSerializerUnlessSet(getExecutionConfig());" > In function initializeSerializerUnlessSet(line 283 of > flink/flink-core/src/main/java/org/apache/flink/api/common/state/StateDescriptor.java): > bq. > bq. {quote}if (serializer == null) { > bq. checkState(typeInfo != null, "no serializer and no type info"); > bq. // instantiate the serializer > bq. serializer = typeInfo.createSerializer(executionConfig); > bq. // we can drop the type info now, no longer needed > bq. typeInfo = null; > bq. }{quote} > "serializer = typeInfo.createSerializer(executionConfig);" is the line which > throws the exception. > We think that's because multiple subtasks of the same producer in a same > TaskManager share a same NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (FLINK-11989) Enable metric reporter modules in jdk9 runs
[ https://issues.apache.org/jira/browse/FLINK-11989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11989. Resolution: Fixed master: 2d045731da193d28bb3e6b15fbee514ec9a0c69f > Enable metric reporter modules in jdk9 runs > --- > > Key: FLINK-11989 > URL: https://issues.apache.org/jira/browse/FLINK-11989 > Project: Flink > Issue Type: Improvement > Components: Runtime / Metrics, Travis >Affects Versions: 1.9.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Fix For: 1.9.0 > > > The Reporter modules are currently disabled on the travis jdk9 jobs as we ran > into some issues in the MetricRegistry that prevented them from suceeding. > It appears that this issue no longer exists, so let's enable them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-11989) Enable metric reporter modules in jdk9 runs
Chesnay Schepler created FLINK-11989: Summary: Enable metric reporter modules in jdk9 runs Key: FLINK-11989 URL: https://issues.apache.org/jira/browse/FLINK-11989 Project: Flink Issue Type: Improvement Components: Runtime / Metrics, Travis Affects Versions: 1.9.0 Reporter: Chesnay Schepler Assignee: Chesnay Schepler Fix For: 1.9.0 The Reporter modules are currently disabled on the travis jdk9 jobs as we ran into some issues in the MetricRegistry that prevented them from suceeding. It appears that this issue no longer exists, so let's enable them again. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] flinkbot commented on issue #8029: [FLINK-11898] [table-planner-blink] Support code generation for all Blink built-in functions and operators
flinkbot commented on issue #8029: [FLINK-11898] [table-planner-blink] Support code generation for all Blink built-in functions and operators URL: https://github.com/apache/flink/pull/8029#issuecomment-475179633 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (FLINK-11900) Flink on Kubernetes sensitive about arguments placement
[ https://issues.apache.org/jira/browse/FLINK-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] frank wang reassigned FLINK-11900: -- Assignee: frank wang > Flink on Kubernetes sensitive about arguments placement > --- > > Key: FLINK-11900 > URL: https://issues.apache.org/jira/browse/FLINK-11900 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.7.2 >Reporter: Mario Georgiev >Assignee: frank wang >Priority: Major > > Hello guys, > I've discovered that when deploying the job cluster on Kubernetes, the Job > Cluster Manager seems sensitive about the placement of arguments. > For instance if i put the savepoint argument last, it never reads it. > For instance if arguments are : > {code:java} > job-cluster --job-classname --fromSavepoint > --allowNonRestoredState does not pick the savepoint path and does not start > from it > job-cluster --fromSavepoint -n --job-classname > -p works for savepoint retrieval{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-11898) Support code generation for built-in functions and operators
[ https://issues.apache.org/jira/browse/FLINK-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-11898: --- Labels: pull-request-available (was: ) > Support code generation for built-in functions and operators > > > Key: FLINK-11898 > URL: https://issues.apache.org/jira/browse/FLINK-11898 > Project: Flink > Issue Type: New Feature > Components: SQL / Planner >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > Labels: pull-request-available > > Support code generation for built-in functions and operators. > FLINK-11788 has supported some of the operators. This issue is aiming to > complement the functions and operators supported in Flink SQL. > This should inlclude: CONCAT, LIKE, SUBSTRING, UPPER, LOWER, and so on. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] wuchong opened a new pull request #8029: [FLINK-11898] [table-planner-blink] Support code generation for all Blink built-in functions and operators
wuchong opened a new pull request #8029: [FLINK-11898] [table-planner-blink] Support code generation for all Blink built-in functions and operators URL: https://github.com/apache/flink/pull/8029 ## What is the purpose of the change Support code generation for all Blink built-in functions and operators. ## Brief change log changes to `flink-table-planner-blink` - Add `BinaryStringCallGen` include `generate...` methods for BinaryString. - Add `ScalarFunctionGens` for generate function using `BuiltinMethods`. - Register all the Blink built-in SqlFunctions to `FlinkSqlOperatorTable` changes to `flink-table-runtime-blink` - Add `SqlDateTimeUtils` to place all the date time related methods. - Add `SqlFunctionUtils` to place all the other builtin methods. - Add Calcite Avatica dependency to `flink-table-runtime-blink` because we need Avatica's `TimeUnitRange` and `DateTimeUtils` when in runtime `SqlDateTimeUtils`. ## Verifying this change - We add a lot of expression tests under `org/apache/flink/table/expressions` to test built-in functions. - Move all the IT tests in `BuiltinScalarFunctionITCase` (in blink repository) to `ScalarFunctionsTest` unit tests to save test time. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] zentol merged pull request #7934: [FLINK-11853] [rest] Minimal implementation of POST for /jars/:jarid/plan
zentol merged pull request #7934: [FLINK-11853] [rest] Minimal implementation of POST for /jars/:jarid/plan URL: https://github.com/apache/flink/pull/7934 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Closed] (FLINK-11786) Merge cron branches into respective release branches
[ https://issues.apache.org/jira/browse/FLINK-11786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chesnay Schepler closed FLINK-11786. Resolution: Fixed Fix Version/s: 1.8.1 1.9.0 master: a5d64e630fd69ec4250499a56a2c6e8fa817d445 1.8: 33be6c24d4dad1e11f88aa7ad0cdbd76600e7c45 1.7: 17509d82ec0250e2e491abb86ea3b4941726d174 > Merge cron branches into respective release branches > > > Key: FLINK-11786 > URL: https://issues.apache.org/jira/browse/FLINK-11786 > Project: Flink > Issue Type: Improvement > Components: Travis >Affects Versions: 1.7.2, 1.8.0 >Reporter: Chesnay Schepler >Assignee: Chesnay Schepler >Priority: Major > Labels: pull-request-available > Fix For: 1.7.3, 1.9.0, 1.8.1 > > Time Spent: 20m > Remaining Estimate: 0h > > Apparently it is possible for jobs to have distinct {{script}} entries > without affecting the caching (which we need for our stage setup to work). > This allows us to merge all cron branches into the respective release branch. > So far we couldn't do it since we relied on the order of jobs to determine > what a particular job should be doing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [flink] zentol commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard
zentol commented on issue #8016: [FLINK-10705][web]: Rework Flink Web Dashboard URL: https://github.com/apache/flink/pull/8016#issuecomment-475161504 I'm against removing the old UI. We can make the new one the default but for at least 1 release the old one should be kept as a backup in case some weird issue arises. This should not incur any maintenance overhead as the old UI can be kept as is and does not have to maintain feature parity with the new one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] flinkbot edited a comment on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type
flinkbot edited a comment on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type URL: https://github.com/apache/flink/pull/7978#issuecomment-472607121 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Review Progress * ✅ 1. The [description] looks good. - Approved by @rmetzger [PMC], @suez1224 [committer] * ✅ 2. There is [consensus] that the contribution should go into to Flink. - Approved by @rmetzger [PMC] * ❗ 3. Needs [attention] from. - Needs attention by @gjl * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] rmetzger commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type
rmetzger commented on issue #7978: [FLINK-11910] [Yarn] add customizable yarn application type URL: https://github.com/apache/flink/pull/7978#issuecomment-475159811 Thank you for addressing our comments. @flinkbot attention @GJL This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [flink] lamber-ken edited a comment on issue #7987: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null
lamber-ken edited a comment on issue #7987: [FLINK-11820][serialization] SimpleStringSchema handle message record which value is null URL: https://github.com/apache/flink/pull/7987#issuecomment-473745024 @klion26, hi, I have updated the pr as you suggest, thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services