date:20220112

[GitHub] [flink] flinkbot edited a comment on pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18346:
URL: https://github.com/apache/flink/pull/18346#issuecomment-1011842175


   
   ## CI report:
   
   * 195e75ed899f947fbb8b535f9bfc16f501d0f86e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29363)
 
   * d5e74a33d64d43c2d0b5c45551de399f880602ff Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29366)
 
   * b96af7d7646dae5707dbc318239a457d328e926a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   * 111d05bc4d409c357223c238e67bcb34880a1951 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lichang-bd commented on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



lichang-bd commented on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011883993


   @flinkbot  run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18346:
URL: https://github.com/apache/flink/pull/18346#issuecomment-1011842175


   
   ## CI report:
   
   * 195e75ed899f947fbb8b535f9bfc16f501d0f86e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29363)
 
   * d5e74a33d64d43c2d0b5c45551de399f880602ff Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29366)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   * 8337b385d6b96699e33fe7668d8b34bfc38beb2d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18102: [FLINK-25033][runtime] Let some scheduler components updatable

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18102:
URL: https://github.com/apache/flink/pull/18102#issuecomment-993332110


   
   ## CI report:
   
   * 114a3c714e91f1075ddc647417ff59bc1d2cc907 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29365)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fapaul commented on a change in pull request #18302: [FLINK-25569][core] Add decomposed Sink V2 interface

2022-01-12 Thread GitBox



fapaul commented on a change in pull request #18302:
URL: https://github.com/apache/flink/pull/18302#discussion_r783706614



##
File path: 
flink-core/src/main/java/org/apache/flink/api/connector/sink2/StatefulSink.java
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.sink2;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.core.io.SimpleVersionedSerializer;
+
+import java.io.IOException;
+import java.util.Collection;
+import java.util.List;
+
+/**
+ * A {@link Sink} with a stateful {@link SinkWriter}.
+ *
+ * The {@link StatefulSink} needs to be serializable. All configuration 
should be validated
+ * eagerly. The respective sink writers are transient and will only be created 
in the subtasks on
+ * the taskmanagers.
+ *
+ * @param  The type of the sink's input
+ * @param  The type of the sink writer's state
+ */
+@PublicEvolving
+public interface StatefulSink extends Sink {
+
+/**
+ * Create a {@link StatefulSinkWriter}.
+ *
+ * @param context the runtime context.
+ * @return A sink writer.
+ * @throws IOException for any failure during creation.
+ */
+StatefulSinkWriter createWriter(InitContext context) 
throws IOException;
+
+/**
+ * Create a {@link StatefulSinkWriter} from a recovered state.
+ *
+ * @param context the runtime context.
+ * @return A sink writer.
+ * @throws IOException for any failure during creation.
+ */
+StatefulSinkWriter restoreWriter(
+InitContext context, Collection recoveredState) 
throws IOException;
+
+/**
+ * Any stateful sink needs to provide this state serializer and implement 
{@link
+ * StatefulSinkWriter#snapshotState(long)} properly. The respective state 
is used in {@link
+ * #restoreWriter(InitContext, Collection)} on recovery.
+ *
+ * @return the serializer of the writer's state type.
+ */
+SimpleVersionedSerializer getWriterStateSerializer();

Review comment:
   Removing all these optionals was one of the intentions behind designing 
the new interfaces. Sink developers can now explicitly decide which 
functionality they want to support and implement the interfaces accordingly 
[1]. With the Sink V1 interfaces they basically always had to implement 
everything except that some of the methods have default implementations.
   
   [1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-191%3A+Extend+unified+Sink+interface+to+support+small+file+compaction#FLIP191:ExtendunifiedSinkinterfacetosupportsmallfilecompaction-SimpleSink




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] fapaul commented on a change in pull request #18302: [FLINK-25569][core] Add decomposed Sink V2 interface

2022-01-12 Thread GitBox



fapaul commented on a change in pull request #18302:
URL: https://github.com/apache/flink/pull/18302#discussion_r783704812



##
File path: 
flink-core/src/main/java/org/apache/flink/api/connector/sink2/Committer.java
##
@@ -0,0 +1,103 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.api.connector.sink2;
+
+import org.apache.flink.annotation.PublicEvolving;
+
+import java.io.IOException;
+import java.util.Collection;
+
+/**
+ * The {@code Committer} is responsible for committing the data staged by the 
{@link
+ * TwoPhaseCommittingSink.PrecommittingSinkWriter} in the second step of a 
two-phase commit
+ * protocol.
+ *
+ * A commit must be idempotent: If some failure occurs in Flink during 
commit phase, Flink will
+ * restart from previous checkpoint and re-attempt to commit all committables. 
Thus, some or all
+ * committables may have already been committed. These {@link CommitRequest}s 
must not change the
+ * external system and implementers are asked to signal {@link
+ * CommitRequest#signalAlreadyCommitted()}.
+ *
+ * @param  The type of information needed to commit the staged data
+ */
+@PublicEvolving
+public interface Committer extends AutoCloseable {
+/**
+ * Commit the given list of {@link CommT}.
+ *
+ * @param committables A list of commit requests staged by the sink writer.
+ * @throws IOException for reasons that may yield a complete restart of 
the job.
+ */
+void commit(Collection> committables)
+throws IOException, InterruptedException;
+
+/**
+ * A request to commit a specific committable.
+ *
+ * @param 
+ */
+@PublicEvolving
+interface CommitRequest {
+
+/** Returns the committable. */
+CommT getCommittable();
+
+/**
+ * Returns how many times this particular committable has been 
retried. Starts at 0 for the
+ * first attempt.
+ */
+int getNumberOfRetries();
+
+/**
+ * The commit failed for known reason and should not be retried.
+ *
+ * Currently calling this method only logs the error, discards the 
comittable and
+ * continues. In the future the behaviour might be configurable.
+ */
+void signalFailedWithKnownReason(Throwable t);
+
+/**
+ * The commit failed for unknown reason and should not be retried.
+ *
+ * Currently calling this method fails the job. In the future the 
behaviour might be
+ * configurable.
+ */
+void signalFailedWithUnknownReason(Throwable t);
+
+/**
+ * The commit failed for a retriable reason. If the sink supports a 
retry maximum, this may
+ * permanently fail after reaching that maximum. Else the committable 
will be retried as
+ * long as this method is invoked after each attempt.
+ */
+void retryLater();

Review comment:
   So far none of our sinks sets a number of maximum retries but in the 
future, we might consider it. The retry mechanism will work internally similar 
to the current implementation [1]. As soon as the committable is retried we 
enqueue in the mailbox that is polled "periodically" and retried. Moreover 
during the next checkpoint, the committable is retried as well.
   
   [1] 
https://github.com/apache/flink/blob/dbbf2a36111da1faea5c901e3b008cc788913bf8/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperator.java#L96




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18346:
URL: https://github.com/apache/flink/pull/18346#issuecomment-1011842175


   
   ## CI report:
   
   * 195e75ed899f947fbb8b535f9bfc16f501d0f86e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29363)
 
   * d5e74a33d64d43c2d0b5c45551de399f880602ff UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25615) FlinkKafkaProducer fail to correctly migrate pre Flink 1.9 state

2022-01-12 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475159#comment-17475159
 ] 

Martijn Visser commented on FLINK-25615:


[~Matthias Schwalbe] Definitely. Thanks for reaching out!

> FlinkKafkaProducer fail to correctly migrate pre Flink 1.9 state
> 
>
> Key: FLINK-25615
> URL: https://issues.apache.org/jira/browse/FLINK-25615
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Matthias Schwalbe
>Priority: Major
>
> I've found an unnoticed error in FlinkKafkaProvider when migrating from pre 
> Flink 1.9 state to versions starting with Flink 1.9:
>  * the operator state for next-transactional-id-hint should be deleted and 
> replaced by operator state next-transactional-id-hint-v2, however
>  * operator state next-transactional-id-hint is never deleted
>  * see here: [1] :
> {quote}        if (context.getOperatorStateStore()
>                 .getRegisteredStateNames()
>                 .contains(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR)) {
>             migrateNextTransactionalIdHindState(context);
>         }{quote} * migrateNextTransactionalIdHindState is never called, as 
> the condition cannot become true:
>  ** getRegisteredStateNames returns a list of String, whereas 
> NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR is ListStateDescriptor (type mismatch)
> The Effect is:
>  * because NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR is for a UnionListState, and
>  * the state is not cleared,
>  * each time the job restarts from a savepoint or checkpoint the size 
> multiplies times the parallelism
>  * then because each entry leaves an offset in metadata, akka.framesize 
> becomes too small, before we run into memory overflow
>  
> The breaking change has been introduced in commit 
> 70fa80e3862b367be22b593db685f9898a2838ef
>  
> A simple fix would be to change the code to:
> {quote}        if (context.getOperatorStateStore()
>                 .getRegisteredStateNames()
>                 .contains(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR.getName())) {
>             migrateNextTransactionalIdHindState(context);
>         }
> {quote}
>  
> Although FlinkKafkaProvider  is marked as deprecated it is probably a while 
> here to stay
>  
> Greeting
> Matthias (Thias) Schwalbe
>  
> [1] 
> https://github.com/apache/flink/blob/d7cf2c10f8d4fba81173854cbd8be27c657c7c7f/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java#L1167-L1171
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18102: [FLINK-25033][runtime] Let some scheduler components updatable

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18102:
URL: https://github.com/apache/flink/pull/18102#issuecomment-993332110


   
   ## CI report:
   
   * 3bb15a5495d1626bded0929f766c179edfafd995 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28557)
 
   * 114a3c714e91f1075ddc647417ff59bc1d2cc907 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29365)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-25628) Introduce RecordReader and related classes for table store

2022-01-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25628:
---
Labels: pull-request-available  (was: )

> Introduce RecordReader and related classes for table store
> --
>
> Key: FLINK-25628
> URL: https://issues.apache.org/jira/browse/FLINK-25628
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.1.0
>
>
> - Introduce RecordReader interface: The reader that reads the batches of 
> records. 
> - Introduce SortMergeReader: This reader is to read a list of `RecordReader`, 
> which is already sorted by key and sequence number, and perform a sort merge 
> algorithm. `KeyValue` with the same key will also be combined during sort 
> merging.
> - Introduce ConcatRecordReader: This reader is to concatenate a list of 
> `RecordReader` and read them sequentially. The input list is already sorted 
> by key and sequence number, and the key intervals do not overlap each other.
> - Introduce FieldStats: Statistics for each field.
> - Introduce SstPathFactory: Factory which produces new Path for sst files.
> - Introduce SstFile and SstFileMeta: This SstFile includes several 
> `KeyValue`, representing the changes inserted into the file storage.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-25628) Introduce RecordReader and related classes for table store

2022-01-12 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-25628:
-
Description: 
- Introduce RecordReader interface: The reader that reads the batches of 
records. 
- Introduce SortMergeReader: This reader is to read a list of `RecordReader`, 
which is already sorted by key and sequence number, and perform a sort merge 
algorithm. `KeyValue` with the same key will also be combined during sort 
merging.
- Introduce ConcatRecordReader: This reader is to concatenate a list of 
`RecordReader` and read them sequentially. The input list is already sorted by 
key and sequence number, and the key intervals do not overlap each other.
- Introduce FieldStats: Statistics for each field.
- Introduce SstPathFactory: Factory which produces new Path for sst files.
- Introduce SstFile and SstFileMeta: This SstFile includes several `KeyValue`, 
representing the changes inserted into the file storage.

> Introduce RecordReader and related classes for table store
> --
>
> Key: FLINK-25628
> URL: https://issues.apache.org/jira/browse/FLINK-25628
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> - Introduce RecordReader interface: The reader that reads the batches of 
> records. 
> - Introduce SortMergeReader: This reader is to read a list of `RecordReader`, 
> which is already sorted by key and sequence number, and perform a sort merge 
> algorithm. `KeyValue` with the same key will also be combined during sort 
> merging.
> - Introduce ConcatRecordReader: This reader is to concatenate a list of 
> `RecordReader` and read them sequentially. The input list is already sorted 
> by key and sequence number, and the key intervals do not overlap each other.
> - Introduce FieldStats: Statistics for each field.
> - Introduce SstPathFactory: Factory which produces new Path for sst files.
> - Introduce SstFile and SstFileMeta: This SstFile includes several 
> `KeyValue`, representing the changes inserted into the file storage.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-web] gaoyunhaii closed pull request #499: Rebuild website

2022-01-12 Thread GitBox



gaoyunhaii closed pull request #499:
URL: https://github.com/apache/flink-web/pull/499


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] JingsongLi opened a new pull request #4: [FLINK-25628] Introduce RecordReader and related classes for table store

2022-01-12 Thread GitBox



JingsongLi opened a new pull request #4:
URL: https://github.com/apache/flink-table-store/pull/4


   - Introduce RecordReader interface: The reader that reads the batches of 
records. 
   - Introduce SortMergeReader: This reader is to read a list of 
`RecordReader`, which is already sorted by key and sequence number, and perform 
a sort merge algorithm. `KeyValue` with the same key will also be combined 
during sort merging.
   - Introduce ConcatRecordReader: This reader is to concatenate a list of 
`RecordReader` and read them sequentially. The input list is already sorted by 
key and sequence number, and the key intervals do not overlap each other.
   - Introduce FieldStats: Statistics for each field.
   - Introduce SstPathFactory: Factory which produces new Path for sst files.
   - Introduce SstFile and SstFileMeta: This SstFile includes several 
`KeyValue`, representing the changes inserted into the file storage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-25627) Add basic structures of file store in table-store

2022-01-12 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-25627.

Resolution: Fixed

master:

40b8a57e71ca833bb5e7045d06b4621010762435

bda887cec0b78f991663e140b26888c507964664

8845586c37550b6b56446ad8208b732650fe426a

> Add basic structures of file store in table-store
> -
>
> Key: FLINK-25627
> URL: https://issues.apache.org/jira/browse/FLINK-25627
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.1.0
>
>
> Add basic structures of file store in table-store:
>  * Add OffsetRowData
>  * Add KeyValue and KeyValueSerializer
>  * Add MemTable and Accumulator



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   * 8337b385d6b96699e33fe7668d8b34bfc38beb2d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18102: [FLINK-25033][runtime] Let some scheduler components updatable

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18102:
URL: https://github.com/apache/flink/pull/18102#issuecomment-993332110


   
   ## CI report:
   
   * 3bb15a5495d1626bded0929f766c179edfafd995 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=28557)
 
   * 114a3c714e91f1075ddc647417ff59bc1d2cc907 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-25638) Increase the default write buffer size of sort-shuffle to 16M

2022-01-12 Thread Yingjie Cao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao reassigned FLINK-25638:
---

Assignee: Yingjie Cao

> Increase the default write buffer size of sort-shuffle to 16M
> -
>
> Key: FLINK-25638
> URL: https://issues.apache.org/jira/browse/FLINK-25638
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yingjie Cao
>Assignee: Yingjie Cao
>Priority: Major
> Fix For: 1.15.0
>
>
> As discussed in 
> [https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this 
> ticket aims to increase the default write buffer size of sort-shuffle to 16M.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-25637) Make sort-shuffle the default shuffle implementation for batch jobs

2022-01-12 Thread Yingjie Cao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao reassigned FLINK-25637:
---

Assignee: Yingjie Cao

> Make sort-shuffle the default shuffle implementation for batch jobs
> ---
>
> Key: FLINK-25637
> URL: https://issues.apache.org/jira/browse/FLINK-25637
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yingjie Cao
>Assignee: Yingjie Cao
>Priority: Major
> Fix For: 1.15.0
>
>
> As discussed in 
> [https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this 
> ticket aims to make sort-shuffle the default shuffle implementation for batch 
> jobs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-25639) Increase the default read buffer size of sort-shuffle to 64M

2022-01-12 Thread Yingjie Cao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao reassigned FLINK-25639:
---

Assignee: Yingjie Cao

> Increase the default read buffer size of sort-shuffle to 64M
> 
>
> Key: FLINK-25639
> URL: https://issues.apache.org/jira/browse/FLINK-25639
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yingjie Cao
>Assignee: Yingjie Cao
>Priority: Major
> Fix For: 1.15.0
>
>
> As discussed in 
> [https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this 
> ticket aims to increase the default read buffer size of sort-shuffle to 64M.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-25627) Add basic structures of file store in table-store

2022-01-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25627:
---
Labels: pull-request-available  (was: )

> Add basic structures of file store in table-store
> -
>
> Key: FLINK-25627
> URL: https://issues.apache.org/jira/browse/FLINK-25627
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.1.0
>
>
> Add basic structures of file store in table-store:
>  * Add OffsetRowData
>  * Add KeyValue and KeyValueSerializer
>  * Add MemTable and Accumulator



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-table-store] JingsongLi merged pull request #3: [FLINK-25627] Add basic structures of file store in table-store

2022-01-12 Thread GitBox



JingsongLi merged pull request #3:
URL: https://github.com/apache/flink-table-store/pull/3


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-24207) Add support of KeyedState in TwoPhaseCommitSinkFunction

2022-01-12 Thread zlzhang0122 (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-24207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475155#comment-17475155
 ] 

zlzhang0122 commented on FLINK-24207:
-

[~roman]  ok, I will check for this, thx!!

> Add support of KeyedState in TwoPhaseCommitSinkFunction
> ---
>
> Key: FLINK-24207
> URL: https://issues.apache.org/jira/browse/FLINK-24207
> Project: Flink
>  Issue Type: New Feature
>  Components: API / DataStream
>Affects Versions: 1.12.2, 1.13.1
>Reporter: zlzhang0122
>Priority: Major
>
> Now, the implementation of TwoPhaseCommitSinkFunction is based on operator 
> state, but operator state will do a deep copy when taking checkpoint, so 
> large operator state may produce a OOM error. Add support of KeyedState in 
> TwoPhaseCommitSinkFunction maybe a good choice to avoid the OOM error and 
> give users more convenience.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-25484) TableRollingPolicy do not support inactivityInterval config which is supported in datastream api

2022-01-12 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-25484:


Assignee: LiChang

> TableRollingPolicy do not support inactivityInterval config which is 
> supported in datastream api
> 
>
> Key: FLINK-25484
> URL: https://issues.apache.org/jira/browse/FLINK-25484
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Table SQL / Ecosystem
>Affects Versions: 1.15.0
>Reporter: LiChang
>Assignee: LiChang
>Priority: Major
> Fix For: 1.15.0
>
> Attachments: image-2022-01-13-11-36-24-102.png
>
>
> TableRollingPolicy do not support inactivityInterval config
> public static class TableRollingPolicy extends 
> CheckpointRollingPolicy {
> private final boolean rollOnCheckpoint;
> private final long rollingFileSize;
> private final long rollingTimeInterval;
> public TableRollingPolicy(
> boolean rollOnCheckpoint, long rollingFileSize, long rollingTimeInterval) {
> this.rollOnCheckpoint = rollOnCheckpoint;
> Preconditions.checkArgument(rollingFileSize > 0L);
> Preconditions.checkArgument(rollingTimeInterval > 0L);
> this.rollingFileSize = rollingFileSize;
> this.rollingTimeInterval = rollingTimeInterval;
> }
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   * 8337b385d6b96699e33fe7668d8b34bfc38beb2d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25057) Streaming File Sink writing to HDFS

2022-01-12 Thread hanjie (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475147#comment-17475147
 ] 

hanjie commented on FLINK-25057:


hi，[~zhisheng]， we can set  file PartPrefix. {color:#0052cc}for 
{color}_example:  part-uuid-_

{_}when  we restart work, t{_}he file name will not be duplicated.

 

 

> Streaming File Sink writing to  HDFS
> 
>
> Key: FLINK-25057
> URL: https://issues.apache.org/jira/browse/FLINK-25057
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem
>Affects Versions: 1.12.1
>Reporter: hanjie
>Priority: Major
>
> Env: Flink 1.12.1
> kafka --> hdfs 
> hdfs : Streaming File Sink
> When I first start flink task：
>     *First part file example：*
>        part-0-0
>        part-0-1
>       .part-0-2.inprogress.952eb958-dac9-4f2c-b92f-9084ed536a1c
>   I cancel flink task. then, i restart task without savepoint or checkpoint. 
> Task run for a while.
>    *Second part file example：*
>           part-0-0
>           part-0-1
>           .part-0-2.inprogress.952eb958-dac9-4f2c-b92f-9084ed536a1c
>           .part-0-0.inprogress.0e2f234b-042d-4232-a5f7-c980f04ca82d
>     'part-0-2.inprogress.952eb958-dac9-4f2c-b92f-9084ed536a1c' not rename 
> file and bucketIndex will start zero.
>      I view related code. Start task need savepoint or checkpoint. I choose 
> savepoint.The above question disappears， when i start third test. 
>     But, if i use expire savepoint. Task will  throw exception.
>      java.io.FileNotFoundException: File does not exist: 
> /ns-hotel/hotel_sa_log/stream/sa_cpc_ad_log_list_detail_dwd/2021-11-25/.part-6-1537.inprogress.cd9c756a-1756-4dc5-9325-485fe99a2803\n\tat
>  
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)\n\tat
>  
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)\n\tat
>  
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)\n\tat
>  
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)\n\tat
>  org.apache.hadoop.fs.FileSystem.resolvePath(FileSystem.java:752)\n\tat 
> org.apache.hadoop.fs.FilterFileSystem.resolvePath(FilterFileSystem.java:153)\n\tat
>  
> org.apache.hadoop.fs.viewfs.ChRootedFileSystem.resolvePath(ChRootedFileSystem.java:373)\n\tat
>  
> org.apache.hadoop.fs.viewfs.ViewFileSystem.resolvePath(ViewFileSystem.java:243)\n\tat
>  
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.revokeLeaseByFileSystem(HadoopRecoverableFsDataOutputStream.java:327)\n\tat
>  
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.safelyTruncateFile(HadoopRecoverableFsDataOutputStream.java:163)\n\tat
>  
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableFsDataOutputStream.(HadoopRecoverableFsDataOutputStream.java:88)\n\tat
>  
> org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.recover(HadoopRecoverableWriter.java:86)\n\tat
>  
> org.apache.flink.streaming.api.functions.sink.filesystem.OutputStreamBasedPartFileWriter$OutputStreamBasedBucketWriter.resumeInProgressFileFrom(OutputStreamBasedPartFileWriter.java:104)\n\tat
>  org.apache.flink.streaming.api.functions.sink.filesyst
>   Task set 'execution.checkpointing.interval': 1min,  I  invoke savepoint  
> every fifth minutes.
>    Consult next everybody solution.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-25640) Enhance the document for blocking shuffle

2022-01-12 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-25640:
---

 Summary: Enhance the document for blocking shuffle
 Key: FLINK-25640
 URL: https://issues.apache.org/jira/browse/FLINK-25640
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Yingjie Cao
 Fix For: 1.15.0


As discussed in 
[https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this ticket 
aims to enhance the document for blocking shuffle and add more operation 
guidelines.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * ffcee2854060f08f1f5ee3821f76823872010a15 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29359)
 
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lichang-bd commented on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



lichang-bd commented on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011863390


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] lindong28 commented on pull request #55: [hotfix] Remove redundant directories

2022-01-12 Thread GitBox



lindong28 commented on pull request #55:
URL: https://github.com/apache/flink-ml/pull/55#issuecomment-1011863063


   @gaoyunhaii @zhipeng93 Could you help review this PR? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-ml] lindong28 opened a new pull request #55: [hotfix] Remove redundant directories

2022-01-12 Thread GitBox



lindong28 opened a new pull request #55:
URL: https://github.com/apache/flink-ml/pull/55


   ## What is the purpose of the change
   
   Remove redundant directories and make the directory structure consistent.
   
   ## Brief change log
   
   Moved `KnnTest.java` and `LogisticRegressionTest.java` to their parent 
directories.
   
   ## Verifying this change
   
   N/A
   
   ## Does this pull request potentially affect one of the following parts:
   
   - Dependencies (does it add or upgrade a dependency): (no)
   - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
   
   ## Documentation
   
   - Does this pull request introduce a new feature? (no)
   - If yes, how is the feature documented? (N/A)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-25639) Increase the default read buffer size of sort-shuffle to 64M

2022-01-12 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-25639:
---

 Summary: Increase the default read buffer size of sort-shuffle to 
64M
 Key: FLINK-25639
 URL: https://issues.apache.org/jira/browse/FLINK-25639
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Yingjie Cao
 Fix For: 1.15.0


As discussed in 
[https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this ticket 
aims to increase the default read buffer size of sort-shuffle to 64M.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * ffcee2854060f08f1f5ee3821f76823872010a15 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29359)
 
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   *  Unknown: [CANCELED](TBD) 
   * 8337b385d6b96699e33fe7668d8b34bfc38beb2d UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18336: [FLINK-25441][network] Wrap failure cuase with ProducerFailedException only for PipelinedSubpartitionView

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18336:
URL: https://github.com/apache/flink/pull/18336#issuecomment-1010777588


   
   ## CI report:
   
   * 1a960b7f82e85d42e2a4f5eb9e9d63a203ed9de5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29358)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-25559) SQL JOIN causes data loss

2022-01-12 Thread Zongwen Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zongwen Li closed FLINK-25559.
--
Fix Version/s: 1.15.0
   1.13.6
   1.14.3
   Resolution: Fixed

> SQL JOIN causes data loss
> -
>
> Key: FLINK-25559
> URL: https://issues.apache.org/jira/browse/FLINK-25559
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.0, 1.13.2, 1.13.3, 1.13.5, 1.14.2
>Reporter: Zongwen Li
>Priority: Major
> Fix For: 1.15.0, 1.13.6, 1.14.3
>
> Attachments: image-2022-01-07-11-27-01-010.png
>
>
> {code:java}
> //sink table,omits some physical fields
> CREATE TABLE kd_product_info (
>   productId BIGINT COMMENT '产品编号',
>   productSaleId BIGINT COMMENT '商品编号',
>   PRIMARY KEY (productSaleId) NOT ENFORCED
> )
> // sql omits some selected fields
> INSERT INTO kd_product_info
> SELECT
>  ps.product AS productId,
>  ps.productsaleid AS productSaleId,
>  CAST(p.complex AS INT) AS complex,
>  p.createtime AS createTime,
>  p.updatetime AS updateTime,
>  p.ean AS ean,
>  ts.availablequantity AS totalAvailableStock,
> IF
>  (
>   ts.availablequantity - ts.lockoccupy - ts.lock_available_quantity > 0,
>   ts.availablequantity - ts.lockoccupy - ts.lock_available_quantity,
>   0
>  ) AS sharedStock
>  ,rps.purchase AS purchase
>  ,v.`name` AS vendorName
>  FROM
>  product_sale ps
>  JOIN product p ON ps.product = p.id
>  LEFT JOIN rate_product_sale rps ON ps.productsaleid = rps.id
>  LEFT JOIN pss_total_stock ts ON ps.productsaleid = ts.productsale
>  LEFT JOIN vendor v ON ps.merchant_id = v.merchant_id AND ps.vendor = v.vendor
>  LEFT JOIN mccategory mc ON ps.merchant_id = mc.merchant_id AND p.mccategory 
> = mc.id
>  LEFT JOIN new_mccategory nmc ON p.mccategory = nmc.mc
>  LEFT JOIN product_sale_grade_plus psgp ON ps.productsaleid = psgp.productsale
>  LEFT JOIN product_sale_extend pse359 ON ps.productsaleid = 
> pse359.product_sale AND pse359.meta = 359
>  LEFT JOIN product_image_url piu ON ps.product = piu.product {code}
> All table sources are upsert-kafka，I have ensured that the associated columns 
> are of the same type:
> {code:java}
> CREATE TABLE product_sale (
>   id BIGINT COMMENT '主键',
>   productsaleid BIGINT COMMENT '商品编号',
>   product BIGINT COMMENT '产品编号',
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   vendor STRING COMMENT '供应商',
>   PRIMARY KEY (productsaleid) NOT ENFORCED
> )
> // No computed columns
> // Just plain physical columns
> WITH (
> 'connector' = 'upsert-kafka',
> 'topic' = 'XXX',
> 'group.id' = '%s',
> 'properties.bootstrap.servers' = '%s',
> 'key.format' = 'json',
> 'value.format' = 'json'
> ) 
> CREATE TABLE product (
>   id BIGINT,
>   mccategory STRING,
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE rate_product_sale ( 
>   id BIGINT COMMENT '主键',
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE pss_total_stock (
>   id INT COMMENT 'ID',
>   productsale BIGINT COMMENT '商品编码',
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE vendor (
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   vendor STRING COMMENT '供应商编码',
>   PRIMARY KEY (merchant_id, vendor) NOT ENFORCED
> )
> CREATE TABLE mccategory (
>   id STRING COMMENT 'mc编号',
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   PRIMARY KEY (merchant_id, id) NOT ENFORCED
> )
> CREATE TABLE new_mccategory (
>   mc STRING,
>   PRIMARY KEY (mc) NOT ENFORCED
> )
> CREATE TABLE product_sale_grade_plus (
>   productsale BIGINT,
>   PRIMARY KEY (productsale) NOT ENFORCED
> )
> CREATE TABLE product_sale_extend (
>   id BIGINT,
>   product_sale BIGINT,
>   meta BIGINT,
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE product_image_url(
>   product BIGINT,
>   PRIMARY KEY (product) NOT ENFORCED 
> ){code}
> the data in each table is between 5 million and 10 million, parallelism: 24;
> Not set ttl; In fact, we can notice data loss as soon as 30 minutes.
>  
> The data flow：
> MySQL -> Flink CDC -> ODS (Upsert Kafka) -> the job -> sink
> I'm sure the ODS data in Kafka is correct;
> I have also tried to use the flink-cdc source directly, it didn't solve the 
> problem;
>  
> We tested sinking to kudu, Kafka or ES;
> Also tested multiple versions: 1.13.2, 1.13.3, 1.13.5, 1.14.0, 1.14.2;
> Lost data appears out of order on kafka, guessed as a bug of retraction 
> stream:
> !image-2022-01-07-11-27-01-010.png!
>  
> After many tests, we found that when the left join table is more or the 
> parallelism of the operator is greater, the data will be more easily lost.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] lichang-bd commented on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



lichang-bd commented on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011858591


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18215: [FLINK-25392][table-planner]Support new StatementSet syntax in planner and parser

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18215:
URL: https://github.com/apache/flink/pull/18215#issuecomment-1001883080


   
   ## CI report:
   
   * 795338a6386db7ede06778e819c48295463edc87 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29356)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zlzhang0122 commented on pull request #18337: [hotfix][tests] Fix the error parameter

2022-01-12 Thread GitBox



zlzhang0122 commented on pull request #18337:
URL: https://github.com/apache/flink/pull/18337#issuecomment-1011857011


   > Please open a ticket for issue.
   
   ok, I will create a ticket for this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread zlzhang0122 (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475136#comment-17475136
 ] 

zlzhang0122 commented on FLINK-25634:
-

You can find the config yarn.application-attempt-failures-validity-interval in 
YarnConfigOptions, this is a flink conf.

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25615) FlinkKafkaProducer fail to correctly migrate pre Flink 1.9 state

2022-01-12 Thread Matthias Schwalbe (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475139#comment-17475139
 ] 

Matthias Schwalbe commented on FLINK-25615:
---

Thought so :)

However, it is documented should someone run into the problem, which was not 
easy to locate.

... next time then :)

 

Happy Flinking

Thias

> FlinkKafkaProducer fail to correctly migrate pre Flink 1.9 state
> 
>
> Key: FLINK-25615
> URL: https://issues.apache.org/jira/browse/FLINK-25615
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.9.0
>Reporter: Matthias Schwalbe
>Priority: Major
>
> I've found an unnoticed error in FlinkKafkaProvider when migrating from pre 
> Flink 1.9 state to versions starting with Flink 1.9:
>  * the operator state for next-transactional-id-hint should be deleted and 
> replaced by operator state next-transactional-id-hint-v2, however
>  * operator state next-transactional-id-hint is never deleted
>  * see here: [1] :
> {quote}        if (context.getOperatorStateStore()
>                 .getRegisteredStateNames()
>                 .contains(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR)) {
>             migrateNextTransactionalIdHindState(context);
>         }{quote} * migrateNextTransactionalIdHindState is never called, as 
> the condition cannot become true:
>  ** getRegisteredStateNames returns a list of String, whereas 
> NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR is ListStateDescriptor (type mismatch)
> The Effect is:
>  * because NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR is for a UnionListState, and
>  * the state is not cleared,
>  * each time the job restarts from a savepoint or checkpoint the size 
> multiplies times the parallelism
>  * then because each entry leaves an offset in metadata, akka.framesize 
> becomes too small, before we run into memory overflow
>  
> The breaking change has been introduced in commit 
> 70fa80e3862b367be22b593db685f9898a2838ef
>  
> A simple fix would be to change the code to:
> {quote}        if (context.getOperatorStateStore()
>                 .getRegisteredStateNames()
>                 .contains(NEXT_TRANSACTIONAL_ID_HINT_DESCRIPTOR.getName())) {
>             migrateNextTransactionalIdHindState(context);
>         }
> {quote}
>  
> Although FlinkKafkaProvider  is marked as deprecated it is probably a while 
> here to stay
>  
> Greeting
> Matthias (Thias) Schwalbe
>  
> [1] 
> https://github.com/apache/flink/blob/d7cf2c10f8d4fba81173854cbd8be27c657c7c7f/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java#L1167-L1171
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] BruceWong96 edited a comment on pull request #18293: [FLINK-25560][Connectors/HBase] Add "sink.delete.mode" in HBase sql connector for retracting the latest version or all versions in

2022-01-12 Thread GitBox



BruceWong96 edited a comment on pull request #18293:
URL: https://github.com/apache/flink/pull/18293#issuecomment-1009895174


   > Thanks for the contribution. The change should be working for your case 
and the code itself looks overall ok. It looks like you want to fix the HBase 
bug by adding work-around into the connector and I have following concerns:
   > 
   > 1. Extra customized config will be introduced to users which turns out 
that an internal bug effects users' behaviour.
   > 2. The HBaseConnectorOptions has been extended for specific process logic, 
beyond the neutral HBase Lookup Options, HBase Configuration Options, and Flink 
Operator Options. This smells like the start of the hacking.
   > 3. HBase should be the right one to control the versions. With the change 
of this PR, there are two roles who can do the same job, a SRP violation. It 
should be used very carefully to avoid data lost in the production. I could 
already imagine the situation where user would complain the unexpected data 
lost. I understood that the default deleteMode is trying to reduce such 
incident, thanks for considering about that, but it is now possible for e.g. 
human mistake on the SQL level to make the data lost happen and, once it 
happens, it is expensive in the production env.
   > 4. This change does not cover multiple CF scenario, since the scope of 
version has been changed/enlarged from CF to table. Modifying the deleteMode 
for one CF might end up with the data lost of another CF. if not modify it, it 
will still have the deletion issue for the related CF. This could be solved by 
splitting the table, which turns out again an internal bug effects users' 
behaviour. There will be extra effort to maintain two maybe even more tables 
that conceptually belong together and keep the data synched.
   > 5. Shotgun surgery - too many classes have been touched for building the 
work-around of simple functionality.
   > 6. Classes defined for common case got involved into specific deletion 
case, i.e. HBaseWriteOptions, HBaseMutationConverter, etc.
   > 
   > Looking forward to your thoughts.
   
   Thank you for your reply
   1. Users need to understand the usage of the custom configuration before 
using it.
   2. In change log mode, op= 'd' means deleting data, even before this PR.
   3. In change log mode, Flink SQL calls HBase Client to modify data, which 
does not violate the single responsibility principle. Flink SQL can also delete 
data before this PR. If the user selects' sink.delete.mode '=' all-versions', 
the data is not accidentally lost but deleted normally. Before the PR, only the 
last version is deleted.
   4. After the test, we found that the deleted CF was the CF specified in the 
Flink SQL DML statement, without affecting other CF. The specific test results 
can be found in JIRA.
   5. I think it is not a good way to avoid data loss without providing a 
correct way to delete data.
   
   What we need is that the HBase Connector can meet the requirements of 
adding, deleting, modifying and querying in the change log mode. It seems that 
the deletion is not perfect, which leads to incorrect data.
   
   Looking forward to your thoughts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25631) Support enhanced `show tables` statement

2022-01-12 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475138#comment-17475138
 ] 

Martijn Visser commented on FLINK-25631:


Thanks [~jark] and [~liyubin117]!

> Support enhanced `show tables` statement
> 
>
> Key: FLINK-25631
> URL: https://issues.apache.org/jira/browse/FLINK-25631
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
>
> Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has 
> been supported broadly in many popular data process engine like 
> presto/trino/spark
> [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html]
> I have investigated the syntax of engines as mentioned above.
>  
> We could use such statement to easily show the tables of specified databases 
> without switching db frequently, alse we could use regexp pattern to find 
> focused tables quickly from plenty of tables. besides, the new statement is 
> compatible completely with the old one, users could use `show tables` as 
> before.
> h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]
>  
> I have implemented the syntax, demo as below:
> {code:java}
> Flink SQL> create database d1;
> [INFO] Execute statement succeed.
> Flink SQL> create table d1.b1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table t1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table m1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> show tables like 'm%';
> ++
> | table name |
> ++
> |         m1 |
> ++
> 1 row in set
> Flink SQL> show tables from d1 like 'b%';
> ++
> | table name |
> ++
> |         b1 |
> ++
> 1 row in set{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25559) SQL JOIN causes data loss

2022-01-12 Thread Zongwen Li (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475137#comment-17475137
 ] 

Zongwen Li commented on FLINK-25559:


[~monster#12] Following Lincoln's suggestion, I tried using flink 1.14.3-rc1 
and so far the problem has not recurred.

> SQL JOIN causes data loss
> -
>
> Key: FLINK-25559
> URL: https://issues.apache.org/jira/browse/FLINK-25559
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.0, 1.13.2, 1.13.3, 1.13.5, 1.14.2
>Reporter: Zongwen Li
>Priority: Major
> Attachments: image-2022-01-07-11-27-01-010.png
>
>
> {code:java}
> //sink table,omits some physical fields
> CREATE TABLE kd_product_info (
>   productId BIGINT COMMENT '产品编号',
>   productSaleId BIGINT COMMENT '商品编号',
>   PRIMARY KEY (productSaleId) NOT ENFORCED
> )
> // sql omits some selected fields
> INSERT INTO kd_product_info
> SELECT
>  ps.product AS productId,
>  ps.productsaleid AS productSaleId,
>  CAST(p.complex AS INT) AS complex,
>  p.createtime AS createTime,
>  p.updatetime AS updateTime,
>  p.ean AS ean,
>  ts.availablequantity AS totalAvailableStock,
> IF
>  (
>   ts.availablequantity - ts.lockoccupy - ts.lock_available_quantity > 0,
>   ts.availablequantity - ts.lockoccupy - ts.lock_available_quantity,
>   0
>  ) AS sharedStock
>  ,rps.purchase AS purchase
>  ,v.`name` AS vendorName
>  FROM
>  product_sale ps
>  JOIN product p ON ps.product = p.id
>  LEFT JOIN rate_product_sale rps ON ps.productsaleid = rps.id
>  LEFT JOIN pss_total_stock ts ON ps.productsaleid = ts.productsale
>  LEFT JOIN vendor v ON ps.merchant_id = v.merchant_id AND ps.vendor = v.vendor
>  LEFT JOIN mccategory mc ON ps.merchant_id = mc.merchant_id AND p.mccategory 
> = mc.id
>  LEFT JOIN new_mccategory nmc ON p.mccategory = nmc.mc
>  LEFT JOIN product_sale_grade_plus psgp ON ps.productsaleid = psgp.productsale
>  LEFT JOIN product_sale_extend pse359 ON ps.productsaleid = 
> pse359.product_sale AND pse359.meta = 359
>  LEFT JOIN product_image_url piu ON ps.product = piu.product {code}
> All table sources are upsert-kafka，I have ensured that the associated columns 
> are of the same type:
> {code:java}
> CREATE TABLE product_sale (
>   id BIGINT COMMENT '主键',
>   productsaleid BIGINT COMMENT '商品编号',
>   product BIGINT COMMENT '产品编号',
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   vendor STRING COMMENT '供应商',
>   PRIMARY KEY (productsaleid) NOT ENFORCED
> )
> // No computed columns
> // Just plain physical columns
> WITH (
> 'connector' = 'upsert-kafka',
> 'topic' = 'XXX',
> 'group.id' = '%s',
> 'properties.bootstrap.servers' = '%s',
> 'key.format' = 'json',
> 'value.format' = 'json'
> ) 
> CREATE TABLE product (
>   id BIGINT,
>   mccategory STRING,
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE rate_product_sale ( 
>   id BIGINT COMMENT '主键',
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE pss_total_stock (
>   id INT COMMENT 'ID',
>   productsale BIGINT COMMENT '商品编码',
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE vendor (
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   vendor STRING COMMENT '供应商编码',
>   PRIMARY KEY (merchant_id, vendor) NOT ENFORCED
> )
> CREATE TABLE mccategory (
>   id STRING COMMENT 'mc编号',
>   merchant_id DECIMAL(20, 0) COMMENT '商户id',
>   PRIMARY KEY (merchant_id, id) NOT ENFORCED
> )
> CREATE TABLE new_mccategory (
>   mc STRING,
>   PRIMARY KEY (mc) NOT ENFORCED
> )
> CREATE TABLE product_sale_grade_plus (
>   productsale BIGINT,
>   PRIMARY KEY (productsale) NOT ENFORCED
> )
> CREATE TABLE product_sale_extend (
>   id BIGINT,
>   product_sale BIGINT,
>   meta BIGINT,
>   PRIMARY KEY (id) NOT ENFORCED
> )
> CREATE TABLE product_image_url(
>   product BIGINT,
>   PRIMARY KEY (product) NOT ENFORCED 
> ){code}
> the data in each table is between 5 million and 10 million, parallelism: 24;
> Not set ttl; In fact, we can notice data loss as soon as 30 minutes.
>  
> The data flow：
> MySQL -> Flink CDC -> ODS (Upsert Kafka) -> the job -> sink
> I'm sure the ODS data in Kafka is correct;
> I have also tried to use the flink-cdc source directly, it didn't solve the 
> problem;
>  
> We tested sinking to kudu, Kafka or ES;
> Also tested multiple versions: 1.13.2, 1.13.3, 1.13.5, 1.14.0, 1.14.2;
> Lost data appears out of order on kafka, guessed as a bug of retraction 
> stream:
> !image-2022-01-07-11-27-01-010.png!
>  
> After many tests, we found that when the left join table is more or the 
> parallelism of the operator is greater, the data will be more easily lost.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #17676: [FLINK-14100][connectors] Add Oracle dialect

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #17676:
URL: https://github.com/apache/flink/pull/17676#issuecomment-960641444


   
   ## CI report:
   
   * 35238e2baa5a30a5a5e047aceddedbfae6cfcd75 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29357)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread zlzhang0122 (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475136#comment-17475136
 ] 

zlzhang0122 edited comment on FLINK-25634 at 1/13/22, 7:04 AM:
---

You can find the config yarn.application-attempt-failures-validity-interval in 
[YarnConfigOptions|https://github.com/apache/flink/blob/dbbf2a36111da1faea5c901e3b008cc788913bf8/flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnConfigOptions.java#L106],
 this is a flink conf.


was (Author: zlzhang0122):
You can find the config yarn.application-attempt-failures-validity-interval in 
YarnConfigOptions, this is a flink conf.

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * ffcee2854060f08f1f5ee3821f76823872010a15 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29359)
 
   * 86046d08986f3ed691c9766d104f97163c445192 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29364)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-25484) TableRollingPolicy do not support inactivityInterval config which is supported in datastream api

2022-01-12 Thread LiChang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475088#comment-17475088
 ] 

LiChang edited comment on FLINK-25484 at 1/13/22, 7:02 AM:
---

yes，I am doing on it  [~fpaul]

you may assigned this task to me 
 


was (Author: lichang):
yes，I am doing on it  Fabian Paul
 

> TableRollingPolicy do not support inactivityInterval config which is 
> supported in datastream api
> 
>
> Key: FLINK-25484
> URL: https://issues.apache.org/jira/browse/FLINK-25484
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / FileSystem, Table SQL / Ecosystem
>Affects Versions: 1.15.0
>Reporter: LiChang
>Priority: Major
> Fix For: 1.15.0
>
> Attachments: image-2022-01-13-11-36-24-102.png
>
>
> TableRollingPolicy do not support inactivityInterval config
> public static class TableRollingPolicy extends 
> CheckpointRollingPolicy {
> private final boolean rollOnCheckpoint;
> private final long rollingFileSize;
> private final long rollingTimeInterval;
> public TableRollingPolicy(
> boolean rollOnCheckpoint, long rollingFileSize, long rollingTimeInterval) {
> this.rollOnCheckpoint = rollOnCheckpoint;
> Preconditions.checkArgument(rollingFileSize > 0L);
> Preconditions.checkArgument(rollingTimeInterval > 0L);
> this.rollingFileSize = rollingFileSize;
> this.rollingTimeInterval = rollingTimeInterval;
> }
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-25638) Increase the default write buffer size of sort-shuffle to 16M

2022-01-12 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-25638:
---

 Summary: Increase the default write buffer size of sort-shuffle to 
16M
 Key: FLINK-25638
 URL: https://issues.apache.org/jira/browse/FLINK-25638
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Yingjie Cao
 Fix For: 1.15.0


As discussed in 
[https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this ticket 
aims to increase the default write buffer size of sort-shuffle to 16M.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-25637) Make sort-shuffle the default shuffle implementation for batch jobs

2022-01-12 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-25637:
---

 Summary: Make sort-shuffle the default shuffle implementation for 
batch jobs
 Key: FLINK-25637
 URL: https://issues.apache.org/jira/browse/FLINK-25637
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Yingjie Cao
 Fix For: 1.15.0


As discussed in 
[https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p], this ticket 
aims to make sort-shuffle the default shuffle implementation for batch jobs.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25631) Support enhanced `show tables` statement

2022-01-12 Thread Jark Wu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475133#comment-17475133
 ] 

Jark Wu commented on FLINK-25631:
-

Thanks [~liyubin117], this feature sounds good to me. You can take FLINK-22885 
as an example to implement it (including the tests and docs). 


> Support enhanced `show tables` statement
> 
>
> Key: FLINK-25631
> URL: https://issues.apache.org/jira/browse/FLINK-25631
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
>
> Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has 
> been supported broadly in many popular data process engine like 
> presto/trino/spark
> [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html]
> I have investigated the syntax of engines as mentioned above.
>  
> We could use such statement to easily show the tables of specified databases 
> without switching db frequently, alse we could use regexp pattern to find 
> focused tables quickly from plenty of tables. besides, the new statement is 
> compatible completely with the old one, users could use `show tables` as 
> before.
> h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]
>  
> I have implemented the syntax, demo as below:
> {code:java}
> Flink SQL> create database d1;
> [INFO] Execute statement succeed.
> Flink SQL> create table d1.b1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table t1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table m1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> show tables like 'm%';
> ++
> | table name |
> ++
> |         m1 |
> ++
> 1 row in set
> Flink SQL> show tables from d1 like 'b%';
> ++
> | table name |
> ++
> |         b1 |
> ++
> 1 row in set{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * ffcee2854060f08f1f5ee3821f76823872010a15 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29359)
 
   * 86046d08986f3ed691c9766d104f97163c445192 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-25631) Support enhanced `show tables` statement

2022-01-12 Thread Jark Wu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu reassigned FLINK-25631:
---

Assignee: Yubin Li

> Support enhanced `show tables` statement
> 
>
> Key: FLINK-25631
> URL: https://issues.apache.org/jira/browse/FLINK-25631
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: Yubin Li
>Assignee: Yubin Li
>Priority: Major
>
> Enhanced `show tables` statement like ` show tables from db1 like 't%' ` has 
> been supported broadly in many popular data process engine like 
> presto/trino/spark
> [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-tables.html]
> I have investigated the syntax of engines as mentioned above.
>  
> We could use such statement to easily show the tables of specified databases 
> without switching db frequently, alse we could use regexp pattern to find 
> focused tables quickly from plenty of tables. besides, the new statement is 
> compatible completely with the old one, users could use `show tables` as 
> before.
> h3. SHOW TABLES [ ( FROM | IN ) [catalog.]db ] [ [NOT] LIKE regex_pattern ]
>  
> I have implemented the syntax, demo as below:
> {code:java}
> Flink SQL> create database d1;
> [INFO] Execute statement succeed.
> Flink SQL> create table d1.b1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table t1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> create table m1(id int) with ('connector'='print');
> [INFO] Execute statement succeed.
> Flink SQL> show tables like 'm%';
> ++
> | table name |
> ++
> |         m1 |
> ++
> 1 row in set
> Flink SQL> show tables from d1 like 'b%';
> ++
> | table name |
> ++
> |         b1 |
> ++
> 1 row in set{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] wsry commented on a change in pull request #17814: [FLINK-24899][runtime] Enable data compression for blocking shuffle by default

2022-01-12 Thread GitBox



wsry commented on a change in pull request #17814:
URL: https://github.com/apache/flink/pull/17814#discussion_r783667878



##
File path: 
flink-tests/src/test/java/org/apache/flink/test/runtime/ShuffleCompressionITCase.java
##
@@ -84,8 +84,6 @@
 @Test
 public void testDataCompressionForBoundedBlockingShuffle() throws 
Exception {

Review comment:
   After changing the default value to true, I guess many test cases 
including BlockingShuffleITCase have already covered the scenario of data 
compression. Here what we really need is a test case which covers the scenario 
where data is not compressed. Could you please change the method name including 
the below one to something like 
testBoundedBlockingShuffleWithoutCompression/testSortMergeBlockingShuffleWithoutCompression,
 and set the bool flag of data compression to false?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wsry commented on pull request #17814: [FLINK-24899][runtime] Enable data compression for blocking shuffle by default

2022-01-12 Thread GitBox



wsry commented on pull request #17814:
URL: https://github.com/apache/flink/pull/17814#issuecomment-1011848043


   @SteNicholas Thanks for your contribution, I have left a small comment.
   BTW, could you please also update the corresponding document for blocking 
shuffle? (docs/content/docs/ops/batch/blocking_shuffle.md & 
docs/content.zh/docs/ops/batch/blocking_shuffle.md)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18201: [FLINK-25167][API / DataStream]Expose StreamOperatorFactory in Connec…

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18201:
URL: https://github.com/apache/flink/pull/18201#issuecomment-1000767373


   
   ## CI report:
   
   * ae038dbc84bb1cc7d6654e40081c01b9c1211411 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29355)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18346:
URL: https://github.com/apache/flink/pull/18346#issuecomment-1011842175


   
   ## CI report:
   
   * 195e75ed899f947fbb8b535f9bfc16f501d0f86e Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29363)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread Jira



[ https://issues.apache.org/jira/browse/FLINK-25634 ]


宇宙先生 deleted comment on FLINK-25634:
--

was (Author: JIRAUSER283011):
firstly,thanks for your answer,But I can't find this config 
yarn.application-attempt-failures-validity-interval,can you help me check it 

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot commented on pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



flinkbot commented on pull request #18346:
URL: https://github.com/apache/flink/pull/18346#issuecomment-1011842594


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 195e75ed899f947fbb8b535f9bfc16f501d0f86e (Thu Jan 13 
06:44:11 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #18346: [DO NOT MERGE][FLINK-25085] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



flinkbot commented on pull request #18346:
URL: https://github.com/apache/flink/pull/18346#issuecomment-1011842175


   
   ## CI report:
   
   * 195e75ed899f947fbb8b535f9bfc16f501d0f86e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zjureel opened a new pull request #18346: [DO NOT MERGE] Add scheduled executor service in MainThreadExecutor and close it when the server reaches termination

2022-01-12 Thread GitBox



zjureel opened a new pull request #18346:
URL: https://github.com/apache/flink/pull/18346


   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475127#comment-17475127
 ] 

宇宙先生 commented on FLINK-25634:
--

firstly,thanks for your answer,But I can't find this config 
yarn.application-attempt-failures-validity-interval,can you help me check it 

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] shouweikun commented on pull request #18201: [FLINK-25167][API / DataStream]Expose StreamOperatorFactory in Connec…

2022-01-12 Thread GitBox



shouweikun commented on pull request #18201:
URL: https://github.com/apache/flink/pull/18201#issuecomment-1011838739


   Hi, @fapaul @pnowojski 
   I‘d say that I encountered some problems. I tried to mark 
`StreamOperatorFactory`  and `StreamOperatorParameters` as  `@PublicEvoliving`, 
tests still failed. I checked the cause and intended to fix them but I didn't 
know what to do.
   Asking for help.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-24899) Enable data compression for blocking shuffle by default

2022-01-12 Thread Yingjie Cao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao reassigned FLINK-24899:
---

Assignee: Nicholas Jiang

> Enable data compression for blocking shuffle by default
> ---
>
> Key: FLINK-24899
> URL: https://issues.apache.org/jira/browse/FLINK-24899
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yingjie Cao
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Currently, shuffle data compression is not enabled by default. Shuffle data 
> compression is important for blocking data shuffle and enabling shuffle data 
> compression by default can improve the usability.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread Jira



[ https://issues.apache.org/jira/browse/FLINK-25634 ]


宇宙先生 deleted comment on FLINK-25634:
--

was (Author: JIRAUSER283011):
first，thanks for your answer，but I can't find this config 
yarn.application-attempt-failures-validity-interval ,

can you help me check it ,thanks a lot!

 

 

https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

!image-2022-01-13-14-19-43-539.png!

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-24899) Enable data compression for blocking shuffle by default

2022-01-12 Thread Yingjie Cao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-24899:

Parent: (was: FLINK-24898)
Issue Type: Improvement  (was: Sub-task)

> Enable data compression for blocking shuffle by default
> ---
>
> Key: FLINK-24899
> URL: https://issues.apache.org/jira/browse/FLINK-24899
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Currently, shuffle data compression is not enabled by default. Shuffle data 
> compression is important for blocking data shuffle and enabling shuffle data 
> compression by default can improve the usability.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-24899) Enable data compression for blocking shuffle by default

2022-01-12 Thread Yingjie Cao (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingjie Cao updated FLINK-24899:

Parent: FLINK-25636
Issue Type: Sub-task  (was: Improvement)

> Enable data compression for blocking shuffle by default
> ---
>
> Key: FLINK-24899
> URL: https://issues.apache.org/jira/browse/FLINK-24899
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Reporter: Yingjie Cao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> Currently, shuffle data compression is not enabled by default. Shuffle data 
> compression is important for blocking data shuffle and enabling shuffle data 
> compression by default can improve the usability.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (FLINK-25636) FLIP-199: Change some default config values of blocking shuffle for better usability

2022-01-12 Thread Yingjie Cao (Jira)

Yingjie Cao created FLINK-25636:
---

 Summary: FLIP-199: Change some default config values of blocking 
shuffle for better usability
 Key: FLINK-25636
 URL: https://issues.apache.org/jira/browse/FLINK-25636
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Network
Reporter: Yingjie Cao
 Fix For: 1.15.0


This is the umbrella issue for FLIP-199, we will change the several default 
config value for batch shuffle and update the document accordingly.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread Jira



[ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17475124#comment-17475124
 ] 

宇宙先生 commented on FLINK-25634:
--

first，thanks for your answer，but I can't find this config 
yarn.application-attempt-failures-validity-interval ,

can you help me check it ,thanks a lot!

 

 

https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

!image-2022-01-13-14-19-43-539.png!

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

宇宙先生 updated FLINK-25634:
-
Attachment: image-2022-01-13-14-19-43-539.png

> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png, image-2022-01-13-14-19-43-539.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #17936: [FLINK-24954][network] Refresh read buffer request timeout on buffer recycling/requesting for sort-shuffle

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #17936:
URL: https://github.com/apache/flink/pull/17936#issuecomment-980533423


   
   ## CI report:
   
   * 1ec81b2f7c5abbf5a2bd470dc804977085787e4c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29303)
 
   * 52013eb761deb1f972813d863269ae959fa11267 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29362)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wsry commented on pull request #17936: [FLINK-24954][network] Refresh read buffer request timeout on buffer recycling/requesting for sort-shuffle

2022-01-12 Thread GitBox



wsry commented on pull request #17936:
URL: https://github.com/apache/flink/pull/17936#issuecomment-1011830027


   @TanYuxin-tyx Thanks for the contribution, changes LGTM. I will merge it 
after build succeeds.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17936: [FLINK-24954][network] Refresh read buffer request timeout on buffer recycling/requesting for sort-shuffle

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #17936:
URL: https://github.com/apache/flink/pull/17936#issuecomment-980533423


   
   ## CI report:
   
   * 1ec81b2f7c5abbf5a2bd470dc804977085787e4c Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29303)
 
   * 52013eb761deb1f972813d863269ae959fa11267 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] imaffe commented on pull request #17937: [FLINK-25044][testing][Pulsar Connector] Add More Unit Test For Pulsar Source

2022-01-12 Thread GitBox



imaffe commented on pull request #17937:
URL: https://github.com/apache/flink/pull/17937#issuecomment-1011826129


   > I think two comments are not addressed but everything else looks good now, 
thanks for the patience. Can you squash all the commits together that the PR 
only includes two commits one for the bug fix and one for the tests? Nit: Next 
time it would be great if you mark the comments you have addressed :)
   
   The first case can be deleted. The edge case it tries to test is addressed 
in the SplitReaderTest. Basically what we want to do here is to wait for a 
certain timeout between two messages in the pulsar client and see how it 
behaves. Do we have suggestions on how to test these kinds of behaviour? (needs 
to wait for an external system to timeout, feels we can't use a blocking queue 
to signal a event complete here). 
   
   The second place, I'll set the polling timeout to a large number. And rely 
on the global timeout to terminate execution if the end condition is not met. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-25634) flink-read-onyarn-configuration

2022-01-12 Thread Jira



 [ 
https://issues.apache.org/jira/browse/FLINK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

宇宙先生 updated FLINK-25634:
-
Description: 
in flink-src.code :

!image-2022-01-13-10-14-18-918.png!

Set the number of retries for failed YARN ApplicationMasters/JobManagers in high
 availability mode. This value is usually limited by YARN.
By default, it's 1 in the standalone case and 2 in the high availability case

 

in my cluster,the number of retries for failed YARN ApplicationMasters is 2

yarn's configuration  also like this

!image-2022-01-13-10-07-02-908.png!

But it keeps restarting when my task fails,

!image-2022-01-13-10-10-44-945.png!

I would like to know the reason why the configuration is not taking effect.

sincere thanks!

  was:
in flink-src.code :

!image-2022-01-13-10-14-18-918.png!

Set the number of retries for failed YARN ApplicationMasters/JobManagers in high
 availability mode. This value is usually limited by YARN.
By default, it's 1 in the standalone case and 2 in the high availability case

 

in my cluster,the number of retries for failed YARN ApplicationMasters is 2

yarn's configuration  also like this

!image-2022-01-13-10-07-02-908.png!

But it keeps restarting when my taskfails,

!image-2022-01-13-10-10-44-945.png!

I would like to know the reason why the configuration is not taking effect.

sincere thanks!


> flink-read-onyarn-configuration
> ---
>
> Key: FLINK-25634
> URL: https://issues.apache.org/jira/browse/FLINK-25634
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.11.2
>Reporter: 宇宙先生
>Priority: Major
> Attachments: image-2022-01-13-10-02-57-803.png, 
> image-2022-01-13-10-03-28-230.png, image-2022-01-13-10-07-02-908.png, 
> image-2022-01-13-10-09-36-890.png, image-2022-01-13-10-10-44-945.png, 
> image-2022-01-13-10-14-18-918.png
>
>
> in flink-src.code :
> !image-2022-01-13-10-14-18-918.png!
> Set the number of retries for failed YARN ApplicationMasters/JobManagers in 
> high
>  availability mode. This value is usually limited by YARN.
> By default, it's 1 in the standalone case and 2 in the high availability case
>  
> in my cluster,the number of retries for failed YARN ApplicationMasters is 2
> yarn's configuration  also like this
> !image-2022-01-13-10-07-02-908.png!
> But it keeps restarting when my task fails,
> !image-2022-01-13-10-10-44-945.png!
> I would like to know the reason why the configuration is not taking effect.
> sincere thanks!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18345: [FLINK-25440][doc] startCurosr only supports publishTime not eventTime

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18345:
URL: https://github.com/apache/flink/pull/18345#issuecomment-1011821419


   
   ## CI report:
   
   * 1dbd224db334939fc5337ac994d1b24efdf99657 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29361)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #18345: [FLINK-25440][doc] startCurosr only supports publishTime not eventTime

2022-01-12 Thread GitBox



flinkbot commented on pull request #18345:
URL: https://github.com/apache/flink/pull/18345#issuecomment-1011821419


   
   ## CI report:
   
   * 1dbd224db334939fc5337ac994d1b24efdf99657 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #18345: [FLINK-25440][doc] startCurosr only supports publishTime not eventTime

2022-01-12 Thread GitBox



flinkbot commented on pull request #18345:
URL: https://github.com/apache/flink/pull/18345#issuecomment-1011821142


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 1dbd224db334939fc5337ac994d1b24efdf99657 (Thu Jan 13 
05:58:27 UTC 2022)
   
   **Warnings:**
* Documentation files were touched, but no `docs/content.zh/` files: Update 
Chinese documentation or file Jira ticket.
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-25440).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] springMoon removed a comment on pull request #18309: [FLINK-25582] [TABLE SQL / API] flink sql kafka source cannot special custom parallelism

2022-01-12 Thread GitBox



springMoon removed a comment on pull request #18309:
URL: https://github.com/apache/flink/pull/18309#issuecomment-1011819956


   @alpreu   thanks for review, but @lmagic233  is right, generic solution for 
custom source parallelism is better


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] springMoon closed pull request #18309: [FLINK-25582] [TABLE SQL / API] flink sql kafka source cannot special custom parallelism

2022-01-12 Thread GitBox



springMoon closed pull request #18309:
URL: https://github.com/apache/flink/pull/18309


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] springMoon commented on pull request #18309: [FLINK-25582] [TABLE SQL / API] flink sql kafka source cannot special custom parallelism

2022-01-12 Thread GitBox



springMoon commented on pull request #18309:
URL: https://github.com/apache/flink/pull/18309#issuecomment-1011819956






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-25440) Apache Pulsar Connector Document description error about 'Starting Position'.

2022-01-12 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-25440:
---
Labels: pull-request-available  (was: )

> Apache Pulsar Connector Document description error about 'Starting Position'.
> -
>
> Key: FLINK-25440
> URL: https://issues.apache.org/jira/browse/FLINK-25440
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 1.14.2
>Reporter: xiechenling
>Priority: Minor
>  Labels: pull-request-available
>
> Starting Position description error.
> Start from the specified message time by Message.getEventTime().
> StartCursor.fromMessageTime(long)
> it should be 'Start from the specified message time by publishTime.'



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] imaffe opened a new pull request #18345: [FLINK-25440][doc] startCurosr only supports publishTime not eventTime

2022-01-12 Thread GitBox



imaffe opened a new pull request #18345:
URL: https://github.com/apache/flink/pull/18345

[GitHub] [flink] flinkbot edited a comment on pull request #17777: [FLINK-24886][core] TimeUtils supports the form of m.

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #1:
URL: https://github.com/apache/flink/pull/1#issuecomment-966971402


   
   ## CI report:
   
   * b00b49e806baf132df4f13124b98469cef9ebb91 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29288)
 
   * 34f5bb8e51b6a57f934ff58d53eeb87252911c84 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29360)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] springMoon commented on pull request #18309: [FLINK-25582] [TABLE SQL / API] flink sql kafka source cannot special custom parallelism

2022-01-12 Thread GitBox



springMoon commented on pull request #18309:
URL: https://github.com/apache/flink/pull/18309#issuecomment-1011819020


   @alpreu   thanks for review, but @lmagic233  is right, generic solution for 
custom source parallelism is better


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17777: [FLINK-24886][core] TimeUtils supports the form of m.

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #1:
URL: https://github.com/apache/flink/pull/1#issuecomment-966971402


   
   ## CI report:
   
   * b00b49e806baf132df4f13124b98469cef9ebb91 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29288)
 
   * 34f5bb8e51b6a57f934ff58d53eeb87252911c84 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] liuyongvs commented on pull request #17777: [FLINK-24886][core] TimeUtils supports the form of m.

2022-01-12 Thread GitBox



liuyongvs commented on pull request #1:
URL: https://github.com/apache/flink/pull/1#issuecomment-1011814363


   > ```
   > MINUTES(ChronoUnit.MINUTES, singular("min"), singular("m"), 
plural("minute"))
   > ```
   
   yes, but if we change to MINUTES(ChronoUnit.MINUTES, singular("min"), 
singular("m"), plural("minute")). i worry the user later may be weird。 if you 
think it is ok, i will change the order.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wsry closed pull request #18235: Test change default batch configuration

2022-01-12 Thread GitBox



wsry closed pull request #18235:
URL: https://github.com/apache/flink/pull/18235


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * ffcee2854060f08f1f5ee3821f76823872010a15 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29359)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot commented on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809921


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit ffcee2854060f08f1f5ee3821f76823872010a15 (Thu Jan 13 
05:30:15 UTC 2022)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-25484).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



flinkbot commented on pull request #18344:
URL: https://github.com/apache/flink/pull/18344#issuecomment-1011809365


   
   ## CI report:
   
   * ffcee2854060f08f1f5ee3821f76823872010a15 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lichang-bd opened a new pull request #18344: [Flink-25484][Filesystem] Support inactivityInterval config in table api

2022-01-12 Thread GitBox



lichang-bd opened a new pull request #18344:
URL: https://github.com/apache/flink/pull/18344


   Change-Id: Ica8ba43307476198c4f525ac40a54ab862b58ea7
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18336: [FLINK-25441][network] Wrap failure cuase with ProducerFailedException only for PipelinedSubpartitionView

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18336:
URL: https://github.com/apache/flink/pull/18336#issuecomment-1010777588


   
   ## CI report:
   
   * 86b6155caba9160268797213db5e9c0a18c95a09 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29312)
 
   * 1a960b7f82e85d42e2a4f5eb9e9d63a203ed9de5 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29358)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18336: [FLINK-25441][network] Wrap failure cuase with ProducerFailedException only for PipelinedSubpartitionView

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #18336:
URL: https://github.com/apache/flink/pull/18336#issuecomment-1010777588


   
   ## CI report:
   
   * 86b6155caba9160268797213db5e9c0a18c95a09 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29312)
 
   * 1a960b7f82e85d42e2a4f5eb9e9d63a203ed9de5 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wanglijie95 commented on pull request #18336: [FLINK-25441][network] Wrap failure cuase with ProducerFailedException only for PipelinedSubpartitionView

2022-01-12 Thread GitBox



wanglijie95 commented on pull request #18336:
URL: https://github.com/apache/flink/pull/18336#issuecomment-1011797394


   > Changes LGTM. Though the test failure seems unrelated, could please you 
re-trigger it and get a green build? @wanglijie95 W
   
   Sure, and I will squash the commits.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] wsry commented on pull request #18336: [FLINK-25441][network] Wrap failure cuase with ProducerFailedException only for PipelinedSubpartitionView

2022-01-12 Thread GitBox



wsry commented on pull request #18336:
URL: https://github.com/apache/flink/pull/18336#issuecomment-1011790899


   Changes LGTM. Though the test failure seems unrelated, could please you 
re-trigger it and get a green build? @wanglijie95 W


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #17676: [FLINK-14100][connectors] Add Oracle dialect

2022-01-12 Thread GitBox



flinkbot edited a comment on pull request #17676:
URL: https://github.com/apache/flink/pull/17676#issuecomment-960641444


   
   ## CI report:
   
   * 715383d5d8ec50ae6fd41db834d800bb65b99c80 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29095)
 
   * 35238e2baa5a30a5a5e047aceddedbfae6cfcd75 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=29357)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 5 6 7 >

1 - 100 of 629 matches

Mail list logo