[GitHub] [flink] wangyang0918 commented on a change in pull request #13644: [FLINK-19542][k8s]Implement LeaderElectionService and LeaderRetrievalService based on Kubernetes API
wangyang0918 commented on a change in pull request #13644: URL: https://github.com/apache/flink/pull/13644#discussion_r513218189 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/leaderretrieval/ZooKeeperLeaderRetrievalDriver.java ## @@ -0,0 +1,195 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.leaderretrieval; + +import org.apache.flink.runtime.leaderelection.LeaderInformation; +import org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionDriver; +import org.apache.flink.util.ExceptionUtils; +import org.apache.flink.util.FlinkException; + +import org.apache.flink.shaded.curator4.org.apache.curator.framework.CuratorFramework; +import org.apache.flink.shaded.curator4.org.apache.curator.framework.api.UnhandledErrorListener; +import org.apache.flink.shaded.curator4.org.apache.curator.framework.recipes.cache.ChildData; +import org.apache.flink.shaded.curator4.org.apache.curator.framework.recipes.cache.NodeCache; +import org.apache.flink.shaded.curator4.org.apache.curator.framework.recipes.cache.NodeCacheListener; +import org.apache.flink.shaded.curator4.org.apache.curator.framework.state.ConnectionState; +import org.apache.flink.shaded.curator4.org.apache.curator.framework.state.ConnectionStateListener; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.ObjectInputStream; +import java.util.UUID; + +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** + * The counterpart to the {@link ZooKeeperLeaderElectionDriver}. + * {@link LeaderRetrievalService} implementation for Zookeeper. It retrieves the current leader which has + * been elected by the {@link ZooKeeperLeaderElectionDriver}. + * The leader address as well as the current leader session ID is retrieved from ZooKeeper. + */ +public class ZooKeeperLeaderRetrievalDriver implements LeaderRetrievalDriver, NodeCacheListener, UnhandledErrorListener { + private static final Logger LOG = LoggerFactory.getLogger(ZooKeeperLeaderRetrievalDriver.class); + + /** Connection to the used ZooKeeper quorum. */ + private final CuratorFramework client; + + /** Curator recipe to watch changes of a specific ZooKeeper node. */ + private final NodeCache cache; + + private final String retrievalPath; + + private final ConnectionStateListener connectionStateListener = (client, newState) -> handleStateChange(newState); + + private final LeaderRetrievalEventHandler leaderRetrievalEventHandler; + + private volatile boolean running; + + /** +* Creates a leader retrieval service which uses ZooKeeper to retrieve the leader information. +* +* @param client Client which constitutes the connection to the ZooKeeper quorum +* @param retrievalPath Path of the ZooKeeper node which contains the leader information +* @param leaderRetrievalEventHandler handler to notify the leader changes. +*/ + public ZooKeeperLeaderRetrievalDriver( + CuratorFramework client, + String retrievalPath, + LeaderRetrievalEventHandler leaderRetrievalEventHandler) throws Exception { + this.client = checkNotNull(client, "CuratorFramework client"); + this.cache = new NodeCache(client, retrievalPath); + this.retrievalPath = checkNotNull(retrievalPath); + + this.leaderRetrievalEventHandler = checkNotNull(leaderRetrievalEventHandler); + + client.getUnhandledErrorListenable().addListener(this); + cache.getListenable().addListener(this); + cache.start(); + + client.getConnectionStateListenable().addListener(connectionStateListener); + + running = true; + } + + @Override + public void close() throws Exception { + if (!running) { + return; + } + + running = false; Review comment: At the very beginning, I think the `volatile` is enough. But after a careful consider
[GitHub] [flink] tzulitai commented on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state
tzulitai commented on pull request #13761: URL: https://github.com/apache/flink/pull/13761#issuecomment-717743321 @pnowojski @banmoy Yes, I think it won't hurt to add checking the `isUsingCustomRawKeyedState` flag to the snapshot path of legacy raw keyed state timers. As of now, if users are already using raw keyed state, snapshotting the timers (in the case of RocksDB + heap-based timers) would have already failed with `IOException("Key group X already registered!")`. Additionally checking `isUsingCustomRawKeyedState` on the snapshot path gives us the opportunity to throw with a more meaningful error message. > Maybe we could go even one step further? throw new UnsupportedOperationException() when operator with isUsingCustomRawKeyedState() is trying to register a timer? Not quite sure about this. Registering timers should work regardless of the `isUsingCustomRawKeyedState` IF the user is not using the RocksDB + heap-based timers configuration. Technically, it is possible to do this check in the `InternalTimerServiceImpl`, I don't have a strong opinion on this. @banmoy what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wangyang0918 commented on a change in pull request #13644: [FLINK-19542][k8s]Implement LeaderElectionService and LeaderRetrievalService based on Kubernetes API
wangyang0918 commented on a change in pull request #13644: URL: https://github.com/apache/flink/pull/13644#discussion_r513218321 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/DefaultLeaderElectionServiceTest.java ## @@ -0,0 +1,92 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.leaderelection; + +import org.junit.Test; + +import java.util.UUID; + +import static org.hamcrest.Matchers.is; +import static org.hamcrest.Matchers.notNullValue; +import static org.junit.Assert.assertThat; + +/** + * Tests for {@link DefaultLeaderElectionService}. + */ +public class DefaultLeaderElectionServiceTest { + + private static final String TEST_URL = "akka//user/jobmanager"; + private static final long timeout = 30L * 1000L; + + @Test + public void testOnGrantAndRevokeLeadership() throws Exception { + final TestingLeaderElectionDriver.TestingLeaderElectionDriverFactory testingLeaderElectionDriverFactory = + new TestingLeaderElectionDriver.TestingLeaderElectionDriverFactory(); + final DefaultLeaderElectionService leaderElectionService = new DefaultLeaderElectionService( + testingLeaderElectionDriverFactory); + final TestingContender testingContender = new TestingContender(TEST_URL, leaderElectionService); + leaderElectionService.start(testingContender); + + // grant leadership + final TestingLeaderElectionDriver testingLeaderElectionDriver = + testingLeaderElectionDriverFactory.getCurrentLeaderDriver(); + assertThat(testingLeaderElectionDriver, is(notNullValue())); + testingLeaderElectionDriver.isLeader(); + + testingContender.waitForLeader(timeout); + assertThat(testingContender.isLeader(), is(true)); + assertThat(testingContender.getDescription(), is(TEST_URL)); + + // Check the external storage + assertThat(testingLeaderElectionDriver.getLeaderInformation().getLeaderAddress(), is(TEST_URL)); + + // revoke leadership + testingLeaderElectionDriver.notLeader(); + testingContender.waitForRevokeLeader(timeout); + assertThat(testingContender.isLeader(), is(false)); + + leaderElectionService.stop(); + } + + @Test + public void testLeaderInformationChangedAndShouldBeCorrected() throws Exception { + final TestingLeaderElectionDriver.TestingLeaderElectionDriverFactory testingLeaderElectionDriverFactory = + new TestingLeaderElectionDriver.TestingLeaderElectionDriverFactory(); + final DefaultLeaderElectionService leaderElectionService = new DefaultLeaderElectionService( + testingLeaderElectionDriverFactory); + final TestingContender testingContender = new TestingContender(TEST_URL, leaderElectionService); + leaderElectionService.start(testingContender); + + final TestingLeaderElectionDriver testingLeaderElectionDriver = + testingLeaderElectionDriverFactory.getCurrentLeaderDriver(); + assertThat(testingLeaderElectionDriver, is(notNullValue())); + testingLeaderElectionDriver.isLeader(); + testingContender.waitForLeader(timeout); + + // Leader information changed and should be corrected + testingLeaderElectionDriver.leaderInformationChanged(LeaderInformation.empty()); + assertThat(testingLeaderElectionDriver.getLeaderInformation().getLeaderAddress(), is(TEST_URL)); Review comment: Will check it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tzulitai edited a comment on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state
tzulitai edited a comment on pull request #13761: URL: https://github.com/apache/flink/pull/13761#issuecomment-717743321 @pnowojski @banmoy Yes, I think it won't hurt to add checking the `isUsingCustomRawKeyedState` flag to the snapshot path of legacy raw keyed state timers. As of now, if users are already using raw keyed state, snapshotting the timers (in the case of RocksDB + heap-based timers) would have already failed with `IOException("Key group X already registered!")`. Additionally checking `isUsingCustomRawKeyedState` on the snapshot path gives us the opportunity to throw with a more meaningful error message. Note that we can't just silently skip snapshotting timers, because that would result in data loss; we have to throw and fail the snapshot in this case. > Maybe we could go even one step further? throw new UnsupportedOperationException() when operator with isUsingCustomRawKeyedState() is trying to register a timer? Not quite sure about this. Registering timers should work regardless of the `isUsingCustomRawKeyedState` IF the user is not using the RocksDB + heap-based timers configuration. Technically, it is possible to do this check in the `InternalTimerServiceImpl`, I don't have a strong opinion on this. @banmoy what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tzulitai edited a comment on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state
tzulitai edited a comment on pull request #13761: URL: https://github.com/apache/flink/pull/13761#issuecomment-717743321 @pnowojski @banmoy Yes, I think it won't hurt to add checking the `isUsingCustomRawKeyedState` flag to the snapshot path of legacy raw keyed state timers. As of now, if users are already using raw keyed state, snapshotting the timers (in the case of RocksDB + heap-based timers) would have already failed with `IOException("Key group X already registered!")`. Additionally checking `isUsingCustomRawKeyedState` on the snapshot path gives us the opportunity to throw with a more meaningful error message. Note that we can't just silently skip snapshotting timers, because that would result in data loss; we have to throw and fail the snapshot in this case. > Maybe we could go even one step further? throw new UnsupportedOperationException() when operator with isUsingCustomRawKeyedState() is trying to register a timer? @pnowojski Not quite sure about this. Registering timers should work regardless of the `isUsingCustomRawKeyedState` IF the user is not using the RocksDB + heap-based timers configuration. Technically, it is possible to do this check in the `InternalTimerServiceImpl`, I don't have a strong opinion on this. @banmoy what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wangyang0918 commented on a change in pull request #13644: [FLINK-19542][k8s]Implement LeaderElectionService and LeaderRetrievalService based on Kubernetes API
wangyang0918 commented on a change in pull request #13644: URL: https://github.com/apache/flink/pull/13644#discussion_r513219224 ## File path: flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/TestingLeaderBase.java ## @@ -0,0 +1,116 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.runtime.leaderelection; + +import java.util.concurrent.TimeoutException; +import java.util.function.Supplier; + +/** + * Base class which provides some convenience functions for testing purposes of {@link LeaderContender} and + * {@link LeaderElectionEventHandler}. + */ +public class TestingLeaderBase { + + protected boolean leader = false; + protected Throwable error = null; + + protected final Object lock = new Object(); + private final Object errorLock = new Object(); + + /** +* Waits until the contender becomes the leader or until the timeout has been exceeded. +* +* @param timeout +* @throws TimeoutException +*/ + public void waitForLeader(long timeout) throws TimeoutException { + waitFor(this::isLeader, timeout, "Contender was not elected as the leader within " + timeout + "ms"); + } + + /** +* Waits until the contender revokes the leader or until the timeout has been exceeded. +* +* @param timeout +* @throws TimeoutException +*/ + public void waitForRevokeLeader(long timeout) throws TimeoutException { + waitFor(() -> !isLeader(), timeout, "Contender was not revoked within " + timeout + "ms"); + } + + protected void waitFor(Supplier supplier, long timeout, String msg) throws TimeoutException { + long start = System.currentTimeMillis(); + long curTimeout; + + while (!supplier.get() && (curTimeout = timeout - System.currentTimeMillis() + start) > 0) { + synchronized (lock) { + try { + lock.wait(curTimeout); + } catch (InterruptedException e) { + // we got interrupted so check again for the condition + } + } + } + + if (!supplier.get()) { + throw new TimeoutException(msg); + } Review comment: Hmm. Actually, this is just the old codes I moved from `TestingContender`. It is indeed could be optimized. I will do it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingsongLi commented on a change in pull request #13605: [FLINK-19599][table] Introduce Filesystem format factories to integrate new FileSource to table
JingsongLi commented on a change in pull request #13605: URL: https://github.com/apache/flink/pull/13605#discussion_r513221615 ## File path: flink-table/flink-table-runtime-blink/src/main/java/org/apache/flink/table/filesystem/AbstractFileSystemTable.java ## @@ -53,15 +62,75 @@ context.getCatalogTable().getOptions().forEach(tableOptions::setString); this.schema = TableSchemaUtils.getPhysicalSchema(context.getCatalogTable().getSchema()); this.partitionKeys = context.getCatalogTable().getPartitionKeys(); - this.path = new Path(context.getCatalogTable().getOptions().getOrDefault(PATH.key(), PATH.defaultValue())); - this.defaultPartName = context.getCatalogTable().getOptions().getOrDefault( - PARTITION_DEFAULT_NAME.key(), PARTITION_DEFAULT_NAME.defaultValue()); + this.path = new Path(tableOptions.get(PATH)); + this.defaultPartName = tableOptions.get(PARTITION_DEFAULT_NAME); } - static FileSystemFormatFactory createFormatFactory(ReadableConfig tableOptions) { + ReadableConfig formatOptions(String identifier) { + return new DelegatingConfiguration(tableOptions, identifier + "."); + } + + FileSystemFormatFactory createFormatFactory() { return FactoryUtil.discoverFactory( Thread.currentThread().getContextClassLoader(), FileSystemFormatFactory.class, tableOptions.get(FactoryUtil.FORMAT)); } + + @SuppressWarnings("rawtypes") + > Optional discoverOptionalEncodingFormat( Review comment: I will move these logic to `FileSystemTableFactory` and use `FactoryUtil` to create formats. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19154) Application mode deletes HA data in case of suspended ZooKeeper connection
[ https://issues.apache.org/jira/browse/FLINK-19154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17221998#comment-17221998 ] Tzu-Li (Gordon) Tai commented on FLINK-19154: - Hi [~casidiablo], there is an ongoing discussion for releasing Flink 1.11.3 that would include this fix. Hopefully this should happen soon: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Apache-Flink-1-11-3-td45989.html > Application mode deletes HA data in case of suspended ZooKeeper connection > -- > > Key: FLINK-19154 > URL: https://issues.apache.org/jira/browse/FLINK-19154 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.12.0, 1.11.1 > Environment: Run a stand-alone cluster that runs a single job (if you > are familiar with the way Ververica Platform runs Flink jobs, we use a very > similar approach). It runs Flink 1.11.1 straight from the official docker > image. >Reporter: Husky Zeng >Assignee: Kostas Kloudas >Priority: Blocker > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > A user reported that Flink's application mode deletes HA data in case of a > suspended ZooKeeper connection [1]. > The problem seems to be that the {{ApplicationDispatcherBootstrap}} class > produces an exception (that the request job can no longer be found because of > a lost ZooKeeper connection) which will be interpreted as a job failure. Due > to this interpretation, the cluster will be shut down with a terminal state > of FAILED which will cause the HA data to be cleaned up. The exact problem > occurs in the {{JobStatusPollingUtils.getJobResult}} which is called by > {{ApplicationDispatcherBootstrap.getJobResult()}}. > The above described behaviour can be found in this log [2]. > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-metadata-deleted-by-Flink-after-ZK-connection-issues-td37937.html > [2] https://pastebin.com/raw/uH9KDU2L -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] pnowojski commented on pull request #13718: [FLINK-18811] Pick another tmpDir if an IOException occurs when creating spill file
pnowojski commented on pull request #13718: URL: https://github.com/apache/flink/pull/13718#issuecomment-717750630 Thanks @yuchuanchen . Note for the future, you didn't have to open new PR :) It would be fine to update this one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] pnowojski merged pull request #13810: [FLINK-18811][network] Pick another tmpDir if an IOException occurs w…
pnowojski merged pull request #13810: URL: https://github.com/apache/flink/pull/13810 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Assigned] (FLINK-18811) if a disk is damaged, taskmanager should choose another disk for temp dir , rather than throw an IOException, which causes flink job restart over and over again
[ https://issues.apache.org/jira/browse/FLINK-18811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski reassigned FLINK-18811: -- Assignee: Kai Chen > if a disk is damaged, taskmanager should choose another disk for temp dir , > rather than throw an IOException, which causes flink job restart over and > over again > > > Key: FLINK-18811 > URL: https://issues.apache.org/jira/browse/FLINK-18811 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Environment: flink-1.10 >Reporter: Kai Chen >Assignee: Kai Chen >Priority: Major > Labels: pull-request-available > Attachments: flink_disk_error.png > > > I met this Exception when a hard disk was damaged: > !flink_disk_error.png! > I checked the code and found that flink will create a temp file when Record > length > 5 MB: > > {code:java} > // SpillingAdaptiveSpanningRecordDeserializer.java > if (nextRecordLength > THRESHOLD_FOR_SPILLING) { >// create a spilling channel and put the data there >this.spillingChannel = createSpillingChannel(); >ByteBuffer toWrite = partial.segment.wrap(partial.position, numBytesChunk); >FileUtils.writeCompletely(this.spillingChannel, toWrite); > } > {code} > The tempDir is random picked from all `tempDirs`. Well on yarn mode, one > `tempDir` usually represents one hard disk. > > In may opinion, if a hard disk is damaged, taskmanager should pick another > disk(tmpDir) for Spilling Channel, rather than throw an IOException, which > causes flink job restart over and over again. > If we could just change “SpillingAdaptiveSpanningRecordDeserializer" like > this: > {code:java} > // SpillingAdaptiveSpanningRecordDeserializer.java > private FileChannel createSpillingChannel() throws IOException { >if (spillFile != null) { > throw new IllegalStateException("Spilling file already exists."); >} >// try to find a unique file name for the spilling channel >int maxAttempts = 10; >String[] tempDirs = this.tempDirs; >for (int attempt = 0; attempt < maxAttempts; attempt++) { > int dirIndex = rnd.nextInt(tempDirs.length); > String directory = tempDirs[dirIndex]; > spillFile = new File(directory, randomString(rnd) + ".inputchannel"); > try { > if (spillFile.createNewFile()) { > return new RandomAccessFile(spillFile, "rw").getChannel(); > } > } catch (IOException e) { > // if there is no tempDir left to try > if(tempDirs.length <= 1) { > throw e; > } > LOG.warn("Caught an IOException when creating spill file: " + > directory + ". Attempt " + attempt, e); > tempDirs = (String[])ArrayUtils.remove(tempDirs,dirIndex); > } >} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-18811) if a disk is damaged, taskmanager should choose another disk for temp dir , rather than throw an IOException, which causes flink job restart over and over again
[ https://issues.apache.org/jira/browse/FLINK-18811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Piotr Nowojski closed FLINK-18811. -- Fix Version/s: 1.12.0 Resolution: Fixed merged commit e38716f into apache:master > if a disk is damaged, taskmanager should choose another disk for temp dir , > rather than throw an IOException, which causes flink job restart over and > over again > > > Key: FLINK-18811 > URL: https://issues.apache.org/jira/browse/FLINK-18811 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Environment: flink-1.10 >Reporter: Kai Chen >Assignee: Kai Chen >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > Attachments: flink_disk_error.png > > > I met this Exception when a hard disk was damaged: > !flink_disk_error.png! > I checked the code and found that flink will create a temp file when Record > length > 5 MB: > > {code:java} > // SpillingAdaptiveSpanningRecordDeserializer.java > if (nextRecordLength > THRESHOLD_FOR_SPILLING) { >// create a spilling channel and put the data there >this.spillingChannel = createSpillingChannel(); >ByteBuffer toWrite = partial.segment.wrap(partial.position, numBytesChunk); >FileUtils.writeCompletely(this.spillingChannel, toWrite); > } > {code} > The tempDir is random picked from all `tempDirs`. Well on yarn mode, one > `tempDir` usually represents one hard disk. > > In may opinion, if a hard disk is damaged, taskmanager should pick another > disk(tmpDir) for Spilling Channel, rather than throw an IOException, which > causes flink job restart over and over again. > If we could just change “SpillingAdaptiveSpanningRecordDeserializer" like > this: > {code:java} > // SpillingAdaptiveSpanningRecordDeserializer.java > private FileChannel createSpillingChannel() throws IOException { >if (spillFile != null) { > throw new IllegalStateException("Spilling file already exists."); >} >// try to find a unique file name for the spilling channel >int maxAttempts = 10; >String[] tempDirs = this.tempDirs; >for (int attempt = 0; attempt < maxAttempts; attempt++) { > int dirIndex = rnd.nextInt(tempDirs.length); > String directory = tempDirs[dirIndex]; > spillFile = new File(directory, randomString(rnd) + ".inputchannel"); > try { > if (spillFile.createNewFile()) { > return new RandomAccessFile(spillFile, "rw").getChannel(); > } > } catch (IOException e) { > // if there is no tempDir left to try > if(tempDirs.length <= 1) { > throw e; > } > LOG.warn("Caught an IOException when creating spill file: " + > directory + ". Attempt " + attempt, e); > tempDirs = (String[])ArrayUtils.remove(tempDirs,dirIndex); > } >} > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] pnowojski commented on pull request #13761: [FLINK-19741] Allow AbstractStreamOperator to skip restoring timers if subclasses are using raw keyed state
pnowojski commented on pull request #13761: URL: https://github.com/apache/flink/pull/13761#issuecomment-717752111 Note: I don't have a strong opinion about adding some `checkState` somewhere. It would be nice to have it, but if it's not that easy to provide it, I'm fine with the PR as it is. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Antti-Kaikkonen commented on pull request #13773: [backport-1.11] [FLINK-19748] Iterating key groups in raw keyed stream on restore fails if some key groups weren't written
Antti-Kaikkonen commented on pull request #13773: URL: https://github.com/apache/flink/pull/13773#issuecomment-717756123 @tzulitai I tried it and restoring from a savepoint worked. As FlinkStatefunCountTo1M doesn't actually use state I also tried with my other app that uses statefun-flink-datastream and it was able to restore from a savepoint without errors. Thank you very much! I only tested with rocksdb state backend and rocksdb timers. The Flink version tested was 1.11.2 from https://flink.apache.org/downloads.html. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19822) Remove redundant shuffle for streaming
[ https://issues.apache.org/jira/browse/FLINK-19822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] godfrey he updated FLINK-19822: --- Summary: Remove redundant shuffle for streaming (was: Remove redundant shuffle for stream) > Remove redundant shuffle for streaming > -- > > Key: FLINK-19822 > URL: https://issues.apache.org/jira/browse/FLINK-19822 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: godfrey he >Assignee: godfrey he >Priority: Major > Fix For: 1.12.0 > > > This is similar > [FLINK-12575|https://issues.apache.org/jira/browse/FLINK-12575], we could > implement {{satisfyTraits}} method for stream nodes to remove redundant > shuffle. This could add more possibilities that more operators can be merged > into multiple input operator. > Note: This feature should only be enabled only when multiple input operator > is enabled because currently key selector is applied only at the header of > the chain, the state access of other operators in the chain may be not > correct. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wuchong opened a new pull request #13815: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting binary as varchar
wuchong opened a new pull request #13815: URL: https://github.com/apache/flink/pull/13815 This is a cherry pick to release-1.11 branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13815: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting binary as varchar
flinkbot commented on pull request #13815: URL: https://github.com/apache/flink/pull/13815#issuecomment-717757502 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 60f1b09b2eaca728dcc6a3ca930ec88067d2ef90 (Wed Oct 28 07:39:22 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong merged pull request #13801: [FLINK-19213][docs-zh] Update the Chinese documentation
wuchong merged pull request #13801: URL: https://github.com/apache/flink/pull/13801 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-19213) Update the Chinese documentation
[ https://issues.apache.org/jira/browse/FLINK-19213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-19213. --- Fix Version/s: 1.12.0 Resolution: Fixed Translated in master: f48ffe55c6b337a772bfbc26ea8da3e3e70b09ec > Update the Chinese documentation > > > Key: FLINK-19213 > URL: https://issues.apache.org/jira/browse/FLINK-19213 > Project: Flink > Issue Type: Sub-task > Components: chinese-translation, Documentation >Reporter: Dawid Wysakowicz >Assignee: jiawen xiao >Priority: Trivial > Labels: pull-request-available > Fix For: 1.12.0 > > Time Spent: 168h > > We should update the Chinese documentation with the changes introduced in > FLINK-18802 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] tzulitai commented on pull request #13773: [backport-1.11] [FLINK-19748] Iterating key groups in raw keyed stream on restore fails if some key groups weren't written
tzulitai commented on pull request #13773: URL: https://github.com/apache/flink/pull/13773#issuecomment-717757992 @Antti-Kaikkonen great news, thank you. We'll keep you updated on the JIRA regarding an official release candidate that includes the fixes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] yuchuanchen commented on pull request #13718: [FLINK-18811] Pick another tmpDir if an IOException occurs when creating spill file
yuchuanchen commented on pull request #13718: URL: https://github.com/apache/flink/pull/13718#issuecomment-717758921 Thanks @pnowojski This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wangxlong opened a new pull request #13816: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting bina…
wangxlong opened a new pull request #13816: URL: https://github.com/apache/flink/pull/13816 ## What is the purpose of the change This is a backport of FLINK-19587 cherry picked from commit d9b0ac97ee4675aebdab1592af663b95fdc5051b This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19587) Error result when casting binary type as varchar
[ https://issues.apache.org/jira/browse/FLINK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222007#comment-17222007 ] hailong wang commented on FLINK-19587: -- Hi [~jark], I open a pr for 1.11. https://github.com/apache/flink/pull/13816 > Error result when casting binary type as varchar > > > Key: FLINK-19587 > URL: https://issues.apache.org/jira/browse/FLINK-19587 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > The result is error when casting binary type as varchar type. > For example, > {code:java} > @Test > def testCast1(): Unit = { > testSqlApi( > "CAST(X'68656c6c6f' as varchar)", > "hello") > } > {code} > The result is > {code:java} > Expected :hello > Actual :[B@57fae983 > {code} > It is right as follow, > {code:java} > @Test > def testCast(): Unit = { > testSqlApi( > "CAST(CAST(X'68656c6c6f' as varbinary) as varchar)", > "hello") > } > {code} > We just need to change > {code:java} > case (VARBINARY, VARCHAR | CHAR){code} > to > {code:java} > case (BINARY | VARBINARY, VARCHAR | CHAR) > {code} > in ScalarOperatorGens#generateCast. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19587) Error result when casting binary type as varchar
[ https://issues.apache.org/jira/browse/FLINK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222009#comment-17222009 ] hailong wang commented on FLINK-19587: -- I just saw you have did it. I closed it. :D > Error result when casting binary type as varchar > > > Key: FLINK-19587 > URL: https://issues.apache.org/jira/browse/FLINK-19587 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > The result is error when casting binary type as varchar type. > For example, > {code:java} > @Test > def testCast1(): Unit = { > testSqlApi( > "CAST(X'68656c6c6f' as varchar)", > "hello") > } > {code} > The result is > {code:java} > Expected :hello > Actual :[B@57fae983 > {code} > It is right as follow, > {code:java} > @Test > def testCast(): Unit = { > testSqlApi( > "CAST(CAST(X'68656c6c6f' as varbinary) as varchar)", > "hello") > } > {code} > We just need to change > {code:java} > case (VARBINARY, VARCHAR | CHAR){code} > to > {code:java} > case (BINARY | VARBINARY, VARCHAR | CHAR) > {code} > in ScalarOperatorGens#generateCast. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] wangxlong closed pull request #13816: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting bina…
wangxlong closed pull request #13816: URL: https://github.com/apache/flink/pull/13816 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13816: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting bina…
flinkbot commented on pull request #13816: URL: https://github.com/apache/flink/pull/13816#issuecomment-717760310 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 8d9b3d077a930c47d1f9846850a3f54134262007 (Wed Oct 28 07:46:41 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19587) Error result when casting binary type as varchar
[ https://issues.apache.org/jira/browse/FLINK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222011#comment-17222011 ] Jark Wu commented on FLINK-19587: - :D > Error result when casting binary type as varchar > > > Key: FLINK-19587 > URL: https://issues.apache.org/jira/browse/FLINK-19587 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > The result is error when casting binary type as varchar type. > For example, > {code:java} > @Test > def testCast1(): Unit = { > testSqlApi( > "CAST(X'68656c6c6f' as varchar)", > "hello") > } > {code} > The result is > {code:java} > Expected :hello > Actual :[B@57fae983 > {code} > It is right as follow, > {code:java} > @Test > def testCast(): Unit = { > testSqlApi( > "CAST(CAST(X'68656c6c6f' as varbinary) as varchar)", > "hello") > } > {code} > We just need to change > {code:java} > case (VARBINARY, VARCHAR | CHAR){code} > to > {code:java} > case (BINARY | VARBINARY, VARCHAR | CHAR) > {code} > in ScalarOperatorGens#generateCast. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13744: [FLINK-19766][table-runtime] Introduce File streaming compaction operators
flinkbot edited a comment on pull request #13744: URL: https://github.com/apache/flink/pull/13744#issuecomment-714369975 ## CI report: * 164c8b5b7750ed567813d70f160ef9f064d0a3b0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8458) * 322bc2c3c11a0f5735db4a475864f894f38866da Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8466) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-19846) Grammar mistakes in annotations and log
zhouchao created FLINK-19846: Summary: Grammar mistakes in annotations and log Key: FLINK-19846 URL: https://issues.apache.org/jira/browse/FLINK-19846 Project: Flink Issue Type: Wish Affects Versions: 1.11.2 Reporter: zhouchao Fix For: 1.12.0 There exit some grammar mistakes in annotations and documents. The mistakes include but are not limited to the following examples: * a entry in WebLogAnalysis.java [246:34] and adm-zip.js [291:33](which should be an entry) * a input in JobGraphGenerator.java [1125:69] etc(which should be an input) * a intersection * an user-* in Table.java etc. (which should be a user) using global search in intellij idea, more mistakes could be foud like this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13756: [FLINK-19770][python][test] Changed the PythonProgramOptionTest to be an ITCase.
flinkbot edited a comment on pull request #13756: URL: https://github.com/apache/flink/pull/13756#issuecomment-714895505 ## CI report: * 9c215d1339aebe0b7249dc6e06b5a5d5d74c25c5 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8368) * 37447c7adbd68701dbce2f515ff69af119be5a5b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rmetzger commented on pull request #13796: [FLINK-19810][CI] Automatically run a basic NOTICE file check on CI
rmetzger commented on pull request #13796: URL: https://github.com/apache/flink/pull/13796#issuecomment-717761562 Thanks a lot for taking a look. I believe the statement by the tool is correct: The notice file contains the following dependency: `com.apache.commons:commons-compress:1.20`, but the expected dependency is `org.apache.commons:commons-compress:1.20`. Notice the `com.` vs `org.` Do you agree to how the tool is integrated into the build process in priciple? If so, I will check and fix all dependency issues found by the tool, so that we can merge it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13812: [FLINK-19674][docs-zh] Translate "Docker" of "Clusters & Depolyment" page into Chinese
flinkbot edited a comment on pull request #13812: URL: https://github.com/apache/flink/pull/13812#issuecomment-717687160 ## CI report: * 4f66617c4540a4987a5852cf0168f7e3a552c91a Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8457) * 74a934ea4dd99e6bc44ac84a9f314468faf8376b Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8470) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13808: [FLINK-19834] Make the TestSink reusable in all the sink related tests.
flinkbot edited a comment on pull request #13808: URL: https://github.com/apache/flink/pull/13808#issuecomment-717348429 ## CI report: * 202d1f9a25defee69d632dbd8cc497646d9580f6 UNKNOWN * 6386d601a02b699a26feb0b1565a68939262076d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8422) * dac456ff8b4af822ba3c883f253d71fb3d5d4c9f Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8469) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ShubinRuan opened a new pull request #13817: [HotFix][docs] Fix broken link of docker.md
ShubinRuan opened a new pull request #13817: URL: https://github.com/apache/flink/pull/13817 ## What is the purpose of the change Fix broken link of docker.md The page url is https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/docker.html The markdown file is located in flink/docs/ops/deployment/docker.md The "how to run the Flink image" link in the Advanced customization chapter of docker.md cannot be redirected normally。 ## Brief change log - Fix broken link of `flink/docs/ops/deployment/docker.md` ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with @public(Evolving): no - The serializers: no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? no - If yes, how is the feature documented? no This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] AHeise commented on a change in pull request #13741: [FLINK-19680][checkpointing] Announce timeoutable CheckpointBarriers
AHeise commented on a change in pull request #13741: URL: https://github.com/apache/flink/pull/13741#discussion_r513238810 ## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/RemoteInputChannel.java ## @@ -437,19 +440,20 @@ public void onBuffer(Buffer buffer, int sequenceNumber, int backlog) throws IOEx wasEmpty = receivedBuffers.isEmpty(); - if (buffer.getDataType().hasPriority()) { - receivedBuffers.addPriorityElement(new SequenceBuffer(buffer, sequenceNumber)); - if (channelStatePersister.checkForBarrier(buffer)) { - // checkpoint was not yet started by task thread, - // so remember the numbers of buffers to spill for the time when it will be started - numBuffersOvertaken = receivedBuffers.getNumUnprioritizedElements(); - } - firstPriorityEvent = receivedBuffers.getNumPriorityElements() == 1; + SequenceBuffer sequenceBuffer = new SequenceBuffer(buffer, sequenceNumber); + DataType dataType = buffer.getDataType(); + if (dataType.hasPriority()) { + checkPriorityXorAnnouncement(buffer); + firstPriorityEvent = addPriorityBuffer(sequenceBuffer); } else { - receivedBuffers.add(new SequenceBuffer(buffer, sequenceNumber)); + receivedBuffers.add(sequenceBuffer); channelStatePersister.maybePersist(buffer); - } + if (dataType.requiresAnnouncement()) { + checkPriorityXorAnnouncement(buffer); + firstPriorityEvent = addPriorityBuffer(announce(sequenceBuffer)); + } Review comment: Missed the `announce` part. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13817: [HotFix][docs] Fix broken link of docker.md
flinkbot commented on pull request #13817: URL: https://github.com/apache/flink/pull/13817#issuecomment-717763924 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit f7fcad2291f072819e6576b291327abd16947445 (Wed Oct 28 07:55:51 UTC 2020) **Warnings:** * Documentation files were touched, but no `.zh.md` files: Update Chinese documentation or file Jira ticket. * **Invalid pull request title: No valid Jira ID provided** Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong commented on pull request #13300: [FLINK-19077][table-runtime] Import process time temporal join operator.
wuchong commented on pull request #13300: URL: https://github.com/apache/flink/pull/13300#issuecomment-717767478 Build is passed: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8452&view=results This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong edited a comment on pull request #13300: [FLINK-19077][table-runtime] Import process time temporal join operator.
wuchong edited a comment on pull request #13300: URL: https://github.com/apache/flink/pull/13300#issuecomment-717767478 Build is passed: https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8452&view=results Merging... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13300: [FLINK-19077][table-runtime] Import process time temporal join operator.
flinkbot edited a comment on pull request #13300: URL: https://github.com/apache/flink/pull/13300#issuecomment-684957567 ## CI report: * 6bf5dc2e71b14086d5f908cc300060595adc46ea Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8452) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong merged pull request #13300: [FLINK-19077][table-runtime] Import process time temporal join operator.
wuchong merged pull request #13300: URL: https://github.com/apache/flink/pull/13300 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13605: [FLINK-19599][table] Introduce Filesystem format factories to integrate new FileSource to table
flinkbot edited a comment on pull request #13605: URL: https://github.com/apache/flink/pull/13605#issuecomment-707521684 ## CI report: * a9df8a1384eb6306656b4fd952edd4be5d7a857d UNKNOWN * 05116d4768052821a37adfa725e60e40f4f71176 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8403) * 61fed9445fbd820423b73ef520d5f9ffc9e0606d UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Closed] (FLINK-19077) Import process time temporal join operator
[ https://issues.apache.org/jira/browse/FLINK-19077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jark Wu closed FLINK-19077. --- Fix Version/s: 1.12.0 Resolution: Fixed Implemented in master: faf500d1bb34a17e520f0739c9fd79f7958756ea > Import process time temporal join operator > -- > > Key: FLINK-19077 > URL: https://issues.apache.org/jira/browse/FLINK-19077 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime >Reporter: Leonard Xu >Assignee: Leonard Xu >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > import *TemporalProcessTimeJoinOperator* for Processing-Time temporal join. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19847) Can we create a fast support on the Nested table join?
xiaogang zhou created FLINK-19847: - Summary: Can we create a fast support on the Nested table join? Key: FLINK-19847 URL: https://issues.apache.org/jira/browse/FLINK-19847 Project: Flink Issue Type: Wish Components: API / DataStream Affects Versions: 1.11.1 Reporter: xiaogang zhou In CommonLookupJoin, one TODO is support nested lookup keys in the future, // currently we only support top-level lookup keys can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13331: [FLINK-19079][table-runtime] Import rowtime deduplicate operator
flinkbot edited a comment on pull request #13331: URL: https://github.com/apache/flink/pull/13331#issuecomment-687585649 ## CI report: * 008e4ae2c2e10b712d84a4f324634a71fe5f84ac Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8235) * 115ad9e4e8c7258d5831848a7144bfbb46a5b6f9 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-19848) flink docs of Building Flink from Source bug
jackylau created FLINK-19848: Summary: flink docs of Building Flink from Source bug Key: FLINK-19848 URL: https://issues.apache.org/jira/browse/FLINK-19848 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.11.0 Reporter: jackylau Fix For: 1.12.0 To speed up the build you can skip tests, QA plugins, and JavaDocs: {{mvn clean install -DskipTests -Dfast}} {{mvn clean install -DskipTests -Dscala-2.12}} {{fast and }}{{scala-2.12}}{{ is profile, not properties}} {{}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19847) Can we create a fast support on the Nested table join?
[ https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222026#comment-17222026 ] xiaogang zhou commented on FLINK-19847: --- [~jark] Please help review, thx > Can we create a fast support on the Nested table join? > -- > > Key: FLINK-19847 > URL: https://issues.apache.org/jira/browse/FLINK-19847 > Project: Flink > Issue Type: Wish > Components: API / DataStream >Affects Versions: 1.11.1 >Reporter: xiaogang zhou >Priority: Major > > In CommonLookupJoin, one TODO is > support nested lookup keys in the future, > // currently we only support top-level lookup keys > > can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…
flinkbot edited a comment on pull request #13763: URL: https://github.com/apache/flink/pull/13763#issuecomment-715195599 ## CI report: * f004220668e20dcd9860026b69566868d473db33 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8455) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13756: [FLINK-19770][python][test] Changed the PythonProgramOptionTest to be an ITCase.
flinkbot edited a comment on pull request #13756: URL: https://github.com/apache/flink/pull/13756#issuecomment-714895505 ## CI report: * 9c215d1339aebe0b7249dc6e06b5a5d5d74c25c5 Azure: [SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8368) * 37447c7adbd68701dbce2f515ff69af119be5a5b Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8477) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19847) Can we create a fast support on the Nested table join?
[ https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222027#comment-17222027 ] Jark Wu commented on FLINK-19847: - [~zhoujira86] could you describe your use case and why you need the nested lookup key join? > Can we create a fast support on the Nested table join? > -- > > Key: FLINK-19847 > URL: https://issues.apache.org/jira/browse/FLINK-19847 > Project: Flink > Issue Type: Wish > Components: API / DataStream >Affects Versions: 1.11.1 >Reporter: xiaogang zhou >Priority: Major > > In CommonLookupJoin, one TODO is > support nested lookup keys in the future, > // currently we only support top-level lookup keys > > can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] rmetzger commented on pull request #13779: [FLINK-18122][e2e] Make K8s test more resilient by retrying and failing docker iamge build
rmetzger commented on pull request #13779: URL: https://github.com/apache/flink/pull/13779#issuecomment-717770788 Thanks. I'll fix the typo in the commit message & merge it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13815: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting binary as varchar
flinkbot commented on pull request #13815: URL: https://github.com/apache/flink/pull/13815#issuecomment-717771070 ## CI report: * 60f1b09b2eaca728dcc6a3ca930ec88067d2ef90 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19848) flink docs of Building Flink from Source bug
[ https://issues.apache.org/jira/browse/FLINK-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jackylau updated FLINK-19848: - Description: To speed up the build you can skip tests, QA plugins, and JavaDocs: {{mvn clean install -DskipTests -Dfast}} {{mvn clean install -DskipTests -Dscala-2.12}} {{fast and }}{{scala-2.12}}{{ is profile, not properties}} was: To speed up the build you can skip tests, QA plugins, and JavaDocs: {{mvn clean install -DskipTests -Dfast}} {{mvn clean install -DskipTests -Dscala-2.12}} {{fast and }}{{scala-2.12}}{{ is profile, not properties}} {{}} > flink docs of Building Flink from Source bug > > > Key: FLINK-19848 > URL: https://issues.apache.org/jira/browse/FLINK-19848 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.11.0 >Reporter: jackylau >Priority: Major > Fix For: 1.12.0 > > > To speed up the build you can skip tests, QA plugins, and JavaDocs: > > {{mvn clean install -DskipTests -Dfast}} > > {{mvn clean install -DskipTests -Dscala-2.12}} > {{fast and }}{{scala-2.12}}{{ is profile, not properties}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (FLINK-18122) Kubernetes test fails with "error: timed out waiting for the condition on jobs/flink-job-cluster"
[ https://issues.apache.org/jira/browse/FLINK-18122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger closed FLINK-18122. -- Fix Version/s: 1.12.0 Resolution: Fixed Improved test stability in https://github.com/apache/flink/commit/d47433a9384b109cba4d83ab4cf4b9a3457de885 > Kubernetes test fails with "error: timed out waiting for the condition on > jobs/flink-job-cluster" > - > > Key: FLINK-18122 > URL: https://issues.apache.org/jira/browse/FLINK-18122 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes, Tests >Affects Versions: 1.11.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.12.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2697&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5 > {code} > 2020-06-04T09:25:40.7205843Z service/flink-job-cluster created > 2020-06-04T09:25:40.9661515Z job.batch/flink-job-cluster created > 2020-06-04T09:25:41.2189123Z deployment.apps/flink-task-manager created > 2020-06-04T10:32:32.6402983Z error: timed out waiting for the condition on > jobs/flink-job-cluster > 2020-06-04T10:32:33.8057757Z error: unable to upgrade connection: container > not found ("flink-task-manager") > 2020-06-04T10:32:33.8111302Z sort: cannot read: > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*': > No such file or directory > 2020-06-04T10:32:33.8124455Z FAIL WordCount: Output hash mismatch. Got > d41d8cd98f00b204e9800998ecf8427e, expected e682ec6622b5e83f2eb614617d5ab2cf. > 2020-06-04T10:32:33.8125379Z head hexdump of actual: > 2020-06-04T10:32:33.8136133Z head: cannot open > '/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-56335570120/out/kubernetes_wc_out*' > for reading: No such file or directory > 2020-06-04T10:32:33.8344715Z Debugging failed Kubernetes test: > 2020-06-04T10:32:33.8345469Z Currently existing Kubernetes resources > 2020-06-04T10:32:36.4977853Z I0604 10:32:36.497383 13191 request.go:621] > Throttling request took 1.198606989s, request: > GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:32:46.6975735Z I0604 10:32:46.697234 13191 request.go:621] > Throttling request took 4.398107353s, request: > GET:https://10.1.0.4:8443/apis/authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:32:57.4978637Z I0604 10:32:57.497209 13191 request.go:621] > Throttling request took 1.198449167s, request: > GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s > 2020-06-04T10:33:07.4980104Z I0604 10:33:07.497320 13191 request.go:621] > Throttling request took 4.198274438s, request: > GET:https://10.1.0.4:8443/apis/apiextensions.k8s.io/v1?timeout=32s > 2020-06-04T10:33:18.4976060Z I0604 10:33:18.497258 13191 request.go:621] > Throttling request took 1.19871495s, request: > GET:https://10.1.0.4:8443/apis/apps/v1?timeout=32s > 2020-06-04T10:33:28.4979129Z I0604 10:33:28.497276 13191 request.go:621] > Throttling request took 4.198369672s, request: > GET:https://10.1.0.4:8443/apis/rbac.authorization.k8s.io/v1?timeout=32s > 2020-06-04T10:33:30.9182069Z NAME READY > STATUS RESTARTS AGE > 2020-06-04T10:33:30.9184099Z pod/flink-job-cluster-dtb67 0/1 > ErrImageNeverPull 0 67m > 2020-06-04T10:33:30.9184869Z pod/flink-task-manager-74ccc9bd9-psqwm 0/1 > ErrImageNeverPull 0 67m > 2020-06-04T10:33:30.9185226Z > 2020-06-04T10:33:30.9185926Z NAMETYPE > CLUSTER-IP EXTERNAL-IP PORT(S) > AGE > 2020-06-04T10:33:30.9186832Z service/flink-job-cluster NodePort > 10.111.92.199 > 6123:32501/TCP,6124:31360/TCP,6125:30025/TCP,8081:30081/TCP 67m > 2020-06-04T10:33:30.9187545Z service/kubernetes ClusterIP > 10.96.0.1 443/TCP > 68m > 2020-06-04T10:33:30.9187976Z > 2020-06-04T10:33:30.9188472Z NAME READY > UP-TO-DATE AVAILABLE AGE > 2020-06-04T10:33:30.9189179Z deployment.apps/flink-task-manager 0/1 1 > 0 67m > 2020-06-04T10:33:30.9189508Z > 2020-06-04T10:33:30.9189815Z NAME > DESIRED CURRENT READY AGE > 2020-06-04T10:33:30.9190418Z replicaset.apps/flink-task-manager-74ccc9bd9 1 > 1 0 67m > 2020-06-04T10:33:30.9190662Z > 2020-06-04T10:33:30.9190891Z NAME
[jira] [Commented] (FLINK-19587) Error result when casting binary type as varchar
[ https://issues.apache.org/jira/browse/FLINK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222029#comment-17222029 ] Jark Wu commented on FLINK-19587: - Hi [~hailong wang], I found it's not compatible to just cherry pick the commit. Could you adapt your code in release-1.11 and create a new pull request? I have closed mine. > Error result when casting binary type as varchar > > > Key: FLINK-19587 > URL: https://issues.apache.org/jira/browse/FLINK-19587 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > The result is error when casting binary type as varchar type. > For example, > {code:java} > @Test > def testCast1(): Unit = { > testSqlApi( > "CAST(X'68656c6c6f' as varchar)", > "hello") > } > {code} > The result is > {code:java} > Expected :hello > Actual :[B@57fae983 > {code} > It is right as follow, > {code:java} > @Test > def testCast(): Unit = { > testSqlApi( > "CAST(CAST(X'68656c6c6f' as varbinary) as varchar)", > "hello") > } > {code} > We just need to change > {code:java} > case (VARBINARY, VARCHAR | CHAR){code} > to > {code:java} > case (BINARY | VARBINARY, VARCHAR | CHAR) > {code} > in ScalarOperatorGens#generateCast. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] rmetzger closed pull request #13779: [FLINK-18122][e2e] Make K8s test more resilient by retrying and failing docker iamge build
rmetzger closed pull request #13779: URL: https://github.com/apache/flink/pull/13779 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13817: [HotFix][docs] Fix broken link of docker.md
flinkbot commented on pull request #13817: URL: https://github.com/apache/flink/pull/13817#issuecomment-717771629 ## CI report: * f7fcad2291f072819e6576b291327abd16947445 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] wuchong closed pull request #13815: [BP-1.11][FLINK-19587][table-planner-blink] Fix error result when casting binary as varchar
wuchong closed pull request #13815: URL: https://github.com/apache/flink/pull/13815 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] StephanEwen commented on pull request #13813: [FLINK-19491][avro] AvroSerializerSnapshot cannot handle large schema
StephanEwen commented on pull request #13813: URL: https://github.com/apache/flink/pull/13813#issuecomment-717772823 Thanks for taking a look at this. The feature code is fine, but I think the tests need some improvements. - This misses adding a V2 config snapshot to the test resources and ensuring we can resume from there. - The test for large schema re-engineers the V2 method in the tests. - This is not necessary. We don't care whether the previous version fails, we only want the new version to succeed - Rebuilding functionality in tests and testing that does not help tp guard anything in the production code, so it is ineffective. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dawidwys commented on a change in pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…
dawidwys commented on a change in pull request #13763: URL: https://github.com/apache/flink/pull/13763#discussion_r513252724 ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java ## @@ -297,17 +299,22 @@ private static DataType convertToDataType(Schema schema) { * @return Avro's {@link Schema} matching this logical type. */ public static Schema convertToSchema(LogicalType logicalType) { - return convertToSchema(logicalType, "record"); + return convertToSchema(logicalType, "record", true); } /** * Converts Flink SQL {@link LogicalType} (can be nested) into an Avro schema. * * @param logicalType logical type * @param rowName the record name +* @param top whether it is parsing the root record, +*if it is, the logical type nullability would be ignored * @return Avro's {@link Schema} matching this logical type. */ - public static Schema convertToSchema(LogicalType logicalType, String rowName) { + public static Schema convertToSchema( + LogicalType logicalType, + String rowName, + boolean top) { Review comment: > It is not true, a row (null, null) is nullable true. And i don't think it makes sense to change the planner behavior in general in order to fix a specific use case. Excuse me, but I wholeheartedly disagree with your statement. null =/= (null, null). (null, null) is still `NOT NULL`. A whole row in a Table can not be null. Only particular columns can be null. Therefore the top level row of a Table is always `NOT NULL`. I am not suggesting changing planner behaviour for a particular use case. The planner should always produce `NOT NULL` type for a top level row of a Table. If it doesn't, it is a bug. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dawidwys commented on a change in pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…
dawidwys commented on a change in pull request #13763: URL: https://github.com/apache/flink/pull/13763#discussion_r513253682 ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java ## @@ -297,17 +299,22 @@ private static DataType convertToDataType(Schema schema) { * @return Avro's {@link Schema} matching this logical type. */ public static Schema convertToSchema(LogicalType logicalType) { - return convertToSchema(logicalType, "record"); + return convertToSchema(logicalType, "record", true); } /** * Converts Flink SQL {@link LogicalType} (can be nested) into an Avro schema. * * @param logicalType logical type * @param rowName the record name +* @param top whether it is parsing the root record, +*if it is, the logical type nullability would be ignored * @return Avro's {@link Schema} matching this logical type. */ - public static Schema convertToSchema(LogicalType logicalType, String rowName) { + public static Schema convertToSchema( + LogicalType logicalType, + String rowName, + boolean top) { Review comment: > I don't think we should let each invoker to decide whether to make the data type not null, because in current codebase, we should always do that, make the decision everyone is error-prone and hard to maintain. I agree making the same decision over and over again at multiple location is error prone and hard to maintain and that's what I want to avoid. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-19847) Can we create a fast support on the Nested table join?
[ https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222033#comment-17222033 ] xiaogang zhou edited comment on FLINK-19847 at 10/28/20, 8:20 AM: -- In our situation, we use a Asynclookup for the redis and some inner rpc service. we combine multikey lookup to save the network cost. so the DDL uses the array type. when i try to join on array, some exception happens. and I notice the todo. looks like something need to be done in the extractConstantField? was (Author: zhoujira86): In our situation, we use a Asynclookup for the redis and some inner rpc service. we combine multikey lookup to save the network cost. so the DDL uses the array type. where i try to join on array, some exception happens. and I notice the todo. looks like something need to be done in the extractConstantField? > Can we create a fast support on the Nested table join? > -- > > Key: FLINK-19847 > URL: https://issues.apache.org/jira/browse/FLINK-19847 > Project: Flink > Issue Type: Wish > Components: API / DataStream >Affects Versions: 1.11.1 >Reporter: xiaogang zhou >Priority: Major > > In CommonLookupJoin, one TODO is > support nested lookup keys in the future, > // currently we only support top-level lookup keys > > can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-19847) Can we create a fast support on the Nested table join?
[ https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222033#comment-17222033 ] xiaogang zhou commented on FLINK-19847: --- In our situation, we use a Asynclookup for the redis and some inner rpc service. we combine multikey lookup to save the network cost. so the DDL uses the array type. where i try to join on array, some exception happens. and I notice the todo. looks like something need to be done in the extractConstantField? > Can we create a fast support on the Nested table join? > -- > > Key: FLINK-19847 > URL: https://issues.apache.org/jira/browse/FLINK-19847 > Project: Flink > Issue Type: Wish > Components: API / DataStream >Affects Versions: 1.11.1 >Reporter: xiaogang zhou >Priority: Major > > In CommonLookupJoin, one TODO is > support nested lookup keys in the future, > // currently we only support top-level lookup keys > > can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19849) Check NOTICE files for 1.12 release
[ https://issues.apache.org/jira/browse/FLINK-19849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Metzger updated FLINK-19849: --- Description: This will be automated through FLINK-19810 > Check NOTICE files for 1.12 release > --- > > Key: FLINK-19849 > URL: https://issues.apache.org/jira/browse/FLINK-19849 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.12.0 >Reporter: Robert Metzger >Assignee: Robert Metzger >Priority: Blocker > Fix For: 1.12.0 > > > This will be automated through FLINK-19810 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-19849) Check NOTICE files for 1.12 release
Robert Metzger created FLINK-19849: -- Summary: Check NOTICE files for 1.12 release Key: FLINK-19849 URL: https://issues.apache.org/jira/browse/FLINK-19849 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 1.12.0 Reporter: Robert Metzger Assignee: Robert Metzger Fix For: 1.12.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] gm7y8 commented on pull request #13458: FLINK-18851 [runtime] Add checkpoint type to checkpoint history entries in Web UI
gm7y8 commented on pull request #13458: URL: https://github.com/apache/flink/pull/13458#issuecomment-717783238 @flinkbot run azure This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] danny0405 commented on a change in pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…
danny0405 commented on a change in pull request #13763: URL: https://github.com/apache/flink/pull/13763#discussion_r513264987 ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java ## @@ -297,17 +299,22 @@ private static DataType convertToDataType(Schema schema) { * @return Avro's {@link Schema} matching this logical type. */ public static Schema convertToSchema(LogicalType logicalType) { - return convertToSchema(logicalType, "record"); + return convertToSchema(logicalType, "record", true); } /** * Converts Flink SQL {@link LogicalType} (can be nested) into an Avro schema. * * @param logicalType logical type * @param rowName the record name +* @param top whether it is parsing the root record, +*if it is, the logical type nullability would be ignored * @return Avro's {@link Schema} matching this logical type. */ - public static Schema convertToSchema(LogicalType logicalType, String rowName) { + public static Schema convertToSchema( + LogicalType logicalType, + String rowName, + boolean top) { Review comment: >Excuse me, but I wholeheartedly disagree with your statement. null =/= (null, null). (null, null) is still NOT NULL. A whole row in a Table can not be null. Only particular columns can be null. Therefore the top level row of a Table is always NOT NULL. You can test it in PostgreSQL with the following SQL: ```sql create type my_type as (a int, b varchar(20)); create table t1( f0 my_type, f1 varchar(20) ); insert into t1 values((null, null), 'def'); select f0 is null from t1; -- it returns true ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19847) Can we create a fast support on the Nested table join?
[ https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222042#comment-17222042 ] xiaogang zhou commented on FLINK-19847: --- looks like splitJoinCondition already has some problem, Maybe i try multiple fields in ddl first. thx > Can we create a fast support on the Nested table join? > -- > > Key: FLINK-19847 > URL: https://issues.apache.org/jira/browse/FLINK-19847 > Project: Flink > Issue Type: Wish > Components: API / DataStream >Affects Versions: 1.11.1 >Reporter: xiaogang zhou >Priority: Major > > In CommonLookupJoin, one TODO is > support nested lookup keys in the future, > // currently we only support top-level lookup keys > > can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] kl0u commented on pull request #13808: [FLINK-19834] Make the TestSink reusable in all the sink related tests.
kl0u commented on pull request #13808: URL: https://github.com/apache/flink/pull/13808#issuecomment-717784829 +1 for merging from me as soon as AZP given green, but I think @aljoscha 's comments are not in the branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] kl0u commented on pull request #13808: [FLINK-19834] Make the TestSink reusable in all the sink related tests.
kl0u commented on pull request #13808: URL: https://github.com/apache/flink/pull/13808#issuecomment-717785112 Thanks for the work @guoweiM ! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19848) flink docs of Building Flink from Source bug
[ https://issues.apache.org/jira/browse/FLINK-19848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-19848: --- Labels: pull-request-available (was: ) > flink docs of Building Flink from Source bug > > > Key: FLINK-19848 > URL: https://issues.apache.org/jira/browse/FLINK-19848 > Project: Flink > Issue Type: Bug > Components: Documentation >Affects Versions: 1.11.0 >Reporter: jackylau >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > > To speed up the build you can skip tests, QA plugins, and JavaDocs: > > {{mvn clean install -DskipTests -Dfast}} > > {{mvn clean install -DskipTests -Dscala-2.12}} > {{fast and }}{{scala-2.12}}{{ is profile, not properties}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] liuyongvs opened a new pull request #13818: [FLINK-19848][docs] fix docs bug of build flink.
liuyongvs opened a new pull request #13818: URL: https://github.com/apache/flink/pull/13818 ## What is the purpose of the change *fix docs bug of build flink.* ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (yes / no / don't know) - The runtime per-record code paths (performance sensitive): (no ) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? ( no) - If yes, how is the feature documented? (docs) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13331: [FLINK-19079][table-runtime] Import rowtime deduplicate operator
flinkbot edited a comment on pull request #13331: URL: https://github.com/apache/flink/pull/13331#issuecomment-687585649 ## CI report: * 115ad9e4e8c7258d5831848a7144bfbb46a5b6f9 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8478) Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13307: [FLINK-19078][table-runtime] Import rowtime join temporal operator
flinkbot edited a comment on pull request #13307: URL: https://github.com/apache/flink/pull/13307#issuecomment-685763264 ## CI report: * dee9ebd0a1947e6580064910a0dace6f5e349c52 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8460) * a558cb0fd984dc5f256d7710f67239d0851204b3 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot edited a comment on pull request #13458: FLINK-18851 [runtime] Add checkpoint type to checkpoint history entries in Web UI
flinkbot edited a comment on pull request #13458: URL: https://github.com/apache/flink/pull/13458#issuecomment-697041219 ## CI report: * 7311b0d12d19a645391ea0359a9aa6318806363b UNKNOWN * Unknown: [CANCELED](TBD) * c416edd7f8696bcf2ed6b77d0238ae7312282a51 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13818: [FLINK-19848][docs] fix docs bug of build flink.
flinkbot commented on pull request #13818: URL: https://github.com/apache/flink/pull/13818#issuecomment-717787755 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit 80c826228cfa96218b23bd087f4047673e91f734 (Wed Oct 28 08:47:53 UTC 2020) **Warnings:** * **This pull request references an unassigned [Jira ticket](https://issues.apache.org/jira/browse/FLINK-19848).** According to the [code contribution guide](https://flink.apache.org/contributing/contribute-code.html), tickets need to be assigned before starting with the implementation work. Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-statefun] igalshilman commented on a change in pull request #168: [FLINK-19692] Write all assigned key groups into the keyed raw stream with a Header.
igalshilman commented on a change in pull request #168: URL: https://github.com/apache/flink-statefun/pull/168#discussion_r513269330 ## File path: statefun-flink/statefun-flink-core/src/test/java/org/apache/flink/statefun/flink/core/logger/UnboundedFeedbackLoggerTest.java ## @@ -79,6 +81,21 @@ public void roundTripWithSpill() throws Exception { roundTrip(1_000_000, 0); } + @Test + public void testHeader() throws IOException { Review comment: Yes definitely. Would add that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] dawidwys commented on a change in pull request #13763: [FLINK-19779][avro] Remove the record_ field name prefix for Confluen…
dawidwys commented on a change in pull request #13763: URL: https://github.com/apache/flink/pull/13763#discussion_r513269624 ## File path: flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java ## @@ -297,17 +299,22 @@ private static DataType convertToDataType(Schema schema) { * @return Avro's {@link Schema} matching this logical type. */ public static Schema convertToSchema(LogicalType logicalType) { - return convertToSchema(logicalType, "record"); + return convertToSchema(logicalType, "record", true); } /** * Converts Flink SQL {@link LogicalType} (can be nested) into an Avro schema. * * @param logicalType logical type * @param rowName the record name +* @param top whether it is parsing the root record, +*if it is, the logical type nullability would be ignored * @return Avro's {@link Schema} matching this logical type. */ - public static Schema convertToSchema(LogicalType logicalType, String rowName) { + public static Schema convertToSchema( + LogicalType logicalType, + String rowName, + boolean top) { Review comment: But the `my_type` is not a top level row in your example. It is a type of a column. It has nothing to do with the case we're discussing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19587) Error result when casting binary type as varchar
[ https://issues.apache.org/jira/browse/FLINK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222044#comment-17222044 ] hailong wang commented on FLINK-19587: -- [~jark] OK, I will open a new pr soon. > Error result when casting binary type as varchar > > > Key: FLINK-19587 > URL: https://issues.apache.org/jira/browse/FLINK-19587 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.11.0 >Reporter: hailong wang >Assignee: hailong wang >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > The result is error when casting binary type as varchar type. > For example, > {code:java} > @Test > def testCast1(): Unit = { > testSqlApi( > "CAST(X'68656c6c6f' as varchar)", > "hello") > } > {code} > The result is > {code:java} > Expected :hello > Actual :[B@57fae983 > {code} > It is right as follow, > {code:java} > @Test > def testCast(): Unit = { > testSqlApi( > "CAST(CAST(X'68656c6c6f' as varbinary) as varchar)", > "hello") > } > {code} > We just need to change > {code:java} > case (VARBINARY, VARCHAR | CHAR){code} > to > {code:java} > case (BINARY | VARBINARY, VARCHAR | CHAR) > {code} > in ScalarOperatorGens#generateCast. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13605: [FLINK-19599][table] Introduce Filesystem format factories to integrate new FileSource to table
flinkbot edited a comment on pull request #13605: URL: https://github.com/apache/flink/pull/13605#issuecomment-707521684 ## CI report: * a9df8a1384eb6306656b4fd952edd4be5d7a857d UNKNOWN * 61fed9445fbd820423b73ef520d5f9ffc9e0606d Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8475) * fa0dc913f56aa2567c30073c044f688f9ed74fee UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19154) Application mode deletes HA data in case of suspended ZooKeeper connection
[ https://issues.apache.org/jira/browse/FLINK-19154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222045#comment-17222045 ] Till Rohrmann commented on FLINK-19154: --- Thanks for trying this fix out [~casidiablo] and happy to hear that it solved your problem. If you want to use Flink's snapshot artifacts, then you have to add {code} snapshot Apache Snapshot repository https://repository.apache.org/content/repositories/snapshots/ always {code} to your {pom.xml}. If you do not bundle Flink dependencies with your user jar which you put into your image, then it should actually not be necessary to recompile the user jar (unless we introduced an incompatible change with Flink 1.12). > Application mode deletes HA data in case of suspended ZooKeeper connection > -- > > Key: FLINK-19154 > URL: https://issues.apache.org/jira/browse/FLINK-19154 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.12.0, 1.11.1 > Environment: Run a stand-alone cluster that runs a single job (if you > are familiar with the way Ververica Platform runs Flink jobs, we use a very > similar approach). It runs Flink 1.11.1 straight from the official docker > image. >Reporter: Husky Zeng >Assignee: Kostas Kloudas >Priority: Blocker > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > A user reported that Flink's application mode deletes HA data in case of a > suspended ZooKeeper connection [1]. > The problem seems to be that the {{ApplicationDispatcherBootstrap}} class > produces an exception (that the request job can no longer be found because of > a lost ZooKeeper connection) which will be interpreted as a job failure. Due > to this interpretation, the cluster will be shut down with a terminal state > of FAILED which will cause the HA data to be cleaned up. The exact problem > occurs in the {{JobStatusPollingUtils.getJobResult}} which is called by > {{ApplicationDispatcherBootstrap.getJobResult()}}. > The above described behaviour can be found in this log [2]. > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-metadata-deleted-by-Flink-after-ZK-connection-issues-td37937.html > [2] https://pastebin.com/raw/uH9KDU2L -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-19154) Application mode deletes HA data in case of suspended ZooKeeper connection
[ https://issues.apache.org/jira/browse/FLINK-19154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222045#comment-17222045 ] Till Rohrmann edited comment on FLINK-19154 at 10/28/20, 8:56 AM: -- Thanks for trying this fix out [~casidiablo] and happy to hear that it solved your problem. If you want to use Flink's snapshot artifacts, then you have to add {code} snapshot Apache Snapshot repository https://repository.apache.org/content/repositories/snapshots/ always {code} to your {{pom.xml}}. If you do not bundle Flink dependencies with your user jar which you put into your image, then it should actually not be necessary to recompile the user jar (unless we introduced an incompatible change with Flink 1.12). was (Author: till.rohrmann): Thanks for trying this fix out [~casidiablo] and happy to hear that it solved your problem. If you want to use Flink's snapshot artifacts, then you have to add {code} snapshot Apache Snapshot repository https://repository.apache.org/content/repositories/snapshots/ always {code} to your {pom.xml}. If you do not bundle Flink dependencies with your user jar which you put into your image, then it should actually not be necessary to recompile the user jar (unless we introduced an incompatible change with Flink 1.12). > Application mode deletes HA data in case of suspended ZooKeeper connection > -- > > Key: FLINK-19154 > URL: https://issues.apache.org/jira/browse/FLINK-19154 > Project: Flink > Issue Type: Bug > Components: Client / Job Submission >Affects Versions: 1.12.0, 1.11.1 > Environment: Run a stand-alone cluster that runs a single job (if you > are familiar with the way Ververica Platform runs Flink jobs, we use a very > similar approach). It runs Flink 1.11.1 straight from the official docker > image. >Reporter: Husky Zeng >Assignee: Kostas Kloudas >Priority: Blocker > Labels: pull-request-available > Fix For: 1.12.0, 1.11.3 > > > A user reported that Flink's application mode deletes HA data in case of a > suspended ZooKeeper connection [1]. > The problem seems to be that the {{ApplicationDispatcherBootstrap}} class > produces an exception (that the request job can no longer be found because of > a lost ZooKeeper connection) which will be interpreted as a job failure. Due > to this interpretation, the cluster will be shut down with a terminal state > of FAILED which will cause the HA data to be cleaned up. The exact problem > occurs in the {{JobStatusPollingUtils.getJobResult}} which is called by > {{ApplicationDispatcherBootstrap.getJobResult()}}. > The above described behaviour can be found in this log [2]. > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-metadata-deleted-by-Flink-after-ZK-connection-issues-td37937.html > [2] https://pastebin.com/raw/uH9KDU2L -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-12884) FLIP-144: Native Kubernetes HA Service
[ https://issues.apache.org/jira/browse/FLINK-12884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222047#comment-17222047 ] Till Rohrmann commented on FLINK-12884: --- Hi [~shravan.adharapurapu], we are currently working on merging the K8s HA services into the master. We hope that we can ship it with the {{1.12}} release which should come in the next couple of weeks. The best thing to do is to follow this ticket in order to see feature's progress. > FLIP-144: Native Kubernetes HA Service > -- > > Key: FLINK-12884 > URL: https://issues.apache.org/jira/browse/FLINK-12884 > Project: Flink > Issue Type: New Feature > Components: Deployment / Kubernetes, Runtime / Coordination >Reporter: MalcolmSanders >Assignee: Yang Wang >Priority: Major > Fix For: 1.12.0 > > > Currently flink only supports HighAvailabilityService using zookeeper. As a > result, it requires a zookeeper cluster to be deployed on k8s cluster if our > customers needs high availability for flink. If we support > HighAvailabilityService based on native k8s APIs, it will save the efforts of > zookeeper deployment as well as the resources used by zookeeper cluster. It > might be especially helpful for customers who run small-scale k8s clusters so > that flink HighAvailabilityService may not cause too much overhead on k8s > clusters. > Previously [FLINK-11105|https://issues.apache.org/jira/browse/FLINK-11105] > has proposed a HighAvailabilityService using etcd. As [~NathanHowell] > suggested in FLINK-11105, since k8s doesn't expose its own etcd cluster by > design (see [Securing etcd > clusters|https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#securing-etcd-clusters]), > it also requires the deployment of etcd cluster if flink uses etcd to > achieve HA. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19834) Make the TestSink reusable in all the sink related tests.
[ https://issues.apache.org/jira/browse/FLINK-19834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas updated FLINK-19834: --- Affects Version/s: 1.12.0 > Make the TestSink reusable in all the sink related tests. > - > > Key: FLINK-19834 > URL: https://issues.apache.org/jira/browse/FLINK-19834 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream >Affects Versions: 1.12.0 >Reporter: Guowei Ma >Assignee: Guowei Ma >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19834) Make the TestSink reusable in all the sink related tests.
[ https://issues.apache.org/jira/browse/FLINK-19834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas updated FLINK-19834: --- Fix Version/s: 1.12.0 > Make the TestSink reusable in all the sink related tests. > - > > Key: FLINK-19834 > URL: https://issues.apache.org/jira/browse/FLINK-19834 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream >Affects Versions: 1.12.0 >Reporter: Guowei Ma >Assignee: Guowei Ma >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-19834) Make the TestSink reusable in all the sink related tests.
[ https://issues.apache.org/jira/browse/FLINK-19834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas reassigned FLINK-19834: -- Assignee: Guowei Ma > Make the TestSink reusable in all the sink related tests. > - > > Key: FLINK-19834 > URL: https://issues.apache.org/jira/browse/FLINK-19834 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream >Reporter: Guowei Ma >Assignee: Guowei Ma >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] gm7y8 commented on pull request #13458: FLINK-18851 [runtime] Add checkpoint type to checkpoint history entries in Web UI
gm7y8 commented on pull request #13458: URL: https://github.com/apache/flink/pull/13458#issuecomment-717796453 > Thanks @gm7y8 . Additionally, I verified manually that the changes result in the expected behavior. Let's wait for @vthinkxie to get back to us to review the code itself. @XComp @vthinkxie I just verified the fix after making suggested enhancements. it is working as expected. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-19834) Make the TestSink reusable in all the sink related tests.
[ https://issues.apache.org/jira/browse/FLINK-19834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas updated FLINK-19834: --- Component/s: Tests > Make the TestSink reusable in all the sink related tests. > - > > Key: FLINK-19834 > URL: https://issues.apache.org/jira/browse/FLINK-19834 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream, Tests >Affects Versions: 1.12.0 >Reporter: Guowei Ma >Assignee: Guowei Ma >Priority: Major > Labels: pull-request-available > Fix For: 1.12.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] shouweikun commented on pull request #13789: [FLINK-19727][table-runtime] Implement ParallelismProvider for sink i…
shouweikun commented on pull request #13789: URL: https://github.com/apache/flink/pull/13789#issuecomment-717800065 @flinkbot run azure This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zhuxiaoshang opened a new pull request #13819: [hotfix][docs]fix some mistakes in StreamingFileSink
zhuxiaoshang opened a new pull request #13819: URL: https://github.com/apache/flink/pull/13819 ## What is the purpose of the change *fix some mistakes in StreamingFileSink* ## Brief change log - *fix some mistakes in StreamingFileSink* ## Verifying this change This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (no) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zhuxiaoshang commented on pull request #13620: [hotfix][doc] the first FSDataOutputStream should be FSDataInputStream in FileSystem#L178
zhuxiaoshang commented on pull request #13620: URL: https://github.com/apache/flink/pull/13620#issuecomment-717801129 @JingsongLi plz have a review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #13819: [hotfix][docs]fix some mistakes in StreamingFileSink
flinkbot commented on pull request #13819: URL: https://github.com/apache/flink/pull/13819#issuecomment-717802407 Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community to review your pull request. We will use this comment to track the progress of the review. ## Automated Checks Last check on commit e4cf3eb0b429e4ffe12a5053e0d8110c2eb50693 (Wed Oct 28 09:15:53 UTC 2020) **Warnings:** * No documentation files were touched! Remember to keep the Flink docs up to date! Mention the bot in a comment to re-run the automated checks. ## Review Progress * ❓ 1. The [description] looks good. * ❓ 2. There is [consensus] that the contribution should go into to Flink. * ❓ 3. Needs [attention] from. * ❓ 4. The change fits into the overall [architecture]. * ❓ 5. Overall code [quality] is good. Please see the [Pull Request Review Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands The @flinkbot bot supports the following commands: - `@flinkbot approve description` to approve one or more aspects (aspects: `description`, `consensus`, `architecture` and `quality`) - `@flinkbot approve all` to approve all aspects - `@flinkbot approve-until architecture` to approve everything until `architecture` - `@flinkbot attention @username1 [@username2 ..]` to require somebody's attention - `@flinkbot disapprove architecture` to remove an approval you gave earlier This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann commented on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery
tillrohrmann commented on pull request #13648: URL: https://github.com/apache/flink/pull/13648#issuecomment-717803324 Sorry for joining the discussion so late but a couple of questions came up when discussing scheduler changes with Yuan offline. I wanted to ask why we need a special `ResultPartitionType` for the approximate local recovery? Shouldn't it be conceptually possible that we support the normal and approximative recovery behaviour with the same pipelined partitions? If we say that we can reconnect to every pipelined result partition (including dropping partially consumed results), then it can be the responsibility of the scheduler to make sure that producers are restarted as well in order to ensure exactly/at-least once processing guarantees. If not, then we would simply consume from where we have left off. As far as I understand the existing `ResultPartitionType.PIPELINED(_BOUNDED)` cannot be used because we release the result partition if the downstream consumer disconnects. I believe that this is not a strict contract of pipelined result partitions but more of an implementation artefact. Couldn't we solve the problem of disappearing pipelined result partitions by binding the lifecyle of a pipelined result partition to the lifecycle of a `Task`? We could say that a `Task` can only terminate once the pipelined result partition has been consumed. Moreover, a `Task` will clean up the result partition if it fails or gets canceled. That way, we have a clearly defined lifecycle and make sure that these results get cleaned up (iff the `Task` reaches a terminal state). I would love to hear your feedback @pnowojski, @zhijiangW and @rkhachatryan and also learn more about you reasoning to introduce a new result partition type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann edited a comment on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery
tillrohrmann edited a comment on pull request #13648: URL: https://github.com/apache/flink/pull/13648#issuecomment-717803324 Sorry for joining the discussion so late but a couple of questions came up when discussing scheduler changes with Yuan offline. I wanted to ask why we need a special `ResultPartitionType` for the approximate local recovery? Shouldn't it be conceptually possible that we support the normal and approximative recovery behaviour with the same pipelined partitions? If we say that we can reconnect to every pipelined result partition (including dropping partially consumed results), then it can be the responsibility of the scheduler to make sure that producers are restarted as well in order to ensure exactly/at-least once processing guarantees. If not, then we would simply consume from where we have left off. As far as I understand the existing `ResultPartitionType.PIPELINED(_BOUNDED)` cannot be used because we release the result partition if the downstream consumer disconnects. I believe that this is not a strict contract of pipelined result partitions but more of an implementation artefact. Couldn't we solve the problem of disappearing pipelined result partitions by binding the lifecyle of a pipelined result partition to the lifecycle of a `Task`? We could say that a `Task` can only terminate once the pipelined result partition has been consumed. Moreover, a `Task` will clean up the result partition if it fails or gets canceled. That way, we have a clearly defined lifecycle and make sure that these results get cleaned up (iff the `Task` reaches a terminal state). I would love to hear your feedback @pnowojski, @zhijiangW, @curcur and @rkhachatryan and also learn more about you reasoning to introduce a new result partition type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tillrohrmann edited a comment on pull request #13648: [FLINK-19632] Introduce a new ResultPartitionType for Approximate Local Recovery
tillrohrmann edited a comment on pull request #13648: URL: https://github.com/apache/flink/pull/13648#issuecomment-717803324 Sorry for joining the discussion so late but a couple of questions came up when discussing scheduler changes with Yuan offline. I wanted to ask why we need a special `ResultPartitionType` for the approximate local recovery? Shouldn't it be conceptually possible that we support the normal and approximative recovery behaviour with the same pipelined partitions? If we say that we can reconnect to every pipelined result partition (including dropping partially consumed results), then it can be the responsibility of the scheduler to make sure that producers are restarted as well in order to ensure exactly/at-least once processing guarantees. If not, then we would simply consume from where we have left off. As far as I understand the existing `ResultPartitionType.PIPELINED(_BOUNDED)` cannot be used because we release the result partition if the downstream consumer disconnects. I believe that this is not a strict contract of pipelined result partitions but more of an implementation artefact. Couldn't we solve the problem of disappearing pipelined result partitions by binding the lifecyle of a pipelined result partition to the lifecycle of a `Task`? We could say that a `Task` can only terminate once the pipelined result partition has been consumed. Moreover, a `Task` will clean up the result partition if it fails or gets canceled. That way, we have a clearly defined lifecycle and make sure that these results get cleaned up (iff the `Task` reaches a terminal state). I would love to hear your feedback @pnowojski, @zhijiangW, @curcur and @rkhachatryan and also learn more about your reasoning to introduce a new result partition type. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] shouweikun commented on pull request #13789: [FLINK-19727][table-runtime] Implement ParallelismProvider for sink i…
shouweikun commented on pull request #13789: URL: https://github.com/apache/flink/pull/13789#issuecomment-717805408 @JingsongLi Hi, comments addressed! Would u plz review the new changes? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-19850) Add E2E tests for the new streaming case of the new FileSink
Yun Gao created FLINK-19850: --- Summary: Add E2E tests for the new streaming case of the new FileSink Key: FLINK-19850 URL: https://issues.apache.org/jira/browse/FLINK-19850 Project: Flink Issue Type: Sub-task Reporter: Yun Gao -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-19847) Can we create a fast support on the Nested table join?
[ https://issues.apache.org/jira/browse/FLINK-19847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek updated FLINK-19847: - Component/s: (was: API / DataStream) Table SQL / Runtime Table SQL / Planner > Can we create a fast support on the Nested table join? > -- > > Key: FLINK-19847 > URL: https://issues.apache.org/jira/browse/FLINK-19847 > Project: Flink > Issue Type: Wish > Components: Table SQL / Planner, Table SQL / Runtime >Affects Versions: 1.11.1 >Reporter: xiaogang zhou >Priority: Major > > In CommonLookupJoin, one TODO is > support nested lookup keys in the future, > // currently we only support top-level lookup keys > > can we create a fast support on the Array join? thx -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [flink] flinkbot edited a comment on pull request #13744: [FLINK-19766][table-runtime] Introduce File streaming compaction operators
flinkbot edited a comment on pull request #13744: URL: https://github.com/apache/flink/pull/13744#issuecomment-714369975 ## CI report: * 164c8b5b7750ed567813d70f160ef9f064d0a3b0 Azure: [FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8458) * 322bc2c3c11a0f5735db4a475864f894f38866da Azure: [PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=8466) * a01ecf0bade9f8e4a56052fb5b5d25c5034fe511 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run travis` re-run the last Travis build - `@flinkbot run azure` re-run the last Azure build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-19833) Rename Sink API Writer interface to SinkWriter
[ https://issues.apache.org/jira/browse/FLINK-19833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222056#comment-17222056 ] Aljoscha Krettek commented on FLINK-19833: -- I believe [~kkl0u] or [~maguowei] will work on this because they're still working on adding all the components for the new Sink API. > Rename Sink API Writer interface to SinkWriter > -- > > Key: FLINK-19833 > URL: https://issues.apache.org/jira/browse/FLINK-19833 > Project: Flink > Issue Type: Sub-task > Components: API / DataStream >Reporter: Aljoscha Krettek >Priority: Major > > This makes it more consistent with {{SourceReader}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-19835) Don't emit intermediate watermarks from sources in BATCH execution mode
[ https://issues.apache.org/jira/browse/FLINK-19835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aljoscha Krettek reassigned FLINK-19835: Assignee: Aljoscha Krettek > Don't emit intermediate watermarks from sources in BATCH execution mode > --- > > Key: FLINK-19835 > URL: https://issues.apache.org/jira/browse/FLINK-19835 > Project: Flink > Issue Type: Sub-task >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek >Priority: Major > > Currently, both sources and watermark/timestamp operators can emit watermarks > that we don't really need. We only need a final watermark in BATCH execution > mode. -- This message was sent by Atlassian Jira (v8.3.4#803005)