Thanks for testing the RC and the feedback Thomas. The problem with the taskmanager options is that the old (taskmanager.initial-registration-pause) and new options (cluster.registration.initial-timeout) don't have the same type. The old options have not been used for a long time (since version 1.5.0) and we wanted to remove them. As part of the removal, we added the old keys as deprecated options for the new ones. I believe this was a mistake. I've opened a PR to remove the deprecated keys from the new ConfigOptions [1].
Please be aware that "taskmanager.initial-registration-pause": "500ms", "taskmanager.max-registration-pause": "5s", "taskmanager.refused-registration-pause": "5s", Shouldn't have any effects anymore (since version 1.5.0). [1] https://github.com/apache/flink/pull/12763 Cheers, Till On Wed, Jun 24, 2020 at 4:17 AM Zhijiang <wangzhijiang...@aliyun.com.invalid> wrote: > Hi Thomas, > > Thanks for these valuable feedbacks and suggestions, and I think they are > very helpful for making us better. > > I can give an direct answer for this issue: > > checkpoint alignment buffered metric missing - note that this job isn't > using the new unaligned checkpointing that should be opt-in. > > The metric of checkpoint alignment buffered would be always 0 now, no > matter with unaligned checkpointing or not, so we removed this metric > directly. > The motivation for such change is from reducing in-flight buffers to speed > up checkpoint somehow. The upstream side would block sending any following > buffers after sending the barrier until receiving the alignment > notification from downstream side. Therefore, the downstream side never > needs to cache > buffers for blocked channels during alignment. We also illustrated such > changes in release notes for attention by link [1]. > > [1] > https://github.com/apache/flink/pull/12699/files#diff-eaa874e007e88f283e96de2d61cc4140R174 > > Best, > Zhijiang > ------------------------------------------------------------------ > From:Thomas Weise <t...@apache.org> > Send Time:2020年6月24日(星期三) 06:51 > To:dev <dev@flink.apache.org> > Cc:zhijiang <zhiji...@apache.org> > Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2 > > Hi, > > Thanks for putting together the RC! > > I have some preliminary feedback from testing with commit > 934f91ead00fd658333f65ffa37ab60bd5ffd99b > > An internal benchmark application that reads from Kinesis and checkpoints > ~12GB performs comparably to 1.10.1 > > There were a few issues hit upgrading our codebase that may be worthwhile > considering, please see details below. > > Given my observations over the past few releases, I would like to suggest > that the community introduces a log of incompatible changes to be published > with the release notes. Though it is possible to analyze git history when > hitting compile errors, there are more subtle changes that can make > upgrades unnecessarily time-consuming. Contributors introducing such > changes are probably in the best position to document. > > I'm planning to try this or the next RC with a couple more applications. > > Cheers, > Thomas > > * notifyCheckpointAborted needed to be implemented > for org.apache.flink.runtime.state.CheckpointListener - can we have the > default implementation in the interface so that users aren't forced to > change their implementations > > * following deprecated configuration values had to be modified to get > the job running: > > "taskmanager.initial-registration-pause": "500ms", > "taskmanager.max-registration-pause": "5s", > "taskmanager.refused-registration-pause": "5s", > > The error message was: > > Could not parse value '500ms' for key > 'cluster.registration.initial-timeout'.\n\tat > > org.apache.flink.configuration.Configuration.getOptional(Configuration.java:753)\n\tat > > org.apache.flink.configuration.Configuration.getLong(Configuration.java:298)\n\tat > > org.apache.flink.runtime.registration.RetryingRegistrationConfiguration.fromConfiguration(RetryingRegistrationConfiguration.java:72)\n\tat > > org.apache.flink.runtime.taskexecutor.TaskManagerServicesConfiguration.fromConfiguration(TaskManagerServicesConfiguration.java:262)\n\tat > > Though easy to fix, it's unfortunate that values are now treated > differently. > > * checkpoint alignment buffered metric missing - note that this job isn't > using the new unaligned checkpointing that should be opt-in. > > * -import org.apache.flink.table.api.java.StreamTableEnvironment; > +import org.apache.flink.table.api.bridge.java.StreamTableEnvironment; > > * -ClientUtils.executeProgram(DefaultExecutorServiceLoader.INSTANCE, > config, program.build()); > +ClientUtils.executeProgram(DefaultExecutorServiceLoader.INSTANCE, > config, program.build(), > false, false); > > * ProcessingTimeCallback removed from StreamingFileSink > > > On Wed, Jun 17, 2020 at 6:29 AM Piotr Nowojski <pnowoj...@apache.org> > wrote: > > > Hi all, > > > > I would like to give an update about the RC2 status. We are now waiting > for > > a green azure build on one final bug fix before creating RC2. This bug > fix > > should be merged late afternoon/early evening Berlin time, so RC2 will be > > hopefully created tomorrow morning. Until then I would ask to not > > merge/backport commits to release-1.11 branch, including bug fixes. If > you > > have something that's truly essential and should be treated as a release > > blocker, please reach out to me or Zhijiang. > > > > Best, > > Piotr Nowojski > > > >