Re: [RESULT] [VOTE] Apache Flink 1.9.0, release candidate #3
@Chesnay No. Users will have to manually build and install PyFlink themselves in 1.9.0: https://ci.apache.org/projects/flink/flink-docs-release-1.9/flinkDev/building.html#build-pyflink This is also mentioned in the announcement blog post (to-be-merged): https://github.com/apache/flink-web/pull/244/files#diff-0cc840a590f5cab2485934278134c9baR291 On Thu, Aug 22, 2019 at 10:03 AM Chesnay Schepler wrote: > Are we also releasing python artifacts for 1.9? > > On 21/08/2019 19:23, Tzu-Li (Gordon) Tai wrote: > > I'm happy to announce that we have unanimously approved this candidate as > > the 1.9.0 release. > > > > There are 12 approving votes, 5 of which are binding: > > - Yu Li > > - Zili Chen > > - Gordon Tai > > - Stephan Ewen > > - Jark Wu > > - Vino Yang > > - Gary Yao > > - Bowen Li > > - Chesnay Schepler > > - Till Rohrmann > > - Aljoscha Krettek > > - David Anderson > > > > There are no disapproving votes. > > > > Thanks everyone who has contributed to this release! > > > > I will wait until tomorrow morning for the artifacts to be available in > > Maven central before announcing the release in a separate thread. > > > > The release blog post will also be merged tomorrow along with the > official > > announcement. > > > > Cheers, > > Gordon > > > > On Wed, Aug 21, 2019, 5:37 PM David Anderson > wrote: > > > >> +1 (non-binding) > >> > >> I upgraded the flink-training-exercises project. > >> > >> I encountered a few rough edges, including problems in the docs, but > >> nothing serious. > >> > >> I had to make some modifications to deal with changes in the Table API: > >> > >> ExternalCatalogTable.builder became new ExternalCatalogTableBuilder > >> TableEnvironment.getTableEnvironment became > StreamTableEnvironment.create > >> StreamTableDescriptorValidator.UPDATE_MODE() became > >> StreamTableDescriptorValidator.UPDATE_MODE > >> org.apache.flink.table.api.java.Slide moved to > >> org.apache.flink.table.api.Slide > >> > >> I also found myself forced to change a CoProcessFunction to a > >> KeyedCoProcessFunction (which it should have been). > >> > >> I also tried a few complex queries in the SQL console, and wrote a > >> simple job using the State Processor API. Everything worked. > >> > >> David > >> > >> > >> David Anderson | Training Coordinator > >> > >> Follow us @VervericaData > >> > >> -- > >> Join Flink Forward - The Apache Flink Conference > >> Stream Processing | Event Driven | Real Time > >> > >> > >> On Wed, Aug 21, 2019 at 1:45 PM Aljoscha Krettek > >> wrote: > >>> +1 > >>> > >>> I checked the last RC on a GCE cluster and was satisfied with the > >> testing. The cherry-picked commits didn’t change anything related, so > I’m > >> forwarding my vote from there. > >>> Aljoscha > >>> > On 21. Aug 2019, at 13:34, Chesnay Schepler > >> wrote: > +1 (binding) > > On 21/08/2019 08:09, Bowen Li wrote: > > +1 non-binding > > > > - built from source with default profile > > - manually ran SQL and Table API tests for Flink's metadata > >> integration > > with Hive Metastore in local cluster > > - manually ran SQL tests for batch capability with Blink planner and > >> Hive > > integration (source/sink/udf) in local cluster > > - file formats include: csv, orc, parquet > > > > > > On Tue, Aug 20, 2019 at 10:23 PM Gary Yao > wrote: > > > >> +1 (non-binding) > >> > >> Reran Jepsen tests 10 times. > >> > >> On Wed, Aug 21, 2019 at 5:35 AM vino yang > >> wrote: > >>> +1 (non-binding) > >>> > >>> - checkout source code and build successfully > >>> - started a local cluster and ran some example jobs successfully > >>> - verified signatures and hashes > >>> - checked release notes and post > >>> > >>> Best, > >>> Vino > >>> > >>> Stephan Ewen 于2019年8月21日周三 上午4:20写道: > >>> > +1 (binding) > > - Downloaded the binary release tarball > - started a standalone cluster with four nodes > - ran some examples through the Web UI > - checked the logs > - created a project from the Java quickstarts maven archetype > - ran a multi-stage DataSet job in batch mode > - killed as TaskManager and verified correct restart behavior, > >> including > failover region backtracking > > > I found a few issues, and a common theme here is confusing error > >>> reporting > and logging. > > (1) When testing batch failover and killing a TaskManager, the job > >>> reports > as the failure cause "org.apache.flink.util.FlinkException: The > >> assigned > slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed." > I think that is a pretty bad error message, as a user I don't > >> know > >>> what > that means. Some internal book keeping thing? > You need to know a lot about
Re: [RESULT] [VOTE] Apache Flink 1.9.0, release candidate #3
Are we also releasing python artifacts for 1.9? On 21/08/2019 19:23, Tzu-Li (Gordon) Tai wrote: I'm happy to announce that we have unanimously approved this candidate as the 1.9.0 release. There are 12 approving votes, 5 of which are binding: - Yu Li - Zili Chen - Gordon Tai - Stephan Ewen - Jark Wu - Vino Yang - Gary Yao - Bowen Li - Chesnay Schepler - Till Rohrmann - Aljoscha Krettek - David Anderson There are no disapproving votes. Thanks everyone who has contributed to this release! I will wait until tomorrow morning for the artifacts to be available in Maven central before announcing the release in a separate thread. The release blog post will also be merged tomorrow along with the official announcement. Cheers, Gordon On Wed, Aug 21, 2019, 5:37 PM David Anderson wrote: +1 (non-binding) I upgraded the flink-training-exercises project. I encountered a few rough edges, including problems in the docs, but nothing serious. I had to make some modifications to deal with changes in the Table API: ExternalCatalogTable.builder became new ExternalCatalogTableBuilder TableEnvironment.getTableEnvironment became StreamTableEnvironment.create StreamTableDescriptorValidator.UPDATE_MODE() became StreamTableDescriptorValidator.UPDATE_MODE org.apache.flink.table.api.java.Slide moved to org.apache.flink.table.api.Slide I also found myself forced to change a CoProcessFunction to a KeyedCoProcessFunction (which it should have been). I also tried a few complex queries in the SQL console, and wrote a simple job using the State Processor API. Everything worked. David David Anderson | Training Coordinator Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time On Wed, Aug 21, 2019 at 1:45 PM Aljoscha Krettek wrote: +1 I checked the last RC on a GCE cluster and was satisfied with the testing. The cherry-picked commits didn’t change anything related, so I’m forwarding my vote from there. Aljoscha On 21. Aug 2019, at 13:34, Chesnay Schepler wrote: +1 (binding) On 21/08/2019 08:09, Bowen Li wrote: +1 non-binding - built from source with default profile - manually ran SQL and Table API tests for Flink's metadata integration with Hive Metastore in local cluster - manually ran SQL tests for batch capability with Blink planner and Hive integration (source/sink/udf) in local cluster - file formats include: csv, orc, parquet On Tue, Aug 20, 2019 at 10:23 PM Gary Yao wrote: +1 (non-binding) Reran Jepsen tests 10 times. On Wed, Aug 21, 2019 at 5:35 AM vino yang wrote: +1 (non-binding) - checkout source code and build successfully - started a local cluster and ran some example jobs successfully - verified signatures and hashes - checked release notes and post Best, Vino Stephan Ewen 于2019年8月21日周三 上午4:20写道: +1 (binding) - Downloaded the binary release tarball - started a standalone cluster with four nodes - ran some examples through the Web UI - checked the logs - created a project from the Java quickstarts maven archetype - ran a multi-stage DataSet job in batch mode - killed as TaskManager and verified correct restart behavior, including failover region backtracking I found a few issues, and a common theme here is confusing error reporting and logging. (1) When testing batch failover and killing a TaskManager, the job reports as the failure cause "org.apache.flink.util.FlinkException: The assigned slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed." I think that is a pretty bad error message, as a user I don't know what that means. Some internal book keeping thing? You need to know a lot about Flink to understand that this means "TaskManager failure". https://issues.apache.org/jira/browse/FLINK-13805 I would not block the release on this, but think this should get pretty urgent attention. (2) The Metric Fetcher floods the log with error messages when a TaskManager is lost. There are many exceptions being logged by the Metrics Fetcher due to not reaching the TM any more. This pollutes the log and drowns out the original exception and the meaningful logs from the scheduler/execution graph. https://issues.apache.org/jira/browse/FLINK-13806 Again, I would not block the release on this, but think this should get pretty urgent attention. (3) If you put "web.submit.enable: false" into the configuration, the web UI will still display the "SubmitJob" page, but errors will continuously pop up, stating "Unable to load requested file /jars." https://issues.apache.org/jira/browse/FLINK-13799 (4) REST endpoint logs ERROR level messages when selecting the "Checkpoints" tab for batch jobs. That does not seem correct. https://issues.apache.org/jira/browse/FLINK-13795 Best, Stephan On Tue, Aug 20, 2019 at 11:32 AM Tzu-Li (Gordon) Tai < tzuli...@apache.org> wrote: +1 Legal checks: - verified signatures and
Re: [RESULT] [VOTE] Apache Flink 1.9.0, release candidate #3
Congratulations and thanks all for the great efforts on release1.9. I have verified the RC#3 with the following items: - Verified signatures and hashes. (OK) - Built from source archive. (OK) - Repository contains all artifacts. (OK) - Test WordCount on local cluster. (OK) a. Both streaming and batch b. Web ui works fine - Test WordCount on yarn cluster. (OK) a. Both streaming and batch b. Web ui works fine c. Test session mode and non-session mode So +1 (binding) from my side. Regards, Shaoxuan On Thu, Aug 22, 2019 at 1:23 AM Tzu-Li (Gordon) Tai wrote: > I'm happy to announce that we have unanimously approved this candidate as > the 1.9.0 release. > > There are 12 approving votes, 5 of which are binding: > - Yu Li > - Zili Chen > - Gordon Tai > - Stephan Ewen > - Jark Wu > - Vino Yang > - Gary Yao > - Bowen Li > - Chesnay Schepler > - Till Rohrmann > - Aljoscha Krettek > - David Anderson > > There are no disapproving votes. > > Thanks everyone who has contributed to this release! > > I will wait until tomorrow morning for the artifacts to be available in > Maven central before announcing the release in a separate thread. > > The release blog post will also be merged tomorrow along with the official > announcement. > > Cheers, > Gordon > > On Wed, Aug 21, 2019, 5:37 PM David Anderson wrote: > > > +1 (non-binding) > > > > I upgraded the flink-training-exercises project. > > > > I encountered a few rough edges, including problems in the docs, but > > nothing serious. > > > > I had to make some modifications to deal with changes in the Table API: > > > > ExternalCatalogTable.builder became new ExternalCatalogTableBuilder > > TableEnvironment.getTableEnvironment became StreamTableEnvironment.create > > StreamTableDescriptorValidator.UPDATE_MODE() became > > StreamTableDescriptorValidator.UPDATE_MODE > > org.apache.flink.table.api.java.Slide moved to > > org.apache.flink.table.api.Slide > > > > I also found myself forced to change a CoProcessFunction to a > > KeyedCoProcessFunction (which it should have been). > > > > I also tried a few complex queries in the SQL console, and wrote a > > simple job using the State Processor API. Everything worked. > > > > David > > > > > > David Anderson | Training Coordinator > > > > Follow us @VervericaData > > > > -- > > Join Flink Forward - The Apache Flink Conference > > Stream Processing | Event Driven | Real Time > > > > > > On Wed, Aug 21, 2019 at 1:45 PM Aljoscha Krettek > > wrote: > > > > > > +1 > > > > > > I checked the last RC on a GCE cluster and was satisfied with the > > testing. The cherry-picked commits didn’t change anything related, so I’m > > forwarding my vote from there. > > > > > > Aljoscha > > > > > > > On 21. Aug 2019, at 13:34, Chesnay Schepler > > wrote: > > > > > > > > +1 (binding) > > > > > > > > On 21/08/2019 08:09, Bowen Li wrote: > > > >> +1 non-binding > > > >> > > > >> - built from source with default profile > > > >> - manually ran SQL and Table API tests for Flink's metadata > > integration > > > >> with Hive Metastore in local cluster > > > >> - manually ran SQL tests for batch capability with Blink planner and > > Hive > > > >> integration (source/sink/udf) in local cluster > > > >> - file formats include: csv, orc, parquet > > > >> > > > >> > > > >> On Tue, Aug 20, 2019 at 10:23 PM Gary Yao > wrote: > > > >> > > > >>> +1 (non-binding) > > > >>> > > > >>> Reran Jepsen tests 10 times. > > > >>> > > > >>> On Wed, Aug 21, 2019 at 5:35 AM vino yang > > wrote: > > > >>> > > > +1 (non-binding) > > > > > > - checkout source code and build successfully > > > - started a local cluster and ran some example jobs successfully > > > - verified signatures and hashes > > > - checked release notes and post > > > > > > Best, > > > Vino > > > > > > Stephan Ewen 于2019年8月21日周三 上午4:20写道: > > > > > > > +1 (binding) > > > > > > > > - Downloaded the binary release tarball > > > > - started a standalone cluster with four nodes > > > > - ran some examples through the Web UI > > > > - checked the logs > > > > - created a project from the Java quickstarts maven archetype > > > > - ran a multi-stage DataSet job in batch mode > > > > - killed as TaskManager and verified correct restart behavior, > > > >>> including > > > > failover region backtracking > > > > > > > > > > > > I found a few issues, and a common theme here is confusing error > > > reporting > > > > and logging. > > > > > > > > (1) When testing batch failover and killing a TaskManager, the > job > > > reports > > > > as the failure cause "org.apache.flink.util.FlinkException: The > > > >>> assigned > > > > slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed." > > > > I think that is a pretty bad error message, as a user I don't > > know > > > what > > > > that means. Some internal book keeping thing? > > >
[RESULT] [VOTE] Apache Flink 1.9.0, release candidate #3
I'm happy to announce that we have unanimously approved this candidate as the 1.9.0 release. There are 12 approving votes, 5 of which are binding: - Yu Li - Zili Chen - Gordon Tai - Stephan Ewen - Jark Wu - Vino Yang - Gary Yao - Bowen Li - Chesnay Schepler - Till Rohrmann - Aljoscha Krettek - David Anderson There are no disapproving votes. Thanks everyone who has contributed to this release! I will wait until tomorrow morning for the artifacts to be available in Maven central before announcing the release in a separate thread. The release blog post will also be merged tomorrow along with the official announcement. Cheers, Gordon On Wed, Aug 21, 2019, 5:37 PM David Anderson wrote: > +1 (non-binding) > > I upgraded the flink-training-exercises project. > > I encountered a few rough edges, including problems in the docs, but > nothing serious. > > I had to make some modifications to deal with changes in the Table API: > > ExternalCatalogTable.builder became new ExternalCatalogTableBuilder > TableEnvironment.getTableEnvironment became StreamTableEnvironment.create > StreamTableDescriptorValidator.UPDATE_MODE() became > StreamTableDescriptorValidator.UPDATE_MODE > org.apache.flink.table.api.java.Slide moved to > org.apache.flink.table.api.Slide > > I also found myself forced to change a CoProcessFunction to a > KeyedCoProcessFunction (which it should have been). > > I also tried a few complex queries in the SQL console, and wrote a > simple job using the State Processor API. Everything worked. > > David > > > David Anderson | Training Coordinator > > Follow us @VervericaData > > -- > Join Flink Forward - The Apache Flink Conference > Stream Processing | Event Driven | Real Time > > > On Wed, Aug 21, 2019 at 1:45 PM Aljoscha Krettek > wrote: > > > > +1 > > > > I checked the last RC on a GCE cluster and was satisfied with the > testing. The cherry-picked commits didn’t change anything related, so I’m > forwarding my vote from there. > > > > Aljoscha > > > > > On 21. Aug 2019, at 13:34, Chesnay Schepler > wrote: > > > > > > +1 (binding) > > > > > > On 21/08/2019 08:09, Bowen Li wrote: > > >> +1 non-binding > > >> > > >> - built from source with default profile > > >> - manually ran SQL and Table API tests for Flink's metadata > integration > > >> with Hive Metastore in local cluster > > >> - manually ran SQL tests for batch capability with Blink planner and > Hive > > >> integration (source/sink/udf) in local cluster > > >> - file formats include: csv, orc, parquet > > >> > > >> > > >> On Tue, Aug 20, 2019 at 10:23 PM Gary Yao wrote: > > >> > > >>> +1 (non-binding) > > >>> > > >>> Reran Jepsen tests 10 times. > > >>> > > >>> On Wed, Aug 21, 2019 at 5:35 AM vino yang > wrote: > > >>> > > +1 (non-binding) > > > > - checkout source code and build successfully > > - started a local cluster and ran some example jobs successfully > > - verified signatures and hashes > > - checked release notes and post > > > > Best, > > Vino > > > > Stephan Ewen 于2019年8月21日周三 上午4:20写道: > > > > > +1 (binding) > > > > > > - Downloaded the binary release tarball > > > - started a standalone cluster with four nodes > > > - ran some examples through the Web UI > > > - checked the logs > > > - created a project from the Java quickstarts maven archetype > > > - ran a multi-stage DataSet job in batch mode > > > - killed as TaskManager and verified correct restart behavior, > > >>> including > > > failover region backtracking > > > > > > > > > I found a few issues, and a common theme here is confusing error > > reporting > > > and logging. > > > > > > (1) When testing batch failover and killing a TaskManager, the job > > reports > > > as the failure cause "org.apache.flink.util.FlinkException: The > > >>> assigned > > > slot 6d0e469d55a2630871f43ad0f89c786c_0 was removed." > > > I think that is a pretty bad error message, as a user I don't > know > > what > > > that means. Some internal book keeping thing? > > > You need to know a lot about Flink to understand that this > means > > > "TaskManager failure". > > > https://issues.apache.org/jira/browse/FLINK-13805 > > > I would not block the release on this, but think this should > get > > pretty > > > urgent attention. > > > > > > (2) The Metric Fetcher floods the log with error messages when a > > > TaskManager is lost. > > > There are many exceptions being logged by the Metrics Fetcher > due > > >>> to > > > not reaching the TM any more. > > > This pollutes the log and drowns out the original exception > and > > >>> the > > > meaningful logs from the scheduler/execution graph. > > > https://issues.apache.org/jira/browse/FLINK-13806 > > > Again, I would not block the release on this, but think this > > >>> should > > >