sjwiesman commented on a change in pull request #478: URL: https://github.com/apache/flink-web/pull/478#discussion_r740498181
########## File path: _posts/2021-11-03-flink-backward.md ########## @@ -0,0 +1,79 @@ +--- +layout: post +title: "Flink Backward - The Apache Flink Retrospective" +date: 2021-11-03 00:00:00 +authors: +- joemoe: + name: "Johannes Moser" +excerpt: A look back at the development cycle for Flink 1.14 +--- + +It has now been a month since [Apache Flink 1.14](https://flink.apache.org/downloads.html#apache-flink-1140) has been released into the wild. :partying_face: We had a comprehensive look at the enhancements, additions, and fixups in the [release announcement blog post](https://flink.apache.org/news/2021/09/29/release-1.14.0.html) and now we will have a look at the development cycle from a different angle. Based on feedback collected from contributors involved in this release, we will explore the experiences and processes behind it all. + +{% toc %} + +# A retrospective on the release cycle + +From the team, we collected emotions that have been attributed to points in time of the 1.14 release cycle: + +<center> +<img src="{{site.baseurl}}/img/blog/2021-11-03-flink-backward/1.14-weather.png" width="70%"/> +</center> + +The overall sentiment seems to be quite good. A ship crushed a robot two times, someone felt sick towards the end, an octopus causing negative emotions appeared in June... + +We looked at the origin of these emotions and came up with an analysis of what went well and what could be improved. We also incorporated some feedback gathered from the community. + +## Problems faced + +From a content perspective, the community is still ironing out processes around documentation and blog posts. There have been a lot of test instabilities and issues with some of the efforts to fix them. We also had to push the feature freeze by two weeks, which might actually be considered early and not actually be living up to the Apache Flink release tradition. :p + +## Things enjoyed + +The implementation of some features, such as [buffer debloating](https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/memory/network_mem_tuning/#the-buffer-debloating-mechanism) and [fine-grained resource management](https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/finegrained_resource/), went really smoothly. Though a few issues are now popping up. We enjoyed the moments while they last. :smiling_face_with_tear: + +We also said goodbye to some components. It always feels refreshing to remove code and reduce complexity. :slightly_smiling_face: + + +# What we want to achieve through process changes + +## Transparency - let the community participate + +When approaching a release (usually a couple of weeks after the previous release has been done) we set up bi-weekly meetings for the community to discuss any issues regarding the release. The usefulness of those meetings varied a lot and so we started to [track the efforts](https://cwiki.apache.org/confluence/display/FLINK/1.14+Release) in the Apache Flink Confluence wiki. + +We came up with a system to label the current states of each feature: "independent", "won’t make it", "very unlikely", "will make it", "done", and "done done". We introduced the "done done" state since we were lacking a shared understanding of the definition of done. To qualify for "done done", the feature is manually tested by someone who has not been involved in the implementation of the specific effort and there exists comprehensive documentation that enables users to use the feature. + +After each meeting, we provided updates on the mailing list and created a corresponding burn down chart. Those efforts have been perceived positively although they might still require some improvements. + +The meeting used to only be for those who have been driving the main efforts, but we opened it up to the whole community for this release. Nobody ended up joining :smiling_face_with_tear: but we will continue to make the meetings open to everyone. + + +## Stability - reduce building and testing pain + +At one point as we were coming close to the feature freeze, the stability of the master branch became quite unstable. Although we have encountered this issue in the past, building and testing Flink under such conditions was not ideal. Let's not count how often Kafka integrations tests have failed. Review comment: I don't like the Kafka comment, I understand it's a joke it but it doesn't come across well. We can certainly talk about why Kafka causes us so much pain but that feel separate from this. ```suggestion At one point, as we were coming close to the feature freeze, the stability of the master branch became quite unstable. Although we have encountered this issue in the past, building and testing Flink under such conditions was not ideal. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
