Re: [Softwarefactory-dev] Tech previews and upstream (zuul, nodepool)

Tristan Cacqueray Sat, 14 Jul 2018 02:51:36 -0700

On July 13, 2018 9:16 pm, Paul Belanger wrote:

On Fri, Jul 13, 2018 at 09:27:23PM +0200, Matthieu Huin wrote:

Hello,

Thanks for starting this discussion, I believe it to be important to the future
relationship between SF and Zuul communities.

I would like to poke the collective brainpower about the way we ship
upstream components to which we also contribute, namely zuul and
nodepool, and see what we can do to improve if possible.

Thank you for starting this thread.

As you probably all know, we choose to package these components as
RPMs, which allows us to ship zuul and nodepool with extra patches
that we call *tech previews*. These patches are always related to
features we are either contributing upstream ourselves or closely
follow, but that are not merged yet. The important point is also that
we have some relative confidence that these patches will eventually
make it into upstream. Unfortunately, "relative" and "eventually" must
be taken with a grain of salt:

My main objection to this concept of 'Tech Preview' is this follows more the
development processes of OpenShift / Kubernetes and not OpenStack[1] (NOTE: zuul
isn't actually an OpenStack project anymore, but governance likely will follow
the 4 opens of OpenStack). We don't want to be releasing things a head of zuul,
I point to the issues between the k8s and openshift projects.

I disagree, please see:
http://crunchtools.com/is-openshift-a-fork-of-kubernetes-short-answer-no-longer-answer-heres-a-ton-of-technical-reasons/
And:
http://www.softwarefactory-project.io/what-is-the-difference-between-software-factory-and-zuul.html

The biggest red flag for me, is downstream Red Hat does not do this process for
OSP (Red Hat product for OpenStack). Teams are committed to push code into
master first, then either make an release from it or in some cases backport into
a downstream branch. I am 100% confident these teams had the same struggles we
have highlighted before and would like to us to may query some folk about how
they solved this issue.  Because the latest version of OSP does not contain
downstream features outside of upstream master branches.

OSP has many core devs to push changes, SF only has Paul that is core on Zuul.

By having SoftwareFactory do the opposite of this, it may mean new features
faster, but as you point out below, now means more technical debt and worst,
meaning SF is only people on the hook to support this 'tech release' version.

SF only does best effort support to fix bugs in zuul source code as well as
in the code the team produces.

Give the size and workload of the current team, I assert we don't want to do
this just for this reason. By shipping something different then upstream, I also
don't believe upstream will be happy with this. In fact, I encourage us to talk
the results of this discussion to the Zuul Discuss mailing list improving them
of this topic. This would be a great opportunity to work closer together as a
team, if we raised our concerns with them.

To me, Zuul in SF isn't different then upstream. The
tech-preview features don't actually have to be part of the Zuul code.

If doing tech-preview obstructs the SF relationship with the upstream Zuul
communities, then SF can drop the patches. I would prefer to keep the features
integrated in Zuul so that it enables direct Zuul improvements, but if this
is making upstream un-happy, then we can revisit SF integration.

1. We usually cannot tell how long it'll take for our patches to land
upstream. I don't have numbers to support it, but from memory It can
vary between days (some bug fixes made it quickly into master
upstream) to months (we've had OCI as tech preview for months, and
it's still not merged upstream). Fabien or Tristan can certainly
provide numbers on that point.

I took some time to run the numbers using reviewday[2] a tool we have in
openstack-infra, for today I've compared nova[3] with zuul[4] (using official
zuul projects[5]). As you can see, while does have a high openreview number, it
is no where need nova. I also ran the numbers using tripleo[5] with zuul, and
again zuul does come out a head of tripleo here. Give the large number of RedHat
developers to tripleo, I think it is fair dive more into here why a 'tech
preview' release of tripleo is not done.

I can understand how it feels like zuul takes a long time for patches to land,
but I believe the fix here is to double down on our commitment upstream and
encourage more time for SF developers to be free to aid / assist with upstream
zuul development. This includes code reviews of other non related SF patches.

2. We cannot guarantee that the feature in tech preview will land
as-is. The upstream reviews are usually being discussed and can be
subject to change, meaning users should not consider any of the tech
preview features to be stable.

+1000

This extends with my comment above, and completely agree. This is a very large
concern as it means more work for only SF to deal with migration issues.

It seems like we need to be even more clear about tech previews,
their migration are also best efforts.

Looking at our distgit for Zuul, we currently ship 12 extra patches as
tech preview (5 of which about to be merged or merged - thus the spec
must be updated), and this is bound to increase if we keep
contributing things faster than they can be reviewed and accepted
upstream. It can become quite hard to maintain the patch chain as
upstream evolves. We also face the very real risk that one of our use
cases (and thus our upstream contributions) might contradict
upstream's roadmap, leading to rejected patches: do we become a fork
then ? Are we actually effectively a fork, providing a "Zuul that
could be someday" but definitely not the current Zuul?

For me, the path forward here is staightforward, the patches that we have in our
zuul RPM file, we keep, there is no point removing them now, as it will be very
disruptive.

Actually, I don't think any patches are needed to pass sf-ci integration test.
We would lose some useful features (like pipeline listing to generate
grafana dashboard), but no blockers that can't be worked around.

We stop working on any new features that SF requires and spend the
effort working upstream to land these changes. This likely means, getting
approval from management to be allowed to help push on these efforts upstream.

Let me list the actual patches we are talking about here...

# 0001-executor-change-execution-log-to-INFO.patch
 https://review.openstack.org/578704
improvement, merged, took 13 days

# 0001-gerrit-use-baseurl-for-change-uris-lookup.patch
 https://review.openstack.org/579086
bugfix, still under review (12 days)

# 0001-Add-tenant-yaml-validation-option-to-scheduler.patch
 https://review.openstack.org/574265
feature, still under review (5 weeks)

# 0001-angular-call-enableProdMode.patch
 https://review.openstack.org/573494
bugfix, no clear solution, reported 5 weeks ago

# 0001-model-fix-AttributeError-exception-in-freezeJobGraph.patch
 https://review.openstack.org/579428
bugfix, merged, took a day

# 0001-gerrit-add-support-for-report-only-connection.patch
 https://review.openstack.org/568216
improvement (divide merger load by 2 for rdo), still under review (2 months)

# 0001-executor-add-support-for-resource-connection-type.patch
 https://review.openstack.org/570668
improvement, minor modification to enable nodepool resources,
part of the container spec implementation

# 0001-executor-add-log_stream_port-and-log_stream_file-set.patch
 https://review.openstack.org/535538
feature, waiting for zuul_stream support ssh port forward

# 0001-zk-use-kazoo-retry-facilities.patch
 https://review.openstack.org/536209 / https://review.openstack.org/535537
feature/bugfix to enable zookeeper restart and reduce the amount of node 
request,
reported 6 months ago, discussed here:
http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-March/000047.html

# 0001-angular6-dashboard.patch / 0001-web-new-routes.patch
 https://review.openstack.org/#/q/topic:zuul-ui-pages
 https://review.openstack.org/#/q/topic:zuul-web-routes
feature, webui improvements


Then regarding nodepool:

# 0001-config-add-statsd-server-config-parameter.patch
 https://review.openstack.org/535560
feature to remove sysconfig for statsd, proposed 7 months ago.

# 0001-zk-skip-node-already-being-deleted-in-cleanup-leaked.patch
 https://review.openstack.org/576288
bugfix, proposed 1 month ago.

# 0001-builder-do-not-configure-provider-that-doesn-t-manag.patch
 https://review.openstack.org/578642
bugfix, proposed 16 days ago.

# 0001-driver-runc.patch / 0001-driver-k8s.patch / 0001-driver-openshift.patch
 https://review.openstack.org/#/q/topic:nodepool-drivers
drivers, not actual nodepool modification, proposed 14 months ago.


TL;DR; most are either cosmetic (e.g. statsd option in nodepool.yaml),
or either critical bug fixes that are already approved upstream.

Then there are 2 "tech-preview":

The first one is a better web-ui. At first, it was developped in
managesf to ease the migration from jenkins. Then the code was
contributed to Zuul and upstream merged the builds and jobs API.
Now SF added more pages to list the projects, the labels, the pipeline conf.
etc... If needed, these new pages could be removed from Zuul and added
back to the legacy managesf interface.

The second one are the nodepool drivers. With the new driver API, those
are just folders that do not modify the nodepool logic.
I keep the upstream review in sync, and I'm willing to do the legwork
to get them merged. But on the other hand, those drivers don't have to be
merged upstream.

At the same time, we agree the SF workflow process will stop included these
patches in the RPM, and land everything upstream in master first. It isn't
enough to just submit the patch to review.o.o, we must be getting it merged
before including it into zuul RPM.

+1

Yet the tech preview system has obvious advantages that make it
difficult to just drop this model, namely providing much needed
features that make Zuul and Nodepool much more serious and versatile
contenders in the world of CI, CD and code quality control - this is
also why we believe our changes will eventually make it upstream.

I hoped this tech preview system would benefit other Zuul comunities,
but if this is causing troubles, then we can look into incubating the
features in managesf.

This is important feedback for the zuul project, and something I think we could
discuss upstream on the zuul discuss ML. If we as a team believe this is
important, then other zuul users upstream should too?

SF-3.1 is currently waiting for upstream to tag master.
Then we could report the tech-preview as we did for SF-3.0:
http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-March/000092.html

I guess the question we need to answer is: why is it so hard or long
to have upstream land the features we propose? And what can we do to
change that? If we can improve on this, the need for patching will
decrease until we can ship code as close as possible to master, or
even tagged releases.

What are your thoughts on this?

This is no limited to just zuul, I've seen this problem time and time again for
other opensource projects. Patience is required, and we need to adjust our
expectations. If we want to influence change, we can and now is the time to do
so. However, we need to spend the time and energy doing so. This may mean
slowing down out feature development work to properly land what we have already
done.

I think we need to do more code reviews, how many do you think we need
to do to gain influence?

Thanks,
-Tristan

[1] https://governance.openstack.org/tc/reference/opens.html
[2] http://git.openstack.org/cgit/openstack-infra/reviewday
[3] http://paste.openstack.org/raw/725858/
[4] http://paste.openstack.org/raw/725860/
[5] https://git.zuul-ci.org/cgit

_______________________________________________
Softwarefactory-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/softwarefactory-dev

pgpeYk1veUWhr.pgp
Description: PGP signature

_______________________________________________
Softwarefactory-dev mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/softwarefactory-dev

Re: [Softwarefactory-dev] Tech previews and upstream (zuul, nodepool)

Reply via email to