On Wednesday, 28 January 2015 10:08:54 CEST, Ben Cooksley wrote:
1) Most applications integrate extremely poorly with LDAP. They
basically take the details once on first login and don't sync the
details again after that (this is what both Chiliproject and
Reviewboard do). How does Gerrit perform here?

Data are fetched from LDAP as needed. There's a local cache for speedup (with configurable TTL and support for explicit flushes).

2) For this trivial scratch repository script, will it offer it's own
user interface or will developers be required to pass arguments to it
through some other means? The code you've presented thus far makes it
appear some other means will be required.

I might not fully understand this question -- I thought we already dicussed this. The simplest method of invocation can be as easy as `ssh user@somehost create-personal-project foobar`, with SSH keys verified by OpenSSH. This is the same UI as our current setup. There are other options, some of them with fancy, web-based UIs.

3) We've used cGit in the past, and found it suffered from performance
problems with our level of scale. Note that just because a solution
scales with the size of repositories does not necessarily mean it
scales with the number of repositories, which is what bites cGit. In
light of this, what do you propose?

An option which is suggested in the document is to use our current quickgit setup, i.e. GitPHP. If it works, there's no need to change it, IMHO, and sticking with it looks like a safe thing to me. But there are many additional choices (including gitiles).

4) Has Gerrit's replication been demonstrated to handle 2000 Git
repositories which consume 30gb of disk space? How is metadata such as
repository descriptions (for repository browsers) replicated?

Yes, Gerrit scales far beyond that. See e.g. the thread at https://groups.google.com/forum/#!topic/repo-discuss/5JHwzednYkc for real users' feedback about large-scale deployments.

5) If Gerrit or it's hosting system were to develop problems how would
the replication system handle this? From what I see it seems highly
automated and bugs or glitches within Gerrit could rapidly be
inflicted upon the anongit nodes with no apparent safe guards (which
our present custom system has). Bear in mind that failure scenarios
can always occur in the most unexpected ways, and the integrity of our
repositories is of paramount importance.

I agree that one needs proper, offline and off-site backups for critical data, and that any online Git replication is not a proper substitute for this. The plan for disaster recovery therefore is "restore from backup".

In terms of Gerrit, this means backing up all of the Git repositories and the dumping the PostgreSQL database, and storing these in a location which cannot be wiped out or modified by an attacker who has root on the main Git server, or by a software bug in our Git hosting. One cannot get that with just Git replication, of course.

What are the safeguard mechanisms that you mentioned? What threats do they mitigate? I'm asking because e.g. the need for frequent branch deletion is minimized by Gerrit's code review process which uses "branches" inside. What risks do you expect to see here?

6) Notifications: Does it support running various checks that our
hooks do at the moment for license validity and the like? When these
rules are tripped the author is emailed back on their own commits.

Yes, the proposed setup supports these. The best place for implementing them is via CI invocation through the ref-updated hook. My personal preference would be a ref-updated event handler in Zuul to ensure proper scalability, but there are other options.

7) Storing information such as tree metadata location within
individual Git repositories is a recipe for delivering a system that
will eventually fail to scale, and will abuse resources. Due to the
time it takes to fork out to Git,

Gerrit uses JGit, a Java implementation of Git. There are no forks.

plus the disk access necessary for
it to retrieve the information in question, I suspect your generation
script will take several load intensive minutes to complete even if it
only covers mainline repositories. This is comparable to the
performance of Chiliproject in terms of generation at the moment.

The yesterday-released Gerrit 2.10 adds a REST API for fetching arbitrary data from files stored in Git with aggressive caching. I would like to use that for generating that kde_projects.xml file.

The original generation of our Git hooks invoked Git several times per
commit, which meant the amount of time taken to process 1000 commits
easily reached 10 minutes. I rebuilt them to invoke git only a handful
of times per push - which is what we have now.

Gerrit has a different architecture with no forks and aggressive caching. I'm all for benchmarking, though. Do you want a test repository to run your benchmarks against?

8) Shifting information such as branch assignments in the same manner
will necessitate that someone have access to a copy of the Git
repository to determine the branch to use. This is something the CI
system cannot ensure, as it needs to determine this information for
dependencies, and a given node may not have a workspace for the
repository in question. It also makes it difficult to update rules
which are common among a set of repositories such as those for
Frameworks and Plasma (Workspace). I've no idea if it would cause
problems for kdesrc-build, but that is also a possibility.

The kde_projects.xml which stores a copy of these data will remain unchanged, and it should also remain to be the place consulted by e.g. the CI scripts, or the kdesrc-build. These tools will need no change.

What the proposal says is to base generating of that file on data in Git rather than on a custom webapp.

9) You've essentially said you are going to eliminate our existing
hooks.

The proposal said that it might be possible to replace a large part of the functionality with Gerrit's native features with zero maintenance. If the remaining functionality (CRLF line endings and author human names for direct pushes) is important to warrant an ongoing maintenance of the custom hooks, they can be run without a problem.

Does Gerrit support:
    a) line ending checks, with exceptions for certain file types and
repositories?

The proposal says to handle this by the CI setup. This means that it was proposed to enable pushing CRLF data to our repos, with a followup e-mail saying "hey, you're doing a bad thing". That's a trade off for not having to maintain these scripts.

Alternative options for this include:
- preserving this part of the hooks and running them from Gerrit,
- extending an existing Git validation plugin to do this.

    b) Convenient deactivation of these checks if necessary.

Yes, this is configurable.

    c) Checks on the author name to ensure it bears a resemblence to
something actual.

No, the author's name is not checked at the moment. If we decide to change this, it's going to be a couple-line patch, or a custom hook.

However, I do not think that checking names in the way the hooks to it now is actually a good thing. Please read http://wookware.org/name.html for an example of a real person from the UK who cannot commit to KDE.

The potential for mistakes is largely mitigated by checks for e-mail validity. In order for this to be a problem, one would have to push a commit with a valid e-mail address, but wrong name ("jkt <j...@kde.org>"). We should evaluate whether risking this is worth the reduced maintenance.

Also, this only affects direct pushes and KDE developers. Patch proposals from third parties can be easily and immediately downvoted by the CI, with a helpful message on what to fix.

    d) Prohibiting anyone from pushing certain types of email address
such as *@localhost.localdomain?

Yes:

Similar check applies to e-mail validation. An ACL verifies whether an e-mail matches one of user’s registered address. These addresses are either read from LDAP, or validated by a mail probe to make sure that they actually exist and belong to the user in question. This validation can be configured on an LDAP group basis, so it is possible to allow KDE developers to push commits on behalf of third-party contributors while preventing regular users from faking their identity.

10) You were aware of the DSL work Scarlett is doing and the fact this
is Jenkins specific (as it generates Jenkins configuration). How can
this work remain relevant?
Additionally, Scarlett's work will introduce declarative configuration
for jobs to KDE.

My understanding of Scarlett's work is that it aims at cleaning up our current configuration, making it work on Windows and OS X, and to introduce a declarative language for preparing job descriptions. AFAIK, the only part which might be Jenkins-specific is the last bit, and I fully expect a declarative generator being able to generate job descriptions for another system just by adding a proper output format. Moving to declarative approach is the big change here; adding another output is much less work.

11) We actually do use some of Jenkins advanced features, and it
offers quite a lot more than just a visual view of the last failure.
As a quick overview:
    a) Tracked history for tests (you can determine if a single test
is flaky and view a graph of it's pass/fail history).

Please see section 3.3.2 which discusses possible ways on how to deal with flaky tests. IMHO, the key feature and our ultimate goal is "let's handle flaky tests efficiently", not "let's have a graph of failing tests" (how would that work with a non-linear history of pre-merge CI?).

    b) Log parsing to track the history of compiler warnings and other
matters of significance (this is fully configurable based on regexes)

That's in section 3.3.3. One solution for using this is making the build warning-free on one well-known platform, and enforcing -Werror in there.

    c) Integrated cppcheck and code coverage reports, actively used by
some projects within KDE.

The Zuul-based CI setup launches KDE's existing build scripts and delivers their output. I choose to disable cppcheck for simplicity and because no projects that are currently in Gerrit are covered by Jenkins' cppcheck on build.kde.org at this time. There is no reason for not enabling cppcheck runs again, of course. When I looked at it last time, it however seemd that the include paths were not being passed properly and the data I got back were clearly bogous, so I decided to skip that for now. The same applies to coverage reports. Both will be provided, of course.

    d) Intelligent dashboards which allow you to get an overview of a
number of jobs easily.

Bear in mind that these intelligent dashboards can be setup by anyone
and are able to filter on a number of conditions. They can also
provide RSS feeds and update automatically when a build completes.

How would Zuul offer any of this? And how custom would this all have
to be? Custom == maintenance cost.

The report explicitly acknowledges a need of future work for this status matrix, and proposes how to get there (section 3.3.4).

Regarding the maintenance costs, let's wait for when it is ready and evaluate the maintenance burden at that point.

Addendum: the variations, etc. offered by the Zuul instance which
already exists in the Gerrit clone are made possible by the hardware
resources Jan has made available to that system. Jenkins is fully
capable of offering such builds as well with the appropriate setup,
some of which are already used - see the Multi Configuration jobs such
as the ones used by Trojita and Plasma Framework.

I believe that it is not about HW resources, but about services' configuration. Does KDE's Jenkins as-is support building against a systemwide version of Qt, for example?

You've lost me i'm afraid with the third party integration - please
clarify what you're intending here.

I am pointing out that it is easy to plug a third-party testing system to Gerrit/Zuul pretty easily, mainly due to the open APIs and the system's architecture. If e.g. one of the FreeBSD guys wanted to help, they would have a way of getting involved without an explicit action from sysadmins. To me, that lowers the bar of entry a bit, and it also frees up some sysadmin time for more important tasks, so I think that it's a benefit of such a setup.

12) The tone of the way the event stream feature is mentioned makes it
sound like sysadmin actively prevents people from receiving the
information they need. We have never in the past prevented people from
receiving notifications they've requested - you yourself have one that
triggers builds on the OBS for Trojita.

It was never my intention to imply anything like that; sorry for this. That section says that it requires manual effort from sysadmins and custom code. In contrast to that, the proposed setup enables anyone to listen for events in a machine-readable way without any prior effort from sysadmins to enable that.

13) You've used the terminology "we" throughout your document. Who are
the other author(s)?

I think this is similar to the previous report. I received feedback about this paper from several developers. Due to the rather heated nature of the previous rounds of the discussion and some personal attacks, they preferred to not be credited as authors. The actual wording is mine, I wrote the text, so I'm listed as the only author.

Anyway, I hope that we'll be able to judge the merits of the individual proposals, and that this won't deteriorate into a popularity contest.

Cheers,
Jan

--
Trojitá, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/

Reply via email to