> > Before we start sending fedmsgs we need to discuss a few things. We
> > don't have to find solutions to all these problems, just keep them in
> > mind when designing the solution we're going to start with:
> > 
> > 1. How often do we send fedmsg
> > a) per-task
> > b) per-update
> > c) per-build
...
> 
> That leaves us with c)

Seems reasonable to me.

> 
> > I guess c) allows to easier filtering in FMN.
> 
> c) not only allows for easier filtering in FMN but it's also more
> compatible with how I think that releng would like to see build gating
> done. Assuming that we eventually get into the rawhide space, we'll
> have to start emitting stuff per-build anyways :)
> 
> I'm of the opinion that c) is going to be best here. In the past, we've
> done a lot of results on a per-update basis but unless I'm forgetting
> something, we could transition to more of a per-build system.
> 
> For example - depcheck processes updates - if one build in that update
> fails, the whole update fails. While I think that this the best choice,
> I also think that logic should be handled in bodhi instead of us trying
> to emulate what bodhi is doing. As far as I know, this is happening
> with bodhi2 - they're assuming that we'll be emitting per-build fedmsgs
> and the logic for failing/passing an update will lie in bodhi and not
> rely on our emulation of bodhi's processes.

That's great to hear.

> 
> > 2. Who do we target: users, systems or both
> > 
> > The issue here is with tasks that repeatedly test one update.
> > Currently we check if there's a bodhi update comment with the same
> > result already and if so, we don't post the comment again. To do
> > something like that with fedmsgs we'd have to have a code running
> > somewhere that would check against its database whether an incoming
> > result is a duplicate or not. The question is where the code would
> > run. Bodhi comes to mind since it already has information about
> > updates and so is good for tasks that work with bodhi updates.
> > However, there might be tasks that work with something else, like
> > composes. In this case we'd probably have the code on taskotron
> > systems.
> 
> I think that how we handle scheduling of some of our current checks
> (depcheck and upgradepath) is a byproduct of trying to make a
> repo-level check look like a build/update-level check. I can't think of
> many more tasks that would run into the same problem of repeated runs.

I agree that depcheck and upgradepath are somewhat "special" here. Once Fedora 
Infra sees how many results we publish daily, especially during freeze periods 
(when there are lots of packaging pending stable), they might ask us to come up 
with a better solution, I'm afraid. I'm not sure if there are better ways to 
handle it, either way, there will probably always be some kind of check that 
will require this kind of constant re-running. But it seems reasonable to 
assume that it will be a small minority in the overall task pool.

> 
> For the majority of tasks, I see the process as being similar to:
> 
>   1. trigger task $x for $y
>   2. run task $x with $y as input
>   3. report result for $x($y)
> 
> With this, we'd be running $x for each $y and the reporting would only
> happen for each unique ($x, $y) assuming that something wasn't
> rescheduled or forced to re-run.
> 
> I think it would be best to have consistent behavior for our fedmsg
> emitting. If most tasks will only emit fedmsgs once, we should take our
> minority tasks that emit more than one fedmsg per item and deduplicate
> before the messages are emitted.

Or, you can say that most tasks emit fedmsgs always (even though that means 
just once), and therefore the minority tasks should also emit it always :) I 
agree with having a consistent behavior. But I think it's possible to find a 
solution side-stepping this. See below.

> 
> > So if we target systems we'd just send all results in fedmsgs and let
> > the systems consume them and do whatever they want to do with them
> > (e.g. bodhi can squash all the tasks relevant to specific update and
> > notify the maintainer of the package via fedmsg about the result). If
> > we target users, we'd have to have some logic to limit rate of fedmsgs
> > ourselves but that would mean hiding some of the results (although
> > duplicates) from the world.
> 
> I'd like to see us do the deduplication in resultsdb (assuming that's
> where the fedmsg emission will be happening). I think that we already
> have a table for items and I don't think that keeping track of
> "is_emitted" and the last state emitted (so we can track changes in
> state) would be too bad. Then again, I'm not the one working in the
> code and I could be wrong :)

We talked with Martin about this in length some time ago, and I raised the 
question of different consumers. I see two groups here - machines and humans. 
If I understand you correctly, what you propose up there is to hardcode the 
system to fit human preferences. If I misunderstood it, then the whole rest of 
the mail is based on wrong assumptions, but it's still an interesting topic :)

When targeting humans, I believe we will cut off some use cases for machines, 
which can benefit from duplicated (and thus very up-to-date) information. Some 
ideas from top of my head describing what duplicated messages allow:
* For some checks like depcheck, the machine (i.e. Bodhi) can not only display 
the outcome for a certain package, but also the time when this package was last 
tested (might be a very interesting piece of information, was it OK yesterday, 
or 14 days ago and not run since?).
* Or maybe show a graph of all the outcomes in the last week - so that you can 
see that it passed 10 times and failed once, and decide that the failure was 
probably a random fluke which is not worth investigating further.
* If the message passes through another system (e.g. Bodhi, Koji), the system 
in question can e.g. allows users to configure how they want to receive results 
- whether duplicated or deduplicated, how much deduplicated, how often, etc. 
This is mostly true for email, RSS or some other communication channels, 
because fedmsg bus itself is not configurable per individual users' needs.
* It's possible to create some kind of package testing stats overview, live and 
without regular queries.

You can argue that most of this is achievable without duplicated messages, by 
querying the ResultsDB. Yes, but it often means increased performance hit and 
you lose the "live" status. For example, in order to display the graph from the 
second point, you can choose the query ResultsDB for every page view, but that 
means a lot of computing demand. Or you can cache it and refresh it once an 
hour, but that loses the live status. With notifications, you can have it 
always perfectly up-to-date and you don't need to refresh it needlessly. You 
can put in a safeguard against lost fedmsgs like "refresh the graph if older 
than a week, just to be safe", but that's it.

So, for machine processing, I see duplicated messages as a benefit. I don't 
insist we need to have it, but it seems to allow interesting tools to be 
written. (A different question is whether the volume won't be too high for 
fedmsg bus to process it, but that is a separate and a technical issue.)

If some machine didn't want to see duplicated messages and wanted to be able to 
easily filter them out without keeping its own database of querying ours, we 
can add something like "duplicate=True" into the message body? Simple solution, 
for machines.


Now, let's imagine we still decide for message deduplication and we chose the 
human as our primary notification target. There are further issues with it. 
Let's imagine a simple scenario:

1. A maintainer submits update U1 consisting of builds B1 and B2.
2. Depcheck x86_64 runs on U1, reports results.
3. Maintainer receives two fedmsg notifications, one for B1 and one for B2, 
from FMN (email or irc).
4. Depcheck i386 runs on U1, reports results.
3. Maintainer receives two fedmsg notifications, one for B1 and one for B2, 
from FMN (email or irc).
6. Depcheck armhfp runs on U1, reports results.
3. Maintainer receives two fedmsg notifications, one for B1 and one for B2, 
from FMN (email or irc).
8. Upgradepath noarch runs on U1, reports results.
3. Maintainer receives two fedmsg notifications, one for B1 and one for B2, 
from FMN (email or irc).

As you can see, the maintainer receives "number of builds x number of 
architectures (except for noarch checks) x number of checks" results. And the 
notifications are distributed in time, not sent together at once.

So, if we really want to do a good job in informing the maintainer here, 
deduplication of future results is just one part of the story. We also need to 
combine:
* individual build results, if they are part of a bigger object (update)
* architecture results, for checks which are architecture dependent
* individual check results, if we run multiple checks

So that ideally:
1. A maintainer submits update U1 consisting of builds B1 and B2.
2. Depcheck x86_64 runs on U1, reports results.
3. Depcheck i386 runs on U1, reports results.
4. Depcheck armhfp runs on U1, reports results.
5. Upgradepath noarch runs on U1, reports results.
6. Maintainer receives a single fedmsg notification about U1, from FMN (email 
or irc).

Unfortunately, this means we would have to implement a lot of external logic 
(i.e. Bodhi's "what is an update" logic), which is something we're trying to 
get away from (we have our unpleasant experience with bodhi comments feature 
which deals with lots of this stuff).


Taking all of this into account, it seems easier and more sensible to me to 
target machines with taskotron fedmsgs. Let's see:

1. A maintainer submits update U1 consisting of builds B1 and B2.
2. Taskotron gradually executes all available checks on B1 and B2.
3. Taskotron emits fedmsgs for every completed check, for every architecture, 
for every build.
4. Bodhi listens for Taskotron fedmsgs, marks internally (and possibly in the 
web UI) which builds were tested with what result, adds/updates links to logs.
5. Once results for all builds x archs x checks were received, or once some 
timeout occurred (e.g. "wait at most 8 hours for test results"), Bodhi sends 
its own fedmsg.
6. Maintainer listens for _Bodhi_ fedmsgs and receives a single notification 
that U1 testing is complete.

Now, because of the fact that Bodhi is designed for publishing updates, it can 
tailor the messaging behavior nicely. It can either notify after all testing is 
complete, or it can notify immediately after the first failure. It can have 
timeouts in case some tests get stuck. I'm not sure if it can make some of 
these things configurable for the particular maintainer, I think that is no 
longer possible when using fedmsgs instead of emails. But it can publish under 
different topics (e.g. first failure vs testing complete) and maintainers can 
subscribe to what suits them. (And if they're feeling particularly tough, they 
can of course also subscribe to the flood of core taskotron fedmsgs).

Furthermore, Bodhi can put additional logic into this, splitting checks into 
essential a non-essential group. I.e. depcheck + upgradepath vs rpmlint + 
rpmgrill. The notifications can fire off after the essential testing is 
complete, or maybe then can wait for all testing but ignore potential failures 
in non-essential group (and set the overall outcome to something like INFO, if 
e.g. only rpmlint failed).

With this approach, I like that the Bodhi logic is configured in Bodhi, and 
we're not trying to emulate it, we just supply raw data. People subscribe to 
Bodhi notifications. The same approach can be used with Koji or any other 
service - we're supplying data, they're deciding what to do with it, what is 
important and what is not, and they're sending final result notifications (or 
even partial if they want and make sense).



But what about results which don't have a specific service, you ask? What if 
new glibc is submitted and existing firefox is tested against it using 
firefox-regression-suite check, where does these results go? Great question.

I think the raw Taskotron fedmsgs are the answer here. Hopefully most of these 
checks will be one-shot execution (unlike continuous execution like depcheck). 
So if maintainers subscribe to our messages, they should receive one result per 
every arch at worst, i.e. 3 separate notifications for a single execution. Or, 
if they have some really special kind of check, they'd process the 
notifications on their own. Once we're there and checks like these are more 
common, we can talk about providing services for further deduplication. But 
still, even if we really need to do this in some specific cases, I think the 
general approach should be the one outlined above, where we don't notify people 
directly but send it through middle-man services with their own logic and 
special needs.


Now, after seeing the wall of text I've written, I wonder, have I actually kept 
to the original topic, or strayed away into a completely different area? :-)
_______________________________________________
qa-devel mailing list
[email protected]
https://admin.fedoraproject.org/mailman/listinfo/qa-devel

Reply via email to