----- Original Message ----- > From: "Tim Flink" <[email protected]> > To: [email protected] > Sent: Friday, May 15, 2015 1:14:56 AM > Subject: Re: Fedmsg Emitting > > On Thu, 14 May 2015 20:02:29 +0200 > Martin Krizek <[email protected]> wrote: > > > Before we start sending fedmsgs we need to discuss a few things. We > > don't have to find solutions to all these problems, just keep them in > > mind when designing the solution we're going to start with: > > > > 1. How often do we send fedmsg > > a) per-task > > b) per-update > > c) per-build > > > > a) and b): we can list affected packages in a fedmsg. > > > > I am not sure if there are any limits when it comes to fedmsg size. > > Whether the infra folks would be more happy with less larger or more > > smaller fedmsgs (or it doesn't matter). > > a) doesn't make a lot of sense to me - yeah, it fits better into our > execution model but I don't think that anyone outside of taskotron > cares much about what was done in a task. That being said, once we have > more diverse tasks, this could change but I'm not really looking to > design for something that hasn't even started happening yet. >
Looking at it now, I have no idea why I wrote "per-task". What I meant is that we could send a fedmsg per-build(update) that would contain results of all tasks executed on that build(update). Just a thought. Sorry for confusion. :/ > b) hits a similar issue - outside of bodhi, there isn't much that works > on updates and my suspicion is that most of the folks consuming > the output will fall into 1 of 2 categories > - people who have small updates that only contain packages that > they're responsible for > - people who have packages in one of the megaupdates > > There are plenty of exceptions to either of those but I suspect that > _most_ people will fall into one of those categories. > > That leaves us with c) > > > I guess c) allows to easier filtering in FMN. > > c) not only allows for easier filtering in FMN but it's also more > compatible with how I think that releng would like to see build gating > done. Assuming that we eventually get into the rawhide space, we'll > have to start emitting stuff per-build anyways :) > > I'm of the opinion that c) is going to be best here. In the past, we've > done a lot of results on a per-update basis but unless I'm forgetting > something, we could transition to more of a per-build system. > > For example - depcheck processes updates - if one build in that update > fails, the whole update fails. While I think that this the best choice, > I also think that logic should be handled in bodhi instead of us trying > to emulate what bodhi is doing. As far as I know, this is happening > with bodhi2 - they're assuming that we'll be emitting per-build fedmsgs > and the logic for failing/passing an update will lie in bodhi and not > rely on our emulation of bodhi's processes. > That does make sense to me. > > 2. Who do we target: users, systems or both > > > > The issue here is with tasks that repeatedly test one update. > > Currently we check if there's a bodhi update comment with the same > > result already and if so, we don't post the comment again. To do > > something like that with fedmsgs we'd have to have a code running > > somewhere that would check against its database whether an incoming > > result is a duplicate or not. The question is where the code would > > run. Bodhi comes to mind since it already has information about > > updates and so is good for tasks that work with bodhi updates. > > However, there might be tasks that work with something else, like > > composes. In this case we'd probably have the code on taskotron > > systems. > > I think that how we handle scheduling of some of our current checks > (depcheck and upgradepath) is a byproduct of trying to make a > repo-level check look like a build/update-level check. I can't think of > many more tasks that would run into the same problem of repeated runs. > > For the majority of tasks, I see the process as being similar to: > > 1. trigger task $x for $y > 2. run task $x with $y as input > 3. report result for $x($y) > > With this, we'd be running $x for each $y and the reporting would only > happen for each unique ($x, $y) assuming that something wasn't > rescheduled or forced to re-run. > > I think it would be best to have consistent behavior for our fedmsg > emitting. If most tasks will only emit fedmsgs once, we should take our > minority tasks that emit more than one fedmsg per item and deduplicate > before the messages are emitted. > > > So if we target systems we'd just send all results in fedmsgs and let > > the systems consume them and do whatever they want to do with them > > (e.g. bodhi can squash all the tasks relevant to specific update and > > notify the maintainer of the package via fedmsg about the result). If > > we target users, we'd have to have some logic to limit rate of fedmsgs > > ourselves but that would mean hiding some of the results (although > > duplicates) from the world. > > I'd like to see us do the deduplication in resultsdb (assuming that's > where the fedmsg emission will be happening). I think that we already > have a table for items and I don't think that keeping track of > "is_emitted" and the last state emitted (so we can track changes in > state) would be too bad. Then again, I'm not the one working in the > code and I could be wrong :) > Can you think of a use case when someone would want to receive all results including duplicates? > > So the question here is where to put the 'deduplication logic'. > > > > Emitting all results is the simplest solution as a starting point. > > Simpler, but I don't think it serves our end goals very well unless > deduplication is going to be more expensive than I think it will be. > > Tim > > _______________________________________________ > qa-devel mailing list > [email protected] > https://admin.fedoraproject.org/mailman/listinfo/qa-devel > _______________________________________________ qa-devel mailing list [email protected] https://admin.fedoraproject.org/mailman/listinfo/qa-devel
