I don't have much to add, just that I agree with Kamil. I see some minor problems even in what he wrote, but that's well beyond what I think we should be solving now.
j. On Tue, Jan 10, 2017 at 5:19 PM, Kamil Paral <kpa...@redhat.com> wrote: > > Couldn't we use the task ID as the primary identifier but use the srpm > > for readability sake since there is no "build" NVR for scratch builds? > > Systems like new hotness will need to query according to the task ID, > though (to get reliable results). So we're talking about hacking just > resultsdb *frontend* here, e.g. by having "item: task_id" and > "item_display: nvr" in the results yaml. I don't like it much. Searching in > the frontend would have to search both in item and item_display. > > Or we could use our existing "note" field to put NEVR inside, and it would > be easily visible in the frontend without any ugly hacks. People would have > to know that if they want to search by NEVR for scratch builds, they need > to put the NEVR in the note search field (we'd have to add it). > > I assume you proposed using task id as primary identifier just to scratch > builds (but not standard builds). If you also mean it for standard builds, > then scheduling tasks locally starts to be quite inconvenient (for each > build you want to run your task on, you need to find its task id first; > it's hard to re-run something from shell history). We would also be > changing something that's a de-facto standard in our infra (using NEVR as a > unique identifier across projects). > > > > Either that or make the primary ID a tuple of (srpm name, koji task > > start time) either stored that way or rendered as a string e.g > > foo-1.2-3.fc99.src.rpm built at 201701092100 (YYYYMMDDHHMM) would become > > foo-1.2-3.fc99-201701092100. > > The same problem with inconvenient local execution. The command line > starts to be hard to read (in tutorials, etc). > > Also, the srpm name doesn't have to be anything reasonable, Koji will > happily accept anything (it doesn't care, for scratch builds). So we can > easily receive "1.srpm" or "my-new-patch.srpm" from the fedmsg (we actually > tried this [1]). Deriving any useful info from this input is probably a > mistake. We would have to download the srpm and look into the included spec > file to reliably decide which package this is related to. > > [1] https://apps.fedoraproject.org/datagrepper/id?id=2017- > 2fe001d2-32d4-434a-b3a9-29ab31eebbb0&is_raw=true&size=extra-large > > > > > > During a discussion with Kamil, a few solutions were mentioned (none > > > of them is pretty): > > > > > > 1. We can ask koji developers if there is a way to add a method that > > > would return all koji scratch builds for a given NVR - we would then > > > take the latest one and work with it. > > > > How would we get the NVR if there's no build identifier on scratch > > builds? Are you talking about the SRPM name in the fedmsg? > > Yes. Not from the fedmsg, but from the Koji task (the fedmsg gets this > value from the srpm filename included in the Koji task, I believe). But see > the problem described above with srpm file naming (can be arbitrary). So > even better would be if Koji could search and return scratch build info > containing the metadata of the srpm or from the spec file included. > > But I'm honestly not sure it is a good idea even if they wanted to > implement it. More on that below. > > > > > > 2. We can use "koji download-task" which works for both. That would > > > mean koji_build item type would eat koji task IDs instead of NVRs. > > > This would lead to having koji task IDs in resultsdb instead of NVR's > > > which kills readability. Unless libtaskotron finds the NVR from the > > > koji task ID in the case of "real" build" and stores it in a field in > > > resultsdb...Also, we need NVRs for fedmsgs. So add code to the fedmsg > > > layer that would take care of somehow adding a NVR to fedmsg of > > > completed scratch builds tasks... > > > > Can't we tell the difference between an NVR and a task ID just by the > > form that the content takes? Wouldn't the following work: > > > > 1. attempt to parse input as NVR, if there are no issues, assume that > > it's a build ID > > 2. if the NVR parse fails, check to see if the input is an int. if > > assume that it's a koji task ID > > The reverse might be easier, but yes, that's of course possible. There are > other caveats, though. > > > > > It'd mean that our koji downloading code would get more complicated but > > it seems like nothing that more unit tests couldn't mitigate. Unless > > I'm mis-thinking here, the changes needed to pull this off in trigger > > would be even smaller. > > I'm afraid our code could get infested with if clauses pretty soon. This > is not just about downloading rpms in the koji directive. We have more koji > build-related functionality and we'll sure get more in the future. For > example, we currently support "download_latest_stable", but that might be > very tricky for scratch builds with "1.srpm" files (which package is > that?). If we ever needed to work with package epoch, it is unavailable for > scratch builds (since there is no build info object to be returned). > Another example, we have dist-git directive. Without clear identification > of the package you can't use dist-git directive. Even if we download 1.srpm > and look into it, you can still find something that's not available in > dist-git (because you can build something in Koji that is not packaged in > Fedora). > > What I'm trying to illustrate here is that this is not just a few if > clauses in a single piece of code (koji utils). This can influence the > whole workflow through the formula. Some directives might return different > data based on whether it is a regular build or a scratch build. Some might > completely fail to work (distgit directive). And you might want to have > different standards and approaches for scratch builds. > > As an example, you might want to use distgit directive to download > rpmlint.config for that particular package when running rpmlint. If that > git repo doesn't exist, you want to error out (should never happen). But > for scratch builds, it's helpful if you figure out which package it is and > the config file is used, but you don't mind if such a repo doesn't exist - > the rpmlint output without any personalized rpmlint.config is still > valuable. How do you implement that? Will we depend on heuristics > (recognizing standard vs scratch build) and hard-code whether it errors out > or not? We can, but will we be always to do the right thing? What if some > tasks have different preferences? Will we not end up with monster code full > of if-then-else? It seems to me that it's a better approach to allow the > task to decide what it wants to do and fully inform it in advance whether > this is a standard build or a scratch build. That means using different > item_type and therefore different formula. The advantage is that you can > easily implement something like "fail_on_error: True/False" option for a > directive and the task can set what it needs. Or the task will know whether > it's possible to retrieve epoch for a build and will act accordingly. > > As a last example, different logic might be desirable for certain tasks > even in the "task code" (not libtaskotron code) itself. With abicheck, we > currently compare new koji build with the latest build in stable updates. > But for scratch builds, different approach might be desired - e.g. > comparing to the last standard/scratch build, regardless of the koji tag it > is in. We actually talked about very this topic the last time I saw Sinny > and Dodji. If we decided to merge standard builds and scratch builds into a > single workflow, we don't allow tasks to take a different approach even if > they wanted (unless they go into such extents like determining whether the > current "item" is a standard or a scratch build, and then emulating e.g. a > koji directive in their own code). Btw, what about tasks that want/need to > run exclusively on just standard or just scratch builds? > > > > > > 3. We can add a new action to the koji directive "download_task". And > > > then have for each koji task two formulas, one for scratch build > > > (using "download_task") and one for "real" build" (using > > > "download_build). This would require writing code for supporting > > > multiple formulas for one task and would result in having two almost > > > identical formulas... > > > > I'm not a huge fan of 3 if there is any other reasonable way to do > > things. Code duplication is not the way to go. > > I've been thinking about this and I actually like it the most. I already > described much of the issues with compounding scratch+standard builds in > the text above, but let me expand on that. I currently believe having a > separate koji_build and koji_scratch_build type is the best and the most > technically clean way to go. These two events look very similar on the > first (and second and third) glance, but they are implemented very > differently. There's a good reason that their fedmsgs look not alike at > all. Scratch builds have no restrictions on NEVRs, no uniqueness > constraints, no dist-git constraints (anything can be built), they do not > have any API (except a completely generic "task" API), we can't get any > structured metadata about them. They are very different beasts. Compare > this to e.g. DNF repo compose, ostree compose, and docker compose. They are > all "composes", but sufficiently different to get different item types and > different handling in directives (or probably even some exclusive > directives). I believe this is the same case. > > The code duplication doesn't have to be a problem, I think. In package > distgit, you'll have koji_build/mytest/runtask.yaml and > koji_scratch_build/mytest/runtask.yaml. You can symlink any code between > dirs, if your task is able to handle both. Or you can have separate code, > if your task needs it (!). The formulas are duplicated, yes, but not > really. First, they don't change much, so it's not really a problem, and > second you might want to have a different behavior anyway (e.g. remove > dist-git directive to download rpmlint.conf, since it's problematic for > scratch builds). > > For non-distgit (generic) tasks, there would be two formulas under > different names in the same repo (say runtask.yaml and > runtask_scratch.yaml). We would configure trigger to run the task with > 'runtask.yaml' for koji builds and 'runtask_scratch.yaml' for scratch > builds (that should be hopefully trivial). And this is in line with our > future ideas how scheduling should work - users should be able to configure > when to run their task in a web UI, and a formula name would be part of > that. That makes it flexible and would allow them to do awesome things, > like experiment with a new formula while keeping the old formula still > working. In short, I believe we want to go this route (multiple formula > support) anyway. Having it currently "hard-coded" in trigger is not > necessarily a bad thing, it's a step forward towards having it in a > database. > > Ugh, such a long email. Bad kparal. > _______________________________________________ > qa-devel mailing list -- qa-devel@lists.fedoraproject.org > To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org >
_______________________________________________ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org