Re: Non-image blocker process change proposal

2016-03-02 Thread Kamil Paral
> > Here we go:
> > https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-blocker.py
> 
> #!/usr/bin/env python2
> 
> there's no excuse for this. ;)

I'll do better in future, I promise! :)

> 
> What's the deal with making args.nvr a list then requiring it to be
> exactly one item long?

I thought that's how argparser always works, returning a list for positional 
arguments. And it does it only if you use nargs. Thanks, fixed.


> 
> > A bit more complicated than I expected. But now we can simplify the
> > instructions to this:
> > https://fedoraproject.org/w/index.php?title=User:Kparal/Draft:SOP_blocker_bug_process=436628#Tracking_AcceptedPreviousRelease_blocker_bugs
> > 
> > What do you think?
> 
> +1!

OK, I'll push the changes live.
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-03-02 Thread Adam Williamson
On Wed, 2016-03-02 at 09:45 -0500, Kamil Paral wrote:
> > 
> > > 
> > > It would be nice of course if there was a tool for doing this, then the
> > > text could be reduced to 'run the magic tool and make sure it says OK'.
> > I'll do some thinking whether I can write such a magical tool.
> Here we go:
> https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-blocker.py

#!/usr/bin/env python2

there's no excuse for this. ;)

What's the deal with making args.nvr a list then requiring it to be
exactly one item long?

> A bit more complicated than I expected. But now we can simplify the
> instructions to this:
> https://fedoraproject.org/w/index.php?title=User:Kparal/Draft:SOP_blocker_bug_process=436628#Tracking_AcceptedPreviousRelease_blocker_bugs
> 
> What do you think?

+1! And of course eventually we can integrate it into making this
process more automated.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-03-02 Thread Kamil Paral
> > > It would be nice of course if there was a tool for doing this, then the
> > > text could be reduced to 'run the magic tool and make sure it says OK'.
> > 
> > I'll do some thinking whether I can write such a magical tool.
> 
> Here we go:
> https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-blocker.py

Also, we could run this from blockerbugs and track it automatically (for 
example not hiding even closed previousrelease blockers, until they satisfy 
that condition). The script itself can be easily adjusted for that. But I'm not 
convinced this is worth the effort on the blockerbugs side, I imagine it could 
require a few non-trivial changes. I don't expect this tool to be used that 
often even manually. So I'd keep it like this (manual execution) for this 
cycle, we'll see how often we need it, and then we can decide whether 
blockerbugs integration is worth the effort.

Thoughts?
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-03-02 Thread Kamil Paral
> > It would be nice of course if there was a tool for doing this, then the
> > text could be reduced to 'run the magic tool and make sure it says OK'.
> 
> I'll do some thinking whether I can write such a magical tool.

Here we go:
https://git.fedorahosted.org/cgit/fedora-qa.git/tree/track-previous-release-blocker.py

A bit more complicated than I expected. But now we can simplify the 
instructions to this:
https://fedoraproject.org/w/index.php?title=User:Kparal/Draft:SOP_blocker_bug_process=436628#Tracking_AcceptedPreviousRelease_blocker_bugs

What do you think?


The tool itself prints output like this:

$ ./track-previous-release-blocker.py tmw-20130201-6.fc23 
INFOQuerying Koji for tmw-20130201-6.fc23 in f23-updates ...
INFOBuild tmw-20130201-6.fc23 was tagged into f23-updates at: 2016-02-29 
18:03:33 UTC (1456769013.68)
INFODownloading metalink for updates-released-f23 ...
INFOMetalink contains metadata with these timestamps:
2016-02-29 04:47:02 UTC (1456721222.0)   ✘ older than pushed package
2016-02-29 21:44:02 UTC (1456782242.0)   ✔ sufficiently recent
WARNING ✘ FAILED Some metadata referenced in metalink is still older than the 
time when tmw-20130201-6.fc23 was tagged into f23-updates. Some users would not 
receive this update if they chose to update now.


$ ./track-previous-release-blocker.py dmidecode-3.0-1.fc23
INFOQuerying Koji for dmidecode-3.0-1.fc23 in f23-updates ...
INFOBuild dmidecode-3.0-1.fc23 was tagged into f23-updates at: 2016-01-24 
23:02:17 UTC (1453676537.74)
INFODownloading metalink for updates-released-f23 ...
INFOMetalink contains metadata with these timestamps:
2016-02-29 04:47:02 UTC (1456721222.0)   ✔ sufficiently recent
2016-02-29 21:44:02 UTC (1456782242.0)   ✔ sufficiently recent
INFO✔ PASSED All metadata referenced in metalink is sufficiently newer than 
the time when dmidecode-3.0-1.fc23 was tagged into f23-updates. All users 
should be able to receive the update now.


$ ./track-previous-release-blocker.py datovka-4.5.0-1.fc23 --metalink 
~/tmp/metalink.xml  # hacked for demonstration purposes
INFOQuerying Koji for datovka-4.5.0-1.fc23 in f23-updates ...
INFOBuild datovka-4.5.0-1.fc23 was tagged into f23-updates at: 2016-02-25 
10:56:37 UTC (1456397797.11)
INFOMetalink contains metadata with these timestamps:
2016-03-02 13:29:06 UTC (1456925346.0)   ✘ DNF cache not yet expired 
(until 2016-03-02 19:29:06 UTC)
WARNING ✘ FAILED All metadata referenced in metalink is newer than the time 
when datovka-4.5.0-1.fc23 was tagged into f23-updates. However, it still hasn't 
been at least 6 hours (the default DNF metadata expire duration) since any of 
the listed metadata were pushed, and so some users would not receive this 
update if they chose to update now. Please wait until the expiration time 
listed above.


$ ./track-previous-release-blocker.py sendmail-8.15.2-2.fc22
INFOQuerying Koji for sendmail-8.15.2-2.fc22 in f22-updates ...
WARNING It seems that sendmail-8.15.2-2.fc22 hasn't yet been tagged into 
f22-updates! Please verify that your NVR is correct or that something else is 
not wrong.
Hint: You can see complete tag history by running: koji list-tag-history 
--build sendmail-8.15.2-2.fc22


It also has a --debug option if needed.
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-03-01 Thread Adam Williamson
On Tue, 2016-03-01 at 06:28 -0500, Kamil Paral wrote:
> > 
> > IIRC the text in fact uses "unaddressed" specifically *instead* of
> > saying "open", as a slight fudge for cases where a bug might still be
> > open but is in fact fully 'addressed'. We *are* reducing the likelihood
> > of that scenario with this change (i.e. we can't say "go" if a fix is
> > in the compose but not yet pushed stable any more), but I'm not 100%
> > sure we've removed any possibility of a bug being in this state
> > somehow. So I'm not 100% against this change but a bit worried by it.

> OK, so what about this?
> https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AGo_No_Go_Meeting=436435=436434

Looks fine.

> > > I also switched GOLD to GO, which seems to be an oversight from the
> > > past.
> > It's not, exactly. The two terms sort of coexist, it wasn't just that
> > we switched from saying "gold" to saying "go" at some point.
> > Conceptually it's the *release candidate* specifically that gets
> > declared "gold", while the *release process* is "go" (or we are "go for
> > release") if the candidate is declared "gold". I think we could at
> > least *conceptually* declare a release candidate "GOLD" but not be "go"
> > for release. It's kinda unnecessary to have both concepts, but the text
> > reads slightly awkwardly if you simply do s/GOLD/GO/g/ as you did,
> > because we don't really "declare the release "GO"", that's a somewhat
> > odd phrasing.
> > 
> > I don't really mind if we want to rephrase this a bit to drop the
> > 'gold' concept - we barely use it anywhere else but on this page - but
> > it feels like it should be a separate change, not part of this
> > revision, and it should be slightly more than just a search-n-replace
> > so the text doesn't read weird :)

> OK, I reverted this.

Thanks! So for me this is good to go.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-03-01 Thread Kamil Paral
> > Then I added the process of tracking AcceptedPreviousRelease fixes
> > and verifying that the related updates repo metalink is in shape.
> 
> This generally looks fine, my only concern is that it's extremely
> specific stuff that might go stale quite easily. But since it's not at
> all an 'obvious' process, explaining it in detail is important.
> 
> It would be nice of course if there was a tool for doing this, then the
> text could be reduced to 'run the magic tool and make sure it says OK'.

I'll do some thinking whether I can write such a magical tool.

> > 2. Go No Go Meeting
> > https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AG
> > o_No_Go_Meeting=436242=435628
> > 
> > I wanted to avoid enumerating different types of blockers and their
> > conditions here, so I use the previously described fact that any open
> > blocker bugs should mean No Go, otherwise it means Go. But since
> > people are not machines and mistakes will happen, I used "open or
> > otherwise unaddressed accepted blocker bugs" to cover the case where
> > we closed some bug sooner than it should have been, and it's still
> > not addressed.
> 
> IIRC the text in fact uses "unaddressed" specifically *instead* of
> saying "open", as a slight fudge for cases where a bug might still be
> open but is in fact fully 'addressed'. We *are* reducing the likelihood
> of that scenario with this change (i.e. we can't say "go" if a fix is
> in the compose but not yet pushed stable any more), but I'm not 100%
> sure we've removed any possibility of a bug being in this state
> somehow. So I'm not 100% against this change but a bit worried by it.

OK, so what about this?
https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AGo_No_Go_Meeting=436435=436434

> 
> > I also switched GOLD to GO, which seems to be an oversight from the
> > past.
> 
> It's not, exactly. The two terms sort of coexist, it wasn't just that
> we switched from saying "gold" to saying "go" at some point.
> Conceptually it's the *release candidate* specifically that gets
> declared "gold", while the *release process* is "go" (or we are "go for
> release") if the candidate is declared "gold". I think we could at
> least *conceptually* declare a release candidate "GOLD" but not be "go"
> for release. It's kinda unnecessary to have both concepts, but the text
> reads slightly awkwardly if you simply do s/GOLD/GO/g/ as you did,
> because we don't really "declare the release "GO"", that's a somewhat
> odd phrasing.
> 
> I don't really mind if we want to rephrase this a bit to drop the
> 'gold' concept - we barely use it anywhere else but on this page - but
> it feels like it should be a separate change, not part of this
> revision, and it should be slightly more than just a search-n-replace
> so the text doesn't read weird :)

OK, I reverted this.
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-02-29 Thread Kamil Paral
> Hello folks,
> 
> I'd like to return to the high-level overview for this topic and discuss the
> changes we plan to do in our SOPs.

Since there was not much feedback, we agreed on a QA meeting that I'll just go 
ahead and draft all related SOP changes. Here we go:

1. SOP blocker bug process:
https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3ASOP_blocker_bug_process=436236=435641

I've clarified that we should not close any blocker bug until the fix is pushed 
to a stable repo *and* also part of a TC/RC (if appropriate). If this is kept, 
it means we can easily look at the list of blockers and decide whether any type 
of blocker is still blocking us (some blocker bugs open) or not (all blocker 
bugs closed). This will be important in other SOPs.

Then I added the process of tracking AcceptedPreviousRelease fixes and 
verifying that the related updates repo metalink is in shape.

This document shares many templates with 
https://fedoraproject.org/wiki/QA:SOP_freeze_exception_bug_process , but I do 
not intend to modify that one, so I might need you help, Adam, to modify the 
templates in such a way we adjust only one of the documents.


2. Go No Go Meeting
https://fedoraproject.org/w/index.php?title=User%3AKparal%2FDraft%3AGo_No_Go_Meeting=436242=435628

I wanted to avoid enumerating different types of blockers and their conditions 
here, so I use the previously described fact that any open blocker bugs should 
mean No Go, otherwise it means Go. But since people are not machines and 
mistakes will happen, I used "open or otherwise unaddressed accepted blocker 
bugs" to cover the case where we closed some bug sooner than it should have 
been, and it's still not addressed.

I also switched GOLD to GO, which seems to be an oversight from the past.



I went through these documents and found them not needing any adjustments:
https://fedoraproject.org/wiki/QA:SOP_Blocker_Bug_Meeting
https://fedoraproject.org/wiki/QA:SOP_compose_request
https://fedoraproject.org/wiki/Blocker_Bug_FAQ


Do the changes look OK to you?

Thanks,
Kamil
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-01-13 Thread Kamil Paral
> > Do you have some other ideas/proposals, in general or in some specific
> > situations regarding the slip length?
> >
> I'm wondering if there would be interest in hosting a file containing
> upgrade requirements for each version.  For example it could have the
> package version requirements needed for a successful upgrade.  The
> upgrade tool could check that and warn the user.

From the blocker bug process perspective, I think this doesn't change much. 
Instead of ensuring the updated build is pushed to FN/FN-1/FN-2, we would need 
to ensure this requirements file is updated and pushed (probably as part of the 
system-upgrade package). So, the same process for us, really.

This would allow us to provide a better experience for important bugs which 
were not approved as blockers, though. If this definition file contained a 
package version that is not found (this or later version) while computing the 
system upgrade, system-upgrade could visibly warn the user that this and this 
bug still affects the upgrade process and reviewing those is recommended before 
commencing. Basically this is the same we do in Common Bugs [1], but visible to 
every user, not just those who stumble upon Common Bugs. I'd like that. But 
this case is not really related to the blocker bug process and this particular 
proposal and should be suggested as a feature request to system-upgrade 
developers. (I'm somewhat skeptical that Will will want to maintain this 
functionality, even though it would be nice for users. But maybe system-upgrade 
could simply suggest users to look at the Common Bugs page? That sounds simple 
enough and maintenance-free.).

So overall a nice idea, but I think it does not directly affect any of those 
two decisions we need to make. But maybe I missed something :)


[1] https://fedoraproject.org/wiki/Common_F23_bugs#Upgrade_issues
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-01-13 Thread Adam Williamson
On Wed, 2016-01-13 at 10:13 -0700, Chris Murphy wrote:
> On Tue, Jan 12, 2016 at 6:06 PM, Samuel Sieb  wrote:
> > On 01/12/2016 08:37 AM, Kamil Paral wrote:
> > > 
> > > Do you have some other ideas/proposals, in general or in some
> > > specific
> > > situations regarding the slip length?
> > > 
> > I'm wondering if there would be interest in hosting a file
> > containing
> > upgrade requirements for each version.  For example it could have
> > the
> > package version requirements needed for a successful upgrade.  The
> > upgrade
> > tool could check that and warn the user.
> 
> One of my concerns is the state of ca-legacy, and whether and how
> this
> gets disabled on upgrades. I'm sure there are some other things that
> have ex post facto unsafe defaults that just stick around through
> upgrades rather than being reset to new defaults. In my opinion that
> would violate the Workstation PRD "Upgrading the system multiple
> times
> through the upgrade process should give a result that is the same as
> an original install of Fedora Workstation."

This all seems out of scope, as Kamil said. Can we please stick to the
non-media blocker policy discussion here? General concerns / ideas for
upgrades, and specific potential upgrade issues, should get their own
threads.

It's very likely true that upgraded systems get increasingly out of
whack with freshly installed ones when it comes to default
configurations of various packages - especially ones which don't use
the modular, multiply-overridden configuration style, and thus can't
easily update the distribution defaults post-install - but this doesn't
really seem to have much to do with the question of what the release
process policies WRT non-media blockers should be.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-01-13 Thread Chris Murphy
On Wed, Jan 13, 2016 at 2:55 PM, Adam Williamson
 wrote:
> On Wed, 2016-01-13 at 10:13 -0700, Chris Murphy wrote:
>> On Tue, Jan 12, 2016 at 6:06 PM, Samuel Sieb  wrote:
>> > On 01/12/2016 08:37 AM, Kamil Paral wrote:
>> > >
>> > > Do you have some other ideas/proposals, in general or in some
>> > > specific
>> > > situations regarding the slip length?
>> > >
>> > I'm wondering if there would be interest in hosting a file
>> > containing
>> > upgrade requirements for each version.  For example it could have
>> > the
>> > package version requirements needed for a successful upgrade.  The
>> > upgrade
>> > tool could check that and warn the user.
>>
>> One of my concerns is the state of ca-legacy, and whether and how
>> this
>> gets disabled on upgrades. I'm sure there are some other things that
>> have ex post facto unsafe defaults that just stick around through
>> upgrades rather than being reset to new defaults. In my opinion that
>> would violate the Workstation PRD "Upgrading the system multiple
>> times
>> through the upgrade process should give a result that is the same as
>> an original install of Fedora Workstation."
>
> This all seems out of scope, as Kamil said. Can we please stick to the
> non-media blocker policy discussion here? General concerns / ideas for
> upgrades, and specific potential upgrade issues, should get their own
> threads.
>
> It's very likely true that upgraded systems get increasingly out of
> whack with freshly installed ones when it comes to default
> configurations of various packages - especially ones which don't use
> the modular, multiply-overridden configuration style, and thus can't
> easily update the distribution defaults post-install - but this doesn't
> really seem to have much to do with the question of what the release
> process policies WRT non-media blockers should be.

Yep, it's true my was vaguely, unintentionally, in the vicinity of, a
drive-by / hijacking.


-- 
Chris Murphy
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-01-13 Thread Chris Murphy
On Tue, Jan 12, 2016 at 6:06 PM, Samuel Sieb  wrote:
> On 01/12/2016 08:37 AM, Kamil Paral wrote:
>>
>> Do you have some other ideas/proposals, in general or in some specific
>> situations regarding the slip length?
>>
> I'm wondering if there would be interest in hosting a file containing
> upgrade requirements for each version.  For example it could have the
> package version requirements needed for a successful upgrade.  The upgrade
> tool could check that and warn the user.

One of my concerns is the state of ca-legacy, and whether and how this
gets disabled on upgrades. I'm sure there are some other things that
have ex post facto unsafe defaults that just stick around through
upgrades rather than being reset to new defaults. In my opinion that
would violate the Workstation PRD "Upgrading the system multiple times
through the upgrade process should give a result that is the same as
an original install of Fedora Workstation."


-- 
Chris Murphy
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-01-12 Thread Samuel Sieb

On 01/12/2016 08:37 AM, Kamil Paral wrote:

Do you have some other ideas/proposals, in general or in some specific 
situations regarding the slip length?

I'm wondering if there would be interest in hosting a file containing 
upgrade requirements for each version.  For example it could have the 
package version requirements needed for a successful upgrade.  The 
upgrade tool could check that and warn the user.

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2016-01-12 Thread Kamil Paral
> Up top I'd like to note there are really kind of two buckets of
> 'special blockers' for any given release. If the release being
> validated is N, they are:
> 
> 1) Bugs for which the fix must be in the 0-day update set for N
> 2) Bugs for which the fix must be stable in N-1 and N-2 by N release day

Hello folks,

I'd like to return to the high-level overview for this topic and discuss the 
changes we plan to do in our SOPs. 

So far, we decided to call bugs from bucket 1) as Accepted0Day and bugs from 
bucket 2) as AcceptedPreviousRelease. I also worked on some technical details 
for ensuring AcceptedPreviousRelease updates get pushed on time. Now we need to 
discuss what *happens* when we have one of these two bugs.


== Question #1: Do we slip always? ==

With media blockers, we need to create new media, which ensures a slip (there 
were a few exceptional situations in the past where we managed to build and 
test fixed media in a day, and therefore postponed the Go/NoGo decision for a 
day). With non-media blockers, the affected artifact is either the repository 
tree (we need to push a new build for Accepted0Day), or a previous release 
repository tree (we need to push an update for AcceptedPreviousRelease). For 
Accepted0Day, this will most likely involve critical bugs in components which 
are not on the default installation media (but for example negatively influence 
them, or prevent some other important functionality). For 
AcceptedPreviousRelease, this will most likely involve bugs in upgrading the 
system, or a few other specific cases like creating a bootable media of the new 
release or booting the new release in a VM.

Now the question is whether exactly the same rules apply (i.e. if this is not 
fixed at go/no-go, we slip), or whether in certain cases we would decide to not 
slip.

Since pushing an update can be done relatively fast, I can imagine that people 
would propose to not slip if an update is prepared and tested, but not yet 
pushed stable. Earlier in this thread, I tried to point out that this is not 
enough, because things can go wrong on multiple levels and we really need to 
insist that update is already stable (and metalink metadata adjusted to not 
allow usage of older pushes). Of course this can be handled with the same trick 
as media blockers sometimes, i.e. postponing the Go/NoGo decision for a day, 
provided RelEng approves. But in general I think we should not avoid slipping 
and just "hope for the best". These bugs were accepted as blockers and we need 
to make sure people don't hit them, even if they have a bad luck of being 
assigned to an older mirror.

Do you see any other cases where we should either not slip, or it would be 
tempting to not slip and we should discuss and define such use case explicitly?


== Question #2: For how long do we slip? ==

Earlier in this thread, I suggested some kind of a dynamic slip that would 
reflect how fast we can resolve things (for example perform a push). But both 
Kevin from Infra and Dennis from RelEng didn't think it was a good idea, and 
claimed we should slip as usual, i.e. a week (if I misunderstood something, 
please correct me). Of course this is their field, not QA's, so I definitely 
believe their judgment.

Do you have some other ideas/proposals, in general or in some specific 
situations regarding the slip length?


Thanks a lot for feedback,
Kamil
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2015-12-07 Thread Kamil Paral
> I definitely like the idea of tracking this better and making sure to
> tell releng whats needed. In fact, I think that may be all we need,
> especially with bodhi2 pushing updates faster than 1 did.

I see 3 parts here:
1. tracking these issues (that will be done by QA with our new trackers)
2. knowing what needs to be done (that I'm trying to figure out in this 
thread), and implementing necessary tools if needed
3. making sure it's done (it needs to be part of some QA or RelEng SOP to make 
sure it's not forgotten when pushing out the new release)


> I am with Kevin here, we have things tightly coupled with mirrors and
> mirroring, making changes by a day or two throws timings way off.  purely
> because we have a built in sync buffer of the weekend. To slip the go/no go
> decision to Monday we would need to push out the ship date from Tuesday to
> Friday to give mirrors syncing time and that is making things somewhat tight.
> We really need to slip a week for any slip

OK, a week slip it is.

> 
> > > Leaves less time to sync mirrors,
> > > update common bugs, etc etc.
> > 
> > I would say the opposite - all of that can start happening right away, it's
> > not blocked on waiting for the FN-x push. So in case the announcement gets
> > out on Tuesday as usual, it's the same time, but if it gets pushed back to
> > Wednesday or later day, it's more time for these tasks to happen. The only
> > exception is that FN-x updates repo, which will get shorter sync time
> > because we want to make sure people download the fixed packages, not old
> > ones. Currently that behavior is undefined.
> 
> we can not put the bits onto the mirrors until we are sure they are the bits,
> otherwise we offer the mirrors lots of churn, wasted iops and bandwidth and
> we
> lose mirrors.

I think there's still some misunderstanding here, but I don't want to spend 
more time on this. My core topic is to figure out what steps we need to take in 
order to deliver the updated (fixed) packages in FN-1/FN-2 repos to all our 
users on the FN release day. Slip duration is secondary.

> > Looking into existing MM scripts, I have the same opinion, but I can
> > contact
> > Adrian to confirm. If we want to make it even simpler, we can drop all
> > alternative metadata and leave just the current hash (that script would be
> > run once the push containing that critical update is performed).
>
> I am okay with having a way to say ship only the latest metadata.

Great, so this could be the technical means to do what we're looking for. Now I 
need to discuss this with MirrorManager devs and ideally convince them to 
implement this.

> we have no way of ensuring always that people are getting the latest data, or
> that they have the latest bits installed. but people can always shoot
> themselves in the foot. people can and will do a distro update without
> updating the running os first. 

Yes, they can. We're trying to limit the unintentional foot shooting in this 
thread. I.e. once we tell them it's safe to upgrade ("Fedora N+1 is here!"), it 
should really be safe to upgrade.

> I would suggest not filing a rel-eng ticket
> and
> telling us what to do as that will not go over well. We should now sit down
> and work out a process. then likely a ticket needs to be filed asking that
> the
> process be followed.

That's exactly what I was trying to say, in a different words. I hope that 
means we agree with each other :-) This very thread is my attempt to "sit down 
and work out a process", and once we have it, and we implement the tool to do 
the necessary tasks, RelEng's "New Release SOP" can then include something like:
"If there are any accepted blockers for previous stable releases [link], please 
look at the date when the last of them was pushed stable, and run 
`./mirror-manager-drop-older-metadata REPO DATE`"
Or QA can ask you to please not forget about this step, if that's what you 
prefer.

Does that sound OK?
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-12-04 Thread Kevin Fenzi
On Wed, 2 Dec 2015 06:42:09 -0500 (EST)
Kamil Paral  wrote:

> If the update is pending stable and just not pushed, it might sense
> to move it one day, yes (most probably skipping weekends, though). If
> it needs more testing, we might decide to postpone it a several days.
> If it's not available at all yet, waiting an extra week might be the
> right choice. So it would depend on the situation and best guess of
> folks at Go/No-Go.

But then no one knows whats happening. People could assume it's go and
do a bunch of release prep, but find out it's not and then have wasted
their time (or at least rushed when they had more time). 

> > Leaves less time to sync mirrors,
> > update common bugs, etc etc.  
> 
> I would say the opposite - all of that can start happening right
> away, it's not blocked on waiting for the FN-x push. So in case the
> announcement gets out on Tuesday as usual, it's the same time, but if
> it gets pushed back to Wednesday or later day, it's more time for
> these tasks to happen. The only exception is that FN-x updates repo,
> which will get shorter sync time because we want to make sure people
> download the fixed packages, not old ones. Currently that behavior is
> undefined.

There's a bunch of things that coordinate for the release. IMHO
shifting it a few days would make those more difficult. 
"Hey mirrors, here's f23 content, our release is... sometime next week,
we don't have our crap together enough to tell you what day, but expect
someday the bit will flip and you will get a bunch more traffic. Hope
that helps you"

> 
> > 
> > So, the alternative there would be to slip a week to get it pushed,
> > but some people may find that excessive.  
> 
> That's why I wanted to propose something more flexible, but hey, it's
> just an idea.

Sure. ;) 

...snip...

> In the future, these issues should be tracked by blocker bugs app
> using bugzilla tracker and a specific keyword, so we should not lose
> track of this. But as mentioned, pushing to stable is not enough, we
> also need to make sure old content is not served to users. That's why
> the "dropping alternative metadata from metalink" idea. We can file a
> releng ticket for this, and either include a description of what
> needs to be done, or link to some wiki SOP. QA can take care of all
> of that. The only thing that we need to ensure is that it really is
> handled before the announcement goes live, so it needs to be listed
> somewhere in RelEng/Infra "new release" SOP. 

I definitely like the idea of tracking this better and making sure to
tell releng whats needed. In fact, I think that may be all we need,
especially with bodhi2 pushing updates faster than 1 did. 

kevin





pgpVfqJY7ZUIt.pgp
Description: OpenPGP digital signature
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2015-12-04 Thread Adam Williamson
On Fri, 2015-12-04 at 12:20 -0600, Dennis Gilmore wrote:
> there is the potential always of hitting issues. with upgrades. an older 
> release gets a higher nvr and things get messy. It is not an issue just at 
> release time.

This is true, but release time is important, because we're very
publicly visible, and a *lot* of people try to upgrade right around
release time. It's a bad look when there's a major bug that breaks all
or most upgrades at release time.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-12-04 Thread Dennis Gilmore
On Wednesday, December 02, 2015 06:42:09 AM Kamil Paral wrote:
> > > Taking all of this into account, would this be a reasonable idea?
> > > 1. At Go/No-Go voting time, all updates which block F-N release but
> > > belong to F-M (M > > is not the case and it's the last blocking issue, selected tasks
> > > (like copying compose trees into appropriate places) can be
> > > performed, and Go/No-Go will be rescheduled to the day and time when
> > > it is expected that those updates will have been pushed.
> > 
> > I think thats not a great idea. It gets back to why we only slip in one
> > week increments. If say we have a go/no-go on a thursday and the only
> > thing blocking it is some update thats not pushed stable all the way
> > yet, we reschedule for friday and if it's not done then we schedule for
> > saturday? This means everyone has to work extra hours without even
> > being sure when the release will be.
> 
> If the update is pending stable and just not pushed, it might sense to move
> it one day, yes (most probably skipping weekends, though). If it needs more
> testing, we might decide to postpone it a several days. If it's not
> available at all yet, waiting an extra week might be the right choice. So
> it would depend on the situation and best guess of folks at Go/No-Go.

I am with Kevin here, we have things tightly coupled with mirrors and 
mirroring, making changes by a day or two throws timings way off.  purely 
because we have a built in sync buffer of the weekend. To slip the go/no go 
decision to Monday we would need to push out the ship date from Tuesday to 
Friday to give mirrors syncing time and that is making things somewhat tight. 
We really need to slip a week for any slip

> > Leaves less time to sync mirrors,
> > update common bugs, etc etc.
> 
> I would say the opposite - all of that can start happening right away, it's
> not blocked on waiting for the FN-x push. So in case the announcement gets
> out on Tuesday as usual, it's the same time, but if it gets pushed back to
> Wednesday or later day, it's more time for these tasks to happen. The only
> exception is that FN-x updates repo, which will get shorter sync time
> because we want to make sure people download the fixed packages, not old
> ones. Currently that behavior is undefined.

we can not put the bits onto the mirrors until we are sure they are the bits, 
otherwise we offer the mirrors lots of churn, wasted iops and bandwidth and we 
lose mirrors.

> > So, the alternative there would be to slip a week to get it pushed, but
> > some people may find that excessive.
> 
> That's why I wanted to propose something more flexible, but hey, it's just
> an idea.

In order to be more flexible here we would really need to change fundamentally 
how we push bits to the mirrors.  If we had a CDN of our own under our control 
we would have more options available, but the cost of that would be massive.
 
> > > 2. We will
> > > create a new mirrormanager script which will go through the specified
> > > metalink(s) and remove all metadata hashes which are older than
> > > provided timestamp/hash.
> > 
> > Something like that should be pretty easy to do I would think.
> > (Although I am not a mm developer)
> 
> Looking into existing MM scripts, I have the same opinion, but I can contact
> Adrian to confirm. If we want to make it even simpler, we can drop all
> alternative metadata and leave just the current hash (that script would be
> run once the push containing that critical update is performed).
I am okay with having a way to say ship only the latest metadata.

> > > 3. If there are such updates as mentioned in
> > > point 1., RelEng will use this script to remove old metadata
> > > alternatives from the metalink, which means only metadata from the
> > > day this update was pushed or newer will be kept. In order to not
> > > increase mirror strain too much, this doesn't need to be used
> > > immediately, but just shortly before the release announcement (so
> > > that mirrors have time to sync latest packages, and the user load is
> > > distributed among more mirrors including those with current-1 or
> > > current-2 trees as long as possible). 4. Once the script is run in
> > > point 3., we can post the release announcement in 6 hours.
> > > 
> > > I know there still one manual step involved (figuring out in which
> > > push the blocker update was included), but I don't know how to better
> > > solve it, especially if we don't want to wait for too long.
> > > 
> > > I would be interested in Infra/RelEng feedback for the technical part
> > > of this (CCing Kevin and Dennis). Do you think this is reasonable
> > > solution, or am I completely off the track here? Do you see any
> > > better options?
> > 
> > So, looking back, we had the case of that dnf-system-upgrade. Are there
> > any others in the past, or are we making a bigger than life deal out of
> > one case?
> 
> I don't want to 

Re: Non-image blocker process change proposal

2015-12-02 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/02/2015 06:42 AM, Kamil Paral wrote:
>>> Taking all of this into account, would this be a reasonable 
>>> idea? 1. At Go/No-Go voting time, all updates which block F-N 
>>> release but belong to F-M (M>> pushed stable. If this is not the case and it's the last
>>> blocking issue, selected tasks (like copying compose trees into
>>> appropriate places) can be performed, and Go/No-Go will be
>>> rescheduled to the day and time when it is expected that those
>>> updates will have been pushed.
>> 
>> I think thats not a great idea. It gets back to why we only slip
>> in one week increments. If say we have a go/no-go on a thursday
>> and the only thing blocking it is some update thats not pushed
>> stable all the way yet, we reschedule for friday and if it's not
>> done then we schedule for saturday? This means everyone has to
>> work extra hours without even being sure when the release will
>> be.
> 
> If the update is pending stable and just not pushed, it might
> sense to move it one day, yes (most probably skipping weekends,
> though).

Sure, this I think is sane.


 If
> it needs more testing, we might decide to postpone it a several
> days. If it's not available at all yet, waiting an extra week might
> be the right choice. So it would depend on the situation and best
> guess of folks at Go/No-Go.

I think this shouldn't be conditional: if at Go/No-Go the update isn't
at least ready to hit the button, then we slip a week. Period.
"Waiting a couple days for testing" is just adding unnecessary
uncertainty.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlZfG8cACgkQeiVVYja6o6ND1ACgoC5EhmSRms9vtpvoiXDjYnFo
4psAoLAWPYIwzl+KJ6waGqrlHLC0rA01
=0Xn+
-END PGP SIGNATURE-
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-12-02 Thread Kamil Paral
> > Taking all of this into account, would this be a reasonable idea?
> > 1. At Go/No-Go voting time, all updates which block F-N release but
> > belong to F-M (M > is not the case and it's the last blocking issue, selected tasks
> > (like copying compose trees into appropriate places) can be
> > performed, and Go/No-Go will be rescheduled to the day and time when
> > it is expected that those updates will have been pushed.
> 
> I think thats not a great idea. It gets back to why we only slip in one
> week increments. If say we have a go/no-go on a thursday and the only
> thing blocking it is some update thats not pushed stable all the way
> yet, we reschedule for friday and if it's not done then we schedule for
> saturday? This means everyone has to work extra hours without even
> being sure when the release will be. 

If the update is pending stable and just not pushed, it might sense to move it 
one day, yes (most probably skipping weekends, though). If it needs more 
testing, we might decide to postpone it a several days. If it's not available 
at all yet, waiting an extra week might be the right choice. So it would depend 
on the situation and best guess of folks at Go/No-Go.

> Leaves less time to sync mirrors,
> update common bugs, etc etc.

I would say the opposite - all of that can start happening right away, it's not 
blocked on waiting for the FN-x push. So in case the announcement gets out on 
Tuesday as usual, it's the same time, but if it gets pushed back to Wednesday 
or later day, it's more time for these tasks to happen. The only exception is 
that FN-x updates repo, which will get shorter sync time because we want to 
make sure people download the fixed packages, not old ones. Currently that 
behavior is undefined.

> 
> So, the alternative there would be to slip a week to get it pushed, but
> some people may find that excessive.

That's why I wanted to propose something more flexible, but hey, it's just an 
idea.

> 
> > 2. We will
> > create a new mirrormanager script which will go through the specified
> > metalink(s) and remove all metadata hashes which are older than
> > provided timestamp/hash.
> 
> Something like that should be pretty easy to do I would think.
> (Although I am not a mm developer)

Looking into existing MM scripts, I have the same opinion, but I can contact 
Adrian to confirm. If we want to make it even simpler, we can drop all 
alternative metadata and leave just the current hash (that script would be run 
once the push containing that critical update is performed).

> 
> > 3. If there are such updates as mentioned in
> > point 1., RelEng will use this script to remove old metadata
> > alternatives from the metalink, which means only metadata from the
> > day this update was pushed or newer will be kept. In order to not
> > increase mirror strain too much, this doesn't need to be used
> > immediately, but just shortly before the release announcement (so
> > that mirrors have time to sync latest packages, and the user load is
> > distributed among more mirrors including those with current-1 or
> > current-2 trees as long as possible). 4. Once the script is run in
> > point 3., we can post the release announcement in 6 hours.
> > 
> > I know there still one manual step involved (figuring out in which
> > push the blocker update was included), but I don't know how to better
> > solve it, especially if we don't want to wait for too long.
> > 
> > I would be interested in Infra/RelEng feedback for the technical part
> > of this (CCing Kevin and Dennis). Do you think this is reasonable
> > solution, or am I completely off the track here? Do you see any
> > better options?
> 
> So, looking back, we had the case of that dnf-system-upgrade. Are there
> any others in the past, or are we making a bigger than life deal out of
> one case?

I don't want to exaggerate the topic, but I'd also like to find and describe a 
process how we can avoid it next time. It will be needed twice a year at 
maximum.

I believe there were a few similar issues in the past, but I can't really point 
to any other examples. In majority of cases, this is likely to be related to 
system upgrade (system-upgrade, dnf, plymouth, systemd, gpg keys).

> 
> Also, that case could have been solved by dropping the alternates in
> metalink as you suggest above at 2 right?

Yes.

> 
> One thing that perhaps we could improve is to somehow note these sorts
> of things to releng. I just checked irc logs and I didn't see any
> mention of that dnf-system-upgrade plugin update being important until
> nov 3rd. Would a tracker ticket help this?

In the future, these issues should be tracked by blocker bugs app using 
bugzilla tracker and a specific keyword, so we should not lose track of this. 
But as mentioned, pushing to stable is not enough, we also need to make sure 
old content is not served to users. That's why the "dropping alternative 
metadata from metalink" 

Re: Non-image blocker process change proposal

2015-12-01 Thread Kevin Fenzi
On Tue, 1 Dec 2015 07:21:04 -0500 (EST)
Kamil Paral  wrote:

> Taking all of this into account, would this be a reasonable idea?
> 1. At Go/No-Go voting time, all updates which block F-N release but
> belong to F-M (M is not the case and it's the last blocking issue, selected tasks
> (like copying compose trees into appropriate places) can be
> performed, and Go/No-Go will be rescheduled to the day and time when
> it is expected that those updates will have been pushed. 

I think thats not a great idea. It gets back to why we only slip in one
week increments. If say we have a go/no-go on a thursday and the only
thing blocking it is some update thats not pushed stable all the way
yet, we reschedule for friday and if it's not done then we schedule for
saturday? This means everyone has to work extra hours without even
being sure when the release will be. Leaves less time to sync mirrors,
update common bugs, etc etc. 

So, the alternative there would be to slip a week to get it pushed, but
some people may find that excessive. 

> 2. We will
> create a new mirrormanager script which will go through the specified
> metalink(s) and remove all metadata hashes which are older than
> provided timestamp/hash. 

Something like that should be pretty easy to do I would think. 
(Although I am not a mm developer)

> 3. If there are such updates as mentioned in
> point 1., RelEng will use this script to remove old metadata
> alternatives from the metalink, which means only metadata from the
> day this update was pushed or newer will be kept. In order to not
> increase mirror strain too much, this doesn't need to be used
> immediately, but just shortly before the release announcement (so
> that mirrors have time to sync latest packages, and the user load is
> distributed among more mirrors including those with current-1 or
> current-2 trees as long as possible). 4. Once the script is run in
> point 3., we can post the release announcement in 6 hours.
> 
> I know there still one manual step involved (figuring out in which
> push the blocker update was included), but I don't know how to better
> solve it, especially if we don't want to wait for too long.
> 
> I would be interested in Infra/RelEng feedback for the technical part
> of this (CCing Kevin and Dennis). Do you think this is reasonable
> solution, or am I completely off the track here? Do you see any
> better options?

So, looking back, we had the case of that dnf-system-upgrade. Are there
any others in the past, or are we making a bigger than life deal out of
one case?

Also, that case could have been solved by dropping the alternates in
metalink as you suggest above at 2 right?

One thing that perhaps we could improve is to somehow note these sorts
of things to releng. I just checked irc logs and I didn't see any
mention of that dnf-system-upgrade plugin update being important until
nov 3rd. Would a tracker ticket help this? 

kevin



pgpmY3L12fMnB.pgp
Description: OpenPGP digital signature
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2015-12-01 Thread Adam Williamson
On Tue, 2015-12-01 at 19:40 -0700, Kevin Fenzi wrote:
> One thing that perhaps we could improve is to somehow note these
> sorts
> of things to releng. I just checked irc logs and I didn't see any
> mention of that dnf-system-upgrade plugin update being important
> until
> nov 3rd. Would a tracker ticket help this?

We're already working on that side of things, see elsewhere in the
thread.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | identi.ca: adamwfedora
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-12-01 Thread Kamil Paral
> I've tried to find out some of the technical details of this. Mirrormanager
> publishes the current hash of repomd.xml, and also hashes of usually up to
> two older repomd.xml files. You can see it here:
> https://mirrors.fedoraproject.org/metalink?repo=rawhide=x86_64
> It's the  tag in  and  tags.
> 
> Here's a nice graph showing how often our mirrors distribute current, or
> older content:
> https://adrian.fedorapeople.org/f22-updates-repomd-propagation.svg
> 
> The time restraints are defined here:
> https://github.com/fedora-infra/mirrormanager2/blob/master/mirrormanager2/lib/model.py#L314
> If the current push is older than 2 days, there should be no alternate hashes
> older than 3 days. If the current push is younger, there can be one hash
> arbitrarily old, but no further hashes older than 3 days. I hope I read the
> code correctly (the docstring doesn't seem to match it exactly).
> 
> However, it also depends on how often metalink is regenerated, the old items
> will not disappear on their own. I learned that all metalinks are
> regenerated based on any of these fedmsg events:
> org.fedoraproject.prod.bodhi.updates.fedora.sync
> org.fedoraproject.prod.compose.rawhide.rsync.complete
> org.fedoraproject.prod.compose.branched.rsync.complete
> So if there is no new push (in any repository), metalinks are not regenerated
> and old hashes are not dropped. Theoretically releng could send out one of
> those fedmsg events artificially to trigger metalink regeneration, if
> needed.
> 
> Currently, there are no means to generate a new metalink with alternative
> hashes disabled, or removing those alternatives from the metalink
> intentionally at some point of time afterwards. That would require patching
> our tools. This would of course lead to a larger load on our master mirror
> and those mirrors which managed to get synced quickly, because that would
> disqualify any other mirrors which are not completely current. But it could
> get handy in some situations, unfortunately the tools do not allow it at the
> moment.
> 
> The second part of the story is dnf. In dnf, metadata_expire= option defines
> how often metalink is pulled again and new metadata are downloaded if the
> cached metadata hash differs from the current hash in the metalink. However,
> if the top-listed repository is not completely up-to-date (it contains
> current-1 or current-2 metadata), but its hash is listed among alternate
> hashes in the metalink, dnf is fine with that and does not attempt to query
> different repos to retrieve the very current metadata. That means that as
> long as the metalink contains some older hashes, and some repository offers
> that older metadata, some users might not get latest metadata. The default
> value for metadata_expire is 6 hours for stable updates.
> 
> So, the outcome of this exercise is:
> If we want to be sure the latest updates are available to _all_ our users, we
> need to wait until there are no older metadata hashes in the metalink and
> then 6 more hours. There will be no older metadata hashes in the metalink
> when 3 days pass since the push of the important update, *once* there is a
> new push after that time (which will regenerate the metalink), or if releng
> send out a fake event manually.
> 
> This is, uh, a) quite a long time and b) complex. I'll be very glad if you
> can point out anything that I've described or understood wrong.
> 


Taking all of this into account, would this be a reasonable idea?
1. At Go/No-Go voting time, all updates which block F-N release but belong to 
F-M (M

Re: Non-image blocker process change proposal

2015-11-26 Thread Kamil Paral
> Accepted0Day (for bugs where the fix must appear in 0-day updates for
> the new release)
> AcceptedStable (for bugs where the fix must appear in updates for the
> previous stable release(s) by release day of the new release)
> 
> I'm not 100% married to either of those, especially the second. If
> anyone has a better idea, please send it!

It works, but I'm not completely satisfied with 'AcceptedStable' either, there 
are too many different interpretations. Potential other keywords could be 
'AcceptedPrevious' or 'AcceptedPrevRelease'. The last one is the longest, but 
it's also most obvious. It would probably be my choice, if we didn't come up 
with something even better.
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-11-26 Thread Kamil Paral
> OK. I was just trying to point out that there needs to be about 1-2 day
> period (releng knows best) between the 0 day push and the actual
> announcement. Maybe that's why we usually have 4 days between Go and the
> announcement? I don't know whether there's a releng process for it or not,
> but since QA wants to track 0 day updates more reliably, it would make sense
> to also have a similar process in releng space to ensure the updates are
> ready for everyone when the announcement goes out. I'd like to avoid the
> system-upgrade sad story next time.

I've tried to find out some of the technical details of this. Mirrormanager 
publishes the current hash of repomd.xml, and also hashes of usually up to two 
older repomd.xml files. You can see it here:
https://mirrors.fedoraproject.org/metalink?repo=rawhide=x86_64
It's the  tag in  and  tags.

Here's a nice graph showing how often our mirrors distribute current, or older 
content:
https://adrian.fedorapeople.org/f22-updates-repomd-propagation.svg

The time restraints are defined here:
https://github.com/fedora-infra/mirrormanager2/blob/master/mirrormanager2/lib/model.py#L314
If the current push is older than 2 days, there should be no alternate hashes 
older than 3 days. If the current push is younger, there can be one hash 
arbitrarily old, but no further hashes older than 3 days. I hope I read the 
code correctly (the docstring doesn't seem to match it exactly).

However, it also depends on how often metalink is regenerated, the old items 
will not disappear on their own. I learned that all metalinks are regenerated 
based on any of these fedmsg events:
org.fedoraproject.prod.bodhi.updates.fedora.sync
org.fedoraproject.prod.compose.rawhide.rsync.complete
org.fedoraproject.prod.compose.branched.rsync.complete
So if there is no new push (in any repository), metalinks are not regenerated 
and old hashes are not dropped. Theoretically releng could send out one of 
those fedmsg events artificially to trigger metalink regeneration, if needed.

Currently, there are no means to generate a new metalink with alternative 
hashes disabled, or removing those alternatives from the metalink intentionally 
at some point of time afterwards. That would require patching our tools. This 
would of course lead to a larger load on our master mirror and those mirrors 
which managed to get synced quickly, because that would disqualify any other 
mirrors which are not completely current. But it could get handy in some 
situations, unfortunately the tools do not allow it at the moment.

The second part of the story is dnf. In dnf, metadata_expire= option defines 
how often metalink is pulled again and new metadata are downloaded if the 
cached metadata hash differs from the current hash in the metalink. However, if 
the top-listed repository is not completely up-to-date (it contains current-1 
or current-2 metadata), but its hash is listed among alternate hashes in the 
metalink, dnf is fine with that and does not attempt to query different repos 
to retrieve the very current metadata. That means that as long as the metalink 
contains some older hashes, and some repository offers that older metadata, 
some users might not get latest metadata. The default value for metadata_expire 
is 6 hours for stable updates.

So, the outcome of this exercise is:
If we want to be sure the latest updates are available to _all_ our users, we 
need to wait until there are no older metadata hashes in the metalink and then 
6 more hours. There will be no older metadata hashes in the metalink when 3 
days pass since the push of the important update, *once* there is a new push 
after that time (which will regenerate the metalink), or if releng send out a 
fake event manually.

This is, uh, a) quite a long time and b) complex. I'll be very glad if you can 
point out anything that I've described or understood wrong.
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-11-25 Thread Adam Williamson
On Fri, 2015-11-20 at 16:18 +0530, Sudhir D wrote:
> 
> > I don't think it's appropriate to turn FEs into blockers automatically,
> > in fact there are obvious cases where it certainly wouldn't be
> > appropriate: bugs in non-blocking desktops are typically taken as FEs,
> > for instance, as are bugs in secondary arches. Neither of those can
> > ever be blockers by policy.
> 
> Ok. We should probably stop calling them as FEs in that case :) and have 
> a mechanism to track them on basis of priority and have them fixed 
> before RC.

So just to be clear on the terms here: in Fedora we have 'blocker' and
'freeze exception' bugs. A 'blocker' bug must be fixed for the release
to go ahead. A 'freeze exception' bug doesn't *have* to be fixed, but
we will accept a fix during the milestone freeze period.

Sudhir explained his broader meaning to me on the phone, and it's a
good point: so long as we don't actually have any process/mechanism for
ensuring that 'special' blockers are fixed on time, calling them
blockers is really misleading, because we aren't really having them
'block' the release. You're certainly right about that. At Monday's
meeting, though, we agreed that we're generally in favour of changing
the release process such that we *do* effectively block the release for
these bugs, so if we do go ahead and implement that, it will still be
correct to refer to them as 'blockers'.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-11-25 Thread Adam Williamson
On Wed, 2015-11-18 at 16:36 -0800, Adam Williamson wrote:

> #2 MOAR METADATA
> 
> 
> The alternative is to make the existing Blocker trackers do more work.
> In this model we wouldn't add any new tracker bugs; we'd just add new
> 'magic words' in the Whiteboard field. Right now, an accepted blocker
> is identified by the string 'AcceptedBlocker' appearing in the
> whiteboard field. We could simply add some more magical strings like
> that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions
> welcome).
> 
> I kind of like this idea as it's less change and involves creating
> fewer new bugs. We'd have to make some changes to blockerbugs either
> way - tflink can say if either approach would be more work in
> blockerbugs, but I'm gonna guess they'd be fairly similar.

Hi again folks!

So it sounds like this option was preferred by everyone who expressed a
preference, and it's my choice too, so I figure we should just go for
it.

I think we still have some more research/discussion/co-ordination to do
before we can propose changes to the release process (especially the
go/no-go process) to 'enforce' special blockers, but I think we can go
ahead and implement the *tracking* side of the changes now. So I'm
gonna propose that we add these new terms for the Whiteboard field:

Accepted0Day (for bugs where the fix must appear in 0-day updates for
the new release)
AcceptedStable (for bugs where the fix must appear in updates for the
previous stable release(s) by release day of the new release)

I'm not 100% married to either of those, especially the second. If
anyone has a better idea, please send it!

Once we decide on the terms, the next step would be to edit the blocker
SOPs:

https://fedoraproject.org/wiki/QA:SOP_Blocker_Bug_Meeting
https://fedoraproject.org/wiki/QA:SOP_blocker_bug_process

- the changes shouldn't be too onerous - and to update blockerbugs for
the new world order. I know tflink has a lot on his plate, so I might
take a cut at that to try and save him the work.

Comments, thoughts, questions as always - thanks folks!
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-11-23 Thread Kamil Paral
> >> Well, here's our latest mess-up:
> >> https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f
> >> dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable
> >> on
> > Oct 29, which was Go/No-Go day. Therefore it was considered "resolved".
> >
> > "Had enough karma" != queued for stable. When I say "queued for
> > stable", I mean that it needs to be "submitted for stable" and
> > awaiting a push (if not already pushed). According to the history on
> > that bug, it was not actually submitted for stable until November 2nd.
> > That would have failed my criterion above, since that was after Go/No-Go.

Hmm, however, that update had karma autopush set, and its threshold was 
reached. So what exactly is the difference between maintainer asking to push 
and bodhi autokarma asking to push? I assume they should behave exactly the 
same. Maybe autokarma push was not triggered, or delayed for some reason? We 
need to find the difference and define that condition more precisely, or fix a 
bug somewhere, else it's quite likely we'll have a similar problem in the 
future.

> 
> Yup, I think "queued for stable" is the right thing to require here.
> Releng always does a 0 day push; we just need to ensure during the
> blocker review process that things that need to be included in that push
> are actually queued for stable.
> 
> That should be enough for all practical purposes. I mean, releng's 0 day
> push may fail of course or take longer than expected, but I don't think
> that needs to be tracked with the blocker review process. Releng is
> going to be painfully aware if their pushes are failing anyway and
> working as fast as they can to fix them.

OK. I was just trying to point out that there needs to be about 1-2 day period 
(releng knows best) between the 0 day push and the actual announcement. Maybe 
that's why we usually have 4 days between Go and the announcement? I don't know 
whether there's a releng process for it or not, but since QA wants to track 0 
day updates more reliably, it would make sense to also have a similar process 
in releng space to ensure the updates are ready for everyone when the 
announcement goes out. I'd like to avoid the system-upgrade sad story next time.
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org


Re: Non-image blocker process change proposal

2015-11-23 Thread Tim Flink
On Wed, 18 Nov 2015 16:36:16 -0800
Adam Williamson  wrote:



> #2 MOAR METADATA
> 
> 
> The alternative is to make the existing Blocker trackers do more work.
> In this model we wouldn't add any new tracker bugs; we'd just add new
> 'magic words' in the Whiteboard field. Right now, an accepted blocker
> is identified by the string 'AcceptedBlocker' appearing in the
> whiteboard field. We could simply add some more magical strings like
> that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions
> welcome).
> 
> I kind of like this idea as it's less change and involves creating
> fewer new bugs. We'd have to make some changes to blockerbugs either
> way - tflink can say if either approach would be more work in
> blockerbugs, but I'm gonna guess they'd be fairly similar.

Yeah, I think that either approach would involve a similar amount of
effort to change blockerbugs. #2 would be slightly less effort but
it's not much.

Once the approach is decided, we can file RFEs for blockerbugs and get
that work landed before F24 alpha.

Tim


pgpVrKU_SAWn9.pgp
Description: OpenPGP digital signature
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
http://lists.fedoraproject.org/admin/lists/test@lists.fedoraproject.org

Re: Non-image blocker process change proposal

2015-11-20 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/20/2015 07:16 AM, Kamil Paral wrote:
>> The biggest issue is this, I think. We probably need to encode 
>> "Special Blockers" into the Go/No-Go process. I don't think that 
>> assurance that it will be fixed on time is necessarily good
>> enough. Particularly given the time that it takes stable updates
>> to make it to the mirrors, I'd say that we probably want to say
>> that any such special blockers have to be queued for stable
>> before the Go/No-Go decision is made. (This may in some cases
>> mean *during* the Go/No-Go meeting, of course.)
> 
> Well, here's our latest mess-up: 
> https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f 
> dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable 
> on
Oct 29, which was Go/No-Go day. Therefore it was considered "resolved".

"Had enough karma" != queued for stable. When I say "queued for
stable", I mean that it needs to be "submitted for stable" and
awaiting a push (if not already pushed). According to the history on
that bug, it was not actually submitted for stable until November 2nd.
That would have failed my criterion above, since that was after Go/No-Go.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlZPNJ8ACgkQeiVVYja6o6Oj9QCdFAiaD/vZKsG5PKBAL9lpb1l4
eAsAoIGAep7Y/GLy2GKD1V6p+K3QFWDW
=jNKy
-END PGP SIGNATURE-
-- 
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-20 Thread Kamil Paral
> > My suggestion would be that we make sure 'blockerbugs' includes
> > lists of each type of blocker. Ahead of and at Go/No-Go meetings,
> > we would want to have a formal assurance from the person
> > responsible for fixing the bug that the fix would be provided by a
> > certain time - say, one day or two days ahead of the release date -
> > and it would be QA's responsibility to ensure the updates are
> > tested promptly, and releng's responsibility to ensure they are
> > pushed on time after being tested. I would suggest the Program
> > Manager ought to have overall responsibility for keeping an eye on
> > the 0Day and Stable blocker lists and making sure the maintainer,
> > QA, and releng all did their jobs on time.
> 
> The biggest issue is this, I think. We probably need to encode
> "Special Blockers" into the Go/No-Go process. I don't think that
> assurance that it will be fixed on time is necessarily good enough.
> Particularly given the time that it takes stable updates to make it to
> the mirrors, I'd say that we probably want to say that any such
> special blockers have to be queued for stable before the Go/No-Go
> decision is made. (This may in some cases mean *during* the Go/No-Go
> meeting, of course.)

Well, here's our latest mess-up:
https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f
dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable on Oct 29, 
which was Go/No-Go day. Therefore it was considered "resolved". However, it was 
pushed to testing on Nov 2 (4 days later) and to stable on Nov 5 (5 days 
later!), which was the public release day. Since mirrormanager is configured to 
serve even last-but-one metadata (i.e. even 1-2 days old, relengs can provide a 
more precise value), many of our users upgraded on Nov 5 and Nov 6 using an 
older version of system-upgrade which broke their systems. Just read the 
comments:
https://fedoramagazine.org/upgrading-from-fedora-22-to-fedora-23/#comments
I was very unhappy. We solved most of the issues, it was a lot of work, and yet 
a large group of people was hit by those old, long-resolved problems, just 
because of bad timing and slow repo pushes (for whatever reason).

So, that update was "queued for stable before the Go/No-Go" as you proposed, 
and yet we have failed to deliver it. So if we really want to avoid such 
problems in the future, we either need to insist on "pushed to stable by 
Go/No-Go, no exceptions", or we need to have another check on release day and 
verify that all required builds were pushed to stable at least 2 days before -- 
if not, do not announce the release and wait for more days. The first approach 
is slightly impractical (we don't want to wait another week, it might be 
resolved in 2 more days; do we lift final freeze or not?), the second approach 
is confusing for media (media announce we're Go, and then nothing happens on 
the proclaimed release day).

What I see as a potential solution here is decoupling tasks that need to wait 
for the 0day blockers and those which don't. So, at the Go/No-Go meeting, we 
can decide that it is No-Go in general, but composes are final now and can be 
uploaded to proper locations for mirrors to pick them up. I don't know exactly 
what else relengs need to do, but I guess there will be other tasks that can be 
done. And in 2-3 days, we can have Go/No-Go again, where we decide that even 
0day blocker have been addressed, pushed to stable, and we can pronounce the 
whole release Go, and publish the announcement immediately or the next day or 
whatever's appropriate (bearing in mind that there should be 2 days period 
after the 0day blockers are pushed stable).

WDYT? Reasonable? Complicated? Bonkers? Off the mark?
-- 
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-20 Thread Kamil Paral
> #2 MOAR METADATA
> 
> 
> The alternative is to make the existing Blocker trackers do more work.
> In this model we wouldn't add any new tracker bugs; we'd just add new
> 'magic words' in the Whiteboard field. Right now, an accepted blocker
> is identified by the string 'AcceptedBlocker' appearing in the
> whiteboard field. We could simply add some more magical strings like
> that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions
> welcome).

I'd use this approach, distinguish by whiteboard metadata. It's easier for 
people to propose (you need to remember just 2 different tracker aliases), and 
it should not require too many blockerbugs modifications (the Propose page can 
stay exactly the same, and in the list view which just add some column or tag 
to distinguish different blocker types).
-- 
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-20 Thread Sudhir D



On 11/19/2015 11:40 PM, Adam Williamson wrote:

On Thu, 2015-11-19 at 19:09 +0530, Sudhir D wrote:

I suggest we have only one ZeroDay i.e., for Final and do away with
intermediate ones.
As I see it, ZeroDay comes with cost and we also need to have basic
sanity testcases automated to ensure ZeroDay fixes won't
introduce/regress blocker.

How about automatically qualifying any freeze exception in current phase
as blocker for next phase and keep 0day only for RC?
AlphaBlocker --> AlphaFreezeException --> BetaBlocker -->
BetaFreezeException --> FinalBlocker --> FinalBlockerException --> ZeroDay

This would mean we will be not so liberal in allowing blockers linger
around in a phase for more time, but I think that is okay tradeoff.

  From tracking perspective, I think we may just want to have trackers
for phaseBlocker for each milestone and FinalBlocker and 0Day for Final
along with backPortfix tracker one for the pending release, and one for
previous stable releases.

Well, the thing is, the criteria are organized by milestone, and we hit
this situation quite often at Beta: the upgrade criteria kick in at
Beta, for instance. So if upgrade from F23 to F24 Beta is completely
broken, but the fix has to go out as an F23 update, we should really be
tracking that to make sure it does. If we only make sure the fix goes
out by Final, are we really honouring the criteria properly?


If a blocker bug breaks phase criteria, then there is no phase exit 
unless the bug is fixed. Unless we are ready to risk as it might happen 
in certain cases earlier in cycle but such instances should be zero once 
we are in Beta. That way, we would still be honoring the phase exit 
criteria. As a definition, BlockerExceptions should not contain any 
phase exit criteria bugs; these can be related to an important feature 
which is partially broken. For the Final phase though, all identified 
blockers and blockerExceptions that were carried from earlier phase are 
fixed before GA and if there is any exception in this phase out of that 
list (after risk assessment), we can consider them for 0day.




I don't think it's appropriate to turn FEs into blockers automatically,
in fact there are obvious cases where it certainly wouldn't be
appropriate: bugs in non-blocking desktops are typically taken as FEs,
for instance, as are bugs in secondary arches. Neither of those can
ever be blockers by policy.


Ok. We should probably stop calling them as FEs in that case :) and have 
a mechanism to track them on basis of priority and have them fixed 
before RC.


Regards,
Sudhir

Regards,
Sudhir

--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-20 Thread Petr Schindler
Stephen Gallagher píše v Čt 19. 11. 2015 v 15:50 -0500:
> On 11/18/2015 07:36 PM, Adam Williamson wrote:
> 
> > With either approach, the basic goal is to make it more feasible
> > to keep an eye on each of the different categories of 'release
> > blocker' bugs; right now we have solid processes in place for
> > ensuring the 'normal' blockers are all addressed in the release
> > media, but we don't have any processes in place for ensuring 0Day
> > and Stable bugs actually get updates shipped when we say they
> > must.
> > 
> > My suggestion would be that we make sure 'blockerbugs' includes
> > lists of each type of blocker. Ahead of and at Go/No-Go meetings,
> > we would want to have a formal assurance from the person
> > responsible for fixing the bug that the fix would be provided by a
> > certain time - say, one day or two days ahead of the release date -
> > and it would be QA's responsibility to ensure the updates are
> > tested promptly, and releng's responsibility to ensure they are
> > pushed on time after being tested. I would suggest the Program
> > Manager ought to have overall responsibility for keeping an eye on
> > the 0Day and Stable blocker lists and making sure the maintainer,
> > QA, and releng all did their jobs on time.
> 
> The biggest issue is this, I think. We probably need to encode
> "Special Blockers" into the Go/No-Go process. I don't think that
> assurance that it will be fixed on time is necessarily good enough.
> Particularly given the time that it takes stable updates to make it
> to
> the mirrors, I'd say that we probably want to say that any such
> special blockers have to be queued for stable before the Go/No-Go
> decision is made. (This may in some cases mean *during* the Go/No-Go
> meeting, of course.)

+1. I think that even 0day bugs should block the release. When the
update for 0day bug would be available after Go/No-Go and we would
found that it is buggy (doesn't fix bug properly or it creates another
problems) then we would end up with unresolved blocker in release time.
Maybe we could add some rule for delaying release in such cases
(something like big red button - stop release now!).

From those two proposed solutions I like the second more. To have
another magic words for White board. As I said I think that 0day
blockers should still block the release. But I agree that we should
treat them differently so we should track them somehow.

> -- 
> test mailing list
> test@lists.fedoraproject.org
> To unsubscribe:
> https://admin.fedoraproject.org/mailman/listinfo/test
-- 
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-20 Thread Kalev Lember

On 11/20/2015 03:56 PM, Stephen Gallagher wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/20/2015 07:16 AM, Kamil Paral wrote:

The biggest issue is this, I think. We probably need to encode
"Special Blockers" into the Go/No-Go process. I don't think that
assurance that it will be fixed on time is necessarily good
enough. Particularly given the time that it takes stable updates
to make it to the mirrors, I'd say that we probably want to say
that any such special blockers have to be queued for stable
before the Go/No-Go decision is made. (This may in some cases
mean *during* the Go/No-Go meeting, of course.)


Well, here's our latest mess-up:
https://bodhi.fedoraproject.org/updates/FEDORA-2015-e00b75e39f
dnf-plugin-system-upgrade-0.7.0-1.fc22 had enough karma for stable
on

Oct 29, which was Go/No-Go day. Therefore it was considered "resolved".

"Had enough karma" != queued for stable. When I say "queued for
stable", I mean that it needs to be "submitted for stable" and
awaiting a push (if not already pushed). According to the history on
that bug, it was not actually submitted for stable until November 2nd.
That would have failed my criterion above, since that was after Go/No-Go.


Yup, I think "queued for stable" is the right thing to require here.
Releng always does a 0 day push; we just need to ensure during the
blocker review process that things that need to be included in that push
are actually queued for stable.

That should be enough for all practical purposes. I mean, releng's 0 day
push may fail of course or take longer than expected, but I don't think
that needs to be tracked with the blocker review process. Releng is
going to be painfully aware if their pushes are failing anyway and
working as fast as they can to fix them.

--
Kalev
--
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-19 Thread Adam Williamson
On Thu, 2015-11-19 at 19:09 +0530, Sudhir D wrote:
> 
> I suggest we have only one ZeroDay i.e., for Final and do away with 
> intermediate ones.
> As I see it, ZeroDay comes with cost and we also need to have basic 
> sanity testcases automated to ensure ZeroDay fixes won't 
> introduce/regress blocker.
> 
> How about automatically qualifying any freeze exception in current phase 
> as blocker for next phase and keep 0day only for RC?
> AlphaBlocker --> AlphaFreezeException --> BetaBlocker --> 
> BetaFreezeException --> FinalBlocker --> FinalBlockerException --> ZeroDay
> 
> This would mean we will be not so liberal in allowing blockers linger 
> around in a phase for more time, but I think that is okay tradeoff.
> 
>  From tracking perspective, I think we may just want to have trackers 
> for phaseBlocker for each milestone and FinalBlocker and 0Day for Final 
> along with backPortfix tracker one for the pending release, and one for 
> previous stable releases.

Well, the thing is, the criteria are organized by milestone, and we hit
this situation quite often at Beta: the upgrade criteria kick in at
Beta, for instance. So if upgrade from F23 to F24 Beta is completely
broken, but the fix has to go out as an F23 update, we should really be
tracking that to make sure it does. If we only make sure the fix goes
out by Final, are we really honouring the criteria properly?

I don't think it's appropriate to turn FEs into blockers automatically,
in fact there are obvious cases where it certainly wouldn't be
appropriate: bugs in non-blocking desktops are typically taken as FEs,
for instance, as are bugs in secondary arches. Neither of those can
ever be blockers by policy.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net


-- 
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-19 Thread Stephen Gallagher
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/18/2015 07:36 PM, Adam Williamson wrote:

> With either approach, the basic goal is to make it more feasible
> to keep an eye on each of the different categories of 'release
> blocker' bugs; right now we have solid processes in place for
> ensuring the 'normal' blockers are all addressed in the release
> media, but we don't have any processes in place for ensuring 0Day
> and Stable bugs actually get updates shipped when we say they
> must.
> 
> My suggestion would be that we make sure 'blockerbugs' includes
> lists of each type of blocker. Ahead of and at Go/No-Go meetings,
> we would want to have a formal assurance from the person
> responsible for fixing the bug that the fix would be provided by a
> certain time - say, one day or two days ahead of the release date -
> and it would be QA's responsibility to ensure the updates are
> tested promptly, and releng's responsibility to ensure they are
> pushed on time after being tested. I would suggest the Program
> Manager ought to have overall responsibility for keeping an eye on
> the 0Day and Stable blocker lists and making sure the maintainer,
> QA, and releng all did their jobs on time.

The biggest issue is this, I think. We probably need to encode
"Special Blockers" into the Go/No-Go process. I don't think that
assurance that it will be fixed on time is necessarily good enough.
Particularly given the time that it takes stable updates to make it to
the mirrors, I'd say that we probably want to say that any such
special blockers have to be queued for stable before the Go/No-Go
decision is made. (This may in some cases mean *during* the Go/No-Go
meeting, of course.)
-BEGIN PGP SIGNATURE-
Version: GnuPG v2

iEYEARECAAYFAlZONh4ACgkQeiVVYja6o6NJHwCfdWPSKY4S93wc5fUVburm4sk8
CAsAmQGuVqpjixhnitIES0ratHTJg8RJ
=8Jxq
-END PGP SIGNATURE-
-- 
test mailing list
test@lists.fedoraproject.org
To unsubscribe:
https://admin.fedoraproject.org/mailman/listinfo/test

Re: Non-image blocker process change proposal

2015-11-19 Thread Sudhir D



On 11/19/2015 06:06 AM, Adam Williamson wrote:

Hi, folks! It's been a recurring issue in the blocker review / release
validation process in recent times that we run across bugs that qualify
as blockers, but for which the fix does not need to be in the final
frozen media or install trees.

Common cases are bugs related to upgrading, especially since the
introduction of fedup and even more so of dnf-system-upgrade; most
upgrade-related issues can now be sufficiently fixed by package updates
to either the source or target release. Bugs to do with writing USB
media from the previous release, for instance, also often fall in this
category.

Up until now we've been sort of handwaving these as 'special blockers',
following the regular blocker process but noting in comments that they
don't block the compose. We haven't been tracking very hard if they
actually *are* being fixed with updates promptly, we've just been sort
of waving a magic wand and assuming it will happen. I just found one
which was supposed to be fixed with a 0-day update for Beta, but hadn't
been fixed yet: https://bugzilla.redhat.com/show_bug.cgi?id=1263230

So, we kinda need to do this better.

Up top I'd like to note there are really kind of two buckets of
'special blockers' for any given release. If the release being
validated is N, they are:

1) Bugs for which the fix must be in the 0-day update set for N
2) Bugs for which the fix must be stable in N-1 and N-2 by N release day

There will be a lot of process nerd detail involved in any fix, but
before any detailed drafts I'd like to suggest two broad possible
approaches and see what people think:

#1 Separate trackers


As a sort of on-the-spot PoC for F23 Beta, I created a new tracker bug
for bucket 1: https://bugzilla.redhat.com/show_bug.cgi?id=1264167

We could formalize that approach, and have a '0-day' blocker tracker
for each milestone. We could either just have one '0-day' tracker and
throw bugs for both the pending release and previous stable releases on
the same tracker and keep track of what needs updating where in each
bug, or we could have two 0-day trackers for each milestone, one for
the pending release, and one for previous stable releases.

So we'd have something like:

F24AlphaBlocker
F24AlphaFreezeException
F24Alpha0Day
(F24AlphaStable) (? - better alias suggestions welcome)

and so on. This is a lot of bugs, but there is a script to create them,
so we're not adding a bunch of onerous work.

So far as proposing bugs goes, I think we'd probably want to extend the
current somewhat flexible approach; formally you would nominate a bug
as a particular type of blocker/FE (by marking it as blocking the
appropriate tracker), but we would move things around in blocker review
meetings sensibly, as we currently do (when something is nominated as
FE but should really be a blocker, or vice versa). In blocker review
we'd go through all bugs nominated for any of the trackers, probably
starting with 'Blocker', then '0Day' and 'Stable', then
'FreezeException'.

#2 MOAR METADATA


The alternative is to make the existing Blocker trackers do more work.
In this model we wouldn't add any new tracker bugs; we'd just add new
'magic words' in the Whiteboard field. Right now, an accepted blocker
is identified by the string 'AcceptedBlocker' appearing in the
whiteboard field. We could simply add some more magical strings like
that: 'Accepted0Day' and 'AcceptedStable', say (better suggestions
welcome).

I kind of like this idea as it's less change and involves creating
fewer new bugs. We'd have to make some changes to blockerbugs either
way - tflink can say if either approach would be more work in
blockerbugs, but I'm gonna guess they'd be fairly similar.

With either approach, the basic goal is to make it more feasible to
keep an eye on each of the different categories of 'release blocker'
bugs; right now we have solid processes in place for ensuring the
'normal' blockers are all addressed in the release media, but we don't
have any processes in place for ensuring 0Day and Stable bugs actually
get updates shipped when we say they must.

My suggestion would be that we make sure 'blockerbugs' includes lists
of each type of blocker. Ahead of and at Go/No-Go meetings, we would
want to have a formal assurance from the person responsible for fixing
the bug that the fix would be provided by a certain time - say, one day
or two days ahead of the release date - and it would be QA's
responsibility to ensure the updates are tested promptly, and releng's
responsibility to ensure they are pushed on time after being tested. I
would suggest the Program Manager ought to have overall responsibility
for keeping an eye on the 0Day and Stable blocker lists and making sure
the maintainer, QA, and releng all did their jobs on time.

It'd be great if folks could post their general thoughts on this, and
any preference for option 1 or option 2. Thanks!


I suggest we have only one