Hi all, 

We would like to kick off the Yocto SWAT team this week. Please see the 
following for the purpose of the SWAT team and let me know if you have any 
questions or concerns. We welcome any community participation on the SWAT team. 
At the same time, I will work with the team to make sure thing get started.

Thanks, 
Song

YOCTO SWAT TEAM

GOAL

The assembly of the Yocto Project SWAT team is mainly to tackle urgent 
technical problems that break build on the master branch or major release 
branches in a timely manner, thus to maintain the stability of the master and 
release branch. The SWAT team includes volunteers or appointed members of the 
Yocto Project team. Community members can also volunteer to be part of the SWAT 
team.

SCOPE OF RESPONSIBILITY

Whenever a build (nightly build, weekly build, release build) fails, the SWAT 
team is responsible for ensuring the necessary debugging occurs and organizing 
resources to solve the issue and ensure successful builds. If resolving the 
issues requires schedule or resource adjustment, the SWAT team should work with 
program and development management to accommodate the change in the overall 
planning. 

MEMBERS:

* Darren Hart (US)
* Elizabeth Flanagan (US)
* Paul Eggleton (UK)
* Jessica Zhang (US)
* Dexuan Cui (CN)
* Saul Wold (US)
* Richard Purdie (UK)

ROTATING CHAIR:

A chairperson role will be rotated among team members each week. The 
Chairperson should monitor the build status for the entire week. Whenever a 
build is broken, the Chairperson should do necessary debugging and organize 
resources to solve the problems in a timely manner to meet the overall project 
and release schedule. The Chairperson serves as the focal point of the SWAT 
team to external people such as program managers or development managers. 

ROTATING PROCESS

Each week on a specific day (propose Monday), a SWAT team meeting could be 
called at the chairperson's discretion to discuss current issues and status. 
Either during the meeting or offline, the Chairperson of last week will 
identify and pass the role to another person in the team. The program manager 
should be notified at the same time. Usually, this will take a simple round 
robin order. In case the next person cannot take the role due to tight 
schedule, vacation or some other reasons, the role will be passed to the next 
person.

The current Chairperson's full name and email address will be published on the 
project status wiki page: 
https://wiki.yoctoproject.org/wiki/Yocto_Project_v1.2_Status under "Current 
SWAT team Chairperson" section.

BKM (RICHARD PURDIE)

When looking at a failure, the first question is what the baseline was and what 
changed. If there were recent known good builds it helps to narrow down the 
number of changes that were likely responsible for the failure. It's also 
useful to note if the build was from scratch or from existing sstate files. You 
can tell by seeing what "setscene" tasks run in the log.

The primary responsibility is to ensure that any failures are categorized 
correctly and that the right people get to know about them.

It's important *someone* is then tasked with fixing it. Image failures are 
particular tricky since its likely some component of the image that failed and 
the question is then whether that component changed recently, whether it was 
some kind of core functionality at fault and so on.

Ideally we want to get the failure reported to the person who knows something 
about the area and can come up with a fix without it distracting them too much.
As a secondary responsibility, its often helpful for to triage the failure. 
This might mean documenting a way to reproduce the failure outside a full build 
and/or documenting how the failure is happening and maybe even propose a fix.

Sometimes failures are difficult to understand and can require direct ssh 
access to the autobuilder so the issue can be debugged passively on the system 
to examine contents of files and so forth. If doing this ensure you don't 
change any of the file system for example adding files that couldn't then be 
deleted by the autobuilder when it rebuilds.

Rarely, "live" debugging might be needed where you'd su to the pokybuild user 
and run a build manually to see the failure in real time. If doing this, ensure 
you only create files as the pokybuild user and you are careful not to generate 
sstate packages which shouldn't be present or any other bad state that might 
get reused. In general its recommended not to do "live" debugging. This can be 
escalated to RP/Saul/Beth if needed.

To fulfill the primary responsibility, it's suggested that bugs are opened on 
the bugzilla for each type of failure. This way, appropriate people can be 
brought into the discussion and a specific owner of the failure can be 
assigned. Replying to the build failure with the bug ID and also bringing the 
bug to the attention of anyone you suspect was responsible for the problem are 
also good practices. 

_______________________________________________
yocto mailing list
yocto@yoctoproject.org
https://lists.yoctoproject.org/listinfo/yocto

Reply via email to