+1 on this in general

comment regarding "make test": for required python packages, (e.g.
scapy), we install them in "virtualenv", which is python way of having
your own controlled playground. Thus, whatever (python) system packages
are there, "make test" ignores them. This allows us to pick the exact
version in test/Makefile (as we do for scapy, because we apply our
patches every time we install it, so we don't want the upstream
to break our test), so at least for python stuff in "make test", this
can be more or less ignored.

Thanks,
Klement

Quoting Dave Wallace (2017-01-20 04:38:33)
>    Ed, Thanh, Vanessa,
> 
>    IMHO, updating the ubuntu packages every time a VM is spun up is a bug
>    wrt. being able to reproduce some (hopefully rare) build/test issues. 
>    Since every VM is potentially running with different versions of OS
>    components, when a failure occurs (e.g. in "make test"), then it may be
>    necessary to recreate the exact run-time environment in order to reproduce
>    the failure. Unless the complete package list is being archived for every
>    VM instance that is spun up, this may not be possible. 
> 
>    My experience is that those rare cases where a tool or environment issue
>    causes a failure, the cost to find the issue is extraordinarily high if
>    you do not have the ability to recreate the EXACT build/run-time
>    environment.  This is why CSIT does not update OS components in the VM
>    initialization scripts and the VM images are built from a specific package
>    list instead of pulling the latest versions from the apt repositories.
> 
>    My recommendation is that the VM images be updated periodically (weekly or
>    whenever a new security update is released) and the package lists archived
>    for each VM image version.  Each VM image should also be verified against
>    a known good VPP commit version as is done with CSIT branches.  Ideally we
>    should build a fully automated continuous deployment model to reduce the
>    amount of work to update the VM images to running a Jenkins job to
>    build/test/deploy a new VM image from the latest packages versions.
> 
>    With that automation in place, this mechanism could be extended for use by
>    CSIT as well as "make test", thus ensuring that all of our testing was
>    done with the same OS component version.  Ideally, all projects should be
>    using the same OS components to ensure that everything is tested in the
>    same run-time environment.
> 
>    Thanks,
>    -daw-
>    On 1/19/2017 8:31 PM, Thanh Ha via RT wrote:
> 
>  The issue with the 16.04 Ubuntu image is fixed now (but we may require some 
> additional actions which I'll send to Vanessa to in case this issue comes up 
> again). We fixed this issue tonight by rebuilding ubuntu1604 and deploying 
> the new image.
> 
>  I'm going to close this ticket as resolved and we'll take the additional 
> task to find a way to ensure this doesn't appear again off of this ticket.
> 
>  If you're not interested in the detailed analysis you can stop reading now.
> 
>  For those interested I suspect that the lock issue will appear again 
> (although I could be wrong). The reason I believe so is that our vm init 
> script runs "apt-get update" as an initialization step when the VM boots up 
> at creation time via this script [0]. Ed mentioned that we didn't see this in 
> the past and it only started appear again recently as we deployed another 
> patch to disable Ubuntu's unattended updates.
> 
>  I believe a possible reason we will see this issue appear again due to [0] 
> is because of we switched from using JClouds to OpenStack Jenkins plugins for 
> node spinnup and there's difference in how the init-script is executed 
> depending on which plugin is being used.
> 
>  JClouds Plugin:
> 
>  1) boot vm
>  2) wait for ssh access
>  3) copies init-script into vm via ssh
>  4) executes init-script, and doesn't continue processing until script is 
> complete
>  5) once init-script is complete, passes vm over to job and job starts
> 
>  OpenStack Plugin:
> 
>  1) boot vm and passes init-script in as User Data
>  2) init-script runs inside vm without Jenkins intervention, thus is a 
> non-blocking function
>  3) in parallel jenkins waits for ssh access to vm
>  4) ssh's into vm and passes vm over to job and job starts running
> 
>  In the OpenStack plugin case step 4 can execute while step 2 is still 
> running apt-get update in the background because it was a non-blocking 
> function.
> 
>  A few ideas I have to get around this.
> 
>  a) Allow init-script to continue running apt-get update however have a shell 
> script at the start of Ubuntu jobs that waits for the lock to get released 
> before allowing the job to start
> 
>  b) Remove apt-get update from init-script and make the job run apt-get 
> update at the beginning of it's execution
> 
>  c) Regularly update VMs to ensure that apt-get update always runs quickly
> 
>   Regards,
>  Thanh
> 
>  [0] 
> [1]https://git.fd.io/ci-management/tree/jenkins-scripts/basic_settings.sh#n14
> 
> 
>  On Thu Jan 19 19:23:59 2017, hagbard wrote:
> 
>  FYI... helpdesk is on it, and its being worked in #fdio-infra on IRC
> 
>  Ed
> 
>  On Thu, Jan 19, 2017 at 4:31 PM, Ed Warnicke [2]<hagb...@gmail.com> wrote:
> 
> 
>  Looping in help desk.
>  On Thu, Jan 19, 2017 at 4:16 PM Dave Barach (dbarach) [3]<dbar...@cisco.com>
>  wrote:
> 
> 
>  Folks,
> 
> 
> 
>  See [4]https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/3378/console
> 
> 
> 
>  11:00:46 E: Could not get lock /var/lib/dpkg/lock - open (11: Resource
>  temporarily unavailable)
> 
>  11:00:46 E: Unable to lock the administration directory (/var/lib/dpkg/),
>  is another process using it?
> 
> 
> 
>  I recognize this failure from my own Ubuntu 16.04 system: a cron-job
>  starts “apt-get -q”, which for whatever reason does not terminate. As a
>  workaround, “sudo killall apt-get || true” before trying to acquire build
>  dependencies...
> 
> 
> 
>  HTH... Dave
> 
> 
>  _______________________________________________
> 
>  vpp-dev mailing list
> 
>  [5]vpp-dev@lists.fd.io
> 
>  [6]https://lists.fd.io/mailman/listinfo/vpp-dev
> 
> 
> 
> 
> 
>  _______________________________________________
>  vpp-dev mailing list
>  [7]vpp-dev@lists.fd.io
>  [8]https://lists.fd.io/mailman/listinfo/vpp-dev
> 
> References
> 
>    Visible links
>    1. 
> https://git.fd.io/ci-management/tree/jenkins-scripts/basic_settings.sh#n14
>    2. mailto:hagb...@gmail.com
>    3. mailto:dbar...@cisco.com
>    4. https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/3378/console
>    5. mailto:vpp-dev@lists.fd.io
>    6. https://lists.fd.io/mailman/listinfo/vpp-dev
>    7. mailto:vpp-dev@lists.fd.io
>    8. https://lists.fd.io/mailman/listinfo/vpp-dev
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev
  • [vpp-dev] Ver... Dave Barach (dbarach)
    • Re: [vpp... Ed Warnicke
      • Re: ... Ed Warnicke
        • ... Ed Warnicke via RT
          • ... Ed Warnicke
            • ... Ed Warnicke via RT
        • ... Thanh Ha via RT
          • ... Dave Wallace
            • ... Klement Sekera -X (ksekera - PANTHEON TECHNOLOGIES at Cisco)
              • ... Dave Wallace
            • ... Thanh Ha
              • ... Ed Warnicke
                • ... Thanh Ha
                • ... Ed Warnicke
              • ... Luke, Chris

Reply via email to