If we have consensus that apt-get update during vm init is a bad idea then
this patch might be a good quick solution [0].

Regards,
Thanh

[0] https://gerrit.fd.io/r/4797

On Thu, Jan 19, 2017 at 10:47 PM, Ed Warnicke <hagb...@gmail.com> wrote:

> Thanh,
>
> I'm not quite sure the logic of having it at that particular point
> either.  Something to investigate.
>
> Ed
>
> On Thu, Jan 19, 2017 at 8:44 PM, Thanh Ha <thanh...@linuxfoundation.org>
> wrote:
>
>> FWIW in OpenDaylight we don't typically run yum update or apt-get update
>> in our init-scripts on VM spinup. At the job level we only install
>> dependencies needed by the build. I'm not sure why fd.io is running
>> upgrades but it was existing in the script when I looked at it. System
>> upgrades during VM spinup is not something the OpenDaylight project does at
>> least.
>>
>> Regards,
>> Thanh
>>
>>
>> On Thu, Jan 19, 2017 at 10:38 PM, Dave Wallace <dwallac...@gmail.com>
>> wrote:
>>
>>> Ed, Thanh, Vanessa,
>>>
>>> IMHO, updating the ubuntu packages every time a VM is spun up is a bug
>>> wrt. being able to reproduce some (hopefully rare) build/test issues.
>>> Since every VM is potentially running with different versions of OS
>>> components, when a failure occurs (e.g. in "make test"), then it may be
>>> necessary to recreate the exact run-time environment in order to reproduce
>>> the failure. Unless the complete package list is being archived for every
>>> VM instance that is spun up, this may not be possible.
>>>
>>> My experience is that those rare cases where a tool or environment issue
>>> causes a failure, the cost to find the issue is extraordinarily high if you
>>> do not have the ability to recreate the EXACT build/run-time environment.  
>>> This
>>> is why CSIT does not update OS components in the VM initialization scripts
>>> and the VM images are built from a specific package list instead of pulling
>>> the latest versions from the apt repositories.
>>>
>>> My recommendation is that the VM images be updated periodically (weekly
>>> or whenever a new security update is released) and the package lists
>>> archived for each VM image version.  Each VM image should also be verified
>>> against a known good VPP commit version as is done with CSIT branches.
>>> Ideally we should build a fully automated continuous deployment model to
>>> reduce the amount of work to update the VM images to running a Jenkins job
>>> to build/test/deploy a new VM image from the latest packages versions.
>>>
>>> With that automation in place, this mechanism could be extended for use
>>> by CSIT as well as "make test", thus ensuring that all of our testing was
>>> done with the same OS component version.  Ideally, all projects should be
>>> using the same OS components to ensure that everything is tested in the
>>> same run-time environment.
>>>
>>> Thanks,
>>> -daw-
>>>
>>> On 1/19/2017 8:31 PM, Thanh Ha via RT wrote:
>>>
>>> The issue with the 16.04 Ubuntu image is fixed now (but we may require some 
>>> additional actions which I'll send to Vanessa to in case this issue comes 
>>> up again). We fixed this issue tonight by rebuilding ubuntu1604 and 
>>> deploying the new image.
>>>
>>> I'm going to close this ticket as resolved and we'll take the additional 
>>> task to find a way to ensure this doesn't appear again off of this ticket.
>>>
>>> If you're not interested in the detailed analysis you can stop reading now.
>>>
>>> For those interested I suspect that the lock issue will appear again 
>>> (although I could be wrong). The reason I believe so is that our vm init 
>>> script runs "apt-get update" as an initialization step when the VM boots up 
>>> at creation time via this script [0]. Ed mentioned that we didn't see this 
>>> in the past and it only started appear again recently as we deployed 
>>> another patch to disable Ubuntu's unattended updates.
>>>
>>> I believe a possible reason we will see this issue appear again due to [0] 
>>> is because of we switched from using JClouds to OpenStack Jenkins plugins 
>>> for node spinnup and there's difference in how the init-script is executed 
>>> depending on which plugin is being used.
>>>
>>> JClouds Plugin:
>>>
>>> 1) boot vm
>>> 2) wait for ssh access
>>> 3) copies init-script into vm via ssh
>>> 4) executes init-script, and doesn't continue processing until script is 
>>> complete
>>> 5) once init-script is complete, passes vm over to job and job starts
>>>
>>> OpenStack Plugin:
>>>
>>> 1) boot vm and passes init-script in as User Data
>>> 2) init-script runs inside vm without Jenkins intervention, thus is a 
>>> non-blocking function
>>> 3) in parallel jenkins waits for ssh access to vm
>>> 4) ssh's into vm and passes vm over to job and job starts running
>>>
>>> In the OpenStack plugin case step 4 can execute while step 2 is still 
>>> running apt-get update in the background because it was a non-blocking 
>>> function.
>>>
>>> A few ideas I have to get around this.
>>>
>>> a) Allow init-script to continue running apt-get update however have a 
>>> shell script at the start of Ubuntu jobs that waits for the lock to get 
>>> released before allowing the job to start
>>>
>>> b) Remove apt-get update from init-script and make the job run apt-get 
>>> update at the beginning of it's execution
>>>
>>> c) Regularly update VMs to ensure that apt-get update always runs quickly
>>>
>>>  Regards,
>>> Thanh
>>>
>>> [0] 
>>> https://git.fd.io/ci-management/tree/jenkins-scripts/basic_settings.sh#n14
>>>
>>>
>>> On Thu Jan 19 19:23:59 2017, hagbard wrote:
>>>
>>> FYI... helpdesk is on it, and its being worked in #fdio-infra on IRC
>>>
>>> Ed
>>>
>>> On Thu, Jan 19, 2017 at 4:31 PM, Ed Warnicke <hagb...@gmail.com> 
>>> <hagb...@gmail.com> wrote:
>>>
>>>
>>> Looping in help desk.
>>> On Thu, Jan 19, 2017 at 4:16 PM Dave Barach (dbarach) <dbar...@cisco.com> 
>>> <dbar...@cisco.com>
>>> wrote:
>>>
>>>
>>> Folks,
>>>
>>>
>>>
>>> See https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/3378/console
>>>
>>>
>>>
>>> 11:00:46 E: Could not get lock /var/lib/dpkg/lock - open (11: Resource
>>> temporarily unavailable)
>>>
>>> 11:00:46 E: Unable to lock the administration directory (/var/lib/dpkg/),
>>> is another process using it?
>>>
>>>
>>>
>>> I recognize this failure from my own Ubuntu 16.04 system: a cron-job
>>> starts “apt-get -q”, which for whatever reason does not terminate. As a
>>> workaround, “sudo killall apt-get || true” before trying to acquire build
>>> dependencies...
>>>
>>>
>>>
>>> HTH... Dave
>>>
>>>
>>> _______________________________________________
>>>
>>> vpp-dev mailing list
>>> vpp-dev@lists.fd.io
>>> https://lists.fd.io/mailman/listinfo/vpp-dev
>>>
>>> _______________________________________________
>>> vpp-dev mailing 
>>> listvpp-...@lists.fd.iohttps://lists.fd.io/mailman/listinfo/vpp-dev
>>>
>>>
>>>
>>
>
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to