Re: [openstack-dev] [sahara][heat][infra] breakage of Sahara gate and images from openstack.org

2016-08-01 Thread Steve Baker

On 02/08/16 03:11, Luigi Toscano wrote:

On Monday, 1 August 2016 10:56:21 CEST Zane Bitter wrote:

On 29/07/16 13:12, Luigi Toscano wrote:

Hi all,
the Sahara jobs on the gate run the scenario tests (from sahara-tests)
using the fake plugin, so no real Hadoop/Spark/BigData operations are
performed, but other the other expected operations are executed on the
image. In order to do this we used for long time this image:
http://tarballs.openstack.org/heat-test-image/fedora-heat-test-image.qcow2

which was updated early on this Friday (July 29th) from Fedora 22 to
Fedora 24 breaking our jobs with some cryptic error, maybe something
related to the repositories:
http://logs.openstack.org/46/335946/12/check/gate-sahara-tests-dsvm-scenar
io-nova-heat/5eeff52/logs/screen-sahara-eng.txt.gz?level=WARNING

So AFAICT from the log:

"rpm -q xfsprogs" prints "package xfsprogs is not installed" which is
expected if xfsprogs is not installed.

"yum install -y xfsprogs" redirects to "/usr/bin/dnf install -y
xfsprogs" which is expected on F24.

dnf fails with "Error: Failed to synchronize cache for repo 'fedora'"
which means it couldn't download the Fedora repository data.

"sudo mount -o data=writeback,noatime,nodiratime /dev/vdb
/volumes/disk1" then fails, doubtlessly because xfsprogs in not installed.

The absence of "sudo" in the yum command (when it does appear in the
mount command) is suspicious, but unlikely to be the reason it can't
sync the cache.

This is why I mentioned the repositories, yes.


It's not obvious why this change of image would suddenly result in not
being able to install packages. It seems more likely that you've never
been able to install packages, but the previous image had xfsprogs
preinstalled and the new one doesn't. I don't know the specifics of how
that image is built, but certainly Fedora has been making an ongoing
effort to strip the cloud image back to basics.

But this is not a normal Fedora image. If I read project-config correctly,
this is generated by this job:

http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/
jobs/heat.yaml#n34

 From a brief chat on #heat on Friday it seems that the image is not gated or
checked or even used right now. Is it the case? The image is almost a simple
Fedora with few extra packages:
http://git.openstack.org/cgit/openstack/heat-templates/tree/hot/software-config/test-image/build-heat-test-image.sh

We've stopped using this image recently because the download failure 
rate from tarballs.openstack.org was impacting heat's gate job 
reliability. We've switched to a vanilla fedora for now because none of 
our tests actually require a customized image. When we do have such 
tests we'll likely do boot-time install of packages from an AFS infra 
mirror.


We had no idea that Sahara was using this image in their gate, and it 
was certainly never intended for broader consumption.


Sahara would have a few options for an alternative:

- changing the test to work on a vanilla image

- do boot-time installation of the required packages

- work with infra on creating and hosting a custom image

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara][heat][infra] breakage of Sahara gate and images from openstack.org

2016-08-01 Thread Luigi Toscano
On Monday, 1 August 2016 10:56:21 CEST Zane Bitter wrote:
> On 29/07/16 13:12, Luigi Toscano wrote:
> > Hi all,
> > the Sahara jobs on the gate run the scenario tests (from sahara-tests)
> > using the fake plugin, so no real Hadoop/Spark/BigData operations are
> > performed, but other the other expected operations are executed on the
> > image. In order to do this we used for long time this image:
> > http://tarballs.openstack.org/heat-test-image/fedora-heat-test-image.qcow2
> > 
> > which was updated early on this Friday (July 29th) from Fedora 22 to
> > Fedora 24 breaking our jobs with some cryptic error, maybe something
> > related to the repositories:
> > http://logs.openstack.org/46/335946/12/check/gate-sahara-tests-dsvm-scenar
> > io-nova-heat/5eeff52/logs/screen-sahara-eng.txt.gz?level=WARNING
> So AFAICT from the log:
> 
> "rpm -q xfsprogs" prints "package xfsprogs is not installed" which is
> expected if xfsprogs is not installed.
> 
> "yum install -y xfsprogs" redirects to "/usr/bin/dnf install -y
> xfsprogs" which is expected on F24.
> 
> dnf fails with "Error: Failed to synchronize cache for repo 'fedora'"
> which means it couldn't download the Fedora repository data.
> 
> "sudo mount -o data=writeback,noatime,nodiratime /dev/vdb
> /volumes/disk1" then fails, doubtlessly because xfsprogs in not installed.
> 
> The absence of "sudo" in the yum command (when it does appear in the
> mount command) is suspicious, but unlikely to be the reason it can't
> sync the cache.

This is why I mentioned the repositories, yes. 

> It's not obvious why this change of image would suddenly result in not
> being able to install packages. It seems more likely that you've never
> been able to install packages, but the previous image had xfsprogs
> preinstalled and the new one doesn't. I don't know the specifics of how
> that image is built, but certainly Fedora has been making an ongoing
> effort to strip the cloud image back to basics.

But this is not a normal Fedora image. If I read project-config correctly, 
this is generated by this job:

http://git.openstack.org/cgit/openstack-infra/project-config/tree/jenkins/
jobs/heat.yaml#n34

>From a brief chat on #heat on Friday it seems that the image is not gated or 
checked or even used right now. Is it the case? The image is almost a simple 
Fedora with few extra packages:
http://git.openstack.org/cgit/openstack/heat-templates/tree/hot/software-config/test-image/build-heat-test-image.sh

-- 
Luigi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara][heat][infra] breakage of Sahara gate and images from openstack.org

2016-08-01 Thread Zane Bitter

On 29/07/16 13:12, Luigi Toscano wrote:

Hi all,
the Sahara jobs on the gate run the scenario tests (from sahara-tests) using
the fake plugin, so no real Hadoop/Spark/BigData operations are performed, but
other the other expected operations are executed on the image. In order to do
this we used for long time this image:
http://tarballs.openstack.org/heat-test-image/fedora-heat-test-image.qcow2

which was updated early on this Friday (July 29th) from Fedora 22 to Fedora 24
breaking our jobs with some cryptic error, maybe something related to the
repositories:
http://logs.openstack.org/46/335946/12/check/gate-sahara-tests-dsvm-scenario-nova-heat/5eeff52/logs/screen-sahara-eng.txt.gz?level=WARNING


So AFAICT from the log:

"rpm -q xfsprogs" prints "package xfsprogs is not installed" which is 
expected if xfsprogs is not installed.


"yum install -y xfsprogs" redirects to "/usr/bin/dnf install -y 
xfsprogs" which is expected on F24.


dnf fails with "Error: Failed to synchronize cache for repo 'fedora'" 
which means it couldn't download the Fedora repository data.


"sudo mount -o data=writeback,noatime,nodiratime /dev/vdb 
/volumes/disk1" then fails, doubtlessly because xfsprogs in not installed.


The absence of "sudo" in the yum command (when it does appear in the 
mount command) is suspicious, but unlikely to be the reason it can't 
sync the cache.


It's not obvious why this change of image would suddenly result in not 
being able to install packages. It seems more likely that you've never 
been able to install packages, but the previous image had xfsprogs 
preinstalled and the new one doesn't. I don't know the specifics of how 
that image is built, but certainly Fedora has been making an ongoing 
effort to strip the cloud image back to basics.


cheers,
Zane.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara][heat][infra] breakage of Sahara gate and images from openstack.org

2016-07-29 Thread Jeremy Stanley
On 2016-07-29 19:12:35 +0200 (+0200), Luigi Toscano wrote:
[...]
> - would it be possible to use the the nodepool cloud images
> (qcow2, raw) from the jobs, if they contains lsb_release (and
> possibly other tools), and if it is, how?

We don't currently publish them as they lack a simple mechanism for
granting access other than with our baked in keys/accounts, and also
because they're quite large due to pre-caching of all our git repos
and any distro packages our CI jobs are likely to try installing
(around 5GiB in compressed qcow2 format the last time I looked).
-- 
Jeremy Stanley

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sahara][heat][infra] breakage of Sahara gate and images from openstack.org

2016-07-29 Thread Luigi Toscano
Hi all,
the Sahara jobs on the gate run the scenario tests (from sahara-tests) using 
the fake plugin, so no real Hadoop/Spark/BigData operations are performed, but 
other the other expected operations are executed on the image. In order to do 
this we used for long time this image:
http://tarballs.openstack.org/heat-test-image/fedora-heat-test-image.qcow2

which was updated early on this Friday (July 29th) from Fedora 22 to Fedora 24 
breaking our jobs with some cryptic error, maybe something related to the 
repositories:
http://logs.openstack.org/46/335946/12/check/gate-sahara-tests-dsvm-scenario-nova-heat/5eeff52/logs/screen-sahara-eng.txt.gz?level=WARNING

Now we are trying to quickly find another image; the standard Fedora 24 and 
CentOS 7 images have no lsb_release (used in Sahara):
https://review.openstack.org/#/c/348849/
https://review.openstack.org/#/c/348894/

but the Ubuntu 16.04 cloud image seems to contain them, so this change *maybe* 
will solve the issue (but pending gates right now):
https://review.openstack.org/#/c/348952/

Nevertheless, it would be nice to not rely on something external, so my 
questions are:

- could someone from the heat side help investigate whether the image is still 
valid?
- would it be possible to use the the nodepool cloud images (qcow2, raw) from 
the jobs, if they contains lsb_release (and possibly other tools), and if it 
is, how?

Ciao
-- 
Luigi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev