Re: [openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

2017-11-06 Thread Emilien Macchi
Thanks folks :-) you rock!

On Mon, Nov 6, 2017 at 5:05 AM, Jiří Stránský  wrote:
> On 6.11.2017 11:17, Jiří Stránský wrote:
>>
>> On 6.11.2017 10:52, Marios Andreou wrote:
>>>
>>> On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea 
>>> wrote:
>>>
 On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi 
 wrote:
>
> Since we've got promotion, we can now properly test upgrades from ocata

 to pike.
>
> It's now failing for various reasons, as you can see on:
> https://review.openstack.org/#/c/500625/
>
> I haven't filled bug yet but this is the kind of thing I see now:
>
> http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-

 scenario002-multinode-oooq-container-upgrades/62e7f14/
 logs/undercloud/home/zuul/overcloud_upgrade_console.log.
 txt.gz#_2017-11-04_00_14_17

 I think this is related to https://review.openstack.org/#/c/510577/
 which introduced running os-net-config during the major upgrade
 composable step. In case of environments without network isolation
 /etc/os-net-config/config.json doesn't exist so the os-net-config
 command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
 to keep track of it.


>>> heh, beat me to it :) I was about to file that. Indeed from logs @ [0]
>>> you
>>> can see the step3 ansible-playbook failing for
>>>
>>> https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64
>>>
>>> I had a poke at one of the other jobs too since there are apparently
>>> multiple issues - I found a different one
>>> for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
>>> https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like
>>> all
>>> the upgrade_tasks pass there but then fails on docker-puppet
>>
>>
>> I'm not sure if it's related to that ^ error in particular
>
>
> Yea the backport [2] seems to have fixed that issue. The upgrade now
> completed successfully, but the job failed on validation. I've +A'd the
> backport as it gets us closer to green.
>
>
>> , but since we
>> landed deploy/upgrade scenario separation [1], the upgrade job on Pike
>> effectively started testing non-pacemaker to pacemaker upgrade, which
>> won't work. Due to a chicken-and-egg issue with landing related patches
>> we could not set the dependencies properly. There's a patch fixing this
>> issue and making the Pike upgrade pacemaker-to-pacemaker [2]. This may
>> not solve all the issues, but i think we need it merged to at least have
>> a chance at a green result.
>>
>>>
>>> [0]
>>>
>>> http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55
>>
>> [1] https://review.openstack.org/#/c/500552
>> [2] https://review.openstack.org/#/c/512305
>>>
>>>
>>> thanks,
>>>
>>> marios
>>>
>>>
 I'm requesting some help from the upgrades squad, if they already saw
>
> the failures, etc. It would be great to have the jobs passing at some
> point, now the framework is in place and we had promotion.
>
> Thanks,


 --
>
> Emilien Macchi


>>>
>>>
>>>
>>>
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

2017-11-06 Thread Jiří Stránský

On 6.11.2017 11:17, Jiří Stránský wrote:

On 6.11.2017 10:52, Marios Andreou wrote:

On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea  wrote:


On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi  wrote:

Since we've got promotion, we can now properly test upgrades from ocata

to pike.

It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-

scenario002-multinode-oooq-container-upgrades/62e7f14/
logs/undercloud/home/zuul/overcloud_upgrade_console.log.
txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.



heh, beat me to it :) I was about to file that. Indeed from logs @ [0] you
can see the step3 ansible-playbook failing for
https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like all
the upgrade_tasks pass there but then fails on docker-puppet


I'm not sure if it's related to that ^ error in particular


Yea the backport [2] seems to have fixed that issue. The upgrade now 
completed successfully, but the job failed on validation. I've +A'd the 
backport as it gets us closer to green.



, but since we
landed deploy/upgrade scenario separation [1], the upgrade job on Pike
effectively started testing non-pacemaker to pacemaker upgrade, which
won't work. Due to a chicken-and-egg issue with landing related patches
we could not set the dependencies properly. There's a patch fixing this
issue and making the Pike upgrade pacemaker-to-pacemaker [2]. This may
not solve all the issues, but i think we need it merged to at least have
a chance at a green result.



[0]
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55

[1] https://review.openstack.org/#/c/500552
[2] https://review.openstack.org/#/c/512305


thanks,

marios



I'm requesting some help from the upgrades squad, if they already saw

the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,


--

Emilien Macchi






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

2017-11-06 Thread Jiří Stránský

On 6.11.2017 10:52, Marios Andreou wrote:

On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea  wrote:


On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi  wrote:

Since we've got promotion, we can now properly test upgrades from ocata

to pike.

It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-

scenario002-multinode-oooq-container-upgrades/62e7f14/
logs/undercloud/home/zuul/overcloud_upgrade_console.log.
txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.



heh, beat me to it :) I was about to file that. Indeed from logs @ [0] you
can see the step3 ansible-playbook failing for
https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like all
the upgrade_tasks pass there but then fails on docker-puppet


I'm not sure if it's related to that ^ error in particular, but since we 
landed deploy/upgrade scenario separation [1], the upgrade job on Pike 
effectively started testing non-pacemaker to pacemaker upgrade, which 
won't work. Due to a chicken-and-egg issue with landing related patches 
we could not set the dependencies properly. There's a patch fixing this 
issue and making the Pike upgrade pacemaker-to-pacemaker [2]. This may 
not solve all the issues, but i think we need it merged to at least have 
a chance at a green result.




[0]
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55

[1] https://review.openstack.org/#/c/500552
[2] https://review.openstack.org/#/c/512305


thanks,

marios



I'm requesting some help from the upgrades squad, if they already saw

the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,


--

Emilien Macchi






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

2017-11-06 Thread Marios Andreou
On Mon, Nov 6, 2017 at 11:09 AM, Marius Cornea  wrote:

> On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi  wrote:
> > Since we've got promotion, we can now properly test upgrades from ocata
> to pike.
> > It's now failing for various reasons, as you can see on:
> > https://review.openstack.org/#/c/500625/
> >
> > I haven't filled bug yet but this is the kind of thing I see now:
> > http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-
> scenario002-multinode-oooq-container-upgrades/62e7f14/
> logs/undercloud/home/zuul/overcloud_upgrade_console.log.
> txt.gz#_2017-11-04_00_14_17
>
> I think this is related to https://review.openstack.org/#/c/510577/
> which introduced running os-net-config during the major upgrade
> composable step. In case of environments without network isolation
> /etc/os-net-config/config.json doesn't exist so the os-net-config
> command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
> to keep track of it.
>
>
heh, beat me to it :) I was about to file that. Indeed from logs @ [0] you
can see the step3 ansible-playbook failing for
https://github.com/openstack/tripleo-heat-templates/blob/e463ca15fb2189fde7e7e2de136cfb2303d3171f/puppet/services/tripleo-packages.yaml#L56-L64

I had a poke at one of the other jobs too since there are apparently
multiple issues - I found a different one
for legacy-tripleo-ci-centos-7-containers-multinode-upgrades and filed
https://bugs.launchpad.net/tripleo/+bug/1730349 for that. It looks like all
the upgrade_tasks pass there but then fails on docker-puppet

[0]
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/subnode-2/var/log/messages.txt.gz#_Nov__4_00_13_55

thanks,

marios


> I'm requesting some help from the upgrades squad, if they already saw
> > the failures, etc. It would be great to have the jobs passing at some
> > point, now the framework is in place and we had promotion.
> >
> > Thanks,
>
> --
> > Emilien Macchi
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

2017-11-06 Thread Marius Cornea
On Sat, Nov 4, 2017 at 2:27 AM, Emilien Macchi  wrote:
> Since we've got promotion, we can now properly test upgrades from ocata to 
> pike.
> It's now failing for various reasons, as you can see on:
> https://review.openstack.org/#/c/500625/
>
> I haven't filled bug yet but this is the kind of thing I see now:
> http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/undercloud/home/zuul/overcloud_upgrade_console.log.txt.gz#_2017-11-04_00_14_17

I think this is related to https://review.openstack.org/#/c/510577/
which introduced running os-net-config during the major upgrade
composable step. In case of environments without network isolation
/etc/os-net-config/config.json doesn't exist so the os-net-config
command fails. I filed https://bugs.launchpad.net/tripleo/+bug/1730328
to keep track of it.

> I'm requesting some help from the upgrades squad, if they already saw
> the failures, etc. It would be great to have the jobs passing at some
> point, now the framework is in place and we had promotion.
>
> Thanks,
> --
> Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Help needed on debugging upgrade jobs on Pike

2017-11-03 Thread Emilien Macchi
Since we've got promotion, we can now properly test upgrades from ocata to pike.
It's now failing for various reasons, as you can see on:
https://review.openstack.org/#/c/500625/

I haven't filled bug yet but this is the kind of thing I see now:
http://logs.openstack.org/25/500625/20/check/legacy-tripleo-ci-centos-7-scenario002-multinode-oooq-container-upgrades/62e7f14/logs/undercloud/home/zuul/overcloud_upgrade_console.log.txt.gz#_2017-11-04_00_14_17

I'm requesting some help from the upgrades squad, if they already saw
the failures, etc. It would be great to have the jobs passing at some
point, now the framework is in place and we had promotion.

Thanks,
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev