[Openstack-operators] [kolla][nova] Upgrade leaves nova_compute at earlier version

2018-06-14 Thread Dave Williams
I am using kolla-ansible 6.0.0 with openstack_release set to master in
globals.yml in a production environment.

I am trying to fix an oslo_messaging.rpc.client.RemoteError when
undertaking server add volume
(https://bugs.launchpad.net/nova/+bug/1773393) and appear to have
tracked it down to an inconsistency of container versions.

kollo-ansible upgrade runs to completion without error but leaves the
running nova_compute containers (and possibly others) at an earlier
version.

The version running is
CONTAINER ID IMAGECOMMAND   CREATED STATUS PORTS NAMES
944f620a445c 7dc2d4695962 "kolla_start" 4 weeks ago Up 4 hours nova_compute

whereas docker images -a shows
REPOSITORY TAG IMAGE ID CREATED SIZE
kolla/ubuntu-source-nova-compute master f2df8187f14e 15 hours ago 1.29GB
kolla/ubuntu-source-nova-compute  582561ac010f 39 hours ago 1.29GB
kolla/ubuntu-source-nova-compute  7dc2d4695962 3 months ago 1.22GB

This implies f2df8187f14e is the one I should be using.
The image 582561ac010f was after I tried to switch to queens from master
but without success due to a bootstrap_cinder problem:
Error during database migration: 
  "Database schema file with version 122 doesn't exist."
I tried to investigate this but without any obvious resolution.

All compute nodes show the same issue.

As per the notes on
https://docs.openstack.org/kolla-ansible/latest/user/operating-kolla.html
I have checked and virt_type is set to kvm in nova.conf and so I cannot
see why the upgrade shouldnt have been successful.

How do I get kolla to use the latest version pulled?

Given I have running instances I am a little nervous of doing a deploy or
reconfigure.

Thanks for help.

Dave


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [kolla][nova] Upgrade leaves nova_compute at earlier version]

2018-06-15 Thread Dave Williams
  Up 8 days   cron
a775d01eabe8kolla/ubuntu-source-kolla-toolbox:queens   "kolla_start"
   8 days ago  Up 8 days   kolla_toolbox
475b8900989bkolla/ubuntu-source-fluentd:queens "kolla_start"
   8 days ago  Up 8 days   fluentd
a48c59fd969ed3cc178ce8c0   "kolla_start"
   9 days ago  Up 9 days   
neutron_openvswitch_agent
7369e5368602800896c34a2e   "kolla_start"
   9 days ago  Up 9 days   
openvswitch_vswitchd
0ca90d6a09e8551909d28c50   "kolla_start"
   9 days ago  Up 9 days   openvswitch_db
cc5228efb01b6307a42d968d   "kolla_start"
   9 days ago  Up 9 days   nova_libvirt
bc3a36aa480f0e00da6547d0   "kolla_start"
   9 days ago  Up 9 days   nova_ssh
dd21cf3c1b1589c2ec59b2b1   "kolla_start"
   9 days ago  Up 9 days   cinder_backup
82b56aa6010252ac8cf90164   "kolla_start"
   9 days ago  Up 9 days   cinder_volume
0feafa193c0564c10510ae70   "kolla_start"
   6 weeks ago Up 9 days   nova_compute


The various ceph startups were due a serious network failure I suffered
using iscsi based OSD's. None of the kolla instructions to recreate the
storage worked. I had to resort to copious dd'ing to zero all the disk
partitions before I could finally get it working afresh. Painful!!

The bootstrap_cinder with exit 1 was my attempt to downgrade to move to
queens rather than master but that failed due to the database schema
issue described below.

The last kolla-ansible upgrade (back to master) was successful with
ansible.log from the fluentd container showing all lines like:

2018-06-07 09:18:46,265 p=767 u=ansible |  localhost | SUCCESS => {
"changed": false, 
"user": "haproxy"
}
2018-06-07 09:19:14,436 p=796 u=ansible |  localhost | SUCCESS => {
"changed": true, 
"msg": "Variable change succeeded prev_value=OFF"
}
2018-06-07 09:19:29,562 p=826 u=ansible |  localhost | SUCCESS => {
"changed": true, 
"msg": "Variable change succeeded prev_value=ON"
}
2018-06-07 09:23:11,834 p=860 u=root |  localhost | SUCCESS => {
"changed": false, 
"disks": "[]"
}
2018-06-07 09:23:59,247 p=892 u=ansible |  localhost | SUCCESS => {
"changed": true, 
"msg": "Variable change succeeded prev_value=OFF"
}
2018-06-07 09:24:08,053 p=921 u=ansible |  localhost | SUCCESS => {
"changed": true, 
"msg": "Variable change succeeded prev_value=ON"
}

Not sure what else I can provide beyond the bootstrap_cinder logs which
show more fully:
++ cat /run_command
+ CMD='apache2 -DFOREGROUND'
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ [[ ! -d /var/log/kolla/cinder ]]
+++ stat -c %a /var/log/kolla/cinder
++ [[ 755 != \7\5\5 ]]
++ . /usr/local/bin/kolla_cinder_extend_start
+++ set -o errexit
+++ [[ -n 0 ]]
+++ cinder-manage db sync
Error during database migration: "Database schema file with version 122 doesn't 
exist."

I am happy with running a stable version rather than master. Using master was
my trying to fix problems with ceph. I appear to be stuck not being able
to upgrade or downgrade at present.  Whilst my cloud is mainly working
its just some secondary features like adding volumes that now no longer work. 

I am not in the position to destroy the system and restart without
significant disruption.


Thank you for your attention.

Dave




On 21:09, Thu 14 Jun 18, Eduardo Gonzalez wrote:
> Hi,
> 
> could you share a your globals file without secrets, a docker ps -a on all
> compute hosts and images too. If you are able to get n upgrade log will be
> helpful too.
> 
> By the way, using master is not really recommended, many changes from other
> projects and kolla may break the deployment.
> 
> Regards
> 
> On Thu, Jun 14, 2018, 9:00 PM Dave Williams 
> wrote:
> 
> > I am using kolla-ansible 6.0.0 with openstack_release set to master in
> > globals.yml in a production environment.
> >
> > I am trying to fix an oslo_messaging.rpc.client.RemoteError when
> > undertaking server add volume
> > (https://bugs.launchpad.net/nova/+bug/1773393) and appear to have
> > tracked it down to an inconsistency of container versions.
> >
> > kollo-ansible upgrade runs to completion without error but leaves the
> > running nova_compute containers (and possibly others) at an earlier
> > version.
> >
> > The version running is
> > CONTAINER ID IMAGECOMMAND   CREATED STATUS PORTS NAMES
> > 944f620a445c 7dc2d4695962 "kolla_start" 4 weeks ago Up 4 hours nova_compute
> >
> > whereas docker images -a shows
> > REPOSITORY TAG IMAGE ID CREATED SIZE
> > kolla/ubuntu-source-nova-compute master f2df8187f14e 15 hours ago 1.29GB
> > kolla/ubuntu-source-nova-compute  582561ac010f 39 hours ago 1.29GB
> > kolla/ubuntu-source-nova-compute  7dc2d4695962 3 months ago 1.22GB
> >
> > This implies f2df8187f14e is the one I should be using.
> > The image 582561ac010f was after I tried to switch to queens from master
> > but without success due to a bootstrap_cinder problem:
> > Error during database migration:
> >   "Database schema file with version 122 doesn't exist."
> > I tried to investigate this but without any obvious resolution.
> >
> > All compute nodes show the same issue.
> >
> > As per the notes on
> > https://docs.openstack.org/kolla-ansible/latest/user/operating-kolla.html
> > I have checked and virt_type is set to kvm in nova.conf and so I cannot
> > see why the upgrade shouldnt have been successful.
> >
> > How do I get kolla to use the latest version pulled?
> >
> > Given I have running instances I am a little nervous of doing a deploy or
> > reconfigure.
> >
> > Thanks for help.
> >
> > Dave
> >
> >
> > ___
> > OpenStack-operators mailing list
> > OpenStack-operators@lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >


___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


[Openstack-operators] [kolla][ceph]

2018-08-28 Thread Dave Williams
What is the best practice for adding more Ceph OSD's to kolla
in a production environment. Does deploy do anything to the existing
data or does it simply add the OSD's (and potentially increase the placement
groups if re-configured)? 

reconfigure doesnt touch newly prepared disks from what I see from the
code which is where I was expecting might have been done.

I am running kolla-ansible queens.

Thanks
Dave

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


Re: [Openstack-operators] [kolla][ceph] Adding OSD's to production Ceph

2018-08-31 Thread Dave Williams
On 16:51, Tue 28 Aug 18, Dave Williams wrote:
Sorry the email subject got lost.
> What is the best practice for adding more Ceph OSD's to kolla-ansible
> in a production environment? Does "deploy" do anything to the existing
> data or does it simply add the OSD's (and potentially increase the placement
> groups if re-configured)? 
> 
> "reconfigure" doesnt touch newly prepared disks from what I see from the
> code which is where I was expecting this might have been undertaken.
> 
> I am running kolla-ansible queens.
> 
> Thanks
> Dave
> 
> ___
> OpenStack-operators mailing list
> OpenStack-operators@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

___
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators