Bug#933652: Missing files in package: SQL upgrade scripts.

2019-08-01 Thread Louis Schmitz
Package: python3-cinder
Version: 2:13.0.3-1

Openstack components regularly prune their SQL upgrade scripts.

Here's an example of why this can be a problem (this isn't an actual real case, 
just for illustration purposes):
Say release no. 14 of openstack only contains SQL upgrade scripts that work 
from release 12 onwards. Trying to upgrade a release 11 database immediately to 
release 14 will not work and requires two separate steps, first installing 
release 12, then running update, then installing release 14, and running the 
SQL update again.

When trying to update an openstack running on debian 9 to one running on debian 
10, you can see why this is an issue:

For example, the 'cinder' package is updated from 'Newton' (DB version 79) 
immediately to 'Rocky'. "Rocky" database updates start at version 86 (the 'full 
database' script creates version 86).

That means the 79-80, 80-81, 81-82, ..., 84-85 scripts are missing from the 
/usr/lib/python3/dist-packages/cinder/db/sqlalchemy/migrate_repo/versions/ 
directory.

To solve this, you would probably have to copy over 073_cinder_init.py until 
085_modify_workers_updated_at.py from 
https://github.com/openstack/cinder/tree/stable/pike/cinder/db/sqlalchemy/migrate_repo/versions
 into the package, and remove the existing 085_cinder_init.py. ("Pike" is an 
openstack release in -between Newton (debian 9) and Rocky (debian 10)). Then, 
you have an unbroken chain of SQL update scripts that will update the database 
from version 79 all the way to version 123.

This problem is likely to also occur with various other openstack components.









Bug#932813: RabbitMQ init scripts are broken

2019-07-23 Thread Louis Schmitz
Package: rabbitmq-server
Version: 3.7.8-4
Severity: minor
Problem encountered while trying to update to debian-10.

Steps to reproduce:

1.   Uninstall rabbitmq with ‘apt-get remove rabbitmq-server’

2.   Install rabbitmq with ‘apt-get install rabbitmq-server’

Expected result:
Rabbitmq is installed and immediately running via a systemd unit file and init 
script.

Actual result:

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libfile-copy-recursive-perl libjs-swfobject libjs-twitter-bootstrap libodbc1 
tcpd
Use 'apt autoremove' to remove them.
The following NEW packages will be installed:
  rabbitmq-server
0 upgraded, 1 newly installed, 0 to remove and 359 not upgraded.
Need to get 0 B/9,227 kB of archives.
After this operation, 14.3 MB of additional disk space will be used.
Selecting previously unselected package rabbitmq-server.
(Reading database ... 105086 files and directories currently installed.)
Preparing to unpack .../rabbitmq-server_3.7.8-4_all.deb ...
Unpacking rabbitmq-server (3.7.8-4) ...
Setting up rabbitmq-server (3.7.8-4) ...
Job for rabbitmq-server.service failed because the control process exited with 
error code.
See "systemctl status rabbitmq-server.service" and "journalctl -xe" for details.
invoke-rc.d: initscript rabbitmq-server, action "restart" failed.
● rabbitmq-server.service - RabbitMQ Messaging Server
   Loaded: loaded (/lib/systemd/system/rabbitmq-server.service; enabled; vendor 
preset: enabled)
  Drop-In: /etc/systemd/system/rabbitmq-server.service.d
   └─limits.conf
   Active: activating (auto-restart) (Result: exit-code) since Tue 2019-07-23 
17:04:53 CEST; 8ms ago
  Process: 31923 ExecStop=/usr/sbin/rabbitmqctl stop (code=exited, status=127)
  Process: 31921 ExecStart=/usr/sbin/rabbitmq-server (code=exited, status=127)
Main PID: 31921 (code=exited, status=127)

Jul 23 17:04:53 st05 systemd[1]: rabbitmq-server.service: Unit entered failed 
state.
Jul 23 17:04:53 st05 systemd[1]: rabbitmq-server.service: Failed with result 
'exit-code'.
dpkg: error processing package rabbitmq-server (--configure):
subprocess installed post-installation script returned error exit status 1
Processing triggers for systemd (232-25+deb9u11) ...
Errors were encountered while processing:
rabbitmq-server
E: Sub-process /usr/bin/dpkg returned an error code (1)

Something is broken in the init script. If you later attempt to start rabbitmq 
manually by running the executable indicated (/usr/sbin/rabbitmq-server) the 
program (so far) appears to run correctly.
Running it via the system script later (systemctl start rabbitmq-server) 
produces the same error.

There is no log output when rabbitmq fails this way. No startup log, No crash 
log, simply nothing, so the program is likely never ran.




Bug#881034: Sid: Galera-3 default configuration; nodes beyond primary will not connect.

2017-11-07 Thread Louis Schmitz
Package: galera-3
Version: 25.3.19-2
Severity: minor
Tags: sid

The default configuration of Galera-3 systemd scripts for the mariaDB server in 
debian Sid is incorrect. The secondary (tertiary, etc.) nodes do not connect to 
the primary node.
The problem may also be with one of the 'systemd' or 'mariadb' packages, and 
the script may be provided upstream. This information is unknown to me.

Reproducing the issue:
1) Create 3 connected VMs or connect 3 physical machines on a LAN with a 
default debian install.
2) Install the galera-3, mariadb-client, mariadb-server, and rsync packages.
3) Configure /etc/mysql/conf.d/galera.cnf with the recommended settings. In my 
case, each of 3 servers had a configuration file like this (the two bottom 
lines are changed on a server-to-server basis):

[mysqld]
#mysql settings
binlog_format=ROW
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_doublewrite=1
query_cache_size=0
query_cache_type=0
bind-address=0.0.0.0

#galera settings
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="osdb_cluster"
wsrep_cluster_address=gcomm://10.0.40.111,10.0.40.112,10.0.40.113
wsrep_sst_method=rsync

wsrep_node_address="10.0.40.111"
wsrep_node_name="galera01"

4) Bootstrap the cluster by using the recommended script on the primary node, 
by calling
$ galera_new_cluster
5) Using the recommended way, check whether the cluster is started:

MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_cluster_size';
++---+
| Variable_name  | Value |
++---+
| wsrep_cluster_size | 1 |
++---+
1 row in set (0.01 sec)

6) Now, on the second node, try the recommended way of starting this node (as 
described on galera's home page):
$ systemctl start mysql
Here's where the bug happens. This operation fails with the mysql server not 
starting on the second node, while we're expecting a two-node cluster.  Below 
is the output passed to the file '/var/log/mysql/error.log'.

2017-11-06 17:10:13 139930521723456 [Note] WSREP: Read nil XID from storage 
engines, skipping position init
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_load(): loading 
provider library '/usr/lib/galera/libgalera_smm.so'
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_load(): Galera 
3.19(rb98f92f) by Codership Oy  loaded successfully.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: CRC-32C: using hardware 
acceleration.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Found saved state: 
----:-1, safe_to_bootsrap: 1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Passing config to GCS: 
base_dir = /var/lib/mysql/; base_host = 10.0.40.113; base_port = 4567; 
cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = 
PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; 
evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; 
evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = 
PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; 
evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; 
gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = 
/var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = no; 
gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 
1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; 
gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; 
gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; 
gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ig
2017-11-06 17:10:13 139930521723456 [Note] WSREP: GCache history reset: 
old(----:0) -> 
new(----:-1)
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Assign initial position for 
certification: -1, protocol version: -1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: wsrep_sst_grab()
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Start replication
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Setting initial position to 
----:-1
2017-11-06 17:10:13 139930521723456 [Note] WSREP: protonet asio version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: Using CRC-32C for message 
checksums.
2017-11-06 17:10:13 139930521723456 [Note] WSREP: backend: asio
2017-11-06 17:10:13 139930521723456 [Note] WSREP: gcomm thread scheduling 
priority set to other:0
2017-11-06 17:10:13 139930521723456 [Warning] WSREP: access 
file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2017-11-06 17:10:13 139930521723456 [Note] WSREP: restore pc from disk failed
2017-11-06 17:10:13 139930521723456 [Note] WSREP: GMCast version 0
2017-11-06 17:10:13 139930521723456 [Note] WSREP: (f877f818, 
'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2017-11-06 17:10:13 139930521723456