This bug was fixed in the package keystone - 2:10.0.1-0ubuntu2 --------------- keystone (2:10.0.1-0ubuntu2) yakkety; urgency=high
* d/p/0001-Make-flushing-tokens-more-robust.patch: Commit token flushes between batches in order to lower resource consumption and make flushing more robust for replication (LP: #1649616). -- Jorge Niedbalski <jorge.niedbal...@canonical.com> Wed, 07 Jun 2017 13:07:50 +0100 ** Changed in: keystone (Ubuntu Yakkety) Status: Fix Committed => Fix Released ** Changed in: keystone (Ubuntu Xenial) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of नेपाली भाषा समायोजकहरुको समूह, which is subscribed to Xenial. Matching subscriptions: Ubuntu 16.04 Bugs https://bugs.launchpad.net/bugs/1649616 Title: Keystone Token Flush job does not complete in HA deployed environment Status in Ubuntu Cloud Archive: In Progress Status in Ubuntu Cloud Archive mitaka series: Fix Committed Status in Ubuntu Cloud Archive newton series: Fix Committed Status in Ubuntu Cloud Archive ocata series: Fix Committed Status in OpenStack Identity (keystone): Fix Released Status in OpenStack Identity (keystone) newton series: In Progress Status in OpenStack Identity (keystone) ocata series: In Progress Status in puppet-keystone: Triaged Status in tripleo: In Progress Status in keystone package in Ubuntu: In Progress Status in keystone source package in Xenial: Fix Released Status in keystone source package in Yakkety: Fix Released Status in keystone source package in Zesty: Fix Released Bug description: [Impact] * The Keystone token flush job can get into a state where it will never complete because the transaction size exceeds the mysql galara transaction size - wsrep_max_ws_size (1073741824). [Test Case] 1. Authenticate many times 2. Observe that keystone token flush job runs (should be a very long time depending on disk) >20 hours in my environment 3. Observe errors in mysql.log indicating a transaction that is too large Actual results: Expired tokens are not actually flushed from the database without any errors in keystone.log. Only errors appear in mysql.log. Expected results: Expired tokens to be removed from the database [Additional info:] It is likely that you can demonstrate this with less than 1 million tokens as the >1 million token table is larger than 13GiB and the max transaction size is 1GiB, my token bench-marking Browbeat job creates more than needed. Once the token flush job can not complete the token table will never decrease in size and eventually the cloud will run out of disk space. Furthermore the flush job will consume disk utilization resources. This was demonstrated on slow disks (Single 7.2K SATA disk). On faster disks you will have more capacity to generate tokens, you can then generate the number of tokens to exceed the transaction size even faster. Log evidence: [root@overcloud-controller-0 log]# grep " Total expired" /var/log/keystone/keystone.log 2016-12-08 01:33:40.530 21614 INFO keystone.token.persistence.backends.sql [-] Total expired tokens removed: 1082434 2016-12-09 09:31:25.301 14120 INFO keystone.token.persistence.backends.sql [-] Total expired tokens removed: 1084241 2016-12-11 01:35:39.082 4223 INFO keystone.token.persistence.backends.sql [-] Total expired tokens removed: 1086504 2016-12-12 01:08:16.170 32575 INFO keystone.token.persistence.backends.sql [-] Total expired tokens removed: 1087823 2016-12-13 01:22:18.121 28669 INFO keystone.token.persistence.backends.sql [-] Total expired tokens removed: 1089202 [root@overcloud-controller-0 log]# tail mysqld.log 161208 1:33:41 [Warning] WSREP: transaction size limit (1073741824) exceeded: 1073774592 161208 1:33:41 [ERROR] WSREP: rbr write fail, data_len: 0, 2 161209 9:31:26 [Warning] WSREP: transaction size limit (1073741824) exceeded: 1073774592 161209 9:31:26 [ERROR] WSREP: rbr write fail, data_len: 0, 2 161211 1:35:39 [Warning] WSREP: transaction size limit (1073741824) exceeded: 1073774592 161211 1:35:40 [ERROR] WSREP: rbr write fail, data_len: 0, 2 161212 1:08:16 [Warning] WSREP: transaction size limit (1073741824) exceeded: 1073774592 161212 1:08:17 [ERROR] WSREP: rbr write fail, data_len: 0, 2 161213 1:22:18 [Warning] WSREP: transaction size limit (1073741824) exceeded: 1073774592 161213 1:22:19 [ERROR] WSREP: rbr write fail, data_len: 0, 2 Disk utilization issue graph is attached. The entire job in that graph takes from the first spike is disk util(~5:18UTC) and culminates in about ~90 minutes of pegging the disk (between 1:09utc to 2:43utc). [Regression Potential] * Not identified To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1649616/+subscriptions _______________________________________________ Mailing list: https://launchpad.net/~group.of.nepali.translators Post to : group.of.nepali.translators@lists.launchpad.net Unsubscribe : https://launchpad.net/~group.of.nepali.translators More help : https://help.launchpad.net/ListHelp