Re: [Users] Issues after updating to 7.0.14 (136)

Jehan PROCACCIA Thu, 02 Jul 2020 10:08:03 -0700

no factory , just repos virtuozzolinux-base and openvz-os 

# yum repolist |grep virt 
virtuozzolinux-base VirtuozzoLinux Base 15 415+189 
virtuozzolinux-updates VirtuozzoLinux Updates 0


Jehan . 


De: "jjs - mainphrame" <[email protected]> 
À: "OpenVZ users" <[email protected]> 
Cc: "Kevin Drysdale" <[email protected]> 
Envoyé: Jeudi 2 Juillet 2020 18:22:33 
Objet: Re: [Users] Issues after updating to 7.0.14 (136) 

Jehan, are you running factory? 

My ovz hosts are up to date, and I see: 

[root@annie ~]# cat /etc/virtuozzo-release 
OpenVZ release 7.0.15 (222) 

Jake 


On Thu, Jul 2, 2020 at 9:08 AM Jehan Procaccia IMT < [ 
mailto:[email protected] | [email protected] ] > wrote: 



"updating to 7.0.14 (136)" !? 

I did an update yesterday , I am far behind that version 

# cat /etc/vzlinux-release 
Virtuozzo Linux release 7.8.0 (609) 

# uname -a 
Linux localhost 3.10.0-1127.8.2.vz7.151.14 #1 SMP Tue Jun 9 12:58:54 MSK 2020 
x86_64 x86_64 x86_64 GNU/Linux 

why don't you try to update to latest version ? 


Le 29/06/2020 à 12:30, Kevin Drysdale a écrit : 

BQ_BEGIN
Hello, 

After updating one of our OpenVZ VPS hosting nodes at the end of last week, 
we've started to have issues with corruption apparently occurring inside 
containers. Issues of this nature have never affected the node previously, and 
there do not appear to be any hardware issues that could explain this. 

Specifically, a few hours after updating, we began to see containers 
experiencing errors such as this in the logs: 

[90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25 
[90471.679022] EXT4-fs (ploop35454p1): initial error at time 1593205255: 
ext4_ext_find_extent:904: inode 136399 
[90471.679030] EXT4-fs (ploop35454p1): last error at time 1593232922: 
ext4_ext_find_extent:904: inode 136399 
[95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67 
[95189.954582] EXT4-fs (ploop42983p1): initial error at time 1593210174: 
htree_dirblock_to_tree:918: inode 926441: block 3683060 
[95189.954589] EXT4-fs (ploop42983p1): last error at time 1593276902: 
ext4_iget:4435: inode 1849777 
[95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42 
[95714.207447] EXT4-fs (ploop60706p1): initial error at time 1593210489: 
ext4_ext_find_extent:904: inode 136272 
[95714.207452] EXT4-fs (ploop60706p1): last error at time 1593231063: 
ext4_ext_find_extent:904: inode 136272 

Shutting the containers down and manually mounting and e2fsck'ing their 
filesystems did clear these errors, but each of the containers (which were 
mostly used for running Plesk) had widespread issues with corrupt or missing 
files after the fsck's completed, necessitating their being restored from 
backup. 

Concurrently, we also began to see messages like this appearing in 
/var/log/vzctl.log, which again have never appeared at any point prior to this 
update being installed: 

/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288448/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288450/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288451/root.hdd/root.hds' is sparse 
/var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole (check.c:240): 
Warning: ploop image '/vz/private/8288452/root.hdd/root.hds' is sparse 

The basic procedure we follow when updating our nodes is as follows: 

1, Update the standby node we keep spare for this process 
2. vzmigrate all containers from the live node being updated to the standby 
node 
3. Update the live node 
4. Reboot the live node 
5. vzmigrate the containers from the standby node back to the live node they 
originally came from 

So the only tool which has been used to affect these containers is 'vzmigrate' 
itself, so I'm at something of a loss as to how to explain the root.hdd images 
for these containers containing sparse gaps. This is something we have never 
done, as we have always been aware that OpenVZ does not support their use 
inside a container's hard drive image. And the fact that these images have 
suddenly become sparse at the same time they have started to exhibit filesystem 
corruption is somewhat concerning. 

We can restore all affected containers from backups, but I wanted to get in 
touch with the list to see if anyone else at any other site has experienced 
these or similar issues after applying the 7.0.14 (136) update. 

Thank you, 
Kevin Drysdale. 




_______________________________________________ 
Users mailing list 
[ mailto:[email protected] | [email protected] ] 
[ https://lists.openvz.org/mailman/listinfo/users | 
https://lists.openvz.org/mailman/listinfo/users ] 





_______________________________________________ 
Users mailing list 
[ mailto:[email protected] | [email protected] ] 
[ https://lists.openvz.org/mailman/listinfo/users | 
https://lists.openvz.org/mailman/listinfo/users ] 

BQ_END


_______________________________________________ 
Users mailing list 
[email protected] 
https://lists.openvz.org/mailman/listinfo/users

_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users

Re: [Users] Issues after updating to 7.0.14 (136)

Reply via email to