Re: [osol-discuss] snv_134 slow down every monday morning
I've got a bit of an odd bug with my OS NAS, every monday morning I come into work and I have to reboot it. SSH doesn't respond and nor does the CIFS shares. And the ping response times are very high! Snowpooch, Have you checked if suspend / resume (i.e. suspend to RAM) is enabled on this box? It's possible that suspend to RAM might be enabled as some kind of a power saving feature and maybe your hardware doesn't quiesce properly in OpenSolaris (i.e. the suspend to RAM is buggy) and so your system suspends to RAM over the weekend when no one is using it and then fails to wake up properly. Also, do you have VirtualBox installed on this system? I had some problems with the VirtualBox driver causing everything to hang on snv_129 with VirtualBox 3.2.8, so check what was happening in the /var/adm/messages file right before the server hangs and see if there are any VirtualBox related message in there. The VirtualBox hang also seemed to happen for me mostly when people weren't using the machine (i.e. over the weekend, etc.) -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] snv_134 slow down every monday morning
Do you have time-slider enabled? On Monday is the system slow or does it not respond at all? Any messages? -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] snv_134 slow down every monday morning
Will, I'll quickly blast through the answers, 1) mysqldump of remote server, rsync of files then a ZFS snapshot. Does this for 4 servers. And there is a windows backup over CIFS. 2) I have done a scrub, I don't have it scheduled but I did it after the first 'crash' and have done it since. Doesn't seem to have helped. 3) We have two 2TB disks in a mirror 4) I'm going to skip this sunday's evenings backups and see how it goes. 5) we don't use NFS, we do however have one server backing up over CIFS. Using wbadmin. I though briefly it was that, since it does a full backup on sunday. BUT I moved that backup to last night to try replicate the issue and no joy. (and there was a heavier backup load last night since it was monday, so if anything was going to happen it should have been then) 6) The hardware is a brand new DELL T110 (I'm not saying its impossible there's something wrong with it, but I think its unlikely) In the mean time I'll scrub the pool again. Thanks for you reply, - Daniel On 24 Aug 2010, at 03:13, William Bauer wrote: What exactly are your backup scripts doing? Have you run a zpool scrub on your pool(s), and do so regularly? Do you have a mirror or other RAID config? Have you tried going one Sunday without any of your backups running to see if they're the culprit? Details may help someone help you. My 134 desktop is backed up every night with Veritas and it's stable. Plus it's an NFS client and server and NIS client and is used by several people all day. The platform seems stable, so perhaps your hardware can't handle the backup stress or there's some major problem with your ZFS pool. Only speculating. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
[osol-discuss] snv_134 slow down every monday morning
Hello, I've got a bit of an odd bug with my OS NAS, every monday morning I come into work and I have to reboot it. SSH doesn't respond and nor does the CIFS shares. And the ping response times are very high! All the backup scripts I have feeding information into this machine run twice and day, every day. (I have no problems on any other day.) The backups midnight Sunday, early morning Monday run fine and after that the system seems to grid to a holt. I know it can still make outgoing connections as it connects to a remote server at 4am just fine. I've been through my crontab on my root user and my user account and removed everything that isn't my backup scripts. Yet still it happens! I should also say that the Sunday night backup is the lightest backup I do in a week, so theres no reason the load of that would do it. (it's less than 100mb combined, vs 5gb on a busy week day) Last week I decided to stagger all the backups by 15 minutes to make sure I wasn't overloading the machine in some way, had no effect. Is there some other set of scheduled tasks I should be looking for? Any ideas on how to debug it? Thanks! - Daniel Taylor ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org
Re: [osol-discuss] snv_134 slow down every monday morning
What exactly are your backup scripts doing? Have you run a zpool scrub on your pool(s), and do so regularly? Do you have a mirror or other RAID config? Have you tried going one Sunday without any of your backups running to see if they're the culprit? Details may help someone help you. My 134 desktop is backed up every night with Veritas and it's stable. Plus it's an NFS client and server and NIS client and is used by several people all day. The platform seems stable, so perhaps your hardware can't handle the backup stress or there's some major problem with your ZFS pool. Only speculating. -- This message posted from opensolaris.org ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org