Dear all,
The real questions are at the bottom, the rest is just a nice intro
which introduces you to the nature of the questions.
Two days ago, we had a power outage in our department which caused a
rather brutal shutdown of the computers. All of the computers survived,
which is a good thing. But only one gained a peculiar character, and of
course it had to be the backup server.
At the current point I am not blaming BackupPC at all, I'm just trying
to isolate the problem, and that is why I would need your help in this.
Okay so what does the bastard (read server) do now. Well not much, it
just hangs or reboots from time to time. Rather in a random way.
The first thing we noticed was in /var/log/messages that after the
poweroutage, the ntpd deamon could not set its clock right anymore.
<snip /var/log/messages>
# cat /var/log/messages | grep ntpd
Mar 29 10:39:43 inwtheo1 ntpd: ntpd startup succeeded
Mar 29 10:39:43 inwtheo1 ntpd[5689]: ntp engine ready
Mar 29 08:40:06 inwtheo1 ntpd[5689]: peer 157.193.40.37 now valid
Mar 29 10:40:57 inwtheo1 ntpd[5688]: adjusting local clock by 166.241134s
Mar 29 10:41:59 inwtheo1 ntpd[5688]: adjusting local clock by 166.240065s
Mar 29 10:44:13 inwtheo1 ntpd[5688]: adjusting local clock by 166.238681s
Mar 29 10:45:13 inwtheo1 ntpd[5688]: adjusting local clock by 166.174413s
Mar 29 10:46:15 inwtheo1 ntpd[5688]: adjusting local clock by 187.903248s
Mar 29 10:55:11 inwtheo1 ntpd: ntpd startup succeeded
Mar 29 10:55:11 inwtheo1 ntpd[5607]: ntp engine ready
Mar 29 08:55:32 inwtheo1 ntpd[5607]: peer 157.193.40.37 now valid
<end snip>
Although trying to understand this problem, I noticed that changing from
openntpd to ntp did the trick to get the time correct. Although unsure
about this solution, we switched off the deamon to be 100% sure this was
not the cause of the reboots and or crashes.
init 1 and 2 ran stable (backuppc is not running in init 2).
init 3 didn't (backuppc runs there)
starting all services by hand to go from 2 to 3, also did not give any
problem. But using the command #init 3, it does. If we remove backuppc
from init 3, the server is stable.
So at this point we started to suspect something is going on when
backuppc is running, but we also noticed that sometimes something was
going on when backuppc was not running. So no conclusion yet.
Although it frequently happens that backuppc initiates the crashes, we
are wondering why this could be, that is why i write here.
Our server is very basic. We are running version 3.0.0, the whole
system is located on /dev/hda in several partitions, and the backup
config files and data is in raid 5 on 3 separate disks
[EMAIL PROTECTED] ~]$ df
Filesystem Size Used Avail Use% Mounted on
/dev/hda7 9.9G 1.4G 8.1G 15% /
/dev/hda1 479M 12M 443M 3% /boot
/dev/hda8 44G 172M 44G 1% /home
/dev/hda6 20G 729M 18G 4% /var
/dev/md0 461G 194G 243G 45% /var/backups
[EMAIL PROTECTED] ~]$ cat /proc/mdstat
Personalities : [raid5]
md0 : active raid5 hdb1[0] hdg1[2] hde1[1]
490223232 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
unused devices: <none>
A test also showed that init 3 without backuppc and without /dev/md0
mounted, was very stable.
I also have to mention that one time when the system rebooted
unexpectedly, the raid system lost 2 of its drives, without a reason.
The next bootup just repaired the raid system. Hence we start thinking
something is wrong with the raid. fsck gives no problems whatsoever.
** If you skipped the top, here are the questions **
What we are wondering now is, does backuppc initiate some other system
commands which could enable the hang?
The poweroutage was in the middle of some full backups, is it possible
that this gives problems? We have for example in couple client directory
a directory new/, even without a backup going on. Can i safely delete
this directory?
Is there more going on that I am not aware of, and how can i see it.
Did anybody had the same? And if so, how did you solve it?
Regards
klaas
--
"Several billion trillion tons of superhot
exploding hydrogen nuclei rose slowly above
the horizon and managed to look small, cold
and slightly damp."
Douglas Adams - The Hitch Hickers
Guide to the Galaxy
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
BackupPC-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/