tags 482643 moreinfo unreproducible thanks On Sat, May 24, 2008 at 9:57 AM, Raphael Manfredi <[EMAIL PROTECTED]> wrote:
> Apparently, uptimed is not updating its /var/spool/uptimed/records file > in a safe way: after a system crash, I have seen on many occasions fsck > clearing the corresponding inode and loosing all the uptime records. > > The update procedure should be something like: > > rename /var/spool/uptimed/records as /var/spool/uptimed/records.old > write new data to /var/spool/uptimed/records.new > fsync() > rename /var/spool/uptimed/records.new as /var/spool/uptimed/records > unlink /var/spool/uptimed/records.old It's pretty much what is done, save for the fsync(): void save_records(int max, time_t log_threshold) { FILE *f; Urec *u; int i = 0; f = fopen(FILE_RECORDS".tmp", "w"); if (!f) { printf("uptimed: cannot write to %s\n", FILE_RECORDS); return; } for(u=urec_list; u; u = u->next) { /* Ignore everything below the threshold */ if (u->utime >= log_threshold) { fprintf(f, "%lu:%lu:%s\n", (unsigned long)u->utime, (unsigned long)u->btime, u->sys); /* Stop processing when we've logged the max number specified. */ if ((max > 0) && (++i >= max)) break; } } fclose(f); rename(FILE_RECORDS".tmp", FILE_RECORDS); } In my opinion, you don't want a process running fsync() every 60s, anyway. At least certainly not a subcritical process such as uptimed. > And upon startup, if a /var/spool/uptimed/records.old file is present > it should be used as the main database because it means the above procedure > was somehow interrupted by a crash. Keeping the old records db as a fallback is an idea worth investigating, indeed. Cc'ing upstream maintainer. > To minimise disruption, I've increased the frequency of database savings > to 600 seconds on my systems. Note that my machines do not crash frequently > but when they do, it is usually because one of the IDE disks loses an > interrupt > and linux then hangs, requiring a reboot (the software watchdog is of no help > here, hangup seems to be at the kernel level). Yeah I've been planning to increase the default frequency in a future upload. 60s is insane. > I'm flagging this bug as "important" because it is somehow defeating the > purpose of having a tool record uptimes if a sudden crash causes years of > history to get trashed. (Data is not valuable enough to go through a tape > restore). Agreed. What filesystem are you using on /var? Thanks T-Bone -- Thibaut VARENE http://www.parisc-linux.org/~varenet/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]