Am Samstag, 19. März 2011 schrieb Thibaut VARENE:
> On Sat, Mar 19, 2011 at 5:47 PM, Martin Steigerwald 
<mar...@lichtvoll.de> wrote:
> > Am Sunday 06 March 2011 schrieb Thibaut VARENE:
> >> Would you be kind enough to test it? I don't crash my systems as
> >> often as you do, and they're setup in a way that apparently makes
> >> it impossible for me to reproduce this bug.
> > 
> >  wanted to integrate it into the package well - by splitting up to
> > different quilt patches, before adding this patch -, but then other
> > things
> 
> Doesn't make sense to me: your patch was a 1-liner. Anyway...
> 
> > I am not likely to invest much work into this the next time as I will
> > be holding trainings and do lots of other stuff as my holidays end
> > on Monday.
> 
> Good for you. I take it it's a negative answer to my previous inquiry?
> 
> > You wrote you build it already. Do you have that package still
> > available? Then I'd test it after I am convinced that the fsync()
> > based version does what it should.
> 
> Well no, I don't have the test build anymore. Since you were able to
> test your own patch, I assumed you'd be capable of testing another
> one.
> 
> > I am still not convinced that adding those checks alone is the
> > correct solution. The original problem is that the records file is
> > truncated to
> 
> [snipped blah]
> 
> > That said I am still willing to test whether those checks will work
> > as reliable as the fsync() did so far.
> 
> Then please do so, and kindly report when you've done it.
> 
> > I think a software should be written with the irregular case in mind
> > and that this is a key factor that differentiates mediocre or quite
> > good software from excellent software. Humans make errors, thus
> > computers, their power sources, and programs constructed and
> > developed by humans will fail, too. Software without that in mind is
> > asking for trouble.
> 
> Whatever. Even though you have a point, it would be silly to knock
> down a fly with a hammer. When "fixes" for "irregular cases" get in
> the way of "regular cases functioning", there is a problem. uptimed
> must run on many platforms, just not on Linux/ext4.
> 
> Bottomline: My patch affects uptimed at startup. Your patch affects
> uptimed on /each and every write/.

While I still think, fsync() on Linux is a good thing I have to admit, 
that my patch does *not* work. Maybe I should have done fsync() in all 
places, but I am not convinced that this would have been worked. Maybe 
with current Ext4 the fsync() guarentee that I thought it gave is really 
borked, even on Linux.

The console snippet below - partly stripped to 70 characters - also 
clearly shows that a patch that tells uptimed to never overwrite its 
backup with an empty file like the one proposed *is* necessary.

So my approach failed. But I wonder whether your approach would have done 
more good than my daily backups, since uptimed doesn't do a regular backup 
of the configuration, but only on stopping it, maybe also on starting it. 
Thus I would easily have lost more than about one day of my uptime 
statistics. And it just doesn't go into my mind that it isn't possible to 
write a few KiB file in such a safe manner so that it doesn't get 
truncated. This is just insane.

Now trying to fixup the boot last boot record manually.


shambhala:~> uprecords
     #               Uptime | System                                    
----------------------------+-------------------------------------------
->   1     0 days, 00:12:04 | Linux 2.6.38.5-tp42-snap  Wed May 11 20:20
----------------------------+-------------------------------------------
NewRec     0 days, 00:12:03 | since                     Wed May 11 20:20
    up     0 days, 00:12:04 | since                     Wed May 11 20:20
  down     0 days, 00:00:00 | since                     Wed May 11 20:20
   %up              100.000 | since                     Wed May 11 20:20


shambhala:~> cd /var/spool/uptimed 
shambhala:/var/spool/uptimed> ls -lh
insgesamt 16K
-rw-r--r-- 1 daemon daemon  11 11. Mai 20:20 bootid
-rw-r--r-- 1 daemon daemon  62 11. Mai 20:31 records
-rw-rw-rw- 1 daemon daemon 757  4. Mär 21:10 records-2011-03-04
-rw-r--r-- 1 daemon daemon  62 11. Mai 20:26 records.old
shambhala:/var/spool/uptimed> /etc/init.d/uptimed stop
Stopping uptime daemon: uptimed.
shambhala:/var/spool/uptimed> cp -p 
/home/martin/Backup/uptimed/records-2011-05-10 .
shambhala:/var/spool/uptimed> ls -l
insgesamt 20
-rw-r--r-- 1 daemon daemon   11 11. Mai 20:20 bootid
-rw-r--r-- 1 daemon daemon   62 11. Mai 20:32 records
-rw-rw-rw- 1 daemon daemon  757  4. Mär 21:10 records-2011-03-04
-rw-rw-rw- 1 martin martin 3015 10. Mai 22:10 records-2011-05-10
-rw-r--r-- 1 daemon daemon   62 11. Mai 20:31 records.old
shambhala:/var/spool/uptimed> cp -p cp -p records.old records-2011-05-11
cp: angegebenes Ziel „records-2011-05-11“ ist kein Verzeichnis
shambhala:/var/spool/uptimed#1> cp -p records.old records-2011-05-11 
shambhala:/var/spool/uptimed> diff -u records records.old
--- records     2011-05-11 20:32:43.995898664 +0200
+++ records.old 2011-05-11 20:31:13.308372746 +0200
@@ -1 +1 @@
-751:1305138013:Linux 2.6.38.5-tp42-snap-debug+resv-size-dirty
+661:1305138013:Linux 2.6.38.5-tp42-snap-debug+resv-size-dirty
shambhala:/var/spool/uptimed#1> cp records-2011-05-10 records
shambhala:/var/spool/uptimed> /etc/init.d/uptimed start
Starting uptime daemon: uptimed.

shambhala:/var/spool/uptimed> uprecords
     #               Uptime | System                                    
----------------------------+-------------------------------------------
     1    18 days, 11:00:44 | Linux 2.6.37-tp42-rtime-  Thu Jan 13 12:44
     2    13 days, 20:58:39 | Linux 2.6.37-rc8-tp42     Thu Dec 30 15:44
     3    12 days, 00:05:29 | Linux 2.6.37-tp42-rtime-  Mon Jan 31 23:50
     4     9 days, 17:09:09 | Linux 2.6.37-tp42-rtime-  Sat Feb 12 23:57
     5     8 days, 20:53:21 | Linux 2.6.38.3-tp42-snap  Mon Apr 18 21:51
     6     8 days, 15:40:00 | Linux 2.6.37-tp42-rtime-  Tue Feb 22 21:18
     7     7 days, 20:04:48 | Linux 2.6.38-tp42-snapsh  Thu Mar 17 23:47
     8     7 days, 08:19:50 | Linux 2.6.37-rc7-tp42-at  Wed Dec 22 13:02
     9     6 days, 13:27:02 | Linux 2.6.38-rc7-tp42-sn  Tue Mar  8 10:23
    10     5 days, 23:32:20 | Linux 2.6.38.2-tp42-snap  Tue Mar 29 22:18
----------------------------+-------------------------------------------
->  44     0 days, 00:15:24 | Linux 2.6.38.5-tp42-snap  Wed May 11 20:20
----------------------------+-------------------------------------------
1up in     0 days, 00:00:55 | at                        Wed May 11 20:36
t10 in     5 days, 23:16:57 | at                        Tue May 17 19:52
no1 in    18 days, 10:45:21 | at                        Mon May 30 07:20
    up   149 days, 08:08:47 | since                     Sat Dec 11 13:27
  down     1 day , 21:59:49 | since                     Sat Dec 11 13:27
   %up               98.733 | since                     Sat Dec 11 13:27

(well easiest would be to drop uptimed from my notebooks and be done with 
it, I might consider that)

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to