subject:"Bug#515653\: fsync\(\) based test version"

Bug#515653: fsync() based test version

2011-05-11 Thread Martin Steigerwald

Am Mittwoch, 11. Mai 2011 schrieb Thibaut VARENE:
> On Wed, May 11, 2011 at 8:48 PM, Martin Steigerwald 
 wrote:
> > So my approach failed. But I wonder whether your approach would have
> > done more good than my daily backups, since uptimed doesn't do a
> > regular backup of the configuration, but only on stopping it, maybe
> > also on starting it.
> 
> If by "configuration" you mean its database, then you're wrong. The
> database (and backup) is written every UPDATE_INTERVAL seconds.

Well of course. Sorry. Didn't think of it at the moment. Well then your 
patch should do the trick. I will try it as soon as I find time.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7


signature.asc
Description: This is a digitally signed message part.

Bug#515653: fsync() based test version

2011-05-11 Thread Thibaut VARENE

On Wed, May 11, 2011 at 8:48 PM, Martin Steigerwald  wrote:

> So my approach failed. But I wonder whether your approach would have done
> more good than my daily backups, since uptimed doesn't do a regular backup
> of the configuration, but only on stopping it, maybe also on starting it.

If by "configuration" you mean its database, then you're wrong. The
database (and backup) is written every UPDATE_INTERVAL seconds.

-- 
Thibaut VARENE
http://www.parisc-linux.org/~varenet/

-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#515653: fsync() based test version

2011-05-11 Thread Martin Steigerwald

Am Samstag, 19. März 2011 schrieb Thibaut VARENE:
> On Sat, Mar 19, 2011 at 5:47 PM, Martin Steigerwald 
 wrote:
> > Am Sunday 06 March 2011 schrieb Thibaut VARENE:
> >> Would you be kind enough to test it? I don't crash my systems as
> >> often as you do, and they're setup in a way that apparently makes
> >> it impossible for me to reproduce this bug.
> > 
> >  wanted to integrate it into the package well - by splitting up to
> > different quilt patches, before adding this patch -, but then other
> > things
> 
> Doesn't make sense to me: your patch was a 1-liner. Anyway...
> 
> > I am not likely to invest much work into this the next time as I will
> > be holding trainings and do lots of other stuff as my holidays end
> > on Monday.
> 
> Good for you. I take it it's a negative answer to my previous inquiry?
> 
> > You wrote you build it already. Do you have that package still
> > available? Then I'd test it after I am convinced that the fsync()
> > based version does what it should.
> 
> Well no, I don't have the test build anymore. Since you were able to
> test your own patch, I assumed you'd be capable of testing another
> one.
> 
> > I am still not convinced that adding those checks alone is the
> > correct solution. The original problem is that the records file is
> > truncated to
> 
> [snipped blah]
> 
> > That said I am still willing to test whether those checks will work
> > as reliable as the fsync() did so far.
> 
> Then please do so, and kindly report when you've done it.
> 
> > I think a software should be written with the irregular case in mind
> > and that this is a key factor that differentiates mediocre or quite
> > good software from excellent software. Humans make errors, thus
> > computers, their power sources, and programs constructed and
> > developed by humans will fail, too. Software without that in mind is
> > asking for trouble.
> 
> Whatever. Even though you have a point, it would be silly to knock
> down a fly with a hammer. When "fixes" for "irregular cases" get in
> the way of "regular cases functioning", there is a problem. uptimed
> must run on many platforms, just not on Linux/ext4.
> 
> Bottomline: My patch affects uptimed at startup. Your patch affects
> uptimed on /each and every write/.

While I still think, fsync() on Linux is a good thing I have to admit, 
that my patch does *not* work. Maybe I should have done fsync() in all 
places, but I am not convinced that this would have been worked. Maybe 
with current Ext4 the fsync() guarentee that I thought it gave is really 
borked, even on Linux.

The console snippet below - partly stripped to 70 characters - also 
clearly shows that a patch that tells uptimed to never overwrite its 
backup with an empty file like the one proposed *is* necessary.

So my approach failed. But I wonder whether your approach would have done 
more good than my daily backups, since uptimed doesn't do a regular backup 
of the configuration, but only on stopping it, maybe also on starting it. 
Thus I would easily have lost more than about one day of my uptime 
statistics. And it just doesn't go into my mind that it isn't possible to 
write a few KiB file in such a safe manner so that it doesn't get 
truncated. This is just insane.

Now trying to fixup the boot last boot record manually.


shambhala:~> uprecords
 #   Uptime | System
+---
->   1 0 days, 00:12:04 | Linux 2.6.38.5-tp42-snap  Wed May 11 20:20
+---
NewRec 0 days, 00:12:03 | since Wed May 11 20:20
up 0 days, 00:12:04 | since Wed May 11 20:20
  down 0 days, 00:00:00 | since Wed May 11 20:20
   %up  100.000 | since Wed May 11 20:20


shambhala:~> cd /var/spool/uptimed 
shambhala:/var/spool/uptimed> ls -lh
insgesamt 16K
-rw-r--r-- 1 daemon daemon  11 11. Mai 20:20 bootid
-rw-r--r-- 1 daemon daemon  62 11. Mai 20:31 records
-rw-rw-rw- 1 daemon daemon 757  4. Mär 21:10 records-2011-03-04
-rw-r--r-- 1 daemon daemon  62 11. Mai 20:26 records.old
shambhala:/var/spool/uptimed> /etc/init.d/uptimed stop
Stopping uptime daemon: uptimed.
shambhala:/var/spool/uptimed> cp -p 
/home/martin/Backup/uptimed/records-2011-05-10 .
shambhala:/var/spool/uptimed> ls -l
insgesamt 20
-rw-r--r-- 1 daemon daemon   11 11. Mai 20:20 bootid
-rw-r--r-- 1 daemon daemon   62 11. Mai 20:32 records
-rw-rw-rw- 1 daemon daemon  757  4. Mär 21:10 records-2011-03-04
-rw-rw-rw- 1 martin martin 3015 10. Mai 22:10 records-2011-05-10
-rw-r--r-- 1 daemon daemon   62 11. Mai 20:31 records.old
shambhala:/var/spool/uptimed> cp -p cp -p records.old records-2011-05-11
cp: angegebenes Ziel „records-2011-05-11“ ist kein Verzeichnis
shambhala:/var/spool/uptimed#1> cp -p records.old records-2011-05-11 
shambhala:/var/spool/uptimed> diff -u records

Bug#515653: fsync() based test version

2011-03-19 Thread Martin Steigerwald

Am Saturday 19 March 2011 schrieb Thibaut VARENE:
> > You wrote you build it already. Do you have that package still
> > available? Then I'd test it after I am convinced that the fsync()
> > based version does what it should.
> 
> Well no, I don't have the test build anymore. Since you were able to
> test your own patch, I assumed you'd be capable of testing another
> one.

Its not about capability its about time. But when you don't have that 
build anymore, you'd have to invest that time as well - without being 
bothered by this issue instead of having some responsibility for the 
package since you are the maintainer.

> Now, you may think I'm an arse, I don't care. The point is, you have
> your opinion and I have mine. In the end I'm not taking any action
> until the patch I took the pain to offer has been tested, and/or
> upstream author makes a call. After all, it's his piece of software.
> 
> I too, have better things to do than arguing over and over on the
> comparative merits of different approaches.

Well its your package after all. I might come back to testing the patch. 
But considering the tone we have in this discussion - which I contributed 
too, I admit, cause I was so utterly fed up with this data loss issues 
that remained unfixed for a really long time that it is not even funny 
anymore - I might also take a break and just enjoy the fsync()  version I 
made that works for me.

I considered splitting the package all-in-one patch into several patches 
cause your patch is no one-liner - my was and thus I thought I can just 
get away with it. But then I am not the package maintainer and for testing 
just could have patched it on top of it.

Well maybe when the next version of the package comes and I would have to 
built a new package anyway I can take your patch instead of adding the 
fsync(). Or when I have so much spare time again that I feel like spending 
some on this issue.

Aside from that I can just continue maintain my own version of the package 
- adding the fsync() is easy enough and its free software after all.

So issue is settled for me, you might hear back from me with some test 
results of your patch or not.

-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7


signature.asc
Description: This is a digitally signed message part.

Bug#515653: fsync() based test version

2011-03-19 Thread Thibaut VARENE

On Sat, Mar 19, 2011 at 5:47 PM, Martin Steigerwald  wrote:
> Am Sunday 06 March 2011 schrieb Thibaut VARENE:

>> Would you be kind enough to test it? I don't crash my systems as often
>> as you do, and they're setup in a way that apparently makes it
>> impossible for me to reproduce this bug.
>
>  wanted to integrate it into the package well - by splitting up to
> different quilt patches, before adding this patch -, but then other things

Doesn't make sense to me: your patch was a 1-liner. Anyway...

> I am not likely to invest much work into this the next time as I will be
> holding trainings and do lots of other stuff as my holidays end on Monday.

Good for you. I take it it's a negative answer to my previous inquiry?

> You wrote you build it already. Do you have that package still available?
> Then I'd test it after I am convinced that the fsync() based version does
> what it should.

Well no, I don't have the test build anymore. Since you were able to
test your own patch, I assumed you'd be capable of testing another
one.

> I am still not convinced that adding those checks alone is the correct
> solution. The original problem is that the records file is truncated to

[snipped blah]

> That said I am still willing to test whether those checks will work as
> reliable as the fsync() did so far.

Then please do so, and kindly report when you've done it.

> I think a software should be written with the irregular case in mind and
> that this is a key factor that differentiates mediocre or quite good
> software from excellent software. Humans make errors, thus computers,
> their power sources, and programs constructed and developed by humans will
> fail, too. Software without that in mind is asking for trouble.

Whatever. Even though you have a point, it would be silly to knock
down a fly with a hammer. When "fixes" for "irregular cases" get in
the way of "regular cases functioning", there is a problem. uptimed
must run on many platforms, just not on Linux/ext4.

Bottomline: My patch affects uptimed at startup. Your patch affects
uptimed on /each and every write/.

Now, you may think I'm an arse, I don't care. The point is, you have
your opinion and I have mine. In the end I'm not taking any action
until the patch I took the pain to offer has been tested, and/or
upstream author makes a call. After all, it's his piece of software.

I too, have better things to do than arguing over and over on the
comparative merits of different approaches.

kthxbye.

--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#515653: fsync() based test version

2011-03-19 Thread Martin Steigerwald

Am Sunday 06 March 2011 schrieb Thibaut VARENE:
> On Sun, Mar 6, 2011 at 6:52 PM, Martin Steigerwald  
wrote:
> > Am Sunday 06 March 2011 schrieb Thibaut VARENE:
> >> As stated before, the correct solution would be to add another layer
> >> of checks during daemon startup, which would assert that the file
> >> it's reading is valid (i.e. to begin with "not empty" and "has
> >> parseable data"), and fall back to the backup copy otherwise. This,
> >> by design, is the correct approach and has /none/ of the drawbacks
> >> of your fsync() patch.
> >> 
> >> I would gladly review such a patch.
> > 
> > But thats at least an offer. And it would help BSD as well. Let's see
> > what I will do when there will be the next rainy day during my
> > holidays in Sevilla.
> > 
> > But until then I likely have a version that works and I will fork it
> > that way as long as it serves the purpose of keeping my uptime
> > records data safe in a easy and simple way. This is likely to fix
> > the issue *now*, not somewhen in the future. And at least I have
> > done something efficient about it in the time I had at my hand.
> 
> w/e
> 
> Here's an ugly half-cooked, built but otherwise TOTALLY UNTESTED
> braindump patch that tries to implement The Right Thing (tm).

Ok, fair enough. Thanks for the patch.

> Would you be kind enough to test it? I don't crash my systems as often
> as you do, and they're setup in a way that apparently makes it
> impossible for me to reproduce this bug.

 wanted to integrate it into the package well - by splitting up to 
different quilt patches, before adding this patch -, but then other things 
were more important to me. And since its my holidays, my time and the 
fsync() based version fared well upto now and survived some hibernation 
hangs - I have more than 99% uptime again -, I didn't do any work on this.

I am not likely to invest much work into this the next time as I will be 
holding trainings and do lots of other stuff as my holidays end on Monday.

You wrote you build it already. Do you have that package still available? 
Then I'd test it after I am convinced that the fsync() based version does 
what it should.

I am still not convinced that adding those checks alone is the correct 
solution. The original problem is that the records file is truncated to 
null bytes and then renamed before the data is written out. The advertised 
fix to that - at least on Linux - is fsync(), by several Linux filesystem 
developers and quite some other Linux people.

I think - at least on Linux - fsync() is the way to go. Another approach 
would be to find out, why Ext4 is producing null byte files even when its 
advertized having a work-around for the rename case. But then Ted always 
advised to use fsync() even in the presence of this work-around. 
Nonetheless I am willing to report a kernel upstream bug with a code 
excerpt of uptimed and a hint to the advertised flush-on-rename workaround.

Those checks for a corrupt records file IMHO actually are a work-around for 
this original problem. The could serve well as an addition to the fsync(), 
cause there may be other reasons for a corrupted records file, but they 
also add even more complication to the code than the fsync() approach.

Also I think that the fsync() is more likely to give me *recent* uptime 
data in case of a crash than the sanity checks. I am not completely sure 
on that yet, but thats what I think currently.

That said I am still willing to test whether those checks will work as 
reliable as the fsync() did so far.

I think a software should be written with the irregular case in mind and 
that this is a key factor that differentiates mediocre or quite good 
software from excellent software. Humans make errors, thus computers, 
their power sources, and programs constructed and developed by humans will 
fail, too. Software without that in mind is asking for trouble.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

signature.asc
Description: This is a digitally signed message part.

Bug#515653: fsync() based test version

2011-03-06 Thread Thibaut VARENE

On Sun, Mar 6, 2011 at 9:05 PM, Thibaut VARENE  wrote:
> On Sun, Mar 6, 2011 at 6:52 PM, Martin Steigerwald  
> wrote:
>> Am Sunday 06 March 2011 schrieb Thibaut VARENE:
>
>>> As stated before, the correct solution would be to add another layer of
>>> checks during daemon startup, which would assert that the file it's
>>> reading is valid (i.e. to begin with "not empty" and "has parseable
>>> data"), and fall back to the backup copy otherwise. This, by design,
>>> is the correct approach and has /none/ of the drawbacks of your
>>> fsync() patch.
>>>
>>> I would gladly review such a patch.
>>
>> But thats at least an offer. And it would help BSD as well. Let's see what
>> I will do when there will be the next rainy day during my holidays in
>> Sevilla.
>>
>> But until then I likely have a version that works and I will fork it that
>> way as long as it serves the purpose of keeping my uptime records data
>> safe in a easy and simple way. This is likely to fix the issue *now*, not
>> somewhen in the future. And at least I have done something efficient about
>> it in the time I had at my hand.
>
> w/e
>
> Here's an ugly half-cooked, built but otherwise TOTALLY UNTESTED
> braindump patch that tries to implement The Right Thing (tm).

For good measure, this line:
+   if (!useold && (filestat.st_size <= filestatold.st_size))
should be:
+   if (!useold && (filestat.st_size < filestatold.st_size))

but it's no big deal

-- 
Thibaut VARENE
http://www.parisc-linux.org/~varenet/



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Bug#515653: fsync() based test version

2011-03-06 Thread Thibaut VARENE

On Sun, Mar 6, 2011 at 6:52 PM, Martin Steigerwald  wrote:
> Am Sunday 06 March 2011 schrieb Thibaut VARENE:

>> As stated before, the correct solution would be to add another layer of
>> checks during daemon startup, which would assert that the file it's
>> reading is valid (i.e. to begin with "not empty" and "has parseable
>> data"), and fall back to the backup copy otherwise. This, by design,
>> is the correct approach and has /none/ of the drawbacks of your
>> fsync() patch.
>>
>> I would gladly review such a patch.
>
> But thats at least an offer. And it would help BSD as well. Let's see what
> I will do when there will be the next rainy day during my holidays in
> Sevilla.
>
> But until then I likely have a version that works and I will fork it that
> way as long as it serves the purpose of keeping my uptime records data
> safe in a easy and simple way. This is likely to fix the issue *now*, not
> somewhen in the future. And at least I have done something efficient about
> it in the time I had at my hand.

w/e

Here's an ugly half-cooked, built but otherwise TOTALLY UNTESTED
braindump patch that tries to implement The Right Thing (tm).

Would you be kind enough to test it? I don't crash my systems as often
as you do, and they're setup in a way that apparently makes it
impossible for me to reproduce this bug.

-- 
Thibaut VARENE
http://www.parisc-linux.org/~varenet/
diff -Nru uptimed-0.3.16.orig/libuptimed/urec.c uptimed-0.3.16/libuptimed/urec.c
--- uptimed-0.3.16.orig/libuptimed/urec.c	2009-01-02 00:46:00.0 +0100
+++ uptimed-0.3.16/libuptimed/urec.c	2011-03-06 21:01:28.0 +0100
@@ -215,20 +215,46 @@
 	time_t utime, btime;
 	long l_utime, l_btime;
 	char buf[256], sys[SYSMAX+1];
+	struct stat filestat, filestatold;
+	int useold = 0;
 	
-	f = fopen(FILE_RECORDS, "r");
-	if (!f) {
-		f = fopen(FILE_RECORDS".old", "r");
-		if (!f) return;
+	if (stat(FILE_RECORDS, &filestat))
+		useold = 1;
+	if (stat(FILE_RECORDS".old", &filestatold))
+		useold = -1;
+
+	/* assume that backupdb larger than normal db means normal is corrupted */
+	if (!useold && (filestat.st_size <= filestatold.st_size))
+		useold = 1;
 
-		printf("uptimed: reading from backup database %s.old\n", FILE_RECORDS);
+dbtry:
+	switch (useold) {
+		case 0:
+			f = fopen(FILE_RECORDS, "r");
+			break;
+		case 1:
+			f = fopen(FILE_RECORDS".old", "r");
+			printf("uptimed: reading from backup database %s.old\n", FILE_RECORDS);
+			break;
+		default:
+			/* this should probably terminate uptimed somehow */
+			printf("uptimed: no useable database found.\n");
+			return;
+	}
+			
+	if (!f) {
+		printf("uptimed: error opening database for reading.\n");
+		return;
 	}
 	
 	fgets(str, sizeof(str), f);
 	while (!feof(f)) {
 		/* Check for validity of input string. */
 		if (sscanf(str, "%ld:%ld:%[^]\n]", &l_utime, &l_btime, buf) != 3) {
-			/* Skip this entry. Do we want feedback here? */
+			/* database is corrupted */
+			fclose(f);
+			useold++;
+			goto dbtry;
 		} else {
 			utime = (time_t)l_utime;
 			btime = (time_t)l_btime;
diff -Nru uptimed-0.3.16.orig/libuptimed/urec.h uptimed-0.3.16/libuptimed/urec.h
--- uptimed-0.3.16.orig/libuptimed/urec.h	2009-01-02 00:46:00.0 +0100
+++ uptimed-0.3.16/libuptimed/urec.h	2011-03-06 20:48:49.0 +0100
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef PLATFORM_LINUX
 #include

Bug#515653: fsync() based test version

2011-03-06 Thread Martin Steigerwald

Am Sunday 06 March 2011 schrieb Thibaut VARENE:
> On Sat, Mar 5, 2011 at 4:22 PM, Martin Steigerwald 
wrote:
> > Since after loosing uptime records once again due to a crash while
> > testing kernels I am so through with it that it is not even funny
> > anymore, Theodore T'so said that one should not fear the fsync() [1]
> > - especially not with Ext4 - and I prefer not loosing uptime records
> > over and over and over again I build a test version using fsync() at
> > the location this bug
> 
> > report is about:
> As far as I understand it, upstream isn't willing to consider a fsync()
> patch which is /bad/ for all the previously mentioned reasons, which
> the post you quote only marginally addresses. Uptimed doesn't need
> atomicity nor durability: it keeps a *backup* of its previous
> database. Also, to put things in a little more perspective: uptimed
> doesn't only run on Linux. It runs on a variety of other platforms,
> where fsync() may have a greater costs than what Ted suggests. And
> finally, and probably more to the point:
> 
> FSYNC(2)BSD System Calls Manual
> FSYNC(2)
> 
> [...]
> 
>  Note that while fsync() will flush all data from the host to the
> drive (i.e. the "permanent storage device"), the drive itself may not
> physi- cally write the data to the platters for quite some time and it
> may be written in an out-of-order sequence.
> 
>  Specifically, if the drive loses power or the OS crashes, the
> application
>  may find that only some or none of their data was written.  The
> disk drive may also re-order the data so that later writes may be
> present, while earlier writes are not.

This explains that BSD does not have any notion of barriers or explicit 
cache flushes. Which is a pity. But to some extent also a OS problem. If 
the OS doesn't guarentee anything how could the application do it?

Honestly whats the point of an fsync() if it doesn't work? Then they could 
do a return 0 implementation as well.

> This explains that even fsync() cannot /certify/ that the data will hit
> the disk in the event of a crash (this is especially true with
> nowadays larger caches on disks).

Well that stuff is *immediately written* is not the point of it at all, as 
I understand it. But to my understanding at least Linux will guarantee 
that the fsync() has happened, *before* the file will be renamed. And thats 
the only guarantee thats important here. Thats what, upto Linux 2.6.36, 
barriers were made for, and thats also what explicit cache flushes since 
kernel 2.6.37 were made for.

Thats what is the whole point of it: A guaranteed order of writes down to 
the disk, including explicat cache flushes to the disk cache or FUA 
requests.

> I'm absolutely against such a patch which is the wrong solution to this
> problem either (and no, I'm not going to add a patch to tune for a
> specific filesystem - not everyone uses ext4 - especially not to work
> around system crashes, which, *again*, do not constitute a "normal use
> of the system").

Systems crash. They aren't perfect. Thats a reality. They do, likely 
desktop machines where people test this and that more often than servers, 
but also a server could face a power outage or what not. Either software 
is written with something like that in mind or software is broken. Thats 
at least my oppinion on the matter.

Rejecting a simple fix that potentially fixes a data loss issue like that 
doesn't contribute to solve the problem. Maybe the fix could be made with a 
conditionally define to that its only compiled in on Linux targets.

> As stated before, the correct solution would be to add another layer of
> checks during daemon startup, which would assert that the file it's
> reading is valid (i.e. to begin with "not empty" and "has parseable
> data"), and fall back to the backup copy otherwise. This, by design,
> is the correct approach and has /none/ of the drawbacks of your
> fsync() patch.
> 
> I would gladly review such a patch.

But thats at least an offer. And it would help BSD as well. Let's see what 
I will do when there will be the next rainy day during my holidays in 
Sevilla.

But until then I likely have a version that works and I will fork it that 
way as long as it serves the purpose of keeping my uptime records data 
safe in a easy and simple way. This is likely to fix the issue *now*, not 
somewhen in the future. And at least I have done something efficient about 
it in the time I had at my hand.

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7

signature.asc
Description: This is a digitally signed message part.

Bug#515653: fsync() based test version

2011-03-06 Thread Thibaut VARENE

On Sat, Mar 5, 2011 at 4:22 PM, Martin Steigerwald wrote:

> Since after loosing uptime records once again due to a crash while testing
> kernels I am so through with it that it is not even funny anymore,
> Theodore T'so said that one should not fear the fsync() [1] - especially
> not with Ext4 - and I prefer not loosing uptime records over and over and
> over again I build a test version using fsync() at the location this bug
> report is about:
>

As far as I understand it, upstream isn't willing to consider a fsync()
patch which is /bad/ for all the previously mentioned reasons, which the
post you quote only marginally addresses. Uptimed doesn't need atomicity nor
durability: it keeps a *backup* of its previous database. Also, to put
things in a little more perspective: uptimed doesn't only run on Linux. It
runs on a variety of other platforms, where fsync() may have a greater costs
than what Ted suggests. And finally, and probably more to the point:

FSYNC(2)BSD System Calls Manual
FSYNC(2)

[...]

 Note that while fsync() will flush all data from the host to the drive
 (i.e. the "permanent storage device"), the drive itself may not physi-
 cally write the data to the platters for quite some time and it may be
 written in an out-of-order sequence.

 Specifically, if the drive loses power or the OS crashes, the
application
 may find that only some or none of their data was written.  The disk
 drive may also re-order the data so that later writes may be present,
 while earlier writes are not.

This explains that even fsync() cannot /certify/ that the data will hit the
disk in the event of a crash (this is especially true with nowadays larger
caches on disks).

I'm absolutely against such a patch which is the wrong solution to this
problem either (and no, I'm not going to add a patch to tune for a specific
filesystem - not everyone uses ext4 - especially not to work around system
crashes, which, *again*, do not constitute a "normal use of the system").

As stated before, the correct solution would be to add another layer of
checks during daemon startup, which would assert that the file it's reading
is valid (i.e. to begin with "not empty" and "has parseable data"), and fall
back to the backup copy otherwise. This, by design, is the correct approach
and has /none/ of the drawbacks of your fsync() patch.

I would gladly review such a patch.

HTH

-- 
Thibaut VARENE
http://www.parisc-linux.org/~varenet/

Bug#515653: fsync() based test version

2011-03-05 Thread Martin Steigerwald

Since after loosing uptime records once again due to a crash while testing 
kernels I am so through with it that it is not even funny anymore, 
Theodore T'so said that one should not fear the fsync() [1] - especially 
not with Ext4 - and I prefer not loosing uptime records over and over and 
over again I build a test version using fsync() at the location this bug 
report is about:

--- uptimed-0.3.16.orig/libuptimed/urec.c
+++ uptimed-0.3.16/libuptimed/urec.c
@@ -263,6 +263,7 @@
if ((max > 0) && (++i >= max)) break;
}
}
+   fsync(f);
fclose(f);
rename(FILE_RECORDS, FILE_RECORDS".old");
rename(FILE_RECORDS".tmp", FILE_RECORDS);

If you like to see whether it helps, please find it on my server:

http://martin-steigerwald.de/tmp/debian/uptimed-fsync-bug-515653/

I will do some long term testing with it.

Additionally uptimed also should not take a shorter records file is a 
longer backup is available, but I leave this alone for now.

[1] http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/

Ciao,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7


signature.asc
Description: This is a digitally signed message part.

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

Bug#515653: fsync() based test version

11 matches

Site Navigation

Mail list logo

Footer information