[Bug 3872] SA 3.0 creates randomly extreme big bayes_journal

bugzilla-daemon 9 Oct 2004 14:20:37 -0000

http://bugzilla.spamassassin.org/show_bug.cgi?id=3872

------- Additional Comments From [EMAIL PROTECTED]  2004-10-09 07:19 -------
Subject: Re:  SA 3.0 creates randomly extreme big bayes_journal

On Sat, Oct 09, 2004 at 12:02:32AM -0700, [EMAIL PROTECTED] wrote:
> I wonder if there is any relationship between creating random huge journals 
> and SA randomly growing suddenly to 250MB+.

BTW: for someone experiencing this issue, I'd be interested in getting a
copy (compressed of course!) of the extreme sized journal, and/or seeing
the output of "ls -las", "stat", etc.

There's only 1 way I can think of for a journal file to "instantly"
grow to a large size (in normal usage, it will grow at a rate relative
to the amount of mail being processed).  When the journal write occurs,
and a failure is detected, the code will truncate() the file back to
the original size before the write (see BayesStore/DBM::cleanup()).
The truncate() is accompanied by a warning:

"bayes: partial write to bayes journal $path ($len of $nbytes), recovering"

If this process somehow gets screwed up, the truncate could actually make
the file larger and create what is known as a sparse file.  ie: a bunch of
data, a bunch of nothing (the OS typically inserts nulls when reading from
the section that doesn't actually exist on disk), and a bunch of data.

This behavior is new in 3.0.0, 2.6 would detect a partial write of journal
data, internally jump ahead to the part that wasn't written, and try again.
This could potentially lead to multiple writers clobbering each other, which
could still happen for a partial write, but at least the journal file should
be truncated() to a known good state.

For examples of creating spare files:

$ perl -e 'open(T, ">foo"); print T "hi"; truncate(T,256*1048576); close(T);'
$ ls -las foo
   4 -rw-r--r--    1 tvd      wheel    268435456 Oct  9 09:47 foo
$ stat foo
  File: `foo'
  Size: 268435456       Blocks: 8          IO Block: 4096   Regular File
Device: 803h/2051d      Inode: 212997      Links: 1    
Access: (0644/-rw-r--r--)  Uid: ( 1113/     tvd)   Gid: (   10/   wheel)
Access: 2004-10-09 09:47:55.000000000 -0400
Modify: 2004-10-09 09:47:55.000000000 -0400
Change: 2004-10-09 09:47:55.000000000 -0400

So the file only actually has a small number of blocks used on disk
to store "hi\n" (see below for comments about the actual size usage),
but the filesystem reports the file as 256MB.

In contrast:

$ perl -e 'open(T, ">foo"); print T "hi"x(128*1048576); close(T);'
$ ls -las foo
262404 -rw-r--r--    1 tvd      wheel    268435456 Oct  9 09:50 foo
$ stat foo
  File: `foo'
  Size: 268435456       Blocks: 524808     IO Block: 4096   Regular File
Device: 803h/2051d      Inode: 212997      Links: 1    
Access: (0644/-rw-r--r--)  Uid: ( 1113/     tvd)   Gid: (   10/   wheel)
Access: 2004-10-09 09:47:55.000000000 -0400
Modify: 2004-10-09 09:50:52.000000000 -0400
Change: 2004-10-09 09:50:52.000000000 -0400

This file actually has 256MB of data in it.  Notice that the space actually
used on disk is much higher.

(the ls vs stat output may be a little confusing wrt blocks.  ls reports
1k blocks, stat reports 512 byte blocks, and the actual file system block
size (smallest amount of allocatable space in the FS) is 4096 bytes.
so in the truncate version, there is only 1 FS block allocated for the
file (4 x 1k == 1 x 4k), the bottom version has 65601 FS blocks allocated
(262404 x 1k blocks == 65601 x 4k blocks)).

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 3872] SA 3.0 creates randomly extreme big bayes_journal

Reply via email to