Re: kmail corrupts emails [solved]

2005-09-28 Thread Rigo Wenning

Am Wednesday 28 September 2005 18:59 verlautbarte Reinhold Kainhofer :
> He? MBox is just the text of all messages concatenated together. Are
> you sure that other MUAs store the message status in there? Unless
> I'm mistaken, the mbox format doesn't support this.

Common, there are all sorts of X-Headers in emails.. The format is quite 
old and the 
>
> BTW, about which version of mbox are you talking exactly? There are
> several different mbox formats out there:
> http://homepages.tesco.net./~J.deBoynePollard/FGA/mail-mbox-formats.h
>tml Each of them is a bit different so you will run into problems when
> you access them with different MUAs that support a different mbox
> format.

I just learned recently, that mbox is not a very well defined format. 
That's why in my email, if you read it carefully, I said I will switch 
to maildir for all my folders. 

Nevertheless, the thread in KDE's bugzilla was rather revealing:
http://bugs.kde.org/show_bug.cgi?id=37898

In fact there is not much to do to actually fix it. I should start a 
fund collection to fix the issue.
>
> > > Valid info
> > > should be written back to the maildir
>
> Are you now talking about maildir or mbox?

That means I talked about maildir. And no, it is not ok to force people 
to only touch email with KMail by storing stuff in the index when those 
indexes get corrupted by touching stuff with other agents. 

> Anyway, neither maildir nor mbox store the status flags directly in
> the message. mbox uses the index file, maildir uses the file name
> (for those flags supported by Maildir). And some info, like the
> "To-Do" flag in kmail, cannot be saved into it at all. They can only
> be stored in the local index file.

I don't think this is correct as they can be stored in X-Headers. 
Whatever format (index or X-Header) one would chose, it would have to 
respect the common practice that one touches email-archives with more 
than one user-agent. 
>
> > > and KMail should not complain
> > > when opening a maildir touched by another agent.
> >
> > You could open a bugzilla change request and get us all to vote for
> > it !
>
> Even better: Provide a patch...

IAAL, so patch is bad. If I could I would have provided it a year ago. 
It all started by corrupting files... (see kde bugzilla above) If I 
would provide a patch, not only the files would get corrupted ;)

I think I will create a bugzilla change request and see what happens. 
When I complained last time, the only argument was missing support and 
developing time (a very relevant argument in open source) But I haven't 
managed to find someone sufficiently literate in KMail sources to 
support my request without funding involved. I will have to further 
look out. 

Best, 

Rigo


pgpDsQB56W0Fg.pgp
Description: PGP signature


Re: kmail corrupts emails [solved]

2005-09-28 Thread Derek Broughton
Reinhold Kainhofer wrote:

> On Wednesday 28 September 2005 18:31, Nick Leverton wrote:
>> On Wed, Sep 28, 2005 at 06:26:29PM +0200, Rigo Wenning wrote:
>> > Thanks a lot, will do too,
>> >
>> > but it remains that I need allies in convincing KMail developers to use
>> > their index only for caching and not for valid information.
> 
> He? MBox is just the text of all messages concatenated together. Are you
> sure that other MUAs store the message status in there? Unless I'm
> mistaken, the mbox format doesn't support this.

It could be stored as an X- header in the message - but kmail can hardly  be
blamed if somebody else uses a completely non-standard way to mark status.
-- 
derek


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-28 Thread Pete Jewell
Derek Broughton wrote:
> Pete Jewell wrote:
> 
> 
>>However, ReiserFS is *much* more efficient when you have thousands of
>>files in one directory, because it uses a hashing algorithm to determine
>>where the required file is (or starts) in the filesystem.  This is
>>something I know about (hashing) based on my experience with Pick
>>database systems, which also use hashing and are incredibly fast at
>>keyed record retrieval (as well as entire file/table traversal).
> 
> 
> There's a blast from the past.  I did too much work with Pick databases... I
> managed not to learn anything more about hashing, too.

There's no such thing as /too much work with Pick databases/ ;-)

Actually, the company I work for are just about to start a pilot of a
system that will replace the D3 one I maintain.  If all goes well it'll
get turned off in a couple of years.  To be honest I'm looking forward
to cross-training onto the new system - Pick permanent jobs are becoming
harder and harder to come by (not to mention having to travel - I do a
150 mile round trip each day to get to work and back).  Contract work is
a little easier to find (lots of tending existing systems, while they're
being replaced!).

That being said I still really enjoy working in Pick - you can go from
the client's initial idea to something tangible so quickly, they think
that you're some sort of genius (considering how long it takes their
requests to get actioned on the SQL backed systems!) :-)

-- 
PeteJ


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-28 Thread Reinhold Kainhofer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 28 September 2005 18:59, Reinhold Kainhofer wrote:
> On Wednesday 28 September 2005 18:31, Nick Leverton wrote:
> > On Wed, Sep 28, 2005 at 06:26:29PM +0200, Rigo Wenning wrote:
> > > Thanks a lot, will do too,
> > >
> > > but it remains that I need allies in convincing KMail developers to use
> > > their index only for caching and not for valid information.
>
> He? MBox is just the text of all messages concatenated together. Are you
> sure that other MUAs store the message status in there? Unless I'm
> mistaken, the mbox format doesn't support this.

Sorry, I should be more exact: The mbox format (or rather the email message 
format, since mbox is just messages concatenated together) has a Status field 
for some status flags, but not others like for "message was deleted" or 
message is a to-do, etc.. That can only be stored in the index file or not at 
all.
Mutt for example doesn't store that at all. You can either purge the messages 
when you leave mutt (in which case, the mbox file is rewritten... Great for a 
huge folder!) or if you don't purge them, they are there again when you next 
start mutt.

Cheers,
Reinhold


- -- 
- --
Reinhold Kainhofer, Vienna University of Technology, Austria
email: [EMAIL PROTECTED], http://reinhold.kainhofer.com/
 * Financial and Actuarial Mathematics, TU Wien, http://www.fam.tuwien.ac.at/
 * K Desktop Environment, http://www.kde.org, KOrganizer maintainer
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDOtBqTqjEwhXvPN0RAnL9AKCJFRCJ4/mqbOuvPLnjRGR76JebpACfZmpU
QtK6E1T0bnvYA//W3vqzF/k=
=vIPO
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-28 Thread Reinhold Kainhofer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Wednesday 28 September 2005 18:31, Nick Leverton wrote:
> On Wed, Sep 28, 2005 at 06:26:29PM +0200, Rigo Wenning wrote:
> > Thanks a lot, will do too,
> >
> > but it remains that I need allies in convincing KMail developers to use
> > their index only for caching and not for valid information. 

He? MBox is just the text of all messages concatenated together. Are you sure 
that other MUAs store the message status in there? Unless I'm mistaken, the 
mbox format doesn't support this. 

BTW, about which version of mbox are you talking exactly? There are several 
different mbox formats out there: 
http://homepages.tesco.net./~J.deBoynePollard/FGA/mail-mbox-formats.html
Each of them is a bit different so you will run into problems when you access 
them with different MUAs that support a different mbox format.

> > Valid info 
> > should be written back to the maildir 

Are you now talking about maildir or mbox?
Anyway, neither maildir nor mbox store the status flags directly in the 
message. mbox uses the index file, maildir uses the file name (for those 
flags supported by Maildir). And some info, like the "To-Do" flag in kmail, 
cannot be saved into it at all. They can only be stored in the local index 
file.

> > and KMail should not complain 
> > when opening a maildir touched by another agent.
>
> You could open a bugzilla change request and get us all to vote for it !

Even better: Provide a patch...

Cheers,
Reinhold

PS: Regarding the mbox is better/faster/whatever: 
http://www.courier-mta.org/mbox-vs-maildir/
"The final conclusion is that -- except in some specific instances -- using 
maildirs will be just as fast -- and in sometimes much faster -- than mbox 
files, while placing less of a load on the rest of the mail system."

- -- 
- --
Reinhold Kainhofer, Vienna University of Technology, Austria
email: [EMAIL PROTECTED], http://reinhold.kainhofer.com/
 * Financial and Actuarial Mathematics, TU Wien, http://www.fam.tuwien.ac.at/
 * K Desktop Environment, http://www.kde.org, KOrganizer maintainer
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDOswGTqjEwhXvPN0RAle4AKDNid5R83FacfOE/eqw2as85S8XOQCgtANL
Ga5IvjdD8CrKbWlcKTiInXk=
=/zmr
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-28 Thread Nick Leverton
On Wed, Sep 28, 2005 at 06:26:29PM +0200, Rigo Wenning wrote:
> Thanks a lot, will do too, 
> 
> but it remains that I need allies in convincing KMail developers to use 
> their index only for caching and not for valid information. Valid info 
> should be written back to the maildir and KMail should not complain 
> when opening a maildir touched by another agent.

You could open a bugzilla change request and get us all to vote for it !
It's the major reason why I don't use kmail, even though I love the rest
of KDE.

Nick


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-28 Thread Rigo Wenning
Thanks a lot, will do too, 

but it remains that I need allies in convincing KMail developers to use 
their index only for caching and not for valid information. Valid info 
should be written back to the maildir and KMail should not complain 
when opening a maildir touched by another agent.

Best, 

Rigo

Am Wednesday 28 September 2005 16:38 verlautbarte David Martínez 
Moreno :
> I changed all my KMail folders to Maildir format a couple of
> years ago in order to avoid corruption.


pgpgkBdH5prkR.pgp
Description: PGP signature


Re: kmail corrupts emails [solved]

2005-09-28 Thread David Martínez Moreno
El Miércoles, 28 de Septiembre de 2005 15:24, Rigo Wenning escribió:
> I complained about this a long time ago to KMail developers. KMail does
> not store the metadata (Flags and the like) back into the file, but
> only in the index files. (For some performance reason, but Mutt is
> faster and does it)
>
> Now once touched with Mutt, Mutt writes back the metadata into the mbox,
> the KMail indexes are corrupt and one loses all the flags (read-replied
> deleted, etc). I don't know if this is also true for the
> maildir-format, but KMail is definitely locking in the user with this
> behavior. If you only have a terminal, you can't read your mail remote.
> The only way out is using IMAP and accessing it from different clients.

No, it changes the filenames, as far as I know:

-rw-rw-r--  1 ender ender  1469 sep 18  2002 1032160529.8617.MYBT:2,RS
-rw-rw-r--  1 ender ender  2083 sep 18  2002 1032370719.16260.amNs:2,S
-rw-rw-r--  1 ender ender  2527 ene 17  2003 1032970004.13325.X1L8:2,S
-rw-rw-r--  1 ender ender  2035 abr 28  2003 1051531899.16760.dQzI:2,S
-rw-rw-r--  1 ender ender  2444 sep 17  2003 1052062900.16760.DkV0:2,S
-rw-rw-r--  1 ender ender  3111 oct 15  2004 1097478594.6170.ZTHEL:2,S
-rw-rw-r--  1 ender ender 10272 jul 26 16:20 1122387654.3012.qLs7u:2,S

See the stuff added at the end. With it you can recreate the state of 
your 
folder without problem.

I changed all my KMail folders to Maildir format a couple of years ago 
in 
order to avoid corruption.

Regards,


Ender.
-- 
I'm packing your extra pair of shoes and your ANGRY eyes!
-- Mrs. Potato to Mr. Potato (Toy Story 2).
--
Debian developer


pgpqKEcpZ1HSF.pgp
Description: PGP signature


Re: kmail corrupts emails [solved]

2005-09-28 Thread Rigo Wenning
I complained about this a long time ago to KMail developers. KMail does 
not store the metadata (Flags and the like) back into the file, but 
only in the index files. (For some performance reason, but Mutt is 
faster and does it)

Now once touched with Mutt, Mutt writes back the metadata into the mbox, 
the KMail indexes are corrupt and one loses all the flags (read-replied 
deleted, etc). I don't know if this is also true for the 
maildir-format, but KMail is definitely locking in the user with this 
behavior. If you only have a terminal, you can't read your mail remote. 
The only way out is using IMAP and accessing it from different clients.

If it wouldn't be for the Kontact-integration, I would have abandoned 
KMail for that reason as I hate lock-in scenarios. AFAIK, this hasn't 
been fixed. If I get more time, I will try again with samples to use 
Mutt and KMail together over a folder in maildir-format.

Best, 

Rigo


Am Friday 23 September 2005 19:29 verlautbarte Theo Schmidt :
> >Here, from directory ~/.Mail, is the ls -al listing for one (mbox)
> > folder (named tldp) with the three indexes.  To start, you might
> > want to experiment with just one mail folder.  Delete all three of
> > the .index files, then restart kmail (assuming you shut it down)
> > and try to access the mail in that folder.  After a short delay,
> > I'm hopeful that it will be ok.
> >
> >-rw---   1 rhk rhk   868526 Sep 19 16:05 tldp
> >-rw---   1 rhk rhk    99043 Sep 19 16:05 .tldp.index
> >-rw-r--r--   1 rhk rhk      937 Sep 19 16:05 .tldp.index.ids
> >-rw-r--r--   1 rhk rhk     5425 Aug 24 07:28 .tldp.index.sorted
> >  
>
> This has solved it, thanks!  Kmail used to compact mailboxes on
> closing; it looks like it no longer does so.


pgpmpvswCKZub.pgp
Description: PGP signature


Re: kmail corrupts emails [solved]

2005-09-28 Thread Derek Broughton
Pete Jewell wrote:

> However, ReiserFS is *much* more efficient when you have thousands of
> files in one directory, because it uses a hashing algorithm to determine
> where the required file is (or starts) in the filesystem.  This is
> something I know about (hashing) based on my experience with Pick
> database systems, which also use hashing and are incredibly fast at
> keyed record retrieval (as well as entire file/table traversal).

There's a blast from the past.  I did too much work with Pick databases... I
managed not to learn anything more about hashing, too.
-- 
derek


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Pete Jewell
Randy Kramer wrote:
> On Tuesday 27 September 2005 01:44 pm, Pete Jewell wrote:
> 
>>However, ReiserFS is *much* more efficient when you have thousands of
>>files in one directory, because it uses a hashing algorithm to determine
>>where the required file is (or starts) in the filesystem.  This is
>>something I know about (hashing) based on my experience with Pick
>>database systems, which also use hashing and are incredibly fast at
>>keyed record retrieval (as well as entire file/table traversal).
>>
> 
> 
> Thanks to Derek, Hendrik, and Pete for the replies!
> 
> Is there a chance that the hash for a ReiserFS can become corrupted like the 
> index for a mbox file can be?  Or maybe I should ask it differently, because 
> presumably something can happen to make it corrupted--does Reisers have some 
> better error detection / correction / recovery for the hash than is typical 
> of an index for an mbox file?
> 
> (Maybe I need to go read up on Reiser, and join a Reiser list. ;-)

The beauty of a hash that is used to locate a file, or record, is that
it is based on the key, or filename in the case of ReiserFS (actually,
not having looked at the internals of ReiserFS I'm assuming it's the
filename - it could conceivably be anything that stored against the
inode).  The important thing is that we have something that we can use
within a mathematical expression to determine where on the filesystem
the beginning of the file can be found.


In Pick the filesystem (and memory) is organised into frames.  The
location of a piece of data within that space can be determined
mathematically, at least to a pointer that shows where the data starts.
 The frames are usually quite small so that there's no performance hit
in loading it into memory and scanning it for the exact location within
it (helped with the use of delimiters between tables, records, fields,
and even fields within fields).  The version of Pick I use at work (D3)
uses 4K frames, but they've ranged from 512 bytes upwards in various
implementations over the years.


As such, finding the file is not as susceptible to corruption (unless
the inode gets trashed, same risk there), as there is no index.  Time to
retrieval is drastically reduced as there's no index to follow, just
some integer math.  However the files themselves wouldn't inherently be
less susceptible to corruption - but that's where a (full) journal can
help out.

HTH

-- 
PeteJ


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Randy Kramer
On Tuesday 27 September 2005 01:44 pm, Pete Jewell wrote:
> However, ReiserFS is *much* more efficient when you have thousands of
> files in one directory, because it uses a hashing algorithm to determine
> where the required file is (or starts) in the filesystem.  This is
> something I know about (hashing) based on my experience with Pick
> database systems, which also use hashing and are incredibly fast at
> keyed record retrieval (as well as entire file/table traversal).
>
> I've used ReiserFS in the past mainly for it's journalling capability,
> which at the time was more complete than ext3's (this was on a RH6.2
> system with the 2.4.x series kernels).  As my customers at the time were
> very likely to simply switch the system off (for any reason, including
> not knowing how a particular application works that they'd wandered
> into), this feature saw a lot of (successful) use!

Thanks to Derek, Hendrik, and Pete for the replies!

Is there a chance that the hash for a ReiserFS can become corrupted like the 
index for a mbox file can be?  Or maybe I should ask it differently, because 
presumably something can happen to make it corrupted--does Reisers have some 
better error detection / correction / recovery for the hash than is typical 
of an index for an mbox file?

(Maybe I need to go read up on Reiser, and join a Reiser list. ;-)

Randy Kramer




-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Pete Jewell
Derek Broughton wrote:
> Randy Kramer wrote:
> 
> 
>>On Tuesday 27 September 2005 09:27 am, Derek Broughton wrote:
>>
>>>Why do you think Maildir would perform worse for folders with thousands
>>>of
>>>emails?  Everything I've read suggests it will perform better - and more
>>>reliably.
>>
>>First a quick (but dumb, I should look it up) question.  Does Linux do the
>>thing that Dos/Windows does (used to do?) of each file requiring a minimum
>>space (one cluster?), or does it vary by filesystem?
> 
> 
> It varies decidedly between the different filesystems.

I think the restriction that you're more likely to run into is the
number of inodes available (which are predetermined when you setup an
ext2/3 filesystem).

However, ReiserFS is *much* more efficient when you have thousands of
files in one directory, because it uses a hashing algorithm to determine
where the required file is (or starts) in the filesystem.  This is
something I know about (hashing) based on my experience with Pick
database systems, which also use hashing and are incredibly fast at
keyed record retrieval (as well as entire file/table traversal).

I've used ReiserFS in the past mainly for it's journalling capability,
which at the time was more complete than ext3's (this was on a RH6.2
system with the 2.4.x series kernels).  As my customers at the time were
very likely to simply switch the system off (for any reason, including
not knowing how a particular application works that they'd wandered
into), this feature saw a lot of (successful) use!

[snip]

-- 
PeteJ


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Derek Broughton
André Wöbbeking wrote:

> On Tuesday 27 September 2005 15:27, Derek Broughton wrote:
>> Randy Kramer wrote:
>> > I suppose maildir will be OK (and maybe even better) for my inbox,
>> > which I generally keep "trimmed" (not too many emails).
>> >
>> > I don't think I want to do that for my mail folders which often
>> > have a lot (thousands) of archived emails (usually short).
>> >
>> > Is it the general consensus that mbox is more subject to corruption
>> > than maildir?
>>
>> By definition.
>>
>> Why do you think Maildir would perform worse for folders with
>> thousands of emails?  Everything I've read suggests it will perform
>> better - and more reliably.
> 
> Because you need a file system which handles small files efficiently
> (i.e. ReiseFS) otherwise you waste much space on your hard drive.

Space is cheap :-)  It's still going to be _faster_ to use maildir than
mbox. Of course, all my partitions are Reiser, anyway.
-- 
derek


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Derek Broughton
Randy Kramer wrote:

> On Tuesday 27 September 2005 09:27 am, Derek Broughton wrote:
>>
>> Why do you think Maildir would perform worse for folders with thousands
>> of
>> emails?  Everything I've read suggests it will perform better - and more
>> reliably.
> 
> First a quick (but dumb, I should look it up) question.  Does Linux do the
> thing that Dos/Windows does (used to do?) of each file requiring a minimum
> space (one cluster?), or does it vary by filesystem?

It varies decidedly between the different filesystems.
> 
> Attempting to answer my own question:  Presumably (V)FAT(16,32) must be
> the same as MS for compatibility.  

Yes.

> I don't have any idea, though, about ext2 and ext3, 

Ext3 is just ext2 with journalling.  You can even mount an ext3 partition as
ext2.  That said, I've never pried into the internals but it does store
files more efficiently than a FAT system.

> Then, I can remember (again, back in my dos/Windows days) two problems
> that I
> may mix up a little bit.  I guess the first was the limitation on the
> number of files on a disk based on the size of the (primary?) FAT, which
> was, iirc, overcome by allowing secondary or virtual FATs (or something
> along those
> lines).  I'm sure that's not a problem in Linux.

Not in a long time.
> 
> The 2nd problem, referenced above--I did run into applications where the
> number of files in a directory was so large that the access time for a
> file became unacceptable because (I guess) of the time required to search
> the FAT?
>
> I don't know if Linux can run into the same problem.  

Again, I haven't looked into the internals but by all reports it isn't an
issue where the number of files is on the order of "thousands".  Actually,
since I _do_ know how FAT works, I don't see why a properly written
application should have that much difficulty finding a file in a FAT
directory either.
 
> It seems that an 
> indexed file (e.g., mbox with index) is an alternative to that, ahh, but I
> guess only if the index can be searched very efficiently.

And only if you can guarantee that the index is in sync with the file - per
Reinhold's email.
 -- 
derek


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Hendrik Sattler
Am Dienstag, 27. September 2005 16:48 schrieb Randy Kramer:
> First a quick (but dumb, I should look it up) question.  Does Linux do the
> thing that Dos/Windows does (used to do?) of each file requiring a minimum
> space (one cluster?), or does it vary by filesystem?

That never had anything to do with DOS or Linux, it's always a matter of the 
filesystem. ReiserFS can handle that, all others (e.g. XFS, ext3) don't. 
However, that's not reason for me to use ReiserFS, hard drives are big enough 
these days (this may be different for REALLY lots of small files like a nntp 
spool).

> Then, I can remember (again, back in my dos/Windows days) two problems that
> I may mix up a little bit.  I guess the first was the limitation on the
> number of files on a disk based on the size of the (primary?) FAT, which
> was, iirc, overcome by allowing secondary or virtual FATs (or something
> along those lines).  I'm sure that's not a problem in Linux.

Again, do not stick to OS thinkings but stay with the filesystem, please!
ext2/ext3 do have an inode limit (you can define that at FS creation time), 
XFS and ReiserFS don't.

> The 2nd problem, referenced above--I did run into applications where the
> number of files in a directory was so large that the access time for a file
> became unacceptable because (I guess) of the time required to search the
> FAT?

Maybe, again you win XFS and ReiserFS, ext[23] also need quite some time for 
large directories (there is caching support in ext3 with later kernels, IIRC, 
that speeds that up).

> How is a search for a file name done in Linux--is it a linear type thing?
> (Without being very conversant in big O notation, I'm trying to ask if it's
> proportional to the number of entries (file names) in the directory (would
> that be O[n]?), or does it do something more clever (would that be O[1}?).
> Or, again, does it depend on the file system, (and maybe Reiser's has that
> (potential) problem licked?)

It's again dependent on the filesystem.

HS


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread Randy Kramer
On Tuesday 27 September 2005 09:27 am, Derek Broughton wrote:
> Randy Kramer wrote:
> > I suppose maildir will be OK (and maybe even better) for my inbox, which
> > I generally keep "trimmed" (not too many emails).
> >
> > I don't think I want to do that for my mail folders which often have a
> > lot (thousands) of archived emails (usually short).
> >
> > Is it the general consensus that mbox is more subject to corruption than
> > maildir?
>
> By definition.
>
> Why do you think Maildir would perform worse for folders with thousands of
> emails?  Everything I've read suggests it will perform better - and more
> reliably.

Thanks for asking!

I may have to exorcise some old MS Dos/Windows demons from my thinking.

First a quick (but dumb, I should look it up) question.  Does Linux do the 
thing that Dos/Windows does (used to do?) of each file requiring a minimum 
space (one cluster?), or does it vary by filesystem?  

Attempting to answer my own question:  Presumably (V)FAT(16,32) must be the 
same as MS for compatibility.  I don't have any idea, though, about ext2 and 
ext3, and I'm guessing (and may have read) that one of the ways Reiser's 
achieves its greater efficiency for small files is by not doing that?

Then, I can remember (again, back in my dos/Windows days) two problems that I 
may mix up a little bit.  I guess the first was the limitation on the number 
of files on a disk based on the size of the (primary?) FAT, which was, iirc, 
overcome by allowing secondary or virtual FATs (or something along those 
lines).  I'm sure that's not a problem in Linux.

The 2nd problem, referenced above--I did run into applications where the 
number of files in a directory was so large that the access time for a file 
became unacceptable because (I guess) of the time required to search the FAT?  

(Aside: I can't remember when that number became a problem in dos/Windows, but 
I'm sure it was before 10's of thousands of files in a directory.)   

I don't know if Linux can run into the same problem.  It seems that an indexed 
file (e.g., mbox with index) is an alternative to that, ahh, but I guess only 
if the index can be searched very efficiently.  

How is a search for a file name done in Linux--is it a linear type thing? 
(Without being very conversant in big O notation, I'm trying to ask if it's 
proportional to the number of entries (file names) in the directory (would 
that be O[n]?), or does it do something more clever (would that be O[1}?).  
Or, again, does it depend on the file system, (and maybe Reiser's has that 
(potential) problem licked?)

Finally (for now ;-), without fully understanding inodes, I guess I'd be 
worried about running out (or needing to allocate an excessive number in 
advance) to allow for super large quantities of files.

Or, I guess one might say all of these problems might be controlled by proper 
administration/tuning of the system, which might include picking an 
appropriate type of filesystem for the requirements.  Still, I was hoping to 
get a little more insight by mentioning these points for discussion.

Thanks!
Randy Kramer


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-27 Thread André Wöbbeking
On Tuesday 27 September 2005 15:27, Derek Broughton wrote:
> Randy Kramer wrote:
> > I suppose maildir will be OK (and maybe even better) for my inbox,
> > which I generally keep "trimmed" (not too many emails).
> >
> > I don't think I want to do that for my mail folders which often
> > have a lot (thousands) of archived emails (usually short).
> >
> > Is it the general consensus that mbox is more subject to corruption
> > than maildir?
>
> By definition.
>
> Why do you think Maildir would perform worse for folders with
> thousands of emails?  Everything I've read suggests it will perform
> better - and more reliably.

Because you need a file system which handles small files efficiently 
(i.e. ReiseFS) otherwise you waste much space on your hard drive.


Cheers,
André



Re: kmail corrupts emails [solved]

2005-09-27 Thread Derek Broughton
Randy Kramer wrote:

> I suppose maildir will be OK (and maybe even better) for my inbox, which I
> generally keep "trimmed" (not too many emails).
> 
> I don't think I want to do that for my mail folders which often have a lot
> (thousands) of archived emails (usually short).
> 
> Is it the general consensus that mbox is more subject to corruption than
> maildir?

By definition.  

Why do you think Maildir would perform worse for folders with thousands of
emails?  Everything I've read suggests it will perform better - and more
reliably.
-- 
derek


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Larry Garfield
On Monday 26 September 2005 11:30 am, Hendrik Sattler wrote:
> Am Montag, 26. September 2005 14:07 schrieb Randy Kramer:
> > Is there / will there ever be a fix for this?
>
> Sure, rename your inbox to some other folder and let kmail recreate it as
> maildir. maildir does not has such problems because every e-mail is a
> seperate file. Speed then depends on the used filesystem (e.g. for lots of
> small files if you only have short text mails). Maildir is usually faster
> for mailboxes with lots of (big) attachments.
>
> HS

I'm having/had a similar problem (hasn't happened recently, but I'm still 
concerned about it), but I use Maildir locally in KMail.  It's an IMAP 
account, and the IMAP server (which I also run) is also using Maildir.  
There's no mbox anywhere, AFAIK.  Why would I be having similar "nullified" 
email problems, if mbox is the culprit?

-- 
Larry Garfield  AIM: LOLG42
[EMAIL PROTECTED]   ICQ: 6817012

"If nature has made any one thing less susceptible than all others of 
exclusive property, it is the action of the thinking power called an idea, 
which an individual may exclusively possess as long as he keeps it to 
himself; but the moment it is divulged, it forces itself into the possession 
of every one, and the receiver cannot dispossess himself of it."  -- Thomas 
Jefferson


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Reinhold Kainhofer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Am Montag, 26. September 2005 23:03 schrieb Reinhold Kainhofer:
> Am Montag, 26. September 2005 21:23 schrieb Randy Kramer:
> > I suppose maildir will be OK (and maybe even better) for my inbox, which
> > I generally keep "trimmed" (not too many emails).
>
> Maildir is the recommended format. It's even the default for a new kmail
> installation.
>
> > I don't think I want to do that for my mail folders which often have a
> > lot (thousands) of archived emails (usually short).
>
> So what? Our department's mail server uses maildir, and there are far more
> than only thousands of messages.
>
> On the other hand, for lots of messages, the mbox file needs to be searched
> / loaded in whole, while in maildir only one file needs to be loaded.

Oh I forgot: You said that you accessed the same mbox file from two different 
machines. mbox format is known to cause massive mbox file corruption in this 
case, since if one app adds/changes a message, the mbox file needs to be 
changed (rewritten). If the second app changes the folder at the same time, 
it looses either that message, or has wrong offsets in the mbox file. The 
latter is really critical, since many mail applications don't rewrite the 
whole mbox file (which would be madness for large folders, when you often 
change message flags like read, answered, etc.) but rather use the binary 
index of the message start into the message.


With Maildir, concurrent access is possible, since each message is its own 
file. At most, the index might be out of sync, and in that case the mail 
application can simply re-generate the index.

Cheers,
Reinhold

- -- 
- --
Reinhold Kainhofer, Vienna, Austria
email: [EMAIL PROTECTED], http://reinhold.kainhofer.com/
 * Financial and Actuarial Mathematics, TU Wien, http://www.fam.tuwien.ac.at
 * K Desktop Environment, http://www.kde.org/, KOrganizer maintainer
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDOGRQTqjEwhXvPN0RAoofAJwI21s0a2y+Vc/J+GHWsRJFEFw+CACfdUe0
sL8RLGI8yUUuUiLu7dZhyZE=
=77jU
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Reinhold Kainhofer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Am Montag, 26. September 2005 21:23 schrieb Randy Kramer:
> I suppose maildir will be OK (and maybe even better) for my inbox, which I
> generally keep "trimmed" (not too many emails).

Maildir is the recommended format. It's even the default for a new kmail 
installation.

> I don't think I want to do that for my mail folders which often have a lot
> (thousands) of archived emails (usually short).

So what? Our department's mail server uses maildir, and there are far more 
than only thousands of messages. 

On the other hand, for lots of messages, the mbox file needs to be searched / 
loaded in whole, while in maildir only one file needs to be loaded.

> Is it the general consensus that mbox is more subject to corruption than
> maildir?

In mbox, all messages are just concatenated together, so if there is even only 
one broken byte (e.g. a null byte), it might possibly mess up all messages 
that come afterwards. In maildir, each message is one file, so at most this 
one message can be corrupted.


> I use a text record separator to separate records (currently "\n---++ "). 
> My largest file at the moment is about 3000 records with 3 M characters.  I
> haven't seen the need to create an index yet, but that may be coming.
>
> I guess if I get some corruption without the index, I'll eventually notice
> it in some record that I look at it, and may be able to fix it, as
> everything is visible plain text, at least when viewed in a (plain) text
> editor.

Sure, but if you have a broken sector or something, some sequence might be 
filled with random data, and while you might be able to manually spot this 
and guess whether it's between two mails (i.e. a mail end/begin corrupted) or 
halfway through a message, it's impossible for a computer.

This means that the index (some message states need to be stored in a separate 
file, since mbox doesn't store them) of all later messages might possibly be 
off by one... In particular, for all messages after that you can't be really 
sure if that message is deleted or the one after it. 
KMail uses the offset into the file in the index file, so this problem can be 
worked around, but compacting will possibly remove too many / too less 
messages.

Reinhold

- -- 
- --
Reinhold Kainhofer, Vienna, Austria
email: [EMAIL PROTECTED], http://reinhold.kainhofer.com/
 * Financial and Actuarial Mathematics, TU Wien, http://www.fam.tuwien.ac.at
 * K Desktop Environment, http://www.kde.org/, KOrganizer maintainer
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDOGIaTqjEwhXvPN0RApkaAJ9KoVHE0iuANKr/LKyMfptfYoTY5ACgqauT
Cw5aMdYP6s49jTO5EhmEFpY=
=KbQr
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Randy Kramer
On Monday 26 September 2005 12:30 pm, Hendrik Sattler wrote:
> Am Montag, 26. September 2005 14:07 schrieb Randy Kramer:
> > Is there / will there ever be a fix for this?
>
> Sure, rename your inbox to some other folder and let kmail recreate it as
> maildir. maildir does not has such problems because every e-mail is a
> seperate file. Speed then depends on the used filesystem (e.g. for lots of
> small files if you only have short text mails). Maildir is usually faster
> for mailboxes with lots of (big) attachments.

Thanks!

I suppose maildir will be OK (and maybe even better) for my inbox, which I 
generally keep "trimmed" (not too many emails).

I don't think I want to do that for my mail folders which often have a lot 
(thousands) of archived emails (usually short).

Is it the general consensus that mbox is more subject to corruption than 
maildir?

I guess I'll want to dig into that so more--not so much for email, but because 
I'm building a plain text database with multiple records per file.  

I use a text record separator to separate records (currently "\n---++ ").  My 
largest file at the moment is about 3000 records with 3 M characters.  I 
haven't seen the need to create an index yet, but that may be coming.

I guess if I get some corruption without the index, I'll eventually notice it 
in some record that I look at it, and may be able to fix it, as everything is 
visible plain text, at least when viewed in a (plain) text editor.

Well, anyway, I guess there's something for me to think about here.

regards,
Randy Kramer


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Hendrik Sattler
Am Montag, 26. September 2005 14:07 schrieb Randy Kramer:
> Is there / will there ever be a fix for this?

Sure, rename your inbox to some other folder and let kmail recreate it as 
maildir. maildir does not has such problems because every e-mail is a 
seperate file. Speed then depends on the used filesystem (e.g. for lots of 
small files if you only have short text mails). Maildir is usually faster for 
mailboxes with lots of (big) attachments.

HS


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Randy Kramer
On Monday 26 September 2005 03:29 am, Reinhold Kainhofer wrote:
> Am Freitag, 23. September 2005 20:30 schrieb Randy Kramer:
> > On Friday 23 September 2005 01:29 pm, Theo Schmidt wrote:
> > > This has solved it, thanks!  Kmail used to compact mailboxes on
> > > closing; it looks like it no longer does so.
> >
> > You're welcome!  But there is something else I should have
> > mentioned--hope you haven't unindexed your inbox yet--
> >
> > on my kmail system, compaction was totally disabled "for safety
> > reasons"--when I removed the index, all of a sudden I got thousands of
> > old emails that had been "marked for deletion" (my words) (and no longer
> > visible) but never actually deleted.
>
> When kmail detects a corrupted mbox file for the inbox (and for any other
> folder), compacting it might eat all your mail, that's the safety reasons.
> Without compacting, kmail can simply ignore the corrupted parts of the mbox
> file, and it still works for everything else. But a compaction will mess
> this all up.

Reinhold,

Thanks for the information!

I don't know if you are the right person to ask, but I'll ask here anyway, 
maybe someone will additional information.   

Is there / will there ever be a fix for this?  I mean, I certainly don't want 
to lose emails (from my inbox or anywhere else) but also I don't want my 
inbox to:
   * grow without bounds filled with the text of removed emails
   * bring back those deleted emails when I have to reindex the inbox to fix 
some other problem (as discussed in this thread)

What do other email clients do?  Do they:
   * have a system of compaction that never fails, despite a perhaps corrupted 
mbox file
   * have a method to detect mbox corruption, perhaps when compaction is 
invoked, warn the operator, stop/prevent the compaction, and suggest 
corrective action to the user
   * have the same potential problem (of potentially losing emails on 
compaction of a corrupted mbox file), but just let the user learn the hard 
way
   * other?

regards,
Randy Kramer


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-26 Thread Reinhold Kainhofer
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Am Freitag, 23. September 2005 20:30 schrieb Randy Kramer:
> On Friday 23 September 2005 01:29 pm, Theo Schmidt wrote:
> > This has solved it, thanks!  Kmail used to compact mailboxes on closing;
> > it looks like it no longer does so.
>
> You're welcome!  But there is something else I should have mentioned--hope
> you haven't unindexed your inbox yet--
>
> on my kmail system, compaction was totally disabled "for safety
> reasons"--when I removed the index, all of a sudden I got thousands of old
> emails that had been "marked for deletion" (my words) (and no longer
> visible) but never actually deleted.

When kmail detects a corrupted mbox file for the inbox (and for any other 
folder), compacting it might eat all your mail, that's the safety reasons. 
Without compacting, kmail can simply ignore the corrupted parts of the mbox 
file, and it still works for everything else. But a compaction will mess this 
all up.

Cheers,
Reinhold

- -- 
- --
Reinhold Kainhofer, Vienna, Austria
email: [EMAIL PROTECTED], http://reinhold.kainhofer.com/
 * Financial and Actuarial Mathematics, TU Wien, http://www.fam.tuwien.ac.at
 * K Desktop Environment, http://www.kde.org/, KOrganizer maintainer
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDN6NNTqjEwhXvPN0RAgfYAJ9ducWlxeMhWlGh+UMl1r/xakifsACeI0x5
q/gct8MYPCJkJ6HVdwcHWjc=
=eujL
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-23 Thread Randy Kramer
On Friday 23 September 2005 01:29 pm, Theo Schmidt wrote:
> This has solved it, thanks!  Kmail used to compact mailboxes on closing;
> it looks like it no longer does so.

You're welcome!  But there is something else I should have mentioned--hope you 
haven't unindexed your inbox yet--

on my kmail system, compaction was totally disabled "for safety reasons"--when 
I removed the index, all of a sudden I got thousands of old emails that had 
been "marked for deletion" (my words) (and no longer visible) but never 
actually deleted.

The note below (from my offline TWiki-like thing) tells how to enable 
compaction on your inbox.  I now try to remember to do that several times a 
day.

regards,
Randy Kramer

---++ kmail: compact inbox disabled

   * [[http://mail.kde.org/pipermail/kmail-devel/2005-June/019581.html][]]

On Wednesday 29 June 2005 14:43, Edwin Schepers wrote:
> Hi,
> When I try to compact my inbox, I get the message that for safety reasons, 
> compaction has been disabled for inbox.
> But is there some method to do the compaction ? I have an inbox of 140M now 
> containing zero messages. I couldn't find any option. I guess `>inbox` is 
not 
> the proper way to do this.

Quit kmail (including the systray icon if you use it), open 
~/.kde/share/config/kmailrc, 
look for lines containing "Compactable=false", remove those lines, restart 
kmail.

The previous discussions on how to improve this issue didn't lead to a 
solution yet, it seems.



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: kmail corrupts emails [solved]

2005-09-23 Thread Theo Schmidt

Randy Kramer schrieb:

Well, a little bit like the 2nd problem. Re the first, is kmail 
generating


those emails automatically, or are you generating them and they get sent out 
with unknown subject, ...?
 

No, not sending them, but I see there was also a problem with the server 
of my provider, which may have set this off.



...
  * the solution to my problem is typically to regenerate the indexes.  To do 
that, I go into the mail directory (.Mail, iirc, although you may have it 
somewhere else), delete the existing indexes, and let kmail regenerate them.


...
Here, from directory ~/.Mail, is the ls -al listing for one (mbox) folder 
(named tldp) with the three indexes.  To start, you might want to experiment 
with just one mail folder.  Delete all three of the .index files, then 
restart kmail (assuming you shut it down) and try to access the mail in that 
folder.  After a short delay, I'm hopeful that it will be ok.


-rw---   1 rhk rhk   868526 Sep 19 16:05 tldp
-rw---   1 rhk rhk99043 Sep 19 16:05 .tldp.index
-rw-r--r--   1 rhk rhk  937 Sep 19 16:05 .tldp.index.ids
-rw-r--r--   1 rhk rhk 5425 Aug 24 07:28 .tldp.index.sorted
 

This has solved it, thanks!  Kmail used to compact mailboxes on closing; 
it looks like it no longer does so.


Theo Schmidt


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]