Re: Reliably detecting/counting new mail. WAS:[Re: Default mailbox display? [partially solved]]

2001-01-23 Thread Heinrich Langos

On Mon, Jan 22, 2001 at 03:53:44PM +, Dave Pearson wrote:
 On Mon, Jan 22, 2001 at 04:16:25PM +0100, Heinrich Langos wrote:
 
  Dave, you may stop reading. The rest will only bother you and further
  waste your time.
 
 With an attitude like that it's not surprising that you're confused about
 what I've been saying. Read what I've actually said, look for the reasonable
 reason this time, then you might find that I've said nothing against the
 idea of providing the feature you'd like to see.

Well, then I'm sorry. I guess I took your repating of "the docs are
right" the wrong way. I was not trying to convince you that the docs
are wrong and I took your insiting on the thier correctnes as conviction
that there is nothing to improve.
That really upset me. Sorry, once again.

  ok ok .. just wanted to make sure we talk about the same thing. i guess
  most users don't have huge mailboxes since mbox-hooks are a such a nice
  way to move older mail out of mailboxes after some time. anyway...
 
 But we can't actually make such a guess can we? You also seem to be under
 the impression that I don't archive older mail. I do. Arguing that "most
 users" do what you'd like them to do doesn't detract from the point that
 there can and will be large mailboxes defined by the mailbox command. When
 designing new features it makes sense to work to the extreme case, not the
 average case (especially when you can't really know what the average is).
 

I was just under the impression that incoming mailboxes usually are
small. Though my choice of words may have been inappropriate and insulting.

I admit that arguing for "most users" in the absents of statistic data
is, well, arguable. :-)

In order to deal with extremly big mailboxes I proposed that "jump to
the previously know end" strategy. But dealing with large amounts of
incoming mail could be a problem. 

Any ideas? I mean other than limiting the amount to scan like
"max_growth_to_scan_for_new_mail=200k"
If a mailbox had grown more than 200k since the last scan you would
only mark the previously known amount of new mail with a "+" and
postpone exact scanning and updating of numbers till the user entered
that folder and you had to scan it anyway.

  jump to the previously know end of the file and scan how may new mails
  arrived. that shouldn't be too much of a burden for a system. since mails
  usually don't arrive in large batches. (i know, fetchmail users will hate
  me.)
 
 I don't use fetchmail but the above still isn't true. email does arrive in
 large batches for me.

How comes? Are these large batches usually source initiated (like
moderated mailinglists) or are they collected at another MX host
before they get transfered to your mail server? In other words: Will
they affect all/many of your mailboxes or usually just one ore two?

  to prevent errors due to other programms, maliciously changing your
  mailboxes and increasing its size, you could check for that. 
  depending on your level of paranoia you could do anything. 
  from
  A) checking if a new mail starts exactly where it is supposed to start
 (at the previously know file-end, which you do anyway by starting
 parsingthere) 
 
 That sounds like a good solution for the problem I highlight above.

I hope it is. It would save a lot of trouble. 

 
 PS: You still appear to have a configuration problem with your copy of mutt:
 
 ,
 | Mail-Followup-To: heinrich, [EMAIL PROTECTED]
 `

yeap .. unfortunatly i don't know how to review the headers before
sending an email. 

could it be a problem with the local sendmail configuration? 

i've got these in my muttrc but it doesnt realy help.
-
set edit_hdrs
ignore *
unignorefrom: subject to cc mail-followup-to \
date x-mailer x-url
-

furthermore i've got "[EMAIL PROTECTED]" on my "lists".  but not on
"subscribe" since i like to see the sender instead of the list address
in the index. if putting it on "subscribe" will solve the
"Mail-Followup-To:"-problem i will do that and change index_format
accordingly.

TIA
-heinrich



Re: Reliably detecting/counting new mail. WAS:[Re: Default mailbox display? [partially solved]]

2001-01-23 Thread Dave Pearson

On Tue, Jan 23, 2001 at 11:30:57AM +0100, Heinrich Langos wrote:
 On Mon, Jan 22, 2001 at 03:53:44PM +, Dave Pearson wrote:

  With an attitude like that it's not surprising that you're confused
  about what I've been saying. Read what I've actually said, look for the
  reasonable reason this time, then you might find that I've said nothing
  against the idea of providing the feature you'd like to see.
 
 Well, then I'm sorry. I guess I took your repating of "the docs are right"
 the wrong way. I was not trying to convince you that the docs are wrong
 and I took your insiting on the thier correctnes as conviction that there
 is nothing to improve.

I think I see the problem and the source of the misunderstanding. I wasn't
saying that the documentation was correct, I was saying that the "new mail"
detection *was* reliable within the limits of it's reported abilities. While
I don't have a problem with those limits (I've configured my backup utility,
for example, to preserve time stamps) it doesn't follow that I think that
development should stop right there.

 In order to deal with extremly big mailboxes I proposed that "jump to
 the previously know end" strategy. But dealing with large amounts of
 incoming mail could be a problem. 
 
 Any ideas? I mean other than limiting the amount to scan like
 "max_growth_to_scan_for_new_mail=200k"
 If a mailbox had grown more than 200k since the last scan you would
 only mark the previously known amount of new mail with a "+" and
 postpone exact scanning and updating of numbers till the user entered
 that folder and you had to scan it anyway.

That would seem like an obvious solution. However, I suspect you're starting
to get into design problems that have caused the mutt developers to dismiss
extending this before. The point being that you still end up with a "it
sometimes works, but not always" solution.

In other words you still get something that is "unreliable" for certain
values of "unreliable".

  I don't use fetchmail but the above still isn't true. email does arrive
  in large batches for me.
 
 How comes? 

I'm on a dialup connection. Mail only flows into my box when I dial into my
ISP URL:http://www.demon.net/.

Are these large batches usually source initiated (like
 moderated mailinglists) or are they collected at another MX host before
 they get transfered to your mail server? In other words: Will they affect
 all/many of your mailboxes or usually just one ore two?

They affect all mailboxes.

  ,
  | Mail-Followup-To: heinrich, [EMAIL PROTECTED]
  `
 
 yeap .. unfortunatly i don't know how to review the headers before sending
 an email.
 
 could it be a problem with the local sendmail configuration? 

I think it's more likely you've got a mutt configuration problem.

 furthermore i've got "[EMAIL PROTECTED]" on my "lists". but not on
 "subscribe" since i like to see the sender instead of the list address in
 the index. if putting it on "subscribe" will solve the
 "Mail-Followup-To:"-problem i will do that and change index_format
 accordingly.

This seems to be your problem. The mutt list should be in your `subscribe'
list, not your `list' list. If the `index_format' isn't what you'd like it
to be that's what you should change.

Here's the index format I use for mailing lists:

,
| # This is the index format for mailing lists.
| set index_format="%4C %Z %{%b %d} %-15.15n (%4l) %s"
`

You can see the whole of my ~/.muttrc setup on my web site. I'm not
suggesting it's actually useful but the list/non-list handling that I have
might give you some ideas.

-- 
Dave Pearson:  | mutt.octet.filter - autoview octet-streams
http://www.davep.org/  | mutt.vcard.filter - autoview simple vcards
Mutt:  | muttrc2html   - muttrc - HTML utility
http://www.davep.org/mutt/ | muttrc.sl - Jed muttrc mode



Re: Default mailbox display? [partially solved]

2001-01-22 Thread Heinrich Langos



first for the important part: 

while reading the source to find a place to put Brandon Long's "folder
count" patch. i've found a configure switch named "--enable-buffy-size"

that seems to solve the detection issue. i only browsed through the
source since it was quite late, but it seems to read the end of the
mbox to find out if the last message is a new one. so it partially
scans the mbox file. i guess i can extend that to a full scan to
report real numbers.

but at least it partially solves the detection issue.

On Sun, Jan 21, 2001 at 08:07:58PM +, Dave Pearson wrote:
 On Sun, Jan 21, 2001 at 06:44:54PM +0100, Heinrich Langos wrote:
  On Sun, Jan 21, 2001 at 03:59:22PM +, Dave Pearson wrote:
[...] 
   Saving such information won't help you work out how many new mails there
   are, or if there is new mail at all. It would let you know if the
   mailbox had been modified in some way, which is pretty much what mutt
   does right now.
  
  nope ... right now mutt only shows that the mailbox has been accessed. not
  if it has been modified.
 
 It would appear that we have different definitions of "accessed" and
 "modified". My copy of mutt shows me when an mbox has been modified, not
 when it has been accessed.

do me a favor and check your "mutt -v" output. if it says
"+BUFFY_SIZE" than your mutt is not just checking access times but
much more than that. if it says "-BUFFY_SIZE" (like mine did) and
after grep-ing your mailbox still shows an "N", i don't get it.  

grep doesn't modify your mbox.
so says strace: 
open("/var/spool/mail/heinrich", O_RDONLY|0x8000) = 3
(though i have to admit i didn't find out what the 0x8000 part means
in a quick look at /usr/include/*)

BTW: is your mutt running 24/7 or do you start it anew in the morning
like i do... ( and several times a day whenever i've found a new
feature in mutt that i would like to try out ? :) )

  right now a simple grep will screw up new mail detection.
  
  try this: 
  $ echo blah | mail yourself@localhost
  $ grep something /var/spool/mail/yourself
  $ mutt -y
  and you see no "N" ... pretty sad, isn't it?
 
 No, I don't find it sad, I find it consistent with the documentation.

consistent? yes! i admit it is
consistent with a documentation that says: in some cases new mail
detection is not working as one would expect, because there is evil in
world and "Something is rotten in the state of Denmark" :-)

 Obviously you're more than happy to find it an itch worth scratching, feel
 free to scratch it. All I've been saying is that it *is* reliable, it does
 exactly what it says it does. That it doesn't do what you'd like is a
 different matter, I'm not commenting on that.

it's not only me who wants mutt to behave that way, i guess. 
if it was not the intention to make mutt detect new mail it would say
"... the main menu status bar displays how any of these folders have
been modified since they where accessed by some programm." ;-)

  i'm not saying that mutt should constantly scan the whole mailboxes or
  anything like that. i just say it could do so on request. or on startup.
 
 The problem with such a scan is that it could take ages. I've got a lot of
 mailboxes, some of which can be huge.

it only scans mailboxes that are marked as incoming mailboxes. 
so it would do the same thing it does, when opening that mbox. only to
all of them at once... ok ok .. that would be some overhead.

but it will not scan all your mailboxes. especially not those archive
like things where you keep several years of the kernel developers list
or bugtraq :-)

if you keep your incoming mboxes down to a month back or two, it
shouldn't be such a problem. and the results of scanning can be cached
(just in memory or in a status file) and only refreshed if the file
was modified.

with _you_ chosing your favorite or most reliable way you see fit to
detect modification. be it access/modification time, file size
or md5sum.

any volunteers to go for it? i have to finish a studies project before
i can go for more than a quick'n'dirty hack.

-heinrich

-- 
Heinrich Langos [EMAIL PROTECTED]
 pgp: http://wh9.tu-dresden.de/~heinrich/pub_pgp_key.asc
 _
|o| The reason we come up with new versions is not to fix bugs. |o|
|o| It's absolutely not. It's the stupidest reason to buy a new |o|
|o| version I ever heard. -- Bill Gates,  Microsoft Corporation |o|
 ~



Re: Default mailbox display? [partially solved]

2001-01-22 Thread Dave Pearson

On Mon, Jan 22, 2001 at 01:34:03PM +0100, Heinrich Langos wrote:

 On Sun, Jan 21, 2001 at 08:07:58PM +, Dave Pearson wrote:

  It would appear that we have different definitions of "accessed" and
  "modified". My copy of mutt shows me when an mbox has been modified, not
  when it has been accessed.
 
 do me a favor and check your "mutt -v" output. if it says
 "+BUFFY_SIZE" than your mutt is not just checking access times but
 much more than that. if it says "-BUFFY_SIZE" (like mine did) and
 after grep-ing your mailbox still shows an "N", i don't get it.  

I don't use the buffy feature. Neither have I ever said that an external
access of a given mailbox won't cause mutt to fail to show the "N". I keep
saying that it works as documented.

 BTW: is your mutt running 24/7 or do you start it anew in the morning like
 i do... ( and several times a day whenever i've found a new feature in
 mutt that i would like to try out ? :) )

24/7. I also run other instances of mutt on an ad-hoc basis.

 it's not only me who wants mutt to behave that way, i guess. if it was not
 the intention to make mutt detect new mail it would say "... the main menu
 status bar displays how any of these folders have been modified since they
 where accessed by some programm." ;-)

I agree that this can be viewed as a documentation problem.

  The problem with such a scan is that it could take ages. I've got a lot of
  mailboxes, some of which can be huge.
 
 it only scans mailboxes that are marked as incoming mailboxes. 

That's why I said "mailboxes".

 so it would do the same thing it does, when opening that mbox. only to all
 of them at once... ok ok .. that would be some overhead.

That could and would be a *lot* of overhead.

 but it will not scan all your mailboxes. especially not those archive like
 things where you keep several years of the kernel developers list or
 bugtraq :-)

I said "mailboxes", not archives.

-- 
Dave Pearson:  | mutt.octet.filter - autoview octet-streams
http://www.davep.org/  | mutt.vcard.filter - autoview simple vcards
Mutt:  | muttrc2html   - muttrc - HTML utility
http://www.davep.org/mutt/ | muttrc.sl - Jed muttrc mode



Reliably detecting/counting new mail. WAS:[Re: Default mailbox display? [partially solved]]

2001-01-22 Thread Heinrich Langos

On Mon, Jan 22, 2001 at 01:10:46PM +, Dave Pearson wrote:
 On Mon, Jan 22, 2001 at 01:34:03PM +0100, Heinrich Langos wrote:
 
  On Sun, Jan 21, 2001 at 08:07:58PM +, Dave Pearson wrote:
 [...]
  it's not only me who wants mutt to behave that way, i guess. if it was not
  the intention to make mutt detect new mail it would say "... the main menu
  status bar displays how any of these folders have been modified since they
  where accessed by some programm." ;-)
 
 I agree that this can be viewed as a documentation problem.

In your opinion the documentation is wrong and should be changed to
something like the above?  Well, ok then I really misunderstood you
all the time and I misunderstood the documentation. I'm sorry I
wasted your time.

I guess continuing this discussion will not get us anywhere then.
Anyway I will continue this mail since I have some ideas that may 
improve the performance in case somebody, maybe me, wants to fix
the non-existing problem :-)

Dave, you may stop reading. The rest will only bother you and further
waste your time.

   The problem with such a scan is that it could take ages. I've got a lot of
   mailboxes, some of which can be huge.
  
  it only scans mailboxes that are marked as incoming mailboxes. 
 
 That's why I said "mailboxes".

ok ok .. just wanted to make sure we talk about the same thing.  
i guess most users don't have huge mailboxes since mbox-hooks are a
such a nice way to move older mail out of mailboxes after some time.
anyway...

  so it would do the same thing it does, when opening that mbox. only to all
  of them at once... ok ok .. that would be some overhead.
 
 That could and would be a *lot* of overhead.

realy? lets see...

and keep in mind that this does not need to be mutts standard way to
detect new mail. just the one it uses when you TAB TAB. or start mutt
with -y. 

when you save modification date, filesize, known amount of new mail,
and an md5sum they will only be generated once for each mailbox. so
assuming that you don't add mailboxes on an hourly base I will forget
about that one-time load.

if a change occures (detecting may be done by modification time,
filesize, md5sum (sorted accending by paranoia)) and the filesize
increases you could assume that there is new mail (if it decreased you
could mark that mailbox as "C" for changed or something like that and
stop here).

jump to the previously know end of the file and scan how may new mails
arrived. that shouldn't be too much of a burden for a system. since
mails usually don't arrive in large batches. (i know, fetchmail users 
will hate me.)

to prevent errors due to other programms, maliciously changing your
mailboxes and increasing its size, you could check for that. 
depending on your level of paranoia you could do anything. 
from
A) checking if a new mail starts exactly where it is supposed to start
   (at the previously know file-end, which you do anyway by starting
   parsingthere) 
to 
Z) checksumming the mailbox up to the last known
   fileend and comparing with the saved checksum.

assuming that increased size usually means that new mail has been
added the overhead would be very small. 

if you still think it is too much overhead go for this one:

cache the information that you gathered during those scans to skip the
initial scan that mutt does when you enter a folder.

this will reduce the overhead to almost zero. only if you don't read
the folder that has new mail you will have wasted time. but why do you
get that mail at all if you dont read it ? :-)

for maildir environments the solution seems straightforward. 
what about imap? i don't have a clue. could somebody enlighten me?

-heinrich

-- 
Heinrich Langos [EMAIL PROTECTED]
 pgp: http://wh9.tu-dresden.de/~heinrich/pub_pgp_key.asc
 _
|o| The reason we come up with new versions is not to fix bugs. |o|
|o| It's absolutely not. It's the stupidest reason to buy a new |o|
|o| version I ever heard. -- Bill Gates,  Microsoft Corporation |o|
 ~





Re: Reliably detecting/counting new mail. WAS:[Re: Default mailbox display? [partially solved]]

2001-01-22 Thread Dave Pearson

On Mon, Jan 22, 2001 at 04:16:25PM +0100, Heinrich Langos wrote:

 Dave, you may stop reading. The rest will only bother you and further
 waste your time.

With an attitude like that it's not surprising that you're confused about
what I've been saying. Read what I've actually said, look for the reasonable
reason this time, then you might find that I've said nothing against the
idea of providing the feature you'd like to see.

IOW, try being less insulting and use a little more comprehension.

  That's why I said "mailboxes".
 
 ok ok .. just wanted to make sure we talk about the same thing. i guess
 most users don't have huge mailboxes since mbox-hooks are a such a nice
 way to move older mail out of mailboxes after some time. anyway...

But we can't actually make such a guess can we? You also seem to be under
the impression that I don't archive older mail. I do. Arguing that "most
users" do what you'd like them to do doesn't detract from the point that
there can and will be large mailboxes defined by the mailbox command. When
designing new features it makes sense to work to the extreme case, not the
average case (especially when you can't really know what the average is).

 if a change occures (detecting may be done by modification time, filesize,
 md5sum (sorted accending by paranoia)) and the filesize increases you
 could assume that there is new mail (if it decreased you could mark that
 mailbox as "C" for changed or something like that and stop here).

Note that an increase in size might be a reduction in the number of actual
emails. I might have deleted a load but saved one large email from somewhere
else to the mailbox in question. IOW, an increase in size can't be assumed
to be new mail.

 jump to the previously know end of the file and scan how may new mails
 arrived. that shouldn't be too much of a burden for a system. since mails
 usually don't arrive in large batches. (i know, fetchmail users will hate
 me.)

I don't use fetchmail but the above still isn't true. email does arrive in
large batches for me.

 to prevent errors due to other programms, maliciously changing your
 mailboxes and increasing its size, you could check for that. 
 depending on your level of paranoia you could do anything. 
 from
 A) checking if a new mail starts exactly where it is supposed to start
(at the previously know file-end, which you do anyway by starting
parsingthere) 

That sounds like a good solution for the problem I highlight above.

PS: You still appear to have a configuration problem with your copy of mutt:

,
| Mail-Followup-To: heinrich, [EMAIL PROTECTED]
`

-- 
Dave Pearson:  | mutt.octet.filter - autoview octet-streams
http://www.davep.org/  | mutt.vcard.filter - autoview simple vcards
Mutt:  | muttrc2html   - muttrc - HTML utility
http://www.davep.org/mutt/ | muttrc.sl - Jed muttrc mode