Re: Very large folderTo:
>Starting in late 2014 I have stopped deleting messages, putting them in a >directory, +gone, which now contains 465,147 messages and uses about 17 >gigabytes. The bulk of these messages were of transitory or of less interest >to me. But they include 1,702 messages from my daughter. They were almost all >of no interest or use to me within a day or two of when she sent them. But she >recently died (the worst thing by far that's ever happened to me). Now every >byte she ever wrote is precious to me. So I am glad that I stopped deleting >messages that I no longer care about. First off, please accept my sympathies for this unimaginable tragedy. >So, what is the likelihood of such a bug? Does anybody have any experience >dealing with such large folders? I can't think of any _buffer overflows_ that might happen; this isn't anything out of the ordinary, except that it's a very large number of messages. What I think you might bump up against are virtual memory limits, but even then I suspect you're fine. There's a number of things that are allocated when a folder is read (in the function folder_read()). From what I see, the ones that are affected by the number of messages in the folder are: - The "message number" array, which holds the message number for each message. That's an int, so 4 bytes per message on most platforms. But it is free()d after folder_read() is done, which seems sub-optimal? Doing better here might be hard, though. It would certainly be more complex. We could do something smarter about message numbers that are contiguous that would cut down on this memory usage a lot. - The msgstats array, which is ... an array of struct bvector. A struct bvector looks like .. a pointer, size_t, two unsigned long. Call it 32 bytes on a 64 bit platform, maybe? It looks like we only set 4 bits possible for each message, so we don't use anything more than that size; with the exception of sequence membership flags. If you have a lot of sequences in that folder, it's possible you could get something more than that (you'd need ... more than 60 sequences in a single folder before it affected anything). It's possible my quick math is wrong, but I think that it's probably close. So by my count, that's 1.9 MB of memory that gets free()d and 14.9 MB of memory for that folder's structure. Which, in 2021, does not seem like a lot! MH and nmh were always a bit casual with memory management since all of the programs are short-lived, but I think you should be fine. All of the calls to malloc() are wrapped using mh_xmalloc() and friends which call die() if a call to malloc() fails. --Ken
Re: Message header formatting
Ken Hornstein writes: > >> Brilliant idea! I too would use an inverse match logic. Shorter rules, > >> easier to apply, probably faster. > >> > >> G > > > >I guess that I should interpret that as no, there isn't such an incantation > >but since I brought it up then it's my job to write the code so I will when > >I get a chance. > > You don't need to write anything. From mhl(1): > >The component "Extras" will output all of the components of the message >which were not matched by explicit components, or included in the >ignore list. If this component is not specified, an ignore list is not >needed since all non-specified components will be ignored. > > So just remove "extras". > > --Ken Ah, thanks.
Re: Message header formatting
Hi [2021-06-05 15:16] Jon Steinhart > Is there any incantation for "show only the headers explicitly listed > in mhl.format" so that new and uninteresting headers from everybody's > latest spam filter, mailing list manager, and internal tracking don't > fill the screen. You can remove the ``extras'' component from your format file. See mhl(1): > The component "Extras" will output all of the components of the > message which were not matched by explicit components, or included in > the ignore list. Philipp
Re: Message header formatting
>> Brilliant idea! I too would use an inverse match logic. Shorter rules, >> easier to apply, probably faster. >> >> G > >I guess that I should interpret that as no, there isn't such an incantation >but since I brought it up then it's my job to write the code so I will when >I get a chance. You don't need to write anything. From mhl(1): The component "Extras" will output all of the components of the message which were not matched by explicit components, or included in the ignore list. If this component is not specified, an ignore list is not needed since all non-specified components will be ignored. So just remove "extras". --Ken
Re: Message header formatting
George Michaelson writes: > > Brilliant idea! I too would use an inverse match logic. Shorter rules, > easier to apply, probably faster. > > G I guess that I should interpret that as no, there isn't such an incantation but since I brought it up then it's my job to write the code so I will when I get a chance. Jon
Re: Message header formatting
Brilliant idea! I too would use an inverse match logic. Shorter rules, easier to apply, probably faster. G On Sun, 6 Jun 2021, 8:17 am Jon Steinhart, wrote: > I've been getting increasingly annoyed at the number of header lines > that fill a screen or two before I can even see message contents. I > keep adding to an already huge ignores line in my mhl.format but new > headers seem to be created daily. > > Is there any incantation for "show only the headers explicitly listed > in mhl.format" so that new and uninteresting headers from everybody's > latest spam filter, mailing list manager, and internal tracking don't > fill the screen. > > BTW, when looking at this I noticed that while the mhl man page has > examples for the ignores variable that it's missing from the list of > variables on that page. > > Jon > >
Message header formatting
I've been getting increasingly annoyed at the number of header lines that fill a screen or two before I can even see message contents. I keep adding to an already huge ignores line in my mhl.format but new headers seem to be created daily. Is there any incantation for "show only the headers explicitly listed in mhl.format" so that new and uninteresting headers from everybody's latest spam filter, mailing list manager, and internal tracking don't fill the screen. BTW, when looking at this I noticed that while the mhl man page has examples for the ignores variable that it's missing from the list of variables on that page. Jon
Re: Very large folderTo:
Its always been my belief that large folders cause multi level directory block chaining in traditional UNIX fs. This itself incurs costs and consequences on how the cross-system file buffer cache works. Basically, any operation which requires all the directory blocks to be walked in sequence flood kernel file buffers. It has impacts on other uses of the OS. It is likely more modern FS like ZFS handle this differently but I don't know, I've never seen an analysis. Your system has cronjobs doing things like find . -type f -mtime which may run slower, you may be causing general systems slowdowns. I think it would make sense to filter out the things you want. I Share your problem, mails from now dead relatives it is exquisitely painful for me to read but I am unwilling to delete, and the thought of having to write filters to find and file them doesn't fill me with joy. On the other hand, I have replicated the data because you have other risks: disk media is fragile. Don't have only one copy of these mails. A cloud mail provider like Google might be a good backup, and has filter, search and tag options. Cheers G On Sun, 6 Jun 2021, 7:10 am , wrote: > Starting in late 2014 I have stopped deleting messages, putting them in a > directory, +gone, which now contains 465,147 messages and uses about 17 > gigabytes. The bulk of these messages were of transitory or of less > interest > to me. But they include 1,702 messages from my daughter. They were almost > all > of no interest or use to me within a day or two of when she sent them. But > she > recently died (the worst thing by far that's ever happened to me). Now > every > byte she ever wrote is precious to me. So I am glad that I stopped deleting > messages that I no longer care about. > > In practice this large folder has little impact on performance. For > example, > whenever I do a pick which is, or in a script which might be, +gone, I > give it > an argument like last:10. I could, if necessary split +gone into > several > smaller folders, but I would rather not. But I'm concerned that a bug in > nmh > might cause a problem. For example, some kind of a buffer overflow. > > So, what is the likelihood of such a bug? Does anybody have any experience > dealing with such large folders? > > > > > > > Norman Shapiro > > -- > Starting in late 2014 I have stopped deleting messages, putting them in > a directory, +gone, which now contains 465,147 messages and uses > about 17 gigabytes. The bulk of these messages were of transitory or of > less > interest to me. But they include 1,702 messages from my daughter. They > were almost > all of no interest or use to me within a day or two of when she sent them. > But she recently died (the worst thing by far that's ever happened to me). > Now every byte she ever wrote is precious to me. So I am glad that I > stopped > deleting messages that I no longer care about. > > In practice this large folder has little impact on performance. For > example, > whenever I do a pick which is, or in a script which might be, +gone I give > it an argument like last:10. I could, if necessary split +gone into > several smaller folders, but I would rather not. But I'm concerned that a > bug > in nmh might cause a problem. For example, some kind of a buffer overflow. > > So, what is the likelihood of such a bug? Does anybody have any experience > dealing with such large folders? > such a large folder might > > > > > > > Norman Shapiro > >
Re: Very large folder
Hi Norm, > Starting in late 2014 I have stopped deleting messages, putting them > in a directory, +gone, which now contains 465,147 messages and uses > about 17 gigabytes. That's far larger than my +cor (for correspondence) which is only 42,229 emails consuming just over 1 GiB. > But they include 1,702 messages from my daughter. They were almost all > of no interest or use to me within a day or two of when she sent them. > But she recently died I'm so very sorry to hear that. > But I'm concerned that a bug in nmh might cause a problem. For > example, some kind of a buffer overflow. >From the kind of severe bugs I see, like segmentation-violation causing ones, they typically affect the display of data about a message, or perhaps sometimes the inc-orporation of a message. I can't think of any which have affected an existing folder of many messages. Others will pipe up with their views. Given those 1,702 messages are precious, you may want to pick them and refile them to a dedicated folder. You could use ‘refile -link +newfolder ...’ to keep them in +gone and just create hard links to the new folder's version, or without the -link to have them move from +gone. -- Ralph.
Very large folderTo:
Starting in late 2014 I have stopped deleting messages, putting them in a directory, +gone, which now contains 465,147 messages and uses about 17 gigabytes. The bulk of these messages were of transitory or of less interest to me. But they include 1,702 messages from my daughter. They were almost all of no interest or use to me within a day or two of when she sent them. But she recently died (the worst thing by far that's ever happened to me). Now every byte she ever wrote is precious to me. So I am glad that I stopped deleting messages that I no longer care about. In practice this large folder has little impact on performance. For example, whenever I do a pick which is, or in a script which might be, +gone, I give it an argument like last:10. I could, if necessary split +gone into several smaller folders, but I would rather not. But I'm concerned that a bug in nmh might cause a problem. For example, some kind of a buffer overflow. So, what is the likelihood of such a bug? Does anybody have any experience dealing with such large folders? Norman Shapiro -- Starting in late 2014 I have stopped deleting messages, putting them in a directory, +gone, which now contains 465,147 messages and uses about 17 gigabytes. The bulk of these messages were of transitory or of less interest to me. But they include 1,702 messages from my daughter. They were almost all of no interest or use to me within a day or two of when she sent them. But she recently died (the worst thing by far that's ever happened to me). Now every byte she ever wrote is precious to me. So I am glad that I stopped deleting messages that I no longer care about. In practice this large folder has little impact on performance. For example, whenever I do a pick which is, or in a script which might be, +gone I give it an argument like last:10. I could, if necessary split +gone into several smaller folders, but I would rather not. But I'm concerned that a bug in nmh might cause a problem. For example, some kind of a buffer overflow. So, what is the likelihood of such a bug? Does anybody have any experience dealing with such large folders? such a large folder might Norman Shapiro