[Nmh-workers] Pessimal Optimizations.

Lyndon Nerenberg Tue, 11 Dec 2012 17:25:48 -0800

On 2012-12-11, at 4:54 PM, Ralph Corderoy wrote:

> mmap seems fine if the program is
> accessing files under its sole control, perhaps not such a good fit for
> nmh.


For now, perhaps not an issue.  But there has been some minor grumbling about 
concurrent access issues on the list.  If we do lean towards encouraging 
concurrent access to things, mmap() can cause a lot of grief, in that it 
presents the appearance of a single view of the data that doesn't necessarily 
exist.

Back in the Esys days, when I was maintaining our version of the Cyrus IMAP 
server, I had no end of grief dealing with mmap() coherency issues in relation 
to, e.g., the globally shared master mailboxes file.  Things may have improved 
since then, but I doubt that anyone can truly implement a read-write 
multi-process-coherent mmap() on top of UNIX using the existing API.  And if 
you do switch to mmap(), you must do it everywhere.  I wouldn't even *think* of 
mixing mmap() access with FD-based I/O on the same object, even on the systems 
that claim to have full coherency between FS buffers and mmap() page views.  
(They don't. I know this all too well ...)

Ultimately, though, mmap() is just a micro-optimization in the context of a 
*822/MIME parser.  The effort that would be expended on mmap() I/O would be 
better spent on writing a bullet-proof parser. No matter what, you will end up 
copying data to and from user space on the way to its eventual display.  The 
solution here is to write a good one-pass MIME parser that can collect the 
structure of the message as it's read in to memory.  In many cases, once you 
hit the body you really can just parse as you go.  Most MUAs want to start with 
the first MIME part – almost always a text/ part.  Picking that off is easy, 
and leaves you pointed at the remainder, should you need it.  If the message is 
a multipart, you can continue the body scan, parsing out the MIME structure 
along the way.  As you build the in-memory representation of the message, you 
store the file offsets of the starting points of each part.  You can also 
opportunistically cache some of the body parts along the way.  At worst, you 
pay the price of reading the entire message once, plus the overhead of 
re-reading specific MIME parts the MUA requests later.

In the case of mmap(), you do the same thing, just pretending the entire 
message is already in a memory buffer.  But it isn't.  You still pay the 
penalty of reading the data from disk into the FS cache, and then exposing that 
to the user process.  These days it's debatable whether there is more overhead 
just copying from the kernel into user space vs. all the mucking about with 
page tables and the like that mmap() requires.  TLB flushes are *expensive*, 
even more so when you're running under a hypervisor.  A straight copy from a 
kernel buffer to user memory can often take place within the CPUs L1/L2 cache.  
So before anyone claims mmap() is faster (not that they were), you really need 
to sit down and benchmark how your particular CPU and OS perform.

But again, these really are micro-optimizations.  mmap() won't make anything 
run perceivably faster, but it will introduce the potential for many subtle 
bugs.  Best to leave well enough alone.

Now if you *really* want to speed things up, let's talk about lazy folder 
indexes.

--lyndon


_______________________________________________
Nmh-workers mailing list
Nmh-workers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

[Nmh-workers] Pessimal Optimizations.

Reply via email to