Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Mouse
>> (4) Are there still incoherencies between mmap and read/write
>> access?  At one time there were, [...]
> This bug was fixed nearly a quarter century ago, in November 2000,
> with the merge of the unified buffer cache.

Ah, I recall UBC being brought in.

> I think using any version of NetBSD released in this millennium
> should be good to avoid the bug.

For use cases for which such a thing is appropriate, if such a thing
exists, yes, I daresay it would be.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Taylor R Campbell
> Date: Fri, 5 Apr 2024 07:36:42 -0400 (EDT)
> From: Mouse 
> 
> (4) Are there still incoherencies between mmap and read/write access?
> At one time there were, and I never got a good handle on what needed to
> be done to avoid them.

This bug was fixed nearly a quarter century ago, in November 2000,
with the merge of the unified buffer cache.  You can read about it
here:

http://www.usenix.org/event/usenix2000/freenix/full_papers/silvers/silvers.pdf

I think using any version of NetBSD released in this millennium should
be good to avoid the bug.


Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Mouse
>> [...]
> Why not stat the input file and decide to use in memory iff the file
> is small enough?  This way sort will handle large sorts on small
> memory machines automatically.

Well, I'm not the one (putatively) doing the work.  But my answers to
that are:

(1) Small sorts are not the issue, IMO.  Even a speedup as great as
halving the time taken is not enough to worry about when it's on a par
with the cost of starting sort(1) at all.

(2) Using mmap versus read provides minimal speedup in this sort of
case: a small file which is being read sequentially.

(3) Code complexity: two paths means twice the testing, twice the
opportunities for bugs, (slightly more than) twice the maintenance,
etc.

(4) Are there still incoherencies between mmap and read/write access?
At one time there were, and I never got a good handle on what needed to
be done to avoid them.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Using mmap(2) in sort(1) instead of temp files

2024-04-05 Thread Brett Lymn
On Thu, Apr 04, 2024 at 02:38:02PM +0200, Martin Husemann wrote:
> 
> Since the original comment hints at "instead of temp files" it is pretty
> clear that the second variant is meant. This avoids all file system operations
> and if the machine you run on has enough free memory it might not even 
> actually
> touch swap space.
> 

Why not stat the input file and decide to use in memory iff the file is
small enough?  This way sort will handle large sorts on small memory
machines automatically.

-- 
Brett Lymn
--
Sent from my NetBSD device.

"We are were wolves",
"You mean werewolves?",
"No we were wolves, now we are something else entirely",
"Oh"