Re: Using mmap(2) in sort(1) instead of temp files
>> (4) Are there still incoherencies between mmap and read/write >> access? At one time there were, [...] > This bug was fixed nearly a quarter century ago, in November 2000, > with the merge of the unified buffer cache. Ah, I recall UBC being brought in. > I think using any version of NetBSD released in this millennium > should be good to avoid the bug. For use cases for which such a thing is appropriate, if such a thing exists, yes, I daresay it would be. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Using mmap(2) in sort(1) instead of temp files
> Date: Fri, 5 Apr 2024 07:36:42 -0400 (EDT) > From: Mouse > > (4) Are there still incoherencies between mmap and read/write access? > At one time there were, and I never got a good handle on what needed to > be done to avoid them. This bug was fixed nearly a quarter century ago, in November 2000, with the merge of the unified buffer cache. You can read about it here: http://www.usenix.org/event/usenix2000/freenix/full_papers/silvers/silvers.pdf I think using any version of NetBSD released in this millennium should be good to avoid the bug.
Re: Using mmap(2) in sort(1) instead of temp files
>> [...] > Why not stat the input file and decide to use in memory iff the file > is small enough? This way sort will handle large sorts on small > memory machines automatically. Well, I'm not the one (putatively) doing the work. But my answers to that are: (1) Small sorts are not the issue, IMO. Even a speedup as great as halving the time taken is not enough to worry about when it's on a par with the cost of starting sort(1) at all. (2) Using mmap versus read provides minimal speedup in this sort of case: a small file which is being read sequentially. (3) Code complexity: two paths means twice the testing, twice the opportunities for bugs, (slightly more than) twice the maintenance, etc. (4) Are there still incoherencies between mmap and read/write access? At one time there were, and I never got a good handle on what needed to be done to avoid them. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Using mmap(2) in sort(1) instead of temp files
On Thu, Apr 04, 2024 at 02:38:02PM +0200, Martin Husemann wrote: > > Since the original comment hints at "instead of temp files" it is pretty > clear that the second variant is meant. This avoids all file system operations > and if the machine you run on has enough free memory it might not even > actually > touch swap space. > Why not stat the input file and decide to use in memory iff the file is small enough? This way sort will handle large sorts on small memory machines automatically. -- Brett Lymn -- Sent from my NetBSD device. "We are were wolves", "You mean werewolves?", "No we were wolves, now we are something else entirely", "Oh"