Re: problems with mmap() and disk caching

2012-05-23 Thread Andrey Zonov

On 4/30/12 3:49 AM, Alan Cox wrote:

On 04/11/2012 01:07, Andrey Zonov wrote:

On 10.04.2012 20:19, Alan Cox wrote:

On 04/09/2012 10:26, John Baldwin wrote:

On Thursday, April 05, 2012 11:54:31 am Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer. I expect that page should be only
once
touched to get it into the memory (disk cache?), but this doesn't
work!

I wrote the test (attached) and ran it for the 1G file generated
from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super:
0; other: 0)
mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super:
0; other: 0)
mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super:
0; other: 0)
mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super:
0; other: 0)
mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super:
0; other: 0)
mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super:
0; other: 0)
mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super:
0; other: 0)
mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super:
0; other: 0)
mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super:
0; other: 0)
mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super:
0; other: 0)
mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super:
0; other: 0)
mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super:
0; other: 0)
mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super:
0; other: 0)
mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super:
0; other: 0)
mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super:
0; other: 0)
mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super:
0; other: 0)
mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super:
0; other: 0)
mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super:
0; other: 0)
mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super:
0; other: 0)
mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super:
0; other: 0)
mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super:
0; other: 0)
mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super:
0; other: 0)
mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super:
0; other: 0)
mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super:
0; other: 0)
mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super:
0; other: 0)
mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super:
0; other: 0)
mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super:
0; other: 0)
mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super:
0; other: 0)
mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super:
0; other: 0)
mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super:
0; other: 0)

If I ran this:
$ cat /mnt/random-1024 /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super:
0; other: 0)
mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super:
0; other: 0)
mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super:
0; other: 0)
mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super:
0; other: 0)
mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super:
0; other: 0)

This is what I expect. But why this doesn't work without reading
file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years. Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);

because I suspect that the current code does more harm than good. In
theory, it saves activations of the page daemon. However, more often
than not, I suspect that we are spending more on page reactivations
than
we are saving on page daemon activations. The sequential access
detection heuristic is just too easily triggered. For example, I've
seen it triggered by demand paging of the gcc text segment. Also, I
think that pmap_remove_all() and especially vm_page_cache() are too
severe for a detection heuristic that is so easily triggered.

Are you planning to commit this?



Not yet. I did some tests with a file that was several times larger than
DRAM, and I didn't like what I saw. Initially, everything behaved as
expected, but about halfway through the test the bulk of the pages were
active. Despite the call to pmap_clear_reference() in
vm_page_dontneed(), the page daemon is finding the pages to be

Re: problems with mmap() and disk caching

2012-04-29 Thread Alan Cox

On 04/11/2012 01:07, Andrey Zonov wrote:

On 10.04.2012 20:19, Alan Cox wrote:

On 04/09/2012 10:26, John Baldwin wrote:

On Thursday, April 05, 2012 11:54:31 am Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer. I expect that page should be only 
once

touched to get it into the memory (disk cache?), but this doesn't
work!

I wrote the test (attached) and ran it for the 1G file generated 
from

/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super:
0; other: 0)
mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super:
0; other: 0)
mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super:
0; other: 0)
mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super:
0; other: 0)
mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super:
0; other: 0)
mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super:
0; other: 0)
mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super:
0; other: 0)
mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super:
0; other: 0)
mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super:
0; other: 0)
mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super:
0; other: 0)
mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super:
0; other: 0)
mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super:
0; other: 0)
mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super:
0; other: 0)
mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super:
0; other: 0)
mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super:
0; other: 0)
mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super:
0; other: 0)
mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super:
0; other: 0)
mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super:
0; other: 0)
mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super:
0; other: 0)
mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super:
0; other: 0)
mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super:
0; other: 0)
mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super:
0; other: 0)
mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super:
0; other: 0)
mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super:
0; other: 0)
mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super:
0; other: 0)
mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super:
0; other: 0)
mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super:
0; other: 0)
mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super:
0; other: 0)
mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super:
0; other: 0)
mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super:
0; other: 0)

If I ran this:
$ cat /mnt/random-1024 /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super:
0; other: 0)
mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super:
0; other: 0)
mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super:
0; other: 0)
mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super:
0; other: 0)
mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super:
0; other: 0)

This is what I expect. But why this doesn't work without reading 
file

manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years. Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);

because I suspect that the current code does more harm than good. In
theory, it saves activations of the page daemon. However, more often
than not, I suspect that we are spending more on page reactivations 
than

we are saving on page daemon activations. The sequential access
detection heuristic is just too easily triggered. For example, I've
seen it triggered by demand paging of the gcc text segment. Also, I
think that pmap_remove_all() and especially vm_page_cache() are too
severe for a detection heuristic that is so easily triggered.

Are you planning to commit this?



Not yet. I did some tests with a file that was several times larger than
DRAM, and I didn't like what I saw. Initially, everything behaved as
expected, but about halfway through the test the bulk of the pages were
active. Despite the call to pmap_clear_reference() in
vm_page_dontneed(), the page daemon is finding the pages to be
referenced and reactivating 

Re: problems with mmap() and disk caching

2012-04-11 Thread Andrey Zonov

On 10.04.2012 20:19, Alan Cox wrote:

On 04/09/2012 10:26, John Baldwin wrote:

On Thursday, April 05, 2012 11:54:31 am Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer. I expect that page should be only once
touched to get it into the memory (disk cache?), but this doesn't
work!

I wrote the test (attached) and ran it for the 1G file generated from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap: 1 pass took: 7.431046 (none: 262112; res: 32; super:
0; other: 0)
mmap: 2 pass took: 7.356670 (none: 261648; res: 496; super:
0; other: 0)
mmap: 3 pass took: 7.307094 (none: 260521; res: 1623; super:
0; other: 0)
mmap: 4 pass took: 7.350239 (none: 258904; res: 3240; super:
0; other: 0)
mmap: 5 pass took: 7.392480 (none: 257286; res: 4858; super:
0; other: 0)
mmap: 6 pass took: 7.292069 (none: 255584; res: 6560; super:
0; other: 0)
mmap: 7 pass took: 7.048980 (none: 251142; res: 11002; super:
0; other: 0)
mmap: 8 pass took: 6.899387 (none: 247584; res: 14560; super:
0; other: 0)
mmap: 9 pass took: 7.190579 (none: 242992; res: 19152; super:
0; other: 0)
mmap: 10 pass took: 6.915482 (none: 239308; res: 22836; super:
0; other: 0)
mmap: 11 pass took: 6.565909 (none: 232835; res: 29309; super:
0; other: 0)
mmap: 12 pass took: 6.423945 (none: 226160; res: 35984; super:
0; other: 0)
mmap: 13 pass took: 6.315385 (none: 208555; res: 53589; super:
0; other: 0)
mmap: 14 pass took: 6.760780 (none: 192805; res: 69339; super:
0; other: 0)
mmap: 15 pass took: 5.721513 (none: 174497; res: 87647; super:
0; other: 0)
mmap: 16 pass took: 5.004424 (none: 155938; res: 106206; super:
0; other: 0)
mmap: 17 pass took: 4.224926 (none: 135639; res: 126505; super:
0; other: 0)
mmap: 18 pass took: 3.749608 (none: 117952; res: 144192; super:
0; other: 0)
mmap: 19 pass took: 3.398084 (none: 99066; res: 163078; super:
0; other: 0)
mmap: 20 pass took: 3.029557 (none: 74994; res: 187150; super:
0; other: 0)
mmap: 21 pass took: 2.379430 (none: 55231; res: 206913; super:
0; other: 0)
mmap: 22 pass took: 2.046521 (none: 40786; res: 221358; super:
0; other: 0)
mmap: 23 pass took: 1.152797 (none: 30311; res: 231833; super:
0; other: 0)
mmap: 24 pass took: 0.972617 (none: 16196; res: 245948; super:
0; other: 0)
mmap: 25 pass took: 0.577515 (none: 8286; res: 253858; super:
0; other: 0)
mmap: 26 pass took: 0.380738 (none: 3712; res: 258432; super:
0; other: 0)
mmap: 27 pass took: 0.253583 (none: 1193; res: 260951; super:
0; other: 0)
mmap: 28 pass took: 0.157508 (none: 0; res: 262144; super:
0; other: 0)
mmap: 29 pass took: 0.156169 (none: 0; res: 262144; super:
0; other: 0)
mmap: 30 pass took: 0.156550 (none: 0; res: 262144; super:
0; other: 0)

If I ran this:
$ cat /mnt/random-1024 /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap: 1 pass took: 0.337657 (none: 0; res: 262144; super:
0; other: 0)
mmap: 2 pass took: 0.186137 (none: 0; res: 262144; super:
0; other: 0)
mmap: 3 pass took: 0.186132 (none: 0; res: 262144; super:
0; other: 0)
mmap: 4 pass took: 0.186535 (none: 0; res: 262144; super:
0; other: 0)
mmap: 5 pass took: 0.190353 (none: 0; res: 262144; super:
0; other: 0)

This is what I expect. But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years. Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);

because I suspect that the current code does more harm than good. In
theory, it saves activations of the page daemon. However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations. The sequential access
detection heuristic is just too easily triggered. For example, I've
seen it triggered by demand paging of the gcc text segment. Also, I
think that pmap_remove_all() and especially vm_page_cache() are too
severe for a detection heuristic that is so easily triggered.

Are you planning to commit this?



Not yet. I did some tests with a file that was several times larger than
DRAM, and I didn't like what I saw. Initially, everything behaved as
expected, but about halfway through the test the bulk of the pages were
active. Despite the call to pmap_clear_reference() in
vm_page_dontneed(), the page daemon is finding the pages to be
referenced and reactivating them. The net result is that the time it
takes to 

Re: problems with mmap() and disk caching

2012-04-10 Thread Alan Cox

On 04/09/2012 10:26, John Baldwin wrote:

On Thursday, April 05, 2012 11:54:31 am Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer.  I expect that page should be only once
touched to get it into the memory (disk cache?), but this doesn't work!

I wrote the test (attached) and ran it for the 1G file generated from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
0; other:  0)
mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
0; other:  0)
mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
0; other:  0)
mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
0; other:  0)
mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
0; other:  0)
mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
0; other:  0)
mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
0; other:  0)
mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
0; other:  0)
mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
0; other:  0)
mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
0; other:  0)
mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
0; other:  0)
mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
0; other:  0)
mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
0; other:  0)
mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
0; other:  0)
mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
0; other:  0)
mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
0; other:  0)
mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
0; other:  0)
mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
0; other:  0)
mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
0; other:  0)
mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
0; other:  0)
mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
0; other:  0)
mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
0; other:  0)
mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
0; other:  0)
mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
0; other:  0)
mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
0; other:  0)
mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
0; other:  0)
mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
0; other:  0)
mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
0; other:  0)
mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
0; other:  0)
mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
0; other:  0)

If I ran this:
$ cat /mnt/random-1024   /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
0; other:  0)
mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
0; other:  0)
mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
0; other:  0)
mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
0; other:  0)
mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
0; other:  0)

This is what I expect.  But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years.  Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

  pmap_remove_all(mt);
  if (mt-dirty != 0)
  vm_page_deactivate(mt);
  else
  vm_page_cache(mt);

to:

  vm_page_dontneed(mt);

because I suspect that the current code does more harm than good.  In
theory, it saves activations of the page daemon.  However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations.  The sequential access
detection heuristic is just too easily triggered.  For example, I've
seen it triggered by demand paging of the gcc text segment.  Also, I

mlock/mlockall (was: Re: problems with mmap() and disk caching)

2012-04-10 Thread Dieter BSD
Andrey writes:
 Wired memory: kernel memory and yes, application may get wired memory
 through mlock()/mlockall(), but I haven't seen any real application
 which calls mlock().

Apps with real time considerations may need to lock memory to prevent
having to wait for page/swap.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: problems with mmap() and disk caching

2012-04-09 Thread Andrey Zonov

On 06.04.2012 12:13, Konstantin Belousov wrote:

On Thu, Apr 05, 2012 at 11:54:53PM +0400, Andrey Zonov wrote:

On 05.04.2012 23:41, Konstantin Belousov wrote:

On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote:

On 05.04.2012 19:54, Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

[snip]

This is what I expect. But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.


I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years. Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);



Thanks Alan!  Now it works as I expect!

But I have more questions to you and kib@.  They are in my test below.

So, prepare file as earlier, and take information about memory usage

from top(1).  After preparation, but before test:

Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free

First run:
$ ./mmap /mnt/random
mmap:  1 pass took:   7.462865 (none:  0; res: 262144; super:
0; other:  0)

No super pages after first run, why?..

Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free

Now the file is in inactive memory, that's good.

Second run:
$ ./mmap /mnt/random
mmap:  1 pass took:   0.004191 (none:  0; res: 262144; super:
511; other:  0)

All super pages are here, nice.

Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free

Wow, all inactive pages moved to active and sit there even after process
was terminated, that's not good, what do you think?

Why do you think this is 'not good' ? You have plenty of free memory,
there is no memory pressure, and all pages were referenced recently.
THere is no reason for them to be deactivated.



I always thought that active memory this is a sum of resident memory of
all processes, inactive shows disk cache and wired shows kernel itself.

So you are wrong. Both active and inactive memory can be mapped and
not mapped, both can belong to vnode or to anonymous objects etc.
Active/inactive distinction is only the amount of references that was
noted by pagedaemon, or some other page history like the way it was
unwired.

Wired is not neccessary means kernel-used pages, user processes can
wire their pages as well.


Let's talk about that in details.

My understanding is the following:

Active memory: the memory which is referenced by application.  An 
application may get memory only through mmap() (allocator don't use 
brk()/sbrk() any more).  The resident memory of an application is the 
sum of physical used memory.  So, sum of RSS is active memory.


Inactive memory: the memory which has no references.  Once we call 
read() on the file, the file is in inactive memory, because we have no 
references to this object, we just read it.  This is also released 
memory by free().


Cache memory: I don't know what is it. It's always small enough to not 
think about it.


Wired memory: kernel memory and yes, application may get wired memory 
through mlock()/mlockall(), but I haven't seen any real application 
which calls mlock().






Read the file:
$ cat /mnt/random   /dev/null

Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free

Now the file is in wired memory.  I do not understand why so.

You do use UFS, right ?


Yes.


There is enough buffer headers and buffer KVA
to have buffers allocated for the whole file content. Since buffers wire
corresponding pages, you get pages migrated to wired.

When there appears a buffer pressure (i.e., any other i/o started),
the buffers will be repurposed and pages moved to inactive.



OK, how can I get amount of disk cache?

You cannot. At least I am not aware of any counter that keeps track
of the resident pages belonging to vnode pager.

Buffers should not be thought as disk cache, pages cache disk content.
Instead, VMIO buffers only provide bread()/bwrite() compatible interface
to the page cache (*) for filesystems.
(*) - The cache term is used in generic term, not to confuse with
cached pages counter from top etc.



Yes, I know that.  I try once again to ask my question about buffers. 
Is this reasonable to use for them 10% of the physical memory or we may 
set rational upper limit automatically?






Could you please give me explanation about active/inactive/wired memory?



because I suspect that the current code does more harm than good. In
theory, it saves activations of the page daemon. However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations. The sequential access
detection heuristic is just too easily triggered. For example, I've seen
it triggered by demand paging of the gcc text segment. Also, I think
that pmap_remove_all() and especially 

Re: problems with mmap() and disk caching

2012-04-09 Thread Konstantin Belousov
On Mon, Apr 09, 2012 at 11:17:41AM +0400, Andrey Zonov wrote:
 On 06.04.2012 12:13, Konstantin Belousov wrote:
 On Thu, Apr 05, 2012 at 11:54:53PM +0400, Andrey Zonov wrote:
 On 05.04.2012 23:41, Konstantin Belousov wrote:
 On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote:
 On 05.04.2012 19:54, Alan Cox wrote:
 On 04/04/2012 02:17, Konstantin Belousov wrote:
 On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
 [snip]
 This is what I expect. But why this doesn't work without reading file
 manually?
 Issue seems to be in some change of the behaviour of the reserv or
 phys allocator. I Cc:ed Alan.
 
 I'm pretty sure that the behavior here hasn't significantly changed in
 about twelve years. Otherwise, I agree with your analysis.
 
 On more than one occasion, I've been tempted to change:
 
 pmap_remove_all(mt);
 if (mt-dirty != 0)
 vm_page_deactivate(mt);
 else
 vm_page_cache(mt);
 
 to:
 
 vm_page_dontneed(mt);
 
 
 Thanks Alan!  Now it works as I expect!
 
 But I have more questions to you and kib@.  They are in my test below.
 
 So, prepare file as earlier, and take information about memory usage
 from top(1).  After preparation, but before test:
 Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free
 
 First run:
 $ ./mmap /mnt/random
 mmap:  1 pass took:   7.462865 (none:  0; res: 262144; super:
 0; other:  0)
 
 No super pages after first run, why?..
 
 Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free
 
 Now the file is in inactive memory, that's good.
 
 Second run:
 $ ./mmap /mnt/random
 mmap:  1 pass took:   0.004191 (none:  0; res: 262144; super:
 511; other:  0)
 
 All super pages are here, nice.
 
 Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free
 
 Wow, all inactive pages moved to active and sit there even after process
 was terminated, that's not good, what do you think?
 Why do you think this is 'not good' ? You have plenty of free memory,
 there is no memory pressure, and all pages were referenced recently.
 THere is no reason for them to be deactivated.
 
 
 I always thought that active memory this is a sum of resident memory of
 all processes, inactive shows disk cache and wired shows kernel itself.
 So you are wrong. Both active and inactive memory can be mapped and
 not mapped, both can belong to vnode or to anonymous objects etc.
 Active/inactive distinction is only the amount of references that was
 noted by pagedaemon, or some other page history like the way it was
 unwired.
 
 Wired is not neccessary means kernel-used pages, user processes can
 wire their pages as well.
 
 Let's talk about that in details.
 
 My understanding is the following:
 
 Active memory: the memory which is referenced by application.  An 
Assuming the part 'by application' is removed, this sentence is almost right.
Any managed mapping of the page participates in the active references.

 application may get memory only through mmap() (allocator don't use 
 brk()/sbrk() any more).  The resident memory of an application is the 
 sum of physical used memory.  So, sum of RSS is active memory.
First, brk/sbrk is still used. Second, there is no requirement that
resident pages are referenced. E.g. page could have participated in the
buffer, and unwiring on the buffer dissolve put it into inactive state.
Or pagedaemon cleared the reference and moved the page to inactive queue.
Or the page was prefaulted by different optimizations.

More, there is subtle difference between 'resident' and 'not causing fault
on access'. Page may be resident, but pte was not preinstalled, or pte
was flushed etc.
 
 Inactive memory: the memory which has no references.  Once we call 
 read() on the file, the file is in inactive memory, because we have no 
 references to this object, we just read it.  This is also released 
 memory by free().
On buffers dissolve, buffer cache explicitely puts pages constituing 
the buffer, into the inactive queue. In fact, this is not quite right,
e.g. if the same pages are mapped and actively referenced, then
pagedaemon has slightly more work now to move the page from inactive
to active.

And, free(3) operates at so much higher level then vm subsystem that
describing the interaction between these two is impossible in any
definitive mood. Old naive mallocs put block description at the beggining
of the block, actually causing free() to reference at least the first
page of the block. Jemalloc often does madvise(MADV_FREE) for large
freed allocations. MADV_FREE  moves pages between queues probabalistically.

 
 Cache memory: I don't know what is it. It's always small enough to not 
 think about it.
This was the bug you reported, and which Alan fixed on Sunday.

 
 Wired memory: kernel memory and yes, application may get wired memory 
 through mlock()/mlockall(), but I haven't seen any real application 
 which calls mlock().
ntpd, amd from the base system. gpg and similar programs try to mlock
key store to avoid sensitive material leakage to the 

Re: problems with mmap() and disk caching

2012-04-09 Thread Andrey Zonov
On Mon, Apr 9, 2012 at 1:18 PM, Konstantin Belousov kostik...@gmail.com wrote:
 On Mon, Apr 09, 2012 at 11:17:41AM +0400, Andrey Zonov wrote:
 On 06.04.2012 12:13, Konstantin Belousov wrote:
 On Thu, Apr 05, 2012 at 11:54:53PM +0400, Andrey Zonov wrote:
[snip]
 I always thought that active memory this is a sum of resident memory of
 all processes, inactive shows disk cache and wired shows kernel itself.
 So you are wrong. Both active and inactive memory can be mapped and
 not mapped, both can belong to vnode or to anonymous objects etc.
 Active/inactive distinction is only the amount of references that was
 noted by pagedaemon, or some other page history like the way it was
 unwired.
 
 Wired is not neccessary means kernel-used pages, user processes can
 wire their pages as well.

 Let's talk about that in details.

 My understanding is the following:

 Active memory: the memory which is referenced by application.  An
 Assuming the part 'by application' is removed, this sentence is almost right.
 Any managed mapping of the page participates in the active references.

 application may get memory only through mmap() (allocator don't use
 brk()/sbrk() any more).  The resident memory of an application is the
 sum of physical used memory.  So, sum of RSS is active memory.
 First, brk/sbrk is still used. Second, there is no requirement that
 resident pages are referenced. E.g. page could have participated in the
 buffer, and unwiring on the buffer dissolve put it into inactive state.
 Or pagedaemon cleared the reference and moved the page to inactive queue.
 Or the page was prefaulted by different optimizations.

 More, there is subtle difference between 'resident' and 'not causing fault
 on access'. Page may be resident, but pte was not preinstalled, or pte
 was flushed etc.

From the user point of view: how can the memory be active if no-one (I
mean application) use it?

What I really saw not at once is that the program for a long time
worked with big mmap()'ed file, couldn't work well (many page faults)
with new version of the file, until I manually flushed active memory
by FS re-mounting.  New version couldn't force out the old one.  In my
opinion if VM moved cached objects to inactive queue after program
termination I wouldn't see this problem.


 Inactive memory: the memory which has no references.  Once we call
 read() on the file, the file is in inactive memory, because we have no
 references to this object, we just read it.  This is also released
 memory by free().
 On buffers dissolve, buffer cache explicitely puts pages constituing
 the buffer, into the inactive queue. In fact, this is not quite right,
 e.g. if the same pages are mapped and actively referenced, then
 pagedaemon has slightly more work now to move the page from inactive
 to active.


Yes, sure, if someone else use the object it should be active and even
better to introduce new SHARED counter, like one is in MacOSX and
Linux.

 And, free(3) operates at so much higher level then vm subsystem that
 describing the interaction between these two is impossible in any
 definitive mood. Old naive mallocs put block description at the beggining
 of the block, actually causing free() to reference at least the first
 page of the block. Jemalloc often does madvise(MADV_FREE) for large
 freed allocations. MADV_FREE  moves pages between queues probabalistically.


That's exactly what I meant by free().  We drop act_count to 0 and
move page to inactive queue by vm_page_dontneed()


 Cache memory: I don't know what is it. It's always small enough to not
 think about it.
 This was the bug you reported, and which Alan fixed on Sunday.


I've tested this patch under 9.0-STABLE and should say that it
introduces problems with interactivity on heavy disk loaded machines.
With the patch that I tested before I didn't observe such problems.


 Wired memory: kernel memory and yes, application may get wired memory
 through mlock()/mlockall(), but I haven't seen any real application
 which calls mlock().
 ntpd, amd from the base system. gpg and similar programs try to mlock
 key store to avoid sensitive material leakage to the swap. cdrecord(8)
 tried to mlock itself to avoid indefinite stalls during write.


Nice catch ;-)



 
 
 Read the file:
 $ cat /mnt/random   /dev/null
 
 Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free
 
 Now the file is in wired memory.  I do not understand why so.
 You do use UFS, right ?
 
 Yes.
 
 There is enough buffer headers and buffer KVA
 to have buffers allocated for the whole file content. Since buffers wire
 corresponding pages, you get pages migrated to wired.
 
 When there appears a buffer pressure (i.e., any other i/o started),
 the buffers will be repurposed and pages moved to inactive.
 
 
 OK, how can I get amount of disk cache?
 You cannot. At least I am not aware of any counter that keeps track
 of the resident pages belonging to vnode pager.
 
 Buffers should not be thought as disk cache, pages cache disk 

Re: problems with mmap() and disk caching

2012-04-09 Thread Konstantin Belousov
On Mon, Apr 09, 2012 at 03:35:30PM +0400, Andrey Zonov wrote:
 On Mon, Apr 9, 2012 at 1:18 PM, Konstantin Belousov kostik...@gmail.com 
 wrote:
  On Mon, Apr 09, 2012 at 11:17:41AM +0400, Andrey Zonov wrote:
  On 06.04.2012 12:13, Konstantin Belousov wrote:
  On Thu, Apr 05, 2012 at 11:54:53PM +0400, Andrey Zonov wrote:
 [snip]
  I always thought that active memory this is a sum of resident memory of
  all processes, inactive shows disk cache and wired shows kernel itself.
  So you are wrong. Both active and inactive memory can be mapped and
  not mapped, both can belong to vnode or to anonymous objects etc.
  Active/inactive distinction is only the amount of references that was
  noted by pagedaemon, or some other page history like the way it was
  unwired.
  
  Wired is not neccessary means kernel-used pages, user processes can
  wire their pages as well.
 
  Let's talk about that in details.
 
  My understanding is the following:
 
  Active memory: the memory which is referenced by application.  An
  Assuming the part 'by application' is removed, this sentence is almost 
  right.
  Any managed mapping of the page participates in the active references.
 
  application may get memory only through mmap() (allocator don't use
  brk()/sbrk() any more).  The resident memory of an application is the
  sum of physical used memory.  So, sum of RSS is active memory.
  First, brk/sbrk is still used. Second, there is no requirement that
  resident pages are referenced. E.g. page could have participated in the
  buffer, and unwiring on the buffer dissolve put it into inactive state.
  Or pagedaemon cleared the reference and moved the page to inactive queue.
  Or the page was prefaulted by different optimizations.
 
  More, there is subtle difference between 'resident' and 'not causing fault
  on access'. Page may be resident, but pte was not preinstalled, or pte
  was flushed etc.
 
 From the user point of view: how can the memory be active if no-one (I
 mean application) use it?
 
 What I really saw not at once is that the program for a long time
 worked with big mmap()'ed file, couldn't work well (many page faults)
 with new version of the file, until I manually flushed active memory
 by FS re-mounting.  New version couldn't force out the old one.  In my
 opinion if VM moved cached objects to inactive queue after program
 termination I wouldn't see this problem.
Moving pages to inactive just because some mapping was destroyed is plain
silly. The pages migrate between active/inactive/cache/free by the
pagedaemon algorithms.

BTW, you do not need to actually remount filesystem to flush pages of its
vnodes. It is enough to try to unmount it while cd to filesystem root.
 
 
  Inactive memory: the memory which has no references.  Once we call
  read() on the file, the file is in inactive memory, because we have no
  references to this object, we just read it.  This is also released
  memory by free().
  On buffers dissolve, buffer cache explicitely puts pages constituing
  the buffer, into the inactive queue. In fact, this is not quite right,
  e.g. if the same pages are mapped and actively referenced, then
  pagedaemon has slightly more work now to move the page from inactive
  to active.
 
 
 Yes, sure, if someone else use the object it should be active and even
 better to introduce new SHARED counter, like one is in MacOSX and
 Linux.
Counter for what ? There is already the ref counter for a vm object.

 
  And, free(3) operates at so much higher level then vm subsystem that
  describing the interaction between these two is impossible in any
  definitive mood. Old naive mallocs put block description at the beggining
  of the block, actually causing free() to reference at least the first
  page of the block. Jemalloc often does madvise(MADV_FREE) for large
  freed allocations. MADV_FREE  moves pages between queues probabalistically.
 
 
 That's exactly what I meant by free().  We drop act_count to 0 and
 move page to inactive queue by vm_page_dontneed()
 
 
  Cache memory: I don't know what is it. It's always small enough to not
  think about it.
  This was the bug you reported, and which Alan fixed on Sunday.
 
 
 I've tested this patch under 9.0-STABLE and should say that it
 introduces problems with interactivity on heavy disk loaded machines.
 With the patch that I tested before I didn't observe such problems.
 
 
  Wired memory: kernel memory and yes, application may get wired memory
  through mlock()/mlockall(), but I haven't seen any real application
  which calls mlock().
  ntpd, amd from the base system. gpg and similar programs try to mlock
  key store to avoid sensitive material leakage to the swap. cdrecord(8)
  tried to mlock itself to avoid indefinite stalls during write.
 
 
 Nice catch ;-)
 
 
 
  
  
  Read the file:
  $ cat /mnt/random   /dev/null
  
  Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free
  
  Now the file is in wired memory.  I do not understand why so.
  You do use UFS, 

Re: problems with mmap() and disk caching

2012-04-09 Thread John Baldwin
On Thursday, April 05, 2012 11:54:31 am Alan Cox wrote:
 On 04/04/2012 02:17, Konstantin Belousov wrote:
  On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
  Hi,
 
  I open the file, then call mmap() on the whole file and get pointer,
  then I work with this pointer.  I expect that page should be only once
  touched to get it into the memory (disk cache?), but this doesn't work!
 
  I wrote the test (attached) and ran it for the 1G file generated from
  /dev/random, the result is the following:
 
  Prepare file:
  # swapoff -a
  # newfs /dev/ada0b
  # mount /dev/ada0b /mnt
  # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024
 
  Purge cache:
  # umount /mnt
  # mount /dev/ada0b /mnt
 
  Run test:
  $ ./mmap /mnt/random-1024 30
  mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
  0; other:  0)
  mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
  0; other:  0)
  mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
  0; other:  0)
  mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
  0; other:  0)
  mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
  0; other:  0)
  mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
  0; other:  0)
  mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
  0; other:  0)
  mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
  0; other:  0)
  mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
  0; other:  0)
  mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
  0; other:  0)
  mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
  0; other:  0)
  mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
  0; other:  0)
  mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
  0; other:  0)
  mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
  0; other:  0)
  mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
  0; other:  0)
  mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
  0; other:  0)
  mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
  0; other:  0)
  mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
  0; other:  0)
  mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
  0; other:  0)
  mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
  0; other:  0)
  mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
  0; other:  0)
  mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
  0; other:  0)
  mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
  0; other:  0)
  mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
  0; other:  0)
  mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
  0; other:  0)
  mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
  0; other:  0)
  mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
  0; other:  0)
  mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
  0; other:  0)
  mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
  0; other:  0)
  mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
  0; other:  0)
 
  If I ran this:
  $ cat /mnt/random-1024  /dev/null
  before test, when result is the following:
 
  $ ./mmap /mnt/random-1024 5
  mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
  0; other:  0)
  mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
  0; other:  0)
  mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
  0; other:  0)
  mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
  0; other:  0)
  mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
  0; other:  0)
 
  This is what I expect.  But why this doesn't work without reading file
  manually?
  Issue seems to be in some change of the behaviour of the reserv or
  phys allocator. I Cc:ed Alan.
 
 I'm pretty sure that the behavior here hasn't significantly changed in 
 about twelve years.  Otherwise, I agree with your analysis.
 
 On more than one occasion, I've been tempted to change:
 
  pmap_remove_all(mt);
  if (mt-dirty != 0)
  vm_page_deactivate(mt);
  else
  vm_page_cache(mt);
 
 to:
 
  vm_page_dontneed(mt);
 
 because I suspect that the current code does more harm than good.  In 
 theory, it saves activations of the page daemon.  However, more often 
 than not, I suspect that we are spending more on page reactivations than 
 we are saving on page daemon 

Re: problems with mmap() and disk caching

2012-04-06 Thread Andrey Zonov

On 05.04.2012 23:54, Andrey Zonov wrote:

On 05.04.2012 23:41, Konstantin Belousov wrote:

You do use UFS, right ?


Yes.



I've run test on ZFS.

Mem: 2645M Active, 363M Inact, 2042M Wired, 1406M Buf, 42G Free

$ ./mmap /mnt/random

Mem: 3669M Active, 363M Inact, 3067M Wired, 1406M Buf, 40G Free

It eats 2Gb as I understand.

# umount /mnt
# zfs mount -a

Mem: 2645M Active, 363M Inact, 2042M Wired, 1406M Buf, 42G Free

$ cat /mnt/random  /dev/null

Mem: 2645M Active, 363M Inact, 3067M Wired, 1406M Buf, 41G Free

That's correct - 1Gb.

About Buf memory.  Is this reasonable to set it to 10% of physical 
memory?  I've lost 10Gb by default on machines with 96Gb.


--
Andrey Zonov
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: problems with mmap() and disk caching

2012-04-06 Thread Konstantin Belousov
On Thu, Apr 05, 2012 at 11:54:53PM +0400, Andrey Zonov wrote:
 On 05.04.2012 23:41, Konstantin Belousov wrote:
 On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote:
 On 05.04.2012 19:54, Alan Cox wrote:
 On 04/04/2012 02:17, Konstantin Belousov wrote:
 On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
 [snip]
 This is what I expect. But why this doesn't work without reading file
 manually?
 Issue seems to be in some change of the behaviour of the reserv or
 phys allocator. I Cc:ed Alan.
 
 I'm pretty sure that the behavior here hasn't significantly changed in
 about twelve years. Otherwise, I agree with your analysis.
 
 On more than one occasion, I've been tempted to change:
 
 pmap_remove_all(mt);
 if (mt-dirty != 0)
 vm_page_deactivate(mt);
 else
 vm_page_cache(mt);
 
 to:
 
 vm_page_dontneed(mt);
 
 
 Thanks Alan!  Now it works as I expect!
 
 But I have more questions to you and kib@.  They are in my test below.
 
 So, prepare file as earlier, and take information about memory usage
 from top(1).  After preparation, but before test:
 Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free
 
 First run:
 $ ./mmap /mnt/random
 mmap:  1 pass took:   7.462865 (none:  0; res: 262144; super:
 0; other:  0)
 
 No super pages after first run, why?..
 
 Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free
 
 Now the file is in inactive memory, that's good.
 
 Second run:
 $ ./mmap /mnt/random
 mmap:  1 pass took:   0.004191 (none:  0; res: 262144; super:
 511; other:  0)
 
 All super pages are here, nice.
 
 Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free
 
 Wow, all inactive pages moved to active and sit there even after process
 was terminated, that's not good, what do you think?
 Why do you think this is 'not good' ? You have plenty of free memory,
 there is no memory pressure, and all pages were referenced recently.
 THere is no reason for them to be deactivated.
 
 
 I always thought that active memory this is a sum of resident memory of 
 all processes, inactive shows disk cache and wired shows kernel itself.
So you are wrong. Both active and inactive memory can be mapped and
not mapped, both can belong to vnode or to anonymous objects etc.
Active/inactive distinction is only the amount of references that was
noted by pagedaemon, or some other page history like the way it was
unwired.

Wired is not neccessary means kernel-used pages, user processes can
wire their pages as well.
 
 
 Read the file:
 $ cat /mnt/random  /dev/null
 
 Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free
 
 Now the file is in wired memory.  I do not understand why so.
 You do use UFS, right ?
 
 Yes.
 
 There is enough buffer headers and buffer KVA
 to have buffers allocated for the whole file content. Since buffers wire
 corresponding pages, you get pages migrated to wired.
 
 When there appears a buffer pressure (i.e., any other i/o started),
 the buffers will be repurposed and pages moved to inactive.
 
 
 OK, how can I get amount of disk cache?
You cannot. At least I am not aware of any counter that keeps track
of the resident pages belonging to vnode pager.

Buffers should not be thought as disk cache, pages cache disk content.
Instead, VMIO buffers only provide bread()/bwrite() compatible interface
to the page cache (*) for filesystems.
(*) - The cache term is used in generic term, not to confuse with
cached pages counter from top etc.

 
 
 Could you please give me explanation about active/inactive/wired memory?
 
 
 because I suspect that the current code does more harm than good. In
 theory, it saves activations of the page daemon. However, more often
 than not, I suspect that we are spending more on page reactivations than
 we are saving on page daemon activations. The sequential access
 detection heuristic is just too easily triggered. For example, I've seen
 it triggered by demand paging of the gcc text segment. Also, I think
 that pmap_remove_all() and especially vm_page_cache() are too severe for
 a detection heuristic that is so easily triggered.
 
 [snip]
 
 --
 Andrey Zonov
 
 -- 
 Andrey Zonov


pgpb0p2pbb0W7.pgp
Description: PGP signature


Re: problems with mmap() and disk caching

2012-04-06 Thread Konstantin Belousov
On Thu, Apr 05, 2012 at 01:25:49PM -0500, Alan Cox wrote:
 On 04/05/2012 12:31, Konstantin Belousov wrote:
 On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote:
 On 04/04/2012 02:17, Konstantin Belousov wrote:
 On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
 Hi,
 
 I open the file, then call mmap() on the whole file and get pointer,
 then I work with this pointer.  I expect that page should be only once
 touched to get it into the memory (disk cache?), but this doesn't work!
 
 I wrote the test (attached) and ran it for the 1G file generated from
 /dev/random, the result is the following:
 
 Prepare file:
 # swapoff -a
 # newfs /dev/ada0b
 # mount /dev/ada0b /mnt
 # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024
 
 Purge cache:
 # umount /mnt
 # mount /dev/ada0b /mnt
 
 Run test:
 $ ./mmap /mnt/random-1024 30
 mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
 0; other:  0)
 mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
 0; other:  0)
 mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
 0; other:  0)
 mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
 0; other:  0)
 mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
 0; other:  0)
 mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
 0; other:  0)
 mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
 0; other:  0)
 mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
 0; other:  0)
 mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
 0; other:  0)
 mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
 0; other:  0)
 mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
 0; other:  0)
 mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
 0; other:  0)
 mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
 0; other:  0)
 mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
 0; other:  0)
 mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
 0; other:  0)
 mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
 0; other:  0)
 mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
 0; other:  0)
 mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
 0; other:  0)
 mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
 0; other:  0)
 mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
 0; other:  0)
 mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
 0; other:  0)
 mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
 0; other:  0)
 mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
 0; other:  0)
 mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
 0; other:  0)
 mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
 0; other:  0)
 mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
 0; other:  0)
 mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
 0; other:  0)
 mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
 0; other:  0)
 mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
 0; other:  0)
 mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
 0; other:  0)
 
 If I ran this:
 $ cat /mnt/random-1024   /dev/null
 before test, when result is the following:
 
 $ ./mmap /mnt/random-1024 5
 mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
 0; other:  0)
 
 This is what I expect.  But why this doesn't work without reading file
 manually?
 Issue seems to be in some change of the behaviour of the reserv or
 phys allocator. I Cc:ed Alan.
 I'm pretty sure that the behavior here hasn't significantly changed in
 about twelve years.  Otherwise, I agree with your analysis.
 
 On more than one occasion, I've been tempted to change:
 
  pmap_remove_all(mt);
  if (mt-dirty != 0)
  vm_page_deactivate(mt);
  else
  vm_page_cache(mt);
 
 to:
 
  vm_page_dontneed(mt);
 
 because I suspect that the current code does more harm than good.  In
 theory, it saves activations of the page daemon.  However, more often
 than not, I suspect that we are spending more on page reactivations than
 we are saving on page 

Re: problems with mmap() and disk caching

2012-04-06 Thread Alan Cox

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer.  I expect that page should be only once
touched to get it into the memory (disk cache?), but this doesn't work!

I wrote the test (attached) and ran it for the 1G file generated from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
0; other:  0)
mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
0; other:  0)
mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
0; other:  0)
mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
0; other:  0)
mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
0; other:  0)
mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
0; other:  0)
mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
0; other:  0)
mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
0; other:  0)
mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
0; other:  0)
mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
0; other:  0)
mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
0; other:  0)
mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
0; other:  0)
mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
0; other:  0)
mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
0; other:  0)
mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
0; other:  0)
mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
0; other:  0)
mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
0; other:  0)
mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
0; other:  0)
mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
0; other:  0)
mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
0; other:  0)
mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
0; other:  0)
mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
0; other:  0)
mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
0; other:  0)
mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
0; other:  0)
mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
0; other:  0)
mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
0; other:  0)
mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
0; other:  0)
mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
0; other:  0)
mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
0; other:  0)
mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
0; other:  0)

If I ran this:
$ cat /mnt/random-1024  /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
0; other:  0)
mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
0; other:  0)
mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
0; other:  0)
mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
0; other:  0)
mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
0; other:  0)

This is what I expect.  But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

What happen is that fault handler deactivates or caches the pages
previous to the one which would satisfy the fault. See the if()
statement starting at line 463 of vm/vm_fault.c. Since all pages
of the object in your test are clean, the pages are cached.

Next fault would need to allocate some more pages for different index
of the same object. What I see is that vm_reserv_alloc_page() returns a
page that is from the cache for the same object, but different pindex.
As an obvious result, the page is invalidated and repurposed. When next
loop started, the page is not resident anymore, so it has to be re-read
from disk.


I pretty sure that the pages aren't being repurposed this quickly.  
Instead, I believe that the explanation is to be found in mincore().  
mincore() is only reporting pages that are in the object's memq as 
resident.  It is not reporting cache pages as resident.



The behaviour of the allocator is not consistent, so some pages are not
reused, allowing the test to converge and to collect all pages of the
object eventually.

Calling madvise(MADV_RANDOM) fixes 

Re: problems with mmap() and disk caching

2012-04-06 Thread Alan Cox

On 04/06/2012 03:38, Konstantin Belousov wrote:

On Thu, Apr 05, 2012 at 01:25:49PM -0500, Alan Cox wrote:

On 04/05/2012 12:31, Konstantin Belousov wrote:

On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer.  I expect that page should be only once
touched to get it into the memory (disk cache?), but this doesn't work!

I wrote the test (attached) and ran it for the 1G file generated from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
0; other:  0)
mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
0; other:  0)
mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
0; other:  0)
mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
0; other:  0)
mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
0; other:  0)
mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
0; other:  0)
mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
0; other:  0)
mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
0; other:  0)
mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
0; other:  0)
mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
0; other:  0)
mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
0; other:  0)
mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
0; other:  0)
mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
0; other:  0)
mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
0; other:  0)
mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
0; other:  0)
mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
0; other:  0)
mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
0; other:  0)
mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
0; other:  0)
mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
0; other:  0)
mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
0; other:  0)
mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
0; other:  0)
mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
0; other:  0)
mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
0; other:  0)
mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
0; other:  0)
mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
0; other:  0)
mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
0; other:  0)
mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
0; other:  0)
mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
0; other:  0)
mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
0; other:  0)
mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
0; other:  0)

If I ran this:
$ cat /mnt/random-1024/dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
0; other:  0)
mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
0; other:  0)
mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
0; other:  0)
mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
0; other:  0)
mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
0; other:  0)

This is what I expect.  But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years.  Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

 pmap_remove_all(mt);
 if (mt-dirty != 0)
 vm_page_deactivate(mt);
 else
 vm_page_cache(mt);

to:

 vm_page_dontneed(mt);

because I suspect that the current code does more harm than good.  In
theory, it saves activations of the page daemon.  However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations.  The sequential access
detection heuristic is just 

Re: problems with mmap() and disk caching

2012-04-05 Thread Alan Cox

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer.  I expect that page should be only once
touched to get it into the memory (disk cache?), but this doesn't work!

I wrote the test (attached) and ran it for the 1G file generated from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
0; other:  0)
mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
0; other:  0)
mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
0; other:  0)
mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
0; other:  0)
mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
0; other:  0)
mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
0; other:  0)
mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
0; other:  0)
mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
0; other:  0)
mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
0; other:  0)
mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
0; other:  0)
mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
0; other:  0)
mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
0; other:  0)
mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
0; other:  0)
mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
0; other:  0)
mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
0; other:  0)
mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
0; other:  0)
mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
0; other:  0)
mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
0; other:  0)
mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
0; other:  0)
mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
0; other:  0)
mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
0; other:  0)
mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
0; other:  0)
mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
0; other:  0)
mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
0; other:  0)
mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
0; other:  0)
mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
0; other:  0)
mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
0; other:  0)
mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
0; other:  0)
mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
0; other:  0)
mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
0; other:  0)

If I ran this:
$ cat /mnt/random-1024  /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
0; other:  0)
mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
0; other:  0)
mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
0; other:  0)
mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
0; other:  0)
mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
0; other:  0)

This is what I expect.  But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.


I'm pretty sure that the behavior here hasn't significantly changed in 
about twelve years.  Otherwise, I agree with your analysis.


On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);

because I suspect that the current code does more harm than good.  In 
theory, it saves activations of the page daemon.  However, more often 
than not, I suspect that we are spending more on page reactivations than 
we are saving on page daemon activations.  The sequential access 
detection heuristic is just too easily triggered.  For example, I've 
seen it triggered by demand paging of the gcc text segment.  Also, I 
think that pmap_remove_all() and especially vm_page_cache() are too 
severe for a detection heuristic 

Re: problems with mmap() and disk caching

2012-04-05 Thread Alan Cox

On 04/04/2012 04:36, Andrey Zonov wrote:

On 04.04.2012 11:17, Konstantin Belousov wrote:


Calling madvise(MADV_RANDOM) fixes the issue, because the code to
deactivate/cache the pages is turned off. On the other hand, it also
turns of read-ahead for faulting, and the first loop becomes eternally
long.


Now it takes 5 times longer.  Anyway, thanks for explanation.



Doing MADV_WILLNEED does not fix the problem indeed, since willneed
reactivates the pages of the object at the time of call. To use
MADV_WILLNEED, you would need to call it between faults/memcpy.



I played with it, but no luck so far.



I've also never seen super pages, how to make them work?

They just work, at least for me. Look at the output of procstat -v
after enough loops finished to not cause disk activity.



The problem was in my test program.  I fixed it, now I see super pages 
but I'm still not satisfied.  There are several tests below:


1. With madvise(MADV_RANDOM) I see almost all super pages:
$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:  26.438535 (none:  0; res: 262144; super: 511; 
other:  0)
mmap:  2 pass took:   0.187311 (none:  0; res: 262144; super: 511; 
other:  0)
mmap:  3 pass took:   0.184953 (none:  0; res: 262144; super: 511; 
other:  0)
mmap:  4 pass took:   0.186007 (none:  0; res: 262144; super: 511; 
other:  0)
mmap:  5 pass took:   0.185790 (none:  0; res: 262144; super: 511; 
other:  0)


Should it be 512?



Check the starting virtual address.  It is probably not aligned on a 
superpage boundary.  Hence, a few pages at the start and end of your 
mapped region are not in a superpage.



2. Without madvise(MADV_RANDOM):
$ ./mmap /mnt/random-1024 50
mmap:  1 pass took:   7.629745 (none: 262112; res: 32; super: 0; 
other:  0)
mmap:  2 pass took:   7.301720 (none: 261202; res:942; super: 0; 
other:  0)
mmap:  3 pass took:   7.261416 (none: 260226; res:   1918; super: 1; 
other:  0)

[skip]
mmap: 49 pass took:   0.155368 (none:  0; res: 262144; super: 323; 
other:  0)
mmap: 50 pass took:   0.155438 (none:  0; res: 262144; super: 323; 
other:  0)


Only 323 pages.

3. If I just re-run test I don't see super pages with any size of 
block.


$ ./mmap /mnt/random-1024 5 $((130))
mmap:  1 pass took:   1.013939 (none:  0; res: 262144; super: 0; 
other:  0)
mmap:  2 pass took:   0.267082 (none:  0; res: 262144; super: 0; 
other:  0)
mmap:  3 pass took:   0.270711 (none:  0; res: 262144; super: 0; 
other:  0)
mmap:  4 pass took:   0.268940 (none:  0; res: 262144; super: 0; 
other:  0)
mmap:  5 pass took:   0.269634 (none:  0; res: 262144; super: 0; 
other:  0)


4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run 
test then I see super pages only if I use block greater than 2Mb.


$ ./mmap /mnt/random-1024 1 $((121))
mmap:  1 pass took:   0.299722 (none:  0; res: 262144; super: 0; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((122))
mmap:  1 pass took:   0.271828 (none:  0; res: 262144; super: 170; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((123))
mmap:  1 pass took:   0.333188 (none:  0; res: 262144; super: 258; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((124))
mmap:  1 pass took:   0.339250 (none:  0; res: 262144; super: 303; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((125))
mmap:  1 pass took:   0.418812 (none:  0; res: 262144; super: 324; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((126))
mmap:  1 pass took:   0.360892 (none:  0; res: 262144; super: 335; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((127))
mmap:  1 pass took:   0.401122 (none:  0; res: 262144; super: 342; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((128))
mmap:  1 pass took:   0.478764 (none:  0; res: 262144; super: 345; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((129))
mmap:  1 pass took:   0.607266 (none:  0; res: 262144; super: 346; 
other:  0)

$ ./mmap /mnt/random-1024 1 $((130))
mmap:  1 pass took:   0.901269 (none:  0; res: 262144; super: 347; 
other:  0)


5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then 
I see some number of super pages (the number from test #2).


$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.178666 (none:  0; res: 262144; super: 323; 
other:  0)
mmap:  2 pass took:   0.158889 (none:  0; res: 262144; super: 323; 
other:  0)
mmap:  3 pass took:   0.157229 (none:  0; res: 262144; super: 323; 
other:  0)
mmap:  4 pass took:   0.156895 (none:  0; res: 262144; super: 323; 
other:  0)
mmap:  5 pass took:   0.162938 (none:  0; res: 262144; super: 323; 
other:  0)


6. If I read file manually before test then I don't see super pages 
with any size of block and madvise(MADV_WILLNEED) doesn't help.


$ ./mmap /mnt/random-1024 5 $((130))
mmap:  1 pass took:   0.996767 (none:  0; res: 262144; super: 0; 
other:  0)
mmap:  2 pass took:   0.311129 (none:  

Re: problems with mmap() and disk caching

2012-04-05 Thread Konstantin Belousov
On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote:
 On 04/04/2012 02:17, Konstantin Belousov wrote:
 On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
 Hi,
 
 I open the file, then call mmap() on the whole file and get pointer,
 then I work with this pointer.  I expect that page should be only once
 touched to get it into the memory (disk cache?), but this doesn't work!
 
 I wrote the test (attached) and ran it for the 1G file generated from
 /dev/random, the result is the following:
 
 Prepare file:
 # swapoff -a
 # newfs /dev/ada0b
 # mount /dev/ada0b /mnt
 # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024
 
 Purge cache:
 # umount /mnt
 # mount /dev/ada0b /mnt
 
 Run test:
 $ ./mmap /mnt/random-1024 30
 mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
 0; other:  0)
 mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
 0; other:  0)
 mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
 0; other:  0)
 mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
 0; other:  0)
 mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
 0; other:  0)
 mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
 0; other:  0)
 mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
 0; other:  0)
 mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
 0; other:  0)
 mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
 0; other:  0)
 mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
 0; other:  0)
 mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
 0; other:  0)
 mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
 0; other:  0)
 mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
 0; other:  0)
 mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
 0; other:  0)
 mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
 0; other:  0)
 mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
 0; other:  0)
 mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
 0; other:  0)
 mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
 0; other:  0)
 mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
 0; other:  0)
 mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
 0; other:  0)
 mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
 0; other:  0)
 mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
 0; other:  0)
 mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
 0; other:  0)
 mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
 0; other:  0)
 mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
 0; other:  0)
 mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
 0; other:  0)
 mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
 0; other:  0)
 mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
 0; other:  0)
 mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
 0; other:  0)
 mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
 0; other:  0)
 
 If I ran this:
 $ cat /mnt/random-1024  /dev/null
 before test, when result is the following:
 
 $ ./mmap /mnt/random-1024 5
 mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
 0; other:  0)
 mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
 0; other:  0)
 
 This is what I expect.  But why this doesn't work without reading file
 manually?
 Issue seems to be in some change of the behaviour of the reserv or
 phys allocator. I Cc:ed Alan.
 
 I'm pretty sure that the behavior here hasn't significantly changed in 
 about twelve years.  Otherwise, I agree with your analysis.
 
 On more than one occasion, I've been tempted to change:
 
 pmap_remove_all(mt);
 if (mt-dirty != 0)
 vm_page_deactivate(mt);
 else
 vm_page_cache(mt);
 
 to:
 
 vm_page_dontneed(mt);
 
 because I suspect that the current code does more harm than good.  In 
 theory, it saves activations of the page daemon.  However, more often 
 than not, I suspect that we are spending more on page reactivations than 
 we are saving on page daemon activations.  The sequential access 
 detection heuristic is just too easily triggered.  For example, I've 

Re: problems with mmap() and disk caching

2012-04-05 Thread Alan Cox

On 04/05/2012 12:31, Konstantin Belousov wrote:

On Thu, Apr 05, 2012 at 10:54:31AM -0500, Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

Hi,

I open the file, then call mmap() on the whole file and get pointer,
then I work with this pointer.  I expect that page should be only once
touched to get it into the memory (disk cache?), but this doesn't work!

I wrote the test (attached) and ran it for the 1G file generated from
/dev/random, the result is the following:

Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super:
0; other:  0)
mmap:  2 pass took:   7.356670 (none: 261648; res:496; super:
0; other:  0)
mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super:
0; other:  0)
mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super:
0; other:  0)
mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super:
0; other:  0)
mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super:
0; other:  0)
mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super:
0; other:  0)
mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super:
0; other:  0)
mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super:
0; other:  0)
mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super:
0; other:  0)
mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super:
0; other:  0)
mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super:
0; other:  0)
mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super:
0; other:  0)
mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super:
0; other:  0)
mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super:
0; other:  0)
mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super:
0; other:  0)
mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super:
0; other:  0)
mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super:
0; other:  0)
mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super:
0; other:  0)
mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super:
0; other:  0)
mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super:
0; other:  0)
mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super:
0; other:  0)
mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super:
0; other:  0)
mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super:
0; other:  0)
mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super:
0; other:  0)
mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super:
0; other:  0)
mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super:
0; other:  0)
mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super:
0; other:  0)
mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super:
0; other:  0)
mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super:
0; other:  0)

If I ran this:
$ cat /mnt/random-1024   /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super:
0; other:  0)
mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super:
0; other:  0)
mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super:
0; other:  0)
mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super:
0; other:  0)
mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super:
0; other:  0)

This is what I expect.  But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years.  Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

 pmap_remove_all(mt);
 if (mt-dirty != 0)
 vm_page_deactivate(mt);
 else
 vm_page_cache(mt);

to:

 vm_page_dontneed(mt);

because I suspect that the current code does more harm than good.  In
theory, it saves activations of the page daemon.  However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations.  The sequential access
detection heuristic is just too easily triggered.  For example, I've
seen it triggered by demand paging of the gcc text segment.  Also, I

Re: problems with mmap() and disk caching

2012-04-05 Thread Andrey Zonov

On 05.04.2012 19:54, Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

[snip]

This is what I expect. But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.


I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years. Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);



Thanks Alan!  Now it works as I expect!

But I have more questions to you and kib@.  They are in my test below.

So, prepare file as earlier, and take information about memory usage 
from top(1).  After preparation, but before test:

Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free

First run:
$ ./mmap /mnt/random
mmap:  1 pass took:   7.462865 (none:  0; res: 262144; super: 
0; other:  0)


No super pages after first run, why?..

Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free

Now the file is in inactive memory, that's good.

Second run:
$ ./mmap /mnt/random
mmap:  1 pass took:   0.004191 (none:  0; res: 262144; super: 
511; other:  0)


All super pages are here, nice.

Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free

Wow, all inactive pages moved to active and sit there even after process 
was terminated, that's not good, what do you think?


Read the file:
$ cat /mnt/random  /dev/null

Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free

Now the file is in wired memory.  I do not understand why so.

Could you please give me explanation about active/inactive/wired memory?



because I suspect that the current code does more harm than good. In
theory, it saves activations of the page daemon. However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations. The sequential access
detection heuristic is just too easily triggered. For example, I've seen
it triggered by demand paging of the gcc text segment. Also, I think
that pmap_remove_all() and especially vm_page_cache() are too severe for
a detection heuristic that is so easily triggered.


[snip]

--
Andrey Zonov
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: problems with mmap() and disk caching

2012-04-05 Thread Konstantin Belousov
On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote:
 On 05.04.2012 19:54, Alan Cox wrote:
 On 04/04/2012 02:17, Konstantin Belousov wrote:
 On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
 [snip]
 This is what I expect. But why this doesn't work without reading file
 manually?
 Issue seems to be in some change of the behaviour of the reserv or
 phys allocator. I Cc:ed Alan.
 
 I'm pretty sure that the behavior here hasn't significantly changed in
 about twelve years. Otherwise, I agree with your analysis.
 
 On more than one occasion, I've been tempted to change:
 
 pmap_remove_all(mt);
 if (mt-dirty != 0)
 vm_page_deactivate(mt);
 else
 vm_page_cache(mt);
 
 to:
 
 vm_page_dontneed(mt);
 
 
 Thanks Alan!  Now it works as I expect!
 
 But I have more questions to you and kib@.  They are in my test below.
 
 So, prepare file as earlier, and take information about memory usage 
 from top(1).  After preparation, but before test:
 Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free
 
 First run:
 $ ./mmap /mnt/random
 mmap:  1 pass took:   7.462865 (none:  0; res: 262144; super: 
 0; other:  0)
 
 No super pages after first run, why?..
 
 Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free
 
 Now the file is in inactive memory, that's good.
 
 Second run:
 $ ./mmap /mnt/random
 mmap:  1 pass took:   0.004191 (none:  0; res: 262144; super: 
 511; other:  0)
 
 All super pages are here, nice.
 
 Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free
 
 Wow, all inactive pages moved to active and sit there even after process 
 was terminated, that's not good, what do you think?
Why do you think this is 'not good' ? You have plenty of free memory,
there is no memory pressure, and all pages were referenced recently.
THere is no reason for them to be deactivated.

 
 Read the file:
 $ cat /mnt/random  /dev/null
 
 Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free
 
 Now the file is in wired memory.  I do not understand why so.
You do use UFS, right ? There is enough buffer headers and buffer KVA
to have buffers allocated for the whole file content. Since buffers wire
corresponding pages, you get pages migrated to wired.

When there appears a buffer pressure (i.e., any other i/o started),
the buffers will be repurposed and pages moved to inactive.

 
 Could you please give me explanation about active/inactive/wired memory?
 
 
 because I suspect that the current code does more harm than good. In
 theory, it saves activations of the page daemon. However, more often
 than not, I suspect that we are spending more on page reactivations than
 we are saving on page daemon activations. The sequential access
 detection heuristic is just too easily triggered. For example, I've seen
 it triggered by demand paging of the gcc text segment. Also, I think
 that pmap_remove_all() and especially vm_page_cache() are too severe for
 a detection heuristic that is so easily triggered.
 
 [snip]
 
 -- 
 Andrey Zonov


pgpgwsB0EvlKi.pgp
Description: PGP signature


Re: problems with mmap() and disk caching

2012-04-05 Thread Andrey Zonov

On 05.04.2012 23:41, Konstantin Belousov wrote:

On Thu, Apr 05, 2012 at 11:33:46PM +0400, Andrey Zonov wrote:

On 05.04.2012 19:54, Alan Cox wrote:

On 04/04/2012 02:17, Konstantin Belousov wrote:

On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:

[snip]

This is what I expect. But why this doesn't work without reading file
manually?

Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.


I'm pretty sure that the behavior here hasn't significantly changed in
about twelve years. Otherwise, I agree with your analysis.

On more than one occasion, I've been tempted to change:

pmap_remove_all(mt);
if (mt-dirty != 0)
vm_page_deactivate(mt);
else
vm_page_cache(mt);

to:

vm_page_dontneed(mt);



Thanks Alan!  Now it works as I expect!

But I have more questions to you and kib@.  They are in my test below.

So, prepare file as earlier, and take information about memory usage
from top(1).  After preparation, but before test:
Mem: 80M Active, 55M Inact, 721M Wired, 215M Buf, 46G Free

First run:
$ ./mmap /mnt/random
mmap:  1 pass took:   7.462865 (none:  0; res: 262144; super:
0; other:  0)

No super pages after first run, why?..

Mem: 79M Active, 1079M Inact, 722M Wired, 216M Buf, 45G Free

Now the file is in inactive memory, that's good.

Second run:
$ ./mmap /mnt/random
mmap:  1 pass took:   0.004191 (none:  0; res: 262144; super:
511; other:  0)

All super pages are here, nice.

Mem: 1103M Active, 55M Inact, 722M Wired, 216M Buf, 45G Free

Wow, all inactive pages moved to active and sit there even after process
was terminated, that's not good, what do you think?

Why do you think this is 'not good' ? You have plenty of free memory,
there is no memory pressure, and all pages were referenced recently.
THere is no reason for them to be deactivated.



I always thought that active memory this is a sum of resident memory of 
all processes, inactive shows disk cache and wired shows kernel itself.




Read the file:
$ cat /mnt/random  /dev/null

Mem: 79M Active, 55M Inact, 1746M Wired, 1240M Buf, 45G Free

Now the file is in wired memory.  I do not understand why so.

You do use UFS, right ?


Yes.


There is enough buffer headers and buffer KVA
to have buffers allocated for the whole file content. Since buffers wire
corresponding pages, you get pages migrated to wired.

When there appears a buffer pressure (i.e., any other i/o started),
the buffers will be repurposed and pages moved to inactive.



OK, how can I get amount of disk cache?



Could you please give me explanation about active/inactive/wired memory?



because I suspect that the current code does more harm than good. In
theory, it saves activations of the page daemon. However, more often
than not, I suspect that we are spending more on page reactivations than
we are saving on page daemon activations. The sequential access
detection heuristic is just too easily triggered. For example, I've seen
it triggered by demand paging of the gcc text segment. Also, I think
that pmap_remove_all() and especially vm_page_cache() are too severe for
a detection heuristic that is so easily triggered.


[snip]

--
Andrey Zonov


--
Andrey Zonov
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: problems with mmap() and disk caching

2012-04-04 Thread Konstantin Belousov
On Tue, Apr 03, 2012 at 11:02:53PM +0400, Andrey Zonov wrote:
 Hi,
 
 I open the file, then call mmap() on the whole file and get pointer, 
 then I work with this pointer.  I expect that page should be only once 
 touched to get it into the memory (disk cache?), but this doesn't work!
 
 I wrote the test (attached) and ran it for the 1G file generated from 
 /dev/random, the result is the following:
 
 Prepare file:
 # swapoff -a
 # newfs /dev/ada0b
 # mount /dev/ada0b /mnt
 # dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024
 
 Purge cache:
 # umount /mnt
 # mount /dev/ada0b /mnt
 
 Run test:
 $ ./mmap /mnt/random-1024 30
 mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super: 
 0; other:  0)
 mmap:  2 pass took:   7.356670 (none: 261648; res:496; super: 
 0; other:  0)
 mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super: 
 0; other:  0)
 mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super: 
 0; other:  0)
 mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super: 
 0; other:  0)
 mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super: 
 0; other:  0)
 mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super: 
 0; other:  0)
 mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super: 
 0; other:  0)
 mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super: 
 0; other:  0)
 mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super: 
 0; other:  0)
 mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super: 
 0; other:  0)
 mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super: 
 0; other:  0)
 mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super: 
 0; other:  0)
 mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super: 
 0; other:  0)
 mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super: 
 0; other:  0)
 mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super: 
 0; other:  0)
 mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super: 
 0; other:  0)
 mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super: 
 0; other:  0)
 mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super: 
 0; other:  0)
 mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super: 
 0; other:  0)
 mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super: 
 0; other:  0)
 mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super: 
 0; other:  0)
 mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super: 
 0; other:  0)
 mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super: 
 0; other:  0)
 mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super: 
 0; other:  0)
 mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super: 
 0; other:  0)
 mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super: 
 0; other:  0)
 mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super: 
 0; other:  0)
 mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super: 
 0; other:  0)
 mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super: 
 0; other:  0)
 
 If I ran this:
 $ cat /mnt/random-1024  /dev/null
 before test, when result is the following:
 
 $ ./mmap /mnt/random-1024 5
 mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super: 
 0; other:  0)
 mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super: 
 0; other:  0)
 mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super: 
 0; other:  0)
 mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super: 
 0; other:  0)
 mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super: 
 0; other:  0)
 
 This is what I expect.  But why this doesn't work without reading file 
 manually?
Issue seems to be in some change of the behaviour of the reserv or
phys allocator. I Cc:ed Alan.

What happen is that fault handler deactivates or caches the pages
previous to the one which would satisfy the fault. See the if()
statement starting at line 463 of vm/vm_fault.c. Since all pages
of the object in your test are clean, the pages are cached.

Next fault would need to allocate some more pages for different index
of the same object. What I see is that vm_reserv_alloc_page() returns a
page that is from the cache for the same object, but different pindex.
As an obvious result, the page is invalidated and repurposed. When next
loop started, the page is not resident anymore, so it has to be re-read
from disk.

The behaviour of the allocator is not consistent, so some pages are not
reused, allowing the test to converge and to collect all pages of the
object eventually.

Calling madvise(MADV_RANDOM) fixes the issue, because the code to
deactivate/cache the pages is turned off. On the other hand, it also
turns of read-ahead for faulting, and the first loop becomes eternally
long.

Doing 

Re: problems with mmap() and disk caching

2012-04-04 Thread Andrey Zonov

On 04.04.2012 11:17, Konstantin Belousov wrote:


Calling madvise(MADV_RANDOM) fixes the issue, because the code to
deactivate/cache the pages is turned off. On the other hand, it also
turns of read-ahead for faulting, and the first loop becomes eternally
long.


Now it takes 5 times longer.  Anyway, thanks for explanation.



Doing MADV_WILLNEED does not fix the problem indeed, since willneed
reactivates the pages of the object at the time of call. To use
MADV_WILLNEED, you would need to call it between faults/memcpy.



I played with it, but no luck so far.



I've also never seen super pages, how to make them work?

They just work, at least for me. Look at the output of procstat -v
after enough loops finished to not cause disk activity.



The problem was in my test program.  I fixed it, now I see super pages 
but I'm still not satisfied.  There are several tests below:


1. With madvise(MADV_RANDOM) I see almost all super pages:
$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:  26.438535 (none:  0; res: 262144; super: 
511; other:  0)
mmap:  2 pass took:   0.187311 (none:  0; res: 262144; super: 
511; other:  0)
mmap:  3 pass took:   0.184953 (none:  0; res: 262144; super: 
511; other:  0)
mmap:  4 pass took:   0.186007 (none:  0; res: 262144; super: 
511; other:  0)
mmap:  5 pass took:   0.185790 (none:  0; res: 262144; super: 
511; other:  0)


Should it be 512?

2. Without madvise(MADV_RANDOM):
$ ./mmap /mnt/random-1024 50
mmap:  1 pass took:   7.629745 (none: 262112; res: 32; super: 
0; other:  0)
mmap:  2 pass took:   7.301720 (none: 261202; res:942; super: 
0; other:  0)
mmap:  3 pass took:   7.261416 (none: 260226; res:   1918; super: 
1; other:  0)

[skip]
mmap: 49 pass took:   0.155368 (none:  0; res: 262144; super: 
323; other:  0)
mmap: 50 pass took:   0.155438 (none:  0; res: 262144; super: 
323; other:  0)


Only 323 pages.

3. If I just re-run test I don't see super pages with any size of block.

$ ./mmap /mnt/random-1024 5 $((130))
mmap:  1 pass took:   1.013939 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  2 pass took:   0.267082 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  3 pass took:   0.270711 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  4 pass took:   0.268940 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  5 pass took:   0.269634 (none:  0; res: 262144; super: 
0; other:  0)


4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run test 
then I see super pages only if I use block greater than 2Mb.


$ ./mmap /mnt/random-1024 1 $((121))
mmap:  1 pass took:   0.299722 (none:  0; res: 262144; super: 
0; other:  0)

$ ./mmap /mnt/random-1024 1 $((122))
mmap:  1 pass took:   0.271828 (none:  0; res: 262144; super: 
170; other:  0)

$ ./mmap /mnt/random-1024 1 $((123))
mmap:  1 pass took:   0.333188 (none:  0; res: 262144; super: 
258; other:  0)

$ ./mmap /mnt/random-1024 1 $((124))
mmap:  1 pass took:   0.339250 (none:  0; res: 262144; super: 
303; other:  0)

$ ./mmap /mnt/random-1024 1 $((125))
mmap:  1 pass took:   0.418812 (none:  0; res: 262144; super: 
324; other:  0)

$ ./mmap /mnt/random-1024 1 $((126))
mmap:  1 pass took:   0.360892 (none:  0; res: 262144; super: 
335; other:  0)

$ ./mmap /mnt/random-1024 1 $((127))
mmap:  1 pass took:   0.401122 (none:  0; res: 262144; super: 
342; other:  0)

$ ./mmap /mnt/random-1024 1 $((128))
mmap:  1 pass took:   0.478764 (none:  0; res: 262144; super: 
345; other:  0)

$ ./mmap /mnt/random-1024 1 $((129))
mmap:  1 pass took:   0.607266 (none:  0; res: 262144; super: 
346; other:  0)

$ ./mmap /mnt/random-1024 1 $((130))
mmap:  1 pass took:   0.901269 (none:  0; res: 262144; super: 
347; other:  0)


5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then I 
see some number of super pages (the number from test #2).


$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.178666 (none:  0; res: 262144; super: 
323; other:  0)
mmap:  2 pass took:   0.158889 (none:  0; res: 262144; super: 
323; other:  0)
mmap:  3 pass took:   0.157229 (none:  0; res: 262144; super: 
323; other:  0)
mmap:  4 pass took:   0.156895 (none:  0; res: 262144; super: 
323; other:  0)
mmap:  5 pass took:   0.162938 (none:  0; res: 262144; super: 
323; other:  0)


6. If I read file manually before test then I don't see super pages with 
any size of block and madvise(MADV_WILLNEED) doesn't help.


$ ./mmap /mnt/random-1024 5 $((130))
mmap:  1 pass took:   0.996767 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  2 pass took:   0.311129 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  3 pass took:   0.317430 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  4 pass took:   0.314437 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  5 pass 

Re: problems with mmap() and disk caching

2012-04-04 Thread Andrey Zonov

I forgot to attach my test program.

On 04.04.2012 13:36, Andrey Zonov wrote:

On 04.04.2012 11:17, Konstantin Belousov wrote:


Calling madvise(MADV_RANDOM) fixes the issue, because the code to
deactivate/cache the pages is turned off. On the other hand, it also
turns of read-ahead for faulting, and the first loop becomes eternally
long.


Now it takes 5 times longer. Anyway, thanks for explanation.



Doing MADV_WILLNEED does not fix the problem indeed, since willneed
reactivates the pages of the object at the time of call. To use
MADV_WILLNEED, you would need to call it between faults/memcpy.



I played with it, but no luck so far.



I've also never seen super pages, how to make them work?

They just work, at least for me. Look at the output of procstat -v
after enough loops finished to not cause disk activity.



The problem was in my test program. I fixed it, now I see super pages
but I'm still not satisfied. There are several tests below:

1. With madvise(MADV_RANDOM) I see almost all super pages:
$ ./mmap /mnt/random-1024 5
mmap: 1 pass took: 26.438535 (none: 0; res: 262144; super: 511; other: 0)
mmap: 2 pass took: 0.187311 (none: 0; res: 262144; super: 511; other: 0)
mmap: 3 pass took: 0.184953 (none: 0; res: 262144; super: 511; other: 0)
mmap: 4 pass took: 0.186007 (none: 0; res: 262144; super: 511; other: 0)
mmap: 5 pass took: 0.185790 (none: 0; res: 262144; super: 511; other: 0)

Should it be 512?

2. Without madvise(MADV_RANDOM):
$ ./mmap /mnt/random-1024 50
mmap: 1 pass took: 7.629745 (none: 262112; res: 32; super: 0; other: 0)
mmap: 2 pass took: 7.301720 (none: 261202; res: 942; super: 0; other: 0)
mmap: 3 pass took: 7.261416 (none: 260226; res: 1918; super: 1; other: 0)
[skip]
mmap: 49 pass took: 0.155368 (none: 0; res: 262144; super: 323; other: 0)
mmap: 50 pass took: 0.155438 (none: 0; res: 262144; super: 323; other: 0)

Only 323 pages.

3. If I just re-run test I don't see super pages with any size of block.

$ ./mmap /mnt/random-1024 5 $((130))
mmap: 1 pass took: 1.013939 (none: 0; res: 262144; super: 0; other: 0)
mmap: 2 pass took: 0.267082 (none: 0; res: 262144; super: 0; other: 0)
mmap: 3 pass took: 0.270711 (none: 0; res: 262144; super: 0; other: 0)
mmap: 4 pass took: 0.268940 (none: 0; res: 262144; super: 0; other: 0)
mmap: 5 pass took: 0.269634 (none: 0; res: 262144; super: 0; other: 0)

4. If I activate madvise(MADV_WILLNEDD) in the copy loop and re-run test
then I see super pages only if I use block greater than 2Mb.

$ ./mmap /mnt/random-1024 1 $((121))
mmap: 1 pass took: 0.299722 (none: 0; res: 262144; super: 0; other: 0)
$ ./mmap /mnt/random-1024 1 $((122))
mmap: 1 pass took: 0.271828 (none: 0; res: 262144; super: 170; other: 0)
$ ./mmap /mnt/random-1024 1 $((123))
mmap: 1 pass took: 0.333188 (none: 0; res: 262144; super: 258; other: 0)
$ ./mmap /mnt/random-1024 1 $((124))
mmap: 1 pass took: 0.339250 (none: 0; res: 262144; super: 303; other: 0)
$ ./mmap /mnt/random-1024 1 $((125))
mmap: 1 pass took: 0.418812 (none: 0; res: 262144; super: 324; other: 0)
$ ./mmap /mnt/random-1024 1 $((126))
mmap: 1 pass took: 0.360892 (none: 0; res: 262144; super: 335; other: 0)
$ ./mmap /mnt/random-1024 1 $((127))
mmap: 1 pass took: 0.401122 (none: 0; res: 262144; super: 342; other: 0)
$ ./mmap /mnt/random-1024 1 $((128))
mmap: 1 pass took: 0.478764 (none: 0; res: 262144; super: 345; other: 0)
$ ./mmap /mnt/random-1024 1 $((129))
mmap: 1 pass took: 0.607266 (none: 0; res: 262144; super: 346; other: 0)
$ ./mmap /mnt/random-1024 1 $((130))
mmap: 1 pass took: 0.901269 (none: 0; res: 262144; super: 347; other: 0)

5. If I activate madvise(MADV_WILLNEED) immediately after mmap() then I
see some number of super pages (the number from test #2).

$ ./mmap /mnt/random-1024 5
mmap: 1 pass took: 0.178666 (none: 0; res: 262144; super: 323; other: 0)
mmap: 2 pass took: 0.158889 (none: 0; res: 262144; super: 323; other: 0)
mmap: 3 pass took: 0.157229 (none: 0; res: 262144; super: 323; other: 0)
mmap: 4 pass took: 0.156895 (none: 0; res: 262144; super: 323; other: 0)
mmap: 5 pass took: 0.162938 (none: 0; res: 262144; super: 323; other: 0)

6. If I read file manually before test then I don't see super pages with
any size of block and madvise(MADV_WILLNEED) doesn't help.

$ ./mmap /mnt/random-1024 5 $((130))
mmap: 1 pass took: 0.996767 (none: 0; res: 262144; super: 0; other: 0)
mmap: 2 pass took: 0.311129 (none: 0; res: 262144; super: 0; other: 0)
mmap: 3 pass took: 0.317430 (none: 0; res: 262144; super: 0; other: 0)
mmap: 4 pass took: 0.314437 (none: 0; res: 262144; super: 0; other: 0)
mmap: 5 pass took: 0.310757 (none: 0; res: 262144; super: 0; other: 0)




--
Andrey Zonov
/*_
 * Andrey Zonov (c) 2011
 */

#include sys/mman.h
#include sys/types.h
#include sys/time.h
#include sys/stat.h
#include err.h
#include fcntl.h
#include stdlib.h
#include string.h
#include unistd.h

int
main(int argc, char **argv)
{
int i;
int fd;
int num;
int block;
int pagesize;

problems with mmap() and disk caching

2012-04-03 Thread Andrey Zonov

Hi,

I open the file, then call mmap() on the whole file and get pointer, 
then I work with this pointer.  I expect that page should be only once 
touched to get it into the memory (disk cache?), but this doesn't work!


I wrote the test (attached) and ran it for the 1G file generated from 
/dev/random, the result is the following:


Prepare file:
# swapoff -a
# newfs /dev/ada0b
# mount /dev/ada0b /mnt
# dd if=/dev/random of=/mnt/random-1024 bs=1m count=1024

Purge cache:
# umount /mnt
# mount /dev/ada0b /mnt

Run test:
$ ./mmap /mnt/random-1024 30
mmap:  1 pass took:   7.431046 (none: 262112; res: 32; super: 
0; other:  0)
mmap:  2 pass took:   7.356670 (none: 261648; res:496; super: 
0; other:  0)
mmap:  3 pass took:   7.307094 (none: 260521; res:   1623; super: 
0; other:  0)
mmap:  4 pass took:   7.350239 (none: 258904; res:   3240; super: 
0; other:  0)
mmap:  5 pass took:   7.392480 (none: 257286; res:   4858; super: 
0; other:  0)
mmap:  6 pass took:   7.292069 (none: 255584; res:   6560; super: 
0; other:  0)
mmap:  7 pass took:   7.048980 (none: 251142; res:  11002; super: 
0; other:  0)
mmap:  8 pass took:   6.899387 (none: 247584; res:  14560; super: 
0; other:  0)
mmap:  9 pass took:   7.190579 (none: 242992; res:  19152; super: 
0; other:  0)
mmap: 10 pass took:   6.915482 (none: 239308; res:  22836; super: 
0; other:  0)
mmap: 11 pass took:   6.565909 (none: 232835; res:  29309; super: 
0; other:  0)
mmap: 12 pass took:   6.423945 (none: 226160; res:  35984; super: 
0; other:  0)
mmap: 13 pass took:   6.315385 (none: 208555; res:  53589; super: 
0; other:  0)
mmap: 14 pass took:   6.760780 (none: 192805; res:  69339; super: 
0; other:  0)
mmap: 15 pass took:   5.721513 (none: 174497; res:  87647; super: 
0; other:  0)
mmap: 16 pass took:   5.004424 (none: 155938; res: 106206; super: 
0; other:  0)
mmap: 17 pass took:   4.224926 (none: 135639; res: 126505; super: 
0; other:  0)
mmap: 18 pass took:   3.749608 (none: 117952; res: 144192; super: 
0; other:  0)
mmap: 19 pass took:   3.398084 (none:  99066; res: 163078; super: 
0; other:  0)
mmap: 20 pass took:   3.029557 (none:  74994; res: 187150; super: 
0; other:  0)
mmap: 21 pass took:   2.379430 (none:  55231; res: 206913; super: 
0; other:  0)
mmap: 22 pass took:   2.046521 (none:  40786; res: 221358; super: 
0; other:  0)
mmap: 23 pass took:   1.152797 (none:  30311; res: 231833; super: 
0; other:  0)
mmap: 24 pass took:   0.972617 (none:  16196; res: 245948; super: 
0; other:  0)
mmap: 25 pass took:   0.577515 (none:   8286; res: 253858; super: 
0; other:  0)
mmap: 26 pass took:   0.380738 (none:   3712; res: 258432; super: 
0; other:  0)
mmap: 27 pass took:   0.253583 (none:   1193; res: 260951; super: 
0; other:  0)
mmap: 28 pass took:   0.157508 (none:  0; res: 262144; super: 
0; other:  0)
mmap: 29 pass took:   0.156169 (none:  0; res: 262144; super: 
0; other:  0)
mmap: 30 pass took:   0.156550 (none:  0; res: 262144; super: 
0; other:  0)


If I ran this:
$ cat /mnt/random-1024  /dev/null
before test, when result is the following:

$ ./mmap /mnt/random-1024 5
mmap:  1 pass took:   0.337657 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  2 pass took:   0.186137 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  3 pass took:   0.186132 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  4 pass took:   0.186535 (none:  0; res: 262144; super: 
0; other:  0)
mmap:  5 pass took:   0.190353 (none:  0; res: 262144; super: 
0; other:  0)


This is what I expect.  But why this doesn't work without reading file 
manually?


I've also never seen super pages, how to make them work?

I've been playing with madvise and posix_fadvise but no luck.  BTW, 
posix_fadvise(POSIX_FADV_WILLNEED) does nothing as the commentary says, 
shouldn't this be documented in the manual page?


All tests were run under 9.0-STABLE (r233744).

--
Andrey Zonov
/*_
 * Andrey Zonov (c) 2011
 */

#include sys/mman.h
#include sys/types.h
#include sys/time.h
#include sys/stat.h
#include err.h
#include fcntl.h
#include stdlib.h
#include string.h
#include unistd.h

int
main(int argc, char **argv)
{
int i;
int fd;
int num;
int block;
int pagesize;
size_t n;
size_t size;
size_t none, incore, super, other;
char *p;
char *tmp;
char *vec;
char *vecp;
struct stat sb;
struct timeval tp, tp1, tp2;

if (argc  2 || argc  4)
errx(1, usage: mmap filename [num] [block]);

fd = open(argv[1], O_RDONLY);
if (fd == -1)
err(1, open());

num = 1;
if (argc = 3)
num = atoi(argv[2]);

pagesize = getpagesize();
block = pagesize;
if (argc == 4)
block = atoi(argv[3]);

if