Re: [zfs-discuss] C'mon ARC, stay small...
Jason J. W. Williams writes: Hi Guys, Rather than starting a new thread I thought I'd continue this thread. I've been running Build 54 on a Thumper since Mid January and wanted to ask a question about the zfs_arc_max setting. We set it to 0x1 #4GB, however its creeping over that till our Kernel memory usage is nearly 7GB (::memstat inserted below). This is a database server so I was curious if the DNLC would have this affect over time, as it does quite quickly when dealing with small files? Would it be worth upgrade to Build 59? Another possibility is that, there is a portion of memory that might be in the kmem caches, ready to be reclaimed and returned to the OS free space. Such reclaims currently only occurs on memory shortage. I think we should do it under some more conditions... This might fall under: CrNumber: 6416757 Synopsis: zfs should return memory eventually http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6416757 If you induce some temporary memory pressure, it would be nice to see if you're kernel shrinks down to ~4GB. -r Thank you in advance! Best Regards, Jason Page SummaryPagesMB %Tot Kernel1750044 6836 42% Anon 1211203 4731 29% Exec and libs7648290% Page cache 220434 8615% Free (cachelist) 318625 12448% Free (freelist)659607 2576 16% Total 4167561 16279 Physical 4078747 15932 On 3/23/07, Roch - PAE [EMAIL PROTECTED] wrote: With latest Nevada setting zfs_arc_max in /etc/system is sufficient. Playing with mdb on a live system is more tricky and is what caused the problem here. -r [EMAIL PROTECTED] writes: Jim Mauro wrote: All righty...I set c_max to 512MB, c to 512MB, and p to 256MB... arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t33176457216 ... } c02e2a08 /Z 0x2000 arc+0x48: 0x7b9789000 = 0x2000 c02e29f8 /Z 0x2000 arc+0x38: 0x7b9789000 = 0x2000 c02e29f0 /Z 0x1000 arc+0x30: 0x3dcbc4800 = 0x1000 arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t268435456 -- p is 256MB c02e29f8 uint64_t c = 0t536870912 -- c is 512MB c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912--- c_max is 512MB ... } After a few runs of the workload ... arc::print -d size size = 0t536788992 Ah - looks like we're out of the woods. The ARC remains clamped at 512MB. Is there a way to set these fields using /etc/system? Or does this require a new or modified init script to run and do the above with each boot? Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
So you're not really sure it's the ARC growing, but only that the kernel is growing to 6.8GB. Print the arc values via mdb: # mdb -k Loading modules: [ unix krtld genunix specfs dtrace uppc scsi_vhci ufs ip hook neti sctp arp usba nca lofs zfs random sppp crypto ptm ipc ] arc::print -t size c p c_max uint64_t size = 0x2a8000 uint64_t c = 0x1cdfe800 uint64_t p = 0xe707400 uint64_t c_max = 0x1cdfe800 Is size = c_max? Assuming it is, you need to look through kmastats and see where the kernel memory is being used (again, inside mdb): ::kmastat The above generates a LOT of output that's not completely painless to parse, but it's not too bad either. If you think it's DNLC related, you can monitor the number of entries with: # kstat -p unix:0:dnlcstats:dir_entries_cached_current unix:0:dnlcstats:dir_entries_cached_current 9374 # You can also monitor kernel memory for the dnlc (just using grep with the kmastat in mdb): ::kmastat ! grep dnlc dnlc_space_cache 16104254 4096 104 0 The 5th column starting from the left is mem in use, in this example 4096. I'm not sure if the dnlc_space_cache represents all of kernel memory used for the dnlc. It might, but I need to look at the code to be sure... Let's start with this... /jim Jason J. W. Williams wrote: Hi Guys, Rather than starting a new thread I thought I'd continue this thread. I've been running Build 54 on a Thumper since Mid January and wanted to ask a question about the zfs_arc_max setting. We set it to 0x1 #4GB, however its creeping over that till our Kernel memory usage is nearly 7GB (::memstat inserted below). This is a database server so I was curious if the DNLC would have this affect over time, as it does quite quickly when dealing with small files? Would it be worth upgrade to Build 59? Thank you in advance! Best Regards, Jason Page SummaryPagesMB %Tot Kernel1750044 6836 42% Anon 1211203 4731 29% Exec and libs7648290% Page cache 220434 8615% Free (cachelist) 318625 12448% Free (freelist)659607 2576 16% Total 4167561 16279 Physical 4078747 15932 On 3/23/07, Roch - PAE [EMAIL PROTECTED] wrote: With latest Nevada setting zfs_arc_max in /etc/system is sufficient. Playing with mdb on a live system is more tricky and is what caused the problem here. -r [EMAIL PROTECTED] writes: Jim Mauro wrote: All righty...I set c_max to 512MB, c to 512MB, and p to 256MB... arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t33176457216 ... } c02e2a08 /Z 0x2000 arc+0x48: 0x7b9789000 = 0x2000 c02e29f8 /Z 0x2000 arc+0x38: 0x7b9789000 = 0x2000 c02e29f0 /Z 0x1000 arc+0x30: 0x3dcbc4800 = 0x1000 arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t268435456 -- p is 256MB c02e29f8 uint64_t c = 0t536870912 -- c is 512MB c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912--- c_max is 512MB ... } After a few runs of the workload ... arc::print -d size size = 0t536788992 Ah - looks like we're out of the woods. The ARC remains clamped at 512MB. Is there a way to set these fields using /etc/system? Or does this require a new or modified init script to run and do the above with each boot? Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
Hi Guys, Rather than starting a new thread I thought I'd continue this thread. I've been running Build 54 on a Thumper since Mid January and wanted to ask a question about the zfs_arc_max setting. We set it to 0x1 #4GB, however its creeping over that till our Kernel memory usage is nearly 7GB (::memstat inserted below). This is a database server so I was curious if the DNLC would have this affect over time, as it does quite quickly when dealing with small files? Would it be worth upgrade to Build 59? Thank you in advance! Best Regards, Jason Page SummaryPagesMB %Tot Kernel1750044 6836 42% Anon 1211203 4731 29% Exec and libs7648290% Page cache 220434 8615% Free (cachelist) 318625 12448% Free (freelist)659607 2576 16% Total 4167561 16279 Physical 4078747 15932 On 3/23/07, Roch - PAE [EMAIL PROTECTED] wrote: With latest Nevada setting zfs_arc_max in /etc/system is sufficient. Playing with mdb on a live system is more tricky and is what caused the problem here. -r [EMAIL PROTECTED] writes: Jim Mauro wrote: All righty...I set c_max to 512MB, c to 512MB, and p to 256MB... arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t33176457216 ... } c02e2a08 /Z 0x2000 arc+0x48: 0x7b9789000 = 0x2000 c02e29f8 /Z 0x2000 arc+0x38: 0x7b9789000 = 0x2000 c02e29f0 /Z 0x1000 arc+0x30: 0x3dcbc4800 = 0x1000 arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t268435456 -- p is 256MB c02e29f8 uint64_t c = 0t536870912 -- c is 512MB c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912--- c_max is 512MB ... } After a few runs of the workload ... arc::print -d size size = 0t536788992 Ah - looks like we're out of the woods. The ARC remains clamped at 512MB. Is there a way to set these fields using /etc/system? Or does this require a new or modified init script to run and do the above with each boot? Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
Jim Mauro wrote: All righty...I set c_max to 512MB, c to 512MB, and p to 256MB... arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t33176457216 ... } c02e2a08 /Z 0x2000 arc+0x48: 0x7b9789000 = 0x2000 c02e29f8 /Z 0x2000 arc+0x38: 0x7b9789000 = 0x2000 c02e29f0 /Z 0x1000 arc+0x30: 0x3dcbc4800 = 0x1000 arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t268435456 -- p is 256MB c02e29f8 uint64_t c = 0t536870912 -- c is 512MB c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912--- c_max is 512MB ... } After a few runs of the workload ... arc::print -d size size = 0t536788992 Ah - looks like we're out of the woods. The ARC remains clamped at 512MB. Is there a way to set these fields using /etc/system? Or does this require a new or modified init script to run and do the above with each boot? Darren ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
Gar. This isn't what I was hoping to see. Buffers that aren't available for eviction aren't listed in the lsize count. It looks like the MRU has grown to 10Gb and most of this could be successfully evicted. The calculation for determining if we evict from the MRU is in arc_adjust() and looks something like: top_sz = ARC_anon.size + ARC_mru.size Then if top_sz arc.p and ARC_mru.lsize 0 we evict the smaller of ARC_mru.lsize and top_size - arc.p In your previous message it looks like arc.p is (ARC_mru.size + ARC_anon.size). It might make sense to double-check these numbers together, so when you check the size and lsize again, also check arc.p. How/when did you configure arc_c_max? arc.p is supposed to be initialized to half of arc.c. Also, I assume that there's a reliable test case for reproducing this problem? Thanks, -j On Thu, Mar 15, 2007 at 06:57:12PM -0400, Jim Mauro wrote: ARC_mru::print -d size lsize size = 0t10224433152 lsize = 0t10218960896 ARC_mfu::print -d size lsize size = 0t303450112 lsize = 0t289998848 ARC_anon::print -d size size = 0 So it looks like the MRU is running at 10GB... What does this tell us? Thanks, /jim [EMAIL PROTECTED] wrote: This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
Something else to consider, depending upon how you set arc_c_max, you may just want to set arc_c and arc_p at the same time. If you try setting arc_c_max, and then setting arc_c to arc_c_max, and then set arc_p to arc_c / 2, do you still get this problem? -j On Thu, Mar 15, 2007 at 05:18:12PM -0700, [EMAIL PROTECTED] wrote: Gar. This isn't what I was hoping to see. Buffers that aren't available for eviction aren't listed in the lsize count. It looks like the MRU has grown to 10Gb and most of this could be successfully evicted. The calculation for determining if we evict from the MRU is in arc_adjust() and looks something like: top_sz = ARC_anon.size + ARC_mru.size Then if top_sz arc.p and ARC_mru.lsize 0 we evict the smaller of ARC_mru.lsize and top_size - arc.p In your previous message it looks like arc.p is (ARC_mru.size + ARC_anon.size). It might make sense to double-check these numbers together, so when you check the size and lsize again, also check arc.p. How/when did you configure arc_c_max? arc.p is supposed to be initialized to half of arc.c. Also, I assume that there's a reliable test case for reproducing this problem? Thanks, -j On Thu, Mar 15, 2007 at 06:57:12PM -0400, Jim Mauro wrote: ARC_mru::print -d size lsize size = 0t10224433152 lsize = 0t10218960896 ARC_mfu::print -d size lsize size = 0t303450112 lsize = 0t289998848 ARC_anon::print -d size size = 0 So it looks like the MRU is running at 10GB... What does this tell us? Thanks, /jim [EMAIL PROTECTED] wrote: This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
How/when did you configure arc_c_max? Immediately following a reboot, I set arc.c_max using mdb, then verified reading the arc structure again. arc.p is supposed to be initialized to half of arc.c. Also, I assume that there's a reliable test case for reproducing this problem? Yep. I'm using a x4500 in-house to sort out performance of a customer test case that uses mmap. We acquired the new DIMMs to bring the x4500 to 32GB, since the workload has a 64GB working set size, and we were clobbering a 16GB thumper. We wanted to see how doubling memory may help. I'm trying clamp the ARC size because for mmap-intensive workloads, it seems to hurt more than help (although, based on experiments up to this point, it's not hurting a lot). I'll do another reboot, and run it all down for you serially... /jim Thanks, -j On Thu, Mar 15, 2007 at 06:57:12PM -0400, Jim Mauro wrote: ARC_mru::print -d size lsize size = 0t10224433152 lsize = 0t10218960896 ARC_mfu::print -d size lsize size = 0t303450112 lsize = 0t289998848 ARC_anon::print -d size size = 0 So it looks like the MRU is running at 10GB... What does this tell us? Thanks, /jim [EMAIL PROTECTED] wrote: This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
I suppose I should have been more forward about making my last point. If the arc_c_max isn't set in /etc/system, I don't believe that the ARC will initialize arc.p to the correct value. I could be wrong about this; however, next time you set c_max, set c to the same value as c_max and set p to half of c. Let me know if this addresses the problem or not. -j How/when did you configure arc_c_max? Immediately following a reboot, I set arc.c_max using mdb, then verified reading the arc structure again. arc.p is supposed to be initialized to half of arc.c. Also, I assume that there's a reliable test case for reproducing this problem? Yep. I'm using a x4500 in-house to sort out performance of a customer test case that uses mmap. We acquired the new DIMMs to bring the x4500 to 32GB, since the workload has a 64GB working set size, and we were clobbering a 16GB thumper. We wanted to see how doubling memory may help. I'm trying clamp the ARC size because for mmap-intensive workloads, it seems to hurt more than help (although, based on experiments up to this point, it's not hurting a lot). I'll do another reboot, and run it all down for you serially... /jim Thanks, -j On Thu, Mar 15, 2007 at 06:57:12PM -0400, Jim Mauro wrote: ARC_mru::print -d size lsize size = 0t10224433152 lsize = 0t10218960896 ARC_mfu::print -d size lsize size = 0t303450112 lsize = 0t289998848 ARC_anon::print -d size size = 0 So it looks like the MRU is running at 10GB... What does this tell us? Thanks, /jim [EMAIL PROTECTED] wrote: This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
Following a reboot: arc::print -tad { . . . c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t33176457216 . . . } c02e2a08 /Z 0x2000 --- set c_max to 512MB arc+0x48: 0x7b9789000 = 0x2000 arc::print -tad { . . . c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912 - c_max is 512MB . . . } ARC_mru::print -d size lsize size = 0t294912 lsize = 0t32768 Run the workload a couple times... c02e29e8 uint64_t size = 0t27121205248 --- ARC size is 27GB c02e29f0 uint64_t p = 0t10551351442 c02e29f8 uint64_t c = 0t27121332576 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912 - c_max is 512MB ARC_mru::print -d size lsize size = 0t223985664 lsize = 0t221839360 ARC_mfu::print -d size lsize size = 0t26897219584 -- MFU list is almost 27GB ... lsize = 0t26869121024 Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
Will try that now... /jim [EMAIL PROTECTED] wrote: I suppose I should have been more forward about making my last point. If the arc_c_max isn't set in /etc/system, I don't believe that the ARC will initialize arc.p to the correct value. I could be wrong about this; however, next time you set c_max, set c to the same value as c_max and set p to half of c. Let me know if this addresses the problem or not. -j How/when did you configure arc_c_max? Immediately following a reboot, I set arc.c_max using mdb, then verified reading the arc structure again. arc.p is supposed to be initialized to half of arc.c. Also, I assume that there's a reliable test case for reproducing this problem? Yep. I'm using a x4500 in-house to sort out performance of a customer test case that uses mmap. We acquired the new DIMMs to bring the x4500 to 32GB, since the workload has a 64GB working set size, and we were clobbering a 16GB thumper. We wanted to see how doubling memory may help. I'm trying clamp the ARC size because for mmap-intensive workloads, it seems to hurt more than help (although, based on experiments up to this point, it's not hurting a lot). I'll do another reboot, and run it all down for you serially... /jim Thanks, -j On Thu, Mar 15, 2007 at 06:57:12PM -0400, Jim Mauro wrote: ARC_mru::print -d size lsize size = 0t10224433152 lsize = 0t10218960896 ARC_mfu::print -d size lsize size = 0t303450112 lsize = 0t289998848 ARC_anon::print -d size size = 0 So it looks like the MRU is running at 10GB... What does this tell us? Thanks, /jim [EMAIL PROTECTED] wrote: This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] C'mon ARC, stay small...
All righty...I set c_max to 512MB, c to 512MB, and p to 256MB... arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t16588228608 c02e29f8 uint64_t c = 0t33176457216 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t33176457216 ... } c02e2a08 /Z 0x2000 arc+0x48: 0x7b9789000 = 0x2000 c02e29f8 /Z 0x2000 arc+0x38: 0x7b9789000 = 0x2000 c02e29f0 /Z 0x1000 arc+0x30: 0x3dcbc4800 = 0x1000 arc::print -tad { ... c02e29e8 uint64_t size = 0t299008 c02e29f0 uint64_t p = 0t268435456 -- p is 256MB c02e29f8 uint64_t c = 0t536870912 -- c is 512MB c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t536870912--- c_max is 512MB ... } After a few runs of the workload ... arc::print -d size size = 0t536788992 Ah - looks like we're out of the woods. The ARC remains clamped at 512MB. Thanks! /jim [EMAIL PROTECTED] wrote: I suppose I should have been more forward about making my last point. If the arc_c_max isn't set in /etc/system, I don't believe that the ARC will initialize arc.p to the correct value. I could be wrong about this; however, next time you set c_max, set c to the same value as c_max and set p to half of c. Let me know if this addresses the problem or not. -j How/when did you configure arc_c_max? Immediately following a reboot, I set arc.c_max using mdb, then verified reading the arc structure again. arc.p is supposed to be initialized to half of arc.c. Also, I assume that there's a reliable test case for reproducing this problem? Yep. I'm using a x4500 in-house to sort out performance of a customer test case that uses mmap. We acquired the new DIMMs to bring the x4500 to 32GB, since the workload has a 64GB working set size, and we were clobbering a 16GB thumper. We wanted to see how doubling memory may help. I'm trying clamp the ARC size because for mmap-intensive workloads, it seems to hurt more than help (although, based on experiments up to this point, it's not hurting a lot). I'll do another reboot, and run it all down for you serially... /jim Thanks, -j On Thu, Mar 15, 2007 at 06:57:12PM -0400, Jim Mauro wrote: ARC_mru::print -d size lsize size = 0t10224433152 lsize = 0t10218960896 ARC_mfu::print -d size lsize size = 0t303450112 lsize = 0t289998848 ARC_anon::print -d size size = 0 So it looks like the MRU is running at 10GB... What does this tell us? Thanks, /jim [EMAIL PROTECTED] wrote: This seems a bit strange. What's the workload, and also, what's the output for: ARC_mru::print size lsize ARC_mfu::print size lsize and ARC_anon::print size For obvious reasons, the ARC can't evict buffers that are in use. Buffers that are available to be evicted should be on the mru or mfu list, so this output should be instructive. -j On Thu, Mar 15, 2007 at 02:08:37PM -0400, Jim Mauro wrote: FYI - After a few more runs, ARC size hit 10GB, which is now 10X c_max: arc::print -tad { . . . c02e29e8 uint64_t size = 0t10527883264 c02e29f0 uint64_t p = 0t16381819904 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . Perhaps c_max does not do what I think it does? Thanks, /jim Jim Mauro wrote: Running an mmap-intensive workload on ZFS on a X4500, Solaris 10 11/06 (update 3). All file IO is mmap(file), read memory segment, unmap, close. Tweaked the arc size down via mdb to 1GB. I used that value because c_min was also 1GB, and I was not sure if c_max could be larger than c_minAnyway, I set c_max to 1GB. After a workload run: arc::print -tad { . . . c02e29e8 uint64_t size = 0t3099832832 c02e29f0 uint64_t p = 0t16540761088 c02e29f8 uint64_t c = 0t1070318720 c02e2a00 uint64_t c_min = 0t1070318720 c02e2a08 uint64_t c_max = 0t1070318720 . . . size is at 3GB, with c_max at 1GB. What gives? I'm looking at the code now, but was under the impression c_max would limit ARC growth. Granted, it's not a factor of 10, and it's certainly much better than the out-of-the-box growth to 24GB (this is a 32GB x4500), so clearly ARC growth is being limited, but it still grew to 3X c_max. Thanks, /jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org