Am 12.01.11 16:32, schrieb Jeff Savit:
Stephan,

There are a bunch of tools you can use, mostly provided with Solaris 11 Express, plus arcstat, arc_summary that are available as downloads. The latter tools will tell you the size and state of ARC, which may be specific to your issues since you cite memory. For the list, could you describe the ZFS pool configuration (zpool status), and summarize output from vmstat, iostat, and zpool iostat? Also, it might be helpful to issue 'prstat -s rss' to see if any process is growing its resident memory size. An excellet source of information is the "ZFS evil tuning guide" (just Google those words), which has a wealth of information.

I hope that helps (for a start at least)
  Jeff



On 01/12/11 08:21 AM, Stephan Budach wrote:
Hi all,

I have exchanged my Dell R610 in favor of a Sun Fire 4170 M2 which has 32 GB RAM installed. I am running Sol11Expr on this host and I use it to primarily serve Netatalk AFP shares. From day one, I have noticed that the amount of free RAM decereased and along with that decrease the overall performance of ZFS decreased as well.

Now, since I am still quite a Solaris newbie, I seem to cannot track where the heck all the memory has gone and why ZFS performs so poorly after an uptime of only 5 days. I can reboot Solaris, which I did for testing, and that would bring back the performance to reasonable levels, but otherwiese I am quite at my witts end. To give some numbers: the ZFS performance decreases down to 1/10th of the initial throughput, either read or write.

Anybody having some tips up their sleeves, where I should start looking for the missing memory?

Cheers,
budy

Sure - here we go. First of all, the zpool configuration:

zpool status -v
  pool: obelixData
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
    still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
    pool will no longer be accessible on older software versions.
 scan: scrub repaired 0 in 15h29m with 0 errors on Mon Nov 15 21:42:52 2010
config:

    NAME                     STATE     READ WRITE CKSUM
    obelixData               ONLINE       0     0     0
      c9t210000D023038FA8d0  ONLINE       0     0     0
      c9t210000D02305FF42d0  ONLINE       0     0     0

errors: No known data errors

This pool consists of two FC LUNS which are exported from two FC RAIDs (no comments on that one, please I am still working on the transision to another zpool config! ;) )

Next up are arcstat.pl and arc_summary.pl:

perl /usr/local/de.jvm.scripts/arcstat.pl
time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
17:13:33     0     0      0     0    0     0    0     0    0    15G   16G
17:13:34    71     0      0     0    0     0    0     0    0    15G   16G
17:13:35     3     0      0     0    0     0    0     0    0    15G   16G
17:13:36   30K     0      0     0    0     0    0     0    0    15G   16G
17:13:37   13K     0      0     0    0     0    0     0    0    15G   16G
17:13:38    72     0      0     0    0     0    0     0    0    15G   16G
17:13:39    12     0      0     0    0     0    0     0    0    15G   16G
17:13:40    45     0      0     0    0     0    0     0    0    15G   16G
17:13:41    57     0      0     0    0     0    0     0    0    15G   16G
<State Changed>
17:13:42  1.3K     8      0     8    0     0    0     6    0    15G   16G
17:13:43    45     0      0     0    0     0    0     0    0    15G   16G
17:13:44  1.5K    15      1    13    0     2   50     4    0    15G   16G
17:13:45   122     0      0     0    0     0    0     0    0    15G   16G
17:13:46    74     0      0     0    0     0    0     0    0    15G   16G
17:13:47    88     0      0     0    0     0    0     0    0    15G   16G
17:13:48   19K    67      0    25    0    42    4    24    0    16G   16G
17:13:49   24K    31      0     0    0    31    9     0    0    15G   16G
17:13:50    41     0      0     0    0     0    0     0    0    15G   16G

perl /usr/local/de.jvm.scripts/arc_summary.pl
System Memory:
     Physical RAM:     32751 MB
     Free Memory :     5615 MB
     LotsFree:     511 MB

ZFS Tunables (/etc/system):
     set zfs:zfs_arc_max = 17179869184

ARC Size:
     Current Size:             16383 MB (arcsize)
     Target Size (Adaptive):   16384 MB (c)
     Min Size (Hard Limit):    2048 MB (zfs_arc_min)
     Max Size (Hard Limit):    16384 MB (zfs_arc_max)

ARC Size Breakdown:
     Most Recently Used Cache Size:      73%     12015 MB (p)
     Most Frequently Used Cache Size:      26%     4368 MB (c-p)

ARC Efficency:
     Cache Access Total:             300030668
Cache Hit Ratio: 92% 277102547 [Defined State for buffer] Cache Miss Ratio: 7% 22928121 [Undefined State for Buffer]
     REAL Hit Ratio:       84%     253621864       [MRU/MFU Hits Only]

     Data Demand   Efficiency:    98%
     Data Prefetch Efficiency:    26%

    CACHE HITS BY CACHE LIST:
Anon: 4% 12706439 [ New Customer, First Cache Hit ] Most Recently Used: 12% 35919656 (mru) [ Return Customer ] Most Frequently Used: 78% 217702208 (mfu) [ Frequent Customer ] Most Recently Used Ghost: 0% 2469558 (mru_ghost) [ Return Customer Evicted, Now Back ] Most Frequently Used Ghost: 2% 8304686 (mfu_ghost) [ Frequent Customer Evicted, Now Back ]
    CACHE HITS BY DATA TYPE:
      Demand Data:                38%      105648747
      Prefetch Data:               1%      3607719
      Demand Metadata:            53%      147826874
      Prefetch Metadata:           7%      20019207
    CACHE MISSES BY DATA TYPE:
      Demand Data:                 9%      2130351
      Prefetch Data:              43%      9907868
      Demand Metadata:            23%      5396413
      Prefetch Metadata:          23%      5493489
---------------------------------------------

So, these show me, that ZFS ARC actually shouldn't have grown over 16GB, right?

Now for vmstat/iostat/prstat:

vmstat 10
 kthr      memory            page            disk          faults      cpu
r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 22210512 9676332 29 215 0 0 0 0 1 -0 8 8 40 6460 13462 5409 2 1 97 0 1 0 18193100 5752352 32 305 0 0 0 0 0 1 15 15 19 3245 29489 1786 0 1 98 0 1 0 18193112 5752084 11 223 0 0 0 0 0 0 5 5 19 3628 30364 2219 0 2 97 0 1 0 18193132 5752568 1 139 0 0 0 0 0 0 21 20 24 4125 31415 2432 0 2 98 0 1 0 18193140 5752592 25 89 0 0 0 0 0 0 0 0 19 3070 23238 1537 0 1 99 0 1 0 18193204 5752316 10 167 0 0 0 0 0 0 8 8 29 4117 14217 2810 0 2 98


As you can see, the free memory shrinked from 32 GB to approx 5 GB. Substracting the ZFS ARC from the 32 GB, that'd leave another 16 GB for Solaris and its crops. So actually there have been 11 GB consumed, which I can't pinpoint using prstat. Although 5 GB od freem RAM should be still enough, the performance drops as soon as the free memory falls below 10 GB, which is pretty much the case after 2 work days.

iostat 10
   tty        sd0           sd1           sd2           sd3            cpu
tin tout kps tps serv kps tps serv kps tps serv kps tps serv us sy wt id 0 24 0 0 0 61 8 3 61 8 3 1107 40 6 2 1 0 97 0 97 0 0 0 160 16 3 160 16 3 170 15 5 0 2 0 98 0 175 0 0 0 168 20 3 168 20 3 105 15 3 0 2 0 98 0 111 0 0 0 0 0 0 0 0 0 90 14 4 0 1 0 99 0 62 0 0 0 146 13 3 146 13 3 422 31 5 0 2 0 98

iostat seems to show a quite idle system...

Last, but not least: zpool iostat barely sees read or write throughputs above 35 MB/s.

Btw, this is regardless whether I use netatlk or smb for sharing.

Thanks,
budy
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to