Hi I have a FreeBSD 9 system with ZFS root. It is actually a VM under Xen on a beefy piece of HW (4 core Sandy Bridge 3ghz Xeon, total HW memory 32GB -- VM has 4vcpus and 6GB RAM). Mirrored gpart partitions. I am looking for data integrity more than performance as long as performance is reasonable (which it has more than been the last 3 months).
The other "servers" on the same HW, the other VMs on the same, don't have this problem but are set up the same way. There are 4 other FreeBSD VMs, one running email for a one man company and a few of his friends, as well as some static web pages and stuff for him, one runs a few low use web apps for various customers, and one runs about 30 websites with apache and nginx, mostly just static sites. None are heavily used. There is also one VM with linux running a couple low use FrontBase databases. Not high use database -- low use ones. The troubleseome VM has been running fine for over 3 months since I installed it. Level of use has been pretty much constant. The server runs 4 jails on it, each dedicated to a different bit of email processing for a small number of users. One is a secondary DNS. One runs clamav and spamassassin. One runs exim for incoming and outgoing mail. One runs dovecot for imap and pop. There is no web server or database or anything else running. Total number of mail users on the system is approximately 50, plus or minus. Total mail traffic is very low compared to "real" mail servers. Earlier this week things started "freezing up". It might last a few minutes, or it might last 1/2 hour. Processes become unresponsive. This can last a few minutes or much longer. It eventually resolves itself and things are good for another 10 minutes or 3 hours until it happens again. When it happens, lots of processes are listed in "top" as zfs zio->i zfs tx->tx db->db state. These processes only get listed in these states when there are problems. What are these states indicative of? Eventually things get going again, these states drop off and the system hums along. Based on some stuff I found in Google (for a person who had a different but somewhat similar problem) I tried setting zfs set primarycache=metadata zroot and zfs set primarycache=none zroot but the problem still happened with approximately the same severity and frequency. (Wanted to see if the system was "churning" with cache upkeep). What is strange is that this server ran fine for 3 months straight without interruption with the same level of work. Thanks for any hints or clues Chad some data points below --- # uname -a FreeBSD newbagend 9.0-STABLE FreeBSD 9.0-STABLE #1: Wed Mar 21 15:22:14 MDT 2012 chad@underhill:/usr/obj/usr/src/sys/UNDERHILL-XEN amd64 # --- # zpool status pool: zroot state: ONLINE scan: scrub repaired 0 in 6h13m with 0 errors on Fri Aug 10 19:33:23 2012 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/f0da8263-8a52-11e1-b3ae-aa00003efccd ONLINE 0 0 0 gptid/0f24ab58-8a53-11e1-b3ae-aa00003efccd ONLINE 0 0 0 errors: No known data errors # --- representative data from doing a stats during a trouble period zfs-stats -a ------------------------------------------------------------------------ ZFS Subsystem Report Sat Aug 11 13:40:07 2012 ------------------------------------------------------------------------ System Information: Kernel Version: 900505 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 ZFS Storage pool Version: 28 ZFS Filesystem Version: 5 FreeBSD 9.0-STABLE #1: Wed Mar 21 15:22:14 MDT 2012 chad 1:40PM up 2:54, 3 users, load averages: 0.23, 0.19, 0.14 ------------------------------------------------------------------------ System Memory: 11.49% 681.92 MiB Active, 4.03% 238.97 MiB Inact 33.37% 1.93 GiB Wired, 0.05% 3.04 MiB Cache 51.04% 2.96 GiB Free, 0.01% 808.00 KiB Gap Real Installed: 6.00 GiB Real Available: 99.65% 5.98 GiB Real Managed: 96.93% 5.80 GiB Logical Total: 6.00 GiB Logical Used: 46.76% 2.81 GiB Logical Free: 53.24% 3.19 GiB Kernel Memory: 1.25 GiB Data: 98.38% 1.23 GiB Text: 1.62% 20.75 MiB Kernel Memory Map: 5.68 GiB Size: 17.27% 1003.75 MiB Free: 82.73% 4.70 GiB ------------------------------------------------------------------------ ARC Summary: (HEALTHY) Memory Throttle Count: 0 ARC Misc: Deleted: 9 Recycle Misses: 64.30k Mutex Misses: 10 Evict Skips: 58.80k ARC Size: 39.98% 1.20 GiB Target Size: (Adaptive) 100.00% 3.00 GiB Min Size (Hard Limit): 12.50% 384.00 MiB Max Size (High Water): 8:1 3.00 GiB ARC Size Breakdown: Recently Used Cache Size: 25.56% 785.15 MiB Frequently Used Cache Size: 74.44% 2.23 GiB ARC Hash Breakdown: Elements Max: 223.30k Elements Current: 99.93% 223.15k Collisions: 418.23k Chain Max: 9 Chains: 66.67k ------------------------------------------------------------------------ ARC Efficiency: 3.17m Cache Hit Ratio: 89.07% 2.82m Cache Miss Ratio: 10.93% 346.27k Actual Hit Ratio: 86.49% 2.74m Data Demand Efficiency: 99.50% 1.09m Data Prefetch Efficiency: 60.54% 1.78k CACHE HITS BY CACHE LIST: Most Recently Used: 23.72% 669.34k Most Frequently Used: 73.38% 2.07m Most Recently Used Ghost: 1.92% 54.33k Most Frequently Used Ghost: 3.30% 93.02k CACHE HITS BY DATA TYPE: Demand Data: 38.35% 1.08m Prefetch Data: 0.04% 1.08k Demand Metadata: 58.75% 1.66m Prefetch Metadata: 2.87% 80.97k CACHE MISSES BY DATA TYPE: Demand Data: 1.56% 5.39k Prefetch Data: 0.20% 704 Demand Metadata: 55.46% 192.02k Prefetch Metadata: 42.78% 148.15k ------------------------------------------------------------------------ L2ARC is disabled ------------------------------------------------------------------------ File-Level Prefetch: (HEALTHY) DMU Efficiency: 6.05m Hit Ratio: 66.59% 4.03m Miss Ratio: 33.41% 2.02m Colinear: 2.02m Hit Ratio: 0.04% 725 Miss Ratio: 99.96% 2.02m Stride: 3.90m Hit Ratio: 99.98% 3.90m Miss Ratio: 0.02% 826 DMU Misc: Reclaim: 2.02m Successes: 2.02% 40.86k Failures: 97.98% 1.98m Streams: 125.81k +Resets: 0.36% 453 -Resets: 99.64% 125.36k Bogus: 0 ------------------------------------------------------------------------ VDEV Cache Summary: 530.68k Hit Ratio: 15.30% 81.21k Miss Ratio: 70.40% 373.57k Delegations: 14.30% 75.89k ------------------------------------------------------------------------ ZFS Tunables (sysctl): kern.maxusers 512 vm.kmem_size 6222712832 vm.kmem_size_scale 1 vm.kmem_size_min 0 vm.kmem_size_max 329853485875 vfs.zfs.l2c_only_size 0 vfs.zfs.mfu_ghost_data_lsize 91367424 vfs.zfs.mfu_ghost_metadata_lsize 128350208 vfs.zfs.mfu_ghost_size 219717632 vfs.zfs.mfu_data_lsize 132299264 vfs.zfs.mfu_metadata_lsize 20034048 vfs.zfs.mfu_size 160949760 vfs.zfs.mru_ghost_data_lsize 45155328 vfs.zfs.mru_ghost_metadata_lsize 642998784 vfs.zfs.mru_ghost_size 688154112 vfs.zfs.mru_data_lsize 347115520 vfs.zfs.mru_metadata_lsize 10907136 vfs.zfs.mru_size 794174976 vfs.zfs.anon_data_lsize 0 vfs.zfs.anon_metadata_lsize 0 vfs.zfs.anon_size 29469696 vfs.zfs.l2arc_norw 1 vfs.zfs.l2arc_feed_again 1 vfs.zfs.l2arc_noprefetch 1 vfs.zfs.l2arc_feed_min_ms 200 vfs.zfs.l2arc_feed_secs 1 vfs.zfs.l2arc_headroom 2 vfs.zfs.l2arc_write_boost 8388608 vfs.zfs.l2arc_write_max 8388608 vfs.zfs.arc_meta_limit 805306368 vfs.zfs.arc_meta_used 805310296 vfs.zfs.arc_min 402653184 vfs.zfs.arc_max 3221225472 vfs.zfs.dedup.prefetch 1 vfs.zfs.mdcomp_disable 0 vfs.zfs.write_limit_override 0 vfs.zfs.write_limit_inflated 19260174336 vfs.zfs.write_limit_max 802507264 vfs.zfs.write_limit_min 33554432 vfs.zfs.write_limit_shift 3 vfs.zfs.no_write_throttle 0 vfs.zfs.zfetch.array_rd_sz 1048576 vfs.zfs.zfetch.block_cap 256 vfs.zfs.zfetch.min_sec_reap 2 vfs.zfs.zfetch.max_streams 8 vfs.zfs.prefetch_disable 0 vfs.zfs.mg_alloc_failures 8 vfs.zfs.check_hostid 1 vfs.zfs.recover 0 vfs.zfs.txg.synctime_ms 1000 vfs.zfs.txg.timeout 5 vfs.zfs.scrub_limit 10 vfs.zfs.vdev.cache.bshift 16 vfs.zfs.vdev.cache.size 10485760 vfs.zfs.vdev.cache.max 16384 vfs.zfs.vdev.write_gap_limit 4096 vfs.zfs.vdev.read_gap_limit 32768 vfs.zfs.vdev.aggregation_limit 131072 vfs.zfs.vdev.ramp_rate 2 vfs.zfs.vdev.time_shift 6 vfs.zfs.vdev.min_pending 4 vfs.zfs.vdev.max_pending 10 vfs.zfs.vdev.bio_flush_disable 0 vfs.zfs.cache_flush_disable 0 vfs.zfs.zil_replay_disable 0 vfs.zfs.zio.use_uma 0 vfs.zfs.snapshot_list_prefetch 0 vfs.zfs.version.zpl 5 vfs.zfs.version.spa 28 vfs.zfs.version.acl 1 vfs.zfs.debug 0 vfs.zfs.super_owner 0 ------------------------ representative (from during a trouble period -- you see not much is going on -- low load and the iostat during a calm good period is about the same) zpool iostat zroot 1 capacity operations bandwidth pool alloc free read write read write ---------- ----- ----- ----- ----- ----- ----- zroot 107G 41.9G 7 261 23.8K 1.52M zroot 107G 41.9G 10 140 7.42K 272K zroot 107G 41.9G 8 176 14.4K 547K zroot 107G 41.9G 0 59 0 188K zroot 107G 41.9G 5 171 6.44K 1.73M zroot 107G 41.9G 4 284 8.42K 1006K zroot 107G 41.9G 5 118 2.97K 260K zroot 107G 41.9G 25 194 27.7K 623K zroot 107G 41.9G 0 132 0 764K zroot 107G 41.9G 1 95 6.44K 1.16M zroot 107G 41.9G 8 272 16.3K 829K zroot 107G 41.9G 56 212 103K 213K zroot 107G 41.9G 22 221 27.7K 204K zroot 107G 41.9G 2 455 1.48K 509K zroot 107G 41.9G 14 198 7.42K 132K zroot 107G 41.9G 14 270 7.42K 306K zroot 107G 41.9G 6 273 3.46K 670K zroot 107G 41.9G 21 175 10.9K 570K zroot 107G 41.9G 17 179 8.91K 591K zroot 107G 41.9G 11 289 17.3K 902K zroot 107G 41.9G 13 121 6.93K 230K zroot 107G 41.9G 18 238 9.41K 734K zroot 107G 41.9G 99 61 50.5K 188K zroot 107G 41.9G 0 222 0 862K zroot 107G 41.9G 11 149 13.4K 1.12M zroot 107G 41.9G 15 319 10.9K 1.05M zroot 107G 41.9G 0 127 0 392K zroot 107G 41.9G 0 159 0 1.70M zroot 107G 41.9G 68 196 212K 601K zroot 107G 41.9G 17 144 18.8K 295K zroot 107G 41.9G 12 187 17.3K 588K zroot 107G 41.9G 0 136 0 1.23M zroot 107G 41.9G 6 209 23.8K 564K zroot 107G 41.9G 11 199 12.4K 422K zroot 107G 41.9G 12 178 9.41K 553K zroot 107G 41.9G 0 140 1.48K 1.17M zroot 107G 41.9G 48 200 128K 411K zroot 107G 41.9G 8 191 16.8K 121K zroot 107G 41.9G 1 397 1013 375K zroot 107G 41.9G 0 263 0 132K zroot 107G 41.9G 14 228 13.4K 235K zroot 107G 41.9G 7 21 4.46K 10.9K zroot 107G 41.9G 2 161 1.48K 156K _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"