On 15/08/2013 15:34, Udo Grabowski (IMK) wrote:
Hi, as a sidenote to the recent discussion on [developer] [zfs] zpool should not accept devices with a larger (incompatible) ashift we just had the case that adding a SSD (Sun F20) to an traditional ashift=9 pool (40 disks, 8 x raidz1 (5 disks) vdevs stripe) results in a 4k blocked ashift=12 SSD (oi_151a7) used as log and cache (both 2 disks stripe). Machine is a Sun X4540 (48 2TB SATA disks on 6 LSI SAS1068E JBOD controllers), mpt driver. About 800 processes on 96 machines fetch and write data (in smaller 10k chunks) over a 10 Gbe (ixgbe) network, plus a few local processes usually reading 10k files with ~12 MB/s. This config worked with more than 100 MB/s sustained inflow on opensolaris 2009.06 with an ashift=9 log/cache (the same F20, two disks mirrored, disk is marked as battery-nvcache enabled in sd.conf).The question is: Does this new configuration harm pool performance ? What we saw the last two days was a heavy impact on write performance that took especially zfs_dirent_unlock to a grinding halt, regardless of delivering it over nfsv4 or locally (see below). We also saw long times in txg_hold_open (incomplete txgs ?). Users could not write in time, even simple remove of an empty directory took minutes. Nevertheless, the zpool-* workers where reading and writing heavily, driving disks to over 200 IOPs (cheaper desktop variants,not that powerfull..) constantly, but mean read/write was only ~12-15MB/s. So a major drawback compared to the good old OSOL 2009.06. Nothing peculiar otherwise, disks were in even use, 8-45ms, no outliers, FMA entries, just unbelievably slow, even the zpool iostats where hard to watch, one line per two secs instead of 50 lines per sec.... Today, we removed log and cache, exported the pool, rebooted the machine, reimported the pool, but left out log/cache. And, surprise surprise, performance is back in the 100ths MB/s read/write, and longest lockstat entries are down by a factor of ten from seconds to tenths of seconds (see picked list below) ! ...... [long lockstat statistics deleted] ......
Anyone any ideas ? Experimented again: - If I export, boot and import with 4k log and cache (512B pool), things run smooth until some point (this time nearly 4 days of mostly continuous write load), then suddenly, write performance goes down and does not recover (a few MB/s on a 1 GB/s capable pool over 10gb/s net). - If I then remove log and cache, pool recovers after some time. - If I add back cache, performance drops again to low levels, removed, recovered again - If I add back 4k log without cache, pool gets speed again. So as a workaround for that problem, I dropped the cache configured the F20 in "logzilla" mode (4 dev stripe), this gives at least a decent NFS write performance. So it seems that a 4k cache attached to a 512B pool will lead to a problem after a while, zpool iostat shows that the cache crawls at a continuous 300-500k/s/dev and does not recover from that workmode. The interesting thing is that the trigger seemed to be a DNS hickup, I got a couple these enigmatic TLI transport errors at the same time the performance dropped:Aug 20 02:14:24 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 791759 daemon.error] t_rcvrel(file descriptor 240/transport tcp) Resource temporarily unavailable
Aug 20 04:35:06 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 396295 daemon.error] t_rcvrel(file descriptor 252/transport tcp) TLI error 17
Aug 20 04:42:36 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 396295 daemon.error] t_rcvrel(file descriptor 240/transport tcp) TLI error 17
Aug 20 05:08:06 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 791759 daemon.error] t_rcvrel(file descriptor 246/transport tcp) Resource temporarily unavailable
Aug 20 05:14:30 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 396295 daemon.error] t_rcvrel(file descriptor 244/transport tcp) TLI error 17
Aug 20 05:43:04 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 396295 daemon.error] t_rcvrel(file descriptor 248/transport tcp) TLI error 17
Aug 20 07:08:24 imksunth4 /usr/lib/nfs/nfsd[1548]: [ID 396295 daemon.error] t_rcvrel(file descriptor 231/transport tcp) TLI error 17
Not sure if this was actually the trigger or a symptom of another problem or just nothing but a pure coincidence... -- Dr.Udo Grabowski Inst.f.Meteorology a.Climate Research IMK-ASF-SAT www.imk-asf.kit.edu/english/sat.php KIT - Karlsruhe Institute of Technology http://www.kit.edu Postfach 3640,76021 Karlsruhe,Germany T:(+49)721 608-26026 F:-926026 ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
smime.p7s
Description: S/MIME Cryptographic Signature
