> And it started replacement/resilvering... after few minutes system became unavailbale. Reboot only gives me a few minutes, then resilvering make system unresponsible. > > Is there any workaroud or patch for this problem???
Argh, sorry -- the problem is that we don't do aggressive enough scrub/resilver throttling. The effect is most pronounced on 32-bit or low-memory systems. We're working on it. One thing you might try is reducing txg_time to 1 second (the default is 5 seconds) by saying this: "echo txg_time/W1 | mdb -kw". Let me describe what's happening, and why this may help. When we kick off a scrub (same code path as resilver, so I'll use the term generically), we traverse the entire block tree looking for blocks that need scrubbing. The tree traversal itself is single-threaded, but the work it generates is not -- each time we find a block that needs scrubbing, we schedule an async I/O to do it. As you've discovered, we can generate work faster than the I/O subsystem can process it. To avoid overloading the disks, we throttle I/O downstream, but we don't (yet) have an upstream throttle. If we discover blocks really fast, we can end up scheduling lots of I/O -- and sitting on lots of memory -- before the downstream throttle kicks in. The reason this relates to txg_time is that every time we sync a transaction group, we suspend the scrub thread and wait for all pending scrub I/Os to complete. This ensures that we won't asynchronously scrub a block that was freed and reallocated in a future txg; when coupled with the COW nature of ZFS, this allows us to run scrubs entirely independent of all filesystem-level structure (e.g. directories) and locking rules. This little trick makes the scrubbing algorithms *much* simpler. The key point is that each spa_sync() throttles the scrub to zero. By lowering txg_time from 5 to 1, you're cutting down the maximum number of pending scrub I/Os by roughly 5x. The unresponsiveness you're seeing is a threshold effect; I'm hoping that by running spa_sync() more often, we can get you below that threshold. Please let me know if this works for you. Jeff _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss