On Mon, 2017-05-01 at 09:16 -0700, Dan Williams wrote: > On Mon, May 1, 2017 at 9:12 AM, Kani, Toshimitsu <toshi.k...@hpe.com> > wrote: > > On Mon, 2017-05-01 at 08:52 -0700, Dan Williams wrote: > > > On Mon, May 1, 2017 at 8:43 AM, Dan Williams <dan.j.williams@inte > > > l.co > > > m> wrote: > > > > On Mon, May 1, 2017 at 8:34 AM, Kani, Toshimitsu <toshi.kani@hp > > > > e.co > > > > m> wrote: > > > > > On Sun, 2017-04-30 at 05:39 -0700, Dan Williams wrote: > > > > : > > > > > > > > > > Hi Dan, > > > > > > > > > > I was testing the change with CONFIG_DEBUG_ATOMIC_SLEEP set > > > > > this time, and hit the following BUG with BTT. This is a > > > > > separate issue (not introduced by this patch), but it shows > > > > > that we have an issue with the DSM call path as well. > > > > > > > > Ah, great find, thanks! We don't see this in the unit tests > > > > because the nfit_test infrastructure takes no sleeping actions > > > > in its simulated DSM path. Outside of converting btt to use > > > > sleeping locks I'm not sure I see a path forward. I wonder how > > > > bad the performance impact of that would be? Perhaps with > > > > opportunistic spinning it won't be so bad, but I don't see > > > > another choice. > > > > > > It's worse than that. Part of the performance optimization of BTT > > > I/O was to avoid locking altogether when we could rely on a BTT > > > lane percpu, so that would also need to be removed. > > > > I do not have a good idea either, but I'd rather disable this > > clearing in the regular BTT write path than adding sleeping locks > > to BTT. Clearing a bad block in the BTT write path is > > difficult/challenging since it allocates a new block. > > Actually, that may make things easier. Can we teach BTT to track > error blocks and clear them before they are reassigned?
I was thinking the same after sending it. I think we should be able to do that. Thanks, -Toshi