On Mon, 2008-07-21 at 14:29 -0400, Ric Wheeler wrote: > Chris Mason wrote: > > On Sun, 2008-07-20 at 09:46 -0400, Ric Wheeler wrote: > > > >> > >> > >>>>>>>> Just to kick the tires, I tried the same test that I ran last week > >>>>>>>> on > >>>>>>>> ext4. Everything was going great, I decided to kill it after 6 > >>>>>>>> million > >>>>>>>> files or so and restart. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>> Well, it looks like I neglected to push all the changesets, especially > >>>>> the last one that made it less racey. So, I've just done another push, > >>>>> sorry. For the fs_mark workload, it shouldn't change anything. > >>>>> > >>>>> This code still hasn't really survived an overnight run, hopefully this > >>>>> commit will. > >>>>> > >>>>> > >>>> The test is still running, but slowly, with a (slow) stream of messages > >>>> about: > >>>> > > > > [ lock timeouts and stalls ] > > > > > > Ok, I've made a few changes that should lower overall contenion on the > > allocation mutex. I'm getting better performance on a 3 million file > > run, please give it a shot. > > > > -chris > > > > > Hi Chris, > > After an update, clean rebuild & reboot, the test is running along and > has hit about 10 million files. I still see some messages like: > > INFO: task pdflush:4051 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > pdflush D ffffffff8129c5b0 0 4051 2 > ffff81002ae77870 0000000000000046 0000000000000000 ffff81002ae77834 > 0000000000000001 ffffffff814b2280 ffffffff814b2280 0000000100000001 > 0000000000000000 ffff81003f188000 ffff81003fac5980 ffff81003f188350 > > but not as many as before. > > I will attach the messages file,
I'll try running with soft-lockup detection here, see if I can hunt down the cause of these stalls. Good to know I've made progress though ;) -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html