Cool! I could compare the builds before and after these changes to see what difference it makes. Of course it also prevents future issues.
Huan On Mon, Nov 2, 2009 at 9:50 PM, John Abd-El-Malek <j...@chromium.org> wrote: > Over the last week, I've been making some changes to how threads are used in > the browser process, with the goal of simplifying cross-thread access and > improving stability. > The problem > We were using a number of patterns that were problematic: > 1) using ChromeThread::GetMessageLoop > -this isn't safe, since it could return a valid pointer, but since the > caller isn't holding on to a lock anymore, the target MessageLoop could be > destructed while it's being used > 2) caching of MessageLoop pointers in order to use them later for PostTask > and friends > -this was more efficient previously (more on that later) since using > ChromeThread::GetMessageLoop involved a lock > -but it spread logic about the order of thread destruction all over the > code. Code that moved from the IO thread to the file thread and back, in > order to avoid doing disk access on the IO thread, ended up having to do an > extra hop through the UI thread on the way back to the IO thread since the > file thread outlives the Io thread. Of course, most code learnt this the > hard way, after doing the straight forward IO->file->IO thread hop and > updating the code after seeing reliability or user crashes > -it made the browser shutdown fragile and hence difficult to update > 3) file thread hops using RefCountedThreadSafe objects which have > non-trivial destructors > -to reduce jank, frequently an object on the UI or IO thread would execute > some code on the file thread, then post the result back to the original > thread. We make this easy using RefCountedThreadSafe and NewRunnableMethod > so this pattern happened often, but it's not always safe: NewRunnableMethod > holds an extra reference on the object to ensure that it doesn't invoke a > method on a deleted object. But it's quite possible that before the file > thread's stack unwinds and it releases the extra reference, that the > response task on the original thread executed and released its own > additional reference. The file thread is then left with the remaining > reference and the object gets destructed there. While for most objects this > is ok, many have non-trivial destructors, with the worst being ones that > register with the per-thread NotificationService. Dangling pointers would > be left behind and tracking these crashes from ChromeBot or the crash dumps > has wasted several days at least for me. > 4) having browser code take different code paths if a thread didn't exist > -this could be either deceptively harmless (i.e. execute synchronously > when it was normally asynchronous), when in fact it makes shutdown slower > because disk access is moved to the UI thread > -it could lead to data loss, if tasks are silently not posted because the > code assumes this only happens in unit tests, when it could occur on browser > shutdown as well > > The solution > 1+2: Use ChromeThread::PostTask and friends (i.e. PostDelayedTask, > DeleteSoon, ReleaseSoon) which are safe and efficient: no locks are grabbed > if the target thread is known to outlive the current thread. The four > static methods have the same signature as the ones from MessageLoop, with > the addition of the first parameter to indicate the target thread. > ChromeThread::PostTask(ChromeThread::FILE, FROM_HERE, task); > 3: If your object must be destructed on a specific thread, use a trait from > ChromeThread: > class Foo : public base::RefCountedThreadSafe<Foo, > ChromeThread::DeleteOnIOThread> > 4: I've removed all the special casing and always made the objects in the > browser code behave in one way. If you're writing a unit test and need to > use an object that goes to a file thread (where before it would proceed > synchronously), you just need: > ChromeThread file_thread(ChromeThread::FILE, MessageLoop::current()); > foo->StartAsyncTaskOnFileThread(); > MessageLoop::current()->RunAllPending(); > There are plenty of examples now in the tests. > > Gotchas > -PostTask will silently delete a task if the thread doesn't exist. This is > done to avoid having all the code that uses it have special cases if the > target thread exists or not, and to make Valgrind happy. As usual, the task > for DeleteSoon/ReleaseSoon doesn't do anything in its destructor, so this > won't cause unexpected behavior with them. But be wary of posted Task > objects which have non-trivial destructors or smart pointers as members. > I'm still on the fence about this, since while the latter is theoretical > now, it could lead to problems in the future. I might change it so that the > tasks are not deleted when I'm ready for more Valgrind fun. > If you absolutely must know if a task was posted or not, you can check the > return value of PostTask and friends. But note that even if the task was > posted successfully, there's no guarantee that it will run because the > target thread could already have a QuitTask queued up. > g_browser->io_thread() and file_thread() are still around (the former for > IPC code, and the latter for Linux proxy code which is in net and so can't > use ChromeThread). Don't use them unless absolutely necessary. > > The Catch > Ensuring we don't fall back into our old patterns through code you write and > review. > > More information > http://code.google.com/p/chromium/issues/detail?id=25354 > > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---