Cool! I could compare the builds before and after these changes to see
what difference it makes. Of course it also prevents future issues.

Huan

On Mon, Nov 2, 2009 at 9:50 PM, John Abd-El-Malek <j...@chromium.org> wrote:
> Over the last week, I've been making some changes to how threads are used in
> the browser process, with the goal of simplifying cross-thread access and
> improving stability.
> The problem
> We were using a number of patterns that were problematic:
> 1) using ChromeThread::GetMessageLoop
>   -this isn't safe, since it could return a valid pointer, but since the
> caller isn't holding on to a lock anymore, the target MessageLoop could be
> destructed while it's being used
> 2) caching of MessageLoop pointers in order to use them later for PostTask
> and friends
>   -this was more efficient previously (more on that later) since using
> ChromeThread::GetMessageLoop involved a lock
>   -but it spread logic about the order of thread destruction all over the
> code.  Code that moved from the IO thread to the file thread and back, in
> order to avoid doing disk access on the IO thread, ended up having to do an
> extra hop through the UI thread on the way back to the IO thread since the
> file thread outlives the Io thread.  Of course, most code learnt this the
> hard way, after doing the straight forward IO->file->IO thread hop and
> updating the code after seeing reliability or user crashes
>   -it made the browser shutdown fragile and hence difficult to update
> 3) file thread hops using RefCountedThreadSafe objects which have
> non-trivial destructors
>   -to reduce jank, frequently an object on the UI or IO thread would execute
> some code on the file thread, then post the result back to the original
> thread.  We make this easy using RefCountedThreadSafe and NewRunnableMethod
> so this pattern happened often, but it's not always safe: NewRunnableMethod
> holds an extra reference on the object to ensure that it doesn't invoke a
> method on a deleted object.  But it's quite possible that before the file
> thread's stack unwinds and it releases the extra reference, that the
> response task on the original thread executed and released its own
> additional reference.  The file thread is then left with the remaining
> reference and the object gets destructed there.  While for most objects this
> is ok, many have non-trivial destructors, with the worst being ones that
> register with the per-thread NotificationService.  Dangling pointers would
> be left behind and tracking these crashes from ChromeBot or the crash dumps
> has wasted several days at least for me.
> 4) having browser code take different code paths if a thread didn't exist
>   -this could be either deceptively harmless  (i.e. execute synchronously
> when it was normally asynchronous), when in fact it makes shutdown slower
> because disk access is moved to the UI thread
>   -it could lead to data loss, if tasks are silently not posted because the
> code assumes this only happens in unit tests, when it could occur on browser
> shutdown as well
>
> The solution
> 1+2: Use ChromeThread::PostTask and friends (i.e. PostDelayedTask,
> DeleteSoon, ReleaseSoon) which are safe and efficient: no locks are grabbed
> if the target thread is known to outlive the current thread.  The four
> static methods have the same signature as the ones from MessageLoop, with
> the addition of the first parameter to indicate the target thread.
> ChromeThread::PostTask(ChromeThread::FILE, FROM_HERE, task);
> 3: If your object must be destructed on a specific thread, use a trait from
> ChromeThread:
> class Foo : public base::RefCountedThreadSafe<Foo,
> ChromeThread::DeleteOnIOThread>
> 4: I've removed all the special casing and always made the objects in the
> browser code behave in one way.  If you're writing a unit test and need to
> use an object that goes to a file thread (where before it would proceed
> synchronously), you just need:
> ChromeThread file_thread(ChromeThread::FILE, MessageLoop::current());
> foo->StartAsyncTaskOnFileThread();
> MessageLoop::current()->RunAllPending();
> There are plenty of examples now in the tests.
>
> Gotchas
> -PostTask will silently delete a task if the thread doesn't exist.  This is
> done to avoid having all the code that uses it have special cases if the
> target thread exists or not, and to make Valgrind happy.  As usual, the task
> for DeleteSoon/ReleaseSoon doesn't do anything in its destructor, so this
> won't cause unexpected behavior with them.  But be wary of posted Task
> objects which have non-trivial destructors or smart pointers as members.
>  I'm still on the fence about this, since while the latter is theoretical
> now, it could lead to problems in the future.  I might change it so that the
> tasks are not deleted when I'm ready for more Valgrind fun.
> If you absolutely must know if a task was posted or not, you can check the
> return value of PostTask and friends.  But note that even if the task was
> posted successfully, there's no guarantee that it will run because the
> target thread could already have a QuitTask queued up.
> g_browser->io_thread() and file_thread() are still around (the former for
> IPC code, and the latter for Linux proxy code which is in net and so can't
> use ChromeThread).  Don't use them unless absolutely necessary.
>
> The Catch
> Ensuring we don't fall back into our old patterns through code you write and
> review.
>
> More information
> http://code.google.com/p/chromium/issues/detail?id=25354
> >
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to