On Thu, Nov 5, 2009 at 2:01 PM, John Abd-El-Malek <j...@chromium.org> wrote:

>
>
> On Thu, Nov 5, 2009 at 1:42 PM, Jeremy Orlow <jor...@chromium.org> wrote:
>
>> On Thu, Nov 5, 2009 at 1:34 PM, John Abd-El-Malek <j...@chromium.org>wrote:
>>
>>>
>>>
>>> On Thu, Nov 5, 2009 at 1:15 PM, Jeremy Orlow <jor...@chromium.org>wrote:
>>>
>>>> On Mon, Nov 2, 2009 at 9:50 PM, John Abd-El-Malek <j...@chromium.org>wrote:
>>>>
>>>>> Over the last week, I've been making some changes to how threads are
>>>>> used in the browser process, with the goal of simplifying cross-thread
>>>>> access and improving stability.
>>>>>
>>>>> *The problem*
>>>>> We were using a number of patterns that were problematic:
>>>>>
>>>>> 1) using ChromeThread::GetMessageLoop
>>>>>   -this isn't safe, since it could return a valid pointer, but since
>>>>> the caller isn't holding on to a lock anymore, the target MessageLoop 
>>>>> could
>>>>> be destructed while it's being used
>>>>>
>>>>> 2) caching of MessageLoop pointers in order to use them later for
>>>>> PostTask and friends
>>>>>   -this was more efficient previously (more on that later) since using
>>>>> ChromeThread::GetMessageLoop involved a lock
>>>>>   -but it spread logic about the order of thread destruction all over
>>>>> the code.  Code that moved from the IO thread to the file thread and back,
>>>>> in order to avoid doing disk access on the IO thread, ended up having to 
>>>>> do
>>>>> an extra hop through the UI thread on the way back to the IO thread since
>>>>> the file thread outlives the Io thread.  Of course, most code learnt this
>>>>> the hard way, after doing the straight forward IO->file->IO thread hop and
>>>>> updating the code after seeing reliability or user crashes
>>>>>   -it made the browser shutdown fragile and hence difficult to update
>>>>>
>>>>> 3) file thread hops using RefCountedThreadSafe objects which have
>>>>> non-trivial destructors
>>>>>   -to reduce jank, frequently an object on the UI or IO thread would
>>>>> execute some code on the file thread, then post the result back to the
>>>>> original thread.  We make this easy using RefCountedThreadSafe and
>>>>> NewRunnableMethod so this pattern happened often, but it's not always
>>>>> safe: NewRunnableMethod holds an extra reference on the object to ensure
>>>>> that it doesn't invoke a method on a deleted object.  But it's quite
>>>>> possible that before the file thread's stack unwinds and it releases the
>>>>> extra reference, that the response task on the original thread executed 
>>>>> and
>>>>> released its own additional reference.  The file thread is then left with
>>>>> the remaining reference and the object gets destructed there.  While for
>>>>> most objects this is ok, many have non-trivial destructors, with the worst
>>>>> being ones that register with the per-thread NotificationService.  
>>>>> Dangling
>>>>> pointers would be left behind and tracking these crashes from ChromeBot or
>>>>> the crash dumps has wasted several days at least for me.
>>>>>
>>>>> 4) having browser code take different code paths if a thread didn't
>>>>> exist
>>>>>   -this could be either deceptively harmless  (i.e. execute
>>>>> synchronously when it was normally asynchronous), when in fact it makes
>>>>> shutdown slower because disk access is moved to the UI thread
>>>>>   -it could lead to data loss, if tasks are silently not posted because
>>>>> the code assumes this only happens in unit tests, when it could occur on
>>>>> browser shutdown as well
>>>>>
>>>>>
>>>>> *The solution*
>>>>> 1+2: Use ChromeThread::PostTask and friends (i.e. PostDelayedTask,
>>>>> DeleteSoon, ReleaseSoon) which are safe and efficient: no locks are 
>>>>> grabbed
>>>>> if the target thread is known to outlive the current thread.  The four
>>>>> static methods have the same signature as the ones from MessageLoop, with
>>>>> the addition of the first parameter to indicate the target thread.
>>>>>
>>>>> ChromeThread::PostTask(ChromeThread::FILE, FROM_HERE, task);
>>>>>
>>>>> 3: If your object must be destructed on a specific thread, use a trait
>>>>> from ChromeThread:
>>>>>
>>>>> class Foo : public base::RefCountedThreadSafe<Foo,
>>>>> ChromeThread::DeleteOnIOThread>
>>>>>
>>>>
>>>> This is really cool and a great idea, but I'm a little concerned about
>>>> what happens when the thread has already been torn down.  It seems that it
>>>> calls DeleteSoon which calls PostNonNestableTask which calls 
>>>> PostTaskHelper.
>>>>  PostTaskHelper deletes the task if the thread is shut down.
>>>>
>>>> This works fine if something is only supposed to touch 2 threads.  But
>>>> what if 2 threads simultaneously delete something which is supposed to be
>>>> deleted on a third thread that's already been shut down?
>>>>
>>>
>>> I'm not sure which object that you're referring to?  The PostTaskHelper
>>> will delete the task.  But if you have a DeleteSoon task, deleting the task
>>> (i.e. because the target thread is gone) doesn't delete the object.
>>>
>>
>> So this means the memory will just be leaked?  That won't work for DOM
>> Storage since the backing database is shut down in destructors.
>>
>
> If the object doesn't need to be closed on a specific thread, and you want
> to ensure that it's deleted on any thread, then ReleaseSoon and
> ChromeThread::DeleteOnIOThread aren't going to serve your need.  You'll
> probably want to have either a singleton or an object like
> ResourceDispatcherHost that BrowserProcess can delete directly.
>
>>
>> As for simultaneously deleting an object, if more than 1 thread are
>>> accessing an object, they should each have a reference to it.  They can't
>>> ensure that releasing their reference will cause deletion, since there could
>>> be other refences.
>>>
>>
>> Sure, but what if the ref count is not thread safe?
>>
>
> If different threads have references to an object, it should derive from
> RefCountedThreadSafe.  Doing manual ref counting on one thread and handing
> out pointers to other threads is recipe for disaster..
>
>
>>  Or they access an object that's not thread safe.  I often do this because
>> I only touch those members on specific threads, but if I
>> can't guarantee that only one thread could be deleting an object at a time,
>> things fall on the floor.
>>
>
> How exactly is your object deleted on more than one thread at a time?  Do
> you mean that one will try to delete it, and it has a lock and so the second
> will early return?  But back to the original point, what exactly is breaking
> down here?
>

You're not understanding me correctly, but I think I have my answer.  Feel
free to ping me in person if you want me to explain what's going on.


>
>
>>
>>   And what if they both call a method on another class that's not thread
>>>> safe?
>>>>
>>>
>>> If they use NewRunnableMethod on an object that's not thread safe, then
>>> the result task would only execute on the target thread.  If the task
>>> couldn't execute because the target thread is gone, then the method won't be
>>> invoked.
>>>
>>>
>>>>  I ask because this scenario is going to prevent me from using this for
>>>> DOM Storage.
>>>>
>>>> A possible solution would be to guarantee that, if the thread is dead,
>>>> destruction happens on the UI thread.  At least in this case, I think it
>>>> would work.
>>>>
>>>> Thoughts?
>>>>
>>>
>>> DeleteOnIOThread and the other variants were added specifically to
>>> enforce that an object is deleted on a specific thread.  Without this, as an
>>> example objects that use the NotificationService (which is per thread) won't
>>> be able to unregister themselves, which would lead to corruption later since
>>> the service will call out to deleted objects.  So no, we wouldn't want to
>>> delete these objects on the UI thread because the problem would still exist.
>>>
>>
>> My question was based on the assumption I laid out above.  Apparently
>> that's not true and the memory is just leaked?
>>
>>
>>>
>>>>
>>>>> 4: I've removed all the special casing and always made the objects in
>>>>> the browser code behave in one way.  If you're writing a unit test and 
>>>>> need
>>>>> to use an object that goes to a file thread (where before it would proceed
>>>>> synchronously), you just need:
>>>>>
>>>>> ChromeThread file_thread(ChromeThread::FILE, MessageLoop::current());
>>>>> foo->StartAsyncTaskOnFileThread();
>>>>> MessageLoop::current()->RunAllPending();
>>>>>
>>>>> There are plenty of examples now in the tests.
>>>>>
>>>>>
>>>>> *Gotchas*
>>>>> -PostTask will silently delete a task if the thread doesn't exist.
>>>>>  This is done to avoid having all the code that uses it have special cases
>>>>> if the target thread exists or not, and to make Valgrind happy.  As usual,
>>>>> the task for DeleteSoon/ReleaseSoon doesn't do anything in its destructor,
>>>>> so this won't cause unexpected behavior with them.  But be wary of posted
>>>>> Task objects which have non-trivial destructors or smart pointers as
>>>>> members.  I'm still on the fence about this, since while the latter is
>>>>> theoretical now, it could lead to problems in the future.  I might change 
>>>>> it
>>>>> so that the tasks are not deleted when I'm ready for more Valgrind fun.
>>>>>
>>>>> If you absolutely must know if a task was posted or not, you can check
>>>>> the return value of PostTask and friends.  But note that even if the task
>>>>> was posted successfully, there's no guarantee that it will run because the
>>>>> target thread could already have a QuitTask queued up.
>>>>>
>>>>> g_browser->io_thread() and file_thread() are still around (the former
>>>>> for IPC code, and the latter for Linux proxy code which is in net and so
>>>>> can't use ChromeThread).  Don't use them unless absolutely necessary.
>>>>>
>>>>>
>>>>> *The Catch*
>>>>> Ensuring we don't fall back into our old patterns through code you
>>>>> write and review.
>>>>>
>>>>>
>>>>> *More information*
>>>>> http://code.google.com/p/chromium/issues/detail?id=25354
>>>>>
>>>>> >>>>>
>>>>>
>>>>
>>>
>>
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

Reply via email to