Hi all, I've put together a proposal for implementing SharedWorkers in WebKit. The worker lifecycle issues turned out to be thornier than I originally expected, mostly because the implications of the spec aren't obvious right away (to me, anyway :)
Any feedback would be appreciated, especially for some of the cross-threading and worker lifecycle issues. Cheers, -atw WebKit SharedWorker design Shared workers (http://dev.w3.org/html5/workers/#shared-workers) are similar to the existing dedicated workers, with a few API differences. - SharedWorkers are shared - if an application creates a SharedWorker() while there's already a non-closing instance of that worker anywhere in the browser, then it gets a reference to the existing worker thread. - All communication is via explicit MessagePorts. SharedWorkers receive new MessagePorts via onconnect() rather than raw messages via onmessage() - SharedWorkers have the same lifecycle as dedicated workers according to the spec. SharedWorkers use explicit MessagePorts for communication instead of implicit MessagePorts like dedicated Workers, but the issues are the same (especially since dedicated Workers can use MessagePorts as well, if entangled ports are sent to/from the worker via postMessage()). New code will be needed here, since WebKit doesn't currently implement all aspects of the worker lifecycle (it's not needed yet because sending MessagePorts to workers is not yet supported). - SharedWorkers have explicit access to the ApplicationCache APIs, while dedicated Workers merely inherit the ApplicationCache from their parent window. >From the browser point of view, SharedWorkers are largely indistinguishable from dedicated Workers. They run in their own SharedWorkerThread with a SharedWorkerContext both of which derive from common base classes shared with dedicated WorkerThreads/WorkerContexts. In Chrome, SharedWorkers will run in a separate process (not in the renderer process) just like dedicated Workers. Creating SharedWorkersThe core of our support for SharedWorkers is the SharedWorkerRepository, which provides a thread-safe interface to a map whose keys are a combination of SecurityOrigin and workerName, and whose values are references to SharedWorkerContext objects. The SharedWorkerRepository is also responsible for tracking which SharedWorker objects are associated with a given SharedWorkerContext, for the purposes of sending close events when the worker shuts down. This section describes the default WebKit implementation of the repository - Chrome will provide its own implementation whose behavior is similar, but whose internals are different because it runs in the browser process (required because it's the only way to provide the necessary cross-render-process synchronization). We define the SharedWorkerContextProxy as an interface to allow the Chrome implementation to vary - there is no similar SharedWorkerObjectProxy interface since this would only be used internally by the Repository which will be Chrome-specific code anyway. class SharedWorkerRepository { // Does a synchronous get-or-create of a worker with the specified name. static public SharedWorkerContextProxy addWorker(SharedWorker *worker, SecurityOrigin* origin, const String& url, const String& name); // Marks a worker as closing (removes it from the map). A close event is // propagated to all SharedWorker objects associated with this context. static public void workerThreadClosed(SharedWorkerThread* worker); // TODO: Add way to send console messages to parent window contexts? } class SharedWorkerContextProxy { // Sends a connect event to the worker passing this port. void connect(MessagePort* port); // Invoked when a SharedWorker object is destroyed. This causes the // SharedWorker to be removed from the repository. // If we have a close event in the queue for this worker, will that be enough // to keep it from being GC'd? Or is it possible for the worker to get deleted // while there are events queued for it? void workerObjectDestroyed(SharedWorker *); } As noted above, the SharedWorkerRepository refers (via the SharedWorkerContextProxy interface) to the set of all SharedWorkerContext objects whose *closing* flag is false, in addition to all SharedWorker objects associated with each SharedWorkerContext. SharedWorkerRepository::addWorker()The SharedWorker constructor passes a copy of the newly-created object into SharedWorkerRepository::addWorker(). This grabs the repository mutex, and then performs the following steps: *If SharedWorkerContextProxy for passed origin/name does not exist in map: create new SharedWorkerContextProxy If SharedWorkerContextProxy has no worker thread: initiate code load (within current context). Do we need to do anything special here re: the ApplicationCache, to make sure we load from the most recent cache rather than from the current context's cache? Add SharedWorker to list of objects associated with SharedWorkerContextProxy return SharedWorkerContextProxy* The SharedWorker constructor stores away a reference to the SharedWorkerContextProxy. It then creates a new entangled MessagePort pair, exposes one end via its *port *attribute and passes the other end into the SharedWorkerContextProxy::connect() handler, then returns to the caller. SharedWorker::notifyFinished() (code is loaded) *When the code load is complete:* * if code load error:* * invoke MessagePort.close() on the port* * invoke app error handler directly on SharedWorker object* * call SharedWorkerContextProxy::workerObjectDestroyed() to remove association ** clear reference to SharedWorkerContextProxy ** else: // code load success* * call SharedWorkerContextProxy::scriptLoaded()* SharedWorkerContextProxy::scriptLoaded(): *Grab repository mutex* *if workerThread == null:* * create workerThread* * pass in script* * send queued up connect events* *SharedWorkerContextProxy::connect() * This is responsible for sending the connect event to a given worker thread. Like WorkerMessagingProxy::postMessageToWorkerContext(), it needs to handle the case where the worker thread has not yet been created (waiting on script to load): *Grab repository mutex* *if workerThread != null:* * send connect event to worker thread* *else:* * add connect event to queue (sent in scriptLoaded() above).* To send a connect event to the worker thread, we queue up a SharedWorkerConnectTask. This task associates the MessagePort with the worker's execution context (via MessagePort::attachToContext()) and then invokes the worker's onconnect() handler on the worker thread. Open issue: What about console/inspector messages generated by SharedWorkers. Can we send them off to the console/inspector directly, or do we have to expose API on SharedWorkerRepository for forwarding them to a SharedWorker's document (possibly to all associated documents?) In the case of nested workers, do we have to continue to fan out these console messages? Seems like we might get loops as well if you have two shared workers referring to one another. Closing SharedWorkersShared workers can be closed through various means: by becoming unreachable, through user action, or by invoking SharedWorkerContext::close(). When a worker is closing by the worker itself calling close(), it is first disassociated from the repository by invoking SharedWorkerRepository::workerThreadClosed() which grabs the repository mutex and performs the following actions: *Get SharedWorkerContextProxy associated with WorkerThread* *For each SharedWorker associated with this SharedWorkerContextProxy object: * * Queue up close event (SharedWorker->scriptExecutionContext()->postTask())* *Remove SharedWorkerContextProxy from the map* This ensures that all existing SharedWorker objects receive the proper close() notifications, but that no new SharedWorker objects are associated with the SharedWorkerContext. Open issue: Is it OK for the repository to maintain explicit pointers to objects like SharedWorker and send events via workerObj->scriptExecutionContext()->postTask()? Is there a safer way to do this (say, via some kind of wrapper, ala WorkerMessagingProxy)? At this point, the SharedWorkerContext is left to manage its own demise, by queueing a task that fires a close event at the worker global scope. Once the close event has been fired, WorkerRunLoop.terminate() is invoked to drop all remaining tasks for the worker and cause the thread to exit, freeing the SharedWorkerContext. The "kill a worker" algorithm described in section 4.6 of the WebWorkers spec suggests that timeouts may be imposed by the UserAgent for the close() handler as well as for any tasks that are executing before the close task is executed. How can we enforce these timeouts by aborting currently executing script? Reuse/Refactoring of existing dedicated Worker codeBoth WorkerThread and WorkerRunLoop can be re-used nearly entirely - we'll need to refactor out the code in WorkerThread that deals with "PendingActivity" since we don't care about that for SharedWorkers and create a factory method for creating the WorkerContext, but the rest of the code should work largely verbatim. Most of WorkerContext should be common between shared and dedicated workers - there are a few APIs (like postMessage() and dispatchMessage()) that aren't needed for SharedWorkers, so we'll create a common baseclass that contains the base functionality and support for items in WorkerGlobalScope without any of the dedicated/specific shared functionality. class SharedWorkerContext : public BaseWorkerContext { // Support for specific items in SharedWorkerGlobalScope public: String name() const; void setOnconnect(PassRefPtr<EventListener> eventListener) { m_onconnectListener = eventListener; } EventListener* onconnect() const { return m_onconnectListener.get(); } // TODO: Add applicationCache functionality } Worker LifecycleOn the DOM side, the SharedWorker object should remain live as long as it's reachable by javascript, or the SharedWorkerRepository holds a reference to it (the SharedWorkerRepository releases the reference once SharedWorkerRepository::workerThreadClosed() is invoked for the associated SharedWorkerContext). As a future optimization, we could probably GC the object earlier if it has no close event handler, but we won't do that initially. The current dedicated Worker code keys the reachability of the worker thread to the reachability of the parent Worker object itself - the Worker object destructor calls WorkerContextProxy::workerObjectDestroyed() which terminates the worker thread (no close event is currently generated). The current dedicated Worker implementation will suffice for dedicated Workers until we support posting MessagePorts to the dedicated worker thread. At that point we should probably change the dedicated worker code to use a real MessagePort behind the scenes, and use the same MessagePort-reachability mechanism that SharedWorkers will use. Worker Lifecycle spec explained in non-normative languageThe spec describes 3 states for workers: permissible, active needed, or suspendable. Only workers that are active needed should be able to execute. Suspendable workers should be suspended. All other workers should be closed. Note: The HTML5 spec refers to a 4th protected worker state, but I believe this to be unnecessary - I'm working with Ian Hickson to clarify this. PermissibleThe spec specifies that a worker is *permissible* based on whether it has a reachable MessagePort that has been entangled *at some point in the past* with an active window (or with a worker who is itself permissible). Basically, if a worker has *ever* been entangled with an active window, or if it's ever been entangled with a worker who is itself permissible (i.e. it's associated with an active window via a chain of workers that have been entangled at some point in the past) then it's permissible. The reason why the "at some point in the past" language is present is to allow a page to create a fire-and-forget worker (for example, a worker that does a set of long network operations) without having to keep a reference to that worker around. Once the referent windows close, the worker should also close, as being permissible is a necessary (but not sufficient) criteria for being runnable. Active neededA permissible worker is *active needed* if: 1. it has pending timers/network requests/DB activity, or 2. it is currently entangled with an active window, or another active needed worker. The intent behind #1 is to enable fire-and-forget workers that don't exit until they are idle. The intent behind #2 is that an idle worker shouldn't exit as long as it's reachable by an active window (possibly chained through other workers). SuspendableA *suspendable* worker is entangled with a non-active window object, or is entangled with another suspendable worker. When a worker is suspendable it should stop running (stop processing events) until it returns to the active needed state. How do we handle this with dedicated workers currently? If you navigate away from a window, does a dedicated worker get suspended, and resumed again when you hit the back button? Tracking permissible stateWe will create a global map (protected by a mutex) keyed by WorkerContext whose value is a set of active Documents associated with that WorkerContext. When a Document has a port which becomes entangled with a WorkerContext, we add that Document to the list of documents associated with that WorkerContext in the map. When a Document becomes inactive, we remove it from the map. When a WorkerContext becomes entangled with another WorkerContext, the two sets of associated Documents are merged, and the combined set is used for each context - in effect, both workers inherit the Window associations of the other. When a document closes, we walk the map and remove each reference to the document. If a given WorkerContext has no more items, then the worker is no longer permissible and we should close it. Tracking active needed stateA worker is *active needed* if it's permissible and has pending activity, or is reachable via a chain of MessagePorts from an active window or worker. If we view the set of ScriptExecutionContexts linked by MessagePorts as a graph, then a worker is reachable if its subgraph is connected to an active window. Determining whether a worker is active only requires a simple breadth-first search of this graph, triggered when a WorkerContext has one of its ports unentangled (currently the owning ScriptExecutionContext is not notified when a port is unentangled, so we'll need to add code to generate this notification). When the WorkerContext receives this notification, it can grab a global mutex and traverse the graph - if the WorkerContext is no longer connected to an active window it can initiate a close. When the WorkerContext is closed, its own MessagePorts will be unentangled, which will cascade to cause any related Workers to be shut down as appropriate.
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev