Hi all, I've put together a design doc for my upcoming SharedWorker implementation. The original lives here:
http://docs.google.com/a/chromium.org/Doc?id=dgfs7fcc_0f32q7xdd and I'll migrate it to dev.chromium.org once it's less draft-y. I'd appreciate any feedback - there are a few highlighted open issues (how do we report worker exceptions, how do we report unexpected worker process termination) that I'd be particularly interested in getting some input on. Thanks, -atw -=-=- Chromium SharedWorkers designThis document describes the Chromium implementation of SharedWorkers. A useful background document is the WebKit SharedWorkers <Doc?docid=0AaSJ7ekxGiStZGNuazV2MnZfMThjYjI0M2hnNw&hl=en> design document, which goes into more depth about the differences between SharedWorkers and dedicated Workers and the underlying WebKit implementation/interfaces. Additionally, the HTML5 Web Workers<http://www.whatwg.org/specs/web-workers/current-work/#shared-workers> specification contains the formal definition of SharedWorker behavior. OverviewSharedWorkers<http://www.whatwg.org/specs/web-workers/current-work/#shared-workers> are similar to dedicated workers, but with a simplified interface and lifecycle definition. SharedWorkers do not need their own messaging implementations - instead, they use MessagePorts to communicate with their parents. Likewise, SharedWorkers do not need to report their pending activity back to their parents to enable garbage collection of unreferenced workers - instead, the lifecycle of SharedWorkers depends solely on the lifecycle of their parent documents and not on the reachability of the associated SharedWorker objects. Ultimately, we will adopt this same lifecycle mechanism for all workers, because the current "reachability" mechanism used for dedicated workers starts to break down when you have nested workers. The WebKit page-context code interacts with SharedWorkers via the SharedWorkerRepository class, which has a very simple interface: // Connects the passed SharedWorker object with the specified worker thread, creating a new thread if necessary. static void connect(PassRefPtr<SharedWorker>, PassOwnPtr<MessagePortChannel>, const KURL&, const String& name, ExceptionCode&); // Invoked when a document has been detached. static void documentDetached(Document*); // Returns true if the passed document is associated with any SharedWorkers. static bool hasSharedWorkers(Document*); SharedWorker threads interact with the system through WorkerReportingProxy and WorkerLoaderProxy objects, just like the current dedicated Worker objects do. SharedWorkerRepositoryImplThe WebKit code is structured to allow different ports (like Chrome) to supply their own implementations of the SharedWorkerRepository interface (it consists only of a set of static functions). The Chromium implementation will split the SharedWorkerRepository implementation into two pieces: multiple SharedWorkerRepositoryImpl instances, one of which runs in each of the renderer processes as a singleton, and the WorkerService which runs in the browser process and controls the lifecycle of all SharedWorkers, and ensures that only a single instance of a given SharedWorker exists at any time. SharedWorkerRepositoryImpl::hasSharedWorkers()This is invoked by the FrameLoader code in WebKit to determine whether a given page is cacheable (for the purposes of the user clicking the back button) - essentially, a page is cacheable if it has ever created a SharedWorker (ideally, when all of a SharedWorker's pages are in the cache, we would suspend the worker, then resume it if the user goes back to the page). Since we cannot suspend workers, we treat Documents that have associated SharedWorkers as uncacheable. This is implemented by just keeping a HashMap of every document in the current process that has called SharedWorkerRepository::connect(), then returning true if the map contains an entry for a given document. When a document is detached (documentDetached() is called) we can remove it from the map. SharedWorkerRepositoryImpl::documentDetached() When a document is detached, we update our "hasSharedWorkersHashMap" as described previously. If the Document was in the hasSharedWorkersHashMap, we also notify the browser process via an IPC (passing the associated document_id - see below) so the WorkerService can shutdown any associated workers. SharedWorkerRepositoryImpl::connect() Invoked when the page script creates a SharedWorker via the SharedWorker() constructor. This makes a blocking ConnectToSharedWorker IPC to the SharedWorkerRepositoryHost to see if the SharedWorker already exists, and to validate the parameters (the HTML5 spec requires that we throw an exception if the client passes the wrong URL to a named SharedWorker, so this has to be a blocking call). The ConnectToSharedWorker IPC will have the following parameters: int message_port_route_id int document_id std::string script_url std::string worker_name std::string script_data // Always empty during the initial call The messagePortRoutingID is the ID associated with the MessagePort (generated when the port was created by the WebKit code). The document_id is a unique ID (unique to a given process) associated with the document - this ID is generated by a counter each time a new entry is added to the hasSharedWorkersHashMap. If the SharedWorker exists already, we can just return to the caller and all future communication will go through the entangled MessagePort. If the SharedWorker does not exist, the SharedWorkerRepositoryHost creates a placeholder (more on this below), and returns a NOT_FOUND flag to the SharedWorkerRepositoryImpl. The Impl singleton creates a SharedWorkerScriptLoader instance and kicks off a load for the worker script in the context of the parent page, then returns to the caller. SharedWorkerScriptLoaderSharedWorkerScriptLoader has two purposes: it initiates the load for the worker script, and it keeps the parent SharedWorker object from being GC'd until the script load has completed, to enable onerror() callbacks to be invoked. The SharedWorkerScriptLoader uses WorkerScriptLoader to accomplish the script loading. Once script loading is complete, it either generates a load error (via worker->dispatchLoadErrorEvent()) or passes off the script to the SharedWorkerRepositoryHost by sending another CreateSharedWorker IPC. WorkerService changesThe WorkerService is currently responsible for starting up and shutting down dedicated workers, queueing up workers when the process limit has been reached, and cleaning up when processes prematurely exit. We will extend this class to also keep track of all currently running SharedWorker threads, and it will be responsible for managing the lifecycle of SharedWorkers (ensuring that multiple requests for the same SharedWorker are mapped to a single instance of that SharedWorker, and tracking when the parent documents for a SharedWorker have detached so we can shut down the thread). Parent Documents are tracked by the pair consisting of the document_id and the parent process ID. Most information about workers currently is kept with the WorkerProcessHost object in its instances_ list (list of WorkerInstance structs). WorkerInstance already contains most of the data needed to interact with a worker (worker process ID, worker route id, script url) - we'll extend WorkerInstance to contain information needed by SharedWorkers as well: std::string name; std::list<pair<int, int>> document_set; // List of parent_process_id/parent_document_id pairs The WorkerService will handle the following IPCs sent from the renderer processes: ConnectToSharedWorker(int message_port_route_id, int document_id, std::string script_url, std::string worker_name, std::string script_data) DocumentDetached(int document_id) The WorkerService also handles a WorkerContextClosed() IPC from the worker process to notify it when the worker has been closed (SharedWorkerContext::close() is invoked) so it can remove its references to this worker.ConnectToSharedWorkerThe first step is to lookup the origin of the passed script_url and worker_name in the instances_ list to see if a matching SharedWorker already exists. At this point, one of several actions are taken: - If an entry already exists for this name/origin, and the URL parameter matches, and there is already a routing_id for this worker (meaning the thread has already been started), we add the passed document_id/process_id pair to the WorkerInstance.document_set and send a Connect message to the worker process with the message_port_routing_id to generate a connect event with an entangled MessagePort. - If a matching entry already exists, but there is no routing id (the thread has not started yet) then we add the passed document to the WorkerInstance's document_set. We then check to see whether the client has passed up script. If so, then we start up a worker thread (passing in the script), and fire off a Connect message to the new thread with the messagePortRoutingID to generate a connect event with an entangled MessagePort. If the client has not passed up scriptData, we return NOT_FOUND to the caller to cause the caller to load script for this worker (since there's already a WorkerInstance, another document is already trying to load script for this worker, but that attempt may fail or the other document may be closed before the load completes, so we issue a parallel load request). - If an entry exists in the repository for this name/origin, but the url does not match, then we return URL_MISMATCH to the caller to cause the SharedWorker constructor in page context to generate an exception. - If an entry does not exist in the repository, then a new placeholder entry is created - this is necessary to arbitrate conflicts if two pages try to create different workers with the same name simultaneously ; one has to fail. If script_data is passed up, we fire off a new worker and send off a Connect message to it. Otherwise, we return NOT_FOUND to the caller to cause the script to be loaded. DocumentDetachedWhen the browser process receives a notification that a document has been detached, it walks through all of the WorkerInstances and removes the specified document_id from their document_set. If the document_set for a worker becomes empty as the result of this operation, we remove the WorkerInstance from the instances_ list and send a Close message to the shared worker to cause it to shut down (Question: should we remove the WorkerInstance from the instances_ list immediately, or should we just mark it as "closing" so we don't connect to it any more and wait until the worker context is destroyed before removing it? Is there a problem with removing the WorkerInstance while the worker is technically still alive?) WorkerContextClosedThe worker can close itself by calling WorkerGlobalScope::close() - when this happens the SharedWorkerContext::close() handler notifies the WorkerInstance via its WorkerReportingProxy. When receiving WorkerContextClosed for a worker, the WorkerService removes the WorkerInstance from the instances_ list (same question as above - is this OK, or do we need to keep the WorkerInstance around until the context actually is destroyed, rather than just closed?). Note that there is inherently a race condition in the WebWorkers spec (a document can connect to a shared worker, but the shared worker can invoke close() before the connect event ever arrives) - addressing this is left up to the webapp (I presume using timeouts/retries if the worker becomes unresponsive) and is outside the scope of our implementation. Handling unexpected terminationThe WorkerService is notified when a worker or a tab process prematurely exits by listening for RESOURCE_MESSAGE_FILTER_SHUTDOWN and WORKER_PROCESS_HOST_SHUTDOWN notifications. When a worker process exits, any associated WorkerInstance objects will be removed from the repository. In the case that a tab process exits, we will walk all of the WorkerInstances and remove any documents from that process from their document_set, removing those workers from the repository and closing them as appropriate. Question: If a worker process terminates, how should we notify the user? For dedicated workers, we display an info bar on the parent document's tab. What should we do for SharedWorkers - should we spam every parent document with info bars? Maybe just pick one? We currently just have document ids - we probably also need to track render_view_route_ids for every parent document if we need to display info bars. LimitsWe will continue to have a per-tab limit on the number of workers, using the existing WorkerService code. The code will need to queue up shared worker create requests (as it currently does) and also queue up connect requests for the workers (destroying queued connect requests if the sender exits, similar to how dedicated worker create requests are managed). Error Reporting The WebKit implementation broadcasts any uncaught SharedWorker exceptions to every document in the worker's document_set. We could also do this for Chromium, but it seems like it might be better to interact directly with the developer tab (write exceptions directly to the console a single time, rather than logging them multiple times via individual windows). Can someone point me in the right direction here (how to log exceptions to the console from worker context, if possible)? Launching workers/code re-useThe SharedWorker threads are not particularly different from existing dedicated worker threads - the only new functionality is the addition of the "connect" event when a new client connects, but otherwise the SharedWorker functionality is a subset of the dedicated worker functionality. But much of the other functionality (loading, shutdown) can be reused. I'll use the existing classes as a base class, and override the few APIs where shared workers are different (primarily error reporting). The WorkerService will send a StartSharedWorkerContext message to the WebWorkerClientProxy rather than a StartWorkerContext to ensure the proper type of WebCore::WorkerContext is created. --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: chromium-dev@googlegroups.com View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---