[chromium-dev] Chromium SharedWorker design doc

Drew Wilson Wed, 09 Sep 2009 13:27:47 -0700

Hi all,
I've put together a design doc for my upcoming SharedWorker implementation.
The original lives here:


http://docs.google.com/a/chromium.org/Doc?id=dgfs7fcc_0f32q7xdd

and I'll migrate it to dev.chromium.org once it's less draft-y.

I'd appreciate any feedback - there are a few highlighted open issues (how
do we report worker exceptions, how do we report unexpected worker process
termination) that I'd be particularly interested in getting some input on.

Thanks,

-atw

-=-=-
Chromium SharedWorkers designThis document describes the Chromium
implementation of SharedWorkers. A useful background document is the WebKit
SharedWorkers <Doc?docid=0AaSJ7ekxGiStZGNuazV2MnZfMThjYjI0M2hnNw&hl=en> design
document, which goes into more depth about the differences between
SharedWorkers and dedicated Workers and the underlying WebKit
implementation/interfaces. Additionally, the HTML5 Web
Workers<http://www.whatwg.org/specs/web-workers/current-work/#shared-workers>
specification
contains the formal definition of SharedWorker behavior.
OverviewSharedWorkers<http://www.whatwg.org/specs/web-workers/current-work/#shared-workers>
are
similar to dedicated workers, but with a simplified interface and lifecycle
definition. SharedWorkers do not need their own messaging implementations -
instead, they use MessagePorts to communicate with their parents. Likewise,
SharedWorkers do not need to report their pending activity back to their
parents to enable garbage collection of unreferenced workers - instead, the
lifecycle of SharedWorkers depends solely on the lifecycle of their parent
documents and not on the reachability of the associated SharedWorker
objects. Ultimately, we will adopt this same lifecycle mechanism for all
workers, because the current "reachability" mechanism used for dedicated
workers starts to break down when you have nested workers.

The WebKit page-context code interacts with SharedWorkers via the
SharedWorkerRepository class, which has a very simple interface:

        // Connects the passed SharedWorker object with the specified worker
thread, creating a new thread if necessary.
        static void connect(PassRefPtr<SharedWorker>,
PassOwnPtr<MessagePortChannel>, const KURL&, const String& name,
ExceptionCode&);

        // Invoked when a document has been detached.
        static void documentDetached(Document*);

        // Returns true if the passed document is associated with any
SharedWorkers.
        static bool hasSharedWorkers(Document*);

SharedWorker threads interact with the system through WorkerReportingProxy
and WorkerLoaderProxy objects, just like the current dedicated Worker
objects do.

SharedWorkerRepositoryImplThe WebKit code is structured to allow different
ports (like Chrome) to supply their own implementations of the
SharedWorkerRepository interface (it consists only of a set of static
functions). The Chromium implementation will split the
SharedWorkerRepository implementation into two pieces: multiple
SharedWorkerRepositoryImpl instances, one of which runs in each of the
renderer processes as a singleton, and the WorkerService which runs in the
browser process and controls the lifecycle of all SharedWorkers, and ensures
that only a single instance of a given SharedWorker exists at any time.
SharedWorkerRepositoryImpl::hasSharedWorkers()This is invoked by the
FrameLoader code in WebKit to determine whether a given page is cacheable
(for the purposes of the user clicking the back button) - essentially, a
page is cacheable if it has ever created a SharedWorker (ideally, when all
of a SharedWorker's pages are in the cache, we would suspend the worker,
then resume it if the user goes back to the page). Since we cannot suspend
workers, we treat Documents that have associated SharedWorkers as
uncacheable.

This is implemented by just keeping a HashMap of every document in the
current process that has called SharedWorkerRepository::connect(), then
returning true if the map contains an entry for a given document. When a
document is detached (documentDetached() is called) we can remove it from
the map.

SharedWorkerRepositoryImpl::documentDetached()
When a document is detached, we update our "hasSharedWorkersHashMap" as
described previously. If the Document was in the hasSharedWorkersHashMap, we
also notify the browser process via an IPC (passing the associated
document_id - see below) so the WorkerService can shutdown any associated
workers.

SharedWorkerRepositoryImpl::connect()
Invoked when the page script creates a SharedWorker via the SharedWorker()
constructor. This makes a blocking ConnectToSharedWorker IPC to the
SharedWorkerRepositoryHost to see if the SharedWorker already exists, and to
validate the parameters (the HTML5 spec requires that we throw an exception
if the client passes the wrong URL to a named SharedWorker, so this has to
be a blocking call).

The ConnectToSharedWorker IPC will have the following parameters:

int message_port_route_id
int document_id
std::string script_url
std::string worker_name
std::string script_data   // Always empty during the initial call

The messagePortRoutingID is the ID associated with the MessagePort
(generated when the port was created by the WebKit code). The document_id is
a unique ID (unique to a given process) associated with the document - this
ID is generated by a counter each time a new entry is added to the
hasSharedWorkersHashMap.

If the SharedWorker exists already, we can just return to the caller and all
future communication will go through the entangled MessagePort. If the
SharedWorker does not exist, the SharedWorkerRepositoryHost creates a
placeholder (more on this below), and returns a NOT_FOUND flag to the
SharedWorkerRepositoryImpl. The Impl singleton creates a
SharedWorkerScriptLoader instance and kicks off a load for the worker script
in the context of the parent page, then returns to the caller.
SharedWorkerScriptLoaderSharedWorkerScriptLoader has two purposes: it
initiates the load for the worker script, and it keeps the parent
SharedWorker object from being GC'd until the script load has completed, to
enable onerror() callbacks to be invoked. The SharedWorkerScriptLoader uses
WorkerScriptLoader to accomplish the script loading. Once script loading is
complete, it either generates a load error (via
worker->dispatchLoadErrorEvent()) or passes off the script to the
SharedWorkerRepositoryHost by sending another CreateSharedWorker IPC.

WorkerService changesThe WorkerService is currently responsible for starting
up and shutting down dedicated workers, queueing up workers when the process
limit has been reached, and cleaning up when processes prematurely exit. We
will extend this class to also keep track of all currently running
SharedWorker threads, and it will be responsible for managing the lifecycle
of SharedWorkers (ensuring that multiple requests for the same SharedWorker
are mapped to a single instance of that SharedWorker, and tracking when the
parent documents for a SharedWorker have detached so we can shut down the
thread). Parent Documents are tracked by the pair consisting of the
document_id and the parent process ID.

Most information about workers currently is kept with the WorkerProcessHost
object in its instances_ list (list of WorkerInstance structs).
WorkerInstance already contains most of the data needed to interact with a
worker (worker process ID, worker route id, script url) - we'll extend
WorkerInstance to contain information needed by SharedWorkers as well:

  std::string name;
  std::list<pair<int, int>> document_set;  // List of
parent_process_id/parent_document_id pairs

The WorkerService will handle the following IPCs sent from the renderer
processes:

ConnectToSharedWorker(int message_port_route_id, int document_id,
std::string script_url, std::string worker_name, std::string script_data)
DocumentDetached(int document_id)

The WorkerService also handles a WorkerContextClosed() IPC from the worker
process to notify it when the worker has been closed
(SharedWorkerContext::close() is invoked) so it can remove its references to
this worker.ConnectToSharedWorkerThe first step is to lookup the origin of
the passed script_url and worker_name in the instances_ list to see if a
matching SharedWorker already exists. At this point, one of several actions
are taken:


   - If an entry already exists for this name/origin, and the URL parameter
   matches, and there is already a routing_id for this worker (meaning the
   thread has already been started), we add the passed document_id/process_id
   pair to the WorkerInstance.document_set and send a Connect message to the
   worker process with the message_port_routing_id to generate a connect event
   with an entangled MessagePort.
   - If a matching entry already exists, but there is no routing id (the
   thread has not started yet) then we add the passed document to the
   WorkerInstance's document_set. We then check to see whether the client has
   passed up script. If so, then we start up a worker thread (passing in the
   script), and fire off a Connect message to the new thread with the
   messagePortRoutingID to generate a connect event with an entangled
   MessagePort. If the client has not passed up scriptData, we return NOT_FOUND
   to the caller to cause the caller to load script for this worker (since
   there's already a WorkerInstance, another document is already trying to load
   script for this worker, but that attempt may fail or the other document may
   be closed before the load completes, so we issue a parallel load request).
   - If an entry exists in the repository for this name/origin, but the url
   does not match, then we return URL_MISMATCH to the caller to cause the
   SharedWorker constructor in page context to generate an exception.
   - If an entry does not exist in the repository, then a new placeholder
   entry is created - this is necessary to arbitrate conflicts if two pages try
   to create different workers with the same name simultaneously ; one has to
   fail. If script_data is passed up, we fire off a new worker and send off a
   Connect message to it. Otherwise, we return NOT_FOUND to the caller to cause
   the script to be loaded.

DocumentDetachedWhen the browser process receives a notification that a
document has been detached, it walks through all of the WorkerInstances and
removes the specified document_id from their document_set. If the
document_set for a worker becomes empty as the result of this operation, we
remove the WorkerInstance from the instances_ list and send a Close message
to the shared worker to cause it to shut down (Question: should we remove
the WorkerInstance from the instances_ list immediately, or should we just
mark it as "closing" so we don't connect to it any more and wait until the
worker context is destroyed before removing it? Is there a problem with
removing the WorkerInstance while the worker is technically still alive?)
WorkerContextClosedThe worker can close itself by calling
WorkerGlobalScope::close() - when this happens the
SharedWorkerContext::close() handler notifies the WorkerInstance via its
WorkerReportingProxy. When receiving WorkerContextClosed for a worker, the
WorkerService removes the WorkerInstance from the instances_ list (same
question as above - is this OK, or do we need to keep the WorkerInstance
around until the context actually is destroyed, rather than just closed?).
Note that there is inherently a race condition in the WebWorkers spec (a
document can connect to a shared worker, but the shared worker can invoke
close() before the connect event ever arrives) - addressing this is left up
to the webapp (I presume using timeouts/retries if the worker becomes
unresponsive) and is outside the scope of our implementation.
Handling unexpected terminationThe WorkerService is notified when a worker
or a tab process prematurely exits by listening for
RESOURCE_MESSAGE_FILTER_SHUTDOWN and WORKER_PROCESS_HOST_SHUTDOWN
notifications. When a worker process exits, any associated WorkerInstance
objects will be removed from the repository. In the case that a tab process
exits, we will walk all of the WorkerInstances and remove any documents from
that process from their document_set, removing those workers from the
repository and closing them as appropriate.

Question: If a worker process terminates, how should we notify the user? For
dedicated workers, we display an info bar on the parent document's tab. What
should we do for SharedWorkers - should we spam every parent document with
info bars? Maybe just pick one? We currently just have document ids - we
probably also need to track render_view_route_ids for every parent document
if we need to display info bars.
LimitsWe will continue to have a per-tab limit on the number of workers,
using the existing WorkerService code. The code will need to queue up shared
worker create requests (as it currently does) and also queue up connect
requests for the workers (destroying queued connect requests if the sender
exits, similar to how dedicated worker create requests are managed).

Error Reporting
The WebKit implementation broadcasts any uncaught SharedWorker exceptions to
every document in the worker's document_set. We could also do this for
Chromium, but it seems like it might be better to interact directly with the
developer tab (write exceptions directly to the console a single time,
rather than logging them multiple times via individual windows). Can someone
point me in the right direction here (how to log exceptions to the console
from worker context, if possible)?
Launching workers/code re-useThe SharedWorker threads are not particularly
different from existing dedicated worker threads - the only new
functionality is the addition of the "connect" event when a new client
connects, but otherwise the SharedWorker functionality is a subset of the
dedicated worker functionality. But much of the other functionality
(loading, shutdown) can be reused.

I'll use the existing classes as a base class, and override the few APIs
where shared workers are different (primarily error reporting). The
WorkerService will send a StartSharedWorkerContext message to the
WebWorkerClientProxy rather than a StartWorkerContext to ensure the proper
type of WebCore::WorkerContext is created.

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: chromium-dev@googlegroups.com 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Chromium SharedWorker design doc

Reply via email to