Re: [whatwg] Workers feedback
I'll mention that the Chrome team is experimenting with something like this (as a Chrome extensions API) - certain extensions will be able to do: window.open(my_bg_page.html, name, background); ...and the associated window will be opened offscreen. They share a process with other pages under that domain which means they can't be used as a worker (doing long-lived operations). But I agree, there's some value in having the full set of page APIs available. -atw On Fri, Feb 11, 2011 at 5:58 PM, Gregg Tavares (wrk) g...@google.comwrote: On Fri, Feb 11, 2011 at 5:45 PM, Ian Hickson i...@hixie.ch wrote: On Fri, 11 Feb 2011, Gregg Tavares (wrk) wrote: On Fri, 7 Jan 2011, Berend-Jan Wever wrote: 1) To give WebWorkers access to the DOM API so they can create their own elements such as img, canvas, etc...? It's the API itself that isn't thread-safe, unfortunately. I didn't see the original thread but how is a WebWorker any different from another webpage? Those run just fine in other threads and use the DOM API. Web pages do not run in a different thread. Oh, sorry. I meant they run in a different process. At least in some browsers. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
[whatwg] SharedWorkers and document discarded
Hi all, Jonas brought up an interesting point regarding SharedWorkers in an unrelated thread that I wanted to clarify here. His contention is that the current SharedWorker spec specifies that the lifecycle of a SharedWorker is currently tied to the GC behavior of the underlying VM - specifically, that a SharedWorker is shutdown after its last parent document has been GC'd. The relevant spec language is (from http://www.whatwg.org/specs/web-workers/current-work/#the-worker's-lifetime ): Whenever a Document d is added to the worker's Documents, the user agent must, for each worker in the list of the worker's workershttp://www.whatwg.org/specs/web-workers/current-work/#the-worker's-workers whose list of the worker's Documentshttp://www.whatwg.org/specs/web-workers/current-work/#the-worker's-documents does not contain d, add dto q's WorkerGlobalScope owner's list of the worker's Documentshttp://www.whatwg.org/specs/web-workers/current-work/#add-a-document-to-the-worker's-documents . Whenever a Document object is discarded, it must be removed from the list of the worker's Documentshttp://www.whatwg.org/specs/web-workers/current-work/#the-worker's-documents of each worker whose list contains that Document. So I'm not an expert on Document lifecycles, so I don't entirely understand under which circumstances the spec requires that a Document object be discarded. For example, if I have a top level window with a child iframe, and that child iframe creates a SharedWorker, then reloads itself or navigates, could that cause the original document to be discarded/suspended, or does this depend on GC (whether script in the top level window maintains a reference to the document javascript object)? My understanding from previous discussions was that the only thing impacting whether a document is discarded is whether the UA decided to keep it suspended in the history cache - can javascript-level references also prevent a document from being discarded? -atw
Re: [whatwg] WebWorkers and images
I would recommend that people review this thread: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-February/025254.htmlto understand the objections previously raised to this idea. -atw On Fri, Jan 7, 2011 at 4:08 PM, Glenn Maynard gl...@zewt.org wrote: On Fri, Jan 7, 2011 at 6:22 PM, David Levin le...@chromium.org wrote: fwiw, ImageData can be used in a worker. Many folks have argued that canvas isn't that useful in a worker and that the gpu acceleration will make it less useful (and that most image manipulation would be able to use ImageData for its needs). This is wrong. You can't download, decompress and blit images to a canvas in realtime from the main thread without causing UI hitches. For example, if you load an HTMLImageElement to blit into a canvas, image decompression often doesn't happen asynchronously during the download, but synchronously when the image data is first used. I often see the UI freeze for 500-700ms to blit and scale a large image in Chrome. Implementations can alleviate this, but generally at a cost elsewhere. The real fix is to stop doing these expensive operations from the UI thread--which is what threads are for. GPU acceleration won't magically fix every case where synchronous canvas and canvas-related operations take too much time in the UI thread. ImageData is only useful for a limited set of image operations. For image blitting and transformations--the most common canvas operations--you need the canvas itself. I'd suspect even more important cases with WebGL; for example, long-running computations using fragment shaders. I understand why Canvas isn't available right now. There are sticky issues to make all of the parts practical to expose to threads, and I'm sure getting the basic web workers API in place is much higher priority. But, I hope this will be revisited seriously at an appropriate time, and not dismissed as not useful. Image manipulation is one of the most obvious candidates for threading. -- Glenn Maynard
Re: [whatwg] Inline Web Worker
I believe it's a security feature. Imagine that you download foo.html into your C:/ - according to the logic below, script running in foo.html should be able to read *any file on your C:/ drive*. That seems scary to me. FWIW, chrome allows passing the --allow-file-access-from-files command line flag to make it easier for developers to work locally without running an HTTP server. -atw On Sat, Oct 16, 2010 at 8:19 AM, Samuel Ytterbrink sam...@ytterbrink.nuwrote: Good news. :D But then i got another problem, why is not file:///some_directory_where_the_html_are/ not the same domain as file:///some_directory_where_the_html_are/child_directory_with_ajax_stuff/. I understand if it was not okay to go closer to root when ajax, file:///where_all_secrete_stuff_are/ or /../../. You see i wonder why i need a web-server to try some things. And I'm sure that there are more developers than me that thinks that local single page Ajaxs applications have a future. One thing that could probably solve this is if the File API will support folders. Then the user could select the files for the program... /Samuel Ytterbrink 2010/10/16 Simon Pieters sim...@opera.com On Sat, 16 Oct 2010 03:12:38 +0200, Jonas Sicking jo...@sicking.cc wrote: Allowing both blob URLs and data URLs for workers sounds like a great idea. FWIW, Opera supports data URLs for Worker (but not SharedWorker since it could be used to cross the same-origin policy if two pages opened a SharedWorker with the same data URL -- this could be solved while still supporting data URLs but we decided to just drop it for now). -- Simon Pieters Opera Software
Re: [whatwg] Web Workers
On Wed, Jul 21, 2010 at 1:11 PM, Ryan Heise r...@ryanheise.com wrote: For all of the reasons above, I would like to see something like threads in Javascript. Yes, threads give rise to race conditions and deadlocks, but this seems to be in line with Javascript's apparent philosophy of doing very little static error checking, and letting things just happen at runtime (e.g. nonexistent static type system). In other words, this may be simply a case of: yes, javascript allows runtime errors to happen. Is not allowing deadlocks important enough that we should make it impossible for a certain class of algorithms to exploit multi-core CPUs? Rather than trying to shoehorn concurrent functionality into Javascript (where many implementations don't support multi-threaded access down at the VM level anyway, so the obstacles to implementation seem fairly large) it seems like a better option is to use a different language entirely. Before I sign off, there is one more feature which (correct me if I'm wrong) is lacking from the current specification. There is currently no way for a program to find out how many cores are present in the host system. Without this, there is no way to know how many Web Workers to create for an algorithm that could easily be parallelised to any number of Web Workers / threads. Even, say, a parallel quicksort should not just create a new thread for each recursive invocation, as deep as it goes. For efficiency, this thread creation should stop as soon as enough threads have been created to match the number of physical cores. After this point, each core should handle its load by reverting to a single-threaded quicksort. There have been a few discussions on this issue - for example: http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-November/024058.html http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/023993.html Not sure if any conclusions were drawn - I think we may have kept this open as an option for v2 of the spec. -- Ryan Heise
Re: [whatwg] Workers: What can be done in a worker after call to close()?
How does the GC-initiated close() event work in Firefox, in the case of a fire-and-forget worker? For example: foo.html: script new Worker(forget.js); /script forget.js: self.setInterval(function() { ...do something...}, 1000); In this case, it seems incorrect to ever fire a close() event until the parent window leaves the bfcache. I'm guessing you must do something to prevent the worker object from being GC'd in the case that there's pending activity in the worker? -atw On Thu, Apr 1, 2010 at 3:31 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Mar 31, 2010 at 10:03 AM, ben turner bent.mozi...@gmail.com wrote: Hi, When implementing the close() function for Firefox we chose to set the closing flag and clear pending events only. As the worker script is calling close() on itself we figured that the worker should retain maximum functionality until it has finished execution (otherwise it could just not call close() and rely on some kind of postMessage() and terminate() combo). Therefore we do not enforce any timeout for the currently executing script and we continue to allow postMessage() calls and synchronous XHR to proceed. Since the closing flag is set in response to close() the worker is guaranteed to finish as soon as the currently running script finishes. We always enforce a timeout for any code that runs in response to the close event that gets fired after the current script finishes, though. If the code that calls close() never returns (like the while(1) { } example above) then the worker will never finish, as pointed out above, but that's no different than having a worker script that consists only of a while(1) { } loop and we don't think it's important to prevent. If a worker script is written in this way then a terminate() call is still a valid solution. Also, since we try to retain maximum functionality after close() we also allow errors to propagate as shown above. If anyone is curious the basic strategy we use in response to close functions (like close(), terminate(), and for UA-generated events like when the main worker object is GC'd) can be found in the following table: http://mxr.mozilla.org/mozilla-central/source/dom/src/threads/nsDOMWorker.h#202 For what it's worth, I think the behavior that firefox has makes a lot of sense and I think it should be mandated by the spec. (I know, big shocker :) ) The one thing that we do and that is somewhat iffy is the close event. Ben actually gets it a bit wrong in the description above. This is how it works: We fire the close event handler in four situations: * After close() is called by the worker, once it finishes its current execution. * After terminate() is called from outside the worker and any code running has been aborted. * If the worker is garbage collected. * Once the user leaves the page (or specifically, once the page falls out of the bfcache). Only in the last case do we give the close handler a time limit, after which any currently running close handler is aborted and no more close handlers are run. Though of course the user can leave the page *while* the close event is getting fired. If so, we start the time limit at that point. The iffy part is the third bullet above, since it exposes GC behavior. This is very unfortunate indeed and because of it I feel that our implementation is somewhat experimental. We could simply not fire the close event in that case, however this would seem to reduce the usefulness of the close event quite a bit. So I think for now I don't care if the close event is put in the spec or not. But I wanted to let you know what we're doing. We don't currently have any plans to remove it. / Jonas
Re: [whatwg] Workers: What can be done in a worker after call to close()?
I'll note that the spec gives the UA an significant amount of latitude about its behavior after close() is called: User agents may invoke the kill a worker #kill-a-worker processing model on a worker at any time, e.g. in response to user requests, in response to CPU quota management, or when a worker stops being an active needed worker#active-needed-worker if the worker continues executing even after its closing#dom-workerglobalscope-closing flag was set to true. Essentially, UAs can kill a worker at any time, and since the kill a worker algorithm allows UAs to abort the script after a user-agent-defined amount of time (including zero), it seems like almost any behavior post-close is compliant. This seems like a guaranteed source of cross-browser incompatibilities. I've always operated under the impression that the intent of the spec is to allow pending worker operations to complete, but still give UAs the ability to abort scripts that don't exit in a timely manner (so close() should not immediately abort the script), but I don't see anything in the spec regarding this. For #2 below, I believe that exceptions in worker context should *always* be reported, regardless of closing state. Section 4.6 (Runtime script errors) makes no mention of tying this behavior to the closing flag. -atw On Tue, Mar 30, 2010 at 4:44 PM, Dmitry Titov dim...@chromium.org wrote: Hi! Trying to fix some bugs for Workers, I've got some questions about close() method on WorkerGlobalScopehttp://www.whatwg.org/specs/web-workers/current-work/#workerglobalscope . In particular, the spec seems to imply that after calling close() inside the worker, the JS does not get terminated right away, but rather continue to execute, while an internal 'closing' flag is set and a message queue associated with the worker discards all the tasks, existing and future. Also, all ports are immediately disentangled. This seems to leave some questions without explicit answer, with differences in current implementations: 1. Does this code in a worker continues looping until the parent page unloads: ... close(); while(true) {} WebKit V8 terminates, WebKit JCS terminates after a timeout, FF does not terminate. 2. Do the errors propagate back to Worker object after close()? ... close(); nonExistingFunction(); -- throws, if not processed locally, posts error info to the Worker object. In WebKit and FF errors propagate, although it does not seem consistent while worker closed all the ports and is in a dormant state. 3. Should synchronous operations work after close()? Asynchronous ones perhaps should not, because of the event loop queue which is stopped... ... close(); xhr.open(GET, foo.com, *false*); xhr.send(); WebKit: does not work (mostly), FF - does work. Perhaps it would be simpler to either say nothing is executed/posted/fired after close() (immediate termination), or to enable worker run unimpeded (with ports open, etc) until it naturally yields from JS. Any opinions? Thanks, Dmitry
Re: [whatwg] Define MessagePort.isConnected or MessagePort.ondisconnect
Agreed, there's not a good way to determine that a port is disentangled. Currently the main solution I know of is to have your document post a message to your shared worker in their onunload handler. I think some kind of MessagePort.onclose event or entangled attribute could be useful - this was originally part of the spec, and the issue with that was that it's hard to define onclose in such a way that doesn't make it highly dependent on garbage collection. As an example: var channel = new MessageChannel(); channel.port1.onclose = channel.port2.onclose = function() {alert(port closed);}; channel = null; What should happen in this case? At what point (if ever) should the onclose handler be invoked? I'm just leery of any situation where the garbage collected state of an unreferenced object is exposed to script, as it seems like this causes interoperability issues. For example, if you ran the script above in Chrome, the onclose handler would likely not be invoked until the parent Document was closed. In Safari, it would get invoked when the JS heap is next garbage collected. An application that relied on onclose() being called in a timely manner would break on Chrome. The only option that comes to mind that doesn't expose compatibility issues would be to only issue onclose events if close() is explicitly called on the entangled port, but if you're doing that you might as well just have the code calling close() post a I'm closing message first. -atw On Mon, Mar 15, 2010 at 5:13 PM, ATSUSHI TAKAYAMA taka.atsu...@googlemail.com wrote: Hi all, Consider a case where I have a SharedWorker script like below, and I open two tabs that use this SharedWorker. Now myPorts.length is 2. If I reload one of the two tabs, then myPorts.length is 3, isn't it? But one of the three ports is already disconnected from the counterpart, so postMessage'ing to the port is meaningless and I want to discard reference to that port. === JS === var myPorts = []; onconnect = function(e) { var port = e.ports[0]; myPorts.push(port); port.onmessage = function(e) { myPorts.forEach(function(p) { if (p !== port) p.postMessage = e.data; }); } } === /JS === It seems like the only way to know if a MessagePort is connected is to actually send a message and wait for a reply. So MessagePort.isConnected or MessagePort.ondisconnect would be nice to have. A. TAKAYAMA
Re: [whatwg] Offscreen canvas (or canvas for web workers).
On Mon, Feb 22, 2010 at 11:13 AM, David Levin le...@google.com wrote: I've talked with some other folks on WebKit (Maciej and Oliver) about having a canvas that is available to workers. They suggested some nice modifications to make it an offscreen canvas, which may be used in the Document or in a Worker. Proposal: Introduce an OffscreenCanvas which may be created from a Document or a Worker context. interface OffscreenCanvas { attribute unsigned long width; attribute unsigned long height; DOMString toDataURL (in optional DOMString type, in any... args); object getContext(in DOMString contextId); }; When it is created in the Worker context, OffscreenCanvas.getContext(2d) returns a CanvasWorkerContext2D. In the Document context, it returns a CanvasRenderingContext2D. The base class for both CanvasWorkerContext2D and CanvasRenderingContext2D is CanvasContext2D. CanvasContext2D is just like a CanvasRenderingContext2D except for omitting the font methods and any method which uses HTML elements. It does have some replacement methods for createPattern/drawImage which take an OffscreenCanvas. The canvas object attribute is either a HTMLCanvasElement or an OffscreenCanvas depending on what the canvas context came from. interface CanvasContext2D { readonly attribute object canvas; void save(); void restore(); void scale(in float sx, in float sy); void rotate(in float angle); void translate(in float tx, in float ty); void transform(in float m11, in float m12, in float m21, in float m22, in float dx, in float dy); void setTransform(in float m11, in float m12, in float m21, in float m22, in float dx, in float dy); attribute float globalAlpha; attribute [ConvertNullToNullString] DOMString globalCompositeOperation; CanvasGradient createLinearGradient(in float x0, in float y0, in float x1, in float y1) raises (DOMException); CanvasGradient createRadialGradient(in float x0, in float y0, in float r0, in float x1, in float y1, in float r1) raises (DOMException); CanvasPattern createPattern(in OffscreenCanvas image, in DOMString repetition); attribute float lineWidth; attribute [ConvertNullToNullString] DOMString lineCap; attribute [ConvertNullToNullString] DOMString lineJoin; attribute float miterLimit; attribute float shadowOffsetX; attribute float shadowOffsetY; attribute float shadowBlur; attribute [ConvertNullToNullString] DOMString shadowColor; void clearRect(in float x, in float y, in float width, in float height); void fillRect(in float x, in float y, in float width, in float height); void strokeRect(in float x, in float y, in float w, in float h); void beginPath(); void closePath(); void moveTo(in float x, in float y); void lineTo(in float x, in float y); void quadraticCurveTo(in float cpx, in float cpy, in float x, in float y); void bezierCurveTo(in float cp1x, in float cp1y, in float cp2x, in float cp2y, in float x, in float y); void arcTo(in float x1, in float y1, in float x2, in float y2, in float radius); void rect(in float x, in float y, in float width, in float height); void arc(in float x, in float y, in float radius, in float startAngle, in float endAngle, in boolean anticlockwise); void fill(); void stroke(); void clip(); boolean isPointInPath(in float x, in float y); void drawImage(in OffscreenCanvas image, in float dx, in float dy, in optional float dw, in optional float dh); void drawImage(in OffscreenCanvas image, in float sx, in float sy, in float sw, in float sh, in float dx, in float dy, in float dw, in float dh); // pixel manipulation ImageData createImageData(in float sw, in float sh) raises (DOMException); ImageData getImageData(in float sx, in float sy, in float sw, in float sh) raises(DOMException); void putImageData(in ImageData imagedata, in float dx, in float dy, in optional float dirtyX, in optional float dirtyY, in optional float dirtyWidth, in optional float dirtyHeight]); }; interface CanvasWorkerContext2D : CanvasContext2D { }; interface CanvasRenderingContext2D : CanvasContext2D { CanvasPattern createPattern(in HTMLImageElement image, in DOMString repetition); CanvasPattern createPattern(in HTMLCanvasElement image, in DOMString repetition); CanvasPattern createPattern(in HTMLVideoElement image, in DOMString repetition); // focus management boolean drawFocusRing(in Element element, in float xCaret,
Re: [whatwg] Notification API
BTW, I would highly recommend that we move this conversation to the public-webapps list. I'm not sure about the best way to do this other than to stop posting here, starting...um...right after my reply :) Anyhow, your question below outlines is why there are two exposed notification APIs - one is simple, lowest-common-denominator text + icon notifications (createNotification()), and the other is HTML (createHTMLNotification()). Platforms (such as mobile devices) that can't support popup HTML notifications can just not expose the createHTMLNotification attribute. Likewise, if the user on a given system just wants to force everyone to use text + icon notifications because he finds the mix of (say) Growl and HTML notifications jarring, then the UA could provide an option for that (which would cause the UA to not expose the createHTMLNotification attribute). -atw On Wed, Feb 3, 2010 at 12:27 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Thu, Feb 4, 2010 at 6:17 AM, John Gregg john...@google.com wrote: The Webapps WG is working on a spec for a Web Notification API. You can see the current draft at http://dev.w3.org/2006/webapi/WebNotifications/publish/, and I would suggest sending comments to the public-webapps mailing list. That spec attempts to address the icon+title+text use case, and allows a user agent to use a third party presentation system as long as that system can notify of notifications being acknowledged, but also allows HTML as an option if the device supports it. I disagree with the claim that HTML notifications are overkill as long as they can be done securely, it opens up a lot of benefit to have dynamic interactive notifications. Even for the simple case of Calendar reminders which might have multiple forms of acknowledgement: snooze for N minutes (a select would be nice), or dismiss. If the underlying platform notification system (e.g. Growl or libnotification) doesn't support that functionality, how should the UA behave? I suppose the UA could distinguish between notifications that can be supported by the platform and those that can't, and use the platform notification system when possible, otherwise fall back to its own notifications, but that could be a jarring user experience. Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Notification API
Apps on the iphone using SQL data storage might disagree with you about the value of optional web features :) But I do understand your point, and perhaps there's a better way to achieve the goals of the notification API. The goals as I understand them are: 1) Support simple text + icon notifications on devices that are unable to support full HTML notifications (I'm thinking of mobile devices specifically, such as the Palm Pre which exposes a similar JS notification API, but some system notification frameworks also fall under this category). 2) Allow more full-featured HTML notifications on the overwhelming majority of platforms that support them. 3) Give web applications the ability to tell whether a given UA supports HTML notifications so they can choose not to display any notification at all if the system does not support HTML. A corollary to #3 may be that a given UA could make it possible for the user to disable HTML notifications even though they would theoretically be possible to support on that platform, if it is believed that there are user benefits to only allowing text + icon notifications on that specific installation (e.g. tighter integration with system notification frameworks). It's possible that Goal #3 is unimportant, or that it could be satisfied through some other mechanism (like a capabilities attribute? ick?) - if so, then it seems like an option to fold createNotification() and createHTMLNotification() together by adding an optional htmlUrl parameter to createNotification() which would be ignored on systems that don't want to display HTML notifications. -atw On Wed, Feb 3, 2010 at 1:27 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Feb 3, 2010 at 1:00 PM, John Gregg john...@google.com wrote: On Wed, Feb 3, 2010 at 12:27 PM, Robert O'Callahan rob...@ocallahan.org wrote: On Thu, Feb 4, 2010 at 6:17 AM, John Gregg john...@google.com wrote: The Webapps WG is working on a spec for a Web Notification API. You can see the current draft at http://dev.w3.org/2006/webapi/WebNotifications/publish/, and I would suggest sending comments to the public-webapps mailing list. That spec attempts to address the icon+title+text use case, and allows a user agent to use a third party presentation system as long as that system can notify of notifications being acknowledged, but also allows HTML as an option if the device supports it. I disagree with the claim that HTML notifications are overkill as long as they can be done securely, it opens up a lot of benefit to have dynamic interactive notifications. Even for the simple case of Calendar reminders which might have multiple forms of acknowledgement: snooze for N minutes (a select would be nice), or dismiss. If the underlying platform notification system (e.g. Growl or libnotification) doesn't support that functionality, how should the UA behave? I suppose the UA could distinguish between notifications that can be supported by the platform and those that can't, and use the platform notification system when possible, otherwise fall back to its own notifications, but that could be a jarring user experience. The spec states that HTML is an optional part of the implementation. If the UA intends to use a presentation system that doesn't support HTML it should not expose the HTML API and just expose the plain one. This isn't ideal as it requires authors to check the capabilities of the UA, but it does provide consistency for the user. This is a very bad idea. Sites are going to forget to do this, or rather not know that they need to do this. At some point a high-profile site is going to not do this check, and from that point on all UAs will effectively be forced to support HTML notifications or not be compatible with the web. I can't think of a single time when optional web features have succeeded. / Jonas
Re: [whatwg] using postMessage() to send to a newly-created window
I ran into this early last year when I was first investigating MessagePorts: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-March/018893.html I dimly recall that the response I got off-list (via IRC?) was that this was expected behavior (the intended use for window.postMessage() is for communicating with child iframes, where you can register an onload handler to be notified when it's safe to post messages), but I agree that it would be worthwhile to explicitly state this somewhere in the spec. -atw On Mon, Dec 21, 2009 at 7:24 PM, Dirk Pranke dpra...@chromium.org wrote: Hi all, In the course of testing something today, I attempted to create a window and immediately post a message to it, and was surprised that it didn't seem to work. E.g.: var w = window.open(http://x;); w.postMessage(hello, world, *); w never got the message - this seemed to be consistent across Safari, Chrome, and FF (all I had installed on my Mac at the time, so apologies to Opera, IE, and anyone else I've left out). Is this supposed to work? If not, is there a reliable way for the the source to know when it is safe to send a message to the target? The only way I can think of is for the target to send a message back to the source, which only works if the target can get a reference to the source using window.opener, which may or may not be possible or desirable ... If this isn't supposed to work, can we state this explicitly in the spec? -- dirk
Re: [whatwg] Passing more than JSON data to workers
That was my point earlier - for this to work, if you post getX and setX over separately, they need to share a closure otherwise they don't work. Realistically, given Javascript's dynamic nature, you need to copy everything reachable via the function's scope chain, and recursively copy everything reachable via any reachable function's scope chain. In effect, copying a function to worker context means bringing over the entire reachable heap. Anything else you try to do is going to break in subtle ways when something in the source context's scope chain isn't in the destination context's scope chain. I understand that we might like to treat code running in a worker as if it were running in a same context as the parent page, so you could pass all the same things to it that you could pass to any Javascript function, but the situations are not identical - the only way something like this would be feasible would be by adding limitations (i.e. not copying over the scope chain of a function) that IMO fundamentally break Javascript semantics ; once you do that, you might as well just use the existing string + eval() mechanisms. -atw On Thu, Dec 17, 2009 at 2:06 AM, Oliver Hunt oli...@apple.com wrote: On Dec 17, 2009, at 10:03 PM, Boris Zbarsky wrote: On 12/17/09 12:48 AM, Boris Zbarsky wrote: It seems very difficult to me to come up with a function cloning solution that won't break in subtle ways when such functions are passed to it... I should clarify this. It seems to me eminently possible to clone functions that only reference local (declared with var) variables and their arguments. And maybe explicit |this| access; not sure. As soon as you're talking anything else, the situation gets very complicated, it seems to me. That includes implicit property access on the global object. To make that clearer, consider these two functions, defined at global scope: var x = 1; function f() { return x; } function g() { return Math; } If I understand your proposal correctly, passing f to a worker would pass across a function which always returns 1. Passing g to a worker would do what? Pass across a function that always returns the Math object from the web page scope? If not, then how is Math different from x, exactly? If yes, then why are we baking anything at all in at pass time? How is the f() example above affected if x is bound to an object, not to a number? I think a more interesting case is the relatively common idiom of closures for access protection, eg. function MyObject() { var x; this.setX = function(_x) { x = _x }; this.getX = function() { return x } } What should worker.postMessage(new MyObject) do if we were to try and serialise the functions? obviously you don't want them each to have (effectively) separate closures, and you can't just substitute their containing scope with the global object. -Boris --Oliver
Re: [whatwg] Passing more than JSON data to workers
I'm not certain what a deep copy of the function means - would you need to copy the entire lexical scope of the function? For example, let's say you do this: var foo = 1; function setFoo(val) { foo = val; } function getFoo() { return foo; } worker.postMessage(setFoo); worker.postMessage(getFoo); foo = 2; Then, from worker code, I call the copy of getFoo() - what should it return (undefined? Does it pull over a copy of foo from the original lexical scope, in which case it's 1)? What if foo is defined in a lexical closure that is shared by both setFoo() and getFoo() - it seems like the separate copies of setFoo() and getFoo() passed to the worker would need to reconstruct a shared closure on the worker side, which seems difficult if not impossible. I think that some variant of data URLs and/or eval() gets you most of what you really need here without requiring extensive JS VM gymnastics. -atw On Wed, Dec 16, 2009 at 9:23 AM, Jan Fabry jan.fa...@monkeyman.be wrote: Hello, Has it been considered to pass more than JSON data to workers? I could not find a rationale behind this in the FAQ, or in other places I looked. I understand the need for separation because of concurrency issues, but aren't there other ways to accomplish this? (The following text was already posted to the forum, but zcorpan suggested I also post it here) [ http://forums.whatwg.org/viewtopic.php?t=4185 ] I am not a Javascript VM developer, so if the following does not make sense, please don't be too hard on me. A reply of Sorry, we thought about this longer than you did, and there are still cases where this is impossible is perfectly valid, but the more I can learn from this conversation, the better. Would it be possible to do a deep copy of the function (object) you pass to the the constructor? So copy everything (or mark it for copy-on-write), but remove references to DOM elements if they exist. This way, I think you can create a parallel data structure, so the original one remains untouched (avoiding concurrency issues). The important difference between this and the usual JSON-serializing of objects that the examples talk about, is that functions can be passed through too in an easy manner. If you have to simulate this using only Javascript, you have to somehow bind the free variables, which requires some introspection, and thus is not easy (if even possible?) to simulate in user space. The Google Gears API seems to provide both createWorker(scriptText) and createWorkerFromUrl(scriptUrl). Why was only the URL variant retained in the Web Workers spec? With the script variant, there would have been at least a little basis for more dynamic programming. Greetings, Jan Fabry
Re: [whatwg] [WebWorkers] Maximum number of workers (was About the delegation example)
We discussed this previously ( http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-July/020865.html) - the consensus was that since the Worker APIs are inherently asynchronous, user agents were free to impose limits on worker creation (and queue up new creation requests when the limit has been hit), but the spec should be silent on this issue. FWIW, Chrome does this - we limit each tab to a maximum of 16 workers, with an overall limit of 64 - the 17th worker creation request for a tab is queued up until another worker exits to make room for it. Chrome currently has the limitation that each worker runs in its own process - when we address that limitation, we expect to adjust the limits accordingly. -atw On Mon, Dec 7, 2009 at 1:39 PM, David Bruant bru...@enseirb-matmeca.frwrote: David Bruant : In the delegation example, the number of workers chosen is an arbitrary 10. But, in a single-core processor, having only one worker will result in more or less the same running time, because at the end, each worker runs on the only core. Ian Hickson : That depends on the algorithm. If the algorithm uses a lot of data, then a single hardware thread might be able to run two workers in the same time as it runs one, with one worker waiting for data while the other runs code, and with the workers trading back and forth. Personally I would recommend basing the number of workers on the number of shards that the input data is split into, and then relying on the UA to avoid thrashing. I would expect UAs to notice when a script spawns a bazillion workers, and have the UA run them in a staggered fashion, so as to not starve the system resources. This is almost certainly needed anyway, to prevent pages from DOSing the user's system. = Wouldn't it be preferable to have an implementation-dependant maximum number of workers and to raise a security exception when this number is reached ? Maximum per domain ? per document ? Maybe a different maximum for shared and dedicated workers. Any opinion on this ? David
Re: [whatwg] Example wrong in web workers
FWIW, I've usually looked at self.postMessage when trying to determine whether running in dedicated or shared worker context, although Anne's suggestion (using in) is better. -atw On Mon, Oct 26, 2009 at 7:00 AM, Anne van Kesteren ann...@opera.com wrote: On Mon, 26 Oct 2009 13:57:10 +0100, Simon Pieters sim...@opera.com wrote: Web Workers has the following in some example (twice): // support being used as a shared worker as well as a dedicated worker if (this.onmessage) // dedicated worker This ought to be doing something like (typeof this.onmessage != 'undefined') , as the event property is presumably 'null' by the time of the test. (onmessage in this) might be somewhat safer fwiw. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] postMessage: max length / size
As a data point, the WebKit implementation (used by Safari and Chrome) doesn't currently enforce any limits (other than those imposed by running out of memory). -atw On Fri, Oct 23, 2009 at 12:02 AM, Ian Hickson i...@hixie.ch wrote: On Thu, 22 Oct 2009, Brian Kuhn wrote: Is there any limit to the length of message you can send with postMessage (HTML5 Cross-document messaging)? I didn't see anything in the spec about this. I thought this might be one area where implementations might end up differing. There are probably implementation-specific limits, but HTML tries to not say what the limits should be, since it's hard to know what they should be. It might vary from platform to platform and device to device, and will almost certainly vary over time. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers and addEventListener
On Wed, Oct 14, 2009 at 3:33 AM, Ian Hickson i...@hixie.ch wrote: There's no start() to call, since there's no explicit pointer to the MessagePort in dedicated workers. The example in the worker spec refers to shared workers, which *do* have an explicit port, and do not automatically start the port. Search for addEventListener in this document: http://www.whatwg.org/specs/web-workers/current-work/#shared-workers And you'll see what I mean. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers and addEventListener
To be absolutely clear (since there's some confusion about whether we are talking about explicit MessagePorts, or about implicit ports for dedicated workers). Are you saying that this: var channel = new MessageChannel(); channel.port1.postMessage(hi mom); channel.port2.addEventListener(message, function(event) {alert(event.data); }, false); Should result in an alert dialog being fired? Because that does not match my reading of the spec at all - the spec explicitly states that a port's message queue is only enabled via start() or via setting the onmessage IDL attribute. I would not be opposed to changing the spec to include enabling a port's message queue when addEventListener(message) is invoked. -atw On Wed, Oct 14, 2009 at 3:32 AM, Ian Hickson i...@hixie.ch wrote: On Tue, 29 Sep 2009, Zoltan Herczeg wrote: In WebKit implementation of MessagePort the addEventListener(message, ...) does not enable the transmitting of messages. All messages are actually discarded until a dummy function is assigned to onmessage. That is a bug. The port message queue is explicitly enabled during the creation of the dedicated worker (step 12). And in the normative text, it is not mentioned that addEventListener should also enable message transmitting. The normative text just says to fire an event; the DOM Events spec makes it clear that events can be handled using addEventListener()-added handlers. Anyway, my qestion is: - should addEventListener enable message transmitting? Yes. - Should it do it in all cases, or only when message is passed as the first argument It should only receive 'message' events if you say 'message' as the first argument. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers and addEventListener
The intent of the spec is fairly clear that addEventListener(message) should not start the message queue dispatch - only setting the onmessage attribute does that: The first time a MessagePort #messageport object's onmessage#handler-messageport-onmessage IDL attribute is set, the port's port message queue #port-message-queue must be enabled, as if the start() #dom-messageport-start method had been called. In fact, the only reason for the existence of the MessagePort.start() method is to enable applications to start message queue dispatch when using addEventListener(). I don't have a strong opinion as to whether we should change the spec, though. I suspect not, given Anne's email. We should instead change the example in the workers spec to call start(). -atw On Tue, Sep 29, 2009 at 2:13 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 29 Sep 2009 09:13:17 +0200, Zoltan Herczeg zherc...@inf.u-szeged.hu wrote: Anyway, my qestion is: - should addEventListener enable message transmitting? - Should it do it in all cases, or only when message is passed as the first argument I don't think it should. Web Workers should not modify the semantics of addEventListener. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Please always use utf-8 for Web Workers
Are you saying that if I load a script via a script tag in a web page, then load it via importScripts() in a worker, that the result of loading that script in those two cases should/could be different because of different decoding mechanisms? If that's what's being proposed, that seems bad. -atw On Fri, Sep 25, 2009 at 6:45 AM, Simon Pieters sim...@opera.com wrote: On Fri, 25 Sep 2009 15:31:41 +0200, Jonathan Cook jonathan.j5.c...@gmail.com wrote: The importScripts portion of the Web Workers API is compatible with existing scripts, Only if those scripts don't use any of the banned interfaces and constructors, right? but I'm all for more UTF-8 :) If the restriction is added to the spec, I'd want to know that a very clear error was going to be thrown explaining the problem. I'm not sure that throwing an error is a good idea. Would you throw an error when there's no declared encoding? That seems to be annoying for the common case of just using ASCII characters. Throwing an error when there is a declared encoding that is not utf-8 might work, but are there many scripts that have a declared encoding and are not utf-8? I think it is to just ignore any declared encoding and assume utf-8. If people are using non-ascii in another encoding, then they would notice by seeing that their text looks like garbage. Browsers could also log messages to their error consoles about encoding declarations declaring non-utf-8 and/or sequences of bytes that are not valid utf-8. -- Simon Pieters Opera Software
Re: [whatwg] Please always use utf-8 for Web Workers
Certainly. If I explicitly override the charset, then that seems like reasonable behavior. Having the default decoding vary between importScripts() and script seems bad, especially since you can't override charsets with importScripts(). -atw On Fri, Sep 25, 2009 at 10:08 AM, Anne van Kesteren ann...@opera.comwrote: On Fri, 25 Sep 2009 18:39:48 +0200, Drew Wilson atwil...@google.com wrote: Are you saying that if I load a script via a script tag in a web page, then load it via importScripts() in a worker, that the result of loading that script in those two cases should/could be different because of different decoding mechanisms? If that's what's being proposed, that seems bad. That could happen already if the script loaded via script did not have an encoding set and got it from script charset. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Please always use utf-8 for Web Workers
Then I'm misunderstanding the suggestion then. My reading of: Therefore, we should be able to always use utf-8 for workers. Always using utf-8 is simpler to implement and test and encourages people to switch to utf-8 elsewhere. ...was we should ignore charset headers coming from the server and always treat script data imported via importScripts() as if it were encoded as utf-8 (i.e. skip step 3 of section 4.3 of the web workers spec), which seems like it's effectively changing the default decoding. Which means that someone naively serving up an existing Big5-encoded script (containing, say, string resources) with the appropriate charset header will find it fails when loaded into workers. Again, apologies if I'm misunderstanding the suggestion. -atw On Fri, Sep 25, 2009 at 10:21 AM, Anne van Kesteren ann...@opera.comwrote: On Fri, 25 Sep 2009 19:16:47 +0200, Drew Wilson atwil...@google.com wrote: Certainly. If I explicitly override the charset, then that seems like reasonable behavior. It does not need to be overridden per se. If the document character encoding is different from UTF-8 then a script loaded through script will be decoded differently from a script loaded through importScripts() as well. Having the default decoding vary between importScripts() and script seems bad, especially since you can't override charsets with importScripts(). This is already the case. The suggestion was not about changing the default. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] LocalStorage in workers
Jeremy, what's the use case here - do developers want workers to have access to shared local storage with pages? Or do they just want workers to have access to their own non-shared local storage? Because we could just give workers their own separate WorkerLocalStorage and let them have at it. A worker could block all the other accesses to WorkerLocalStorage within that domain, but so be it - it wouldn't affect page access, and we already had that issue with the (now removed?) synchronous SQL API. I think a much better case can be made for WorkerLocalStorage than for give workers access to page LocalStorage, and the design issues are much simpler. -atw On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org wrote: One possible solution is to add an asynchronous callback interface for LocalStorage into workers. For example: function myCallback(localStorage) { localStorage.accountBalance = localStorage.accountBalance + 100; } executeLocalStorageCallback(myCallback); // TODO: Make this name better :-) The interface is simple. You can only access localStorage via a callback. Any use outside of the callback is illegal and would raise an exception. The callback would acquire the storage mutex during execution, but the worker's execution would not block during this time. Of course, it's still possible for a poorly behaving worker to do large amounts of computation in the callback, but hopefully the fact they're executing in a callback makes the developer more aware of the problem. First off, I agree that not having localStorage in workers is a big problem that we need to address. If I were designing the localStorage interface today I would use the above interface that you suggest. Grabbing localStorage can only be done asynchronously, and while you're using it, no one else can get a reference to it. This way there are no race conditions, but also no way for anyone to have to lock. So one solution is to do that in parallel to the current localStorage interface. Let's say we introduce a 'clientStorage' object. You can only get a reference to it using a 'getClientStorage' function. This function is available both to workers and windows. The storage is separate from localStorage so no need to worry about the 'storage mutex'. There is of course a risk that a worker grabs on to the clientStorage and holds it indefinitely. This would result in the main window (or another worker) never getting a reference to it. However it doesn't affect responsiveness of that window, it's just that the callback will never happen. While that's not ideal, it seems like a smaller problem than any other solution that I can think of. And the WebDatabase interfaces are suffering from the same problem if I understand things correctly. There's a couple of other interesting things we could expose on top of this: First, a synchronous API for workers. We could allow workers to synchronously get a reference to clientStorage. If someone is currently using clientStorage then the worker blocks until the storage becomes available. We could either use a callback as the above, which blocks until the clientStorage is acquired and only holds the storage until the callback exists. Or we could expose clientStorage as a property which holds the storage until control is returned to the worker eventloop, or until some explicit release API is called. The latter would be how localStorage is now defined, with the important difference that localStorage exposes the synchronous API to windows. Second, allow several named storage areas. We could add an API like getNamedClientStorage(name, callback). This would allow two different workers to simultaneously store things in a storage areas, as long as they don't need to use the *same* storage area. It would also allow a worker and the main window to simultaneously use separate storage areas. However we need to be careful if we add both above features. We can't allow a worker to grab multiple storage areas at the same time since that could cause deadlocks. However with proper APIs I believe we can avoid that. / Jonas
Re: [whatwg] LocalStorage in workers
I'm saying that an async API is overkill and unwieldy if all you need is WorkerLocalStorage. If you're going to route your localstorage access through an async API anyway, then you might as well proxy it to the parent page - there's very little advantage to doing it otherwise, other than access to lexically scoped resources from within your callback. But, yeah, if you want to provide access to shared worker/page storage, then an async API would be the way to go - I'm just saying that if you don't actually need shared storage, then you could maintain a more convenient synchronous silo'd API. Since Jeremy didn't really elaborate on the use case, and he's getting feedback from app developers that I'm not privy to, I figured I'd ask him. -atw On Wed, Sep 16, 2009 at 11:34 AM, Michael Nordman micha...@google.comwrote: On Wed, Sep 16, 2009 at 11:24 AM, James Robinson jam...@google.comwrote: On Wed, Sep 16, 2009 at 10:53 AM, Michael Nordman micha...@google.comwrote: On Wed, Sep 16, 2009 at 9:58 AM, Drew Wilson atwil...@google.comwrote: Jeremy, what's the use case here - do developers want workers to have access to shared local storage with pages? Or do they just want workers to have access to their own non-shared local storage? Because we could just give workers their own separate WorkerLocalStorage and let them have at it. A worker could block all the other accesses to WorkerLocalStorage within that domain, but so be it - it wouldn't affect page access, and we already had that issue with the (now removed?) synchronous SQL API. I think a much better case can be made for WorkerLocalStorage than for give workers access to page LocalStorage, and the design issues are much simpler. Putting workers in their own storage silo doesn't really make much sense? Sure it may be simpler for browser vendors, but does that make life simpler for app developers, or just have them scratching their heads about how to read/write the same data set from either flavor of context in their application? I see no rhyme or reason for the arbitrary barrier except for browser vendors to work around the awkward implict locks on LocalStorage (the source of much grief). Consider this... would it make sense to cordon off the databases workers vs pages can see? I would think not, and i would hope others agree. The difference is that the database interface is purely asynchronous whereas storage is synchronous. Sure... we're talking about adding an async api that allows worker to access a local storage repository... should such a thing exist, why should it not provide access to the same repository as seen by pages? If multiple threads have synchronous access to the same shared resource then there has to be a consistency model. ECMAScript does not provide for one so it has to be done at a higher level. Since there was not a solution in the first versions that shipped, the awkward implicit locks you mention were suggested as a workaround. However it's far from clear that these solve the problem and are implementable. It seems like the only logical continuation of this path would be to add explicit, blocking synchronization primitives for developers to deal with - which I think everyone agrees would be a terrible idea. If you're worried about developers scratching their heads about how to pass data between workers just think about happens-before relationships and multi-threaded memory models. In a hypothetical world without synchronous access to LocalStorage/cookies from workers, there is no shared memory between threads except via message passing. This can seem a bit tricky for developers but is very easy to reason about and prove correctness and the absence of deadlocks. - James -atw On Tue, Sep 15, 2009 at 8:27 PM, Jonas Sicking jo...@sicking.ccwrote: On Tue, Sep 15, 2009 at 6:56 PM, Jeremy Orlow jor...@chromium.org wrote: One possible solution is to add an asynchronous callback interface for LocalStorage into workers. For example: function myCallback(localStorage) { localStorage.accountBalance = localStorage.accountBalance + 100; } executeLocalStorageCallback(myCallback); // TODO: Make this name better :-) The interface is simple. You can only access localStorage via a callback. Any use outside of the callback is illegal and would raise an exception. The callback would acquire the storage mutex during execution, but the worker's execution would not block during this time. Of course, it's still possible for a poorly behaving worker to do large amounts of computation in the callback, but hopefully the fact they're executing in a callback makes the developer more aware of the problem. First off, I agree that not having localStorage in workers is a big problem that we need to address. If I were designing the localStorage interface today I would use the above interface that you suggest. Grabbing localStorage can only
Re: [whatwg] LocalStorage in workers
Thanks, Robert - I didn't want to second my own proposal :) I think that #4 is probably a reasonable bridge API until we come up with a consensus API for #3. For myself, I see this API as being very useful for persistent workers (yes, I'm still banging that drum :). -atw On Wed, Sep 16, 2009 at 3:21 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Thu, Sep 17, 2009 at 9:56 AM, Jeremy Orlow jor...@chromium.org wrote: 1) Create a LocalStorage like API that can only be accessed in an async way via pages (kind of like WebDatabase). 2) Remove any atomicity/consistency guarantees from synchronous LocalStorage access within pages (like IE8 currently does) and add an async interface for when pages do need atomicity/consistency. 3) Come up with a completely different storage API that all the browser vendors are willing to implement that only allows Async access from within pages. WebSimpleDatabase might be a good starting point for this. 4) Create WorkerStorage so that shared workers have exclusive, synchronous access to their own persistent storage via an API compatible with LocalStorage. This sounds like it has a low implementation cost and solves many use cases in a very simple way, right? Rob -- He was pierced for our transgressions, he was crushed for our iniquities; the punishment that brought us peace was upon him, and by his wounds we are healed. We all, like sheep, have gone astray, each of us has turned to his own way; and the LORD has laid on him the iniquity of us all. [Isaiah 53:5-6]
Re: [whatwg] Inter-window communication beyond window.postMessage()
Agreed - I've always felt like having to have a reference to a window would be an obstacle. I feel obliged to point out that cross-domain SharedWorkers might be another option for two completely unrelated windows to interact, although the suggestions I've heard for APIs for x-domain access seem to center around CORS, which may or may not be sufficient for this type of case. -atw On Mon, Sep 14, 2009 at 7:36 AM, Sidney San Martín s...@sidneysm.com wrote: The cross-document messaging API solves a lot of problems and is overall an Awesome Thing, but requiring a reference to the target window is hugely limiting. When a a window wants to talk to another window and didn't create it, there are basically three options: 1. window.open with a window name argument, which is a hack because the target window has to reload. 2. Comet, which is a horrible hack because a trip to the server is required. 3. LocalStorage and storage events, which wasn't designed for anything remotely like this. Unless there's a reason to prevent free communication between windows, there must be a better solution. I can think of a couple of possibilities. The most obvious one would be an API similar to postMessage that allows broadcasting of messages to all windows, windows by name, and windows by domain. Another one (which I haven't developed very far) would be to stick with window.postMessage but provide an API to ask for windows. So, I could say, Can I please have a reference to the window named 'x', or, ...to windows at 'example.com', or, ...to any window who'll give me one. Each window would obviously have to opt into this. What do you all think?
Re: [whatwg] Storage mutex and cookies can lead to browser deadlock
I think the canonical racy case is the page wants to keep a counter for the number of times event X occurs in a cookie or local storage. It doesn't seem to be possible to achieve this without the mutex - the proposal below would break down if two pages tried to increment the cookie value simultaneously (if both pages changed cookieValue=3 to cookieValue=4 at the same time, the result of your merge step would likely be cookieValue=4, not cookieValue=5 as one might intend). -atw On Thu, Sep 3, 2009 at 1:08 PM, Benjamin Smedberg bsmedb...@mozilla.comwrote: On 9/1/09 7:31 PM, Jeremy Orlow wrote: Does the silence mean that no one has any intention of implementing this? If so, maybe we should resign ourselves to a break in the single threaded illusion for cookies. This doesn't seem too outlandish considering that web servers working with cookies will never have such a guarantee and given that we have no evidence of widespread breakage with IE 8 and Chrome. We (Firefox) just started looking at this seriously: the implications of the global lock are pretty unpleasant. The major race condition appears to be code on the web that gets document.cookie, parses and modifies the string it to add or remove a particular cookie, and sets document.cookie again. This pattern could race against HTTP requests which also set cookies. Chris Jones proposed that we behave in a script-consistent manner without actually doing a global mutex: * When a script gets document.cookie, check out a consistent view of the cookie data. While the script runs to completion, its view of document.cookie does not change. * When the script sets document.cookie and runs to completion, calculate the delta with the original data and commit the changes. * HTTP Set-Cookie headers stomp on prior data at any time, but don't interfere with the consistent script view above. It would be nice to provide an web API to perform the operation of setting a cookie atomically, just as the Set-Cookie HTTP header does. That is: document.setCookie('foo=bar; domain=subd.example.com'). It's not clear whether/how this same algorithm could be applied to localStorage, since the amount of data required to create a consistent state is potentially much larger. Is there an inherently racy API in .localStorage which we need to protect with complicate mutex/transactional schemes? --BDS
Re: [whatwg] Storage mutex and cookies can lead to browser deadlock
On Thu, Sep 3, 2009 at 1:32 PM, Benjamin Smedberg bsmedb...@mozilla.comwrote: On 9/3/09 4:24 PM, Drew Wilson wrote: I think the canonical racy case is the page wants to keep a counter for the number of times event X occurs in a cookie or local storage. It doesn't seem to be possible to achieve this without the mutex - the proposal below would break down if two pages tried to increment the cookie value simultaneously (if both pages changed cookieValue=3 to cookieValue=4 at the same time, the result of your merge step would likely be cookieValue=4, not cookieValue=5 as one might intend). Is that case important? I think that it's more important to make sure that script doesn't inadvertently remove or modify cookies due to races in document.cookie, but that maintaining a single-threaded view of the cookie data is not desirable. You're asking the wrong guy - I'm in the don't bother trying to ensure consistency camp :) I'm just parroting the arguments that were offered previously. If I hadn't lost the arguments about providing workers access to cookies, then I'd say that pages that care about cookie consistency should set cookies via a SharedWorker. But failing that, it seems like the choices are fairly stark: ensure consistency and take the performance hit in all multi-threaded browsers, or explicitly disavow consistency, and console ourselves by making our future APIs more multi-process friendly. I do understand the point of view of the correctness over performance camp - however, as Jeremy points out, it doesn't sound like those arguments have gotten much traction with actual implementors yet. --BDS
Re: [whatwg] Storage mutex and cookies can lead to browser deadlock
To be clear, I'm not trying to reopen the topic of giving cookie access to workers - I'm happy to restrict cookie access to document context (I probably shouldn't have brought it up again). I do agree with Jeremy that we should rethink the spec language around cookie consistency to reflect what implementors are actually willing to build. -atw On Thu, Sep 3, 2009 at 6:17 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Sep 4, 2009 at 8:17 AM, Benjamin Smedberg benja...@smedbergs.uswrote: What kind of conflict? There is no need to merge individual cookies: whichever one was set (or removed) last wins. I think this strategy would work fine for cookies since the HTTP side of them is inherently racy. I think such behavior would be pretty counter-intuitive for localStorage, though. If we did go with this strategy, I think we could give access to shared workers, and someone could use those if they needed better atomicity.
Re: [whatwg] Storage mutex and cookies can lead to browser deadlock
When we had this discussion last, there was significant pushback on this - the argument was basically we have no evidence that cookie-based race conditions *aren't* causing sporadic breakages, which is true. It's inherently difficult to measure. As an aside, I'll note that the majority of pushback came from developers of platforms that were inherently single-threaded, and so enforcing synchronicity had no impact on the performance of their platforms. It's easy to be a purist when there's no cost. Now that more browsers are moving to multi-process architectures and will soon be faced with having to reduce the performance of their platforms to enforce cookie coherence, I wonder if people's attitudes have changed. I, too, would be interested in hearing if the folks working on multi-process firefox are planning to implement this piece of the spec. -atw On Wed, Sep 2, 2009 at 9:55 AM, Darin Fisher da...@chromium.org wrote: On Tue, Sep 1, 2009 at 4:31 PM, Jeremy Orlow jor...@chromium.org wrote: On Wed, Aug 26, 2009 at 3:24 PM, Jeremy Orlow jor...@chromium.orgwrote: On Wed, Aug 26, 2009 at 3:05 PM, Robert O'Callahan rob...@ocallahan.org wrote: On Wed, Aug 26, 2009 at 2:54 PM, Jeremy Orlow jor...@chromium.orgwrote: Is there any data (or any way to collect the data) on how much of the web IE and Chrome's current behavior has broken? Given that there hasn't been panic in the streets, I'm assuming approximately 0%? We previously had a lengthy discussion about this. If a site has a cookie race that causes a problem in IE/Chrome one in every 10,000 page loads, are you comfortable with that? I'm much more comfortable with that than the cost of a global mutex that all cookies and LocalStorage share. There are other ways to come about this problem (like developer tools). I'm pretty sure Chromium has no intention of implementing a global storage mutex and putting all cookie access under it. Has anyone heard anything (either way) from Microsoft? Are there any browsers moving to a multi-event-loop (be it multi-threaded or multi-process) based model that intend to implement this? If not, then it would seem like the spec is not grounded in reality. Does the silence mean that no one has any intention of implementing this? If so, maybe we should resign ourselves to a break in the single threaded illusion for cookies. This doesn't seem too outlandish considering that web servers working with cookies will never have such a guarantee and given that we have no evidence of widespread breakage with IE 8 and Chrome. IE 6 -- it is also multi process. you can poke at wininet from any application and change the cookies for IE. -darin If we were to get rid of the storage mutex for cookie manipulation (as I believe we should) maybe we should re-examine it for local storage. At a minimum, it could be implemented as a per-origin mutex. But I feel strongly we should go further. Why not have an asynchronous mechanism for atomic updates? For example, if I wanted to write an ATM application, I would have to do the following: var accountDelta = /* something */; window.localStorage.executeAtomic(function() { localStorage.accountBalance = localStorage.accountBalance + accountDelta; } Alternatively, we could make it so that each statement is atomic, but that you have to use such a mechanism for anything more complicated. For example: localStorage.accountBalance = localStorage.accountBalance + accountDelta; // It's atomic, so no worries! var balance = localStorage.accountBalance; /* Oh no This isn't safe since it's implemented via multiple statements... */ localStorage.accountBalance = balance + accountDelta; /* we should have used localStorage.executeAtomic! */ Such ideas would definitely lighten lock contention and possibly eliminate the need for yieldForStorageUpdates (formerly getStorageUpdates). Another major bonus is that it'd allow us to expose localStorage to workers again, which is one of the top complaints I've gotten when talking to web developers about localStorage. I know this is radical stuff, but the way things are speced currently just are not practical. J
Re: [whatwg] Changing postMessage() to allow sending unentangled ports
I'm saying that we should differentiate between the closed state and cloned state. Implementors effectively need to do this anyway, because the spec says that closed ports are still task sources, while cloned ports are not. It makes sense to be able to post closed ports via postmessage() because they are still task sources so the recipient could attach an onmessage handler and pull messages off them. It makes no sense to re-send an already-cloned port since it's not a task source and can't ever be a task source again (no way to send messages to it). Likewise it is no longer entangled and so you can't send messages via it. Re-sending a cloned port is an error, and we should treat it as such. -atw On Fri, Aug 28, 2009 at 12:11 PM, Ian Hickson i...@hixie.ch wrote: On Mon, 17 Aug 2009, Drew Wilson wrote: Following up on this issue: Currently, the checks specified for MessagePort.postMessage() are different from the checks done in window.postMessage() (as described in section 7.2.4 Posting messages with message ports). In particular, step 4 of section 7.2.4 says: If any of the entries in ports are null, *if any of the entries in **ports** are not entangled **MessagePort** objects*, or if any MessagePort object is listed in ports more than once, then throw an INVALID_STATE_ERR exception. It appears that this is fixed. Also, as written, the spec now incorrectly lets us send a cloned port multiple times. So code like this would not generate an error: var channel = new MessageChannel(); otherWindow.postMessage(message1, channel.port1); otherWindow.postMessage(message2, channel.port1); // Sent the same port again That's intentional. By the second call, channel.port1 is not entangled; the 'message2' event will have a lame duck port as its port. The current WebKit behavior is to throw an INVALID_STATE_ERR in this case, while still allowing closed ports to be sent, which I believe is the intended behavior based on previous discussions. If this is correct, we should update the spec to prohibit resending cloned ports. I don't see how this could be correct. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Storage mutex and cookies can lead to browser deadlock
My recollection is that we prohibit worker access to cookies for exactly this reason (WorkerGlobalScope does not expose a cookies attribute). -atw On Wed, Aug 26, 2009 at 2:05 PM, Jens Alfke s...@google.com wrote: I know that one of the design issues with worker threads and local storage has been how to resolve concurrency issues, and that for this reason, in the current spec worker threads can't access local storage. However, there's a scenario under the current spec that *doesn't* involve local storage, whereby a worker thread can deadlock the browser. This is because access to cookies, by workers or the browser itself, is also subject to that global mutex. Consider these steps: 1. A worker thread accesses document.cookie. This causes it to acquire the mutex (sec. 3.1.3). 2. The thread now performs some long-lasting operation without exiting. In the simplest case it just goes into an infinite loop. 3. Meanwhile, the user loads a new web page in the browser. 4. The resulting HTTP response contains a Cookie: header. The spec requires that the browser's loader temporarily acquire the mutex while updating the cookie (sec. 2.6, item 4). 5. The page-load blocks indefinitely because the worker thread still has the mutex and never lets go of it. The result is that the browser becomes incapable of loading any web pages that use cookies. Assuming the worker thread never exits, the only way to recover from this is to quit the browser. A worker thread like this could very easily be created by a malicious website, resulting in a DoS attack on the browser. And of course, a merely poorly-written script could create the same effect without intending to. I honestly can't think of any safe way of tweaking the semantics of the existing 'document.cookie' API to make it transactional. :( Has anyone implemented this portion of the spec yet? —Jens
Re: [whatwg] Storage mutex and cookies can lead to browser deadlock
We discussed this in more detail here: http://www.mail-archive.com/whatwg@lists.whatwg.org/msg13799.html At the time, I suggested not protecting cookies with a mutex (allow asynchronous access - the current behavior on IE and Chrome), which made the monocles pop out of everyone's eyes :) -atw On Wed, Aug 26, 2009 at 2:21 PM, Jens Alfke s...@google.com wrote: On Aug 26, 2009, at 2:11 PM, Drew Wilson wrote: My recollection is that we prohibit worker access to cookies for exactly this reason (WorkerGlobalScope does not expose a cookies attribute). Looks like you're right; section 5 of the Web Workers spec says: The DOM APIs (Node objects, Document objects, etc) are not available to workers in this version of this specification. and there's no defined way to access cookies except through Document. Crisis averted. (If the spec does get modified to allow local-storage access from worker threads, though, this same problem will arise, since they use the same lock.) —Jens
Re: [whatwg] Web Storage: apparent contradiction in spec
This is one of those times when I *really* wish that the application developer community was more active on this list. I absolutely understand Linus' point of view, but I also feel like we are really hamstringing applications when we make choices like this and I wish that those developers were more vocally represented in these types of discussions. Going down this path would basically kill the ability to have offline web applications, because there would be no guarantees that the data would persist until the user comes back online. But since that point's already been made several times, I guess it's not a compelling argument. -atw On Wed, Aug 26, 2009 at 8:23 PM, Linus Upson li...@google.com wrote: I simply want clicking on links to be safe. In a previous thread I wrote safe and stateless but I'm coming to the opinion that stateless is a corollary of safe. Clicking on links shouldn't, either by filling my disk or hitting my global quota, someday lead to a dialog which reads, Please choose what to delete so that web sites will continue to work. The candidate delete list will be thousands long and hidden in that haystack will be a few precious needles. I also want to avoid any [Yes] [No] dialogs. Can I do something scary [Yes] [No]? Can I do something innocuous [Yes] [No]? Users shouldn't be forced to make those kinds of safety judgements. I'm guilty of instigating at least one of those dialogs. As shamed politicians do I'll retreat to the passive voice: Mistakes were made. I'm not opposed to web apps manipulating files on the user's computer, but the user should be in explicit control. I'd support input type=open and input type=save that worked similarly to input type=file. User agents are already registering for file types so that double clicking a file with a certain extension can be automatically sent to an URL, perhaps residing in an AppCache. In addition, I'd like to see the pop-up dialogs for the location API removed. I find the Can I know where you are? dialogs on the iPhone very annoying. Mistakes were made. Perhaps we can find a way to make input type=location work well instead. Linus On Wed, Aug 26, 2009 at 5:14 PM, Brady Eidson beid...@apple.com wrote: I started writing a detailed rebuttal to Linus's reply, but by the time I was finished, many others had already delivered more targetted replies. So I'll cut the rebuttal format and make a few specific points. - Many apps act as a shoebox for managing specific types of data, and users are used to using these apps to manage that data directly. See iTunes, Windows Media Player, iPhoto, and desktop mail clients as examples. This trend is growing, not waning. Browsers are already a shoebox for history, bookmarks, and other types of data. Claiming that this data is hidden from users who are used to handling obscure file management scenarios and therefore we shouldn't fully respect it is trying to fit in with the past, not trying to make the future better. - No one is suggesting that UAs not have whatever freedom they want in deciding *what* or *how much* to store. We're only suggesting that once the UA has committed to storing it, it *not* be allowed to arbitrarily purge it. - One use of LocalStorage is as a cache for data that is transient and non-critical in nature, or that will live on a server. But another, just-as-valid use of LocalStorage is for persistent, predictable storage in the client UA that will never rely on anything in the cloud. - And on that note, if developers don't have faith that data in LocalStorage is actually persistent and safe, they won't use it. I've given talks on this point 4 times in the last year, and I am stating this as a fact, based on real-world feedback from actual, real-world web developers: If LocalStorage is defined in the standard to be a purgable cache, developers will continue to use what they're already comfortable with, which is Flash's LocalStorage. When a developer is willing to instantiate a plug-in just to reliably store simple nuggets of data - like user preferences and settings - because they don't trust the browser, then I think we've failed in moving the web forward. I truly hope we can sway the LocalStorage is a cache crowd. But if we can't, then I would have to suggest something crazy - that we add a third Storage object. (Note that Jens - from Google - has already loosely suggested this) So we'd have something like: -SessionStorage - That fills the per browsing context role and whose optionally transient nature is already well spec'ed -CachedStorage - That fills Google's interpretation of the LocalStorage role in that it's global, and will probably be around on the disk in the future, maybe -FileStorage - That fills Apple's interpretation of the LocalStorage role in that it's global, and is as sacred as a file on the disk (or a song in your media library, or a photo in your photo library, or a bookmark,
Re: [whatwg] Storage mutex
On Tue, Aug 25, 2009 at 11:51 AM, Jeremy Orlow jor...@chromium.org wrote: On Sun, Aug 23, 2009 at 11:33 PM, Robert O'Callahan rob...@ocallahan.orgwrote: On Sat, Aug 22, 2009 at 10:22 PM, Jeremy Orlow jor...@chromium.orgwrote: On Sat, Aug 22, 2009 at 5:54 AM, Robert O'Callahan rob...@ocallahan.org wrote: On Wed, Aug 19, 2009 at 11:26 AM, Jeremy Orlow jor...@chromium.orgwrote: First of all, I was wondering why all user prompts are specified as must release the storage mutex ( http://dev.w3.org/html5/spec/Overview.html#user-prompts). Should this really say must instead of may? IIRC (I couldn't find the original thread, unfortunately) this was added because of deadlock concerns. It seems like there might be some UA implementation specific ways this could deadlock and there is the question of whether we'd want an alert() while holding the lock to block other execution requiring the lock, but I don't see why the language should be must. For Chromium, I don't think we'll need to release the lock for any of these, unless there's some deadlock scenario I'm missing here. So if one page grabs the lock and then does an alert(), and another page in the same domain tries to get the lock, you're going to let the latter page hang until the user dismisses the alert in the first page? Yes. And I agree this is sub-optimal, but shouldn't it be left up to the UAs what to do? I feel like this is somewhat of an odd case to begin with since alerts lock up most (all?) browsers to a varying degrees even without using localStorage. That behaviour sounds worse than what Firefox currently does, where an alert disables input to all tabs in the window (which is already pretty bad), because it willl make applications in visually unrelated tabs and windows hang. OK...I guess it makes sense to leave this as is. One thing I just realized that kind of sucks though: This makes alert based debugging much more difficult. I guess that's acceptable, though. I'm not sure why, unless you are saying that alert based debugging while another document is updating the same database simultaneously, then yeah. But that seems like an obscure case for alert debugging. The problem with leaving this up to the UA is it becomes a point of incompatibility - on one browser, it's safe to put up an alert, on another it isn't. So if applications have to fall back to the LCD behavior, then we might as well codify it in the spec, which we have :)
Re: [whatwg] Global Script proposal
BTW, the WorkerGlobalScope.importScript() API blocks the current thread of execution, which is probably not acceptable for code executed from page context. So for globalscripts we'll need some way to do async notifications when the loading is complete, and report errors. We may also want to have some way to automatically enforce ordering (so if I call GlobalScript.importScripts() twice in a row, the second script is not executed until after the first script is loaded/executed, to deal with dependencies between scripts). The alternative is to force applications to do their own manual ordering. -atw On Mon, Aug 24, 2009 at 11:32 AM, Michael Nordman micha...@google.comwrote: Dmitry had a later note which combined creation of the context and loading of the script. But I suspect one thing people will want to do, in development anyway, is load multiple scripts into a context - like you can in workers. Which would mean we'd still need a function to load a script, or the only way to load a script would be by also creating a new context - which is much like the serverJS module concept. I think the plan is to provide an importScript(...) function to globalScripts as is done for workers... http://www.whatwg.org/specs/web-workers/current-work/#importing-scripts-and-libraries
Re: [whatwg] SharedWorkers and the name parameter
An alternative would be to make the name parameter optional, where omitting the name would create an unnamed worker that is identified/shared only by its url. So pages would only specify the name in cases where they actually want to have multiple instances of a shared worker. -atw On Tue, Aug 18, 2009 at 7:01 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Aug 18, 2009 at 3:08 PM, Jeremy Orlowjor...@chromium.org wrote: On Tue, Aug 18, 2009 at 1:22 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Aug 18, 2009 at 12:00 AM, Darin Fisherda...@chromium.org wrote: I agree. Moreover, since a shared worker identified by a given name cannot be navigated elsewhere, the name isn't all that synonymous with other usages of names (e.g., window.open). At the very least, it would seem helpful to scope the name to the URL to avoid the name conflict issue. -Darin Technically, that can already be done by using the current the current URL as the name. I don't quite understand. Are you suggesting that you can work around this by passing the same parameter twice when creating a shared worker? If so, that seems ugly...and a sign that it should be changed. No, what I mean is that if you want to create a worker shared with other instances of the same page, without having to worry about collisions from other pages on your site, you can do: worker = new SharedWorker(/scripts/workerJSFile.js, document.location); This way you can be sure that no other page on your site happen to use the same name. / Jonas
Re: [whatwg] Changing postMessage() to allow sending unentangled ports
Following up on this issue: Currently, the checks specified for MessagePort.postMessage() are different from the checks done in window.postMessage() (as described in section 7.2.4 Posting messages with message ports). In particular, step 4 of section 7.2.4 says: If any of the entries in ports are null, *if any of the entries in **ports** are not entangled **MessagePort*#1232afb852169a6e_1232af4ce8d86de7_messageport * objects*, or if any MessagePort#1232afb852169a6e_1232af4ce8d86de7_messageport object is listed in ports more than once, then throw an INVALID_STATE_ERRhttp://infrastructure.html#invalid_state_err exception. The spec for MessagePort.postMessage() does not throw an exception if any of the entries in ports are not entangled (per this thread). We should probably update the spec for window.postMessage() to define the same behavior there as well. Also, as written, the spec now incorrectly lets us send a cloned port multiple times. So code like this would not generate an error: var channel = new MessageChannel(); otherWindow.postMessage(message1, channel.port1); otherWindow.postMessage(message2, channel.port1); // Sent the same port again The current WebKit behavior is to throw an INVALID_STATE_ERR in this case, while still allowing closed ports to be sent, which I believe is the intended behavior based on previous discussions. If this is correct, we should update the spec to prohibit resending cloned ports. -atw On Thu, Jun 4, 2009 at 10:30 AM, Drew Wilson atwil...@google.com wrote: Hi all, I'd like to propose a change to the spec for postMessage(). Currently the spec reads: Throws an INVALID_STATE_ERRhttp://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#invalid_state_err if the ports array is not null and it contains either null entries, duplicate ports, or ports that are not entangled. I'd like to suggest that we allow sending ports that are not entangled (i.e. ports that have been closed) - the rationale is two-fold: 1) We removed MessagePort.active because it exposes details about garbage collection (i.e. an application could determine whether the other side of a MessagePort was collected or not based on testing the active attribute of a port). Throwing an exception in postMessage() is the same thing - it provides details about whether the other end of the port has been collected. 2) Imagine the following scenario: Window W has two workers, A and B. Worker A wants to send a set of messages to Worker B by queuing those messages on a MessagePort, then asking Window W to forward that port to Worker B: Window W code: workerA.onmessage(evt) { if (evt.data == forward) { // Currently this would throw an exception if the passed port is closed/unentangled. workerB.postMessage(messageFromA, evt.ports); } } Worker A code: function sendMessagesToB() { var channel = new MessageChannel(); channel.port1.postMessage(message 1); channel.port1.postMessage(message 2); channel.port1.postMessage(message 3); // Send port to worker B via Window W postMessage(forward, [channel.port2]); } Now Worker A is done with its port - it wants to close the port. But it can't safely do so until it knows that Window W has forwarded the port to Worker B, so it needs to build in some kind of ack mechanism to know when it's safe to close the port. Even worse, what if Worker A wants to shut down - it can't safely shut down until it knows that its message has been delivered, because the port would get closed when the owner closes. Since the port still acts as a task source even when it is closed, there seems to be no reason not to allow passing unentangled ports around - it's a reasonable way to represent a set of messages. And if you think about it, there's no reason why this is allowed: postMessage(msg, port) port.close() while this is prohibited: port.close(); postMessage(msg, port); Given that in both cases the port will almost certainly be closed before the message is delivered to the target. -atw
Re: [whatwg] SharedWorkers and the name parameter
That suggestion has also been floating around in some internal discussions. I'd have to objections to this approach either, although I'm not familiar enough with URL semantics to know if this is a valid use of URL fragments. -atw On Sat, Aug 15, 2009 at 5:29 PM, Jim Jewett jimjjew...@gmail.com wrote: Currently, SharedWorkers accept both a url parameter and a name parameter - the purpose is to let pages run multiple SharedWorkers using the same script resource without having to load separate resources from the server. [ request that name be scoped to the URL, rather than the entire origin, because not all parts of example.com can easily co-ordinate.] Would there be a problem with using URL fragments to distinguish the workers? Instead of: new SharedWorker(url.js, name); Use new SharedWorker(url.js#name); and if you want a duplicate, call it new SharedWorker(url.js#name2); The normal semantics of fragments should prevent the repeated server fetch. -jJ
Re: [whatwg] SharedWorkers and the name parameter
On Sun, Aug 16, 2009 at 12:51 PM, Michael Nordman micha...@google.comwrote: I'd have to objections to this Did you mean to say i'd have no objectsion to this? Yes, I have *no* objections to either approach. Apparently the coffee hadn't quite hit my fingers yet. -atw
[whatwg] SharedWorkers and the name parameter
Currently, SharedWorkers accept both a url parameter and a name parameter - the purpose is to let pages run multiple SharedWorkers using the same script resource without having to load separate resources from the server. Per section 4.8.3 of the SharedWorkers spec, if a page loads a shared worker with a url and name, it is illegal for any other page under the same origin to load a worker with the same name but a different URL -- the SharedWorker name becomes essentially a shared global namespace across all pages in a single origin. This causes problems when you have multiple pages under the same domain (ala geocities.com) - the pages all need to coordinate in their use of name. Additionally, a typo in one page (i.e. invoking SharedWorker(mypagescript?, name) instead of SharedWorker(mypagescript, name) will keep all subsequent pages in that domain from loading a worker under that name so long as the original page resides in the page cache. I'd* like to propose changing the spec so that the name is not associated with the origin, but instead with the URL itself. So if a page wanted to have multiple instances of a SharedWorker using the same URL, it could do this: new SharedWorker(url.js, name); new SharedWorker(url.js, name2); Nothing would prevent a page from also doing this, however: new SharedWorker(other_url.js, name); So step 4 in section 4.8.3 would change from this: If there exists a SharedWorkerGlobalScope #sharedworkerglobalscope object whose closing #dom-workerglobalscope-closing flag is false, whose name attribute is exactly equal to the name argument, and whose location#dom-workerglobalscope-location attribute represents an absolute URL that has the same origin as the resulting absolute URL, then run these substeps: to this: If there exists a SharedWorkerGlobalScope #sharedworkerglobalscope object whose closing #dom-workerglobalscope-closing flag is false, whose name attribute is exactly equal to the name argument, and whose location#dom-workerglobalscope-location attribute represents an absolute URL that exactly matches the resulting absolute URL, then run these substeps: The downside of this change is pages might inadvertently create a second instance of a SharedWorker if they inadvertently use the wrong URL. It seems like this is an acceptable tradeoff given the problems described above. What do people think of this? -atw * Thanks to Darin Adler for suggesting this solution
Re: [whatwg] Installed Apps
On Thu, Aug 13, 2009 at 4:07 AM, Ian Hickson i...@hixie.ch wrote: Sure, although I'd say that persistent storage is addressed by the Web Storage and Web Database features. Shared state is also addressed, but that's not the primary goal. If I have a tree of objects that I'd like to share between two pages, telling me to serialize it into name/value string pairs, write it into Web Storage, and then have the remote side read it out is not a satisfying (or performant) solution. Web Storage supports structured data now. Yeah, the fact that the UA will automatically jsonify my (cycle-free) data structures does not really make this a great solution, for many of the reasons Mike Wilson mentioned. That said, once you've architected your application around having only asynchronous access to your data structures, there are lots of tools available in HTML5 to do sharing (use WebStorage as you describe, push all data access through a SharedWorker, keep duplicate copies of data structures in each page and update them via DB or SharedWorker messages, etc). A system that displays rich/scripted content on server demand rather than on user demand is a massive security nightmare. It turns a scripting security bug and an XSS bug into an instant malware deployment vector. Another name for a system that displays rich/scripted content on server demand is an open web page :) The only difference is the user has UI to close a web page when he's done interacting with it, while the UI to enable/disable notifications from a domain is probably less obvious. Scriptable notifications are a use case that none of these proposals currently satisfy. I understand the security concerns. I just don't (yet :) share the general belief that they are insurmountable which is why I want to continue experimenting in this area. Additionally, any server-side-feed-based solution has the implication that it won't work for offline apps. If I am using a web calendar, I want my event notifications regardless of whether I'm online or offline (the event may have been added while I'm offline and never synced to the server at all). I think on the long term we may wish to consider adding a feature to queue up POSTs for when the UA finds it is back online. That would address a number of problems, including this one. I'll just note that to get a narrow subset of the behavior that simple background scripting would provide (static notifications and static data synchronization without client-side reconciliation), we're having to have: 1) A server-controlled push notification stream, as well as infrastructure for applications to insert/remove notifications into the stream for offline use. 2) Some kind of server database push-sync protocol. 3) Some kind of queued up posts feature (with assumedly yet more infrastructure to deal with errors/return values from these delayed POSTs). What you really want for #2/#3 is a general-purpose sync protocol, but I don't see how you do it without some form of client-side conflict resolution. I hope that people understand why application scripting seems like a more attractive, general-purpose solution. I'm unable to muster much enthusiasm for a set of convoluted server-and-client-side mechanisms that cover such a narrow set of use cases without any way for client applications to customize this behavior through scripting. I really don't feel right allowing script to run like that. Why can't the server send the data to the client in a near-final form and let the script figure it out when the user finally opens the app? What if there are things the application wants to do to act upon this data immediately (add items to the notification stream, for example)? What you're saying is we need to push all of this functionality up to the server, then provide a very narrow set of APIs (essentially, static notifications) that the server can use to act on that functionality. What other use cases are there? Those were the ones given. We're very much use-case-driven here. I won't claim to understand all of the potential use cases yet, but I have a preference for general-purpose solutions rather than solutions that narrowly target a set of specific use cases, although I recognize that more general-purpose solutions have commensurate security implications. I'd like to just experiment with background scripting/scriptable notifications in a way that people find acceptable (either without persistence, or using extensions for persistence), see how applications actually use them, then continue this conversation. People are certainly welcome to do parallel experimentation with other approaches such as the ones you've outlined above. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Installed Apps
On Thu, Aug 13, 2009 at 5:13 AM, Mike Wilson mike...@hotmail.com wrote: Maybe I'm mistaken, but I think Drew wanted handling of live objects, where each window gets access to the same concrete objects (possibly protected by private proxy objects) so the graph can be walked without cloning. To be honest, I'm not really a good spokesperson for this issue, as most of my thinking has been around shared workers which have all the same drawbacks for data sharing that WebStorage has. I was just saying that I understand the problem that the shared context is trying to address. I personally think that part of the problem can be overcome using the existing tools, although with more effort on the app side than a simple shared context solution. Drew: are you thinking that the same object graph also makes up the data cache between sessions? If not, then persistence is not a must-have for this use case so the area of ideas could well expand outside webstorage. I think ideally serialization would happen only when data needs to be persisted. If I have a data structure that I want to share with other open windows, I shouldn't have to persist it to accomplish this, and I certainly shouldn't have to re-serialize it every time I want to make a minor change. But, again, I'm just speaking in the abstract - the folks proposing shared context (Dmitry) should probably chime in here as they've thought about this problem much more than I have. -atw
Re: [whatwg] Installed Apps
2009/8/7 Michael Kozakewich mkozakew...@icosidodecahedron.com TO SUMMARIZE: -There are many other existing ways to notify -I'd suggest browsers have a Notification process with which open tabs register. -Registered open tabs could tell the browser to pop up a notification, perhaps with title text, body text, and image -Clicking the notification would set focus to the browser and to the notifying tab. To solve the lifetime issue: -Torn-off tabs run in separate processes -Processes may be re-skinned to appear as applications, but are really tabs. -Minimized/docked Processes taken off the taskbar/dock and into a notification area or Application Manager -If the rest of the browser is closed, the main process will stay on until the application tabs are closed -Browser's 'Application Manager' in notification area or taskbar/dock (and as a button in the main browser) holds all open application tabs Have I forgotten anything? Even without the application tabs, the notification process would be great to implement. One of the reasons I'm a proponent of running application script is the track record of one-size-fits-all, generic push solutions that are able to fulfill a broad range of use cases is fairly poor. If we are going to have the browser manage an incoming feed of push notifications, then I'd say at the very least we should allow web apps to register script handlers for those notifications, rather than try to build in static browser behavior. One could put limits on the execution of this script (perhaps only allow it to run for a limited period of time, for example), and provide UX to the end user such that there is transparency about the script running in the background of their browser. I agree that the UX and Security issues around always-on persistent script are formidable, but I still think it's valuable to experiment in this area to see whether they can be overcome. Regardless, I think it's premature to latch onto a single potential use case for persistent script (static text + icon calendar notifications) and start building extensive alternative mechanisms to satisfy that case. I think the takeaway from this thread is the recommendation that vendors could experiment in this area through the extension mechanism. I think once we've had a chance to build some app functionality on top of this, we'll have a better idea of the real-life use cases, and it would then be appropriate to see how/if those use cases could be satisfied in an alternate manner. -atw
[whatwg] MessagePorts and message delivery ordering
I was writing some unit tests for SharedWorkers, and I saw some behavior that seems to be spec compliant, but which was counter-intuitive from a developer standpoint. Let's say that you have two message ports - some other window or a shared worker owns the other end of those ports. You then do this: port1.postMessage(foo); port2.postMessage(bar); At the other end, the order of the delivery of these messages is entirely arbitrary (could be foo-bar, or bar-foo) even though those ports share the same owner. This is because each port is an individual task source, and ordering within a given task source is guaranteed by the spec, but ordering across task sources is intentionally not guaranteed (this allows UAs to prioritize task sources). Anyhow, just thought I'd point it out, and make sure there isn't something in the spec that I missed that should affect delivery ordering in this case. -atw
Re: [whatwg] Installed Apps
On Mon, Aug 3, 2009 at 8:58 PM, Ian Hickson i...@hixie.ch wrote: It seems like a direct solution to these would be to have a way for Web pages to expose a service that clients could subscribe to, which could send up database updates and notifications. That way, no script runs on the client, but the server can still update the client whenever needed. Yeah, this is an interesting idea, although I'm envisioning apps that support offline use requiring what amounts to a sync process with app-specific conflict resolution, etc - I think this would be difficult to support with this kind of general platform. But, agreed, this is better than nothing - at the very least, an application could use it to sync down changes to a staging area for later resolution once the app itself starts up. I don't understand why workers can't reduce latency. What is the latency that must be reduced which workers can't help with? As I understand it, the big problems are 1) loading/parsing/executing N-hundred K of javascript for an app like gmail is slow, 2) initializing application data structures from data that must be sent down from the server (or loaded from the database) is also slow (and by slow, we may only be talking on the order of hundreds of milliseconds). Workers don't do much for either of these, especially when you are launching a web app for the first time. SharedWorkers are overloaded to provide a way for pages under the same domain to share state, but this seems like an orthogonal goal to parallel execution and I suspect that we may have ended up with a cleaner solution had we decided to address the shared state issue via a separate mechanism. Shared state is addressed by the Web Storage and Web Database features; the shared workers are intended to provide shared computation. Sure, although I'd say that persistent storage is addressed by the Web Storage and Web Database features. Shared state is also addressed, but that's not the primary goal. If I have a tree of objects that I'd like to share between two pages, telling me to serialize it into name/value string pairs, write it into Web Storage, and then have the remote side read it out is not a satisfying (or performant) solution. So (and forgive me for restating), it seems like hidden page addresses the following problems that gmail and other large web apps are having: 1) Loading large amounts of Javascript is slow, even from cache. The solution is to make loading the JS faster, not to load it and use the user's resources regardless. I agree. I don't think that forcing pages to stay resident to reduce JS load times is a good solution. An ideal solution would be for browsers to be inherently faster, to enable this for the entire web, not just for installed apps. 2) Loading application state from the database is slow. I don't see how a hidden window can solve this. Could you elaborate? The data would be always cached in memory and shared across instances. But, yeah, Netscape 6, etc. My expectation for persistent workers was that these workers would try to have a minimal ram footprint, for exactly that reason (used primarily for generating notifications and keeping the data store in sync with the server, not as a persistent memory cache). 4) There's no way to do things like new mail notifications, calendar notifications, local updates of your email inbox, etc when the browser is not open. It seems a system dedicated to this could solve this problem in a much simpler and more efficient way than running background windows. Yeah, I'm somewhat leery of the canned RSS-feed-style solution to notifications (our vision for notifications is that they are scriptable and more interactive than just a dumb text + icon). But it's possible that a simple static feed may cover some portion of the use cases - hopefully we'll have more real-world use cases once the webkit notification API ships in a browser. Additionally, any server-side-feed-based solution has the implication that it won't work for offline apps. If I am using a web calendar, I want my event notifications regardless of whether I'm online or offline (the event may have been added while I'm offline and never synced to the server at all). -atw
Re: [whatwg] Installed Apps
On Tue, Aug 4, 2009 at 10:47 AM, Jeremy Orlow jor...@chromium.org wrote: Which use case is this related to? If the shared worker is creating UI elements for the page, then composing HTML and sicking it into a div's .innerHTML is actually (sadly) the fastest way to go at the moment. Besides that, I can't think of why you'd have some huge tree of information for the gmail use case. OK, imagine your inbox. It contains a set of emails organized into threads, with various attributes and tags associated with them. Imagine your contacts, which has a set of individual contact entities, with chat status information updated dynamically, as well as group information for subsets of those contacts. I mean, look at the internals of any modern web app - they have data structures of a similar complexity to traditional apps. They aren't just conduits for HTML. Yeah, I'm somewhat leery of the canned RSS-feed-style solution to notifications (our vision for notifications is that they are scriptable and more interactive than just a dumb text + icon). What if the notification could have embedded links? If you make them too powerful, you'll definitely see spam/ads/phishing/etc showing up in them. For spam/ads,I think the problem is identical whether the ad is static or not static - the user has to enable notifications from some source, and when something shows up the user needs to have a way to block it if it's inappropriate. Agreed that scripting potentially enables some phishing exploits, depending on how the user is able to interact with the notification. If the notification can popup and say Your gmail credentials have expired - please enter them here: and allow you to type into the notification, then that's a potential phishing exploit. But a scriptable notification with restricted user interaction (i.e. no keyboard input allowed) would seem to be no more phish-able than a static notification. -atw
Re: [whatwg] Web Workers and postMessage(): Questions
On Mon, Aug 3, 2009 at 10:34 AM, Daniel Gredler daniel.gred...@gmail.comwrote: I know Anne VK (Opera) and ROC (Mozilla) appear to read this list... any comments, guys? Should I just file bugs? Any Safari / Chrome / IE guys out there with comments? I've often had the same thought (that there's no reason we shouldn't handle cycles when implementing structured clones). That said, I'm compelled to point out that WebKit browsers only support string messages currently (they don't yet implement the structured clone part of the spec). And none of the currently shipping browsers support MessagePorts or SharedWorkers (although WebKit browsers are getting these soon). So given that there's a workaround for the lack of support for cycles in structured clones (applications can do their own serialization) but there's no app-level workaround for the lack of SharedWorkers, I'd rather see vendors concentrate on implementing the current spec before adding greater support for cloning message parameters. I agree that once you've made the decision to not clone functions, cloning the prototype chain becomes (nearly?) useless. However, I'd be interested to know the rationale behind this decision, since Web Workers appear to follow the same-origin policy (e.g. If the origin of the resulting absolute URL is not the same as the origin of the script that invoked the constructor, then throw a security exception, etc). I assume there's a security concern lurking somewhere? It's not clear to me how you'd clone the lexical scope of a function and carry it over to the worker in a way that doesn't cause synchronization issues. Case in point: var obj = {}; var foo = abc; obj.bar = function() { foo = def; } sharedWorker.port.postMessage(obj); Now, from shared worker scope, you have the ability to directly access the variable foo from a different thread/process, which is not really implementable. -atw
Re: [whatwg] Installed Apps
I think the error here is viewing this as a UX issue - if it were just a UX issue, then the responses from people would be along the lines of Oh, this sounds dangerous - make sure you wrap it with the same permissions UI that we have for extensions, plugins, and binary downloads. The realization I came to this morning is that the core of the objections are not primarily about protecting users (although this is one goal), but more about protecting the current secure web browsing model (Linus explicitly said this yesterday in his email to the list, but I only got it when thinking about it today). This is why people are OK with supporting this via extensions but not OK with supporting this as part of the core HTML APIs even if the UX was exactly the same. It's more about keeping the model pristine. Doing crazy stuff in extensions and plugins are OK because they are viewed as falling outside the model (they are just random scary things that user agents choose to do that don't conform to the specification). So arguing but it's the same UI either way! is not going to convince anyone. -atw On Thu, Jul 30, 2009 at 12:51 PM, Dmitry Titov dim...@google.com wrote: It seems the biggest concern in this discussion is around BotNet Construction Kit as Machej succulently called it, or an ability to run full-powered platform API persistently in the background, w/o a visible 'page' in some window. This concern is clear. But what could be a direction to the solution? Assuming one of the goals for html5 is reducing a gap in capabilities between web apps and native apps, how do we move forward with more powerful APIs? So far, multiple ways exist to gain access to the user's machine - nearly all of them based on some dialog that asks user to make impossible decision - as bad as it is, binary downloads, plugins, browser extensions, axtivex controls or Gears modules are all but a dialog away from the user's computer. Basically, if a malicious dudes are cool to write native apps - they can have their botnet relatively easy. The ongoing fight with malware and viruses will continue - not because the platforms have wrong API, but because it's really hard to give power to the apps and not to the malware, since they, in essence, do the very similar things. As controversial as it sounds, it might be if a web platform API can't be used to write a botnet, then it can't be used to write a wide class of powerful applications as well :-) I don't have a botnet example, but when Safari 4 visits the sites in the background (to keep the 'new tab page' site snapshots up-to-date) w/o ever asking my permission - it looks a bit scary, because I'm not sure I want it to visit websites at random time from my IP with I don't know what cookies and then snapshot the result in jpg and store it locally... But I sort of like the feature anyways. Now, how can I make a web app that does this? Some sort of background shared page could be handy. It can pop up the same dialog when installed, live in Applications folder but it should be possible. Now if we make it possible, would it be possible to write a botnet on top of the API? Of course! Same exact way as it's possible to write even better botnet on OSX API in which Safari is written. Now, what if I want the same feature but implemented not as a native app, but as a web app? We would need to give it specific rights locally, and make the process transparent - not only on 'install' time but when it runs too - so the user could peek into some 'task manager' and clearly see if such thing is running. Browser could periodically download 'malware lists' and kill those web apps that are in it. But for now, it should be ok to have it 'installed' with a specific browser dialog that asks the user to make a decision the user may not understand - it is not the ideal way but it is the common way today, users know they are asked these questions, admins and IT teaches users what to do when asked, so it's the best we can do now. Having a 'task manager' (as in Chrome) reflecting those things is good too. Btw, if it only can do window.open() on the url from the same domain, then if it's from Gmail then it can't be used or hijaked. If it is from a site that says install this and I'll show you a pretty picture and user clicks through a dialog, I'd say it's not a new vector for malware. Dmitry On Thu, Jul 30, 2009 at 10:26 AM, Michael Davidson m...@google.com wrote: On Wed, Jul 29, 2009 at 5:38 PM, Maciej Stachowiakm...@apple.com wrote: * Notification Feeds * Often, web applications would like to give users the option to subscribe to notifications that occur at specific times or in response to server-side events, and for the user to get these UI notifications without a prerequisite that the web app is open or that the browser is running. There may be a desire to do client-side computation as well, but often just the ability to give the user a
Re: [whatwg] Installed Apps
On Wed, Jul 29, 2009 at 7:19 AM, Tab Atkins Jr. jackalm...@gmail.comwrote: Firefox's Jetpack addon (essentially Greasemonkey turned up to 11) exposes a super-convenient jetpack.notifications.show() function for doing exactly that. It pops up an attractive notification in the lower right-hand corner of the screen for a few seconds with a custom title, text, and icon. I'd like to have something like this as a general feature. ~TJ Something similar to this is in the works for WebKit as well: https://bugs.webkit.org/show_bug.cgi?id=25463 It's experimental and so would be exposed via window.webkitNotifications. In addition to the text+icon functionality, this also enables scripted HTML notifications, to allow things like notifications that display number of unread emails, a dynamic countdown for an event reminder, etc. It sounds like enough people are prototyping in this area (us, mozilla jetpack, Palm) that we might get consensus on a general API at some point. -atw
Re: [whatwg] Installed Apps
On Wed, Jul 29, 2009 at 6:32 AM, Michael Kozakewich mkozakew...@icosidodecahedron.com wrote: It sounds like the hidden page idea is just the solution you thought up to the problem of keeping a page running. How many other reasons are there for it? Not sure what other motivations there may be, but one shouldn't underestimate the value of keeping a page running. It's one of the major differences between desktop and web apps. - Data up-to-date: Even Outlook checks online every X minutes, and has an options panel where you can set that value. Google Reader checks for new feeds, for me, *if I just leave it open on my desktop.* It works great. Exactly - but you have to leave it open on your desktop. I can't tell you how many meetings I've missed because I've inadvertently closed (or crashed :) my browser, and forgotten to start up my web calendar when I restart. What I'd like, as a user, is some way to pin selected apps to run in the background - whether that's something I initiate through the UI myself, or via a prompt from the application is really a matter of UX. -- Notifications: I don't think I've ever had Outlook notify me of new mail when it's not running. It usually starts up with Windows, and it runs in the background. If you turn it off from the tray, it stops. The way I've envisioned any of these persistent running workers/pages operating is the browser would have a status bar icon which would allow background apps to display status, and also give the user the opportunity to exit the browser or (possibly) close down individual apps. So it's a very similar situation. If browsers could tear off tabs, minimize them to tray and allow them to send pop-up notifications, I think it would solve your main problem. Chrome seems to be halfway there, with the Create Application Shortcuts... option, but I believe only Chrome and Firefox support tear-away tabs. This sounds largely like a browser issue. If Chrome does it first, I'm sure the others will see and follow along. Agreed. I like this way of looking at the issue - framed in this manner, it highlights this as primarily a UX challenge (how to present the idea of 'pinned' tabs to the user).
Re: [whatwg] Security risks of persistent background content (Re: Installed Apps)
Maciej, thanks for sending this out. These are great points - I have a few responses below. The main thrust of your argument seems to be that allowing web applications to run persistently opens us up to some of the same vulnerabilities that native (desktop and mobile) apps have, and I agree with that. The question (as with native apps) is whether we can mitigate those vulnerabilities, and whether the functionality that persistence provides is worth the larger attack surface. On Tue, Jul 28, 2009 at 10:58 PM, Maciej Stachowiak m...@apple.com wrote: On Jul 28, 2009, at 10:01 AM, Drew Wilson wrote: I've been kicking around some ideas in this area. One thing you could do with persistent workers is restrict network access to the domain of that worker if you were concerned about botnets. That doesn't address the I installed something in my browser and now it's constantly sucking up my CPU issue, but that makes us no different than Flash :-P Here's some security risks I've thought about, for persistent workers and persistent background pages: 1) If they have general-purpose network access, they are a tool to build a DDOS botnet, or a botnet to launch attacks against vulnerable servers. Indeed. There are mitigations against this (basically, leveraging some of the same infrastructure we have in place to warn users of malware), although not all browsers have this protection currently. But, yes, this (intentionally) makes the browser more similar to the desktop environment, and so more vulnerable to desktop-style attacks. 2) If they do not have general-purpose network access, this can be worked around with DNS rebinding. Note that ordinarily, DNS rebinding is only considered a risk for content protect by network position. But in the case of a DDOS or attempt to hunt for server vulnerabilities, this doesn't matter - the attack doesn't depend on the DDOS node sending credentials. That's an interesting point. Basically, once I've gotten a farm of people to install persistent workers, I can just rebind my domain to any arbitrary IP address, and now that domain could get a flood of HTTP connections. 3) If they have notification capabilities, they can be used for advertising spam. Yes, although the point of notifications is that 1) they are opt-in and 2) they are easy to opt-out (there's a block button on the notification). So I don't know that this is a real issue - the point of notifications is that it's really easy to undo your decision to grant access. I'd say that rather than this being a security issue, it's a UX issue to make sure that users have a way to get rid of annoying notifications easily and permanently. 4) If they have general network access only while a page from the same domain is displayed, then they can use a misleading notification to trick the user into going to a page on that domain, to gain network general network access at the moment it's needed. Good point, although I don't think this would be an acceptable restriction anyway. One of the whole points behind persistent workers is that they can keep a local data cache up-to-date (i.e. list of upcoming calendar events) regardless of whether a page is open. 5) Even if they only have same-domain network access, they can be used to create a botnet for computation - for example for purposes like distributed password cracking. Agreed. Once you have your software running on many machines, there are many things you could do with those cycles. Attackers probably won't be folding proteins :) 6) They can be used to greatly extend the window of vulnerability from visiting a malicious site once. Consider the model where a browser patches a security vulnerability, and users apply the patch over some period after it's released. Assuming the vulnerability wasn't already known to attackers, users are at risk if they visit a malicious site in the period between release of the patch and install of the patch. But with persistent workers (or background pages) in the picture, users can be vulnerable if they have *every* visited a malicious site - because it could have installed a persistent worker that periodically phones home for exploit code to try. This can greatly increase the number of people who can be affected by a malicious web page, and therefore greatly increases the incentive to try such a thing. This works even with just same-doman network access. I think this risk is really serious because it makes every future browser vulnerability much more dangerous. Agreed that this is a big deal, and is a problem I hadn't considered previously. I would assume that browser malware detection would blacklist these sites, but I hate to lean on some magical malware detection infrastructure too heavily. This seems like an issue that Apple and Microsoft have dealt with for years in their OS offerings - how do they handle this? This list isn't necessarily exhaustive, I'm sure there's more risks I
Re: [whatwg] Installed Apps
My understanding (when I looked at Prism a while back) was that it was essentially no different than a desktop shortcut that ran the page in a separate profile. Has this changed? -atw On Wed, Jul 29, 2009 at 10:21 AM, timeless timel...@gmail.com wrote: On Wed, Jul 29, 2009 at 7:56 PM, Drew Wilsonatwil...@google.com wrote: What I'd like, as a user, is some way to pin selected apps to run in the background - whether that's something I initiate through the UI myself, or via a prompt from the application is really a matter of UX. in my book, you're definitely asking for prism. http://labs.mozilla.com/projects/prism/ https://wiki.mozilla.org/prism and here's a prism link for google calendar: http://starkravingfinkle.org/projects/webrunner/gcalendar.webapp there should and will be more documentation about how these bundles are exposed.
Re: [whatwg] Security risks of persistent background content (Re: Installed Apps)
I'd agree with #1, for some given value of safe - we've all heard tales of search engines inadvertently deleting data on people's sites by following links. Note that web storage violates #2 and #3 (and even cookies could be viewed as a violation of #2, depending on how broadly you view caches). But I would agree that what you've mentioned below are characteristics of traditional web browsing. If we went back in time several years, we might have added in some restrictions about how data is only posted to the server in response to explicit user activity/form submittal. I think we should be open to the possibility that the characteristics of web browsing today are not necessarily inherent to the web browsing experience, and may change over time. Should web browsing in 2020 look like web browsing in 2010? Will web pages still be restricted to a sandbox with a close button? It seems like the tenets below are quite appropriate for the browser as content-delivery platform. But they are already starting to change for browser as application platform. The challenge is to balance the safety of a content-delivery platform while still giving applications the power they need. -atw On Wed, Jul 29, 2009 at 10:48 AM, Linus Upson li...@google.com wrote: This is a good analysis. I agree that it is important for the web to maintain some important properties that are in conflict with persistent background processing: 1. All links are safe to click 2. When a page is closed, the only artifacts left behind are items in various caches 3. The user agent is free to evict items from its various caches at any time For apps that desire capabilities that are not safe and stateless I like your suggestion to use the browser's extension mechanism (or runtimes such as prism or air). Those services usually involve some combination of multiple affirmative steps, vetting, reputation and revocation. Linus On Tue, Jul 28, 2009 at 10:58 PM, Maciej Stachowiak m...@apple.com wrote: On Jul 28, 2009, at 10:01 AM, Drew Wilson wrote: I've been kicking around some ideas in this area. One thing you could do with persistent workers is restrict network access to the domain of that worker if you were concerned about botnets. That doesn't address the I installed something in my browser and now it's constantly sucking up my CPU issue, but that makes us no different than Flash :-P Here's some security risks I've thought about, for persistent workers and persistent background pages: 1) If they have general-purpose network access, they are a tool to build a DDOS botnet, or a botnet to launch attacks against vulnerable servers. 2) If they do not have general-purpose network access, this can be worked around with DNS rebinding. Note that ordinarily, DNS rebinding is only considered a risk for content protect by network position. But in the case of a DDOS or attempt to hunt for server vulnerabilities, this doesn't matter - the attack doesn't depend on the DDOS node sending credentials. 3) If they have notification capabilities, they can be used for advertising spam. 4) If they have general network access only while a page from the same domain is displayed, then they can use a misleading notification to trick the user into going to a page on that domain, to gain network general network access at the moment it's needed. 5) Even if they only have same-domain network access, they can be used to create a botnet for computation - for example for purposes like distributed password cracking. 6) They can be used to greatly extend the window of vulnerability from visiting a malicious site once. Consider the model where a browser patches a security vulnerability, and users apply the patch over some period after it's released. Assuming the vulnerability wasn't already known to attackers, users are at risk if they visit a malicious site in the period between release of the patch and install of the patch. But with persistent workers (or background pages) in the picture, users can be vulnerable if they have *every* visited a malicious site - because it could have installed a persistent worker that periodically phones home for exploit code to try. This can greatly increase the number of people who can be affected by a malicious web page, and therefore greatly increases the incentive to try such a thing. This works even with just same-doman network access. I think this risk is really serious because it makes every future browser vulnerability much more dangerous. 7) Even with only same-domain network access, the persistent worker could periodically phone home to allow tracking of the user by IP, which can be mapped to an approximate physical location. Normally, a page you don't have open can't do that to you. This list isn't necessarily exhaustive, I'm sure there's more risks I haven't thought of, but note that most of these problems are not resolved by limiting networking to same-domain. I don't
Re: [whatwg] Issues with Web Sockets API
On Wed, Jul 29, 2009 at 1:33 AM, Ian Hickson i...@hixie.ch wrote: Yes. But that's the case anyway -- events are asynchronous, so consider the case of receiving two messages. Both are queued up, then eventually the first is dispatched. If in response to that you close the connection, that doesn't stop the second being dispatched, since it was already queued up. I'd note that this conforms to the behavior of MessagePorts - close disentangles the ports, but already-received/queued messages are still delivered.
Re: [whatwg] Installed Apps
I've been kicking around some ideas in this area. One thing you could do with persistent workers is restrict network access to the domain of that worker if you were concerned about botnets. That doesn't address the I installed something in my browser and now it's constantly sucking up my CPU issue, but that makes us no different than Flash :-P Anyhow, addressing some of the other comments - I don't think that there's necessarily a problem with the async worker APIs as they stand, and I don't think we can easily retrofit synchronous APIs on top of their current execution model. The issue is that the core problem that Workers solve (parallel execution in a separate context from the page) is different than the problem that large web apps are trying to address (reduced latency). SharedWorkers are overloaded to provide a way for pages under the same domain to share state, but this seems like an orthogonal goal to parallel execution and I suspect that we may have ended up with a cleaner solution had we decided to address the shared state issue via a separate mechanism. Similarly, the hidden page mechanism seems to address a bunch of issues at once, and I'm wondering if we explicitly laid out the problems it's trying to solve, whether we might find a set of distinct smaller solutions that were more generally applicable. I just don't want to make the same design choices that we did with SharedWorkers, and end up with a monolithic solution that doesn't address the individual goals (i.e. cross-page sharing) in a developer-friendly manner. So (and forgive me for restating), it seems like hidden page addresses the following problems that gmail and other large web apps are having: 1) Loading large amounts of Javascript is slow, even from cache. 2) Loading application state from the database is slow. 3) Sharing between pages requires going through the database or shared worker - you can't just party on a big shared datastructure. 4) There's no way to do things like new mail notifications, calendar notifications, local updates of your email inbox, etc when the browser is not open. Currently, even the most brain-dead shareware desktop calendar app can display an event notification, while web-based calendars are forced to rely on the user remembering to keep a browser window open. Am I missing any other issues that hidden page is supposed to address? A persistent worker could address #4 (perhaps with some limitations on network access to address security concerns). For #1/#2/#3, are we saying that web applications *must* install themselves (with the requisite user flow) just to get fast load times? That seems unfortunate - if I don't care about #4, I'd really like to be able to get the benefits of #1/2/3 without jumping through a user install. -atw On Mon, Jul 27, 2009 at 6:39 PM, Maciej Stachowiak m...@apple.com wrote: On Jul 27, 2009, at 7:13 PM, Aryeh Gregor wrote: I'm not clear how the UI requirements here are different from persistent workers, though. Those also persist after the user navigates away, right? Persistent workers are even more of a security risk, since they are supposed to persist even after the browser has been restarted, or after the system has been rebooted. Persistent workers should be renamed to BotNet Construction Kit. Regards, Maciej
Re: [whatwg] Installed Apps
To clarify - I said that *persistent workers* could restrict x-domain network access. I didn't mean to imply that you could apply this same reasoning to hidden pages - I haven't thought about hidden pages enough to comment about the implications of that, since as you mention there are many more network access methods for hidden pages. You do have a good point, though, and that is that if hidden pages *or* persistent workers need to be able to display UI to the user (for example, to prompt the user to enter their gmail credentials when they first start up their computer), it has some implications for popup spam. -atw On Tue, Jul 28, 2009 at 10:09 AM, Aryeh Gregor simetrical+...@gmail.comsimetrical%2b...@gmail.com wrote: On Tue, Jul 28, 2009 at 1:01 PM, Drew Wilsonatwil...@google.com wrote: I've been kicking around some ideas in this area. One thing you could do with persistent workers is restrict network access to the domain of that worker if you were concerned about botnets. How would that work for background pages, though? It couldn't include any files from other domains in any form (image, script, style, etc.)? But it could still spawn a regular tab and load whatever it wanted in that. Have it spawn a popunder window, say, quickly open a bunch of things from foreign sites, and close it before the user notices anything more than a sudden odd flicker. Or whatever. Workers, if I understand right (I haven't read the spec . . .), can't do things like open new tabs, but it's been explicitly stated that these background pages should be able to do just that.
Re: [whatwg] Issues with Web Sockets API
On Mon, Jul 27, 2009 at 1:14 PM, Alexey Proskuryakov a...@webkit.org wrote: 27.07.2009, в 12:35, Maciej Stachowiak написал(а): However, I do not think that raising an exception is an appropriate answer. Often, the TCP implementation takes a part of data given to it, and asks to resubmit the rest later. So, just returning an integer result from send() would be best in my opinion. With WebSocket, another possibility is for the implementation to buffer pending data that could not yet be sent to the TCP layer, so that the client of WebSocket doesn't have to be exposed to system limitations. At that point, an exception is only needed if the implementation runs out of memory for buffering. With a system TCP implementation, the buffering would be in kernel space, which is a scarce resource, but user space memory inside the implementation is no more scarce than user space memory held by the Web application waiting to send to the WebSocket. I agree that this will help if the application sends data in burst mode, but what if it just constantly sends more than the network can transmit? It will never learn that it's misbehaving, and will just take more and more memory. I would suggest that the solution to this situation is an appropriate application-level protocol (i.e. acks) to allow the application to have no more than (say) 1MB of data outstanding. I'm just afraid that we're burdening the API to handle degenerative cases that the vast majority of users won't encounter. Specifying in the API that any arbitrary send() invocation could throw some kind of retry exception or return some kind of error code is really really cumbersome. An example where adapting to network bandwidth is needed is of course file uploading, but even if we dismiss it as a special case that can be served with custom code, there's also e.g. captured video or audio that can be downgraded in quality for slow connections. - WBR, Alexey Proskuryakov
Re: [whatwg] Issues with Web Sockets API
On Mon, Jul 27, 2009 at 1:36 PM, Alexey Proskuryakov a...@webkit.org wrote: 27.07.2009, в 13:20, Jeremy Orlow написал(а): I agree that this will help if the application sends data in burst mode, but what if it just constantly sends more than the network can transmit? It will never learn that it's misbehaving, and will just take more and more memory. An example where adapting to network bandwidth is needed is of course file uploading, but even if we dismiss it as a special case that can be served with custom code, there's also e.g. captured video or audio that can be downgraded in quality for slow connections. Maybe the right behavior is to buffer in user-space (like Maciej explained) up until a limit (left up to the UA) and then anything beyond that results in an exception. This seems like it'd handle bursty communication and would keep the failure model simple. This sounds like the best approach to me. 27.07.2009, в 13:27, Drew Wilson написал(а): I would suggest that the solution to this situation is an appropriate application-level protocol (i.e. acks) to allow the application to have no more than (say) 1MB of data outstanding. I'm just afraid that we're burdening the API to handle degenerative cases that the vast majority of users won't encounter. Specifying in the API that any arbitrary send() invocation could throw some kind of retry exception or return some kind of error code is really really cumbersome. Having a send() that doesn't return anything and doesn't raise exceptions would be a clear signal that send() just blocks until it's possible to send data to me, and I'm sure to many others, as well. There is no reason to silently drop data sent over a TCP connection - after all, we could as well base the protocol on UDP if we did, and lose nothing. There's another option besides blocking, raising an exception, and dropping data: unlimited buffering in user space. So I'm saying we should not put any limits on the amount of user-space buffering we're willing to do, any more than we put any limits on the amount of other types of user-space memory allocation a page can perform. - WBR, Alexey Proskuryakov
Re: [whatwg] Issues with Web Sockets API
On Mon, Jul 27, 2009 at 2:02 PM, Jeremy Orlow jor...@chromium.org wrote: On Mon, Jul 27, 2009 at 1:44 PM, Drew Wilson atwil...@google.com wrote: On Mon, Jul 27, 2009 at 1:36 PM, Alexey Proskuryakov a...@webkit.orgwrote: 27.07.2009, в 13:20, Jeremy Orlow написал(а): I agree that this will help if the application sends data in burst mode, but what if it just constantly sends more than the network can transmit? It will never learn that it's misbehaving, and will just take more and more memory. An example where adapting to network bandwidth is needed is of course file uploading, but even if we dismiss it as a special case that can be served with custom code, there's also e.g. captured video or audio that can be downgraded in quality for slow connections. Maybe the right behavior is to buffer in user-space (like Maciej explained) up until a limit (left up to the UA) and then anything beyond that results in an exception. This seems like it'd handle bursty communication and would keep the failure model simple. This sounds like the best approach to me. 27.07.2009, в 13:27, Drew Wilson написал(а): I would suggest that the solution to this situation is an appropriate application-level protocol (i.e. acks) to allow the application to have no more than (say) 1MB of data outstanding. I'm just afraid that we're burdening the API to handle degenerative cases that the vast majority of users won't encounter. Specifying in the API that any arbitrary send() invocation could throw some kind of retry exception or return some kind of error code is really really cumbersome. Having a send() that doesn't return anything and doesn't raise exceptions would be a clear signal that send() just blocks until it's possible to send data to me, and I'm sure to many others, as well. There is no reason to silently drop data sent over a TCP connection - after all, we could as well base the protocol on UDP if we did, and lose nothing. There's another option besides blocking, raising an exception, and dropping data: unlimited buffering in user space. So I'm saying we should not put any limits on the amount of user-space buffering we're willing to do, any more than we put any limits on the amount of other types of user-space memory allocation a page can perform. I agree with Alexey that applications need feedback when they're consistentiently exceeding what your net connection can handle. I think an application getting an exception rather than filling up its buffer until it OOMs is a much better experience for the user and the web developer. I'm assuming that no actual limits would be specified in the specification, so it would be entirely up to a given UserAgent to decide how much buffering it is willing to provide. Doesn't that imply that a well-behaved web application would be forced to check for exceptions from all send() invocations, since there's no way to know a priori whether limits imposed by an application via its app-level protocol would be sufficient to stay under a given user-agent's internal limits? Even worse, to be broadly deployable the app-level protocol would have to enforce the lowest-common-denominator buffering limit, which would inhibit throughput on platforms that support higher buffers. In practice, I suspect most implementations would adopt a just blast out as much data as possible until the system throws an exception, then set a timer to retry the send in 100ms approach. But perhaps that's your intention? If so, then I'd suggest changing the API to just have a canWrite notification like other async socket APIs provide (or something similar) to avoid the clunky catch-and-retry idiom. Personally, I think that's overkill for the vast majority of use cases which would be more than happy with a simple send(), and I'm not sure why we're obsessing over limiting memory usage in this case when we allow pages to use arbitrary amounts of memory elsewhere. If you have application level ACKs (which you probably should--especially in high-throughput uses), you really shouldn't even hit the buffer limits that a UA might have in place. I don't really think that having a limit on the buffer size is a problem and that, if anything, it'll promote better application level flow control. J
Re: [whatwg] Installed Apps
This sounds really powerful, and seems like a natural evolution of some of the stuff we've discussed previously for persistent workers. A few comments/notes: 1) It sounds like this background page would act like any other web page with respect to its processing model (i.e. like other pages, script running in this page would be limited as to how long it can run, as opposed to workers which can run for any arbitrary length of time). This seems reasonable, especially since this page could assumedly still create workers if it need to do true background processing. It's really more of a hidden page than a background page? 2) For multi-process browsers like Chrome, there seem to be limitations as to what can actually be accessed between processes (direct DOM access across process boundaries seems problematic for example). Do you have ideas about how to address this, since assumedly the page calling getInstalledApp() could be running under some arbitrary process? 3) This approach has another advantage over something like workers in that a hidden page can do cross-domain access/sharing via iframes, whereas workers don't really have any facility for cross-domain access. 4) I had a quick question/clarification about the motivation behind this - aside from the advantages described above, it sounds like the specific problem you are solving by a hidden page is a) you don't have to load javascript in a new page (which I'm assuming must be slow), and b) you don't have to load client state in the new page. For a) - Having some way to load large amounts of cached javascript quickly in a new page seems like an issue that would be nice to address in general, not just for pages that install hidden pages. Are there other approaches worth trying here? For b) - How much client state are we talking about? If you were to pursue this approach using workers to maintain client state, how much data would you expect to be transferred to the client app on startup? We're seeing fairly low latency for client-worker communication, so in theory it shouldn't be a huge source of slowdown. I agree that the programming model of the hidden page is much cleaner/more familiar than rewriting applications to use asynchronous messaging, so that may be sufficient motivation for this. -atw On Mon, Jul 27, 2009 at 11:50 AM, Michael Davidson m...@google.com wrote: Hello folks - I'm an engineer on the Gmail team. We've been working on a prototype with the Chrome team to make the Gmail experience better. We thought we'd throw out our ideas to the list to get some feedback. THE PROBLEM We would like to enable rich internet applications to achieve feature parity with desktop applications. I will use Gmail and Outlook as examples for stating the problems we hope to solve. -- Slow startup: When a user navigates to mail.google.com, multiple server requests are required to render the page. The Javascript is cacheable, but personal data (e.g. the list of emails to show) is not. New releases of Gmail that require JS downloads are even slower to load. -- Native apps like Outlook can (and do) run background processes on the user's machine to make sure that data is always up-to-date. -- Notifications: Likewise, Outlook can notify users (via a background process) when new mail comes in even if it's not running. A SOLUTION Our proposed solution has two parts. The first, which should be generally useful, is the ability to have a hidden HTML/JS page running in the background that can access the DOM of visible windows. This page should be accessible from windows that the user navigates to. We call this background Javascript window a shared context or a background page. This will enable multiple instances of a web app (e.g. tearoff windows in Gmail) to cleanly access the same user state no matter which windows are open. Additionally, we'd like this background page to continue to run after the user has navigated away from the site, and preferably after the user has closed the browser. This will enable us to keep client-side data up-to-date on the user's machine. It will also enable us to download JS in advance. When the user navigates to a web app, all the background page has to do is draw the DOM in the visible window. This should significantly speed up app startup. Additionally, when something happens that requires notification, the background page can launch a visible page with a notification (or use other rich APIs for showing notifications). WHY NOT SHARED WORKERS Shared workers and persistent workers are designed to solve similar problems, but don't meet our needs. The key difference between what we're proposing and earlier proposals for persistent workers is that background pages would be able to launch visible windows and have full DOM access. This is different from the model of workers where all interaction with the DOM has to be done through asynchronous message passing. We would like background pages
[whatwg] Close events and workers
I noticed that Section 4.6 of the Web Workers spec still refers to the close event which has been removed: If the script gets aborted by the kill a worker#122aa363b1e6e893_kill-a-worker algorithm, then that same algorithm will cause there to only be a singletask in the event loop at the next step, namely the task for the close event. The terminate a worker #122aa363b1e6e893_terminate-a-worker algorithm removes all the events. Seems like we should update this language. -atw
Re: [whatwg] AppCache can't serve different contents for different users at the same URL
Not sure what you are suggesting, Anne - it sounds like they want to tie the AppCache to a specific cookie/value combination, which I don't believe is supported by the current spec. -atw On Wed, Jul 22, 2009 at 3:32 AM, Anne van Kesteren ann...@opera.com wrote: On Wed, 15 Jul 2009 00:30:05 +0200, Aaron Whyte awh...@google.com wrote: Most apps provide different contents for the same uncacheable main-page URL, depending on the identity of the user, which is typically stored in a cookie and read by the server. However, the HTML5 AppCache spec doesn't allow cookies to influence the choice of AppCaches or the contents of a response returned by the cache. Why not? I cannot find anything like that in the specification. It seems to me that the generic fetching algorithm is used which does not forbid sending cookies and even explicitly calls out setting them. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] AppCache can't serve different contents for different users at the same URL
On Wed, Jul 22, 2009 at 9:46 AM, Anne van Kesteren ann...@opera.com wrote: On Wed, 22 Jul 2009 18:10:52 +0200, Drew Wilson atwil...@google.com wrote: Not sure what you are suggesting, Anne - it sounds like they want to tie the AppCache to a specific cookie/value combination, which I don't believe is supported by the current spec. Well, as far as I can tell cookies are part of the request to the manifest file so you could serve up a different one from the server based on cookie data. That's an interesting idea (send down a different manifest), although I don't see how you'd leverage that technique to support two different users/manifests and use the appropriate app cache depending on which user is logged in. I think this boils down to the Gears 'requiredCookie' attribute was really useful. Is the problem supporting multiple users completely client-side? I can see how that might not work very well. Yeah, I think that's the use case they are trying to support - offline access to web apps where any one of multiple users can log in. I have to say that I'm somewhat fuzzy on the precise use case, though. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Issues with Web Sockets API
On Fri, Jun 26, 2009 at 9:18 AM, James Robinson jam...@google.com wrote: However, users can't usefully check the readyState to see if the WebSocket is still open because there are not and cannot be any synchronization guarantees about when the WebSocket may close. Is this true? Based on our prior discussion surrounding cookies, it seems like as a general rule we try to keep state from changing dynamically while JS code is executing for exactly these reasons.
Re: [whatwg] Issues with Web Sockets API
Yes, but the closed state of a given WebSocket doesn't have to exactly match the state of the underlying TCP connection, in the same way that document.cookies doesn't exactly match the current set of cookies that the network stack may be tracking (they can differ when HTTP responses are received in the background while JS is executing). So if the remote server closes the TCP connection, it generates a close event which marks the WebSocket as closed. It means that you could have a situation where you post messages to a WebSocket which aren't received by the server because the connection is closed, but that's true regardless due to the asynchronous nature of the networking protocol. -atw On Fri, Jun 26, 2009 at 9:52 AM, Darin Fisher da...@chromium.org wrote: On Fri, Jun 26, 2009 at 9:46 AM, Drew Wilson atwil...@google.com wrote: On Fri, Jun 26, 2009 at 9:18 AM, James Robinson jam...@google.comwrote: However, users can't usefully check the readyState to see if the WebSocket is still open because there are not and cannot be any synchronization guarantees about when the WebSocket may close. Is this true? Based on our prior discussion surrounding cookies, it seems like as a general rule we try to keep state from changing dynamically while JS code is executing for exactly these reasons. I think this is a very different beast. The state of a network connection may change asynchronously whether we like it or not. Unlike who may access cookies or local storage, the state of the network connection is not something we solely control. -Darin
Re: [whatwg] Issues with Web Sockets API
On Fri, Jun 26, 2009 at 1:14 PM, Kelly Norton knor...@google.com wrote: One thing about postMessage that I'm curious about. Since it has to report failure synchronously by throwing an INVALID_STATE_ERR, that seems to imply that all data must be written to a socket before returning and cannot be asynchronously delivered to an I/O thread without adding some risk of silently dropping messages. I don't think that's the intent of the spec - the intent is that INVALID_STATE_ERR is sent if the port is in a closed state, not if there's an I/O error after send. But Michael's right, I don't think there's any way to determine that the server received the message - I guess the intent is that applications will build their own send/ack protocol on top of postMessage(), as you note. -atw
Re: [whatwg] Issues with Web Sockets API
On Fri, Jun 26, 2009 at 2:11 PM, James Robinson jam...@google.com wrote: Forcing applications to build their own send/ack functionality would be pretty tragic considering that WebSockets are built on top of TCP. - James Every time I've written a response/reply protocol on TCP I've needed to put in my own acks - how else do you know your message has been delivered to the remote app layer? One could argue that WebSockets should do this for you, but I like leaving this up to the app as it gives them more flexibility. -atw
Re: [whatwg] Issues with Web Sockets API
On Fri, Jun 26, 2009 at 3:25 PM, Michael Nordman micha...@google.comwrote: On Fri, Jun 26, 2009 at 3:16 PM, Drew Wilson atwil...@google.com wrote: On Fri, Jun 26, 2009 at 2:11 PM, James Robinson jam...@google.comwrote: Forcing applications to build their own send/ack functionality would be pretty tragic considering that WebSockets are built on top of TCP. - James Every time I've written a response/reply protocol on TCP I've needed to put in my own acks - how else do you know your message has been delivered to the remote app layer? Classic networking problem... if you do send the ack... how does the ack sender know the other side has received it... and so on. Precisely, and complicated by the fact that the app layers I've worked with don't actually expose TCP acks to the app, so you can't even tell that the remote side has acked your packets. One could argue that WebSockets should do this for you, but I like leaving this up to the app as it gives them more flexibility. Yes. But knowing if the data your queuing to be sent is backing up in your local system instead of being pushed out is different than knowing if the remote side has received it and processed it. The former can be done w/o changing the websocket network protocol, the latter cannot. Is the queued up data is backing up problem any different from somebody doing a ton of async XHR requests, some of which may need to be queued before being sent?
Re: [whatwg] Issues with Web Sockets API
On Fri, Jun 26, 2009 at 3:47 PM, Michael Nordman micha...@google.comwrote: No. But the difference is each XHR tells you when its been sent and gives you the response when its received. With this info, apps can rate limit things. WebSocket.postMessage doesn't tell you when that message has been sent. Well, yes and no. You know when you get a response back because readyState = HEADERS_RECEIVED. But there's nothing between OPEN and HEADERS_RECEIVED that tells you anything about bits on the wire. Suppose your sending 'i'm alive' messages. If the message you sent 5 minutes ago hasn't really been sent, you wouldn't want to queue another 'i'm alive'. If your goal is never to send another heartbeat until you know your previous one has been delivered, then that seems like a motivation to add an app-layer heartbeat ack. Treating queued but not yet put on the wire any differently from put on the wire but not yet acked or acked, but still queued for delivery to the app layer on the remote end seems like a false distinction. If you're uploading a large data set incrementally across many distinct postMessage calls (perhaps to leave room for other control messages interspersed amoungst them, or to present progress info), how do you know when to queue more data to be sent. I could keep saying app level acks! but I don't want to beat a dead horse, and honestly, I'm not entirely certain that I'm right :)
Re: [whatwg] Limit on number of parallel Workers.
That's a great approach. Is the pool of OS threads per-domain, or per browser instance (i.e. can a domain DoS the workers of other domains by firing off several infinite-loop workers)? Seems like having a per-domain thread pool is an ideal solution to this problem. -atw On Tue, Jun 9, 2009 at 9:33 PM, Dmitry Titov dim...@chromium.org wrote: On Tue, Jun 9, 2009 at 7:07 PM, Michael Nordman micha...@google.comwrote: This is the solution that Firefox 3.5 uses. We use a pool of relatively few OS threads (5 or so iirc). This pool is then scheduled to run worker tasks as they are scheduled. So for example if you create 1000 worker objects, those 5 threads will take turns to execute the initial scripts one at a time. If you then send a message using postMessage to 500 of those workers, and the other 500 calls setTimeout in their initial script, the same threads will take turns to run those 1000 tasks (500 message events, and 500 timer callbacks). This is somewhat simplified, and things are a little more complicated due to how we handle synchronous network loads (during which we freeze and OS thread and remove it from the pool), but the above is the basic idea. / Jonas Thats a really good model. Scalable and degrades nicely. The only problem is with very long running operations where a worker script doesn't return in a timely fashion. If enough of them do that, all others starve. What does FF do about that, or in practice do you anticipate that not being an issue? Webkit dedicates an OS thread per worker. Chrome goes even further (for now at least) with a process per worker. The 1:1 mapping is probably overkill as most workers will probably spend most of their life asleep just waiting for a message. Indeed, it seems FF has a pretty good solution for this (at least for non-multiprocess case). 1:1 is not scaling well in case of threads and especially in case of processes. Here http://figushki.com/test/workers/workers.html is a page that can create variable number of workers to observe the effects, curious can run it in FF3.5, in Safari 4, or in Chromium with '--enable-web-workers' flag. Don't click 'add 1000' button in Safari 4 or Chromium if you are not prepared to kill the unresponsive browser while the whole system gets half-frozen. FF continue to work just fine, well done guys :-) Dmitry
Re: [whatwg] Limit on number of parallel Workers.
This is a bit of an aside, but section 4.5 of the Web Workers spec no longer makes any guarantees regarding GC of workers. I would expect user agents to make some kind of best effort to detect unreachability in the simplest cases, but supporting MessagePorts and SharedWorkers makes authoritatively determining worker reachability exceedingly difficult except in simpler cases (DedicatedWorkers with no MessagePorts or nested workers, for example). It seems like we should be encouraging developers to call WorkerGlobalScope.close() when they are done with their workers, which in the case below makes the number of running threads less undeterministic. Back on topic, I believe what Dmitry was suggesting was not that we specify a specific limit in the specification, but rather we have some sort of general agreement on how a UA might handle limits (what should it do when the limit is reached). His suggestion of delaying the startup of the worker seems like a better solution than other approaches like throwing an exception on the Worker constructor. -atw On Tue, Jun 9, 2009 at 6:28 PM, Oliver Hunt oli...@apple.com wrote: I believe that this will be difficult to have such a limit as sites may rely on GC to collect Workers that are no longer running (so number of running threads is non-deterministic), and in the context of mix source content (mash-ups) it will be difficult for any content source to be sure it isn't going to contribute to that limit. Obviously a UA shouldn't crash, but i believe that it is up to the UA to determine how to achieve this -- eg. having a limit to allow a 1:1 relationship between workers and processes will have a much lower limit than an implementation that has a worker per thread model, or an m:n relationship between workers and threads/processes. Having the specification limited simply because one implementation mechanism has certain limits when there are many alternative implementation models seems like a bad idea. I believe if there's going to be any worker related limits, it should realistically be a lower limit on the number of workers rather than an upper. --Oliver On Jun 9, 2009, at 6:13 PM, Dmitry Titov wrote: Hi WHATWG! In Chromium, workers are going to have their separate processes, at least for now. So we quickly found that while(true) foo = new Worker(...) quickly consumes the OS resources :-) In fact, this will kill other browsers too, and on some systems the unbounded number of threads will effectively freeze the system beyond the browser. We think about how to reasonably place limits on the resources consumed by 'sea of workers'. Obviously, one could just limit a maxumum number of parallel workers available to page or domain or browser. But what do you do when a limit is reached? The Worker() constructor could return null or throw exception. However, that seems to go against the spirit of the spec since it usually does not deal with resource constraints. So it makes sense to look for the most sensible implementation that tries best to behave. Current idea is to let create as many Worker objects as requested, but not necessarily start them right away. So the resources are not allocated except the thin JS wrapper. As long as workers terminate and the number of them decreases below the limit, more workers from the ready queue could be started. This allows to support implementation limits w/o exposing them. This is similar to how a 'sea of XHRs' would behave. The test page herehttp://www.figushki.com/test/xhr/xhr1.html creates 10,000 async XHR requests to distinct URLs and then waits for all of them to complete. While it's obviosuly impossible to have 10K http connections in parallel, all XHRs will be completed, given time. Does it sound like a good way to avoid the resource crunch due to high number of workers? Thanks, Dmitry
Re: [whatwg] Limit on number of parallel Workers.
It occurs to me that my statement was a bit stronger than I intended - the spec *does* indeed make guarantees regarding GC of workers, but they are fairly loose and typically tied to the parent Document becoming inactive. -atw On Tue, Jun 9, 2009 at 6:42 PM, Drew Wilson atwil...@google.com wrote: This is a bit of an aside, but section 4.5 of the Web Workers spec no longer makes any guarantees regarding GC of workers. I would expect user agents to make some kind of best effort to detect unreachability in the simplest cases, but supporting MessagePorts and SharedWorkers makes authoritatively determining worker reachability exceedingly difficult except in simpler cases (DedicatedWorkers with no MessagePorts or nested workers, for example). It seems like we should be encouraging developers to call WorkerGlobalScope.close() when they are done with their workers, which in the case below makes the number of running threads less undeterministic. Back on topic, I believe what Dmitry was suggesting was not that we specify a specific limit in the specification, but rather we have some sort of general agreement on how a UA might handle limits (what should it do when the limit is reached). His suggestion of delaying the startup of the worker seems like a better solution than other approaches like throwing an exception on the Worker constructor. -atw On Tue, Jun 9, 2009 at 6:28 PM, Oliver Hunt oli...@apple.com wrote: I believe that this will be difficult to have such a limit as sites may rely on GC to collect Workers that are no longer running (so number of running threads is non-deterministic), and in the context of mix source content (mash-ups) it will be difficult for any content source to be sure it isn't going to contribute to that limit. Obviously a UA shouldn't crash, but i believe that it is up to the UA to determine how to achieve this -- eg. having a limit to allow a 1:1 relationship between workers and processes will have a much lower limit than an implementation that has a worker per thread model, or an m:n relationship between workers and threads/processes. Having the specification limited simply because one implementation mechanism has certain limits when there are many alternative implementation models seems like a bad idea. I believe if there's going to be any worker related limits, it should realistically be a lower limit on the number of workers rather than an upper. --Oliver On Jun 9, 2009, at 6:13 PM, Dmitry Titov wrote: Hi WHATWG! In Chromium, workers are going to have their separate processes, at least for now. So we quickly found that while(true) foo = new Worker(...) quickly consumes the OS resources :-) In fact, this will kill other browsers too, and on some systems the unbounded number of threads will effectively freeze the system beyond the browser. We think about how to reasonably place limits on the resources consumed by 'sea of workers'. Obviously, one could just limit a maxumum number of parallel workers available to page or domain or browser. But what do you do when a limit is reached? The Worker() constructor could return null or throw exception. However, that seems to go against the spirit of the spec since it usually does not deal with resource constraints. So it makes sense to look for the most sensible implementation that tries best to behave. Current idea is to let create as many Worker objects as requested, but not necessarily start them right away. So the resources are not allocated except the thin JS wrapper. As long as workers terminate and the number of them decreases below the limit, more workers from the ready queue could be started. This allows to support implementation limits w/o exposing them. This is similar to how a 'sea of XHRs' would behave. The test page herehttp://www.figushki.com/test/xhr/xhr1.html creates 10,000 async XHR requests to distinct URLs and then waits for all of them to complete. While it's obviosuly impossible to have 10K http connections in parallel, all XHRs will be completed, given time. Does it sound like a good way to avoid the resource crunch due to high number of workers? Thanks, Dmitry
[whatwg] Changing postMessage() to allow sending unentangled ports
Hi all, I'd like to propose a change to the spec for postMessage(). Currently the spec reads: Throws an INVALID_STATE_ERRhttp://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#invalid_state_err if the ports array is not null and it contains either null entries, duplicate ports, or ports that are not entangled. I'd like to suggest that we allow sending ports that are not entangled (i.e. ports that have been closed) - the rationale is two-fold: 1) We removed MessagePort.active because it exposes details about garbage collection (i.e. an application could determine whether the other side of a MessagePort was collected or not based on testing the active attribute of a port). Throwing an exception in postMessage() is the same thing - it provides details about whether the other end of the port has been collected. 2) Imagine the following scenario: Window W has two workers, A and B. Worker A wants to send a set of messages to Worker B by queuing those messages on a MessagePort, then asking Window W to forward that port to Worker B: Window W code: workerA.onmessage(evt) { if (evt.data == forward) { // Currently this would throw an exception if the passed port is closed/unentangled. workerB.postMessage(messageFromA, evt.ports); } } Worker A code: function sendMessagesToB() { var channel = new MessageChannel(); channel.port1.postMessage(message 1); channel.port1.postMessage(message 2); channel.port1.postMessage(message 3); // Send port to worker B via Window W postMessage(forward, [channel.port2]); } Now Worker A is done with its port - it wants to close the port. But it can't safely do so until it knows that Window W has forwarded the port to Worker B, so it needs to build in some kind of ack mechanism to know when it's safe to close the port. Even worse, what if Worker A wants to shut down - it can't safely shut down until it knows that its message has been delivered, because the port would get closed when the owner closes. Since the port still acts as a task source even when it is closed, there seems to be no reason not to allow passing unentangled ports around - it's a reasonable way to represent a set of messages. And if you think about it, there's no reason why this is allowed: postMessage(msg, port) port.close() while this is prohibited: port.close(); postMessage(msg, port); Given that in both cases the port will almost certainly be closed before the message is delivered to the target. -atw
Re: [whatwg] Worker lifecycle
On Thu, May 28, 2009 at 7:47 PM, Maciej Stachowiak m...@apple.com wrote: On May 28, 2009, at 5:17 PM, Ian Hickson wrote: On Thu, 28 May 2009, Maciej Stachowiak wrote: On May 28, 2009, at 1:08 PM, Ian Hickson wrote: On Thu, 28 May 2009, Maciej Stachowiak wrote: If so, that seems like it could create unbounded memory leaks in long-running Web applications that use MessagePorts, even if all references to both endpoints of the MessageChannel are dropped. That seems unacceptable to me, unless I misunderstood. The requirement is actually indistinguishable from the UA using the other alternative and just having a really slow garbage collector that only runs at page-closing time. So it's exactly equivalent to the old requirement, except the spec now specifically points out that you can just leak forever instead. I don't think that addresses the original concern at all. I've tweaked the text some to make it clear that once the port is not entangled, it doesn't continue being protected in this way. The new text seems to be this: When a MessagePort object is entangled, user agents must either act as if the object has a strong reference to its entangledMessagePort object, or as if the MessagePort object's owner has a strong reference to the MessagePort object It seems to me this allows the following case: two message ports A and B are entangled. A is treated as having a strong reference to B, but is not treated as if its owner has a strong reference to it. However, B is not treated as having a strong reference to A, but is treated as if its owner has a strong reference to it. Is that intended? I think this behavior would be practically implementable and quite useful in many cases, even though it is asymmetric. But I am not sure if the text intended to allow it. Can you elaborate on this a bit? Where would this asymmetric behavior be useful? It seems like in the specific case you cite, B would be doubly-referenced, while A would be unreferenced. Regards, Maciej
Re: [whatwg] Workers and URL origin check
On Thu, May 28, 2009 at 1:11 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, May 27, 2009 at 6:15 PM, Drew Wilson atwil...@google.com wrote: Along the same lines, I'm wondering why we require a same-domain check for initial worker URLs, but not for script imported via importScripts(). This is because workers run in a security context of the initial worker URL. So this is the origin that is used for security checks whenever the worker does something, like load data using XMLHttpRequest. I'm not quite sure why importScripts() should behave more like a script tag than the Worker constructor itself. There's no reason why I shouldn't be able to do Worker(http://foo.com/worker.js;) if I can do importScript( http://foo.com/worker.js;). importScripts() however behave more like script in that they run the loaded script in the security context of the worked that loaded them. Seems like we ought to have workers inherit the origin of the script context that invoked the Worker constructor, but allow the script URL passed to the constructor to point at any domain. That would be another solution to this problem, however some people preferred the solution that is currently in the spec. Can anyone explain the motivation? A search of the archives isn't yielding anything particularly useful - the earliest mention I could find was someone from August of last year saying that essentially the same thing: some people thought it was not safe with no details.
Re: [whatwg] Worker lifecycle
Is your concern that an ill-behaved app could leak ports (since obviously an ill-behaved app could leak ports anyway just by stuffing them in some array), or is it that a well-behaved app can't release ports? Still need to review the new spec in detail, but from previous conversations I'd assumed that calling MessagePort.close() on either end would allow the ports to be freed - perhaps we should clarify the language in the spec to state that the strong reference is only in place for *entangled* ports. The alternative is to force applications to keep explicit references to all of their ports, which seems unwieldy and also worse given that there's now no way for applications to determine whether a given port is entangled or not (since .active exposes the behavior of the garbage collector). -atw On Thu, May 28, 2009 at 3:34 AM, Maciej Stachowiak m...@apple.com wrote: On May 28, 2009, at 2:29 AM, Ian Hickson wrote: I just checked in a substantial change to the lifetime model for workers. Instead of being bound to their ports, which became especially hard to implement for shared workers, they now just live as long as the Document that created them (all of the Documents that obtained them, for shared workers), with this ownership inheriting to nested workers. I also removed the various ways to observe the lifetime, namely .active and the 'close' events. I hope this will make the shared workers easier to implement. Please let me know if this screws anything up for dedicated workers. I'm assuming this is one of the changes: User agents must either act as if MessagePort objects have a strong reference to their entangled MessagePort object or as if each MessagePort object's owner has a strong reference to the MessagePort object. It seems to me the second alternative prevents MessagePorts created by a Window from ever being garbage collected until the user leaves the page. Is that a correct understanding? If so, that seems like it could create unbounded memory leaks in long-running Web applications that use MessagePorts, even if all references to both endpoints of the MessageChannel are dropped. That seems unacceptable to me, unless I misunderstood. Regards, Maciej
Re: [whatwg] Worker lifecycle
So I got a chance to review the latest changes (Hurray for the tracking view!: http://html5.org/tools/web-workers-tracker?from=139to=140). Do we still need the concept of a protected worker? We define what a protected worker is, but we don't actually reference that definition anywhere in the spec anymore, since active needed/permissible status is entirely driven by the existence of active/inactive documents. Overall it looks good, and I think the steps taken to remove GC behavior from the spec will greatly facility implementation. However, I did have a question about the definition of orphaned workers and active needed workers: A worker is said to be an *active needed worker* if any of the Documentobjects in the worker's Documents http://dev.w3.org/html5/workers/#the-worker-s-documentsare fully active. *Closing orphan workers*: Start monitoring the worker such that no sooner than it stops being either an active needed workerhttp://dev.w3.org/html5/workers/#active-needed-workeror a suspendable worker http://dev.w3.org/html5/workers/#suspendable-worker, and no later than it stops being a permissible workerhttp://dev.w3.org/html5/workers/#permissible-worker, *worker global scope*'s closinghttp://dev.w3.org/html5/workers/#dom-workerglobalscope-closingflag is set to true. It sounds like the worker is guaranteed to not be orphaned as long as the parent window is active, even if the user agent is able to identify that the worker is not reachable, which might be a stronger guarantee than was intended. Perhaps the spec already has an implicit assumption that UAs are able to do whatever they want with unreachable items? -atw On Thu, May 28, 2009 at 1:08 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 28 May 2009, Maciej Stachowiak wrote: I'm assuming this is one of the changes: User agents must either act as if MessagePort objects have a strong reference to their entangled MessagePort object or as if each MessagePort object's owner has a strong reference to the MessagePort object. It seems to me the second alternative prevents MessagePorts created by a Window from ever being garbage collected until the user leaves the page. Is that a correct understanding? Yes. If so, that seems like it could create unbounded memory leaks in long-running Web applications that use MessagePorts, even if all references to both endpoints of the MessageChannel are dropped. That seems unacceptable to me, unless I misunderstood. The requirement is actually indistinguishable from the UA using the other alternative and just having a really slow garbage collector that only runs at page-closing time. On Thu, 28 May 2009, Drew Wilson wrote: Is your concern that an ill-behaved app could leak ports (since obviously an ill-behaved app could leak ports anyway just by stuffing them in some array), or is it that a well-behaved app can't release ports? Still need to review the new spec in detail, but from previous conversations I'd assumed that calling MessagePort.close() on either end would allow the ports to be freed - perhaps we should clarify the language in the spec to state that the strong reference is only in place for *entangled* ports. The UA can at any time switch to the other mechanism, which only has a strong reference through the entanglement, which basically means that the UA can be as aggressive as the UA wants to be. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Workers and URL origin check
Along the same lines, I'm wondering why we require a same-domain check for initial worker URLs, but not for script imported via importScripts(). Seems like we ought to have workers inherit the origin of the script context that invoked the Worker constructor, but allow the script URL passed to the constructor to point at any domain. Is there a motivating security concern for this language from section 4.8.2?: If the origin of the resulting absolute URL is not the same as the origin of the script that invoked the constructor, then throw a security exception. It seems like it makes it harder for people to share script across domains without actually providing any security. -atw On Wed, May 27, 2009 at 5:13 PM, Dmitry Titov dim...@chromium.org wrote: Hi WHATWG! I have a question about URL origin check for Workers: the spec, in 4.8.2, mandates a check for the Worker URL to be the 'same origin' with the parent document's URL. At the same time, 4.2 says the origin of the worker is derived later from the URL represented by the 'location' object of the worker context. However, the spec doesn't say how redirects should be processed. If a browser gets 30x redirect request, the final URL of a worker can be different from the original one which has passed the check before loading. Current spec ignores the fact that origin can be changed via redirect. If the origin of the loaded worker is based on the final (potentially redirected) URL that 'location' object represents, then subsequent XHR requests, nested workers and importScripts() will work in the origin of that final URL. As specified, in case of redirect the page from http://applicationInternals.com; can use a worker from http://application.com; (via redirect) to access APIs of application.comthat were not necessarily intended for such consumption. Should the spec simply require the redirected, final URLs to be checked against parent's and reject the script if redirection results in a different origin? Thanks, Dmitry
[whatwg] MessagePorts in Web Workers: implementation feedback
Hi all, I've been hashing through a bunch of the design issues around using MessagePorts within Workers with IanH and the Chrome/WebKit teams and I wanted to follow up with the list with my progress. The problems we've encountered are all solveable, but I've been surprised at the amount of work involved in implementing worker MessagePorts (and the resulting implications that MessagePorts have on worker lifecycles/reachability). My concern is that the amount of work to implement MessagePorts within Worker context may be so high that it will prevent vendors from implementing the SharedWorker API. Have other implementers started working on this part of the spec yet? Let me quickly run down some of the implementation issues I've run into - some of these may be WebKit/Chrome specific, but other browsers may run into some of them as well: 1) MessagePort reachability is challenging in the context of separate Worker heaps In WebKit, each worker has its own heap (in Chrome, they will have their own process as well). The spec reads: User agents must act as if MessagePorthttp://www.w3.org/TR/html5/comms.html#messageportobjects have a strong reference to their entangled MessagePort http://www.w3.org/TR/html5/comms.html#messageport object. Thus, a message port can be received, given an event listener, and then forgotten, and so long as that event listener could receive a message, the channel will be maintained. Of course, if this was to occur on both sides of the channel, then both ports would be garbage collected, since they would not be reachable from live code, despite having a strong reference to each other. Furthermore, a MessagePorthttp://www.w3.org/TR/html5/comms.html#messageportobject must not be garbage collected while there exists a message in a task queue http://www.w3.org/TR/html5/browsers.html#task-queue that is to be dispatched on that MessagePorthttp://www.w3.org/TR/html5/comms.html#messageportobject, or while the MessagePort http://www.w3.org/TR/html5/comms.html#messageport object's port message queue http://www.w3.org/TR/html5/comms.html#port-message-queue is open and there exists a messagehttp://www.w3.org/TR/html5/comms.html#event-messageevent in that queue. The end result of this is the need to track some common state across an entangled MessagePort pair such as: number of outstanding messages, open state of each end, and number of active references to each port (zero or non-zero). Turns out this last bit will require adding new hooks to the JavaScriptCore garbage collector to detect transitioning between 1 and 0 references without actually freeing the object - not that difficult, but possibly something that other implementers should keep in mind. 2) MessagePorts dramatically change the worker lifecycle Having MessagePorts in worker context means that Workers can outlive their parent window(s) - I can create a worker, pass off an entangled MessagePort to another window (say, to a different domain), then close the original window, and the worker should stay alive. In the case of WebKit, this causes some problems for things like worker-initiated network requests - if workers can continue to run even though there are no open windows for that origin, then it becomes problematic to perform network requests (part of this is due to the architecture of WebKit which requires proxying network requests to window context, but part of this is just a general problem of how do you handle things like HTTP Auth when there are no open windows for that origin?) Finally, the spec defines a fairly broad definition of what makes a worker reachable - here's an excerpt from my WebKit Shared Worker design doc, where I summarize the spec (possibly incorrectly - feel free to correct any misconceptions): PermissibleThe spec specifies that a worker is *permissible* based on whether it has a reachable MessagePort that has been entangled *at some point in the past* with an active window (or with a worker who is itself permissible). Basically, if a worker has *ever* been entangled with an active window, or if it's ever been entangled with a worker who is itself permissible (i.e. it's associated with an active window via a chain of workers that have been entangled at some point in the past) then it's permissible. The reason why the at some point in the past language is present is to allow a page to create a fire-and-forget worker (for example, a worker that does a set of long network operations) without having to keep a reference to that worker around. Once the referent windows close, the worker should also close, as being permissible is a necessary (but not sufficient) criteria for being runnable. Active neededA permissible worker is *active needed* if: 1. it has pending timers/network requests/DB activity, or 2. it is currently entangled with an active window, or another active needed worker. The intent behind #1 is to enable fire-and-forget workers that don't exit until they are
Re: [whatwg] MessagePorts in Web Workers: implementation feedback
It looks like WebKit binds the XMLHttpRequest object to its parent document at instantiation time, so the source of the constructor doesn't make a difference. And it looks like that binding is cleared when the document is closed so invoking xhr.send() on an XHR object whose parent document is no longer open fails silently. I'm basing this on code inspection, not on extensive knowledge of the codebase, so the webkit folks should feel free to correct me here. Why is having the window proxy on behalf of its workers a poor workaround for worker MessagePorts? I totally understand the utility of MessagePorts for things like cross-window and cross-iframe communication, but it seems like the use cases for external access to workers are far more obscure. -atw On Thu, May 7, 2009 at 2:47 PM, Ian Hickson i...@hixie.ch wrote: On Thu, 7 May 2009, Drew Wilson wrote: Having MessagePorts in worker context means that Workers can outlive their parent window(s) - I can create a worker, pass off an entangled MessagePort to another window (say, to a different domain), then close the original window, and the worker should stay alive. In the case of WebKit, this causes some problems for things like worker-initiated network requests - if workers can continue to run even though there are no open windows for that origin, then it becomes problematic to perform network requests (part of this is due to the architecture of WebKit which requires proxying network requests to window context, but part of this is just a general problem of how do you handle things like HTTP Auth when there are no open windows for that origin?) How does WebKit handle this case for regular Windows? (e.g. if a script does x=window.open(), grabs the x.XMLHttpRequest constructor, calls x.close(), and then invokes the constructor.) The thing we'd give up is the capabilities-based API that MessagePorts provide, but I'd argue that the workaround is simple: the creating window can just act as a proxy for the worker. That's a rather poor workaround. :-) -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Worker lifecycle
OK, here's a more focused question - let's imagine that one is implementing SharedWorkers in a new browser. One seemingly reasonable way to proceed would be to have a Worker stay alive as long as there are *any* reachable entangled ports between any window and that worker. Does this cause some kind of problem that's addressed by the more complex behavior described in the spec? The any reachable port implementation seems like it's more along the lines of what people would expect, and doesn't require a new concept to be tracked (the port's original creator). I just have the nagging feeling that I'm not properly translating from spec language into actual behavior, and perhaps the intended behavior is not as complex as the spec implies. -atw On Fri, Apr 10, 2009 at 6:32 PM, Drew Wilson atwil...@google.com wrote: Hi all, A couple of quick questions about the lifecycle of workers - specifically I'm trying to grok the body of text at section 4.5 of the Web Workers spec. It seems like it's saying that if I create a shared worker, then hand off its port to another window, that shared worker will be orphaned once the original window is closed. It seems like instead of just using the normal message port reachability algorithm (once a worker's message ports are no longer reachable, it is considered orphaned) we instead have to track the original allocator of all ports, and only count a worker as reachable if the window that allocated the original port is still active. [a worker is permissible if] at some point past or present a MessagePortowned by the worker was entangled with a MessagePort *p* whose owner is a Window object whose active document is the Document that was that browsing context's active document when *p* was created, and that Document is fully active Am I reading this correctly? It just seems wonky that if I create a shared worker from window A, hand the port off to window B, then close window A, that worker is now orphaned despite being reachable from window B. But if I do the same thing, but before window A closes, window B creates a new port and sends it to the worker via the port that window A gave it, the worker won't be orphaned when window A closes. What's the intent here? Also, one other thing: I was previously under the impression that SharedWorkers had a different lifecycle from dedicated Workers - SharedWorkers would not exit as long as there were any windows open to that domain. In retrospect, I'm pretty sure I just made that up - SharedWorkers and dedicated Workers have identical lifecycles, correct? -atw
[whatwg] Worker lifecycle
Hi all, A couple of quick questions about the lifecycle of workers - specifically I'm trying to grok the body of text at section 4.5 of the Web Workers spec. It seems like it's saying that if I create a shared worker, then hand off its port to another window, that shared worker will be orphaned once the original window is closed. It seems like instead of just using the normal message port reachability algorithm (once a worker's message ports are no longer reachable, it is considered orphaned) we instead have to track the original allocator of all ports, and only count a worker as reachable if the window that allocated the original port is still active. [a worker is permissible if] at some point past or present a MessagePortowned by the worker was entangled with a MessagePort *p* whose owner is a Window object whose active document is the Document that was that browsing context's active document when *p* was created, and that Document is fully active Am I reading this correctly? It just seems wonky that if I create a shared worker from window A, hand the port off to window B, then close window A, that worker is now orphaned despite being reachable from window B. But if I do the same thing, but before window A closes, window B creates a new port and sends it to the worker via the port that window A gave it, the worker won't be orphaned when window A closes. What's the intent here? Also, one other thing: I was previously under the impression that SharedWorkers had a different lifecycle from dedicated Workers - SharedWorkers would not exit as long as there were any windows open to that domain. In retrospect, I'm pretty sure I just made that up - SharedWorkers and dedicated Workers have identical lifecycles, correct? -atw
Re: [whatwg] Worker feedback
I know I said I would stay out of this conversation, but I feel obliged to share a data point that's pertinent to our API design. The structured storage spec has an asynchronous API currently. There are no shortage of experienced javascript programmers at Google, and yet the single biggest piece of feedback I've gotten from the internal app community has been (essentially): The asynchronous APIs are too cumbersome. We are going to delay porting over to use the HTML5 APIs until we have synchronous APIs, like the ones in Gears. So, we should all take the whining of pampered Google engineers with a grain of salt :), but the point remains that even though callbacks are conceptually familiar and easy to use, it's not always convenient (or possible!) for an application to stop an operation in the middle and resume it via an asynchronous callback. Imagine if you're a library author that exposes a synchronous API for your clients - now you'd like to use localStorage within your library, but there's no way to do it while maintaining your existing synchronous APIs. If we try to force everyone to use asynchronous APIs to access local storage, the first thing everyone is going to do is build their own write-through caching wrapper objects around local storage to give them synchronous read access and lazy writes, which generates precisely the type of racy behavior we're trying to avoid. If we can capture the correct behavior using synchronous APIs, we should. -atw On Fri, Apr 3, 2009 at 11:44 AM, Tab Atkins Jr. jackalm...@gmail.comwrote: On Thu, Apr 2, 2009 at 8:37 PM, Robert O'Callahan rob...@ocallahan.org wrote: I agree it would make sense for new APIs to impose much greater constraints on consumers, such as requiring them to factor code into transactions, declare up-front the entire scope of resources that will be accessed, and enforce those restrictions, preferably syntactically --- Jonas' asynchronous multi-resource-acquisition callback, for example. Speaking as a novice javascript developer, this feels like the cleanest, simplest, most easily comprehensible way to solve this problem. We define what needs to be locked all at once, provide a callback, and within the dynamic context of the callback no further locks are acquirable. You have to completely exit the callback and start a new lock block if you need more resources. This prevents deadlocks, while still giving us developers a simple way to express what we need. As well, callbacks are at this point a relatively novice concept, as every major javascript library makes heavy use of them. ~TJ
Re: [whatwg] Worker feedback
On Mon, Mar 30, 2009 at 6:45 PM, Robert O'Callahan rob...@ocallahan.orgwrote: We have no way of knowing how much trouble this has caused so far; non-reproducibility means you probably won't get a good bug report for any given incident. It's even plausible that people are getting lucky with cookie races almost all the time, or maybe cookies are usually used in a way that makes them a non-issue. That doesn't mean designing cookie races in is a good idea. So, the first argument against cookie races was this is the way the web works now - if we introduce cookie races, we'll break the web. When this was proven to be incorrect (IE does not enforce exclusive access to cookies), the argument has now morphed to the web is breaking right now and nobody notices, which is more an article of faith than anything else. I agree that designing cookie races is not a good idea. If we could go back in time, we might design a better API for cookies that didn't introduce race conditions. However, given where we are today, I'd say that sacrificing performance in the form of preventing parallel network calls/script execution in order to provide theoretical correctness for an API that is already quite happily race-y is not a good tradeoff. In this case, I think the spec should describe the current implementation of cookies, warts and all. -atw
Re: [whatwg] Worker feedback
On Tue, Mar 31, 2009 at 6:25 PM, Robert O'Callahan rob...@ocallahan.orgwrote: We know for sure it's possible to write scripts with racy behaviour, so the question is whether this ever occurs in the wild. You're claiming it does not, and I'm questioning whether you really have that data. I'm not claiming it *never* occurs, because in the vasty depths of the internet I suspect *anything* can be found. Also, my rhetorical powers aren't up to the task of constructing a negative proof :) We don't know how much (if any) performance must be sacrificed, because no-one's tried to implement parallel cookie access with serializability guarantees. So I don't think we can say what the correct tradeoff is. The spec as proposed states that script that accesses cookies cannot operate in parallel with network access on those same domains. The performance impact of something like this is pretty clear, IMO - we don't need to implement it and measure it to know it exists and in some situations could be significant. You mean IE and Chrome's implementation, I presume, since Firefox and Safari do not allow cookies to be modified during script execution AFAIK. I think the old spec language captured the intent quite well - document.cookie is a snapshot of an inherently racy state, which is the set of cookies that would be sent with a network call at that precise instant. Due to varying browser implementations, that state may be less racy on some browsers than on others, but the general model was one without guarantees. I understand the philosophy behind serializing access to shared state, and I agree with it in general. But I think we need to make an exception in the case of document.cookie based on current usage and expected performance impact (since it impacts our ability to parallelize network access and script execution). In this case, the burden of proof has to fall on those trying to change the spec - I think we need a compelling real-world argument why we should be making our browsers slower. The pragmatic part of my brain suggests that we're trying to solve a problem that exists in theory, but which doesn't actually happen in practice. Anyhow, at this point I think we're just going around in circles about this - I'm not sure that either of us are going to convince the other, so I'll shut up now and let others have the last word :) -atw
Re: [whatwg] Worker feedback
On Fri, Mar 27, 2009 at 6:23 PM, Ian Hickson i...@hixie.ch wrote: Another use case would be keeping track of what has been done so far, for this I guess it would make sense to have a localStorage API for shared workers (scoped to their name). I haven't added this yet, though. On a related note, I totally understand the desire to protect developers from race conditions, so I understand why we've removed localStorage access from dedicated workers. In the past we've discussed having synchronous APIs for structured storage that only workers can use - it's a much more convenient API, particularly for applications porting to HTML5 structured storage from gears. It sounds like if we want to support these APIs in workers, we'd need to enforce the same kind of serializability guarantees that we have for localStorage in browser windows (i.e. add some kind of structured storage mutex similar to the localStorage mutex). Gears had an explicit permissions variable applications could check, which seems valuable - do we do anything similar elsewhere in HTML5 that we could use as a model here? HTML5 so far has avoided anything that requires explicit permission grants, because they are generally a bad idea from a security perspective (users will grant any permissions the system asks them for). The Database spec has a strong implication that applications can request a larger DB quota, which will result in the user being prompted for permission either immediately, or at the point that the default quota is exceeded. So it's not without precedent, I think. Or maybe I'm just misreading this: User agents are expected to use the display name and the estimated database size to optimize the user experience. For example, a user agent could use the estimated size to suggest an initial quota to the user. This allows a site that is aware that it will try to use hundreds of megabytes to declare this upfront, instead of the user agent prompting the user for permission to increase the quota every five megabytes. There are many ways to expose this, e.g. asynchronously as a drop-down infobar, or as a pie chart showing the disk usage that the user can click on to increase the allocaton whenever they want, etc. Certainly. I actually think we're in agreement here - my point is not that you need a synchronous permission grant (since starting up a worker is an inherently asynchronous operation anyway) - just that there's precedent in the spec for applications to request access to resources (storage space, persistent workers) that are not necessarily granted to all sites by default. It sounds like the specifics of how the UA chooses to expose this access control (pie charts, async dropdowns, domain whitelists, trusted zones with security levels) left to the individual implementation. Re: cookies I suppose that network activity should also wait for the lock. I've made that happen. Seems like that would restrict parallelism between network loads and executing javascript, which seems like the wrong direction to go. It feels like we are jumping through hoops to protect running script from having document.cookies modified out from underneath it, and now some of the ramifications may have real performance impacts. From a pragmatic point of view, I just want to remind people that many current browsers do not make these types of guarantees about document.cookies, and yet the tubes have not imploded. Cookies have a cross-domain aspect (multiple subdomains can share cookie state at the top domain) - does this impact the specification of the storage mutex since we need to lockout multiple domains? There's only one lock, so that should work fine. OK, I was assuming a single per-domain lock (ala localStorage) but it sounds like there's a group lock, cross-domain. This makes it even more onerous if network activity across all related domains has to serialize on a single lock. -atw
Re: [whatwg] AppCache and SharedWorkers?
On Thu, Mar 26, 2009 at 3:58 AM, Alexey Proskuryakov a...@webkit.org wrote: Letting faceless background processes update themselves without user consent is not necessarily desirable. I think that they need browser UI for this, and/or associated HTML configuration pages that could (among other duties) trigger application cache update. I'd be curious about why you think this is a problem, especially given the existence of importScripts() and XHR which allow workers to load scripts dynamically anyway. ApplicationCache for persistent workers would enable them to continue running even when offline - I don't see that it introduces any new security/permission wrinkles, though. If you don't provide something like that, then you'll have workers doing things like using XHR to download script, store it in the data store, then eval() it at load time to roll their own manual offline support.
Re: [whatwg] AppCache and SharedWorkers?
On Thu, Mar 26, 2009 at 1:19 PM, Alexey Proskuryakov a...@webkit.org wrote: But I was looking at this in terms of a model for users, not any specific security threats - if we think of persistent workers as an equivalent of native applications that need installation, then we should consider that native applications don't usually update themselves without user consent. It seems like a common model is for offline-enabled applications to store their javascript in the ApplicationCache, and encourage users to create desktop links to access those apps even when offline. Should these applications (which for all intents are installed) also prompt users before updating? Are you suggesting that user agents may want to require explicit user permission when any application invokes ApplicationCache.update()? That might be a reasonable approach if a given user agent wants to enforce some kind of no silent update policy... -atw
Re: [whatwg] AppCache and SharedWorkers?
On Wed, Mar 25, 2009 at 2:11 PM, Michael Nordman micha...@google.comwrote: The appcache spec has changed since the ian and i sent these old messages. Child browsing contexts (nested iframes) no longer inherit the appcache of their parent context (frame) by default. How's this for a starting point for how these things intereract... * Dedicated worker contexts should be associated with an appcache according to the same resource loading and cache selection logic used for child browsing contexts. (So just like navigating an iframe). Since dedicated workers are tightly tied (1:1) with a specific top-level browsing context, I'd say that they should use the same appcache as the document that started them. * Shared (or persistent) worker contexts should be associated with an appcache according to the same resource loading and cache selection logic used for top-level browsing contexts. (So just like navigating a window.) That may make sense for Shared workers, I think. For persistent workers I think this is a problem - persistent workers need a way to manage their own app cache, since they are not guaranteed to have any open windows/documents associated with them. My concern about this is that app cache manifests are only specified via manifest html tags, which makes them only applicable to HTML documents (you can't associate a manifest with a worker since there's no document to put the manifest tag in). At least one question, I'm sure there are others... What does a shared (or persistent) worker do when the appcache its associated with is updated? Is there a way to reload itself with the new script in the latest version of the appcache? What about message ports between the worker and other contexts? One could imagine that the worker would reload its javascript via importScripts(). It kind of assumes that the script is idempotent, though. -atw
Re: [whatwg] AppCache and SharedWorkers?
Good point - I like the idea of nested workers, especially if the SharedWorker uses the pattern where it just passes off all incoming message ports directly to the nested worker so it doesn't have to proxy messages. It'd have to have some app-specific mechanism to get them all back when it wants to restart the nested worker, though :) -atw On Wed, Mar 25, 2009 at 5:09 PM, David Levin le...@google.com wrote: On Wed, Mar 25, 2009 at 3:01 PM, Drew Wilson atwil...@google.com wrote: On Wed, Mar 25, 2009 at 2:11 PM, Michael Nordman micha...@google.comwrote: The appcache spec has changed since the ian and i sent these old messages. Child browsing contexts (nested iframes) no longer inherit the appcache of their parent context (frame) by default. How's this for a starting point for how these things intereract... * Dedicated worker contexts should be associated with an appcache according to the same resource loading and cache selection logic used for child browsing contexts. (So just like navigating an iframe). Since dedicated workers are tightly tied (1:1) with a specific top-level browsing context, I'd say that they should use the same appcache as the document that started them. * Shared (or persistent) worker contexts should be associated with an appcache according to the same resource loading and cache selection logic used for top-level browsing contexts. (So just like navigating a window.) That may make sense for Shared workers, I think. For persistent workers I think this is a problem - persistent workers need a way to manage their own app cache, since they are not guaranteed to have any open windows/documents associated with them. My concern about this is that app cache manifests are only specified via manifest html tags, which makes them only applicable to HTML documents (you can't associate a manifest with a worker since there's no document to put the manifest tag in). At least one question, I'm sure there are others... What does a shared (or persistent) worker do when the appcache its associated with is updated? Is there a way to reload itself with the new script in the latest version of the appcache? What about message ports between the worker and other contexts? One could imagine that the worker would reload its javascript via importScripts(). It kind of assumes that the script is idempotent, though. Similarly one could use nested workers (which I like because it gives the new script a new global object). The shared/persistent worker would start a nested worker. Then for a reload, it could shut down the current nested worker and start up a new one. Regarding message ports, it would be up to the implementation to decide if the shared/persistent worker followed a pointer to implementation pattern or if it handed out message ports directly to the nested worker. Dave -atw
[whatwg] AppCache and SharedWorkers?
I'm trying to understand the ApplicationCache spec as it applies to workers, but I didn't find anything promising when I searched the archives. Is ApplicationCache intended to apply to workers? The application cache API isn't available to workers, but I'm guessing the intent is that if an application creates a dedicated worker then worker requests (like importScripts()) would come out of the cache inherited from the parent document. If not, then it seems impossible to support running workers when in offline mode. Since SharedWorkers are shared by multiple windows, there's some ambiguity about which app cache it should use (perhaps always the one from the creator window?) - it seems like an app might get different SharedWorkers() loading from different app caches depending on the order in which different windows create them, which seems like a dubious outcome. Has this been discussed previously? -atw
Re: [whatwg] localStorage + worker processes
The problem is that .length is basically useless without some kind of immutability guarantees. I've thought about this more, and I'm afraid that if you start making the API cumbersome (forcing only async access) then apps are just going to use document.cookies instead of localStorage. I'd hate to see us radically change the API to support the worker case - I'd rather get rid of localStorage support from workers, or else just enforce a max time that a worker can hold the lock. -atw On Sun, Mar 22, 2009 at 10:46 AM, Michael Nordman micha...@google.comwrote: On Sat, Mar 21, 2009 at 3:25 PM, Aaron Boodman a...@google.com wrote: On Sat, Mar 21, 2009 at 1:51 PM, Jonas Sicking jo...@sicking.cc wrote: The problem with synchronously grabbing the lock is that we can only ever have one feature that uses synchronous locks, otherwise we'll risk dead-locks. Say that we make document.cookie behave the same way (to prevent multi-process browsers like IE8 and chrome from having race conditions). So that if you call document.getCookiesWithLock(callback) we'll synchronously grab a lock and call the callback function. This would cause two pages like the ones below to potentially deadlock: Page 1: getLocalStorage(function(storage) { document.getCookiesWithLock(function(cookieContainer) { storage.foo = cookieContainer.getCookie('cookieName'); }); ]); Page 2: document.getCookiesWithLock(function(cookieContainer) { getLocalStorage(function(storage) { cookieContainer.setCookie('cookieName', storage.bar); }); }); Good point. Ok, I agree that an asynchronous callback makes most sense for this API. Given an async api, would it be possible to store values into localStorage at onunload time? I expect that could be a useful time to use this API. function onunload() { getLocalStorage(function(storage) { // Will this ever execute? }); } Locking the storage until script completion isn't really necessary in many cases. Maybe we're over engineering this? Suppose immutability across calls was generally not guaranteed by the existing API. And we add an async getLocalStorage(callback) which does provide immutability for the duration of the callback if that is desired.
Re: [whatwg] localStorage + worker processes
If you deny workers, you can enforce exclusive access to localStorage by applying a lock that extends from the first access of localStorage until the script re-enters the event loop. Page script is guaranteed to re-enter the event loop fairly quickly (lest it trigger the browser's this script is taking too long to run protection) so you won't get starvation. Since worker script never has to re-enter the event loop, this isn't a feasible solution for workers. That's why I'm proposing that the most reasonable implementation is just to have a simple lock like I describe above, and then either deny access to localStorage to dedicated workers (shared workers can silo the storage as I described previously), or else just enforce a limit to how long workers can hold the localStorage lock (if they hold it beyond some period, they get terminated just like page script that doesn't re-enter the event loop). -atw On Sun, Mar 22, 2009 at 12:07 PM, Michael Nordman micha...@google.comwrote: I don't see how denying workers solves the problem. In a multi-threaded browser, this has to be resolved reasonably even in the absence of workers.
Re: [whatwg] Worker and message port feedback
FWIW, I wrote my tests using IE7, not IE8. The original argument I was countering was browsers currently enforce synchronous access to cookies, so we can't add asynchronous access via workers because that will break existing sites. Clearly, this argument was incorrect, since the core assumption about current browser behavior was wrong - in point of fact, the majority of browsers in use today make no such guarantees. So giving workers access to document.cookies is compatible both with the current language in the spec *and* the current behavior of the majority of browser implementations. That said, if we don't think this behavior is acceptable (and there are good arguments against it), then we should change the spec for cookies to disallow it. -atw On Sat, Mar 21, 2009 at 2:13 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Mar 20, 2009 at 3:29 PM, Ian Hickson i...@hixie.ch wrote: On Sat, 7 Mar 2009, Jonas Sicking wrote: document.cookies can't change in the middle of an execution. I.e. a script like: a = document.cookie; b = document.cookie; alert(a === b); will always show 'true'. On Mon, 9 Mar 2009, Drew Wilson wrote: Following up on this. I created two pages, one that tests cookies in a loop, and one that sets cookies in a loop, and ran them in separate windows in Firefox 3, IE7, and Chrome. Chrome and IE7 currently allow concurrent modification of document.cookies (i.e. the test loop throws up an alert). Firefox does not. I do not think there is a problem with providing self.cookie in workers, exposing the cookie of the script. However, currently there doesn't seem to be much support for this. What do other browser vendors think of this? Jonas, given the above information regarding IE's behaviour, do you still think that providing such an API in workers is a problem? It's the vendors that have exposed their users to this inconsistency that you should ask. Or maybe sites that use document.cookie a lot and that have a lot of chrome or IE8 users. Though both of those browsers might be too new to have received a lot of feedback regarding this. Note that this is only really a problem on sites that modifies document.cookie a lot, and where users have multiple tabs open to the same site. Personally I don't see how this couldn't be a problem. The only thing that'd save us is that cookies are generally not heavily used. But I bet there are sites out there that do use document.cookie a lot. / Jonas
Re: [whatwg] localStorage + worker processes
That might work. Is it feasible for user agents to enforce limits on how long a callback is allowed to run holding the lock? That way workers can't starve normal pages from accessing their local storage. -atw On Sat, Mar 21, 2009 at 12:48 AM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Mar 20, 2009 at 3:10 PM, Aaron Boodman a...@google.com wrote: I think the best option is to make access to localstorage asynchronous for workers. This reduces the amount of time a worker can hold the localstore lock so that it shouldn't be a problem for normal pages. It sucks to make such a simple and useful API aync though. As I understand the current API (on main window) to be defined, as soon as someone accesses the .localStorage property, the implementation is supposed to acquire a lock. This lock would be held on to until that script returns to the event loop for that thread. So if javascript in another window, running in another thread or process, tries to access .localStorage for the same origin, the .localStorage getter would try to acquire the same lock and block until the first thread releases the lock. This could in theory be applied to applied to workers as well. However as Jeremy points out that could result in the a worker script running for a very long time blocking the window thread. What we could do, is to have an API like getLocalStorage(callback); This function returns immediately, but will then call the callback function as soon as the localStorage becomes available and the lock been acquired. This would always happen asynchronously off the event loop, which means that once the callback returns the lock is released again. Of course, it would still mean that a window context or worker could hold on to the lock for an indefinite time, but as long as the asych getLocalStorage API is used, this means that no thread is blocked, just that they aren't able to get access to the localStorage. So for example, the following code would safely add 1 to the 'foo' property in localStorage: getLocalStorage(function(store) { store.foo = parseInt(store.foo, 10) + 1; }); Additionally, we would have to define that if the store object passed to the callback function is accessed outside after the callback has ended this will throw an exception. If the object is reactivated next time a callback is entered, or if a new storage object is created also needs to be defined. This new API I believe is good enough to be used both from workers and window contexts. We could even keep the current API implemented in IE8, or we could just ask people to write a wrapper for IE8 like: function getLocalStorage(callback) { setTimeout(function() { callback(localStorage); }, 0); } in an implementation that implements correct locking for the synchronous API, this will even produce the correct locking behavior for the new API. / Jonas
Re: [whatwg] localStorage + worker processes
I agree with Jeremy that the spec is currently unimplementable if we give localStorage access to workers. I'd like to point out that workers who want to access localStorage just need to send a message to their main window breaks down for persistent workers (where you don't necessarily have an open browser window) and is pretty wonky for shared workers (you can send a message to a window, but that window may go away before your message is processed, so you end up having to build some kind of send message, timeout, pick a new window to send it to, etc message layer). One alternative I'd like to propose is to remove access to localStorage for dedicated workers, and give SharedWorkers access to localStorage, but have that storage be partitioned by the worker name (i.e. the worker can access it, but it's not shared with web pages or any other workers and so you don't have any synchronicity issues). I don't see how this would work for dedicated workers, though, since there's no name to partition storage access, but they could always fall back to postMessage(). -atw On Fri, Mar 20, 2009 at 2:19 PM, Oliver Hunt oli...@apple.com wrote: When discussing this standard we have to recognize that not all browsers actually have a main thread. Time will tell if more or less browsers of the future will have multi-threaded architectures, but the trend has been for more I think. Any aspects of the spec that asserts or assumes a main thread is questionable. Yes they do -- we're talking about the main thread from the point of view of javascript, which is not necessarily the UI thread. The important thing is that with the current model is that JS on any thread may be blocked by js executing in a worker, which leads to a page (in effect) locking up -- the UI may still be functional but that particular page will have hung. --Oliver
Re: [whatwg] Worker and message port feedback
Thanks, Ian - this was great feedback. On Fri, Mar 20, 2009 at 3:29 PM, Ian Hickson i...@hixie.ch wrote: It is unclear to me why you would need access to the cookies from script to do cookie-based authentication. Isn't the server the one that sets the cookie and the one that uses it when it is returned? Could you elaborate on how you see the cookie API being used? Good point. Cookie-based auth is not a great use case, because as you point out, you could just do this by passing credentials to the server via an XHR request and have it set your cookies. I guess the motivation for allowing cookies to be set from workers is the same as the motivation for allowing web-page script to set cookies - perhaps this motivation is deprecated now that we have localStorage but even localStorage doesn't seem to have the nice cross-sub-domain sharing that cookies allow. (notifications): These seem like good use cases, but it's not clear what the user interface would look like, which is probably the hardest problem here. Agreed. This will be tricky especially given the restrictions on some platforms - I'm hoping we can do some experiments and come up with something that's compelling. Gears had an explicit permissions variable applications could check, which seems valuable - do we do anything similar elsewhere in HTML5 that we could use as a model here? HTML5 so far has avoided anything that requires explicit permission grants, because they are generally a bad idea from a security perspective (users will grant any permissions the system asks them for). The Database spec has a strong implication that applications can request a larger DB quota, which will result in the user being prompted for permission either immediately, or at the point that the default quota is exceeded. So it's not without precedent, I think. Or maybe I'm just misreading this: User agents are expected to use the display name and the estimated database size to optimize the user experience. For example, a user agent could use the estimated size to suggest an initial quota to the user. This allows a site that is aware that it will try to use hundreds of megabytes to declare this upfront, instead of the user agent prompting the user for permission to increase the quota every five megabytes. To be clear, are you saying that our philosophy is to leave any permissions granting up to the individual user agent (i.e. not described in the spec)? Or that we're trying to avoid specifying functionality that might be invasive enough to require permissions? The namespace for PersistentWorkers are identical to those of SharedWorkers - for example, if you already have a PersistentWorker named 'core' under a domain and a window tries to create a SharedWorker named 'core', a security exception will be thrown, just as if a different URL had been specified for two identically-named SharedWorkers. Why would we not want them to use different namespaces? I've rethought this, and I actually agree that they should have different namespaces. In fact, I'd go further - I don't think we should even *have* names for persistent workers (the use case for having names is what if I want to run the same worker multiple times without having to host multiple scripts, which I don't think really applies to persistent workers). Also, one of the things I'd like to experiment with in my implementation is allowing cross-domain access to workers (this is required if you want workers to be able to communicate/share resources across domains, since workers don't have access to any of the cross-domain functionality that window-based script has) - getting rid of the name and always having persistent workers identified by their script url helps enable this, and avoids some security issues, such as the ones described in this old Gears proposal I came across: http://code.google.com/p/gears/wiki/CrossOriginAPI I would be very concerned about this getting abused for popups. I don't think we want to allow arbitrary windows to be opened. I could see allowing a kind of simple toast popup that pops up a link which, when clicked, _then_ opens a window; would that work? Something like: void notify(message, url); That could definitely work. Agreed that we'd want to force some kind of user interaction first to avoid popup spam. Additionally, there's no good way for workers under different domains to talk to one another (a window can use the cross-domain messaging functionality to talk to other domains, but there's no analog for this for workers). This has been intentionally delayed while we wait for more implementation experience. I'm hoping to experiment with this some (per my earlier comment), so hopefully I'll be able to report back with some interesting data points (or at least my miserable failure will serve as an object lesson for future implementors :). To give people a more concrete example of one of the use cases that is driving
Re: [whatwg] Proposal for enhancing postMessage
Yes, it sends a clone, but the source port becomes unentangled (inactive) - step 5 of the clone a port specification reads: Entanglehttp://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#entangle the remote port and new port objects. The original port object will be unentangled by this process. So, cloning a port has the effect of killing the original port - the intent, I think, is for the sender to permanently hand off ownership of the port to the recipient, not to duplicate the port itself. I think this is a side issue, though - I agree that since you can effectively pass a message consisting of multiple objects, you probably ought to be able to pass multiple MessagePorts along with them. As you point out, it's primarily a matter of convenience/efficiency - you could still get the same functionality by individually serializing each object/port pair. -atw On Fri, Mar 13, 2009 at 2:06 PM, Mark S. Miller erig...@google.com wrote: On Wed, Mar 11, 2009 at 2:30 PM, Drew Wilson atwil...@google.com wrote: Mark, I won't pretend to completely understand the use cases you're describing as I'm not familiar with the prior work you've cited. But my understanding of the postMessage() API is that they are primarily useful for handing off ports to new owners - your idea of a pass-by-copy serialization of a proxy object implies that there's some way to copy the message port, and pass that along with the proxy to the new owner, which I don't think is possible in general (you can create a new port as part of a MessageChannel, but you can't really duplicate an existing port). I may be misunderstanding the use case that's driving your proposal, though. And I may be misunderstanding the postMessage draft spec. But step 4 of 7.4.4 at http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#posting-messages-with-message-ports reads: Try to obtain a new port by cloning the messagePort argument with the Window object on which the method was invoked as the owner of the clone. If this returns an exception, then throw that exception and abort these steps. Doesn't this mean that sending a MessagePort actually sends a clone? -- Cheers, --MarkM