Re: [webkit-dev] Passing data structures through postMessage()
On Thu, Sep 10, 2009 at 5:21 PM, Maciej Stachowiak m...@apple.com wrote: On Sep 10, 2009, at 3:12 PM, Chris Campbell wrote: Hi All, I had it in mind to implement support for passing data structures through postMessage() using the structured clone algorithm laid out in the HTML5 spec: http://dev.w3.org/html5/spec/Overview.html#posting-messages http://dev.w3.org/html5/spec/Overview.html#safe-passing-of-structured-data I've had some brief discussion with Dave Levin and Drew Wilson on #chromium IRC about this, and have an approach in mind that follows and elaborates on their suggestions, but there are still some holes in it and I'd very much like input from people familiar with this area. Currently, there are several postMessage() handlers (in MessagePort.idl, DOMWindow.idl, DedicatedWorkerContext.idl, and Worker.idl), and they all take a DOMString for the message parameter. The general idea is to change that message parameter from a DOMString to a new AttributeIterator type that allows walking of any sort of JS structure/type. AttributeIterator would essentially be an adapter to translate native V8 or JSC JavaScript objects to an agnostic interface suitable for walking the structure and serializing it. I'm thinking it would look something like this: interface AttributeIterator { bool isUndefined(); bool isNull(); bool isFalse(); bool isTrue(); bool isNumber(); String getNumber(); bool isString(); // ... cover the other types including Date, RegExp, ImageData, // File, FileData, and FileList ... // Retrieve the key for the next property (for Objects and Arrays) String nextEnumerableProperty(); AttributeIterator getPropertyValue(key); } Some thoughts off the cuff: I think it would be better to write custom code for doing the cloning for each JS engine. For one thing, turning everything into a string is not necessarily the most efficient way to do things. It might be possible to clone the object graph more directly in a threadsafe way. Also, things like File and ImageData fundamentally can't be turned into a string (well ImageData can) What does it mean to send a File along? Isn't that just a reference to a file? A handle of sorts. It seems like that is something that can be passed between threads easily enough. What is the use case for sending ImageData? It seems like it could be to allow a worker to modify the pixels of a canvas or it could be to share a separate copy of the data (copy-on-write). It may be undesirable to make the canvas have to support having its data be modified by a background thread / background process. -Darin Second, to be really agnostic about it, postMessage should take a generic Message object that has some methods that can be used to do the cross-thread clone without building in the assumption that it serializes to a string in the middle. Third, using a stateful iterator for this instead of an interface representing a value is not a great design. Iterators are not good when what you are traversing is in general a graph. Fourth, this doesn't give a way to detect if an object graph is not in fact legal to clone. It's kind of a shame that we have the baggage of multiple JavaScript engines to contend with. Trying to do things in a generic way will make this task needlessly more difficult. You may also want to get input on this from Oliver Hunt, who wrote the JSC JSON parse/stringify code; from past discussions I know has opinions on how cross-thread cloning should work. I'm also thinking that depending on compile-time flags, the contstructor for AttributeIterator would either take a v8::Handlev8::Value or JSC::JSvalue value. Then in each implementation of postMessage() the AttributeIterator instance could be passed to the structured clone serializer, which would return a string. Thereafter, no changes would be required to WebCore internals since they already pass strings around... until on the receiving end we get to MessageEvent.data where we would do the deserialization in a custom getter. Open questions: (1) Is passing an AttributeIterator type into postMessage() really the best way to go? Drew mentioned that this might incur a bunch of ObjC binding work on the JSC side... (2) Where should AttributeIterator live in the source tree? (3) Where should the serialization and deserialization routines live in the source tree? (3) I haven't addressed the specifics of the serialized string format. Plain JSON is not quite sufficient since it doesn't retain type information for Date, RegExp, etc.. However, I'm not too worried about coming up with a suitable format for this. Comments, advice, admonitions welcome! :) In general I'm not sure this approach is workable. At the very least File, FileData, FileList and ImageData need to be passed as something other than strings. And I think the value of making the object graph traversal code
Re: [webkit-dev] Passing data structures through postMessage()
Note that ImageData is cloned/copied when sent via postMessage(). So you end up with a copy of the ImageData, obviating the need for locks. I think (per my private mail to Darin which I'm restating here) that the use case is doing some kind of raytracing or something in a Worker, posting the result to a page, then blitting the result back to the canvas via putImageData(). -atw On Fri, Sep 11, 2009 at 10:54 AM, Darin Fisher da...@google.com wrote: On Thu, Sep 10, 2009 at 5:21 PM, Maciej Stachowiak m...@apple.com wrote: On Sep 10, 2009, at 3:12 PM, Chris Campbell wrote: Hi All, I had it in mind to implement support for passing data structures through postMessage() using the structured clone algorithm laid out in the HTML5 spec: http://dev.w3.org/html5/spec/Overview.html#posting-messages http://dev.w3.org/html5/spec/Overview.html#safe-passing-of-structured-data I've had some brief discussion with Dave Levin and Drew Wilson on #chromium IRC about this, and have an approach in mind that follows and elaborates on their suggestions, but there are still some holes in it and I'd very much like input from people familiar with this area. Currently, there are several postMessage() handlers (in MessagePort.idl, DOMWindow.idl, DedicatedWorkerContext.idl, and Worker.idl), and they all take a DOMString for the message parameter. The general idea is to change that message parameter from a DOMString to a new AttributeIterator type that allows walking of any sort of JS structure/type. AttributeIterator would essentially be an adapter to translate native V8 or JSC JavaScript objects to an agnostic interface suitable for walking the structure and serializing it. I'm thinking it would look something like this: interface AttributeIterator { bool isUndefined(); bool isNull(); bool isFalse(); bool isTrue(); bool isNumber(); String getNumber(); bool isString(); // ... cover the other types including Date, RegExp, ImageData, // File, FileData, and FileList ... // Retrieve the key for the next property (for Objects and Arrays) String nextEnumerableProperty(); AttributeIterator getPropertyValue(key); } Some thoughts off the cuff: I think it would be better to write custom code for doing the cloning for each JS engine. For one thing, turning everything into a string is not necessarily the most efficient way to do things. It might be possible to clone the object graph more directly in a threadsafe way. Also, things like File and ImageData fundamentally can't be turned into a string (well ImageData can) What does it mean to send a File along? Isn't that just a reference to a file? A handle of sorts. It seems like that is something that can be passed between threads easily enough. What is the use case for sending ImageData? It seems like it could be to allow a worker to modify the pixels of a canvas or it could be to share a separate copy of the data (copy-on-write). It may be undesirable to make the canvas have to support having its data be modified by a background thread / background process. -Darin Second, to be really agnostic about it, postMessage should take a generic Message object that has some methods that can be used to do the cross-thread clone without building in the assumption that it serializes to a string in the middle. Third, using a stateful iterator for this instead of an interface representing a value is not a great design. Iterators are not good when what you are traversing is in general a graph. Fourth, this doesn't give a way to detect if an object graph is not in fact legal to clone. It's kind of a shame that we have the baggage of multiple JavaScript engines to contend with. Trying to do things in a generic way will make this task needlessly more difficult. You may also want to get input on this from Oliver Hunt, who wrote the JSC JSON parse/stringify code; from past discussions I know has opinions on how cross-thread cloning should work. I'm also thinking that depending on compile-time flags, the contstructor for AttributeIterator would either take a v8::Handlev8::Value or JSC::JSvalue value. Then in each implementation of postMessage() the AttributeIterator instance could be passed to the structured clone serializer, which would return a string. Thereafter, no changes would be required to WebCore internals since they already pass strings around... until on the receiving end we get to MessageEvent.data where we would do the deserialization in a custom getter. Open questions: (1) Is passing an AttributeIterator type into postMessage() really the best way to go? Drew mentioned that this might incur a bunch of ObjC binding work on the JSC side... (2) Where should AttributeIterator live in the source tree? (3) Where should the serialization and deserialization routines live in the source tree? (3) I haven't addressed the specifics of the serialized string
Re: [webkit-dev] Passing data structures through postMessage()
The other approach we discussed was leaving the postMessage() APIs in WebCore as they are (taking a DOMString parameter) and doing the serialization/de-serialization in the JS bindings instead. My one concern about building the serialization into the WebCore postMessage impls is I didn't quite understand how that would map to ObjC (although for ObjC we could just continue exposing only postMessage(DOMString) and have the ObjC bindings wrap the string in an attributeIterator). My main concern with any approach is that we find a way to share serialization code between V8 and JSC. -atw On Thu, Sep 10, 2009 at 3:12 PM, Chris Campbell campb...@flock.com wrote: Hi All, I had it in mind to implement support for passing data structures through postMessage() using the structured clone algorithm laid out in the HTML5 spec: http://dev.w3.org/html5/spec/Overview.html#posting-messages http://dev.w3.org/html5/spec/Overview.html#safe-passing-of-structured-data I've had some brief discussion with Dave Levin and Drew Wilson on #chromium IRC about this, and have an approach in mind that follows and elaborates on their suggestions, but there are still some holes in it and I'd very much like input from people familiar with this area. Currently, there are several postMessage() handlers (in MessagePort.idl, DOMWindow.idl, DedicatedWorkerContext.idl, and Worker.idl), and they all take a DOMString for the message parameter. The general idea is to change that message parameter from a DOMString to a new AttributeIterator type that allows walking of any sort of JS structure/type. AttributeIterator would essentially be an adapter to translate native V8 or JSC JavaScript objects to an agnostic interface suitable for walking the structure and serializing it. I'm thinking it would look something like this: interface AttributeIterator { bool isUndefined(); bool isNull(); bool isFalse(); bool isTrue(); bool isNumber(); String getNumber(); bool isString(); // ... cover the other types including Date, RegExp, ImageData, // File, FileData, and FileList ... // Retrieve the key for the next property (for Objects and Arrays) String nextEnumerableProperty(); AttributeIterator getPropertyValue(key); } I'm also thinking that depending on compile-time flags, the contstructor for AttributeIterator would either take a v8::Handlev8::Value or JSC::JSvalue value. Then in each implementation of postMessage() the AttributeIterator instance could be passed to the structured clone serializer, which would return a string. Thereafter, no changes would be required to WebCore internals since they already pass strings around... until on the receiving end we get to MessageEvent.data where we would do the deserialization in a custom getter. Open questions: (1) Is passing an AttributeIterator type into postMessage() really the best way to go? Drew mentioned that this might incur a bunch of ObjC binding work on the JSC side... (2) Where should AttributeIterator live in the source tree? (3) Where should the serialization and deserialization routines live in the source tree? (3) I haven't addressed the specifics of the serialized string format. Plain JSON is not quite sufficient since it doesn't retain type information for Date, RegExp, etc.. However, I'm not too worried about coming up with a suitable format for this. Comments, advice, admonitions welcome! :) Regards, --Chris ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Passing data structures through postMessage()
On Thu, Sep 10, 2009 at 3:12 PM, Chris Campbell campb...@flock.com wrote: Hi All, I had it in mind to implement support for passing data structures through postMessage() using the structured clone algorithm laid out in the HTML5 spec: http://dev.w3.org/html5/spec/Overview.html#posting-messages http://dev.w3.org/html5/spec/Overview.html#safe-passing-of-structured-data I've had some brief discussion with Dave Levin and Drew Wilson on #chromium IRC about this, and have an approach in mind that follows and elaborates on their suggestions, but there are still some holes in it and I'd very much like input from people familiar with this area. Currently, there are several postMessage() handlers (in MessagePort.idl, DOMWindow.idl, DedicatedWorkerContext.idl, and Worker.idl), and they all take a DOMString for the message parameter. The general idea is to change that message parameter from a DOMString to a new AttributeIterator type that allows walking of any sort of JS structure/type. AttributeIterator would essentially be an adapter to translate native V8 or JSC JavaScript objects to an agnostic interface suitable for walking the structure and serializing it. I'm thinking it would look something like this: interface AttributeIterator { bool isUndefined(); bool isNull(); bool isFalse(); bool isTrue(); bool isNumber(); String getNumber(); bool isString(); // ... cover the other types including Date, RegExp, ImageData, // File, FileData, and FileList ... // Retrieve the key for the next property (for Objects and Arrays) String nextEnumerableProperty(); AttributeIterator getPropertyValue(key); } I'm also thinking that depending on compile-time flags, the contstructor for AttributeIterator would either take a v8::Handlev8::Value or JSC::JSvalue value. Then in each implementation of postMessage() the AttributeIterator instance could be passed to the structured clone serializer, which would return a string. Thereafter, no changes would be required to WebCore internals since they already pass strings around... until on the receiving end we get to MessageEvent.data where we would do the deserialization in a custom getter. Open questions: (1) Is passing an AttributeIterator type into postMessage() really the best way to go? Drew mentioned that this might incur a bunch of ObjC binding work on the JSC side... (2) Where should AttributeIterator live in the source tree? (3) Where should the serialization and deserialization routines live in the source tree? (3) I haven't addressed the specifics of the serialized string format. Plain JSON is not quite sufficient since it doesn't retain type information for Date, RegExp, etc.. However, I'm not too worried about coming up with a suitable format for this. Comments, advice, admonitions welcome! :) Regards, --Chris It should not be necessary to serialize to a string just to pass the structured clones across thread boundaries. This would be an especially bad idea for things like CanvasPixelArray. I am also not sure I understand the name AttributeIterator. -Sam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Passing data structures through postMessage()
On Thu, Sep 10, 2009 at 3:55 PM, Drew Wilson atwil...@google.com wrote: The other approach we discussed was leaving the postMessage() APIs in WebCore as they are (taking a DOMString parameter) and doing the serialization/de-serialization in the JS bindings instead. My one concern about building the serialization into the WebCore postMessage impls is I didn't quite understand how that would map to ObjC (although for ObjC we could just continue exposing only postMessage(DOMString) and have the ObjC bindings wrap the string in an attributeIterator). For the time being, ObjC can be ignored as basic concepts like RegExp are not defined in that context. ObjC can continue living with the old style postMessage and be happy doing so :). -Sam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Passing data structures through postMessage()
On Thu, Sep 10, 2009 at 5:29 PM, Oliver Hunt oli...@apple.com wrote: This is incorrect, from the bindings point of view the type here should be any, which in the JS bindings means ScriptValue. The actual serialisation is by definition bindings dependent, be that JSC or ObjC. Certainly, from a WebIDL standpoint this is correct. I'm not certain that our current code generator will accept this (the JS bindings are quite flexible especially if we are talking about custom attributes, but my experience with the ObjC generated bindings are that you need to specify concrete classes in the .idl files or else you end up with type errors at compile time - this could obviously be changed). --Oliver ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Passing data structures through postMessage()
No, it is more restrictive now than it was 6 months ago -- I was attempting to implement this back then and the ambiguity in handling the more excitingly complex objects (now simply return null) made it substantially more complex, that is the only reason the implementation is not currently matching the object clone semantic. JSON was never sufficient for the purpose of postMessage, and is also relatively trivially breakable. --Oliver On Sep 10, 2009, at 5:29 PM, Drew Wilson wrote: Good point - re-reviewing the Structured Clone spec, I see all kinds of crazy stuff is cloneable now, so string/JSON may not be a particularly good basis. It seems that we'll need to support File access from Workers, since you can clone/send those objects over from page context. I had expected that having a common serialization format would be useful, but I agree - it probably is better to just send opaque objects around, which might enable WebKit to send actual cloned object instances without requiring any serialization, while chromium can do the serialization itself when sending the data cross-process. -atw ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Passing data structures through postMessage()
I'm not sure that it is actually more restrictive now than it was 6 months ago. I assume you're looking at this: http://www.w3.org/TR/2009/WD-html5-20090212/infrastructure.html#safe-passing-of-structured-data where it seems to say that all host objects (DOM objects, etc) are cloned as null. Sounds like you looked into it in much more detail than I did, so undoubtedly there are ambiguities that aren't obvious at first glance. Anyhow, from April up until two weeks ago, ImageData was the only serialized host object: http://www.w3.org/TR/2009/WD-html5-20090423/infrastructure.html#safe-passing-of-structured-data ...so I'm not totally crazy :) It's a moot point now - while I could (and did :) imagine a reasonably compact one-off serialization for something like ImageData, the addition of the File types on 8/25 (and who knows what else in the future) makes it clear that serialization is not the way to go. -atw On Thu, Sep 10, 2009 at 5:38 PM, Oliver Hunt oli...@apple.com wrote: No, it is more restrictive now than it was 6 months ago -- I was attempting to implement this back then and the ambiguity in handling the more excitingly complex objects (now simply return null) made it substantially more complex, that is the only reason the implementation is not currently matching the object clone semantic. JSON was never sufficient for the purpose of postMessage, and is also relatively trivially breakable. --Oliver On Sep 10, 2009, at 5:29 PM, Drew Wilson wrote: Good point - re-reviewing the Structured Clone spec, I see all kinds of crazy stuff is cloneable now, so string/JSON may not be a particularly good basis. It seems that we'll need to support File access from Workers, since you can clone/send those objects over from page context. I had expected that having a common serialization format would be useful, but I agree - it probably is better to just send opaque objects around, which might enable WebKit to send actual cloned object instances without requiring any serialization, while chromium can do the serialization itself when sending the data cross-process. -atw ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Passing data structures through postMessage()
Given shared workers (and indeed Chromium's out-of-process dedicated workers), it seems like we also have cross process boundaries to consider. -Darin On Thu, Sep 10, 2009 at 5:21 PM, Maciej Stachowiak m...@apple.com wrote: On Sep 10, 2009, at 3:12 PM, Chris Campbell wrote: Hi All, I had it in mind to implement support for passing data structures through postMessage() using the structured clone algorithm laid out in the HTML5 spec: http://dev.w3.org/html5/spec/Overview.html#posting-messages http://dev.w3.org/html5/spec/Overview.html#safe-passing-of-structured-data I've had some brief discussion with Dave Levin and Drew Wilson on #chromium IRC about this, and have an approach in mind that follows and elaborates on their suggestions, but there are still some holes in it and I'd very much like input from people familiar with this area. Currently, there are several postMessage() handlers (in MessagePort.idl, DOMWindow.idl, DedicatedWorkerContext.idl, and Worker.idl), and they all take a DOMString for the message parameter. The general idea is to change that message parameter from a DOMString to a new AttributeIterator type that allows walking of any sort of JS structure/type. AttributeIterator would essentially be an adapter to translate native V8 or JSC JavaScript objects to an agnostic interface suitable for walking the structure and serializing it. I'm thinking it would look something like this: interface AttributeIterator { bool isUndefined(); bool isNull(); bool isFalse(); bool isTrue(); bool isNumber(); String getNumber(); bool isString(); // ... cover the other types including Date, RegExp, ImageData, // File, FileData, and FileList ... // Retrieve the key for the next property (for Objects and Arrays) String nextEnumerableProperty(); AttributeIterator getPropertyValue(key); } Some thoughts off the cuff: I think it would be better to write custom code for doing the cloning for each JS engine. For one thing, turning everything into a string is not necessarily the most efficient way to do things. It might be possible to clone the object graph more directly in a threadsafe way. Also, things like File and ImageData fundamentally can't be turned into a string (well ImageData can) Second, to be really agnostic about it, postMessage should take a generic Message object that has some methods that can be used to do the cross-thread clone without building in the assumption that it serializes to a string in the middle. Third, using a stateful iterator for this instead of an interface representing a value is not a great design. Iterators are not good when what you are traversing is in general a graph. Fourth, this doesn't give a way to detect if an object graph is not in fact legal to clone. It's kind of a shame that we have the baggage of multiple JavaScript engines to contend with. Trying to do things in a generic way will make this task needlessly more difficult. You may also want to get input on this from Oliver Hunt, who wrote the JSC JSON parse/stringify code; from past discussions I know has opinions on how cross-thread cloning should work. I'm also thinking that depending on compile-time flags, the contstructor for AttributeIterator would either take a v8::Handlev8::Value or JSC::JSvalue value. Then in each implementation of postMessage() the AttributeIterator instance could be passed to the structured clone serializer, which would return a string. Thereafter, no changes would be required to WebCore internals since they already pass strings around... until on the receiving end we get to MessageEvent.data where we would do the deserialization in a custom getter. Open questions: (1) Is passing an AttributeIterator type into postMessage() really the best way to go? Drew mentioned that this might incur a bunch of ObjC binding work on the JSC side... (2) Where should AttributeIterator live in the source tree? (3) Where should the serialization and deserialization routines live in the source tree? (3) I haven't addressed the specifics of the serialized string format. Plain JSON is not quite sufficient since it doesn't retain type information for Date, RegExp, etc.. However, I'm not too worried about coming up with a suitable format for this. Comments, advice, admonitions welcome! :) In general I'm not sure this approach is workable. At the very least File, FileData, FileList and ImageData need to be passed as something other than strings. And I think the value of making the object graph traversal code generic while everything else is JS-engine-specific is pretty low. In addition, I do not think the proposed interface is adequate to implement the cloning algorithm. Regards, Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev