Re: [whatwg] XML data islands related question
2013-08-06 2:27, Ian Hickson wrote: On Thu, 7 Feb 2013, Jukka K. Korpela wrote: [...] It's a bit odd that if you wish to set up a standalone application running in a browser (often called HTML5 application, without implying any particular version of HTML5), you can include e.g. scripts and images in separate files but not plain text or XML data Why can't you put plain text or XML data in other files? So long as everything is same origin, you can read anything you want via XHR. A standalone application should be as self-contained as possible, without needing HTTP connections or any network connections to access its own data. When no connections are needed for other reasons, an HTML5 application should run in any client capable of just interpreting HTML and JavaScript (and, in practice, CSS). If such an application needs some bulk of text data, it can be included e.g. in script type=text/plain.../script but not in a separate plain text file (included into the application distribution, along with other files) referred to via script src=.../script. This is a frustrating restriction and makes it more difficult to maintain and customize application. If an external plain text file could be used, the data content could be separately managed (requiring knowledge only about the format used). Yucca
Re: [whatwg] BinaryEncoding for Typed Arrays using window.btoa and window.atob
On Tue, Aug 6, 2013 at 1:41 AM, Kenneth Russell k...@google.com wrote: The Encoding spec at http://encoding.spec.whatwg.org/ seems to have handled issues like these. Perhaps a better route would be to fold this functionality into that spec. Yeah, I think my preference would be at this point to expose API-only encodings there. One of those could be base64. Labels for those encodings would simply not be recognized for form and URL. We could even give them labels that suggest that, e.g. api-base64. Another one I've heard requests for is true latin1 which we also use in XMLHttpRequest for various HTTP-related things. -- http://annevankesteren.nl/
Re: [whatwg] XML data islands related question
On Tue, 6 Aug 2013, Jukka K. Korpela wrote: 2013-08-06 2:27, Ian Hickson wrote: On Thu, 7 Feb 2013, Jukka K. Korpela wrote: [...] It's a bit odd that if you wish to set up a standalone application running in a browser (often called HTML5 application, without implying any particular version of HTML5), you can include e.g. scripts and images in separate files but not plain text or XML data Why can't you put plain text or XML data in other files? So long as everything is same origin, you can read anything you want via XHR. A standalone application should be as self-contained as possible, without needing HTTP connections or any network connections to access its own data. When no connections are needed for other reasons, an HTML5 application should run in any client capable of just interpreting HTML and JavaScript (and, in practice, CSS). If such an application needs some bulk of text data, it can be included e.g. in script type=text/plain.../script but not in a separate plain text file (included into the application distribution, along with other files) referred to via script src=.../script. This is a frustrating restriction and makes it more difficult to maintain and customize application. If an external plain text file could be used, the data content could be separately managed (requiring knowledge only about the format used). I'm not sure what you mean by application distribution. Why can't a text/plain file by included the same way an image/png file is included? -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] XML data islands related question
2013-08-06 17:45, Ian Hickson wrote: If such an application needs some bulk of text data, it can be included e.g. in script type=text/plain.../script but not in a separate plain text file (included into the application distribution, along with other files) referred to via script src=.../script. This is a frustrating restriction and makes it more difficult to maintain and customize application. If an external plain text file could be used, the data content could be separately managed (requiring knowledge only about the format used). I'm not sure what you mean by application distribution. Why can't a text/plain file by included the same way an image/png file is included? It can be included as a file, but it cannot be used. I can't read it. That is the point. I can use an img element referring to an image file, but I cannot refer to a simple plain text file (or an XML file) in an HTML document in a manner that lets me process its content in scripting. I can only include it via iframe or object, but that's different from accessing its content. Yucca
Re: [whatwg] Window and WindowProxy
On Tue, 6 Aug 2013, Boris Zbarsky wrote: As currently specified, the setup for Window/WindowProxy is as follows: 1) WindowProxy is specified as all operations that would be performed on it must be performed on the Window object of the browsing context's active document instead, whatever that means in ES-spec terms. 2) Window has an indexed getter on it and does security checks of various sorts on property access. There is a somewhat different way to specify this: 1) WindowProxy has the indexed getter behavior and does security checks as needed. 2) Window has no magic at all. Right now, these two ways of specifying it are black-box equivalent, but this equivalence relies on the following three invariants holding: A) var foo; is not valid ES for any value of foo that would be considered a valid argument to the indexed getter. B) Bareword foo is not valid ES for any value of foo that would be considered a valid argument to the indexed getter. C) Script can never get its hands directly on a Window object. Invariants B and C together mean that the only way to invoke the indexed getter is via the WindowProxy. Invariant A means that there is no contradiction between the way ES specifies var (as creating non-configurable properties) and the WebIDL requirements for an object with an indexed getter (not allowing definition of any expando indexed properties at all). I think there are other invariants that make them equivalent that are relevant here. In particular: D) When a Window is a script's global object, that script is always going to be same-origin with the Window, so it will always pass the security checks. (So, it's ok to not do the checks on Window and do them on WindowProxy instead.) I think actually invariants A and B are mooted by invariant D. That is, if they weren't true, we'd still be ok, because the security check is always going to be safe given D. But if invariant D was broken, then it seems like A and B would become problematic if we moved the security checks to the WindowProxy rather than to the Window. If invariant C is broken, e.g. because in some new language we don't have a WindowProxy and instead return the real Window for the current Document, or whatnot, whenever you access the Window object, it seems like we'd also actually want the security checks on Window. Do these last two points affect your conclusions? I believe the model that puts all the magic in the WindowProxy, which has to be quite magical already, is much easier for implementors to understand and reason about, and more clearly maps onto actual implementations with an actual proxy for the WindowProxy. It has the benefit of not depending on hidden invariants to avoid contradicting the ES spec, and of making it clear exactly where the magic is, as well as the small but tangible side benefit of making the global (in the ES sense) not be an exotic object (also in the ES sense), thus reducing the likelihood that future ES changes to how the global behaves will in any way affect the behavior of window. The drawback is that it needs a bit more prose defining the behavior of WindowProxy It doesn't seem like that much more prose, at least, not if we're keeping the same level of precision. (If we want more, that's a different matter.) What do other vendors think? This is in principle a purely editorial change. It would be cool if there was a WebIDL way to define WindowProxy, so that it could be unambiguously defined for all languages, but since it's a one-off object, maybe it's not worth it. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Window and WindowProxy
On 8/6/13 2:30 PM, Ian Hickson wrote: I think there are other invariants that make them equivalent that are relevant here. In particular: D) When a Window is a script's global object, that script is always going to be same-origin with the Window Ah, yes. Yes, that one is important too. ;) I think actually invariants A and B are mooted by invariant D. That is, if they weren't true, we'd still be ok, because the security check is always going to be safe given D. Invariants A is needed because otherwise the behavior of objects with indexed properties (wherein they disallow adding indexed properties to them) would conflict with the ES-spec behavior of var. Invariant B is needed because otherwise you could look up a property named 0 on a Window directly, and if the indexed props live on the WindowProxy you would unexpectedly get undefined instead of the first child window. Neither one of those is about the security check situation, afaict. But if invariant D was broken, then it seems like A and B would become problematic if we moved the security checks to the WindowProxy rather than to the Window. Yes, agreed. There are two somewhat-orthogonal concerns here: 1) Where do the security checks live? 2) Where do the indexed properties live? If invariant C is broken, e.g. because in some new language we don't have a WindowProxy and instead return the real Window for the current Document, or whatnot, whenever you access the Window object, it seems like we'd also actually want the security checks on Window. Yes. Do these last two points affect your conclusions? I don't think they affect what I want to happen for indexed properties. That part is actually more important to me right now than the much more underspecified security check story; I expect as we fully specify the security checks in terms of the MOP (which we need to do) it'll become more obvious whether they need to live on the Window or the WindowProxy or both It doesn't seem like that much more prose, at least, not if we're keeping the same level of precision. (If we want more, that's a different matter.) Oh, I want more precision for sure. ;) What do other vendors think? I'd love to know this too. but since it's a one-off object, maybe it's not worth it. I don't think it's worth it at all, frankly. -Boris
Re: [whatwg] BinaryEncoding for Typed Arrays using window.btoa and window.atob
If technically no benefit of passing ArrayBufferView as a 2nd parameter to atob, I think returning an ArrayBuffer is a good way to go. Enhancing btoa/atob would be an easy solution while I am open to enhance the Encoding spec. But it appears to me we have to introduce another pair of coders, say BinaryDecoder/BinaryEncoder, in addition to TextDecoder/TextEncode since the signatures of the decode/encode functions are different. Chang On Tue, Aug 6, 2013 at 8:28 AM, Kornel Lesiński kor...@geekhood.net wrote: On Mon, 05 Aug 2013 21:39:22 +0100, Chang Shu csh...@gmail.com wrote: I see your point now, Simon. Technically both approaches should work. As you said, yours has the limitation that the implementation does not know which view to return unless you provide an enum type of parameter instead of boolean to atob. In that case it'd be better to return ArrayBuffer, so the user can wrap it in any type they want (including DataView). -- regards, Kornel
[whatwg] Form-associated elements and the parser
Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. I'm interested what others would think about changing the parser to only associate a form with an input if both are in the same home subtree (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree). Or is there some deep web-compat reason for this parsing oddity? - Adam
[whatwg] Microdata feedback
On Wed, 13 Feb 2013, Ed Summers wrote: I am looking for some guidance about the use of multiple itemtypes in microdata [1], specifically the phrase defined to use the same vocabulary in: The item types must all be types defined in applicable specifications and must all be defined to use the same vocabulary. For example, does this mean that I can't say: div itemscope itemtype=http://acme.com/Foo http://zenith.com/Bar; ... /div It depends on what http://acme.com/Foo and http://zenith.com/Bar are. If they use the same vocabulary, then you can do it. If they're separate vocabularies, then no. The reason I ask is that there is some desire over in the schema.org community [2] to provide a mechanism for schema.org to be specialized. For example, in the case of an audiobook: div itemscope itemtype=http://schema.org/Book http://www.productontology.org/id/Audiobook; ... /div The idea being not to overload schema.org with more vocabulary, and to let vocabularies grow a bit more organically. If they're the same vocabulary -- that is, the properties on this .../Book vocabulary and this .../Audiobook vocabulary don't clash -- properties mean the same thing in both -- then it's fine. This schema.org group is currently thinking of using a one off property additionalType that would be used like so: div itemscope itemtype=http://schema.org/Book; link itemprop=additionalType href=http://www.productontology.org/id/Audiobook; ... /div I personally find this to be kind of distasteful since it replicates the mechanics that microdata's itemtype already offers. It's essentially equivalent, yes. So, my question: is it the case that itemtype cannot reference types in different vocabularies like the example above? If so, I'm curious to know what the rationale was, and if perhaps it could be relaxed. If they're different vocabularies (i.e. the same terms are used to mean different things), then you wouldn't know which was meant, so it would be ambiguous. There's an open bug about this topic with an open question: https://www.w3.org/Bugs/Public/show_bug.cgi?id=13527 On Thu, 14 Feb 2013, Ed Summers wrote: In John's email [1] he proposed limiting multiple types to being from the same origin domain, not the same vocabulary as is stated in the Microdata spec. It sounds like an obvious question, but is there a precise definition of what is meant by same vocabulary? Or is it just a hand wavy way of talking about what humans understand when putting the itemtype URLs in their browsers, reading, and understanding that they are types that are part of some larger coherent whole? Vocabulary means the set of properties that are defined. There's some non-normative text in the HTML spec that talks about this: # The type gives the context for the properties, thus selecting a # vocabulary: a property named class given for an item with the type # http://census.example/person; might refer to the economic class of # an individual, while a property named class given for an item with # the type http://example.com/school/teacher; might refer to the # classroom a teacher has been assigned. Several types can share a # vocabulary. For example, the types # http://example.org/people/teacher; and # http://example.org/people/engineer; could be defined to use the # same vocabulary (though maybe some properties would not be # especially useful in both cases, e.g. maybe the # http://example.org/people/engineer; type might not typically be # used with the classroom property). Multiple types defined to use # the same vocabulary can be given for a single item by listing the # URLs as a space-separated list in the attribute' value. An item # cannot be given two types if they do not use the same vocabulary, # however. On Tue, 19 Feb 2013, Judson Lester wrote: There was an email from last year suggesting that the values of input elements be derived from their value attributes - the purpose there being to be able to control the form via the microdata interface. I've only been able to read it in the archives - the brief exchange was between Igor Nikolev and Ian Hickson, who was curious about use cases. Conversely, it would be useful to be able to use input elements to contain item values, and at the moment, since their values would be derived from their textContent, they're useless for that. Specifically, it's often reasonable to present a representation as the default values in a form and allow for updates simply by posting the changed values. It seems unwieldy to need to replicate that information in e.g. data elements. While it would be simple to treat the defaultValue as the item property value for elements (and for radio inputs, let the representation mark the selected input as the itemprop), it seems counter to the spirit of the proposal. The alternative would be to do something like excluding unsuccessful input elements during
Re: [whatwg] Should video controls generate click events?
On Thu, 27 Jun 2013, Philip Jägenstedt wrote: In a discussion about a click to play/pause feature for Opera on Android, the issue of click event handlers came up.[1] The problem is that pages can do things like this: v.onclick = function() { if (v.paused) { v.play(); } else { v.pause(); } // no preventDefault() } I created a demo [2] and it is indeed the case that this makes video controls unusable in both Presto and Chromium based browsers. Simon Pieters has brought this up before, but the spec wasn't changed at that point.[3] While my demo may be on the hypothetical side, we do want users to be able to bring up the native controls via a context menu and be able to use them regardless of what the page does in its event handlers. So, I request that the spec be explicit that interacting with the video controls does not cause the normal script-visible events to be fired. [1] https://codereview.chromium.org/17391015 [2] http://people.opera.com/~philipj/click.html [3] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/031916.html (search for As with the post Simon cites above) I've made the spec say this is a valid (and recommended) implemenation strategy. HTH, -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness
On Thu, 7 Mar 2013, j...@mailb.org wrote: right now JSON.parse blocks the mainloop, this gets more and more of an issue as JSON documents get bigger and are also used as serialization format to communicate with web workers. I think it would make sense to have a Promise-based API for JSON parsing. This probably belongs either in the JS spec or the DOM spec; Anne, Ms2ger, and any JS people, is anyone interested in taking this? On Thu, 7 Mar 2013, David Rajchenbach-Teller wrote: Actually, communicating large JSON objects between threads may cause performance issues. I do not have the means to measure reception speed simply (which would be used to implement asynchronous JSON.parse), but it is easy to measure main thread blocks caused by sending (which would be used to implement asynchronous JSON.stringify). I don't understand why there'd be any difficulty in sending large objects between workers or from a worker to the main thread. It's possible this is not well-implemented today, but isn't that just an implementation detail? One could imagine an implementation strategy where the cloning is done on the sending side, or even on a third thread altogether, and just passed straight to the receiving side in one go. On Thu, 7 Mar 2013, Tobie Langel wrote: Even if an async API for JSON existed, wouldn't the perf bottleneck then simply fall on whatever processing needs to be done afterwards? That was my initial reaction as well, I must admit. On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote: For the moment, the main use case I see is for asynchronous serialization of JSON is that of snapshoting the world without stopping it, for backup purposes, e.g.: a. saving the state of the current region in an open world RPG; b. saving the state of an ongoing physics simulation; c. saving the state of the browser itself in case of crash/power loss (that's assuming a FirefoxOS-style browser implemented as a web application); d. backing up state and history of the browser itself to a server (again, assuming that the browser is a web application). Serialising is hard to do async, since you fundamentally have to walk the data structure, and the actual serialisation at that point is not especially more expensive than a copy. The natural course of action would be to do the following: 1. collect data to a JSON object (possibly a noop); I'm not sure what you mean by JSON object. JSON is a string format. Do you mean a JS object data structure? 2. send the object to a worker; 3. apply some post-treatment to the object (possibly a noop); 4. write/upload the object. Having an asynchronous JSON serialization to some Transferable form would considerably the task of implement step 2. without janking if data ends up very heavy. I don't understand what JSON has to do with sending data to a worker. You can just send the actual JS object; MessagePorts and postMessage() support raw JS objects. So far, I have discussed serializing JSON, not deserializing it, but I believe that the symmetric scenarios also hold. No, they are quite asymetric. Serialising requires stalling the code that is interacting with the data structure, to guarantee integrity. Parsing is easy to do on a separate worker, because it has no dependencies -- you can do it all in isolation. On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote: If I am correct, this means that we need some mechanism to provide efficient serialization of non-Transferable data into something Transferable. I don't understand what this means. Transferable is about neutering objects on one side and creating new versions on the other. It's the equivalent of a move. Your use cases were about making copies, as far as I can tell (saving and backing up). As a general rule, JSON has nothing to do with Transferable objects, as far as I can tell. On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote: Intuitively, this sounds like: 1. collect data to a JSON; 2. serialize JSON (hopefully asynchronously) to a Transferable (or several Transferables). I really don't understand this. Are you asking for a way to move a JS object from one thread to another, killing references to it in the first thread? What's the use case? (What would this have to do with JSON?) On Fri, 8 Mar 2013, David Bruant wrote: Why not collect the data in a Transferable like an ArrayBuffer directly? It skips the additional serialization part. Writing a byte stream directly is a bit hardcore I admit, but an object full of setters can give the impression to create an object while actually filling an ArrayBuffer as a backend. I feel that could work efficiently. It's not clear to me what the use case is, but if the desire is to move a batch of data from one thread to another, then this is certainly one way to do it. Another would be to just copy the data in the first place, no need to move it -- since you have to pay the cost of reading
Re: [whatwg] Form-associated elements and the parser
On Aug 6, 2013, at 2:01 PM, Adam Klein ad...@chromium.org wrote: Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. What is the behavior of the old IE? - R. Niwa
Re: [whatwg] Form-associated elements and the parser
On Tue, Aug 6, 2013 at 4:09 PM, Ryosuke Niwa rn...@apple.com wrote: On Aug 6, 2013, at 2:01 PM, Adam Klein ad...@chromium.org wrote: Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. What is the behavior of the old IE? form.elements.length == 0 in IE 9. - Adam
Re: [whatwg] Form-associated elements and the parser
As I recall it (it was ages since I dealt with this), the tricky case that you need to handle is this one: http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432 In this case, web compatibility requires that the input is associated with the form. Specifically hidden input elements would often end up moved, but still had to show up in form.elements as well as get submitted along with the form. / Jonas / Jonas On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote: Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. I'm interested what others would think about changing the parser to only associate a form with an input if both are in the same home subtree (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree). Or is there some deep web-compat reason for this parsing oddity? - Adam
Re: [whatwg] Form-associated elements and the parser
On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking jo...@sicking.cc wrote: As I recall it (it was ages since I dealt with this), the tricky case that you need to handle is this one: http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432 In this case, web compatibility requires that the input is associated with the form. Specifically hidden input elements would often end up moved, but still had to show up in form.elements as well as get submitted along with the form. That case definitely makes sense to me, and I think it's fine to keep that behavior for compat. The only one I'm asking to change is the case when the input and form end up in different trees. On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote: Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. I'm interested what others would think about changing the parser to only associate a form with an input if both are in the same home subtree (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree). Or is there some deep web-compat reason for this parsing oddity? - Adam
Re: [whatwg] Form-associated elements and the parser
On Tue, Aug 6, 2013 at 4:27 PM, Adam Klein ad...@chromium.org wrote: On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking jo...@sicking.cc wrote: As I recall it (it was ages since I dealt with this), the tricky case that you need to handle is this one: http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432 In this case, web compatibility requires that the input is associated with the form. Specifically hidden input elements would often end up moved, but still had to show up in form.elements as well as get submitted along with the form. That case definitely makes sense to me, and I think it's fine to keep that behavior for compat. The only one I'm asking to change is the case when the input and form end up in different trees. Sure, as long as you come up with a formalized algorithm for when there is an association and when there isn't. Keep in mind that by the time that the input-element is inserted, the form-element might have been moved elsewhere. We likely don't need the association in that case, but detecting that that has happened sounds tricky. The way that Gecko currently works IIRC is that it creates the association any time it has seen a form without seeing a /form. And it breaks the association anytime an input-element's parent chain changes and the associated form-element is no longer in the parent chain. On a related note, when are you guys going to add a cycle collector or other not-plain-refcounting memory manager :-) / Jonas On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote: Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. I'm interested what others would think about changing the parser to only associate a form with an input if both are in the same home subtree (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree). Or is there some deep web-compat reason for this parsing oddity? - Adam
Re: [whatwg] Form-associated elements and the parser
On Tue, Aug 6, 2013 at 4:38 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Aug 6, 2013 at 4:27 PM, Adam Klein ad...@chromium.org wrote: On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking jo...@sicking.cc wrote: As I recall it (it was ages since I dealt with this), the tricky case that you need to handle is this one: http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432 In this case, web compatibility requires that the input is associated with the form. Specifically hidden input elements would often end up moved, but still had to show up in form.elements as well as get submitted along with the form. That case definitely makes sense to me, and I think it's fine to keep that behavior for compat. The only one I'm asking to change is the case when the input and form end up in different trees. Sure, as long as you come up with a formalized algorithm for when there is an association and when there isn't. Keep in mind that by the time that the input-element is inserted, the form-element might have been moved elsewhere. We likely don't need the association in that case, but detecting that that has happened sounds tricky. My concrete proposal would be something like this: In step 4 of http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#create-an-element-for-the-token, add a requirement that intended parent and the form element pointer be part of the same home subtree (defined at http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree). The way that Gecko currently works IIRC is that it creates the association any time it has seen a form without seeing a /form. And it breaks the association anytime an input-element's parent chain changes and the associated form-element is no longer in the parent chain. This is basically the same thing Blink WebKit do, with the caveat that we also avoid associating forms with elements inside templates (this is now reflected in step 4 of the algorithm, see above). On a related note, when are you guys going to add a cycle collector or other not-plain-refcounting memory manager :-) Yes, that would be nice :) - Adam / Jonas On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote: Hixie opened my eyes last week to parser-association behavior of the sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428. In that case, an input in a detached tree is associated with a form in the main document. This causes badness in WebKit and Blink because the association between the form and the input (e.g., as exposed in the HTMLFormElement.elements collection) is only weakly held to avoid reference loops (and thus memory leaks). And that weakness occasionally results in crashes when one of these objects is collected before the other. While all modern HTML parser implementations I tested seemed to agree on their treatment of the above example (they all return 1 as elements.length), this feature doesn't strike me as terribly useful. And for what it's worth, it doesn't seem to be present in legacy IE. I'm interested what others would think about changing the parser to only associate a form with an input if both are in the same home subtree (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree). Or is there some deep web-compat reason for this parsing oddity? - Adam
Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness
On 8/6/13 5:58 PM, Ian Hickson wrote: One could imagine an implementation strategy where the cloning is done on the sending side, or even on a third thread altogether The cloning needs to run to completion (in the sense of capturing an immutable representation) before anyone can change the data structure being cloned. That means either serializing the whole data structure in some way before returning control to JS or doing something where you start serializing it async and block until finish as soon as someone tries to modify any of those objects in any way, right? The latter is rather nontrivial to implement, so UAs do the former at the moment. Serialising is hard to do async, since you fundamentally have to walk the data structure, and the actual serialisation at that point is not especially more expensive than a copy. Right, that's what I said above... ;) Parsing is easy to do on a separate worker, because it has no dependencies -- you can do it all in isolation. Sadly, that may not be the. Actual JS implementations have various thread-local data that objects depend on (starting with interned property names), such that it's not actually possible to create an object on one thread and use it on another in many of them. For instance, how would you serialize something as simple as the following? { name: The One, hp: 1000, achievements: [achiever, overachiever, extreme overachiever] // Length of the list is unpredictable } Why serialise it? If you want to post this across a MessagePort to a worker, or back from a worker, why not just post it? var a = { ... }; // from above port.postMessage(a); This in practice does some sort of serialization in UAs. Assuming by Firefox Desktop you mean the browser for desktop OSes called Firefox, then, why not just do this in C++? Let's start with because writing C++ code without memory errors is harder than writing JS code without memory errors? I don't understand why you would constrain yourself to using Web APIs in JavaScript to write a browser. Simplicity of implementation? Sandboxing of the code? Eating your own dogfood? I can come up with some more reasons if you want. -Boris