Re: [whatwg] XML data islands related question

2013-08-06 Thread Jukka K. Korpela

2013-08-06 2:27, Ian Hickson wrote:


On Thu, 7 Feb 2013, Jukka K. Korpela wrote:

[...]

It's a bit odd that if you wish to set up a standalone application
running in a browser (often called HTML5 application, without implying
any particular version of HTML5), you can include e.g. scripts and
images in separate files but not plain text or XML data


Why can't you put plain text or XML data in other files? So long as
everything is same origin, you can read anything you want via XHR.


A standalone application should be as self-contained as possible, 
without needing HTTP connections or any network connections to access 
its own data. When no connections are needed for other reasons, an HTML5 
application should run in any client capable of just interpreting HTML 
and JavaScript (and, in practice, CSS).


If such an application needs some bulk of text data, it can be included 
e.g. in script type=text/plain.../script but not in a separate plain 
text file (included into the application distribution, along with other 
files) referred to via script src=.../script. This is a frustrating 
restriction and makes it more difficult to maintain and customize 
application. If an external plain text file could be used, the data 
content could be separately managed (requiring knowledge only about the 
format used).


Yucca




Re: [whatwg] BinaryEncoding for Typed Arrays using window.btoa and window.atob

2013-08-06 Thread Anne van Kesteren
On Tue, Aug 6, 2013 at 1:41 AM, Kenneth Russell k...@google.com wrote:
 The Encoding spec at http://encoding.spec.whatwg.org/ seems to have
 handled issues like these. Perhaps a better route would be to fold
 this functionality into that spec.

Yeah, I think my preference would be at this point to expose API-only
encodings there. One of those could be base64. Labels for those
encodings would simply not be recognized for form and URL. We could
even give them labels that suggest that, e.g. api-base64. Another
one I've heard requests for is true latin1 which we also use in
XMLHttpRequest for various HTTP-related things.


-- 
http://annevankesteren.nl/


Re: [whatwg] XML data islands related question

2013-08-06 Thread Ian Hickson
On Tue, 6 Aug 2013, Jukka K. Korpela wrote:
 2013-08-06 2:27, Ian Hickson wrote:
  On Thu, 7 Feb 2013, Jukka K. Korpela wrote:
 [...]
   It's a bit odd that if you wish to set up a standalone application 
   running in a browser (often called HTML5 application, without 
   implying any particular version of HTML5), you can include e.g. 
   scripts and images in separate files but not plain text or XML data
  
  Why can't you put plain text or XML data in other files? So long as 
  everything is same origin, you can read anything you want via XHR.
 
 A standalone application should be as self-contained as possible, 
 without needing HTTP connections or any network connections to access 
 its own data. When no connections are needed for other reasons, an HTML5 
 application should run in any client capable of just interpreting HTML 
 and JavaScript (and, in practice, CSS).
 
 If such an application needs some bulk of text data, it can be included 
 e.g. in script type=text/plain.../script but not in a separate plain 
 text file (included into the application distribution, along with other 
 files) referred to via script src=.../script. This is a frustrating 
 restriction and makes it more difficult to maintain and customize 
 application. If an external plain text file could be used, the data 
 content could be separately managed (requiring knowledge only about the 
 format used).

I'm not sure what you mean by application distribution. Why can't a 
text/plain file by included the same way an image/png file is included?

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] XML data islands related question

2013-08-06 Thread Jukka K. Korpela

2013-08-06 17:45, Ian Hickson wrote:


If such an application needs some bulk of text data, it can be included
e.g. in script type=text/plain.../script but not in a separate plain
text file (included into the application distribution, along with other
files) referred to via script src=.../script. This is a frustrating
restriction and makes it more difficult to maintain and customize
application. If an external plain text file could be used, the data
content could be separately managed (requiring knowledge only about the
format used).


I'm not sure what you mean by application distribution. Why can't a
text/plain file by included the same way an image/png file is included?


It can be included as a file, but it cannot be used. I can't read it. 
That is the point. I can use an img element referring to an image 
file, but I cannot refer to a simple plain text file (or an XML file) in 
an HTML document in a manner that lets me process its content in 
scripting. I can only include it via iframe or object, but that's 
different from accessing its content.


Yucca




Re: [whatwg] Window and WindowProxy

2013-08-06 Thread Ian Hickson
On Tue, 6 Aug 2013, Boris Zbarsky wrote:
 
 As currently specified, the setup for Window/WindowProxy is as follows:
 
 1) WindowProxy is specified as all operations that would be performed 
on it must be performed on the Window object of the browsing 
context's active document instead, whatever that means in ES-spec 
terms.
 
 2) Window has an indexed getter on it and does security checks of 
various sorts on property access.
 
 There is a somewhat different way to specify this:
 
 1) WindowProxy has the indexed getter behavior and does security checks 
as needed.
 
 2) Window has no magic at all.
 
 Right now, these two ways of specifying it are black-box equivalent, but 
 this equivalence relies on the following three invariants holding:
 
 A) var foo; is not valid ES for any value of foo that would be 
considered a valid argument to the indexed getter.
 
 B) Bareword foo is not valid ES for any value of foo that would be 
considered a valid argument to the indexed getter.
 
 C) Script can never get its hands directly on a Window object.
 
 Invariants B and C together mean that the only way to invoke the indexed 
 getter is via the WindowProxy.  Invariant A means that there is no 
 contradiction between the way ES specifies var (as creating 
 non-configurable properties) and the WebIDL requirements for an object 
 with an indexed getter (not allowing definition of any expando indexed 
 properties at all).

I think there are other invariants that make them equivalent that are 
relevant here. In particular:

D) When a Window is a script's global object, that script is always going 
   to be same-origin with the Window, so it will always pass the security 
   checks. (So, it's ok to not do the checks on Window and do them on 
   WindowProxy instead.)

I think actually invariants A and B are mooted by invariant D. That is, if 
they weren't true, we'd still be ok, because the security check is always 
going to be safe given D.

But if invariant D was broken, then it seems like A and B would become 
problematic if we moved the security checks to the WindowProxy rather than 
to the Window.

If invariant C is broken, e.g. because in some new language we don't have 
a WindowProxy and instead return the real Window for the current Document, 
or whatnot, whenever you access the Window object, it seems like we'd also 
actually want the security checks on Window.

Do these last two points affect your conclusions?


 I believe the model that puts all the magic in the WindowProxy, which 
 has to be quite magical already, is much easier for implementors to 
 understand and reason about, and more clearly maps onto actual 
 implementations with an actual proxy for the WindowProxy. It has the 
 benefit of not depending on hidden invariants to avoid contradicting the 
 ES spec, and of making it clear exactly where the magic is, as well as 
 the small but tangible side benefit of making the global (in the ES 
 sense) not be an exotic object (also in the ES sense), thus reducing 
 the likelihood that future ES changes to how the global behaves will in 
 any way affect the behavior of window.
 
 The drawback is that it needs a bit more prose defining the behavior of 
 WindowProxy

It doesn't seem like that much more prose, at least, not if we're keeping 
the same level of precision. (If we want more, that's a different matter.)


What do other vendors think? This is in principle a purely editorial 
change. It would be cool if there was a WebIDL way to define WindowProxy, 
so that it could be unambiguously defined for all languages, but since 
it's a one-off object, maybe it's not worth it.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Window and WindowProxy

2013-08-06 Thread Boris Zbarsky

On 8/6/13 2:30 PM, Ian Hickson wrote:

I think there are other invariants that make them equivalent that are
relevant here. In particular:

D) When a Window is a script's global object, that script is always going
to be same-origin with the Window


Ah, yes.  Yes, that one is important too.  ;)


I think actually invariants A and B are mooted by invariant D. That is, if
they weren't true, we'd still be ok, because the security check is always
going to be safe given D.


Invariants A is needed because otherwise the behavior of objects with 
indexed properties (wherein they disallow adding indexed properties to 
them) would conflict with the ES-spec behavior of var.


Invariant B is needed because otherwise you could look up a property 
named 0 on a Window directly, and if the indexed props live on the 
WindowProxy you would unexpectedly get undefined instead of the first 
child window.


Neither one of those is about the security check situation, afaict.


But if invariant D was broken, then it seems like A and B would become
problematic if we moved the security checks to the WindowProxy rather than
to the Window.


Yes, agreed.

There are two somewhat-orthogonal concerns here:

1)  Where do the security checks live?
2)  Where do the indexed properties live?


If invariant C is broken, e.g. because in some new language we don't have
a WindowProxy and instead return the real Window for the current Document,
or whatnot, whenever you access the Window object, it seems like we'd also
actually want the security checks on Window.


Yes.


Do these last two points affect your conclusions?


I don't think they affect what I want to happen for indexed properties. 
 That part is actually more important to me right now than the much 
more underspecified security check story; I expect as we fully specify 
the security checks in terms of the MOP (which we need to do) it'll 
become more obvious whether they need to live on the Window or the 
WindowProxy or both



It doesn't seem like that much more prose, at least, not if we're keeping
the same level of precision. (If we want more, that's a different matter.)


Oh, I want more precision for sure.  ;)


What do other vendors think?


I'd love to know this too.


but since it's a one-off object, maybe it's not worth it.


I don't think it's worth it at all, frankly.

-Boris


Re: [whatwg] BinaryEncoding for Typed Arrays using window.btoa and window.atob

2013-08-06 Thread Chang Shu
If technically no benefit of passing ArrayBufferView as a 2nd
parameter to atob, I think returning an ArrayBuffer is a good way to
go. Enhancing btoa/atob would be an easy solution while I am open to
enhance the Encoding spec. But it appears to me we have to introduce
another pair of coders, say BinaryDecoder/BinaryEncoder, in addition
to TextDecoder/TextEncode since the signatures of the decode/encode
functions are different.

Chang

On Tue, Aug 6, 2013 at 8:28 AM, Kornel Lesiński kor...@geekhood.net wrote:
 On Mon, 05 Aug 2013 21:39:22 +0100, Chang Shu csh...@gmail.com wrote:

 I see your point now, Simon. Technically both approaches should work.
 As you said, yours has the limitation that the implementation does not
 know which view to return unless you provide an enum type of parameter
 instead of boolean to atob.


 In that case it'd be better to return ArrayBuffer, so the user can wrap it
 in any type they want (including DataView).

 --
 regards, Kornel


[whatwg] Form-associated elements and the parser

2013-08-06 Thread Adam Klein
Hixie opened my eyes last week to parser-association behavior of the
sort found at http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
In that case, an input in a detached tree is associated with a
form in the main document. This causes badness in WebKit and Blink
because the association between the form and the input (e.g., as
exposed in the HTMLFormElement.elements collection) is only weakly
held to avoid reference loops (and thus memory leaks). And that
weakness occasionally results in crashes when one of these objects is
collected before the other.

While all modern HTML parser implementations I tested seemed to agree
on their treatment of the above example (they all return 1 as
elements.length), this feature doesn't strike me as terribly useful.
And for what it's worth, it doesn't seem to be present in legacy IE.

I'm interested what others would think about changing the parser to
only associate a form with an input if both are in the same home
subtree 
(http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
Or is there some deep web-compat reason for this parsing oddity?

- Adam


[whatwg] Microdata feedback

2013-08-06 Thread Ian Hickson
On Wed, 13 Feb 2013, Ed Summers wrote:
 
 I am looking for some guidance about the use of multiple itemtypes in 
 microdata [1], specifically the phrase defined to use the same 
 vocabulary in:
 
 
 The item types must all be types defined in applicable specifications
 and must all be defined to use the same vocabulary.
 
 
 For example, does this mean that I can't say:
 
 div itemscope itemtype=http://acme.com/Foo http://zenith.com/Bar; ... 
 /div

It depends on what http://acme.com/Foo and http://zenith.com/Bar are. If 
they use the same vocabulary, then you can do it. If they're separate 
vocabularies, then no.


 The reason I ask is that there is some desire over in the schema.org 
 community [2] to provide a mechanism for schema.org to be specialized. 
 For example, in the case of an audiobook:
 
 div itemscope itemtype=http://schema.org/Book
 http://www.productontology.org/id/Audiobook; ... /div
 
 The idea being not to overload schema.org with more vocabulary, and to 
 let vocabularies grow a bit more organically.

If they're the same vocabulary -- that is, the properties on this .../Book 
vocabulary and this .../Audiobook vocabulary don't clash -- properties 
mean the same thing in both -- then it's fine.


 This schema.org group is currently thinking of using a one off property 
 additionalType that would be used like so:
 
 div itemscope itemtype=http://schema.org/Book;
   link itemprop=additionalType
 href=http://www.productontology.org/id/Audiobook;
   ...
 /div
 
 I personally find this to be kind of distasteful since it replicates the 
 mechanics that microdata's itemtype already offers.

It's essentially equivalent, yes.


 So, my question: is it the case that itemtype cannot reference types in 
 different vocabularies like the example above? If so, I'm curious to 
 know what the rationale was, and if perhaps it could be relaxed.

If they're different vocabularies (i.e. the same terms are used to mean 
different things), then you wouldn't know which was meant, so it would be 
ambiguous. There's an open bug about this topic with an open question:

   https://www.w3.org/Bugs/Public/show_bug.cgi?id=13527


On Thu, 14 Feb 2013, Ed Summers wrote:
 
 In John's email [1] he proposed limiting multiple types to being from 
 the same origin domain, not the same vocabulary as is stated in the 
 Microdata spec. It sounds like an obvious question, but is there a 
 precise definition of what is meant by same vocabulary? Or is it just 
 a hand wavy way of talking about what humans understand when putting the 
 itemtype URLs in their browsers, reading, and understanding that they 
 are types that are part of some larger coherent whole?

Vocabulary means the set of properties that are defined. There's some 
non-normative text in the HTML spec that talks about this:

# The type gives the context for the properties, thus selecting a
# vocabulary: a property named class given for an item with the type
# http://census.example/person; might refer to the economic class of
# an individual, while a property named class given for an item with
# the type http://example.com/school/teacher; might refer to the
# classroom a teacher has been assigned. Several types can share a
# vocabulary. For example, the types
# http://example.org/people/teacher; and
# http://example.org/people/engineer; could be defined to use the
# same vocabulary (though maybe some properties would not be
# especially useful in both cases, e.g. maybe the
# http://example.org/people/engineer; type might not typically be
# used with the classroom property). Multiple types defined to use
# the same vocabulary can be given for a single item by listing the
# URLs as a space-separated list in the attribute' value. An item
# cannot be given two types if they do not use the same vocabulary,
# however.


On Tue, 19 Feb 2013, Judson Lester wrote:

 There was an email from last year suggesting that the values of input 
 elements be derived from their value attributes - the purpose there 
 being to be able to control the form via the microdata interface.  I've 
 only been able to read it in the archives - the brief exchange was 
 between Igor Nikolev and Ian Hickson, who was curious about use cases.
 
 Conversely, it would be useful to be able to use input elements to 
 contain item values, and at the moment, since their values would be 
 derived from their textContent, they're useless for that.  
 Specifically, it's often reasonable to present a representation as the 
 default values in a form and allow for updates simply by posting the 
 changed values.  It seems unwieldy to need to replicate that information 
 in e.g. data elements.
 
 While it would be simple to treat the defaultValue as the item property 
 value for elements (and for radio inputs, let the representation mark 
 the selected input as the itemprop), it seems counter to the spirit of 
 the proposal.  The alternative would be to do something like excluding 
 unsuccessful input elements during 

Re: [whatwg] Should video controls generate click events?

2013-08-06 Thread Ian Hickson
On Thu, 27 Jun 2013, Philip Jägenstedt wrote:

 In a discussion about a click to play/pause feature for Opera on 
 Android, the issue of click event handlers came up.[1] The problem is 
 that pages can do things like this:
 
 v.onclick = function() {
  if (v.paused) {
v.play();
  } else {
v.pause();
  }
  // no preventDefault()
 }
 
 I created a demo [2] and it is indeed the case that this makes video 
 controls unusable in both Presto and Chromium based browsers. Simon 
 Pieters has brought this up before, but the spec wasn't changed at that 
 point.[3]
 
 While my demo may be on the hypothetical side, we do want users to be 
 able to bring up the native controls via a context menu and be able to 
 use them regardless of what the page does in its event handlers. So, I 
 request that the spec be explicit that interacting with the video 
 controls does not cause the normal script-visible events to be fired.
 
 [1] https://codereview.chromium.org/17391015
 [2] http://people.opera.com/~philipj/click.html
 [3] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-June/031916.html
 (search for As with the post Simon cites above)

I've made the spec say this is a valid (and recommended) implemenation 
strategy.

HTH,
-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness

2013-08-06 Thread Ian Hickson
On Thu, 7 Mar 2013, j...@mailb.org wrote:

 right now JSON.parse blocks the mainloop, this gets more and more of an 
 issue as JSON documents get bigger and are also used as serialization 
 format to communicate with web workers.

I think it would make sense to have a Promise-based API for JSON parsing. 
This probably belongs either in the JS spec or the DOM spec; Anne, Ms2ger, 
and any JS people, is anyone interested in taking this?


On Thu, 7 Mar 2013, David Rajchenbach-Teller wrote:
 
 Actually, communicating large JSON objects between threads may cause 
 performance issues. I do not have the means to measure reception speed 
 simply (which would be used to implement asynchronous JSON.parse), but 
 it is easy to measure main thread blocks caused by sending (which would 
 be used to implement asynchronous JSON.stringify).

I don't understand why there'd be any difficulty in sending large objects 
between workers or from a worker to the main thread. It's possible this is 
not well-implemented today, but isn't that just an implementation detail?

One could imagine an implementation strategy where the cloning is done on 
the sending side, or even on a third thread altogether, and just passed 
straight to the receiving side in one go.


On Thu, 7 Mar 2013, Tobie Langel wrote:
 
 Even if an async API for JSON existed, wouldn't the perf bottleneck then 
 simply fall on whatever processing needs to be done afterwards?

That was my initial reaction as well, I must admit.


On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote:

 For the moment, the main use case I see is for asynchronous
 serialization of JSON is that of snapshoting the world without stopping
 it, for backup purposes, e.g.:
 a. saving the state of the current region in an open world RPG;
 b. saving the state of an ongoing physics simulation;
 c. saving the state of the browser itself in case of crash/power loss
 (that's assuming a FirefoxOS-style browser implemented as a web
 application);
 d. backing up state and history of the browser itself to a server
 (again, assuming that the browser is a web application).

Serialising is hard to do async, since you fundamentally have to walk the 
data structure, and the actual serialisation at that point is not 
especially more expensive than a copy.


 The natural course of action would be to do the following:
 1. collect data to a JSON object (possibly a noop);

I'm not sure what you mean by JSON object. JSON is a string format. Do you 
mean a JS object data structure?

 2. send the object to a worker;
 3. apply some post-treatment to the object (possibly a noop);
 4. write/upload the object.
 
 Having an asynchronous JSON serialization to some Transferable form 
 would considerably the task of implement step 2. without janking if data 
 ends up very heavy.

I don't understand what JSON has to do with sending data to a worker. You 
can just send the actual JS object; MessagePorts and postMessage() support 
raw JS objects.


 So far, I have discussed serializing JSON, not deserializing it, but I 
 believe that the symmetric scenarios also hold.

No, they are quite asymetric. Serialising requires stalling the code that 
is interacting with the data structure, to guarantee integrity. Parsing is 
easy to do on a separate worker, because it has no dependencies -- you can 
do it all in isolation.


On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote:
 
 If I am correct, this means that we need some mechanism to provide 
 efficient serialization of non-Transferable data into something 
 Transferable.

I don't understand what this means. Transferable is about neutering 
objects on one side and creating new versions on the other. It's the 
equivalent of a move. Your use cases were about making copies, as far as 
I can tell (saving and backing up).

As a general rule, JSON has nothing to do with Transferable objects, as 
far as I can tell.


On Fri, 8 Mar 2013, David Rajchenbach-Teller wrote:
 
 Intuitively, this sounds like:
 1. collect data to a JSON;
 2. serialize JSON (hopefully asynchronously) to a Transferable (or
 several Transferables).

I really don't understand this. Are you asking for a way to move a JS 
object from one thread to another, killing references to it in the first 
thread? What's the use case? (What would this have to do with JSON?)


On Fri, 8 Mar 2013, David Bruant wrote:

 Why not collect the data in a Transferable like an ArrayBuffer directly? 
 It skips the additional serialization part. Writing a byte stream 
 directly is a bit hardcore I admit, but an object full of setters can 
 give the impression to create an object while actually filling an 
 ArrayBuffer as a backend. I feel that could work efficiently.

It's not clear to me what the use case is, but if the desire is to move a 
batch of data from one thread to another, then this is certainly one way 
to do it. Another would be to just copy the data in the first place, no 
need to move it -- since you have to pay the cost of reading 

Re: [whatwg] Form-associated elements and the parser

2013-08-06 Thread Ryosuke Niwa

On Aug 6, 2013, at 2:01 PM, Adam Klein ad...@chromium.org wrote:

 Hixie opened my eyes last week to parser-association behavior of the
 sort found at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
 In that case, an input in a detached tree is associated with a
 form in the main document. This causes badness in WebKit and Blink
 because the association between the form and the input (e.g., as
 exposed in the HTMLFormElement.elements collection) is only weakly
 held to avoid reference loops (and thus memory leaks). And that
 weakness occasionally results in crashes when one of these objects is
 collected before the other.
 
 While all modern HTML parser implementations I tested seemed to agree
 on their treatment of the above example (they all return 1 as
 elements.length), this feature doesn't strike me as terribly useful.
 And for what it's worth, it doesn't seem to be present in legacy IE.

What is the behavior of the old IE?

- R. Niwa



Re: [whatwg] Form-associated elements and the parser

2013-08-06 Thread Adam Klein
On Tue, Aug 6, 2013 at 4:09 PM, Ryosuke Niwa rn...@apple.com wrote:

 On Aug 6, 2013, at 2:01 PM, Adam Klein ad...@chromium.org wrote:

 Hixie opened my eyes last week to parser-association behavior of the
 sort found at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
 In that case, an input in a detached tree is associated with a
 form in the main document. This causes badness in WebKit and Blink
 because the association between the form and the input (e.g., as
 exposed in the HTMLFormElement.elements collection) is only weakly
 held to avoid reference loops (and thus memory leaks). And that
 weakness occasionally results in crashes when one of these objects is
 collected before the other.

 While all modern HTML parser implementations I tested seemed to agree
 on their treatment of the above example (they all return 1 as
 elements.length), this feature doesn't strike me as terribly useful.
 And for what it's worth, it doesn't seem to be present in legacy IE.

 What is the behavior of the old IE?

form.elements.length == 0 in IE 9.

- Adam


Re: [whatwg] Form-associated elements and the parser

2013-08-06 Thread Jonas Sicking
As I recall it (it was ages since I dealt with this), the tricky case
that you need to handle is this one:

http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432

In this case, web compatibility requires that the input is
associated with the form. Specifically hidden input elements would
often end up moved, but still had to show up in form.elements as well
as get submitted along with the form.

/ Jonas

/ Jonas

On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote:
 Hixie opened my eyes last week to parser-association behavior of the
 sort found at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
 In that case, an input in a detached tree is associated with a
 form in the main document. This causes badness in WebKit and Blink
 because the association between the form and the input (e.g., as
 exposed in the HTMLFormElement.elements collection) is only weakly
 held to avoid reference loops (and thus memory leaks). And that
 weakness occasionally results in crashes when one of these objects is
 collected before the other.

 While all modern HTML parser implementations I tested seemed to agree
 on their treatment of the above example (they all return 1 as
 elements.length), this feature doesn't strike me as terribly useful.
 And for what it's worth, it doesn't seem to be present in legacy IE.

 I'm interested what others would think about changing the parser to
 only associate a form with an input if both are in the same home
 subtree 
 (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
 Or is there some deep web-compat reason for this parsing oddity?

 - Adam


Re: [whatwg] Form-associated elements and the parser

2013-08-06 Thread Adam Klein
On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking jo...@sicking.cc wrote:
 As I recall it (it was ages since I dealt with this), the tricky case
 that you need to handle is this one:

 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432

 In this case, web compatibility requires that the input is
 associated with the form. Specifically hidden input elements would
 often end up moved, but still had to show up in form.elements as well
 as get submitted along with the form.

That case definitely makes sense to me, and I think it's fine to keep
that behavior for compat. The only one I'm asking to change is the
case when the input and form end up in different trees.

 On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote:
 Hixie opened my eyes last week to parser-association behavior of the
 sort found at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
 In that case, an input in a detached tree is associated with a
 form in the main document. This causes badness in WebKit and Blink
 because the association between the form and the input (e.g., as
 exposed in the HTMLFormElement.elements collection) is only weakly
 held to avoid reference loops (and thus memory leaks). And that
 weakness occasionally results in crashes when one of these objects is
 collected before the other.

 While all modern HTML parser implementations I tested seemed to agree
 on their treatment of the above example (they all return 1 as
 elements.length), this feature doesn't strike me as terribly useful.
 And for what it's worth, it doesn't seem to be present in legacy IE.

 I'm interested what others would think about changing the parser to
 only associate a form with an input if both are in the same home
 subtree 
 (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
 Or is there some deep web-compat reason for this parsing oddity?

 - Adam


Re: [whatwg] Form-associated elements and the parser

2013-08-06 Thread Jonas Sicking
On Tue, Aug 6, 2013 at 4:27 PM, Adam Klein ad...@chromium.org wrote:
 On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking jo...@sicking.cc wrote:
 As I recall it (it was ages since I dealt with this), the tricky case
 that you need to handle is this one:

 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432

 In this case, web compatibility requires that the input is
 associated with the form. Specifically hidden input elements would
 often end up moved, but still had to show up in form.elements as well
 as get submitted along with the form.

 That case definitely makes sense to me, and I think it's fine to keep
 that behavior for compat. The only one I'm asking to change is the
 case when the input and form end up in different trees.

Sure, as long as you come up with a formalized algorithm for when
there is an association and when there isn't. Keep in mind that by the
time that the input-element is inserted, the form-element might have
been moved elsewhere. We likely don't need the association in that
case, but detecting that that has happened sounds tricky.

The way that Gecko currently works IIRC is that it creates the
association any time it has seen a form without seeing a
/form. And it breaks the association anytime an input-element's
parent chain changes and the associated form-element is no longer in
the parent chain.

On a related note, when are you guys going to add a cycle collector or
other not-plain-refcounting memory manager :-)

/ Jonas

 On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote:
 Hixie opened my eyes last week to parser-association behavior of the
 sort found at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
 In that case, an input in a detached tree is associated with a
 form in the main document. This causes badness in WebKit and Blink
 because the association between the form and the input (e.g., as
 exposed in the HTMLFormElement.elements collection) is only weakly
 held to avoid reference loops (and thus memory leaks). And that
 weakness occasionally results in crashes when one of these objects is
 collected before the other.

 While all modern HTML parser implementations I tested seemed to agree
 on their treatment of the above example (they all return 1 as
 elements.length), this feature doesn't strike me as terribly useful.
 And for what it's worth, it doesn't seem to be present in legacy IE.

 I'm interested what others would think about changing the parser to
 only associate a form with an input if both are in the same home
 subtree 
 (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
 Or is there some deep web-compat reason for this parsing oddity?

 - Adam


Re: [whatwg] Form-associated elements and the parser

2013-08-06 Thread Adam Klein
On Tue, Aug 6, 2013 at 4:38 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Aug 6, 2013 at 4:27 PM, Adam Klein ad...@chromium.org wrote:
 On Tue, Aug 6, 2013 at 4:21 PM, Jonas Sicking jo...@sicking.cc wrote:
 As I recall it (it was ages since I dealt with this), the tricky case
 that you need to handle is this one:

 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2432

 In this case, web compatibility requires that the input is
 associated with the form. Specifically hidden input elements would
 often end up moved, but still had to show up in form.elements as well
 as get submitted along with the form.

 That case definitely makes sense to me, and I think it's fine to keep
 that behavior for compat. The only one I'm asking to change is the
 case when the input and form end up in different trees.

 Sure, as long as you come up with a formalized algorithm for when
 there is an association and when there isn't. Keep in mind that by the
 time that the input-element is inserted, the form-element might have
 been moved elsewhere. We likely don't need the association in that
 case, but detecting that that has happened sounds tricky.

My concrete proposal would be something like this:

In step 4 of 
http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#create-an-element-for-the-token,
add a requirement that intended parent and the form element
pointer be part of the same home subtree (defined at
http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).

 The way that Gecko currently works IIRC is that it creates the
 association any time it has seen a form without seeing a
 /form. And it breaks the association anytime an input-element's
 parent chain changes and the associated form-element is no longer in
 the parent chain.

This is basically the same thing Blink  WebKit do, with the caveat
that we also avoid associating forms with elements inside
templates (this is now reflected in step 4 of the algorithm, see
above).

 On a related note, when are you guys going to add a cycle collector or
 other not-plain-refcounting memory manager :-)

Yes, that would be nice :)

- Adam

 / Jonas

 On Tue, Aug 6, 2013 at 2:01 PM, Adam Klein ad...@chromium.org wrote:
 Hixie opened my eyes last week to parser-association behavior of the
 sort found at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=2428.
 In that case, an input in a detached tree is associated with a
 form in the main document. This causes badness in WebKit and Blink
 because the association between the form and the input (e.g., as
 exposed in the HTMLFormElement.elements collection) is only weakly
 held to avoid reference loops (and thus memory leaks). And that
 weakness occasionally results in crashes when one of these objects is
 collected before the other.

 While all modern HTML parser implementations I tested seemed to agree
 on their treatment of the above example (they all return 1 as
 elements.length), this feature doesn't strike me as terribly useful.
 And for what it's worth, it doesn't seem to be present in legacy IE.

 I'm interested what others would think about changing the parser to
 only associate a form with an input if both are in the same home
 subtree 
 (http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#home-subtree).
 Or is there some deep web-compat reason for this parsing oddity?

 - Adam


Re: [whatwg] asynchronous JSON.parse and sending large structured data between threads without compromising responsiveness

2013-08-06 Thread Boris Zbarsky

On 8/6/13 5:58 PM, Ian Hickson wrote:

One could imagine an implementation strategy where the cloning is done on
the sending side, or even on a third thread altogether


The cloning needs to run to completion (in the sense of capturing an 
immutable representation) before anyone can change the data structure 
being cloned.


That means either serializing the whole data structure in some way 
before returning control to JS or doing something where you start 
serializing it async and block until finish as soon as someone tries to 
modify any of those objects in any way, right?


The latter is rather nontrivial to implement, so UAs do the former at 
the moment.



Serialising is hard to do async, since you fundamentally have to walk the
data structure, and the actual serialisation at that point is not
especially more expensive than a copy.


Right, that's what I said above...  ;)


Parsing is easy to do on a separate worker, because it has no dependencies -- 
you can
do it all in isolation.


Sadly, that may not be the.

Actual JS implementations have various thread-local data that objects 
depend on (starting with interned property names), such that it's not 
actually possible to create an object on one thread and use it on 
another in many of them.



For instance, how would you serialize something as simple as the following?

{
   name: The One,
   hp: 1000,
   achievements: [achiever, overachiever, extreme overachiever]
// Length of the list is unpredictable
}


Why serialise it? If you want to post this across a MessagePort to a
worker, or back from a worker, why not just post it?

var a = { ... }; // from above
port.postMessage(a);


This in practice does some sort of serialization in UAs.


Assuming by Firefox Desktop you mean the browser for desktop OSes called
Firefox, then, why not just do this in C++?


Let's start with because writing C++ code without memory errors is 
harder than writing JS code without memory errors?



I don't understand why you
would constrain yourself to using Web APIs in JavaScript to write a browser.


Simplicity of implementation?  Sandboxing of the code?  Eating your own 
dogfood?


I can come up with some more reasons if you want.

-Boris