Re: [whatwg] register*Handler and Web Intents
Hi, Is there any ability to pass a MessageChannel Port in as an IntentSetting, or out in the success handler? Is there any facility to allow multi-part communications to an activity? For example, Sony does this in their Local UPnP Service Discovery Web Intent's scheme: http://www.w3.org/wiki/images/2/2e/V4_W3C_Web_Intents_-_Local_UPnP_Service_Discovery.pdf#page=15 This is really the only way to get out of the one-shot request/response model, which is extremely important to me, and the general versatility of this inter- perability mechanism. The callbacks given in the method, if provided, are invoked asynchronously in reaction to the handler calling success() or failure() on the Intent object. We would just allow one success or failure message per Intent, for sanity's sake. I'd far prefer a model not based up front on the one-shot model: a Intent ought be a SharedWorker in terms of interactions with the page (although more Intent-oriented in instantiation), crossed with the recent notion of Chrome's Packaged App: a stand-alone contained experience. This is a radical turn I would justify as due it's more general purpose interaction model. It also invents far less: SharedWorkers just need some interface, and presto-chango, we have the perfect Intents, rather than making an entirely new custom suite of interaction models tailored to a more limited one shot use case. I would be happy to make this proposal more concrete. Although I reference Packaged App as a good model, the ultimate implementation could be merely a new web browser page whose Window implements SharedWorker: interface SharedWorker : EventTarget { readonly attribute MessagePort port; }; SharedWorker extends AbstractWorker; interface AbstractWorker { attribute EventHandler onerror; }; If we want to stick on this current one shot model, I'd recommend chainable Intents: callback AnyCallback = void (any data, Intentable continuator); interface Intentable { void startIntent(IntentSettings intent, optional AnyCallback success, optional DOMStringCallback failure); } Window implements Intentable; This is for this use multi-part case: window.startIntent({action:control-point},function(cpData,myPanel){ myPanel.startIntent({action:play,data:{}}) }) Note that if the registration page does not have both of these the nested startIntent will fail: intent action=control-point scheme=? title=RCA Control Panel/ intent action=play scheme=? title=Play on RCA TV/ The desire I wish to express is creating a context which can be continued. The explicit use of Intentable insures that only the previous handler will be able to handle the new request. I'd seek a more formal mechanic to officially carry on the continuation: informally, cpData could hold a token which could be passed into the data of myPanel's play startIntent, but this ad-hoc continuation is a weak way of being able to hold a reasonable conversation. In parting, I wish to thank Sony for showing the utmost pragmatism in their design. I appreciate their two approaches to this problem, and for showing what a real service discovery use case looks like. Fair regards, delighted to see this topic being talked about, please let me know how I can aid, rektide
Re: [whatwg] Missing alt attribute name bikeshedding (was Re: alt= and the meta name=generator exception)
On Mon, Aug 6, 2012 at 4:17 AM, Odin Hørthe Omdal odi...@opera.com wrote: IMHO generator-unable-to-provide-**required-alt in all its ugliness is a really nice feature, because how would anyone in their sane mind write that. It's really made for a corner case, and if you really really want that, you should be prepared to deal with the ugliness, because what you are doing is ugly in the first place... Making things ugly on purpose is always a bad idea. Either it has valid use cases, and it should be a clean, well-designed feature, or it doesn't, and it shouldn't be there at all. Please don't go down this path; we have more than enough ugliness by accident without doing it on purpose. -- Glenn Maynard
Re: [whatwg] register*Handler and Web Intents
On Fri, Aug 3, 2012 at 12:00 PM, James Graham jgra...@opera.com wrote: I agree with Henri that it is extremely worrying to allow aesthetic concerns to trump backward compatibility here. Letting aesthetic concerns trump backward compat is indeed troubling. It's also troubling that this even needs to be debated, considering that we're supposed to have a common understanding of the design principles and the design principles pretty clearly uphold backward compatibility over aesthetics. I would also advise strongly against using position in DOM to detect intents support; if you insist on adding a new void element I will strongly recommend that we add it to the parser asap to try and mitigate the above breakage, irrespective of whether our plans for the rest of the intent mechanism. I think the compat story for new void elements is so bad that we shouldn't add new void elements. (source gets away with being a void element, because the damage is limited by the /video or /audio end tag that comes soon enough after source.) I think we also shouldn't add new elements that don't imply body when appearing in in head. It's great that browsers have converged on the parsing algorithm. Let's not break what we've achieved to cater to aesthetics. -- Henri Sivonen hsivo...@iki.fi http://hsivonen.iki.fi/
Re: [whatwg] StringEncoding: encode() return type looks weird in the IDL
On Sun, Aug 5, 2012 at 11:44 AM, Boris Zbarsky bzbar...@mit.edu wrote: On 8/5/12 1:39 PM, Glenn Maynard wrote: I didn't say it was extensibility, just a leftover from something that was either considered and dropped or forgotten about. Oh, I see. I thought you were talking about leaving the return value as-is so that Uint16Array return values can be added later. I'd vote for changing the return type to Uint8Array as things stand, and if we ever change what the function can return, we change the return type at that point. Thanks. Yes, having the return type be ArrayBufferView in the IDL is just a leftover. Fixing it now to be Uint8Array. I'll start another thread on StringEncoding shortly summarizing open issues, but anyone reading this thread is encouraged to take a look at http://wiki.whatwg.org/wiki/StringEncoding and craft opinions.
[whatwg] StringEncoding open issues
Regarding the API proposal at: http://wiki.whatwg.org/wiki/StringEncoding It looks like we've got some developer interest in implementing this, and need to nail down the open issues. I encourage folks to look over the Resolved issues in the wiki page and make sure the resolutions - gathered from loose consensus here and offline discussion - are truly resolved or if anything is not future-proof and should block implementations from proceeding. Also, look at the Notes to Implementers section; this should be non-controversial but may be non-obvious. This leaves two open issues: behavior on encoding error, and handling of Byte Order Marks (BOMs) == Encoding Errors == The proposal builds on Anne's http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec, which defines when encodings should emit an encoder error. In that spec (which describes the existing behavior of Web browsers) encoders are used in a limited fashion, e.g. for encoding form results before submission via HTTP, and hence the cases are much more restricted than the errors encountered when browsers are asked to decode content from the wild. As noted, the encoding process could terminate when an error is emitted. Alternately (and as is necessary for forms, etc) there is a use-case-specific escaping mechanism for non-encodable code points. The proposed TextDecoder object takes a TextDecoderOptions options with a |fatal| flag that controls the decode behavior in case of error - if |fatal| is unset (default) a decode error produces a fallback character (U+FFFD); if |fatal| is set then a DOMException is raised instead. No such option is currently proposed for the TextEncoder object; the proposal dictates that a DOMException is thrown if the encoder emits an error. I believe this is sufficient for V1, but want feedback. For V2 (or now, if desired), the API could be extended to accept an options object allowing for some/all of these cases; * Don't throw, instead emit a standard/encoding-specific replacement character (e.g. '?') * Don't throw, instead emit a fixed placeholder character (byte?) sequence * Don't throw, instead call a user-defined callback and allow it to produce a replacement escaped character sequence, e.g. #x; The latter seems the most flexible (superset of the rest) but is probably overkill for now. Since it can be added in easily later, can we defer until we have implementer and user feedback? == Byte Order Marks (BOMs) == Once again, the proposal builds on Anne's http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec, which describes the existing behavior of Web browsers. In the wild, browsers deal with a variety of mechanisms for indicating the encoding of documents (server headers, meta tags, XML preludes, etc), many of which are blatantly incorrect or contradictory. One form is fortunately rarely wrong - if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes the byte order mark (the encoding-specific serialization of U+FEFF). This is built into the Encoding spec - given a byte sequence to decode and an encoding label, the label is ignored if the sequence starts with one of the three UTF BOMs, and the BOM-indicated encoding is used to decode the rest of the stream. The proposed API will have different uses, so it is unclear that this is necessary or desirable. At a minimum, it is clear that: * If one of the UTF encodings is specified AND the BOM matches then the leading BOM character (U+FEFF) MUST NOT be emitted in the output character sequence (i.e. it is silently consumed) Less clear is this behavior in these two cases. * If one of the UTF encodings is specified AND and a different BOM is present (e.g. UTF-16LE but a UTF-16BE BOM) * If one of the non-UTF encodings is specified AND a UTF BOM is present Options include: * Nothing special - decoder does what it will with the bytes, possibly emitting garbage, possibly throwing * Raise a DOMException * Switch the decoder from the user-specified encoding to the DOM-specified encoding The latter seems the most helpful when the proposed API is used as follows: var s = TextDecoder().decode(bytes); // handles UTF-8 w/o BOM and any UTF w/ BOM ... but it does seem a little weird when used like this; var d = TextDecoder('euc-jp'); assert(d.encoding === 'euc-jp'); var s = d.decode(new Uint8Array([0xFE]), {stream: true}); assert(d.encoding === 'euc-jp'); assert(s.length === 0); // can't emit anything until BOM is definitely passed s += d.decode(new Uint8Array([0xFF]), {stream: true}); assert(d.encoding === 'utf-16be'); // really?
Re: [whatwg] StringEncoding: encode() return type looks weird in the IDL
On Sun, Aug 5, 2012 at 10:29 AM, Glenn Maynard gl...@zewt.org wrote: I guess the brokenness of Uint16Array (eg. the current lack of Uint16LEArray) could be sidestepped by just always returning Uint8Array, even if encoding to a 16-bit encoding (which is what it currently says to do). Maybe that's better anyway, since it avoids making UTF-16 a special case. +1 - which is why I pushed back on returning a Uint16Array earlier in the discussion. I guess that if you're converting a string to a UTF-16 ArrayBuffer, you're probably doing it to quickly dump it into a binary field somewhere anyway--if you wanted to *examine* the codepoints, you'd just look at the DOMString you started with. +1 again, and nicely stated. When I was a potential consumer of such an API, I was happy to treat the encoded form as a black box.
Re: [whatwg] alt= and the meta name=generator exception
On 5.8.2012 15:52, Henri Sivonen wrote: People who are not the developer of the generator use validators to assess the quality of the markup generated by the generator. People can use tools in various ways. We cannot prevent that. But it does not need to dictate the design of tools. People can use hammers as toothpicks, but hammer manufacturers don't make hammers softer for this reason. Or, alternatively, Alice anticipates Bob's reaction and preemptively makes her generator output alt= before Bob ever gets to badmouth about the invalidity of the generator's output. So? Whose problem is this? Generators have generated nonsensical alt attributes for years, e.g. inserting the filename and number of bytes. Keeping the attribute required won't make much difference. Even if we wanted to position validators as tools for the people who write markup, we can't prevent other people from using validators to judge markup output by generator written by others. And it is appropriate to judge that generation of HTML has problems, when the markup contains img elements without alt attributes. There is no reason why this possibility should be taken away. It is true that generator vendors can cheat by emitting alt=. We can't really prevent that. You seem to be worried about the possibility that keeping alt attribute required somehow pushes or forces vendors into doing such things to stay competitive. But this sounds highly speculative. We know that generators and other software may produce documents without a title element or with a dummy or bogus title element like titleNew document/title. And surely there are situations where an automatic generator has no way of deciding on an appropriate title element without consulting the user. So should there also be an exception allowing the omission of the title element, to avoid the assumed reaction by Alice, making her generator produce title/title or something worse? Yucca
Re: [whatwg] alt= and the meta name=generator exception
Jukka K. Korpela writes: On 5.8.2012 15:52, Henri Sivonen wrote: Alice anticipates Bob's reaction and preemptively makes her generator output alt= So? Whose problem is this? It hurts users browsing without images of pages generated by that generator. If the validator can do something different which wouldn't nudge developers into writing software which produces such mark-up, end-users benefit. Smylers -- http://twitter.com/Smylers2
Re: [whatwg] StringEncoding open issues
On Mon, Aug 6, 2012 at 11:29 AM, Joshua Bell jsb...@chromium.org wrote: Regarding the API proposal at: http://wiki.whatwg.org/wiki/StringEncoding It looks like we've got some developer interest in implementing this, and need to nail down the open issues. I encourage folks to look over the Resolved issues in the wiki page and make sure the resolutions - gathered from loose consensus here and offline discussion - are truly resolved or if anything is not future-proof and should block implementations from proceeding. Also, look at the Notes to Implementers section; this should be non-controversial but may be non-obvious. This leaves two open issues: behavior on encoding error, and handling of Byte Order Marks (BOMs) == Encoding Errors == The proposal builds on Anne's http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec, which defines when encodings should emit an encoder error. In that spec (which describes the existing behavior of Web browsers) encoders are used in a limited fashion, e.g. for encoding form results before submission via HTTP, and hence the cases are much more restricted than the errors encountered when browsers are asked to decode content from the wild. As noted, the encoding process could terminate when an error is emitted. Alternately (and as is necessary for forms, etc) there is a use-case-specific escaping mechanism for non-encodable code points. The proposed TextDecoder object takes a TextDecoderOptions options with a |fatal| flag that controls the decode behavior in case of error - if |fatal| is unset (default) a decode error produces a fallback character (U+FFFD); if |fatal| is set then a DOMException is raised instead. No such option is currently proposed for the TextEncoder object; the proposal dictates that a DOMException is thrown if the encoder emits an error. I believe this is sufficient for V1, but want feedback. For V2 (or now, if desired), the API could be extended to accept an options object allowing for some/all of these cases; Not introducing options for the encoder for V1 sounds like a good idea to me. However I would definitely prefer if the default for encoding matches the default for decoding and used replacement characters rather than threw an exception. This also matches what the recent WebSocket spec which recently changed from throwing to using replacement characters for encoding. The reason WebSocket was changed was because it's relatively easy to make a mistake and cause a surrogate UTF16 pair be cut into two, which results in an invalidly encoded DOMString. The problem with this is that it's very data dependent and so might not happen on the developer's computer, but only in the wild when people write text which uses non-BMP characters. In such cases throwing an exception will likely result in more breakage than using a replacement character. * Don't throw, instead emit a standard/encoding-specific replacement character (e.g. '?') Yes, using the replacement character sounds good to me. * Don't throw, instead emit a fixed placeholder character (byte?) sequence * Don't throw, instead call a user-defined callback and allow it to produce a replacement escaped character sequence, e.g. #x; The latter seems the most flexible (superset of the rest) but is probably overkill for now. Since it can be added in easily later, can we defer until we have implementer and user feedback? Indeed, we can explore these options if the need arises. == Byte Order Marks (BOMs) == Once again, the proposal builds on Anne's http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec, which describes the existing behavior of Web browsers. In the wild, browsers deal with a variety of mechanisms for indicating the encoding of documents (server headers, meta tags, XML preludes, etc), many of which are blatantly incorrect or contradictory. One form is fortunately rarely wrong - if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes the byte order mark (the encoding-specific serialization of U+FEFF). This is built into the Encoding spec - given a byte sequence to decode and an encoding label, the label is ignored if the sequence starts with one of the three UTF BOMs, and the BOM-indicated encoding is used to decode the rest of the stream. The proposed API will have different uses, so it is unclear that this is necessary or desirable. At a minimum, it is clear that: * If one of the UTF encodings is specified AND the BOM matches then the leading BOM character (U+FEFF) MUST NOT be emitted in the output character sequence (i.e. it is silently consumed) Agreed. Less clear is this behavior in these two cases. * If one of the UTF encodings is specified AND and a different BOM is present (e.g. UTF-16LE but a UTF-16BE BOM) * If one of the non-UTF encodings is specified AND a UTF BOM is present Options include: * Nothing special - decoder does what it will with the bytes,
Re: [whatwg] StringEncoding open issues
I agree with Jonas that encoding should just use a replacement character (U+FFFD for Unicode encodings, '?' otherwise), and that we should put off other modes (eg. exceptions and user-specified replacement characters) until there's a clear need. My intuition is that encoding DOMString to UTF-16 should never have errors; if there are dangling surrogates, pass them through unchanged. There's no point in using a placeholder that says an error occured here, when the error can be passed through in exactly the same form (not possible with eg. DOMString-SJIS). I don't feel strongly about this only because outputting UTF-16 is so rare to begin with. On Mon, Aug 6, 2012 at 1:29 PM, Joshua Bell jsb...@chromium.org wrote: - if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes the byte order mark (the encoding-specific serialization of U+FEFF). This rarely detects the wrong type, but that doesn't mean it's not the wrong answer. If my input is meant to be UTF-8, and someone hands me BOM-marked UTF-16, I want it to fail in the same way it would if someone passed in SJIS. I don't want it silently translated. On the other hand, it probably does make sense for UTF-16 to switch to UTF-16BE, since that's by definition the original purpose of the BOM. The convention iconv uses, which I think is a useful one, is decoding from UTF-16 means try to figure out the encoding from the BOM, if any, and UTF-16LE and UTF-16BE mean always use this exact encoding. * If one of the UTF encodings is specified AND the BOM matches then the leading BOM character (U+FEFF) MUST NOT be emitted in the output character sequence (i.e. it is silently consumed) It's a little weird that data = readFile(user-supplied-file.txt); // shortcutting for brevity var s = new TextDecoder(utf-16).decode(data); // or utf-8 s = s.replace(a, b); var data2 = new TextEncoder(utf-16).encode(s); writeFile(user-supplied-file.txt, data2); causes the BOM to be quietly stripped away. Normally if you're modifying a file, you want to pass through the BOM (or lack thereof) untouched. One way to deal with this could be: var decoder = new TextDecoder(utf-16); var s = decoder.decode(data); s = s.replace(a, b); var data2 = new TextEncoder(decoder.encoding).encode(s); where decoder.encoding is eg. UTF-16LE-BOM if a BOM was present, thus preserving both the BOM and (for UTF-16) endianness. I don't actually like this, though, because I don't like the idea of decoder.encoding changing after the decoder has already been constructed. I think I agree with just stripping it, and people who want to preserve BOMs on write-through can jump the hoops manually (which aren't terribly hard). Another issue is new TextDecoder('ascii').encoding (and ISO-8859-1) giving .encoding = windows-1252. That's strange, even when you know why it's happening. Is there any reason to expose the actual primary names? It's not clear that the name column in the Encoding spec is even intended to be exposed to APIs; they look more like labels for specs to refer to internally. (Anne?) If there's no pressing reason to expose this, I'd suggest that the .encoding attribute simply return the name that was passed to the constructor. It's still not ideal (it's weird that asking for ASCII gives you something other than ASCII in the first place), but it at least seems a bit less strange. The nice fix would be to implement actual ASCII, ISO-8859-1, ISO-8859-9, etc. charsets, but that just means extra implementation work (and some charset proliferation) without use cases. -- Glenn Maynard
[whatwg] iframe sandbox and indexedDB
Hi, the spec at http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html#sandboxed-origin-browsing-context-flag says : This flag also prevents script from reading from or writing to the document.cookie IDL attribute, and blocks access to localStorage. it seems that indexedDB access should also be blocked when this flag is set (ie when 'allow-same-origin' is NOT specified for the sandbox attribute). i intend to implement this restriction in Gecko, feedback from other implementors is welcome :) thanks ! Ian
Re: [whatwg] iframe sandbox and indexedDB
On Mon, Aug 6, 2012 at 5:08 PM, Ian Melven imel...@mozilla.com wrote: the spec at http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html#sandboxed-origin-browsing-context-flag says : This flag also prevents script from reading from or writing to the document.cookie IDL attribute, and blocks access to localStorage. it seems that indexedDB access should also be blocked when this flag is set (ie when 'allow-same-origin' is NOT specified for the sandbox attribute). Yes. I think this is actually a consequence of having a unique origin and doesn't need to be stated explicitly in the spec. (Although we might want to state it explicitly for the avoidance of doubt.) The reason document.cookie needs to called out explicitly is that it doesn't use the document's origin to determine which cookies to access: it uses the document's URL. We need to do that because cookie ignore the port but do care about the path part of the document's URL. (The better pattern for new API is to use the origin, which is what IndexedDB does.) i intend to implement this restriction in Gecko, feedback from other implementors is welcome :) Great. Adam
Re: [whatwg] iframe sandbox and indexedDB
Hi, - Original Message - From: Adam Barth w...@adambarth.com To: Ian Melven imel...@mozilla.com Cc: whatwg@lists.whatwg.org Sent: Monday, August 6, 2012 5:12:40 PM Subject: Re: [whatwg] iframe sandbox and indexedDB Yes. I think this is actually a consequence of having a unique origin and doesn't need to be stated explicitly in the spec. (Although we might want to state it explicitly for the avoidance of doubt.) yeah, i can see how this situation behaves being implementation dependent - some implementations might allow storing data for the unique origin, which seems undesirable. So, it might be worth stating the restriction explicitly, as it is for LocalStorage. thank you for the clarification on why document.cookie is explicitly called out :) thanks, ian
[whatwg] StringEncoding: Allowed encodings for TextEncoder
Hi All, I seem to have a recollection that we discussed only allowing encoding to UTF8 and UTF16LE, UTF16BE. This in order to promote these formats as well as stay in sync with other APIs like XMLHttpRequest. However I currently can't find any restrictions on which target encodings are supported in the current drafts. One wrinkle in this is if we want to support arbitrary encodings when encoding, that means that we can't use insert a the replacement character as default error handling since that isn't available in a lot of encoding formats. / Jonas