Re: [whatwg] register*Handler and Web Intents

2012-08-06 Thread rektide
Hi,

Is there any ability to pass a MessageChannel Port in as an IntentSetting, or
out in the success handler? Is there any facility to allow multi-part
communications to an activity? For example, Sony does this in their Local UPnP
Service Discovery Web Intent's scheme:
http://www.w3.org/wiki/images/2/2e/V4_W3C_Web_Intents_-_Local_UPnP_Service_Discovery.pdf#page=15

This is really the only way to get out of the one-shot request/response model,
which is extremely important to me, and the general versatility of this inter-
perability mechanism.

 The callbacks given in the method, if provided, are invoked asynchronously 
 in reaction to the handler calling success() or failure() on the Intent 
 object. We would just allow one success or failure message per Intent, 
 for sanity's sake.

I'd far prefer a model not based up front on the one-shot model: a Intent
ought be a SharedWorker in terms of interactions with the page (although more
Intent-oriented in instantiation), crossed with the recent notion of Chrome's
Packaged App: a stand-alone contained experience. This is a radical turn I
would justify as due it's more general purpose interaction model. It also
invents far less: SharedWorkers just need some interface, and presto-chango,
we have the perfect Intents, rather than making an entirely new custom
suite of interaction models tailored to a more limited one shot use case.

I would be happy to make this proposal more concrete. Although I reference
Packaged App as a good model, the ultimate implementation could be merely
a new web browser page whose Window implements SharedWorker:

interface SharedWorker : EventTarget {
   readonly attribute MessagePort port;
};
SharedWorker extends AbstractWorker;
interface AbstractWorker {
   attribute EventHandler onerror;
};


If we want to stick on this current one shot model, I'd recommend chainable
Intents:

callback AnyCallback = void (any data, Intentable continuator);
interface Intentable {
   void startIntent(IntentSettings intent,
optional AnyCallback success,
optional DOMStringCallback failure);
}
Window implements Intentable;

This is for this use multi-part case:

window.startIntent({action:control-point},function(cpData,myPanel){
   myPanel.startIntent({action:play,data:{}})
})

Note that if the registration page does not have both of these the nested
startIntent will fail:

intent action=control-point scheme=? title=RCA Control Panel/
intent action=play scheme=? title=Play on RCA TV/

The desire I wish to express is creating a context which can be continued. The
explicit use of Intentable insures that only the previous handler will be
able to handle the new request. I'd seek a more formal mechanic to officially
carry on the continuation: informally, cpData could hold a token which could
be passed into the data of myPanel's play startIntent, but this ad-hoc
continuation is a weak way of being able to hold a reasonable conversation.

In parting, I wish to thank Sony for showing the utmost pragmatism in their
design. I appreciate their two approaches to this problem, and for showing
what a real service discovery use case looks like.

Fair regards, delighted to see this topic being talked about, please let me
know how I can aid,
rektide


Re: [whatwg] Missing alt attribute name bikeshedding (was Re: alt= and the meta name=generator exception)

2012-08-06 Thread Glenn Maynard
On Mon, Aug 6, 2012 at 4:17 AM, Odin Hørthe Omdal odi...@opera.com wrote:

 IMHO generator-unable-to-provide-**required-alt in all its ugliness is a
 really nice feature, because how would anyone in their sane mind write
 that. It's really made for a corner case, and if you really really want
 that, you should be prepared to deal with the ugliness, because what you
 are doing is ugly in the first place...


Making things ugly on purpose is always a bad idea.  Either it has valid
use cases, and it should be a clean, well-designed feature, or it doesn't,
and it shouldn't be there at all.  Please don't go down this path; we have
more than enough ugliness by accident without doing it on purpose.

-- 
Glenn Maynard


Re: [whatwg] register*Handler and Web Intents

2012-08-06 Thread Henri Sivonen
On Fri, Aug 3, 2012 at 12:00 PM, James Graham jgra...@opera.com wrote:
 I agree with Henri that it is
 extremely worrying to allow aesthetic concerns to trump backward
 compatibility here.

Letting aesthetic concerns trump backward compat is indeed troubling.
It's also troubling that this even needs to be debated, considering
that we're supposed to have a common understanding of the design
principles and the design principles pretty clearly uphold backward
compatibility over aesthetics.

 I would also advise strongly against using position in DOM to detect intents
 support; if you insist on adding a new void element I will strongly
 recommend that we add it to the parser asap to try and mitigate the above
 breakage, irrespective of whether our plans for the rest of the intent
 mechanism.

I think the compat story for new void elements is so bad that we
shouldn't add new void elements. (source gets away with being a void
element, because the damage is limited by the /video or /audio end
tag that comes soon enough after source.) I think we also shouldn't
add new elements that don't imply body when appearing in in head.

It's great that browsers have converged on the parsing algorithm.
Let's not break what we've achieved to cater to aesthetics.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] StringEncoding: encode() return type looks weird in the IDL

2012-08-06 Thread Joshua Bell
On Sun, Aug 5, 2012 at 11:44 AM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 8/5/12 1:39 PM, Glenn Maynard wrote:

 I didn't say it was extensibility, just a leftover from something that
 was either considered and dropped or forgotten about.


 Oh, I see.  I thought you were talking about leaving the return value
 as-is so that Uint16Array return values can be added later.

 I'd vote for changing the return type to Uint8Array as things stand, and
 if we ever change what the function can return, we change the return type
 at that point.


Thanks. Yes, having the return type be ArrayBufferView in the IDL is just a
leftover. Fixing it now to be Uint8Array.

I'll start another thread on StringEncoding shortly summarizing open
issues, but anyone reading this thread is encouraged to take a look at
http://wiki.whatwg.org/wiki/StringEncoding and craft opinions.


[whatwg] StringEncoding open issues

2012-08-06 Thread Joshua Bell
Regarding the API proposal at: http://wiki.whatwg.org/wiki/StringEncoding

It looks like we've got some developer interest in implementing this, and
need to nail down the open issues. I encourage folks to look over the
Resolved issues in the wiki page and make sure the resolutions - gathered
from loose consensus here and offline discussion - are truly resolved or if
anything is not future-proof and should block implementations from
proceeding. Also, look at the Notes to Implementers section; this should
be non-controversial but may be non-obvious.

This leaves two open issues: behavior on encoding error, and handling of
Byte Order Marks (BOMs)

== Encoding Errors ==

The proposal builds on Anne's
http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec,
which defines when encodings should emit an encoder error. In that spec
(which describes the existing behavior of Web browsers) encoders are used
in a limited fashion, e.g. for encoding form results before submission via
HTTP, and hence the cases are much more restricted than the errors
encountered when browsers are asked to decode content from the wild. As
noted, the encoding process could terminate when an error is emitted.
Alternately (and as is necessary for forms, etc) there is a
use-case-specific escaping mechanism for non-encodable code points.

The proposed TextDecoder object takes a TextDecoderOptions options with a
|fatal| flag that controls the decode behavior in case of error - if
|fatal| is unset (default) a decode error produces a fallback character
(U+FFFD); if |fatal| is set then a DOMException is raised instead.

No such option is currently proposed for the TextEncoder object; the
proposal dictates that a DOMException is thrown if the encoder emits an
error. I believe this is sufficient for V1, but want feedback. For V2 (or
now, if desired), the API could be extended to accept an options object
allowing for some/all of these cases;

* Don't throw, instead emit a standard/encoding-specific replacement
character (e.g. '?')
* Don't throw, instead emit a fixed placeholder character (byte?) sequence
* Don't throw, instead call a user-defined callback and allow it to produce
a replacement escaped character sequence, e.g. #x;

The latter seems the most flexible (superset of the rest) but is probably
overkill for now. Since it can be added in easily later, can we defer until
we have implementer and user feedback?


== Byte Order Marks (BOMs) ==

Once again, the proposal builds on Anne's
http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec,
which describes the existing behavior of Web browsers. In the wild,
browsers deal with a variety of mechanisms for indicating the encoding of
documents (server headers, meta tags, XML preludes, etc), many of which are
blatantly incorrect or contradictory. One form is fortunately rarely wrong
- if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes
the byte order mark (the encoding-specific serialization of U+FEFF). This
is built into the Encoding spec - given a byte sequence to decode and an
encoding label, the label is ignored if the sequence starts with one of the
three UTF BOMs, and the BOM-indicated encoding is used to decode the rest
of the stream.

The proposed API will have different uses, so it is unclear that this is
necessary or desirable.

At a minimum, it is clear that:

* If one of the UTF encodings is specified AND the BOM matches then the
leading BOM character (U+FEFF) MUST NOT be emitted in the output character
sequence (i.e. it is silently consumed)

Less clear is this behavior in these two cases.

* If one of the UTF encodings is specified AND and a different BOM is
present (e.g. UTF-16LE but a UTF-16BE BOM)
* If one of the non-UTF encodings is specified AND a UTF BOM is present

Options include:
* Nothing special - decoder does what it will with the bytes, possibly
emitting garbage, possibly throwing
* Raise a DOMException
* Switch the decoder from the user-specified encoding to the DOM-specified
encoding

The latter seems the most helpful when the proposed API is used as follows:

var s = TextDecoder().decode(bytes); // handles UTF-8 w/o BOM and any UTF
w/ BOM

... but it does seem a little weird when used like this;

var d = TextDecoder('euc-jp');
assert(d.encoding === 'euc-jp');
var s = d.decode(new Uint8Array([0xFE]), {stream: true});
assert(d.encoding === 'euc-jp');
assert(s.length === 0); // can't emit anything until BOM is definitely
passed
s += d.decode(new Uint8Array([0xFF]), {stream: true});
assert(d.encoding === 'utf-16be'); // really?


Re: [whatwg] StringEncoding: encode() return type looks weird in the IDL

2012-08-06 Thread Joshua Bell
On Sun, Aug 5, 2012 at 10:29 AM, Glenn Maynard gl...@zewt.org wrote:

 I guess the brokenness of Uint16Array (eg. the current lack of
 Uint16LEArray) could be sidestepped by just always returning Uint8Array,
 even if encoding to a 16-bit encoding (which is what it currently says to
 do).  Maybe that's better anyway, since it avoids making UTF-16 a special
 case.


+1 - which is why I pushed back on returning a Uint16Array earlier in the
discussion.


  I guess that if you're converting a string to a UTF-16 ArrayBuffer,
 you're probably doing it to quickly dump it into a binary field somewhere
 anyway--if you wanted to *examine* the codepoints, you'd just look at the
 DOMString you started with.


+1 again, and nicely stated. When I was a potential consumer of such an
API, I was happy to treat the encoded form as a black box.


Re: [whatwg] alt= and the meta name=generator exception

2012-08-06 Thread Jukka K. Korpela

On 5.8.2012 15:52, Henri Sivonen wrote:

People who are not the developer of the generator use validators to
assess the quality of the markup generated by the generator.
People can use tools in various ways. We cannot prevent that. But it 
does not need to dictate the design of tools. People can use hammers as 
toothpicks, but hammer manufacturers don't make hammers softer for this 
reason.



Or, alternatively, Alice anticipates Bob's reaction and preemptively
makes her generator output alt= before Bob ever gets to badmouth
about the invalidity of the generator's output.


So? Whose problem is this? Generators have generated nonsensical alt 
attributes for years, e.g. inserting the filename and number of bytes. 
Keeping the attribute required won't make much difference.



Even if we wanted to position validators as tools for the people who
write markup, we can't prevent other people from using validators to
judge markup output by generator written by others.
And it is appropriate to judge that generation of HTML has problems, 
when the markup contains img elements without alt attributes. There is 
no reason why this possibility should be taken away. It is true that 
generator vendors can cheat by emitting alt=. We can't really prevent 
that. You seem to be worried about the possibility that keeping alt 
attribute required somehow pushes or forces vendors into doing such 
things to stay competitive. But this sounds highly speculative.


We know that generators and other software may produce documents without 
a title element or with a dummy or bogus title element like titleNew 
document/title. And surely there are situations where an automatic 
generator has no way of deciding on an appropriate title element without 
consulting the user. So should there also be an exception allowing the 
omission of the title element, to avoid the assumed reaction by Alice, 
making her generator produce title/title or something worse?


Yucca


Re: [whatwg] alt= and the meta name=generator exception

2012-08-06 Thread Smylers
Jukka K. Korpela writes:

 On 5.8.2012 15:52, Henri Sivonen wrote:
 
  Alice anticipates Bob's reaction and preemptively makes her
  generator output alt=
 
 So? Whose problem is this?

It hurts users browsing without images of pages generated by that
generator.

If the validator can do something different which wouldn't nudge
developers into writing software which produces such mark-up, end-users
benefit.

Smylers
-- 
http://twitter.com/Smylers2


Re: [whatwg] StringEncoding open issues

2012-08-06 Thread Jonas Sicking
On Mon, Aug 6, 2012 at 11:29 AM, Joshua Bell jsb...@chromium.org wrote:
 Regarding the API proposal at: http://wiki.whatwg.org/wiki/StringEncoding

 It looks like we've got some developer interest in implementing this, and
 need to nail down the open issues. I encourage folks to look over the
 Resolved issues in the wiki page and make sure the resolutions - gathered
 from loose consensus here and offline discussion - are truly resolved or if
 anything is not future-proof and should block implementations from
 proceeding. Also, look at the Notes to Implementers section; this should
 be non-controversial but may be non-obvious.

 This leaves two open issues: behavior on encoding error, and handling of
 Byte Order Marks (BOMs)

 == Encoding Errors ==

 The proposal builds on Anne's
 http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec,
 which defines when encodings should emit an encoder error. In that spec
 (which describes the existing behavior of Web browsers) encoders are used
 in a limited fashion, e.g. for encoding form results before submission via
 HTTP, and hence the cases are much more restricted than the errors
 encountered when browsers are asked to decode content from the wild. As
 noted, the encoding process could terminate when an error is emitted.
 Alternately (and as is necessary for forms, etc) there is a
 use-case-specific escaping mechanism for non-encodable code points.

 The proposed TextDecoder object takes a TextDecoderOptions options with a
 |fatal| flag that controls the decode behavior in case of error - if
 |fatal| is unset (default) a decode error produces a fallback character
 (U+FFFD); if |fatal| is set then a DOMException is raised instead.

 No such option is currently proposed for the TextEncoder object; the
 proposal dictates that a DOMException is thrown if the encoder emits an
 error. I believe this is sufficient for V1, but want feedback. For V2 (or
 now, if desired), the API could be extended to accept an options object
 allowing for some/all of these cases;

Not introducing options for the encoder for V1 sounds like a good idea
to me. However I would definitely prefer if the default for encoding
matches the default for decoding and used replacement characters
rather than threw an exception.

This also matches what the recent WebSocket spec which recently
changed from throwing to using replacement characters for encoding.

The reason WebSocket was changed was because it's relatively easy to
make a mistake and cause a surrogate UTF16 pair be cut into two, which
results in an invalidly encoded DOMString. The problem with this is
that it's very data dependent and so might not happen on the
developer's computer, but only in the wild when people write text
which uses non-BMP characters. In such cases throwing an exception
will likely result in more breakage than using a replacement
character.

 * Don't throw, instead emit a standard/encoding-specific replacement
 character (e.g. '?')

Yes, using the replacement character sounds good to me.

 * Don't throw, instead emit a fixed placeholder character (byte?) sequence
 * Don't throw, instead call a user-defined callback and allow it to produce
 a replacement escaped character sequence, e.g. #x;

 The latter seems the most flexible (superset of the rest) but is probably
 overkill for now. Since it can be added in easily later, can we defer until
 we have implementer and user feedback?

Indeed, we can explore these options if the need arises.

 == Byte Order Marks (BOMs) ==

 Once again, the proposal builds on Anne's
 http://dvcs.w3.org/hg/encoding/raw-file/tip/Overview.html encoding spec,
 which describes the existing behavior of Web browsers. In the wild,
 browsers deal with a variety of mechanisms for indicating the encoding of
 documents (server headers, meta tags, XML preludes, etc), many of which are
 blatantly incorrect or contradictory. One form is fortunately rarely wrong
 - if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes
 the byte order mark (the encoding-specific serialization of U+FEFF). This
 is built into the Encoding spec - given a byte sequence to decode and an
 encoding label, the label is ignored if the sequence starts with one of the
 three UTF BOMs, and the BOM-indicated encoding is used to decode the rest
 of the stream.

 The proposed API will have different uses, so it is unclear that this is
 necessary or desirable.

 At a minimum, it is clear that:

 * If one of the UTF encodings is specified AND the BOM matches then the
 leading BOM character (U+FEFF) MUST NOT be emitted in the output character
 sequence (i.e. it is silently consumed)

Agreed.

 Less clear is this behavior in these two cases.

 * If one of the UTF encodings is specified AND and a different BOM is
 present (e.g. UTF-16LE but a UTF-16BE BOM)
 * If one of the non-UTF encodings is specified AND a UTF BOM is present

 Options include:
 * Nothing special - decoder does what it will with the bytes, 

Re: [whatwg] StringEncoding open issues

2012-08-06 Thread Glenn Maynard
I agree with Jonas that encoding should just use a replacement character
(U+FFFD for Unicode encodings, '?' otherwise), and that we should put off
other modes (eg. exceptions and user-specified replacement characters)
until there's a clear need.

My intuition is that encoding DOMString to UTF-16 should never have errors;
if there are dangling surrogates, pass them through unchanged.  There's no
point in using a placeholder that says an error occured here, when the
error can be passed through in exactly the same form (not possible with eg.
DOMString-SJIS).  I don't feel strongly about this only because outputting
UTF-16 is so rare to begin with.

On Mon, Aug 6, 2012 at 1:29 PM, Joshua Bell jsb...@chromium.org wrote:

 - if the document is encoded in UTF-8, UTF-16LE or UTF-16BE and includes
 the byte order mark (the encoding-specific serialization of U+FEFF).


This rarely detects the wrong type, but that doesn't mean it's not the
wrong answer.  If my input is meant to be UTF-8, and someone hands me
BOM-marked UTF-16, I want it to fail in the same way it would if someone
passed in SJIS.  I don't want it silently translated.

On the other hand, it probably does make sense for UTF-16 to switch to
UTF-16BE, since that's by definition the original purpose of the BOM.

The convention iconv uses, which I think is a useful one, is decoding from
UTF-16 means try to figure out the encoding from the BOM, if any, and
UTF-16LE and UTF-16BE mean always use this exact encoding.

 * If one of the UTF encodings is specified AND the BOM matches then the
 leading BOM character (U+FEFF) MUST NOT be emitted in the output character
 sequence (i.e. it is silently consumed)


It's a little weird that

data = readFile(user-supplied-file.txt); // shortcutting for brevity
var s = new TextDecoder(utf-16).decode(data); // or utf-8
s = s.replace(a, b);
var data2 = new TextEncoder(utf-16).encode(s);
writeFile(user-supplied-file.txt, data2);

causes the BOM to be quietly stripped away.  Normally if you're modifying a
file, you want to pass through the BOM (or lack thereof) untouched.

One way to deal with this could be:

var decoder = new TextDecoder(utf-16);
var s = decoder.decode(data);
s = s.replace(a, b);
var data2 = new TextEncoder(decoder.encoding).encode(s);

where decoder.encoding is eg. UTF-16LE-BOM if a BOM was present, thus
preserving both the BOM and (for UTF-16) endianness.  I don't actually like
this, though, because I don't like the idea of decoder.encoding changing
after the decoder has already been constructed.

I think I agree with just stripping it, and people who want to preserve
BOMs on write-through can jump the hoops manually (which aren't terribly
hard).


Another issue is new TextDecoder('ascii').encoding (and ISO-8859-1)
giving .encoding = windows-1252.  That's strange, even when you know why
it's happening.

Is there any reason to expose the actual primary names?  It's not clear
that the name column in the Encoding spec is even intended to be exposed
to APIs; they look more like labels for specs to refer to internally.
(Anne?)  If there's no pressing reason to expose this, I'd suggest that the
.encoding attribute simply return the name that was passed to the
constructor.

It's still not ideal (it's weird that asking for ASCII gives you something
other than ASCII in the first place), but it at least seems a bit less
strange.  The nice fix would be to implement actual ASCII, ISO-8859-1,
ISO-8859-9, etc. charsets, but that just means extra implementation work
(and some charset proliferation) without use cases.

-- 
Glenn Maynard


[whatwg] iframe sandbox and indexedDB

2012-08-06 Thread Ian Melven

Hi,

the spec at 
http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html#sandboxed-origin-browsing-context-flag
says :

This flag also prevents script from reading from or writing to the 
document.cookie IDL attribute, and blocks access to localStorage.

it seems that indexedDB access should also be blocked when this flag is set (ie 
when 'allow-same-origin' is NOT specified for the sandbox attribute).

i intend to implement this restriction in Gecko, feedback from other 
implementors is welcome :)

thanks !
Ian


Re: [whatwg] iframe sandbox and indexedDB

2012-08-06 Thread Adam Barth
On Mon, Aug 6, 2012 at 5:08 PM, Ian Melven imel...@mozilla.com wrote:
 the spec at 
 http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html#sandboxed-origin-browsing-context-flag
 says :

 This flag also prevents script from reading from or writing to the 
 document.cookie IDL attribute, and blocks access to localStorage.

 it seems that indexedDB access should also be blocked when this flag is set 
 (ie when 'allow-same-origin' is NOT specified for the sandbox attribute).

Yes.  I think this is actually a consequence of having a unique origin
and doesn't need to be stated explicitly in the spec.  (Although we
might want to state it explicitly for the avoidance of doubt.)

The reason document.cookie needs to called out explicitly is that it
doesn't use the document's origin to determine which cookies to
access: it uses the document's URL.  We need to do that because cookie
ignore the port but do care about the path part of the document's URL.
 (The better pattern for new API is to use the origin, which is what
IndexedDB does.)

 i intend to implement this restriction in Gecko, feedback from other 
 implementors is welcome :)

Great.

Adam


Re: [whatwg] iframe sandbox and indexedDB

2012-08-06 Thread Ian Melven

Hi,

- Original Message -
From: Adam Barth w...@adambarth.com
To: Ian Melven imel...@mozilla.com
Cc: whatwg@lists.whatwg.org
Sent: Monday, August 6, 2012 5:12:40 PM
Subject: Re: [whatwg] iframe sandbox and indexedDB

 Yes.  I think this is actually a consequence of having a unique origin
 and doesn't need to be stated explicitly in the spec.  (Although we
 might want to state it explicitly for the avoidance of doubt.)

yeah, i can see how this situation behaves being implementation dependent - 
some implementations might allow storing data for the unique origin,
which seems undesirable. So, it might be worth stating the restriction 
explicitly,
as it is for LocalStorage. 

thank you for the clarification on why document.cookie is explicitly called out 
:)

thanks,
ian



[whatwg] StringEncoding: Allowed encodings for TextEncoder

2012-08-06 Thread Jonas Sicking
Hi All,

I seem to have a recollection that we discussed only allowing encoding
to UTF8 and UTF16LE, UTF16BE. This in order to promote these formats
as well as stay in sync with other APIs like XMLHttpRequest.

However I currently can't find any restrictions on which target
encodings are supported in the current drafts.

One wrinkle in this is if we want to support arbitrary encodings when
encoding, that means that we can't use insert a the replacement
character as default error handling since that isn't available in a
lot of encoding formats.

/ Jonas