Re: String to ArrayBuffer

2012-01-12 Thread Charles Pritchard

On 1/12/2012 10:03 AM, Tab Atkins Jr. wrote:

On Thu, Jan 12, 2012 at 9:54 AM, Charles Pritchard  wrote:

I don't see it being a particularly bad thing if vendors expose more
translation encodings. I've only come across one project that would use
them. Binary and utf8 handle everything else I've come across, and I can use
them to build character maps for the rest, if I ever hit another strange
project that needs them.

As always, the problem is that if one browser supports an encoding
that no one else does, then content will be written that depends on
that encoding, and thus is locked into that browser.  Other browsers
will then feel competitive pressure to support the encoding, so that
the content works on them as well.  Repeat this for the union of
encodings that every browser supports.

It's not necessarily true that this will happen for every single
encoding.  History shows us that it will probably happen with at least
*several* encodings, if nothing is done to prevent it.  But there's no
reason to risk it, when we can legislate against it and even test for
common things that browsers *might* support.



Count me as agnostic. I'm fine with simple. I'd like to see MS and Apple 
chime in on this issue.


Here's the "worst case" as I understand it being presented:
http://www.php.net/manual/en/mbstring.supported-encodings.php
http://php.net/manual/en/function.mb-convert-encoding.php


-Charles




Re: String to ArrayBuffer

2012-01-12 Thread Charles Pritchard
On Jan 12, 2012, at 9:17 AM, Glenn Adams  wrote:

> 
> 
> On Thu, Jan 12, 2012 at 10:10 AM, Tab Atkins Jr.  wrote:
> On Thu, Jan 12, 2012 at 8:59 AM, Glenn Adams  wrote:
> > On Thu, Jan 12, 2012 at 3:49 AM, Henri Sivonen  wrote:
> >> On Thu, Jan 12, 2012 at 1:12 AM, Kenneth Russell  wrote:
> >> > The StringEncoding proposal is the best path forward because it
> >> > provides correct behavior in all cases.
> >>
> >> Do you mean this one? http://wiki.whatwg.org/wiki/StringEncoding
> >>
> >> I see the following problems after a cursory glance:
> >>  4) It says "Browsers MAY support additional encodings." This is a
> >> huge non-interoperability loophole. The spec should have a small and
> >> fixed set of supported encodings that everyone MUST support and
> >> supporting other encodings should be a "MUST NOT".
> >
> >
> > In practice, it will be impractical if not impossible to enforce such a
> > dictum "MUST NOT support other encodings". Implementers will support
> > whatever they like when it comes to character encodings, both for
> > interchange, runtime storage, and persistent storage.
> 
> Actually, such requirements often work relatively well.  Many
> implementors recognize the pain caused by race-to-the-bottom support
> for random encodings.
> 
> I speak of enforcement. Will there be test cases to check for support of a 
> random encoding not included in a blessed list? I suspect not.

If it is an issue, an array of strings from a getSupportedEncodings method 
would solve that one.

I don't see it being a particularly bad thing if vendors expose more 
translation encodings. I've only come across one project that would use them. 
Binary and utf8 handle everything else I've come across, and I can use them to 
build character maps for the rest, if I ever hit another strange project that 
needs them.

-Charles

Re: String to ArrayBuffer

2012-01-12 Thread Tab Atkins Jr.
On Thu, Jan 12, 2012 at 9:54 AM, Charles Pritchard  wrote:
> I don't see it being a particularly bad thing if vendors expose more
> translation encodings. I've only come across one project that would use
> them. Binary and utf8 handle everything else I've come across, and I can use
> them to build character maps for the rest, if I ever hit another strange
> project that needs them.

As always, the problem is that if one browser supports an encoding
that no one else does, then content will be written that depends on
that encoding, and thus is locked into that browser.  Other browsers
will then feel competitive pressure to support the encoding, so that
the content works on them as well.  Repeat this for the union of
encodings that every browser supports.

It's not necessarily true that this will happen for every single
encoding.  History shows us that it will probably happen with at least
*several* encodings, if nothing is done to prevent it.  But there's no
reason to risk it, when we can legislate against it and even test for
common things that browsers *might* support.

~TJ



Re: String to ArrayBuffer

2012-01-12 Thread Glenn Adams
On Thu, Jan 12, 2012 at 10:10 AM, Tab Atkins Jr. wrote:

> On Thu, Jan 12, 2012 at 8:59 AM, Glenn Adams  wrote:
> > On Thu, Jan 12, 2012 at 3:49 AM, Henri Sivonen  wrote:
> >> On Thu, Jan 12, 2012 at 1:12 AM, Kenneth Russell 
> wrote:
> >> > The StringEncoding proposal is the best path forward because it
> >> > provides correct behavior in all cases.
> >>
> >> Do you mean this one? http://wiki.whatwg.org/wiki/StringEncoding
> >>
> >> I see the following problems after a cursory glance:
> >>  4) It says "Browsers MAY support additional encodings." This is a
> >> huge non-interoperability loophole. The spec should have a small and
> >> fixed set of supported encodings that everyone MUST support and
> >> supporting other encodings should be a "MUST NOT".
> >
> >
> > In practice, it will be impractical if not impossible to enforce such a
> > dictum "MUST NOT support other encodings". Implementers will support
> > whatever they like when it comes to character encodings, both for
> > interchange, runtime storage, and persistent storage.
>
> Actually, such requirements often work relatively well.  Many
> implementors recognize the pain caused by race-to-the-bottom support
> for random encodings.


I speak of enforcement. Will there be test cases to check for support of a
random encoding not included in a blessed list? I suspect not.


Re: String to ArrayBuffer

2012-01-12 Thread Tab Atkins Jr.
On Thu, Jan 12, 2012 at 8:59 AM, Glenn Adams  wrote:
> On Thu, Jan 12, 2012 at 3:49 AM, Henri Sivonen  wrote:
>> On Thu, Jan 12, 2012 at 1:12 AM, Kenneth Russell  wrote:
>> > The StringEncoding proposal is the best path forward because it
>> > provides correct behavior in all cases.
>>
>> Do you mean this one? http://wiki.whatwg.org/wiki/StringEncoding
>>
>> I see the following problems after a cursory glance:
>>  4) It says "Browsers MAY support additional encodings." This is a
>> huge non-interoperability loophole. The spec should have a small and
>> fixed set of supported encodings that everyone MUST support and
>> supporting other encodings should be a "MUST NOT".
>
>
> In practice, it will be impractical if not impossible to enforce such a
> dictum "MUST NOT support other encodings". Implementers will support
> whatever they like when it comes to character encodings, both for
> interchange, runtime storage, and persistent storage.

Actually, such requirements often work relatively well.  Many
implementors recognize the pain caused by race-to-the-bottom support
for random encodings.

~TJ



Re: String to ArrayBuffer

2012-01-12 Thread Glenn Adams
On Thu, Jan 12, 2012 at 3:49 AM, Henri Sivonen  wrote:

> On Thu, Jan 12, 2012 at 1:12 AM, Kenneth Russell  wrote:
> > The StringEncoding proposal is the best path forward because it
> > provides correct behavior in all cases.
>
> Do you mean this one? http://wiki.whatwg.org/wiki/StringEncoding
>
> I see the following problems after a cursory glance:
>  4) It says "Browsers MAY support additional encodings." This is a
> huge non-interoperability loophole. The spec should have a small and
> fixed set of supported encodings that everyone MUST support and
> supporting other encodings should be a "MUST NOT".
>

In practice, it will be impractical if not impossible to enforce such a
dictum "MUST NOT support other encodings". Implementers will support
whatever they like when it comes to character encodings, both for
interchange, runtime storage, and persistent storage.

Regarding use of the word "support" in the context of character encodings,
it would be useful if folks would explicitly qualify support as applying to
one of these three uses (interchange, runtime storage, persistent storage).


Re: String to ArrayBuffer

2012-01-12 Thread Henri Sivonen
On Thu, Jan 12, 2012 at 1:12 AM, Kenneth Russell  wrote:
> The StringEncoding proposal is the best path forward because it
> provides correct behavior in all cases.

Do you mean this one? http://wiki.whatwg.org/wiki/StringEncoding

I see the following problems after a cursory glance:
 1) It doesn't support streaming encoding/decoding.
 2) BINARY and ISO-8859-1 are defined as functionally equivalent. It
would be better to keep BINARY and get rid of real ISO-8859-1, because
normally the Web platform doesn't support real ISO-8859-1 and
ISO-8859-1 is an alias for Windows-1252.
 3) UTF-16 is supported, which is bad, because it's a terrible idea to
use UTF-16 for interchange.
 4) It says "Browsers MAY support additional encodings." This is a
huge non-interoperability loophole. The spec should have a small and
fixed set of supported encodings that everyone MUST support and
supporting other encodings should be a "MUST NOT".

What's the motivation for supporting encodings other than UTF-8 and BINARY?

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/



Re: String to ArrayBuffer

2012-01-11 Thread Charles Pritchard

On 1/11/2012 7:44 PM, Charles Pritchard wrote:

On 1/11/2012 4:22 PM, Boris Zbarsky wrote:

On 1/11/12 6:03 PM, Charles Pritchard wrote:

Web Storage, also, only works with unicode.


I'm not familiar with the relevant part of Web Storage.  Can you cite 
the relevant part please?


The character code conversion gets weird. If you'd explain this in the 
proper terms, I'd appreciate it.


Load a binary resource via the old charset hack.

Save the resulting string into localStorage. There are some conversion 
issues. I am not using the right vocabulary.
I know the list has seen the issue before, and I'll bet someone here 
can explain it succinctly.


Example:
// Image files are easiest to try this with.
https://developer.mozilla.org/En/XMLHttpRequest/Using_XMLHttpRequest#Receiving_binary_data_in_older_browsers 


// From the article:
function load_binary_resource(url) {
  var req = new XMLHttpRequest();
  req.open('GET', url, false);
  //XHR binary charset opt by Marcus Granado 2006 
[http://mgran.blogspot.com]

  req.overrideMimeType('text\/plain; charset=x-user-defined');
  req.send(null);
  if (req.status != 200) return '';
  return req.responseText;
}
var x = load_binary_resource('imageurl.png');
localStorage.fail = x;
localStorage.fail == x.fail; // will return false.



I'm sorry, I'm just digging myself in a hole with this one. First, that 
should be localStorage.fail == x;


Second, it seems to be working fine in my console. I know that I had 
some heavy issues with the technique last year.
There are many posts on the Net talking about base64 encoding binary 
before putting it into Web Storage.


But, more importantly, this thread is being resolved via the 
StringEncoding API for Typed Arrays.


I'm not sure what the plan is for Blob storage with IndexedDB; other 
than it's a v2 consideration.


Anyway, sorry for the confusion. If I do come across an old thread or 
explanation of the issue, I'll post it.
I'm certain there were issues with storing arbitrary binary in 
localStorage in prior versions of browsers.



-Charles





Re: String to ArrayBuffer

2012-01-11 Thread Charles Pritchard

On 1/11/2012 4:22 PM, Boris Zbarsky wrote:

On 1/11/12 6:03 PM, Charles Pritchard wrote:

Is there any instance in practice where DOMString as exposed to the
scripting environment is not implemented as a unicode string?


I don't know what you mean by that.

The point is, it's trivial to construct JS strings that contain 
arbitrary sequences of 16-bit units (using fromCharCode or \u 
escapes).  Nothing anywhere in JS or the DOM per se enforces that 
strings are valid UTF-16 (which is the way that an actual Unicode 
string would be encoded as a JS string).



My [wrong] understanding was that DOMString referred to valid unicode.

WebIDL:
"The DOMString type corresponds to the set of all possible sequences of 
16 bit unsigned integer code units. Such sequences are commonly 
interpreted as UTF-16 encoded strings [RFC2781] although this is not 
required... Nothing in this specification requires a DOMString value to 
be a valid UTF-16 string."

http://www.w3.org/TR/WebIDL/#idl-DOMString

DOM3:
"The DOMString type is used to store [Unicode] characters as a sequence 
of 16-bit units using UTF-16 as defined in [Unicode] and Amendment 1 of 
[ISO/IEC 10646]." There are some normalization notes, but otherwise, 
it's close enough to saying it stores Unicode, but it can handle all 
16bit combinations.

http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#ID-C74D1578

For "historic reasons" WindowBase64 throws an error if input is not 
within Unicode range.

http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob



I realize that internally, DOMString may be implemented as a 16 bit
integer + length;


Not just internally.  The JS spec and the DOM spec both explicitly say 
that this is what strings are: an array of 16-bit integers.


WebIDL and DOM define "DOMString", of course. JS defines "The String 
Type" in 8.4. They are intended to be the same.

http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf

"The  String type is the set of all finite ordered sequences of zero or 
more 16-bit unsigned integer values  When a String contains actual 
textual data, each element is considered to be a single UTF-16 code 
unit.  Whether or not this is the actual storage format of a String, the 
characters within a String are numbered by their initial code unit 
element position as though they were represented using UTF-16."



Browsers do the same thing with WindowBase64, though it's specified as
DOMString, in practice (as the notes say), it's unicode.
http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob 



If you look at the actual processing model, you take the input array 
of 16-bit integers, throw if any is not in the set { 0x2B, 0x2F, 0x30 
} union [0x41,0x5A] union [0x61,0x6A] and then treat the rest as ASCII 
data (which at that point it is).


It defines this in terms of "Unicode" but that's just because any JS 
string that satisfies the above constraints can be considered a 
"Unicode" string if one wishes.



Web Storage, also, only works with unicode.


I'm not familiar with the relevant part of Web Storage.  Can you cite 
the relevant part please?


The character code conversion gets weird. If you'd explain this in the 
proper terms, I'd appreciate it.


Load a binary resource via the old charset hack.

Save the resulting string into localStorage. There are some conversion 
issues. I am not using the right vocabulary.
I know the list has seen the issue before, and I'll bet someone here can 
explain it succinctly.


Example:
// Image files are easiest to try this with.
https://developer.mozilla.org/En/XMLHttpRequest/Using_XMLHttpRequest#Receiving_binary_data_in_older_browsers
// From the article:
function load_binary_resource(url) {
  var req = new XMLHttpRequest();
  req.open('GET', url, false);
  //XHR binary charset opt by Marcus Granado 2006 
[http://mgran.blogspot.com]

  req.overrideMimeType('text\/plain; charset=x-user-defined');
  req.send(null);
  if (req.status != 200) return '';
  return req.responseText;
}
var x = load_binary_resource('imageurl.png');
localStorage.fail = x;
localStorage.fail == x.fail; // will return false.





Re: String to ArrayBuffer

2012-01-11 Thread Boris Zbarsky

On 1/11/12 6:03 PM, Charles Pritchard wrote:

Is there any instance in practice where DOMString as exposed to the
scripting environment is not implemented as a unicode string?


I don't know what you mean by that.

The point is, it's trivial to construct JS strings that contain 
arbitrary sequences of 16-bit units (using fromCharCode or \u escapes). 
 Nothing anywhere in JS or the DOM per se enforces that strings are 
valid UTF-16 (which is the way that an actual Unicode string would be 
encoded as a JS string).



I realize that internally, DOMString may be implemented as a 16 bit
integer + length;


Not just internally.  The JS spec and the DOM spec both explicitly say 
that this is what strings are: an array of 16-bit integers.



Browsers do the same thing with WindowBase64, though it's specified as
DOMString, in practice (as the notes say), it's unicode.
http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob


If you look at the actual processing model, you take the input array of 
16-bit integers, throw if any is not in the set { 0x2B, 0x2F, 0x30 } 
union [0x41,0x5A] union [0x61,0x6A] and then treat the rest as ASCII 
data (which at that point it is).


It defines this in terms of "Unicode" but that's just because any JS 
string that satisfies the above constraints can be considered a 
"Unicode" string if one wishes.



Web Storage, also, only works with unicode.


I'm not familiar with the relevant part of Web Storage.  Can you cite 
the relevant part please?


-Boris



Re: String to ArrayBuffer

2012-01-11 Thread Joshua Bell
On Wed, Jan 11, 2012 at 3:12 PM, Kenneth Russell  wrote:

> The StringEncoding proposal is the best path forward because it
> provides correct behavior in all cases. Adding String conversions
> directly to the typed array spec will introduce dependencies that are
> strongly undesirable, and make it much harder to implement the core
> spec. Hopefully Josh can provide an update on how the StringEncoding
> proposal is going.
>
> -Ken
>

Thanks for the cue, Ken. :)

As background for folks on public-webapps, the StringEncoding proposal
linked to by Charles grew out of similar discussions to this in on the
public_we...@khronos.org discussion. The most recent thread can be found at
http://www.khronos.org/webgl/public-mailing-list/archives//msg00017.html


If you read that thread it should be clear why the proposal is as "heavy"
as it is (although, being mired in IndexedDB lately, it looks so tiny).
Dealing with text encoding is also never as trivial or easy as it seems.

As far as current status: I haven't done much work on the proposal in the
last month or so, but plan to pick that up again soon, and it should be
shopped around for the appropriate WG (public-webapps or otherwise) for
feedback, gauging implementer interest, etc. Anne's work over on whatwg
around encoding detection and BOM handling in browsers is valuable so I've
been watching that closely, although this is a new API and callers will
have access to the raw bits so we don't have to spec the kitchen sink or
match legacy behavior. There are a few open issues called out in the
proposal, perhaps most notably the default handling of invalid data.


>
> On Wed, Jan 11, 2012 at 3:05 PM, Charles Pritchard 
> wrote:
> > On 1/11/2012 2:49 PM, James Robinson wrote:
> >
> >
> >
> > On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard 
> wrote:
> >>
> >> Currently, we can asynchronously use BlobBuilder with FileReader to get
> an
> >> array buffer from a string.
> >> We can of course, use code to convert String.fromCharCode into a
> >> Uint8Array, but it's ugly.
> >>
> >> The StringEncoding proposal seems a bit much for most web use:
> >> http://wiki.whatwg.org/wiki/StringEncoding
> >>
> >> All we really ever do is work on DOMString, and that's covered by UTF8.
> >
> >
> > DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 bit
> > integers and a length.
> >
> >
> >
> > To clarify, I'd want ArrayBuffer(DOMString) to work with unicode and
> throw
> > an error if the DOMString is not valid unicode.
> > This is consistent with other Web Apps APIs.
> >
> > For feature detection, the method should be wrapped in a try-catch block
> > anyway.
> >
> > -Charles
>


Re: String to ArrayBuffer

2012-01-11 Thread Charles Pritchard

On 1/11/2012 3:12 PM, Kenneth Russell wrote:

The StringEncoding proposal is the best path forward because it
provides correct behavior in all cases. Adding String conversions
directly to the typed array spec will introduce dependencies that are
strongly undesirable, and make it much harder to implement the core
spec. Hopefully Josh can provide an update on how the StringEncoding
proposal is going.


Looking forward to it.
I'm not particularly worried about the dependencies, but, what I 
proposed is likely to do the wrong thing.
I'd want the DOMString processed as a UTF8 string, and at that point, 
we're stepping out of the way that other Web Apps APIs operate.


Is base64 encoding at all appropriate for a StringEncoding type?
Browser implementations of atob are not very good, and it's an extra 
step to run  StringEncoding(atob()).



-Charles





Re: String to ArrayBuffer

2012-01-11 Thread Kenneth Russell
The StringEncoding proposal is the best path forward because it
provides correct behavior in all cases. Adding String conversions
directly to the typed array spec will introduce dependencies that are
strongly undesirable, and make it much harder to implement the core
spec. Hopefully Josh can provide an update on how the StringEncoding
proposal is going.

-Ken

On Wed, Jan 11, 2012 at 3:05 PM, Charles Pritchard  wrote:
> On 1/11/2012 2:49 PM, James Robinson wrote:
>
>
>
> On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard  wrote:
>>
>> Currently, we can asynchronously use BlobBuilder with FileReader to get an
>> array buffer from a string.
>> We can of course, use code to convert String.fromCharCode into a
>> Uint8Array, but it's ugly.
>>
>> The StringEncoding proposal seems a bit much for most web use:
>> http://wiki.whatwg.org/wiki/StringEncoding
>>
>> All we really ever do is work on DOMString, and that's covered by UTF8.
>
>
> DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 bit
> integers and a length.
>
>
>
> To clarify, I'd want ArrayBuffer(DOMString) to work with unicode and throw
> an error if the DOMString is not valid unicode.
> This is consistent with other Web Apps APIs.
>
> For feature detection, the method should be wrapped in a try-catch block
> anyway.
>
> -Charles



Re: String to ArrayBuffer

2012-01-11 Thread James Robinson
On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard  wrote:

> Currently, we can asynchronously use BlobBuilder with FileReader to get an
> array buffer from a string.
> We can of course, use code to convert String.fromCharCode into a
> Uint8Array, but it's ugly.
>
> The StringEncoding proposal seems a bit much for most web use:
> http://wiki.whatwg.org/wiki/**StringEncoding
>
> All we really ever do is work on DOMString, and that's covered by UTF8.
>

DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 bit
integers and a length.


>
> As following file shows, DOMString to ArrayBuffer conversion is about 30
> lines of code (start at line 125):
> http://code.google.com/p/**stringencoding/source/browse/**encoding.js


This only seems correct for valid unicode strings, which does not cover all
DOMStrings.

- James

>
>
> It seems like this kind of type conversion could be handled more
> efficiently and be less error prone on programmers like myself, who often
> forget to test with multibyte strings.
>
> I'm sure this has popped up many times before on the list. Thought I'd put
> it out there again.
> We could just tweak the ArrayBuffer constructor to support DOMString as an
> argument.
> Currently, it supports length.
>
> -Charles
>
>


Re: String to ArrayBuffer

2012-01-11 Thread Charles Pritchard

On 1/11/2012 2:49 PM, James Robinson wrote:



On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard > wrote:


Currently, we can asynchronously use BlobBuilder with FileReader
to get an array buffer from a string.
We can of course, use code to convert String.fromCharCode into a
Uint8Array, but it's ugly.

The StringEncoding proposal seems a bit much for most web use:
http://wiki.whatwg.org/wiki/StringEncoding

All we really ever do is work on DOMString, and that's covered by
UTF8.


DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 
bit integers and a length.


Is there any instance in practice where DOMString as exposed to the 
scripting environment is not implemented as a unicode string?
I realize that internally, DOMString may be implemented as a 16 bit 
integer + length;



As following file shows, DOMString to ArrayBuffer conversion is
about 30 lines of code (start at line 125):
http://code.google.com/p/stringencoding/source/browse/encoding.js


This only seems correct for valid unicode strings, which does not 
cover all DOMStrings.




Sure, they're checking for correctness. And it's really only about 15 lines.

Browsers do the same thing with WindowBase64, though it's specified as 
DOMString, in practice (as the notes say), it's unicode.

http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#atob

Web Storage, also, only works with unicode.

-Charles


Re: String to ArrayBuffer

2012-01-11 Thread Charles Pritchard

On 1/11/2012 2:49 PM, James Robinson wrote:



On Wed, Jan 11, 2012 at 2:45 PM, Charles Pritchard > wrote:


Currently, we can asynchronously use BlobBuilder with FileReader
to get an array buffer from a string.
We can of course, use code to convert String.fromCharCode into a
Uint8Array, but it's ugly.

The StringEncoding proposal seems a bit much for most web use:
http://wiki.whatwg.org/wiki/StringEncoding

All we really ever do is work on DOMString, and that's covered by
UTF8.


DOMString is not UTF8 or necessarily unicode.  It's a sequence of 16 
bit integers and a length.




To clarify, I'd want ArrayBuffer(DOMString) to work with unicode and 
throw an error if the DOMString is not valid unicode.

This is consistent with other Web Apps APIs.

For feature detection, the method should be wrapped in a try-catch block 
anyway.


-Charles


String to ArrayBuffer

2012-01-11 Thread Charles Pritchard
Currently, we can asynchronously use BlobBuilder with FileReader to get 
an array buffer from a string.
We can of course, use code to convert String.fromCharCode into a 
Uint8Array, but it's ugly.


The StringEncoding proposal seems a bit much for most web use:
http://wiki.whatwg.org/wiki/StringEncoding

All we really ever do is work on DOMString, and that's covered by UTF8.

As following file shows, DOMString to ArrayBuffer conversion is about 30 
lines of code (start at line 125):

http://code.google.com/p/stringencoding/source/browse/encoding.js

It seems like this kind of type conversion could be handled more 
efficiently and be less error prone on programmers like myself, who 
often forget to test with multibyte strings.


I'm sure this has popped up many times before on the list. Thought I'd 
put it out there again.
We could just tweak the ArrayBuffer constructor to support DOMString as 
an argument.

Currently, it supports length.

-Charles