Re: [whatwg] Please always use utf-8 for Web Workers
On Fri, 25 Sep 2009, Simon Pieters wrote: Workers are new and seems very likely to be incompatible with existing scripts. So it is not subject to legacy content with legacy encodings. Therefore, we should be able to always use utf-8 for workers. Always using utf-8 is simpler to implement and test and encourages people to switch to utf-8 elsewhere. On Fri, 25 Sep 2009, Jonathan Cook wrote: The importScripts portion of the Web Workers API is compatible with existing scripts, but I'm all for more UTF-8 :) If the restriction is added to the spec, I'd want to know that a very clear error was going to be thrown explaining the problem. On Fri, 25 Sep 2009, Simon Pieters wrote: I'm not sure that throwing an error is a good idea. Would you throw an error when there's no declared encoding? That seems to be annoying for the common case of just using ASCII characters. Throwing an error when there is a declared encoding that is not utf-8 might work, but are there many scripts that have a declared encoding and are not utf-8? I think it is to just ignore any declared encoding and assume utf-8. If people are using non-ascii in another encoding, then they would notice by seeing that their text looks like garbage. Browsers could also log messages to their error consoles about encoding declarations declaring non-utf-8 and/or sequences of bytes that are not valid utf-8. On Fri, 25 Sep 2009, Drew Wilson wrote: Are you saying that if I load a script via a script tag in a web page, then load it via importScripts() in a worker, that the result of loading that script in those two cases should/could be different because of different decoding mechanisms? If that's what's being proposed, that seems bad. On Fri, 25 Sep 2009, Anne van Kesteren wrote: That could happen already if the script loaded via script did not have an encoding set and got it from script charset. On Fri, 25 Sep 2009, Drew Wilson wrote: Certainly. If I explicitly override the charset, then that seems like reasonable behavior. Having the default decoding vary between importScripts() and script seems bad, especially since you can't override charsets with importScripts(). On Fri, 25 Sep 2009, Anne van Kesteren wrote: It does not need to be overridden per se. If the document character encoding is different from UTF-8 then a script loaded through script will be decoded differently from a script loaded through importScripts() as well. On Mon, 28 Sep 2009, Michael Nordman wrote: Leaving legacy encodings behind would be a good thing if we can get away with it... jmho. Ok, I've mode workers assume UTF-8 always. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Please always use utf-8 for Web Workers
On Fri, 25 Sep 2009 19:34:18 +0200, Drew Wilson atwil...@google.com wrote: Again, apologies if I'm misunderstanding the suggestion. I thought that by default encoding you meant the encoding that would be used if other means of getting the encoding failed. If there is only one encoding it is not exactly the default, since it cannot be changed. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Please always use utf-8 for Web Workers
Leaving legacy encodings behind would be a good thing if we can get away with it... jmho. On Mon, Sep 28, 2009 at 9:59 AM, Drew Wilson atwil...@google.com wrote: Ah, sorry for the confusion - my use of default was indeed sloppy. I'm saying that if the server is explicitly specifying the charset either via a header or via BOMs, it seems bad to ignore it since there's no other way to override the charset. I understand your point, though - since workers don't inherit the document encoding from their parent, they may indeed decode a given resource differently if the server isn't specifying a charset in some way. -atw On Mon, Sep 28, 2009 at 4:47 AM, Anne van Kesteren ann...@opera.comwrote: On Fri, 25 Sep 2009 19:34:18 +0200, Drew Wilson atwil...@google.com wrote: Again, apologies if I'm misunderstanding the suggestion. I thought that by default encoding you meant the encoding that would be used if other means of getting the encoding failed. If there is only one encoding it is not exactly the default, since it cannot be changed. -- Anne van Kesteren http://annevankesteren.nl/
[whatwg] Please always use utf-8 for Web Workers
Workers are new and seems very likely to be incompatible with existing scripts. So it is not subject to legacy content with legacy encodings. Therefore, we should be able to always use utf-8 for workers. Always using utf-8 is simpler to implement and test and encourages people to switch to utf-8 elsewhere. -- Simon Pieters Opera Software
Re: [whatwg] Please always use utf-8 for Web Workers
The importScripts portion of the Web Workers API is compatible with existing scripts, but I'm all for more UTF-8 :) If the restriction is added to the spec, I'd want to know that a very clear error was going to be thrown explaining the problem. Regards, Jonathan 'J5' Cook Simon Pieters wrote: Workers are new and seems very likely to be incompatible with existing scripts. So it is not subject to legacy content with legacy encodings. Therefore, we should be able to always use utf-8 for workers. Always using utf-8 is simpler to implement and test and encourages people to switch to utf-8 elsewhere.
Re: [whatwg] Please always use utf-8 for Web Workers
On Fri, 25 Sep 2009 15:31:41 +0200, Jonathan Cook jonathan.j5.c...@gmail.com wrote: The importScripts portion of the Web Workers API is compatible with existing scripts, Only if those scripts don't use any of the banned interfaces and constructors, right? but I'm all for more UTF-8 :) If the restriction is added to the spec, I'd want to know that a very clear error was going to be thrown explaining the problem. I'm not sure that throwing an error is a good idea. Would you throw an error when there's no declared encoding? That seems to be annoying for the common case of just using ASCII characters. Throwing an error when there is a declared encoding that is not utf-8 might work, but are there many scripts that have a declared encoding and are not utf-8? I think it is to just ignore any declared encoding and assume utf-8. If people are using non-ascii in another encoding, then they would notice by seeing that their text looks like garbage. Browsers could also log messages to their error consoles about encoding declarations declaring non-utf-8 and/or sequences of bytes that are not valid utf-8. -- Simon Pieters Opera Software
Re: [whatwg] Please always use utf-8 for Web Workers
Are you saying that if I load a script via a script tag in a web page, then load it via importScripts() in a worker, that the result of loading that script in those two cases should/could be different because of different decoding mechanisms? If that's what's being proposed, that seems bad. -atw On Fri, Sep 25, 2009 at 6:45 AM, Simon Pieters sim...@opera.com wrote: On Fri, 25 Sep 2009 15:31:41 +0200, Jonathan Cook jonathan.j5.c...@gmail.com wrote: The importScripts portion of the Web Workers API is compatible with existing scripts, Only if those scripts don't use any of the banned interfaces and constructors, right? but I'm all for more UTF-8 :) If the restriction is added to the spec, I'd want to know that a very clear error was going to be thrown explaining the problem. I'm not sure that throwing an error is a good idea. Would you throw an error when there's no declared encoding? That seems to be annoying for the common case of just using ASCII characters. Throwing an error when there is a declared encoding that is not utf-8 might work, but are there many scripts that have a declared encoding and are not utf-8? I think it is to just ignore any declared encoding and assume utf-8. If people are using non-ascii in another encoding, then they would notice by seeing that their text looks like garbage. Browsers could also log messages to their error consoles about encoding declarations declaring non-utf-8 and/or sequences of bytes that are not valid utf-8. -- Simon Pieters Opera Software
Re: [whatwg] Please always use utf-8 for Web Workers
On Fri, 25 Sep 2009 18:39:48 +0200, Drew Wilson atwil...@google.com wrote: Are you saying that if I load a script via a script tag in a web page, then load it via importScripts() in a worker, that the result of loading that script in those two cases should/could be different because of different decoding mechanisms? If that's what's being proposed, that seems bad. That could happen already if the script loaded via script did not have an encoding set and got it from script charset. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Please always use utf-8 for Web Workers
Certainly. If I explicitly override the charset, then that seems like reasonable behavior. Having the default decoding vary between importScripts() and script seems bad, especially since you can't override charsets with importScripts(). -atw On Fri, Sep 25, 2009 at 10:08 AM, Anne van Kesteren ann...@opera.comwrote: On Fri, 25 Sep 2009 18:39:48 +0200, Drew Wilson atwil...@google.com wrote: Are you saying that if I load a script via a script tag in a web page, then load it via importScripts() in a worker, that the result of loading that script in those two cases should/could be different because of different decoding mechanisms? If that's what's being proposed, that seems bad. That could happen already if the script loaded via script did not have an encoding set and got it from script charset. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Please always use utf-8 for Web Workers
On Fri, 25 Sep 2009 19:16:47 +0200, Drew Wilson atwil...@google.com wrote: Certainly. If I explicitly override the charset, then that seems like reasonable behavior. It does not need to be overridden per se. If the document character encoding is different from UTF-8 then a script loaded through script will be decoded differently from a script loaded through importScripts() as well. Having the default decoding vary between importScripts() and script seems bad, especially since you can't override charsets with importScripts(). This is already the case. The suggestion was not about changing the default. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Please always use utf-8 for Web Workers
Then I'm misunderstanding the suggestion then. My reading of: Therefore, we should be able to always use utf-8 for workers. Always using utf-8 is simpler to implement and test and encourages people to switch to utf-8 elsewhere. ...was we should ignore charset headers coming from the server and always treat script data imported via importScripts() as if it were encoded as utf-8 (i.e. skip step 3 of section 4.3 of the web workers spec), which seems like it's effectively changing the default decoding. Which means that someone naively serving up an existing Big5-encoded script (containing, say, string resources) with the appropriate charset header will find it fails when loaded into workers. Again, apologies if I'm misunderstanding the suggestion. -atw On Fri, Sep 25, 2009 at 10:21 AM, Anne van Kesteren ann...@opera.comwrote: On Fri, 25 Sep 2009 19:16:47 +0200, Drew Wilson atwil...@google.com wrote: Certainly. If I explicitly override the charset, then that seems like reasonable behavior. It does not need to be overridden per se. If the document character encoding is different from UTF-8 then a script loaded through script will be decoded differently from a script loaded through importScripts() as well. Having the default decoding vary between importScripts() and script seems bad, especially since you can't override charsets with importScripts(). This is already the case. The suggestion was not about changing the default. -- Anne van Kesteren http://annevankesteren.nl/