Re: browsers and unicode surrogates

2002-04-25 Thread Lars Marius Garshol
* Lars Marius Garshol | | It doesn't make much sense to have the meta statement there, as I | would expect most browser to assume ASCII compatibility, but I agree | that must only be used sounds too harsh. | | [...] | | it struck us: if we can see that the page claims to be UTF-16, it | can't

Re: variations of UTF-16/UTF-32 and browsers' interpretation (wasRe: browsers and unicode surrogates)

2002-04-24 Thread Tom Gewecke
Following is result of Mac OS X and OmniWeb browser on http://jshin.net/i18n/utftest 5 cases of proper display: +Y, BE, 16, UTF-16 and UTF-16BE +Y, LE, 16, UTF-16 +N, BE, 16, UTF-16 and UTF-16BE All the rest showed only the ascii correctly.

Re: variations of UTF-16/UTF-32 and browsers' interpretation (was Re: browsers and unicode surrogates)

2002-04-24 Thread Michael Everson
At 01:42 +0100 2002-04-25, Michael Everson wrote: On http://jshin.net/i18n/utftest/bom_utf16be.utf16.html under OS X you don't see just question marks, though -- you see the Last Resort font showing that Korean characters not present in the font are in the text. Awesome. In OmniWeb at least.

RE: browsers and unicode surrogates

2002-04-23 Thread Yves Arrouye
| I am surprised by the must only be used. It seems I am not | conforming by including a meta statement in the utf-16 HTML page. I | should either remove the statement or encode the HTML up to and | including that statement as ascii. I'll check on this. It doesn't make much sense to have

Re: browsers and unicode surrogates

2002-04-23 Thread Martin Duerst
At 22:25 02/04/19 +0100, Steffen Kamp wrote: However, when giving the validator a ASCII-only document with a META tag specifying UTF-16 as encoding (just for testing) it says that it does not yet support this encoding, so I don't fully trust the validator in this case. The validator indeed

Re: browsers and unicode surrogates

2002-04-23 Thread Martin Duerst
Just a very small correction: At 07:19 02/04/22 -0400, James H. Cloos Jr. wrote: There are other ways as well. Apache will already (if you use the default configs) add the Content-Language header if you use a filename like foo.en.html. You could have it also add the charset via a similar

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Jungshik Shin, Hi! Just a couple of minor comments. Opera 6 lists UTF-16 as an encoding. Netscape 6.2 lists UTF-16LE. IE 6 does not list any UTF other than UTF-8. I haven't noticed any encodings becoming available or unavailable when I access pages with different encodings, but maybe there is a

Re: browsers and unicode surrogates

2002-04-22 Thread Lars Marius Garshol
* Tex Texin | | In looking at the HTML 4.01 spec to quote the above, I noted an | interesting sentence: | The META declaration must only be used when the character encoding | is organized such that ASCII-valued bytes stand for ASCII characters | (at least until the META element is parsed). | |

Re: browsers and unicode surrogates

2002-04-22 Thread James H. Cloos Jr.
Tex == Tex Texin [EMAIL PROTECTED] writes: Tex I am surprised by the must only be used. It seems I am not Tex conforming by including a meta statement in the utf-16 HTML Tex page. I should either remove the statement or encode the HTML up Tex to and including that statement as ascii. I'll check

Re: browsers and unicode surrogates

2002-04-22 Thread Stefan Persson
--- Tex Texin [EMAIL PROTECTED] skrev: Jungshik Shin, Opera 6 lists UTF-16 as an encoding. Netscape 6.2 lists UTF-16LE. IE 6 does not list any UTF other than UTF-8. I haven't noticed any encodings becoming available or unavailable when I access pages with different encodings, but maybe

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Jim, thanks for all the info. I would prefer to not clutter filenames with encodings and locales, and the remainder I need to coordinate with my ISP. I'll talk to them and see what they let me do. tex James H. Cloos Jr. wrote: Tex == Tex Texin [EMAIL PROTECTED] writes: Tex I am surprised

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Thanks Stefan, that's good to know. Seems a bit odd to hide the encoding abilities, especially when there is already a more menu pick... thanks for the info. tex Stefan Persson wrote: IE 5 supports more encodings than listed. For example, the western European DOS encoding is supported

Re: browsers and unicode surrogates

2002-04-22 Thread jshin
On Fri, 19 Apr 2002, Tom Gewecke wrote: With BOM at the beginning, Netscape 4.x, Netscape 6.x/Mozilla and MS IE 5.x/6.x can handle them without much problem except that support for characters above BMP varies from browser to browser as Tex tried to demonstrate in his test pages. The

Re: browsers and unicode surrogates

2002-04-22 Thread jshin
James H. Cloos Jr. wrote: Since you are using apache, it is quite easy to get the extra headers sent at the protocol level rather than having to use meta tags. You can use a Header directive in an .htaccess file a la: Files foobar.html Header set Content-Language en-US

Re: browsers and unicode surrogates

2002-04-22 Thread David Starner
On Mon, Apr 22, 2002 at 12:19:15PM -0400, [EMAIL PROTECTED] wrote: My test pages don't have yet characters beyond BMP(I just recycled a page I made a long time ago for Korean testing) . I may later add them. (Tex, can I use your sample page? I'd rather put up a page with some content

Re: browsers and unicode surrogates

2002-04-19 Thread jshin
On Fri, 19 Apr 2002, Tom Gewecke wrote: I have added a couple more variations of the Unicode supplementary characters example page, for utf-16 and utf-32. I had the impression that it was not really practical to use web pages with these encodings over the internet, because they do not

Re: browsers and unicode surrogates

2002-04-19 Thread Steffen Kamp
I have added a couple more variations of the Unicode supplementary characters example page, for utf-16 and utf-32. I am not sure if your UTF-16 and UTF-32 test pages really conform to the HTML standard. The server states a content type of text/html without charset information. From the content

browsers and unicode surrogates

2002-04-17 Thread Tex Texin
I have added a couple more variations of the Unicode supplementary characters example page, for utf-16 and utf-32. I thought that because support for utf-8 supplementary characters was weak, that utf-16 would be weaker. So I was surprised that Netscape 4.7 displays the UTF-16 page. I didn't