On Sunday, Jun 8, 2003, at 11:16 US/Pacific, Greenhalgh David wrote: [..]
This isn't foolproof, unfortunately. A large part of the target audience
is English (or other) speakers in Japan. Unless they've manually set the
preferences in their browsers, they will show up as asking for the Japanese
content (assuming they bought the computer in Japan) since a Japanese browser
will happily display English, but the other way round isn't necessarily true.
Excuse me while I giggle a bit first, but nothing is fool proof but a Fool.
For they alone are free of the perception of their folly...
First off it is really not the 'national origins' of the browser that
gives the ability to render 'bit streams' into specific 'character sets',
it is more a case of whether or not that bit stream flips the right
trigger in the browser and the browser has the ability to present
the data in other than 7-bit ascii...[1]
We agree completely that it is reasonably simple to simple hand out mere ascii, and/or it's unicode equivolent. What I was thinking along the lines was caching the 'variations'
/some/path/<language>/
and based upon either, a 'default' value, or a 'preference', looking for the 'source' to be presented to the user in their specific 'rendered glyph set'. Since Unicode has both dwarvish and klingon, you have less work to do as you expand out from English and Japanese, to including things like American, and Cyrillic, and .... Since the method by which you switch the 'base dirctory' to acquire the source data is basically set with the 'URL' style association. Then all you need is someone who can translate and verify that your 'text' reads in american as it would in english.... 8-)
[..]Remember that the stuff between between the html tags, egSo am I taking things too literally here in assuming the editor I use must be
<p>$stuff</p>
set to plain text with no fancy encoding?
Perchance a bit too literally. What is required is that the 'document' that you create in an editor will be able to save the 'data' in a way that the data itself will carry with it the 'mark up rules'.
A part of the problem being raised here is the interoperability of various tools with 'unicode' vice 'flat ascii'.
When I have created the Japanese text in another app and tried to copy paste
it into the text editor, I loose one of the two bytes/character. If I set
BBEdit to correctly display Japanese I get compile errors. I'm obviously
missing something obvious here.
Remember that what you see on the glass tube may not be flat ascii, it may well be the rendered product of things like 'rich text' - and hence your 'cut and paste' picked up part of the process and not all of it. One of the allegations for using 'pdf' formatted data was that it provided the rendering engine - as well as a 'well defined' set of unicode to pass.
As a general rule I send email in 'ascii mode' or as 'plain text' which ever the MUA says, so that I do not have to ship along with the email squiggly stuff that explains how the 'text' should be rendered. If you have something like the Mail.app and flip to the 'view raw source' you will find that as more people ship 'rich text' and the like around that there 'interesting header stuff' that explains how to 'decode' the email. In some cases the actual 'email bit' is a block of gibberish that requires one to unwrap it with a base64 to get the block into a format that can then be presented as 'flat ascii text'...
The same holds true with passing 'bit streams' using the HTTP protocol, vice smtp, and how you want that stuff rendered; eg: as a 'unicode' style, or as flat text, or....
HTH...
ciao drieux
---
[1] For fun, you may wish to get into the
LWP::UserAgent
and build a small 'browser' to 'fetch web pages' since that may help understand the 'advantages' and disadvantages of page fetching...
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]