On Sunday, Jun 8, 2003, at 11:16 US/Pacific, Greenhalgh David wrote: [..]

This isn't foolproof, unfortunately. A large part of the target audience
is English (or other) speakers in Japan. Unless they've manually set the
preferences in their browsers, they will show up as asking for the Japanese
content (assuming they bought the computer in Japan) since a Japanese browser
will happily display English, but the other way round isn't necessarily true.

Excuse me while I giggle a bit first, but nothing is fool proof but a Fool.
For they alone are free of the perception of their folly...


First off it is really not the 'national origins' of the browser that
gives the ability to render 'bit streams' into specific 'character sets',
it is more a case of whether or not that bit stream flips the right
trigger in the browser and the browser has the ability to present
the data in other than 7-bit ascii...[1]


We agree completely that it is reasonably simple to simple hand
out mere ascii, and/or it's unicode equivolent. What I was thinking
along the lines was caching the 'variations'

/some/path/<language>/

and based upon either, a 'default' value, or a 'preference',
looking for the 'source' to be presented to the user in their
specific 'rendered glyph set'. Since Unicode has both dwarvish
and klingon, you have less work to do as you expand out from
English and Japanese, to including things like American, and
Cyrillic, and .... Since the method by which you switch the
'base dirctory' to acquire the source data is basically set
with the 'URL' style association. Then all you need is someone
who can translate and verify that your 'text' reads in american
as it would in english.... 8-)

Remember that the stuff between between the html tags, eg

<p>$stuff</p>

So am I taking things too literally here in assuming the editor I use must be
set to plain text with no fancy encoding?
[..]

Perchance a bit too literally. What is required is that
the 'document' that you create in an editor will be able
to save the 'data' in a way that the data itself will
carry with it the 'mark up rules'.

A part of the problem being raised here is the interoperability
of various tools with 'unicode' vice 'flat ascii'.

When I have created the Japanese text in another app and tried to copy paste
it into the text editor, I loose one of the two bytes/character. If I set
BBEdit to correctly display Japanese I get compile errors. I'm obviously
missing something obvious here.


Remember that what you see on the glass tube may not be flat
ascii, it may well be the rendered product of things like
'rich text' - and hence your 'cut and paste' picked up part
of the process and not all of it. One of the allegations for
using 'pdf' formatted data was that it provided the rendering
engine - as well as a 'well defined' set of unicode to pass.

As a general rule I send email in 'ascii mode' or as 'plain text'
which ever the MUA says, so that I do not have to ship along with
the email squiggly stuff that explains how the 'text' should be
rendered. If you have something like the Mail.app and flip to
the 'view raw source' you will find that as more people ship 'rich text'
and the like around that there 'interesting header stuff' that
explains how to 'decode' the email. In some cases the actual
'email bit' is a block of gibberish that requires one to
unwrap it with a base64 to get the block into a format that
can then be presented as 'flat ascii text'...

The same holds true with passing 'bit streams' using the
HTTP protocol, vice smtp, and how you want that stuff rendered;
eg: as a 'unicode' style, or as flat text, or....

HTH...


ciao drieux

---


[1] For fun, you may wish to get into the


LWP::UserAgent

and build a small 'browser' to 'fetch web pages' since that may
help understand the 'advantages' and disadvantages of page fetching...




-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to