Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
E. Keown k underscore isoetc at yahoo dot com wrote: What's the point, really, of going far beyond, even beyond CSS, into XHTML, where few computational Hebraists have gone before? Sorry, but I think this stuff is the least interesting thing one can do on a computer(no offense). Well, COBOL was my worst experience so far... You are right. There shouldn't be any need to resort to fancy tricks, or even XHTML (which is by no means fancy), just to display Hebrew properly on a variety of browsers. That was your original question. I think the most important thing, if you want to ensure correct operation on as many platforms as possible, is to validate your HTML using the W3C Markup Validation Service: http://validator.w3.org/ That will keep you from accidentally using browser-specific tricks and ensure that your HTML is clean. Most browsers will behave correctly when handed clean HTML. Beyond that, you might want to specify a font family using CSS (doesn't have to be in a separate CSS file, either) to improve the odds that the reader will see Hebrew instead of hollow boxes, but this is optional. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
Doug Ewell wrote: Beyond that, you might want to specify a font family using CSS (doesn't have to be in a separate CSS file, either) to improve the odds that the reader will see Hebrew instead of hollow boxes, but this is optional. While we are on the (off) topic of HTML, browsers etc. I've noticed, that with Windows and IE, - when going to a page with characters for a script for which fonts are not installed my system, IE will sometimes ask whether or not I want to download install fonts for that script from Microsoft's web site. This only happens in some cases - even where the same script is involved. I've looked the source of some of these pages but I've never been able to identify just what what triggers this. Does anyone know? - Chris
Re: [even more increasingly OT-- into Sunday morning] Re: Unicode HTML, download
From: Stefan Persson [EMAIL PROTECTED] I haven't used M$ IE for many years, though, and my memory might be wrong. Blinded by the misspelling of the product name, maybe? :-) See http://msdn.microsoft.com/msdnmag/issues/0700/localize/ and the section entitled Choosing Character Sets for info on what is going on here, particularly firgures 3 and 4 for info on how to script the behavior for the UTF-8 case MichKa [MS] NLS Collation/Locale/Keyboard Technical Lead Globalization Infrastructure, Fonts, and Tools Windows International Division
Re: [even more increasingly OT-- into Sunday morning] Re: Unicode HTML, download
Michael (michka) Kaplan scripsit: I haven't used M$ IE for many years, though, and my memory might be wrong. Blinded by the misspelling of the product name, maybe? :-) No, that's just a glyph difference. :-) See http://msdn.microsoft.com/msdnmag/issues/0700/localize/ and the section entitled Choosing Character Sets for info on what is going on here, particularly firgures 3 and 4 for info on how to script the behavior for the UTF-8 case Nice article, though it's obnoxious that the figures will only open in a pop-up window. -- Ambassador Trentino: I've said enough. I'm a man of few words. Rufus T. Firefly: I'm a man of one word: scram! --Duck Soup John Cowan [EMAIL PROTECTED]
Re: [even more increasingly OT-- into Sunday morning] Re: Unicode HTML, download
From: Christopher Fynn [EMAIL PROTECTED] I'd also like to figure out a way to trigger this kind of behavior in other browsers as well as in IE (using Java Script or Java rather than VB) as not quite everyone uses IE - (but I guess you are not going to give me any more clues on how to do that :-) ) If only there was a portable way to determine in JavaScript that a string can be rendered with the existing fonts, or to enumerate the installed fonts and get some of their properties... we could prompt the user to install some fonts or change their browser settings, or we could autoadapt the CSS style rules, notably the list of fonts inserted in the font-family: or abbreviated font: CSS properties... There are limited controls with the CSS @ keys that allow building virtual font names, but not enough to tune the font selections by script or by code point ranges. And Javascript is of little help to paliate. Certainly there's a need to include in a refined standard DOM for styles the properties needed to manage prefered font stacks associated to a virtual font name (for example, in a way similar to what Java2D v1.5 allows), that can then be referenced directly within legacy HTML font name=virtualname or in CSS font-family: virtualname properties (some examples of virtual font names are standardized in HTML: serif, sans-serif, monospace; Java2D or AWT adds dialog and dialoginput; but other virtual names could be defined as well like decorated or handscript or ocr). The key issue here is to create documents that refer to font families according to their usage rather than their exact appearance and the limited set of languages and scripts they support. Another possibility would be to create a portable but easily tunable font format (XML based? so that they can be created or tuned by scripting through DOM?) which would be a list of references to various external but actual fonts or glyph collections, and parameters to allows selecting in them with various priorities. For now this is not implemented in font technologies (OpenType, Graphite, ...) but within vendor-specific renderer APIs (than contain some rules to create such font mappings).
Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
Elaine Keown Seattle (only 11 hours now...) Dear Doug Ewell, fantasai and List: I will try to sort out these diverse pieces of advice. What's the point, really, of going far beyond, even beyond CSS, into XHTML, where few computational Hebraists have gone before? Sorry, but I think this stuff is the least interesting thing one can do on a computer(no offense). Well, COBOL was my worst experience so far... I've partly learned CSS, I guess---elegant placement options!!--much better than HTML (clunky). But I discovered that the Web is full of bad CSS, even by supposed gurus, and they never tell you which browser/operating system/whatever their code might be good for. EK __ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com
Re: [even more increasingly OT-- into Sunday morning] Re: Unicode HTML, download
This is JScrript tags in HTML -- client side script. I do not if other browsers have solutions for this problem? Michael - Original Message - From: Christopher Fynn [EMAIL PROTECTED] Cc: Michael (michka) Kaplan [EMAIL PROTECTED]; Unicode List [EMAIL PROTECTED] Sent: Sunday, November 21, 2004 7:49 AM Subject: Re: [even more increasingly OT-- into Sunday morning] Re: Unicode HTML, download Thanks Michael This is useful information. Unfortunately I usually need to use static HTML - so I can't use the ASP parts. It would be nice see something like this working on UTF-8 encoded web pages where lang is defined. In most cases knowing the text is a specific language and knowing the page is Unicode would let you know which script is being used. I'd also like to figure out a way to trigger this kind of behavior in other browsers as well as in IE (using Java Script or Java rather than VB) as not quite everyone uses IE - (but I guess you are not going to give me any more clues on how to do that :-) ) regards - Chris Michael (michka) Kaplan wrote: From: Stefan Persson [EMAIL PROTECTED] I haven't used M$ IE for many years, though, and my memory might be wrong. Blinded by the misspelling of the product name, maybe? :-) See http://msdn.microsoft.com/msdnmag/issues/0700/localize/ and the section entitled Choosing Character Sets for info on what is going on here, particularly firgures 3 and 4 for info on how to script the behavior for the UTF-8 case MichKa [MS] NLS Collation/Locale/Keyboard Technical Lead Globalization Infrastructure, Fonts, and Tools Windows International Division
Re: Unicode HTML, download
On 21/11/2004 00:05, Edward H. Trager wrote: ... A better CSS class would additionally specify the font-family, for example, something like the SIL Ezra font (http://scripts.sil.org/cms/scripts/page.php?site_id=nrsiid=EzraSIL_Home) (4) Since your readers may not have certain fonts, In the case of legally downloadable fonts like SIL Ezra, I would definitely put a link to the download site so readers can download the (Hebrew) fonts if they need it to view your page. Please don't use SIL Ezra for such purposes, which is a legacy encoded and visually ordered Hebrew font, and is not rendered correctly in IE6. Instead, please use Ezra SIL, which is basically the same outlines but properly Unicode encoded. The URL given is for Ezra SIL, and it is a free download. By the way, this font mostly works fine with any Windows (95+) system. Office 2003 is required only for ideal placement of certain accents etc. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Unicode HTML, download
On 21/11/2004 15:28, Philippe Verdy wrote: From: Peter Kirk [EMAIL PROTECTED] On 21/11/2004 00:50, Philippe Verdy wrote: ... style type=text/css!-- .he { font-family: SIL Ezra, Arial Unicode MS, David, Myriam, Tahoma, Arial, sans-serif; direction: rtl; } This will absolutely NOT work because SIL Ezra is legacy encoded and the others are Unicode encoded. You should be using Ezra SIL. See my previous posting. Thanks for this correction. I thought that this font was Unicode too... Please read my earlier posting. Of course it does make things rather difficult that none of my postings ever get approved on a Sunday, especially when I am trying to correct seriously misleading factual errors. But this creates an even more complicate case for creating a portable HTML page: as the font uses a specific encoding, how can characters be selected in that font, given that the page will be UTF-8 encoded and thus will contain numeric references to Unicode code points? Does this font works as if it was assigning ISO-8859-1 characters? If so, Elaine will need to use only Latin-1, which will be correctly rendered as expected only if the specific font is installed. If it is not, readers will see Latin-1 characters, but not even any Hebrew character present in most classic core fonts of their browser... If you really want to know, the font SIL Ezra (which was never intended for Unicode use) uses PUA characters F020 to F0FF only. It is totally unsuitable for web use because it uses some of these PUA characters as combining marks, and this usage is not supported (for some reason which has never been explained) by the world's most popular browser (although it was supported by previous versions, hence breaking a large number of existing web pages using legacy encodings for Hebrew, Greek etc with diacritics). So please don't even think of how to trick browsers into using SIL Ezra - which would also require support for visual encoding. So if she really wants to include character compositions which are only possible with Ezra SIL, she will need these two classes: style type=text/css!-- .he { font-family: Arial Unicode MS, David, Myriam, Tahoma, Arial, sans-serif;} .heb { font-family: Ezra SIL } .he, .heb { direction: rtl; } //--/style No problem if you are using Ezra SIL, which is a different font from SIL Ezra, and is Unicode mapped and so can be mixed with the others you mention. ... I still doubt that you need such a specialized font for Biblic Hebrew and Canaanite languages, to create a technical translation glossary, which would probably use modern Hebrew only (so the he class above would probably be enough...) David is a very adequate font for Hebrew with consonants and vowel points, as long as accents are not required - and Elaine is very unlikely to require them. Times New Roman is fine for unpointed consonantal Hebrew only as its Holam point is unfortunately broken. Arial and Arial Unicode MS are probably OK for modern Hebrew but look odd to those of us more used to the ancient language - and their Holam is also broken. Miriam doesn't look good at all, to me. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Unicode HTML, download
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: So if she really wants to include character compositions which are only possible with Ezra SIL, she will need these two classes: style type=text/css!-- .he { font-family: Arial Unicode MS, David, Myriam, Tahoma, Arial, sans-serif;} .heb { font-family: Ezra SIL } .he, .heb { direction: rtl; } //--/style and use preferably the he class name for all Hebrew characters which can be represented with Unicode code points and Unicode fonts found in common browsers, surrounding only the specific sections requiring the SIL encoding mapped on ISO-8859-1 within span class=heb elements. Absolutely not. No way. A document should NEVER contain text in two or more character encodings with changes indicated only by font suggestions. This approach will destroy searching capabilities, and will not ensure proper rendering in any event. The user who has Miriam but not Ezra SIL (or vice versa) will see some Hebrew text rendered properly and some improperly, for no apparent reason. This is worse than either the all-Unicode or all-Ezra approach. Don't do it, Elaine. The only time a document should EVER be presented in mixed encodings is for direct illustration of encoding issues (intended for Unicode weenies) or in a MIME-like setting where the document is divided into logical sections, with the encoding of each section clearly indicated. This is true for all types of documents, not just Web pages. If Elaine suspects that some of her HTML will not be displayed properly with commonly available Unicode fonts, she will have to bite the bullet and either: (a) code the whole page in Unicode, and provide a link to a comprehensive-enough Hebrew Unicode font, OR (b) code the whole page in the legacy encoding, and provide a link to Ezra SIL. Cryptically naming these two CSS classes .he and .heb, which provides no indication of which is the Unicode encoding and which is the Latin-1 hack, merely makes a bad suggestion worse. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: Unicode HTML, download
Peter Kirk peterkirk at qaya dot org wrote: Please don't use SIL Ezra for such purposes, which is a legacy encoded and visually ordered Hebrew font, and is not rendered correctly in IE6. Instead, please use Ezra SIL, which is basically the same outlines but properly Unicode encoded. The URL given is for Ezra SIL, and it is a free download. This makes things a little clearer: Philippe's bad advice to mix encodings was based on bad information, that doing so would be necessary. The best advice for Elaine's situation becomes simpler. To maximize the likelihood that readers will see the right glyphs, add a font-family style line that lists a variety of available fonts, in decreasing order of coverage and attractiveness. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Font selection, font downloads, and (writing system) scripts
This discussion belongs on www-style, so setting Reply-To to there. Philippe, could you explain what you meant by The key issue here is to create documents that refer to font families according to their usage rather than their exact appearance and the limited set of languages and scripts they support. ? ~fantasai Philippe Verdy wrote: From: Christopher Fynn [EMAIL PROTECTED] ... Christopher Fynn wrote: I've noticed, that with Windows and IE, - when going to a page with characters for a script for which fonts are not installed my system, IE will sometimes ask whether or not I want to download install fonts for that script from Microsoft's web site. This only happens in some cases - even where the same script is involved. I've looked the source of some of these pages but I've never been able to identify just what what triggers this. Does anyone know? ... I'd also like to figure out a way to trigger this kind of behavior in other browsers as well as in IE (using Java Script or Java rather than VB) as not quite everyone uses IE - (but I guess you are not going to give me any more clues on how to do that :-) ) If only there was a portable way to determine in JavaScript that a string can be rendered with the existing fonts, or to enumerate the installed fonts and get some of their properties... we could prompt the user to install some fonts or change their browser settings, or we could autoadapt the CSS style rules, notably the list of fonts inserted in the font-family: or abbreviated font: CSS properties... There are limited controls with the CSS @ keys that allow building virtual font names, but not enough to tune the font selections by script or by code point ranges. And Javascript is of little help to paliate. Certainly there's a need to include in a refined standard DOM for styles the properties needed to manage prefered font stacks associated to a virtual font name (for example, in a way similar to what Java2D v1.5 allows), that can then be referenced directly within legacy HTML font name=virtualname or in CSS font-family: virtualname properties (some examples of virtual font names are standardized in HTML: serif, sans-serif, monospace; Java2D or AWT adds dialog and dialoginput; but other virtual names could be defined as well like decorated or handscript or ocr). The key issue here is to create documents that refer to font families according to their usage rather than their exact appearance and the limited set of languages and scripts they support. Another possibility would be to create a portable but easily tunable font format (XML based? so that they can be created or tuned by scripting through DOM?) which would be a list of references to various external but actual fonts or glyph collections, and parameters to allows selecting in them with various priorities. For now this is not implemented in font technologies (OpenType, Graphite, ...) but within vendor-specific renderer APIs (than contain some rules to create such font mappings).
Re: Unicode HTML, download
I wrote: The best advice for Elaine's situation becomes simpler. To maximize the likelihood that readers will see the right glyphs, add a font- family style line that lists a variety of available fonts, in decreasing order of coverage and attractiveness. Actually, of course, the only way to *guarantee* that readers will see the right glyphs is to chuck HTML altogether and create a PDF file. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: Font selection, font downloads, and (writing system) scripts
Fantasai wrote, This discussion belongs on www-style, so setting Reply-To to there. And if you're going to do that then, as a matter of etiquette, please don't CC the Unicode list. When you CC the Unicode list and some other list, people on the other list may try to reply all and include both lists. For hot topics, this can result in a cross-posting mess and people seeing half the story. And some people may get you can't post here because you're not subscribed messages. Thanks, Rick
Re: Unicode HTML, download
From: Doug Ewell [EMAIL PROTECTED] The best advice for Elaine's situation becomes simpler. To maximize the likelihood that readers will see the right glyphs, add a font-family style line that lists a variety of available fonts, in decreasing order of coverage and attractiveness. My bad advice comes effectively from the confusion about two SIL related fonts: one with legacy encoding (handled in browsers as if it was ISO-8859-1 encoded, so that you need to insert text in the HTML page using only the code points in the Latin-1 page starting at U+, even though they do not represent the correct Unicode characters), and the other coded with Unicode (for which you need to encode your text with Habrew code points...). But your advice, Doug, still won't work when multiple fonts in the font-family style use distinct encodings: Mixing SIL Ezra with Arial, or similar Unicode encoded fonts will never produce the intended fallbacks if users don't have SIL Ezra effectively installed and selectable in their browser environment. Legacy encoded fonts only contain a codepage/charset identifier (most often ISO-8859-1) and no character to glyph translation table; also don't work properly with browsers configured for accessibility, where only the user-defined prefered fonts are allowed, and fonts specified in HTML pages must be ignored by the browser, user styles having been set to higher priority (even if one uses the important (!) CSS style rule markers), unless the default font mapping associated with the codepage/charset identifier effectively corresponds to what would be found in a regular char-to-glyph mapping table present in that font.
Re: Unicode HTML, download
From: Doug Ewell [EMAIL PROTECTED] Cryptically naming these two CSS classes .he and .heb, which provides no indication of which is the Unicode encoding and which is the Latin-1 hack, merely makes a bad suggestion worse. It was not cryptocraphic: he was meant for Hebrew (generic, properly Unicode encoded, suitable for any modern Hebrew), and heb for Biblic Hebrew where a legacy encoding may still be needed, in absence of workable Unicode support for now: this won't be the same language however, so a change of encoding may be justified. I was not advocating for mixing encodings within the same text for the same language... But I was nearly sure that a technical jargon in Hebrew would probably not need Biblic Hebrew, except for illustration purpose within small delimited block quotes or spans, where there will be simultaneously changes of: - language level - needed character set, some characters not being encodable with Unicode - a needed changed encoding (from Unicode to Latin-1 override hack) - specific font to render the legacy encoding. In that case, it is acceptable to have the general text in modern Hebrew properly coded with Unicode, even if the small illustrative quotes remain fully in a non standard mapping, and won't appear correctly without the necessary font. Note that PDF files DO mix encodings within the embedded fonts that PDF writers dynamically create for only the necessary glyphs. These encodings are specific to the document, for each embedded font... This is why PDF files can encode text that still don't have Unicode character mappings. You can see that when you attempt to copy/paste text fragments from PDF files in sections using embedded fonts; the pasted text will not reproduce the same characters as what you can see in the PDF reader; copy/pasting however works for PDF files using external fonts with standard mappings.
Re: Ezra
On Sunday 2004.11.21 14:38:09 +, Peter Kirk wrote: On 21/11/2004 00:05, Edward H. Trager wrote: ... A better CSS class would additionally specify the font-family, for example, something like the SIL Ezra font (http://scripts.sil.org/cms/scripts/page.php?site_id=nrsiid=EzraSIL_Home) (4) Since your readers may not have certain fonts, In the case of legally downloadable fonts like SIL Ezra, I would definitely put a link to the download site so readers can download the (Hebrew) fonts if they need it to view your page. Please don't use SIL Ezra for such purposes, which is a legacy encoded and visually ordered Hebrew font, and is not rendered correctly in IE6. Instead, please use Ezra SIL, which is basically the same outlines but properly Unicode encoded. The URL given is for Ezra SIL, and it is a free download. Are you saying the difference in names is SIL Ezra vs. Ezra SIL ? That's too confusing! When I gave the URL, I checked that it was referring to an OpenType font --and since it was OpenType, I assumed that it was the newer version with a Unicode CMAP. If SIL still has links to legacy non-Unicode versions of fonts which now also have Unicode versions, then they should make this really clear to people. My apologies if I provided the wrong URL. By the way, this font mostly works fine with any Windows (95+) system. Office 2003 is required only for ideal placement of certain accents etc. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
On Sunday 2004.11.21 00:06:31 -0800, Doug Ewell wrote: E. Keown k underscore isoetc at yahoo dot com wrote: What's the point, really, of going far beyond, even beyond CSS, into XHTML, where few computational Hebraists have gone before? Sorry, but I think this stuff is the least interesting thing one can do on a computer(no offense). Well, COBOL was my worst experience so far... You are right. There shouldn't be any need to resort to fancy tricks, or even XHTML (which is by no means fancy), just to display Hebrew properly on a variety of browsers. That was your original question. I beg to differ with Doug Ewell here: Using XHTML and some very basic CSS1 is not, in my opinion, resorting to fancy tricks. XHTML is very simple to do correctly, and more consistent than HTML 4.01. Philip Verdy also provided some good advice on what a CSS class for Hebrew might look like. XHTML has a consistent set of rules that apply across all tags : I would argue that this is *easier* to learn and stick to than old-style HTML. And proper use of CSS really allows one to separate one's content from the display of that content. For me, the combination of XHTML and CSS is so much easier than what I used to suffer through in the bad old days of HTML before CSS came along ... I do agree with Doug that validation using the W3C.org or similar validator is absolutely essential. But this thread is getting off-topic. The intent of my original post was merely to suggest Elaine take a look at using XHTML, CSS, and UTF-8 for her documents. I think the most important thing, if you want to ensure correct operation on as many platforms as possible, is to validate your HTML using the W3C Markup Validation Service: http://validator.w3.org/ That will keep you from accidentally using browser-specific tricks and ensure that your HTML is clean. Most browsers will behave correctly when handed clean HTML. Beyond that, you might want to specify a font family using CSS (doesn't have to be in a separate CSS file, either) to improve the odds that the reader will see Hebrew instead of hollow boxes, but this is optional. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: [increasingly OT--but it's Saturday night] Re: Unicode HTML, download
From: E. Keown [EMAIL PROTECTED] Dear Doug Ewell, fantasai and List: I will try to sort out these diverse pieces of advice. What's the point, really, of going far beyond, even beyond CSS, into XHTML, where few computational Hebraists have gone before? You're right Helen, the web is full of non XHTML conforming documents. You probably don't need full XHTML conformance too, but having your document respect the XML nesting and closure of elements is certainly a must today, because it avoids most interoperability problems in browsers. So: make sure all your HTML elements and attributes are lowercase, and close ALL elements (even empty elements that should be closed by / instead of just , for example br / instead of br, and even li.../li, or p.../p). And then don't embed structural block elements (like p.../p or div../div or blockquote.../blockquote or li.../li or table.../table) within inline elements (like b.../b or font.../font or a href=../a or span.../span) Note that most inline elements are related to style, and they better fit outside of the body by assigning style classes to the structural elements (most of them are block elements). XHTML has deprecated most inline style elements, in favor of external specification of style through the class property added to structural block elements. XHTML has an excellent interoperability with a wider range of browsers, including old ones, except for the effective rendering of some CSS styles. The cost to convert an HTML file to full XML well-formedness is minor for you, but this allows you to use XML editors to make sure the document is properly nested, a pre-condition that will greatly help its interoperable interpretation. If you have FrontPage XP or 2003, you can use its apply XML formatting rules option to make this job nearly automatically, and make sure that all elements are properly nested and closed.
Re: Ezra
From: Edward H. Trager [EMAIL PROTECTED] Are you saying the difference in names is SIL Ezra vs. Ezra SIL ? That's too confusing! You're not alone to be confused. I had completely forgotten the existence of two versions of the same font design. I may have just seen that it used PUAs, so I did not install it (I did not remember that it used PUAs, and the wording of the sentence that introduced it in this discussion made me think that it was NOT using Unicode, and thus not PUAs which are Unicode things; that's where I supposed it was using some legacy Latin-1 override or similar hacks found in some special-purpose fonts, or in legacy non-TrueType-based font formats, like PostScript mappings within a 0-based indexed vector or hashed dictionnary of glyph names...)
Re: Unicode HTML, download
Peter Kirk scripsit: Please read my earlier posting. Of course it does make things rather difficult that none of my postings ever get approved on a Sunday, especially when I am trying to correct seriously misleading factual errors. Yr hble Hebrew Moderator attempts to work 24/7, but occasionally the need to sleep or to engage in business (I was at a conference all last week) or family business (a death in a friend's family) interferes with this otherwise laudable goal. -- John Cowan [EMAIL PROTECTED] www.ccil.org/~cowan www.reutershealth.com In computer science, we stand on each other's feet. --Brian K. Reid
Re: Unicode HTML, download
On 21/11/2004 17:35, Doug Ewell wrote: ... This approach will destroy searching capabilities, and will not ensure proper rendering in any event. The user who has Miriam but not Ezra SIL (or vice versa) will see some Hebrew text rendered properly and some improperly, for no apparent reason. ... Not true of Ezra SIL, only of SIL Ezra. Sorry to keep repeating myself, but these errors keep being perpetuated. ... (b) code the whole page in the legacy encoding, and provide a link to Ezra SIL. Ezra SIL does not use a legacy encoding, it is a Unicode font. Later, Doug wrote: This makes things a little clearer: Philippe's bad advice to mix encodings was based on bad information, that doing so would be necessary. I had already corrected the bad information, and Philippe quoted my correction. He simply failed to recognise that SIL Ezra != Ezra SIL. Not my choice of naming conventions, but it is consistent with several SIL fonts: SIL xxx is a legacy encoded version, xxx SIL is the Unicode version of it. Philippe wrote: two SIL related fonts: one with legacy encoding (handled in browsers as if it was ISO-8859-1 encoded, so that you need to insert text in the HTML page using only the code points in the Latin-1 page starting at U+, even though they do not represent the correct Unicode characters) ... More bad information. As I already wrote, SIL Ezra is encoded in the PUA and not as if it was ISO-8859-1 encoded. So this technique will not work. Mixing SIL Ezra with Arial, or similar Unicode encoded fonts will never produce the intended fallbacks if users don't have SIL Ezra effectively installed and selectable in their browser environment. Mixing SIL Ezra with Arial, or similar Unicode encoded fonts, is A BAD THING. Period. Don't even think of trying it, especially in HTML. Instead, use Ezra SIL. And use Times New Roman rather than Arial as the fallback because it looks much more similar - or David, which is less similar but gives better results. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Unicode HTML, download
On 21/11/2004 22:23, Philippe Verdy wrote: From: Doug Ewell [EMAIL PROTECTED] Cryptically naming these two CSS classes .he and .heb, which provides no indication of which is the Unicode encoding and which is the Latin-1 hack, merely makes a bad suggestion worse. It was not cryptocraphic: he was meant for Hebrew (generic, properly Unicode encoded, suitable for any modern Hebrew), and heb for Biblic Hebrew where a legacy encoding may still be needed, in absence of workable Unicode support for now: ... A good point, Philippe. Modern and biblical Hebrew are slightly different languages, and in principle may need different encodings. There are still some small holes in Unicode support for biblical Hebrew, most of which will be plugged (in some kind of way) when the current pipeline empties itself. (Sorry for mixing my liquid container metaphors.) But the current results of displaying biblical Hebrew in browsers, at least on Windows, are already much better with Unicode than with the legacy encoding, because at least IE6 converts all legacy encoded combining marks into spacing marks. Think what French would look like if every accent were spacing, and then think much worse for Hebrew because almost every base character has one or more combining mark. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Ezra
On 21/11/2004 23:02, Edward H. Trager wrote: ... Are you saying the difference in names is SIL Ezra vs. Ezra SIL ? That's too confusing! Confusing, but true. When I gave the URL, I checked that it was referring to an OpenType font --and since it was OpenType, I assumed that it was the newer version with a Unicode CMAP. If SIL still has links to legacy non-Unicode versions of fonts which now also have Unicode versions, then they should make this really clear to people. My apologies if I provided the wrong URL. No, you provided the right URL, but the wrong font name. For reference: For the Unicode font Ezra SIL, go to http://scripts.sil.org/cms/scripts/page.php?site_id=nrsiid=EzraSIL_Home. Do use this for web pages, but Hebrew accents may not display properly on some systems. For the legacy font SIL Ezra, go to http://scripts.sil.org/cms/scripts/page.php?site_id=nrsiitem_id=SILEzra. Don't use this for web pages because it doesn't work with IE6. The situation is clearly explained at this URL. -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/
Re: Unicode HTML, download
At 11:10 AM 11/21/2004, Doug Ewell wrote: Actually, of course, the only way to *guarantee* that readers will see the right glyphs is to chuck HTML altogether and create a PDF file. And that's a task that needs to be approached with some care as well. The UTC and WG2 constantly get PDF documents with all the interesting glyphs trashed in them. For the code charts, I have long given up on embedding fonts and am using a two step process of creating a PS file and using distiller. For the PS driver I select convert TT fonts to outline or similar settings which extracts and embeds the specific outline information. I disable all embedding. That makes for poorer font quality at small magnification, but absolutely guarantees that what I put together is what people see. So far, that has worked well for the purpose. A./
Re: Unicode HTML, download
Peter Kirk peterkirk at qaya dot org wrote: A good point, Philippe. Modern and biblical Hebrew are slightly different languages, and in principle may need different encodings. English and Russian and Chinese and Hebrew are *very* different languages, and that still does not justify the confusion of using different encodings for each within the same document. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
Re: Unicode HTML, download
Philippe Verdy verdy underscore p at wanadoo dot fr wrote: But your advice, Doug, still won't work when multiple fonts in the font-family style use distinct encodings: Mixing SIL Ezra with Arial, or similar Unicode encoded fonts will never produce the intended fallbacks if users don't have SIL Ezra effectively installed and selectable in their browser environment. Don't use multiple fonts in the same font-family style that use different encodings. That way lies madness. It was not cryptocraphic: he was meant for Hebrew (generic, properly Unicode encoded, suitable for any modern Hebrew), and heb for Biblic Hebrew where a legacy encoding may still be needed, in absence of workable Unicode support for now: this won't be the same language however, so a change of encoding may be justified. I was not advocating for mixing encodings within the same text for the same language... Don't mix encodings within the same text REGARDLESS of the languages involved. If Unicode support (meaning font and rendering-engine support) is inadequate for one of the languages, then the same non-Unicode encoding should be used for the whole document. Documents that used different 8-bit encodings for French and Russian, or French and Hebrew, or whatever, were central to the ISO 2022-based chaos of the 1980s. Rendering these properly was difficult and painful. Let's not start recommending that path again. I do see your logic in choosing he and heb, but heb looks like it could also stand for just Hebrew. In fact, he and heb are actually the ISO 639 alpha-2 and alpha-3 codes, respectively, for Hebrew, with no difference in meaning. Class names (or other identifiers) should not be so short that they become, well, cryptic. hebrew and biblical are possible class names that might be more easily recognized. -Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/