Re: [svg-developers] Unicode and SVG
I think you are confused about utf-8 declaring iso-8859-1 text to be utf-8 doesn't make it utf-8 If you want it to be utf-8 you need to convert it to that. Terry On Oct 21, 2010, at 11:53 PM, JC Ahangama ahang...@gmail.com wrote: I believe I know what is going on. The treatment of ISO-8859-1 set by Unicode is the culprit, at least in the Windows machines. Please check the three versions of an HTML file for the same text given at the bottom of the page. Characters outside ASCII that are still within ISO-8859-1 (codepoints 128 thru 255) are not included in the Unicode repertoire (as the last sample illustrates). The HTML pages do not declare a font, and therefore, uses the Last Resort font of the system, which demonstrably *has* the letters that UTF-8 set thinks are missing. If you are working with people that want to use Indic, it is best that they transliterate their languages to ISO-8859-1 and display them by means of orthographic fonts. Here are two web sites that illustrates it (The language is Sinhala): http://www.ahangama.com/ -- My Wordpress blog has both English and Sinhala http://www.lovatasinhala.com/ -- has only *one* graphic, that of the lion. Regards, JC 1. No character set declaration. Shows the text correctly because ISO-8859-1 is the default charset. == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 but no character set declared/title /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 2. character declared as iso-8859-1. The text shows correctly == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=iso-8859-1 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 3. characters set declared as UTF-8. No European characters! == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=utf-8 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html === [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links * To visit your group on the web, go to: http://groups.yahoo.com/group/svg-developers/ * Your email settings: Individual Email | Traditional * To change settings online go to: http://groups.yahoo.com/group/svg-developers/join (Yahoo! ID required) * To change settings via email: svg-developers-dig...@yahoogroups.com svg-developers-fullfeatu...@yahoogroups.com * To unsubscribe from this group, send an email to: svg-developers-unsubscr...@yahoogroups.com * Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/
Re: [svg-developers] Unicode and SVG
I agree, sort of. The question is still why does US-ASCII letters show inside an HTML file declared as charset utf-8 and letters like ð,þ, á show as glyph not found. I did not *convert* US-ASCII. You will understand the problem only if you open the attached HTML files on 3 tabs and compare. I still believe that somebody forgot something somewhere about 2004. Thanks. JC On Fri, Oct 22, 2010 at 5:29 AM, Terry Riegel rie...@clearimageonline.comwrote: I think you are confused about utf-8 declaring iso-8859-1 text to be utf-8 doesn't make it utf-8 If you want it to be utf-8 you need to convert it to that. Terry On Oct 21, 2010, at 11:53 PM, JC Ahangama ahang...@gmail.comahangama%40gmail.com wrote: I believe I know what is going on. The treatment of ISO-8859-1 set by Unicode is the culprit, at least in the Windows machines. Please check the three versions of an HTML file for the same text given at the bottom of the page. Characters outside ASCII that are still within ISO-8859-1 (codepoints 128 thru 255) are not included in the Unicode repertoire (as the last sample illustrates). The HTML pages do not declare a font, and therefore, uses the Last Resort font of the system, which demonstrably *has* the letters that UTF-8 set thinks are missing. If you are working with people that want to use Indic, it is best that they transliterate their languages to ISO-8859-1 and display them by means of orthographic fonts. Here are two web sites that illustrates it (The language is Sinhala): http://www.ahangama.com/ -- My Wordpress blog has both English and Sinhala http://www.lovatasinhala.com/ -- has only *one* graphic, that of the lion. Regards, JC 1. No character set declaration. Shows the text correctly because ISO-8859-1 is the default charset. == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 but no character set declared/title /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 2. character declared as iso-8859-1. The text shows correctly == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=iso-8859-1 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 3. characters set declared as UTF-8. No European characters! == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=utf-8 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html === [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.comsvg-developers-unsubscribe%40yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links * To visit your group on the web, go to: http://groups.yahoo.com/group/svg-developers/ * Your email settings: Individual Email | Traditional * To change settings online go to: http://groups.yahoo.com/group/svg-developers/join (Yahoo! ID required) * To change settings via email: svg-developers-dig...@yahoogroups.com svg-developers-fullfeatu...@yahoogroups.com * To unsubscribe from this group, send an email to: svg-developers-unsubscr...@yahoogroups.com * Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/
Re: [svg-developers] Unicode and SVG
Us ASCII will always be us ASCII in both encodings. UTF-8 is what you want so convert any non-utf to utf and you'll be fine Terry Sent from my iPhone On Oct 22, 2010, at 12:44 PM, JC Ahangama ahang...@gmail.com wrote: I agree, sort of. The question is still why does US-ASCII letters show inside an HTML file declared as charset utf-8 and letters like ð,þ, á show as glyph not found. I did not *convert* US-ASCII. You will understand the problem only if you open the attached HTML files on 3 tabs and compare. I still believe that somebody forgot something somewhere about 2004. Thanks. JC On Fri, Oct 22, 2010 at 5:29 AM, Terry Riegel rie...@clearimageonline.comwrote: I think you are confused about utf-8 declaring iso-8859-1 text to be utf-8 doesn't make it utf-8 If you want it to be utf-8 you need to convert it to that. Terry On Oct 21, 2010, at 11:53 PM, JC Ahangama ahang...@gmail.comahangama%40gmail.com wrote: I believe I know what is going on. The treatment of ISO-8859-1 set by Unicode is the culprit, at least in the Windows machines. Please check the three versions of an HTML file for the same text given at the bottom of the page. Characters outside ASCII that are still within ISO-8859-1 (codepoints 128 thru 255) are not included in the Unicode repertoire (as the last sample illustrates). The HTML pages do not declare a font, and therefore, uses the Last Resort font of the system, which demonstrably *has* the letters that UTF-8 set thinks are missing. If you are working with people that want to use Indic, it is best that they transliterate their languages to ISO-8859-1 and display them by means of orthographic fonts. Here are two web sites that illustrates it (The language is Sinhala): http://www.ahangama.com/ -- My Wordpress blog has both English and Sinhala http://www.lovatasinhala.com/ -- has only *one* graphic, that of the lion. Regards, JC 1. No character set declaration. Shows the text correctly because ISO-8859-1 is the default charset. == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 but no character set declared/title /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 2. character declared as iso-8859-1. The text shows correctly == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=iso-8859-1 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 3. characters set declared as UTF-8. No European characters! == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=utf-8 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html === [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.comsvg-developers-unsubscribe%40yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links * To visit your group on the web, go to: http://groups.yahoo.com/group/svg-developers/ * Your email settings: Individual Email | Traditional * To change settings online go to: http://groups.yahoo.com/group/svg-developers/join (Yahoo! ID required) * To change settings via
[svg-developers] Unicode and SVG
Hi folks, I'm currently teaching a course with a lot of international students and a few have complained about not being able to get special characters (as with umlauts and the like) to display well in SVG. I haven't played with the issue much, though we've seen fairly sizable variation in support for various font-families as a function of: operating systems browsers the fonts chosen whether or not the fonts are installed on the system (and how the browsers react when they aren't) even when the font-families are intentionally generic. Some of the oddities observed in the screen shot comparisons at http://www.w3.org/Graphics/SVG/IG/resources/svgprimer.html#text from 4-5 years ago, alas, still persist, even with so many more browsers now competing. So my questions: 1. Any generic advice you might have about how best to make non-English and especially Unicode character sets display consistently across browers in SVG? 2. Do you know of any nice essays on the subject, that I could point folks toward? cheers David [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links * To visit your group on the web, go to: http://groups.yahoo.com/group/svg-developers/ * Your email settings: Individual Email | Traditional * To change settings online go to: http://groups.yahoo.com/group/svg-developers/join (Yahoo! ID required) * To change settings via email: svg-developers-dig...@yahoogroups.com svg-developers-fullfeatu...@yahoogroups.com * To unsubscribe from this group, send an email to: svg-developers-unsubscr...@yahoogroups.com * Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/
[svg-developers] Unicode and SVG
I believe I know what is going on. The treatment of ISO-8859-1 set by Unicode is the culprit, at least in the Windows machines. Please check the three versions of an HTML file for the same text given at the bottom of the page. Characters outside ASCII that are still within ISO-8859-1 (codepoints 128 thru 255) are not included in the Unicode repertoire (as the last sample illustrates). The HTML pages do not declare a font, and therefore, uses the Last Resort font of the system, which demonstrably *has* the letters that UTF-8 set thinks are missing. If you are working with people that want to use Indic, it is best that they transliterate their languages to ISO-8859-1 and display them by means of orthographic fonts. Here are two web sites that illustrates it (The language is Sinhala): http://www.ahangama.com/ -- My Wordpress blog has both English and Sinhala http://www.lovatasinhala.com/ -- has only *one* graphic, that of the lion. Regards, JC 1. No character set declaration. Shows the text correctly because ISO-8859-1 is the default charset. == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 but no character set declared/title /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 2. character declared as iso-8859-1. The text shows correctly == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=iso-8859-1 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html == 3. characters set declared as UTF-8. No European characters! == !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN html head titleISO-8859-1 charset explicitly declared/title meta http-equiv=Content-Type content=text/html; Charset=utf-8 /head body ASCII lc:br span style=font-size:20px;letter-spacing:8px; abcdefghijklmnopqrstuvwxyz /spanbrbr Some non-English letters:br span style=font-size:20px;letter-spacing:8px; ðþææéúíóáðçøçµëûüïöäÐçôçñ /span /body /html === [Non-text portions of this message have been removed] - To unsubscribe send a message to: svg-developers-unsubscr...@yahoogroups.com -or- visit http://groups.yahoo.com/group/svg-developers and click edit my membership Yahoo! Groups Links * To visit your group on the web, go to: http://groups.yahoo.com/group/svg-developers/ * Your email settings: Individual Email | Traditional * To change settings online go to: http://groups.yahoo.com/group/svg-developers/join (Yahoo! ID required) * To change settings via email: svg-developers-dig...@yahoogroups.com svg-developers-fullfeatu...@yahoogroups.com * To unsubscribe from this group, send an email to: svg-developers-unsubscr...@yahoogroups.com * Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/