Re: [PHP] actually the egrave; not the ampersand
On Thu, 2005-10-13 at 09:25 -0400, John Nichel wrote: jonathan wrote: do you then have to do the reverse operation to get it back for rendering. Since it was erroring on me during DOM creation, I feel like I'm going around it to put it into a format it likes but then on display via XSL transformation, I will have to convert it back. Or am I missing something? I'm only on the sending end; I don't know if the people I send the documents too have to convert it back. Don't quote me on this, but if you're going to display the information in a web browser, *I think* it will display the decimal value properly. Don't worry about being quoted on that -- it most definitely will. Using decimal values in HTML entities is very common, especially for writing non-ASCII and non-usual-HTML-entities characters (such as CJK Kanji, mathematical operators or cyrillic glyphs) without worrying about the transfer charset. Personally though, I usually convert into proper UTF-8 instead of using entities. It takes a *lot* less space, providing that the web server is properly configured not to say that the document is ISO-8859-1 or something. Personally, I use a seven-liner perl script to convert the official HTML entitity DTDs into a PHP include file: print(?php\n\$htmlentities = array(\n); while() { if(/!ENTITY\s+(\w+)\s+CDATA\s+\\#(\d+);\/) { print \t\$1\ = $2,\n; } } print(\t);\n?\n); The result is as follows: ?php $htmlentities = array( nbsp = 160, iexcl = 161, cent = 162, pound = 163, curren = 164, yen = 165, brvbar = 166, sect = 167, uml = 168, copy = 169, ordf = 170, laquo = 171, not = 172, shy = 173, reg = 174, macr = 175, deg = 176, plusmn = 177, sup2 = 178, sup3 = 179, acute = 180, micro = 181, para = 182, middot = 183, cedil = 184, sup1 = 185, ordm = 186, raquo = 187, frac14 = 188, frac12 = 189, frac34 = 190, iquest = 191, Agrave = 192, Aacute = 193, Acirc = 194, Atilde = 195, Auml = 196, Aring = 197, AElig = 198, Ccedil = 199, Egrave = 200, Eacute = 201, Ecirc = 202, Euml = 203, Igrave = 204, Iacute = 205, Icirc = 206, Iuml = 207, ETH = 208, Ntilde = 209, Ograve = 210, Oacute = 211, Ocirc = 212, Otilde = 213, Ouml = 214, times = 215, Oslash = 216, Ugrave = 217, Uacute = 218, Ucirc = 219, Uuml = 220, Yacute = 221, THORN = 222, szlig = 223, agrave = 224, aacute = 225, acirc = 226, atilde = 227, auml = 228, aring = 229, aelig = 230, ccedil = 231, egrave = 232, eacute = 233, ecirc = 234, euml = 235, igrave = 236, iacute = 237, icirc = 238, iuml = 239, eth = 240, ntilde = 241, ograve = 242, oacute = 243, ocirc = 244, otilde = 245, ouml = 246, divide = 247, oslash = 248, ugrave = 249, uacute = 250, ucirc = 251, uuml = 252, yacute = 253, thorn = 254, yuml = 255, fnof = 402, Alpha = 913, Beta = 914, Gamma = 915, Delta = 916, Epsilon = 917, Zeta = 918, Eta = 919, Theta = 920, Iota = 921, Kappa = 922, Lambda = 923, Mu = 924, Nu = 925, Xi = 926, Omicron = 927, Pi = 928, Rho = 929, Sigma = 931, Tau = 932, Upsilon = 933, Phi = 934, Chi = 935, Psi = 936, Omega = 937, alpha = 945, beta = 946, gamma = 947, delta = 948, epsilon = 949, zeta = 950, eta = 951, theta = 952, iota = 953, kappa = 954, lambda = 955, mu = 956, nu = 957, xi = 958, omicron = 959, pi = 960, rho = 961, sigmaf = 962, sigma = 963, tau = 964, upsilon = 965, phi = 966, chi = 967, psi = 968, omega = 969, thetasym = 977, upsih = 978, piv = 982, bull = 8226, hellip = 8230, prime = 8242, Prime = 8243, oline = 8254, frasl = 8260, weierp = 8472, image = 8465, real = 8476, trade = 8482, alefsym = 8501, larr = 8592, uarr = 8593, rarr = 8594, darr = 8595, harr = 8596, crarr = 8629, lArr = 8656, uArr = 8657,
Re: [PHP] actually the egrave; not the ampersand
jonathan wrote: do you then have to do the reverse operation to get it back for rendering. Since it was erroring on me during DOM creation, I feel like I'm going around it to put it into a format it likes but then on display via XSL transformation, I will have to convert it back. Or am I missing something? I'm only on the sending end; I don't know if the people I send the documents too have to convert it back. Don't quote me on this, but if you're going to display the information in a web browser, *I think* it will display the decimal value properly. -- John C. Nichel ÜberGeek KegWorks.com 716.856.9675 [EMAIL PROTECTED] -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] actually the egrave; not the ampersand
so, the problem isn't the ampersand but rather the egrave; in the following: item_namefarm lettuces with reed avocado, cregrave;me fraicirc;che, radish and cilantro/item_name I'm not sure how php / DOM handles these non-standard other entities. How would / could I escape this? do I need to convert it to something else to make DOM happy (say grave) and then convert it back when outputted? It seems like there might be a way to use createEntityReference but searching through google shows no examples. arg. I'm using php 5.0.3 -jonathan -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] actually the egrave; not the ampersand
jonathan wrote: so, the problem isn't the ampersand but rather the egrave; in the following: item_namefarm lettuces with reed avocado, cregrave;me fraicirc;che, radish and cilantro/item_name I'm not sure how php / DOM handles these non-standard other entities. How would / could I escape this? do I need to convert it to something else to make DOM happy (say grave) and then convert it back when outputted? It seems like there might be a way to use createEntityReference but searching through google shows no examples. arg. I have to do this to create XML documents for our company, and built this function for it. You're welcome to use/modify it to fit your needs. function convertString ( $string ) { $find_array = array ( /quot;/, /amp;/, /lt;/, /gt;/, /nbsp;/, /iexcl;/, /cent;/, /pound;/, /curren;/, /yen;/, /brvbar;/, /sect;/, /uml;/, /copy;/, /ordf;/, /laquo;/, /not;/, /shy;/, /reg;/, /macr;/, /deg;/, /plusmn;/, /sup2;/, /sup3;/, /acute;/, /micro;/, /para;/, /middot;/, /cedil;/, /sup1;/, /ordm;/, /raquo;/, /frac14;/, /frac12;/, /frac34;/, /iquest;/, /Agrave;/, /Aacute;/, /Acirc;/, /Atilde;/, /Auml;/, /Aring;/, /AElig;/, /Ccedil;/, /Egrave;/, /Eacute;/, /Ecirc;/, /Euml;/, /Igrave;/, /Iacute;/, /Icirc;/, /Iuml;/, /ETH;/, /Ntilde;/, /Ograve;/, /Oacute;/, /Ocirc;/, /Otilde;/, /Ouml;/, /times;/, /Oslash;/, /Ugrave;/, /Uacute;/, /Ucirc;/, /Uuml;/, /Yacute;/, /THORN;/, /szlig;/, /agrave;/, /aacute;/, /acirc;/, /atilde;/, /auml;/, /aring;/, /aelig;/, /ccedil;/, /egrave;/, /eacute;/, /ecirc;/, /euml;/, /igrave;/, /iacute;/, /icirc;/, /iuml;/, /eth;/, /ntilde;/, /ograve;/, /oacute;/, /ocirc;/, /otilde;/, /ouml;/, /divide;/, /oslash;/, /ugrave;/, /uacute;/, /ucirc;/, /uuml;/, /yacute;/, /thorn;/, /yuml;/ ); $replace_array = array ( '#034;', '#038;', '#060;', '#062;', '#160;', '#161;', '#162;', '#163;', '#164;', '#165;', '#166;', '#167;', '#168;', '#169;', '#170;', '#171;', '#172;', '#173;', '#174;', '#175;', '#176;', '#177;', '#178;', '#179;', '#180;', '#181;', '#182;', '#183;', '#184;', '#185;', '#186;', '#187;', '#188;', '#189;', '#190;', '#191;', '#192;', '#193;', '#194;', '#195;', '#196;', '#197;', '#198;', '#199;', '#200;', '#201;', '#202;', '#203;', '#204;', '#205;', '#206;', '#207;', '#208;', '#209;', '#210;', '#211;', '#212;', '#213;', '#214;', '#215;', '#216;', '#217;', '#218;', '#219;',
Re: [PHP] actually the egrave; not the ampersand
do you then have to do the reverse operation to get it back for rendering. Since it was erroring on me during DOM creation, I feel like I'm going around it to put it into a format it likes but then on display via XSL transformation, I will have to convert it back. Or am I missing something? thanks again for your help. jonathan On Oct 12, 2005, at 2:17 PM, John Nichel wrote: jonathan wrote: so, the problem isn't the ampersand but rather the egrave; in the following: item_namefarm lettuces with reed avocado, cregrave;me fraicirc;che, radish and cilantro/item_name I'm not sure how php / DOM handles these non-standard other entities. How would / could I escape this? do I need to convert it to something else to make DOM happy (say grave) and then convert it back when outputted? It seems like there might be a way to use createEntityReference but searching through google shows no examples. arg. I have to do this to create XML documents for our company, and built this function for it. You're welcome to use/modify it to fit your needs. function convertString ( $string ) { $find_array = array ( /quot;/, /amp;/, /lt;/, /gt;/, /nbsp;/, /iexcl;/, /cent;/, /pound;/, /curren;/, /yen;/, /brvbar;/, /sect;/, /uml;/, /copy;/, /ordf;/, /laquo;/, /not;/, /shy;/, /reg;/, /macr;/, /deg;/, /plusmn;/, /sup2;/, /sup3;/, /acute;/, /micro;/, /para;/, /middot;/, /cedil;/, /sup1;/, /ordm;/, /raquo;/, /frac14;/, /frac12;/, /frac34;/, /iquest;/, /Agrave;/, /Aacute;/, /Acirc;/, /Atilde;/, /Auml;/, /Aring;/, /AElig;/, /Ccedil;/, /Egrave;/, /Eacute;/, /Ecirc;/, /Euml;/, /Igrave;/, /Iacute;/, /Icirc;/, /Iuml;/, /ETH;/, /Ntilde;/, /Ograve;/, /Oacute;/, /Ocirc;/, /Otilde;/, /Ouml;/, /times;/, /Oslash;/, /Ugrave;/, /Uacute;/, /Ucirc;/, /Uuml;/, /Yacute;/, /THORN;/, /szlig;/, /agrave;/, /aacute;/, /acirc;/, /atilde;/, /auml;/, /aring;/, /aelig;/, /ccedil;/, /egrave;/, /eacute;/, /ecirc;/, /euml;/, /igrave;/, /iacute;/, /icirc;/, /iuml;/, /eth;/, /ntilde;/, /ograve;/, /oacute;/, /ocirc;/, /otilde;/, /ouml;/, /divide;/, /oslash;/, /ugrave;/, /uacute;/, /ucirc;/, /uuml;/, /yacute;/, /thorn;/, /yuml;/ ); $replace_array = array ( '#034;', '#038;', '#060;', '#062;', '#160;', '#161;', '#162;', '#163;', '#164;', '#165;', '#166;', '#167;', '#168;', '#169;', '#170;', '#171;', '#172;', '#173;', '#174;', '#175;', '#176;', '#177;', '#178;', '#179;', '#180;', '#181;', '#182;', '#183;', '#184;', '#185;', '#186;', '#187;', '#188;', '#189;', '#190;', '#191;', '#192;', '#193;', '#194;', '#195;', '#196;', '#197;', '#198;', '#199;', '#200;', '#201;', '#202;', '#203;', '#204;', '#205;', '#206;', '#207;', '#208;', '#209;', '#210;', '#211;', '#212;', '#213;', '#214;', '#215;', '#216;', '#217;', '#218;', '#219;', '#220;', '#221;', '#222;', '#223;', '#224;', '#225;', '#226;', '#227;', '#228;', '#229;', '#230;', '#231;', '#232;', '#233;', '#234;', '#235;', '#236;', '#237;', '#238;', '#239;', '#240;', '#241;', '#242;', '#243;', '#244;', '#245;', '#246;', '#247;', '#248;', '#249;', '#250;', '#251;', '#252;', '#253;', '#254;', '#255;' ); $string = htmlentities ( strip_tags ( preg_replace ( /\n|\r|\r \n/, , $string ) ), ENT_QUOTES ); $string = preg_replace ( $find_array, $replace_array, $string ); return $string; } -- John C. Nichel ÜberGeek KegWorks.com 716.856.9675 [EMAIL PROTECTED] -- PHP General Mailing List