Re: [PHP] actually the egrave; not the ampersand

2005-11-04 Thread Fredrik Tolf
On Thu, 2005-10-13 at 09:25 -0400, John Nichel wrote:
 jonathan wrote:
  do you then have to do the reverse operation to get it back for  
  rendering. Since it was erroring on me during DOM creation, I feel  like 
  I'm going around it to put it into a format it likes but then on  
  display via XSL transformation, I will have to convert it back. Or am  I 
  missing something?
 
 I'm only on the sending end; I don't know if the people I send the 
 documents too have to convert it back.  Don't quote me on this, but if 
 you're going to display the information in a web browser, *I think* it 
 will display the decimal value properly.

Don't worry about being quoted on that -- it most definitely will. Using
decimal values in HTML entities is very common, especially for writing
non-ASCII and non-usual-HTML-entities characters (such as CJK Kanji,
mathematical operators or cyrillic glyphs) without worrying about the
transfer charset. Personally though, I usually convert into proper UTF-8
instead of using entities. It takes a *lot* less space, providing that
the web server is properly configured not to say that the document is
ISO-8859-1 or something.

Personally, I use a seven-liner perl script to convert the official HTML
entitity DTDs into a PHP include file:

print(?php\n\$htmlentities = array(\n);
while() {
if(/!ENTITY\s+(\w+)\s+CDATA\s+\\#(\d+);\/) {
print \t\$1\ = $2,\n;
}
}
print(\t);\n?\n);

The result is as follows:
?php
$htmlentities = array(
nbsp = 160,
iexcl = 161,
cent = 162,
pound = 163,
curren = 164,
yen = 165,
brvbar = 166,
sect = 167,
uml = 168,
copy = 169,
ordf = 170,
laquo = 171,
not = 172,
shy = 173,
reg = 174,
macr = 175,
deg = 176,
plusmn = 177,
sup2 = 178,
sup3 = 179,
acute = 180,
micro = 181,
para = 182,
middot = 183,
cedil = 184,
sup1 = 185,
ordm = 186,
raquo = 187,
frac14 = 188,
frac12 = 189,
frac34 = 190,
iquest = 191,
Agrave = 192,
Aacute = 193,
Acirc = 194,
Atilde = 195,
Auml = 196,
Aring = 197,
AElig = 198,
Ccedil = 199,
Egrave = 200,
Eacute = 201,
Ecirc = 202,
Euml = 203,
Igrave = 204,
Iacute = 205,
Icirc = 206,
Iuml = 207,
ETH = 208,
Ntilde = 209,
Ograve = 210,
Oacute = 211,
Ocirc = 212,
Otilde = 213,
Ouml = 214,
times = 215,
Oslash = 216,
Ugrave = 217,
Uacute = 218,
Ucirc = 219,
Uuml = 220,
Yacute = 221,
THORN = 222,
szlig = 223,
agrave = 224,
aacute = 225,
acirc = 226,
atilde = 227,
auml = 228,
aring = 229,
aelig = 230,
ccedil = 231,
egrave = 232,
eacute = 233,
ecirc = 234,
euml = 235,
igrave = 236,
iacute = 237,
icirc = 238,
iuml = 239,
eth = 240,
ntilde = 241,
ograve = 242,
oacute = 243,
ocirc = 244,
otilde = 245,
ouml = 246,
divide = 247,
oslash = 248,
ugrave = 249,
uacute = 250,
ucirc = 251,
uuml = 252,
yacute = 253,
thorn = 254,
yuml = 255,
fnof = 402,
Alpha = 913,
Beta = 914,
Gamma = 915,
Delta = 916,
Epsilon = 917,
Zeta = 918,
Eta = 919,
Theta = 920,
Iota = 921,
Kappa = 922,
Lambda = 923,
Mu = 924,
Nu = 925,
Xi = 926,
Omicron = 927,
Pi = 928,
Rho = 929,
Sigma = 931,
Tau = 932,
Upsilon = 933,
Phi = 934,
Chi = 935,
Psi = 936,
Omega = 937,
alpha = 945,
beta = 946,
gamma = 947,
delta = 948,
epsilon = 949,
zeta = 950,
eta = 951,
theta = 952,
iota = 953,
kappa = 954,
lambda = 955,
mu = 956,
nu = 957,
xi = 958,
omicron = 959,
pi = 960,
rho = 961,
sigmaf = 962,
sigma = 963,
tau = 964,
upsilon = 965,
phi = 966,
chi = 967,
psi = 968,
omega = 969,
thetasym = 977,
upsih = 978,
piv = 982,
bull = 8226,
hellip = 8230,
prime = 8242,
Prime = 8243,
oline = 8254,
frasl = 8260,
weierp = 8472,
image = 8465,
real = 8476,
trade = 8482,
alefsym = 8501,
larr = 8592,
uarr = 8593,
rarr = 8594,
darr = 8595,
harr = 8596,
crarr = 8629,
lArr = 8656,
uArr = 8657,
  

Re: [PHP] actually the egrave; not the ampersand

2005-10-13 Thread John Nichel

jonathan wrote:
do you then have to do the reverse operation to get it back for  
rendering. Since it was erroring on me during DOM creation, I feel  like 
I'm going around it to put it into a format it likes but then on  
display via XSL transformation, I will have to convert it back. Or am  I 
missing something?


I'm only on the sending end; I don't know if the people I send the 
documents too have to convert it back.  Don't quote me on this, but if 
you're going to display the information in a web browser, *I think* it 
will display the decimal value properly.


--
John C. Nichel
ÜberGeek
KegWorks.com
716.856.9675
[EMAIL PROTECTED]

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] actually the egrave; not the ampersand

2005-10-12 Thread jonathan
so, the problem isn't the ampersand but rather the egrave; in the  
following:
item_namefarm lettuces with reed avocado, cregrave;me  
fraicirc;che, radish and cilantro/item_name


I'm not sure how php / DOM handles these non-standard other entities.  
How would / could I escape this? do I need to convert it to something  
else to make DOM happy (say grave) and then convert it back when  
outputted? It seems like there might be a way to use  
createEntityReference but searching through google shows no examples.  
arg.


I'm using php 5.0.3

-jonathan 


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] actually the egrave; not the ampersand

2005-10-12 Thread John Nichel

jonathan wrote:
so, the problem isn't the ampersand but rather the egrave; in the  
following:
item_namefarm lettuces with reed avocado, cregrave;me  fraicirc;che, 
radish and cilantro/item_name


I'm not sure how php / DOM handles these non-standard other entities.  
How would / could I escape this? do I need to convert it to something  
else to make DOM happy (say grave) and then convert it back when  
outputted? It seems like there might be a way to use  
createEntityReference but searching through google shows no examples.  
arg.


I have to do this to create XML documents for our company, and built 
this function for it.  You're welcome to use/modify it to fit your needs.


function convertString ( $string ) {
$find_array = array (
/quot;/,
/amp;/,
/lt;/,
/gt;/,
/nbsp;/,
/iexcl;/,
/cent;/,
/pound;/,
/curren;/,
/yen;/,
/brvbar;/,
/sect;/,
/uml;/,
/copy;/,
/ordf;/,
/laquo;/,
/not;/,
/shy;/,
/reg;/,
/macr;/,
/deg;/,
/plusmn;/,
/sup2;/,
/sup3;/,
/acute;/,
/micro;/,
/para;/,
/middot;/,
/cedil;/,
/sup1;/,
/ordm;/,
/raquo;/,
/frac14;/,
/frac12;/,
/frac34;/,
/iquest;/,
/Agrave;/,
/Aacute;/,
/Acirc;/,
/Atilde;/,
/Auml;/,
/Aring;/,
/AElig;/,
/Ccedil;/,
/Egrave;/,
/Eacute;/,
/Ecirc;/,
/Euml;/,
/Igrave;/,
/Iacute;/,
/Icirc;/,
/Iuml;/,
/ETH;/,
/Ntilde;/,
/Ograve;/,
/Oacute;/,
/Ocirc;/,
/Otilde;/,
/Ouml;/,
/times;/,
/Oslash;/,
/Ugrave;/,
/Uacute;/,
/Ucirc;/,
/Uuml;/,
/Yacute;/,
/THORN;/,
/szlig;/,
/agrave;/,
/aacute;/,
/acirc;/,
/atilde;/,
/auml;/,
/aring;/,
/aelig;/,
/ccedil;/,
/egrave;/,
/eacute;/,
/ecirc;/,
/euml;/,
/igrave;/,
/iacute;/,
/icirc;/,
/iuml;/,
/eth;/,
/ntilde;/,
/ograve;/,
/oacute;/,
/ocirc;/,
/otilde;/,
/ouml;/,
/divide;/,
/oslash;/,
/ugrave;/,
/uacute;/,
/ucirc;/,
/uuml;/,
/yacute;/,
/thorn;/,
/yuml;/
);
$replace_array = array (
'#034;',
'#038;',
'#060;',
'#062;',
'#160;',
'#161;',
'#162;',
'#163;',
'#164;',
'#165;',
'#166;',
'#167;',
'#168;',
'#169;',
'#170;',
'#171;',
'#172;',
'#173;',
'#174;',
'#175;',
'#176;',
'#177;',
'#178;',
'#179;',
'#180;',
'#181;',
'#182;',
'#183;',
'#184;',
'#185;',
'#186;',
'#187;',
'#188;',
'#189;',
'#190;',
'#191;',
'#192;',
'#193;',
'#194;',
'#195;',
'#196;',
'#197;',
'#198;',
'#199;',
'#200;',
'#201;',
'#202;',
'#203;',
'#204;',
'#205;',
'#206;',
'#207;',
'#208;',
'#209;',
'#210;',
'#211;',
'#212;',
'#213;',
'#214;',
'#215;',
'#216;',
'#217;',
'#218;',
'#219;',
  

Re: [PHP] actually the egrave; not the ampersand

2005-10-12 Thread jonathan
do you then have to do the reverse operation to get it back for  
rendering. Since it was erroring on me during DOM creation, I feel  
like I'm going around it to put it into a format it likes but then on  
display via XSL transformation, I will have to convert it back. Or am  
I missing something?



thanks again for your help.

jonathan

On Oct 12, 2005, at 2:17 PM, John Nichel wrote:


jonathan wrote:

so, the problem isn't the ampersand but rather the egrave; in  
the  following:
item_namefarm lettuces with reed avocado, cregrave;me   
fraicirc;che, radish and cilantro/item_name
I'm not sure how php / DOM handles these non-standard other  
entities.  How would / could I escape this? do I need to convert  
it to something  else to make DOM happy (say grave) and then  
convert it back when  outputted? It seems like there might be a  
way to use  createEntityReference but searching through google  
shows no examples.  arg.




I have to do this to create XML documents for our company, and  
built this function for it.  You're welcome to use/modify it to fit  
your needs.


function convertString ( $string ) {
$find_array = array (
/quot;/,
/amp;/,
/lt;/,
/gt;/,
/nbsp;/,
/iexcl;/,
/cent;/,
/pound;/,
/curren;/,
/yen;/,
/brvbar;/,
/sect;/,
/uml;/,
/copy;/,
/ordf;/,
/laquo;/,
/not;/,
/shy;/,
/reg;/,
/macr;/,
/deg;/,
/plusmn;/,
/sup2;/,
/sup3;/,
/acute;/,
/micro;/,
/para;/,
/middot;/,
/cedil;/,
/sup1;/,
/ordm;/,
/raquo;/,
/frac14;/,
/frac12;/,
/frac34;/,
/iquest;/,
/Agrave;/,
/Aacute;/,
/Acirc;/,
/Atilde;/,
/Auml;/,
/Aring;/,
/AElig;/,
/Ccedil;/,
/Egrave;/,
/Eacute;/,
/Ecirc;/,
/Euml;/,
/Igrave;/,
/Iacute;/,
/Icirc;/,
/Iuml;/,
/ETH;/,
/Ntilde;/,
/Ograve;/,
/Oacute;/,
/Ocirc;/,
/Otilde;/,
/Ouml;/,
/times;/,
/Oslash;/,
/Ugrave;/,
/Uacute;/,
/Ucirc;/,
/Uuml;/,
/Yacute;/,
/THORN;/,
/szlig;/,
/agrave;/,
/aacute;/,
/acirc;/,
/atilde;/,
/auml;/,
/aring;/,
/aelig;/,
/ccedil;/,
/egrave;/,
/eacute;/,
/ecirc;/,
/euml;/,
/igrave;/,
/iacute;/,
/icirc;/,
/iuml;/,
/eth;/,
/ntilde;/,
/ograve;/,
/oacute;/,
/ocirc;/,
/otilde;/,
/ouml;/,
/divide;/,
/oslash;/,
/ugrave;/,
/uacute;/,
/ucirc;/,
/uuml;/,
/yacute;/,
/thorn;/,
/yuml;/
);
$replace_array = array (
'#034;',
'#038;',
'#060;',
'#062;',
'#160;',
'#161;',
'#162;',
'#163;',
'#164;',
'#165;',
'#166;',
'#167;',
'#168;',
'#169;',
'#170;',
'#171;',
'#172;',
'#173;',
'#174;',
'#175;',
'#176;',
'#177;',
'#178;',
'#179;',
'#180;',
'#181;',
'#182;',
'#183;',
'#184;',
'#185;',
'#186;',
'#187;',
'#188;',
'#189;',
'#190;',
'#191;',
'#192;',
'#193;',
'#194;',
'#195;',
'#196;',
'#197;',
'#198;',
'#199;',
'#200;',
'#201;',
'#202;',
'#203;',
'#204;',
'#205;',
'#206;',
'#207;',
'#208;',
'#209;',
'#210;',
'#211;',
'#212;',
'#213;',
'#214;',
'#215;',
'#216;',
'#217;',
'#218;',
'#219;',
'#220;',
'#221;',
'#222;',
'#223;',
'#224;',
'#225;',
'#226;',
'#227;',
'#228;',
'#229;',
'#230;',
'#231;',
'#232;',
'#233;',
'#234;',
'#235;',
'#236;',
'#237;',
'#238;',
'#239;',
'#240;',
'#241;',
'#242;',
'#243;',
'#244;',
'#245;',
'#246;',
'#247;',
'#248;',
'#249;',
'#250;',
'#251;',
'#252;',
'#253;',
'#254;',
'#255;'
);
$string = htmlentities ( strip_tags ( preg_replace ( /\n|\r|\r 
\n/,  , $string ) ), ENT_QUOTES );

$string = preg_replace ( $find_array, $replace_array, $string );
return $string;
}

--
John C. Nichel
ÜberGeek
KegWorks.com
716.856.9675
[EMAIL PROTECTED]

--
PHP General Mailing List