[PHP] Cleaning bad characters from var
Dear All, I'm trying to set up an XML feed form our news articles. My XML is validating. The issue is some of the articles have a weird encoding. It seems to be single quotes. For example: the world92s largest live event producer Notice the 92. I already have this to clean vars but its not doing the trick: // clean bad chars for valid XML //$patterns[0] = '/=/'; $patterns[1] = '//'; $patterns[2] = '//'; $patterns[3] = '/\'/'; $patterns[4] = '/\/'; $patterns[5] = '//'; //$replacements[0] = '/eq/'; $replacements[1] = '/lt/'; $replacements[2] = '/gt/'; $replacements[3] = '/apos;/'; $replacements[4] = '/quot;/'; $replacements[5] = '/amp;/'; // chars to replace $badwordchars=array( \xe2\x80\x98, // left single quote \xe2\x80\x99, // right single quote \xe2\x80\x9c, // left double quote \xe2\x80\x9d, // right double quote \xe2\x80\x94, // em dash \xe2\x80\xa6 // elipses ); $fixedwordchars=array( #8216;, #8217;, '#8220;', '#8221;', 'mdash;', '#8230;' ); An thoughts would be very helpful. Thank You, -- Paul Nowosielski Webmaster -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Cleaning bad characters from var
Paul Nowosielski wrote: Dear All, I'm trying to set up an XML feed form our news articles. My XML is validating. The issue is some of the articles have a weird encoding. I wrote a function to do this for our product descriptions when sending them in a XML doc to certain vendors. It's old, crude, but you're welcome to it. function convertString ( $string, $reverse = false ) { $find_array = array ( /quot;/, /amp;/, /lt;/, /gt;/, /nbsp;/, /iexcl;/, /cent;/, /pound;/, /curren;/, /yen;/, /brvbar;/, /sect;/, /uml;/, /copy;/, /ordf;/, /laquo;/, /not;/, /shy;/, /reg;/, /macr;/, /deg;/, /plusmn;/, /sup2;/, /sup3;/, /acute;/, /micro;/, /para;/, /middot;/, /cedil;/, /sup1;/, /ordm;/, /raquo;/, /frac14;/, /frac12;/, /frac34;/, /iquest;/, /Agrave;/, /Aacute;/, /Acirc;/, /Atilde;/, /Auml;/, /Aring;/, /AElig;/, /Ccedil;/, /Egrave;/, /Eacute;/, /Ecirc;/, /Euml;/, /Igrave;/, /Iacute;/, /Icirc;/, /Iuml;/, /ETH;/, /Ntilde;/, /Ograve;/, /Oacute;/, /Ocirc;/, /Otilde;/, /Ouml;/, /times;/, /Oslash;/, /Ugrave;/, /Uacute;/, /Ucirc;/, /Uuml;/, /Yacute;/, /THORN;/, /szlig;/, /agrave;/, /aacute;/, /acirc;/, /atilde;/, /auml;/, /aring;/, /aelig;/, /ccedil;/, /egrave;/, /eacute;/, /ecirc;/, /euml;/, /igrave;/, /iacute;/, /icirc;/, /iuml;/, /eth;/, /ntilde;/, /ograve;/, /oacute;/, /ocirc;/, /otilde;/, /ouml;/, /divide;/, /oslash;/, /ugrave;/, /uacute;/, /ucirc;/, /uuml;/, /yacute;/, /thorn;/, /yuml;/ ); $replace_array = array ( '#034;', '#038;', '#060;', '#062;', '#160;', '#161;', '#162;', '#163;', '#164;', '#165;', '#166;', '#167;', '#168;', '#169;', '#170;', '#171;', '#172;', '#173;', '#174;', '#175;', '#176;', '#177;', '#178;', '#179;', '#180;', '#181;', '#182;', '#183;', '#184;', '#185;', '#186;', '#187;', '#188;', '#189;', '#190;', '#191;', '#192;', '#193;', '#194;', '#195;', '#196;', '#197;', '#198;', '#199;', '#200;', '#201;', '#202;', '#203;', '#204;', '#205;', '#206;', '#207;', '#208;', '#209;', '#210;', '#211;', '#212;', '#213;', '#214;', '#215;', '#216;', '#217;', '#218;', '#219;', '#220;', '#221;', '#222;', '#223;', '#224;', '#225;', '#226;', '#227;', '#228;', '#229;', '#230;', '#231;', '#232;',
Re: [PHP] Cleaning bad characters from var
John Nichel wrote: Paul Nowosielski wrote: I wrote a function to do this for our product descriptions when sending them in a XML doc to certain vendors. It's old, crude, but you're welcome to it. snip function that uses regular expressions for simple string replacements Using regular expressions for that is a very bad idea. You're using regular expressions for simple string matches, which is going to be much slower than just using the simple string replace functions. It even warns you against doing what you're doing in the manual. Here's a function that demonstrates replacing several characters with something else function replaceChars($string) { $find_array = array(a, b, c); $replace_array = array(apple, banana, carambola); return str_replace($find_array, $replace_array, $string); } You can, of course, throw other string processing functions in there. Regards, Adam Zey. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php