[PHP] Cleaning bad characters from var

2006-07-21 Thread Paul Nowosielski
Dear All,

I'm trying to set up an XML feed form our news articles. My XML is validating. 
The issue  is some of the articles have a weird encoding.

It seems to be single quotes. For example:
the world92s largest live event producer

Notice the 92.

I already have this to clean vars but its not doing the trick:

// clean bad chars for valid XML
//$patterns[0] = '/=/';
$patterns[1] = '//';
$patterns[2] = '//';
$patterns[3] = '/\'/';
$patterns[4] = '/\/';
$patterns[5] = '//';

//$replacements[0] = '/eq/';
$replacements[1] = '/lt/';
$replacements[2] = '/gt/';
$replacements[3] = '/apos;/';
$replacements[4] = '/quot;/';
$replacements[5] = '/amp;/';





// chars to replace
$badwordchars=array(
\xe2\x80\x98, // left single quote
\xe2\x80\x99, // right single quote
\xe2\x80\x9c, // left double quote
\xe2\x80\x9d, // right double quote
\xe2\x80\x94, // em dash
\xe2\x80\xa6 // elipses
);

$fixedwordchars=array(
#8216;,
#8217;,
'#8220;',
'#8221;',
'mdash;',
'#8230;'
);

An thoughts would be very helpful.

Thank You,

-- 
Paul Nowosielski
Webmaster

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Cleaning bad characters from var

2006-07-21 Thread John Nichel

Paul Nowosielski wrote:

Dear All,

I'm trying to set up an XML feed form our news articles. My XML is validating. 
The issue  is some of the articles have a weird encoding.




I wrote a function to do this for our product descriptions when sending 
them in a XML doc to certain vendors.  It's old, crude, but you're 
welcome to it.


function convertString ( $string, $reverse = false ) {
$find_array = array (
/quot;/,
/amp;/,
/lt;/,
/gt;/,
/nbsp;/,
/iexcl;/,
/cent;/,
/pound;/,
/curren;/,
/yen;/,
/brvbar;/,
/sect;/,
/uml;/,
/copy;/,
/ordf;/,
/laquo;/,
/not;/,
/shy;/,
/reg;/,
/macr;/,
/deg;/,
/plusmn;/,
/sup2;/,
/sup3;/,
/acute;/,
/micro;/,
/para;/,
/middot;/,
/cedil;/,
/sup1;/,
/ordm;/,
/raquo;/,
/frac14;/,
/frac12;/,
/frac34;/,
/iquest;/,
/Agrave;/,
/Aacute;/,
/Acirc;/,
/Atilde;/,
/Auml;/,
/Aring;/,
/AElig;/,
/Ccedil;/,
/Egrave;/,
/Eacute;/,
/Ecirc;/,
/Euml;/,
/Igrave;/,
/Iacute;/,
/Icirc;/,
/Iuml;/,
/ETH;/,
/Ntilde;/,
/Ograve;/,
/Oacute;/,
/Ocirc;/,
/Otilde;/,
/Ouml;/,
/times;/,
/Oslash;/,
/Ugrave;/,
/Uacute;/,
/Ucirc;/,
/Uuml;/,
/Yacute;/,
/THORN;/,
/szlig;/,
/agrave;/,
/aacute;/,
/acirc;/,
/atilde;/,
/auml;/,
/aring;/,
/aelig;/,
/ccedil;/,
/egrave;/,
/eacute;/,
/ecirc;/,
/euml;/,
/igrave;/,
/iacute;/,
/icirc;/,
/iuml;/,
/eth;/,
/ntilde;/,
/ograve;/,
/oacute;/,
/ocirc;/,
/otilde;/,
/ouml;/,
/divide;/,
/oslash;/,
/ugrave;/,
/uacute;/,
/ucirc;/,
/uuml;/,
/yacute;/,
/thorn;/,
/yuml;/
);
$replace_array = array (
'#034;',
'#038;',
'#060;',
'#062;',
'#160;',
'#161;',
'#162;',
'#163;',
'#164;',
'#165;',
'#166;',
'#167;',
'#168;',
'#169;',
'#170;',
'#171;',
'#172;',
'#173;',
'#174;',
'#175;',
'#176;',
'#177;',
'#178;',
'#179;',
'#180;',
'#181;',
'#182;',
'#183;',
'#184;',
'#185;',
'#186;',
'#187;',
'#188;',
'#189;',
'#190;',
'#191;',
'#192;',
'#193;',
'#194;',
'#195;',
'#196;',
'#197;',
'#198;',
'#199;',
'#200;',
'#201;',
'#202;',
'#203;',
'#204;',
'#205;',
'#206;',
'#207;',
'#208;',
'#209;',
'#210;',
'#211;',
'#212;',
'#213;',
'#214;',
'#215;',
'#216;',
'#217;',
'#218;',
'#219;',
'#220;',
'#221;',
'#222;',
'#223;',
'#224;',
'#225;',
'#226;',
'#227;',
'#228;',
'#229;',
'#230;',
'#231;',
'#232;',
   

Re: [PHP] Cleaning bad characters from var

2006-07-21 Thread Adam Zey

John Nichel wrote:

Paul Nowosielski wrote:

I wrote a function to do this for our product descriptions when sending 
them in a XML doc to certain vendors.  It's old, crude, but you're 
welcome to it.

snip function that uses regular expressions for simple string replacements



Using regular expressions for that is a very bad idea. You're using 
regular expressions for simple string matches, which is going to be much 
slower than just using the simple string replace functions. It even 
warns you against doing what you're doing in the manual.


Here's a function that demonstrates replacing several characters with 
something else


function replaceChars($string)
{
$find_array = array(a, b, c);
$replace_array = array(apple, banana, carambola);

return str_replace($find_array, $replace_array, $string);
}

You can, of course, throw other string processing functions in there.

Regards, Adam Zey.

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php