(Sorry for posting this here, but [EMAIL PROTECTED] bounced this mail as "off topic" and suggested that i post it here.)
Hello, mysqlers... (first post from a long-time mysql user) i recently learned that mysqldump has an --xml flag to dump out a db into XML. COOL! Except that it mangles data by converting >, <, &, etc, into their XML equivalents. This is conversion Evil because: The data may or may not have been stored encoded that way originally, and upon attempting to convert from XML back into some other format, the user cannot reliably convert XML entities back into "normal characters" in the case where both types of data (normal characters and XML entities) are in the same fields (an example is below). In short, it's an unannounced alteration of the user's data, and one which can potentially cause problems later on during conversion from XML to [data format X]. For example, i store PHP code in a database, and the <?php php?> tags get mangled with the --xml flag, as do the & signs in the code. Upon conversion from XML back into any other format, the data is useless: i would have to look through every & entity and see if it's in a string (and thus is okay as '&') or not (and thus it's a programming operator) and edit accordingly. Here's a real-world example of a piece of data mangled by --xml: <content><?php echo '<hr><b>Session table:</b><br>'; $db=classload('DebugUtil'); echo $db->dumpArray( r_session(), '<br>' ); php?><content> That cannot be 100% automatically/reliably converted back into it's original form. Coincidentally, a couple months ago i wrote a Perl script which dumps a mysql db into XML, and the approach i took to this problem seems to be less intrusive, and keeps the user's data exactly as it is in the db: If the data of a dumped field contains any non-word characters, simply wrap it up the output in a <![CDATA[...]]> block. Using mysqldump --xml: <myfield><?php echo "some code goes here, & some code goes there."; php?></myfield> proposed method: <myfield><![CDATA[<?php echo "some code goes here, & some code goes there."; php?>]]</myfield> So, the mangled example from above becomes: <content><![CDATA[<?php echo '<hr><b>Session table:</b><br>'; $db=classload('DebugUtil'); echo $db->dumpArray( r_session(), '<br>' ); php?>]]></content> Fields with only word characters (or word and any of ",.-") are left intact. i strongly recommend a similar change in mysqldump's --xml behaviour, as the current behaviour seems downright evil. Take care, :) ----- stephan [EMAIL PROTECTED] - http://www.einsurance.de Office: +49 (89) 552 92 862 Handy: +49 (179) 211 97 67 This email is encrypted with ROT26 encoding. Decoding it is in violation of the Digital Millennium Copyright Act. --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php