From:             pd at xobase dot com
Operating system: All
PHP version:      5.1.6
PHP Bug Type:     *XML functions
Bug description:  xml_parser_to_struct hoses html content

Description:
------------
This relates the XML/php content management system XObase.

XOb allows users to embed html content into the xml docs used to store
data in the system.  The html is wrapped in a CDATA wrapper, and it has
been escaped.  This all worked perfectly in versions of php prior to 5.1.x
(5.0.x is ok).

What happens is that even if the file is saved correctly, and well formed,
in some instances the html content in the file causes a badly formed xml
object to be created.  The system uses xml_parser_to_struct from a string
read from disk.

We were using implode(file($filearg),"");

In the newer php versions, the array returned from file() is not complete.
 So there may be an issue with the file() function. (specifically if a call
back to it used within xml_parse_into_struct)

We changed this to 
$fp = fopen($filearg,"r");
$data = fread($fp,filesize($filearg));
fclose($fp);

$data = file_get_contents($filearg); Also returns an incomplete string
from xml doc.

The fread() method creates a good string, however.
The $xmlObj created below in the code example cuts off at the node where
the html exists.  This seems to happen with certain html every time.  In
other instances not at all, or if we paste the same content that is good
once more than 1 or 2 times the issue shows.

I've pasted an example xml string that contains the issue on opening.
Note that the offending html lives in node

<BRANDINGHTML_01>





Reproduce code:
---------------
function createXMLStruc($filearg){
        if (is_file($filearg)){
                $fp = fopen($filearg,"r");
                $data = fread($fp,filesize($filearg));
                fclose($fp);
                //$data = file_get_contents($filearg);
                //$data = implode(file($filearg),"");
                $p = xml_parser_create();
                xml_parser_set_option($p, XML_OPTION_SKIP_WHITE, 1);
                //xml_parser_set_option($p, XML_OPTION_CASE_FOLDING, false);
                xml_parse_into_struct($p,$data,&$structure,&$index);
                xml_parser_free($p);
                $xmlObj[0]=$index;
                $xmlObj[1]=$structure;
                return $xmlObj;
        }else{
                //echo "Array createXMLstruc() function error :$filearg does 
not exist
<br>";
        }
}

>>>>>>>>>>>>> Example doc
<?xml version="1.0"?>
<ARRAY_CLASS CLASS="content" TYPE="venue" STATUS="pending"
ISCONTAINER="false" PARENTCLASS="venue.xml">
  <GENERAL TYPE="property-catagory" LABEL="General" DATAFORMAT="">
    <NAME TYPE="text" LABEL="Name" DATAFORMAT="singleline"
SEARCH="true"><![CDATA[TestVenue]]></NAME>
    <ALIAS1 TYPE="alias" LABEL="Class Attributes"
PROPERTY="ARRAY_CLASS"></ALIAS1>
    <RELEASE TYPE="date" LABEL="Release Date" DATAFORMAT="MTYHMS"
SEARCH="false"></RELEASE>
    <EXPIRE TYPE="date" LABEL="Expire Date" DATAFORMAT="MTYHMS"
SEARCH="false"></EXPIRE>
    <CLASS_TEMPLATES TYPE="classtemplates" DATAFORMAT="path"
SEARCH="false" LABEL="Templates" SELECTED="0"></CLASS_TEMPLATES>
    <CREATIONDATE LABEL="Creation Date" TYPE="date"
DATAFORMAT="mdyhms">1158958165</CREATIONDATE>
    <OWNER TYPE="resourceproperty" LABEL="Owner" PROPERTY="EMAIL"
WHERE="users" SEARCH="false">/pdempsey.xml</OWNER>
    <VENUECODE TYPE="text" LABEL="Venue Code"
DATAFORMAT="singleline"></VENUECODE>
    <ADDRESS1 TYPE="text" LABEL="Address 1"
DATAFORMAT="singleline"></ADDRESS1>
    <ADDRESS2 TYPE="text" LABEL="Address 2"
DATAFORMAT="singleline"></ADDRESS2>
    <CITY TYPE="text" LABEL="City" DATAFORMAT="singleline"></CITY>
    <STATE TYPE="text" LABEL="State" DATAFORMAT="simplestate"></STATE>
    <POSTALCODE TYPE="text" LABEL="Postal Code (Zip)"
DATAFORMAT="singleline"></POSTALCODE>
  </GENERAL>
  <SUBVENUES TYPE="list" LABEL="Sub Venues" DATAFORMAT="menu"
ROLES="BD,SB,RD,TM"></SUBVENUES>
  <BRANDING TYPE="property-catagory" LABEL="Branding Resources">
    <BRANDRECS TYPE="resourcelist" LABEL="Branding
Resources"></BRANDRECS>
  </BRANDING>
  <BRANDINGHTML TYPE="userfields" LABEL="Branding HTML"
DATAFORMAT="blob">
    <BRANDINGHTML_00 TYPE="html" LABEL="Branding HTML"
DATAFORMAT="blob"><![CDATA[]]></BRANDINGHTML_00>
    <BRANDINGHTML_01 TYPE="html" LABEL="Discount Offer For Venue"
DATAFORMAT="blob"><![CDATA[<table width="280" border="0" align="center"
cellpadding="0" cellspacing="0">_carrige_return_newline 
<tr>_carrige_return_newline    <td><img src="/ui-img/rule_tl.gif"
width="10" height="10"></td>_carrige_return_newline    <td
background="/ui-img/rule_ht.gif"><img src="/ui-img/blank.gif" width="5"
height="5"></td>_carrige_return_newline    <td><img
src="/ui-img/rule_tr.gif" width="10"
height="10"></td>_carrige_return_newline  </tr>_carrige_return_newline 
<tr>_carrige_return_newline    <td
background="/ui-img/rule_vl.gif"> </td>_carrige_return_newline   
<td><table cellpadding="0" cellspacing="0">_carrige_return_newline       
<tr>_carrige_return_newline          <td valign="top"><font
size="3"><strong>Winter Park Resort customers_carrige_return_newline      
         get <u><font size="4">25% Off</font></u> all
orders!</strong></font>*<br>_carrige_return_newline            <img
src="/ui-img/blank.gif" width="10" height="5"><br>_carrige_return_newline 
          <span class="small">Copy this code. You will be prompted to
enter_carrige_return_newline            it during the checkout process to
receive your special discount._carrige_return_newline            *Offer
available for a limited time only. <br>_carrige_return_newline           
<img src="/ui-img/blank.gif" width="10"
height="5"><br>_carrige_return_newline            <em>special offer
code:</em> </span>_carrige_return_newline            <table width="20"
border="1" cellpadding="5" cellspacing="0"
bgcolor="#FFFF99">_carrige_return_newline             
<tr>_carrige_return_newline                <td><strong><font
size="3">wpnewsite</font></strong></td>_carrige_return_newline            
 </tr>_carrige_return_newline            </table>_carrige_return_newline   
      </td>_carrige_return_newline        </tr>_carrige_return_newline     
</table>_carrige_return_newline    </td>_carrige_return_newline    <td
background="/ui-img/rule_vr.gif"> </td>_carrige_return_newline 
</tr>_carrige_return_newline  <tr>_carrige_return_newline    <td><img
src="/ui-img/rule_bl.gif" width="10"
height="10"></td>_carrige_return_newline    <td
background="/ui-img/rule_hb.gif"><img src="/ui-img/blank.gif" width="5"
height="5"></td>_carrige_return_newline    <td><img
src="/ui-img/rule_br.gif" width="10"
height="10"></td>_carrige_return_newline 
</tr>_carrige_return_newline</table>]]></BRANDINGHTML_01>
  </BRANDINGHTML>
  <SEARCH TYPE="property-catagory" LABEL="Search Keys" DATAFORMAT="">
    <KEYWORDS TYPE="keywords" DATAFORMAT="multiline" LABEL="Automatic
Keywords" SEARCH="false">,TestVenue,</KEYWORDS>
    <SEARCHKEYPAIRS TYPE="keywords" DATAFORMAT="multiline"
LABEL="Automatic Property-pairs" SEARCH="false"></SEARCHKEYPAIRS>
    <USERKEYWORDS TYPE="text" DATAFORMAT="multiline" LABEL="Addtional
Keywords" SEARCH="false"></USERKEYWORDS>
    <DESCRIPTION TYPE="text" LABEL="description"
DATAFORMAT="multiline"></DESCRIPTION>
  </SEARCH>
  <ROLES TYPE="rolelist" SEARCH="false">
    <ADMIN TYPE="role" LABEL="administrator">654321</ADMIN>
    <CONTENT_MANAGER TYPE="role" LABEL="Content
Manager">654321</CONTENT_MANAGER>
  </ROLES>
</ARRAY_CLASS>


-- 
Edit bug report at http://bugs.php.net/?id=38930&edit=1
-- 
Try a CVS snapshot (PHP 4.4): 
http://bugs.php.net/fix.php?id=38930&r=trysnapshot44
Try a CVS snapshot (PHP 5.2): 
http://bugs.php.net/fix.php?id=38930&r=trysnapshot52
Try a CVS snapshot (PHP 6.0): 
http://bugs.php.net/fix.php?id=38930&r=trysnapshot60
Fixed in CVS:                 http://bugs.php.net/fix.php?id=38930&r=fixedcvs
Fixed in release:             
http://bugs.php.net/fix.php?id=38930&r=alreadyfixed
Need backtrace:               http://bugs.php.net/fix.php?id=38930&r=needtrace
Need Reproduce Script:        http://bugs.php.net/fix.php?id=38930&r=needscript
Try newer version:            http://bugs.php.net/fix.php?id=38930&r=oldversion
Not developer issue:          http://bugs.php.net/fix.php?id=38930&r=support
Expected behavior:            http://bugs.php.net/fix.php?id=38930&r=notwrong
Not enough info:              
http://bugs.php.net/fix.php?id=38930&r=notenoughinfo
Submitted twice:              
http://bugs.php.net/fix.php?id=38930&r=submittedtwice
register_globals:             http://bugs.php.net/fix.php?id=38930&r=globals
PHP 3 support discontinued:   http://bugs.php.net/fix.php?id=38930&r=php3
Daylight Savings:             http://bugs.php.net/fix.php?id=38930&r=dst
IIS Stability:                http://bugs.php.net/fix.php?id=38930&r=isapi
Install GNU Sed:              http://bugs.php.net/fix.php?id=38930&r=gnused
Floating point limitations:   http://bugs.php.net/fix.php?id=38930&r=float
No Zend Extensions:           http://bugs.php.net/fix.php?id=38930&r=nozend
MySQL Configuration Error:    http://bugs.php.net/fix.php?id=38930&r=mysqlcfg

Reply via email to