ID: 50156
Comment by: edwin at bitstorm dot org
Reported By: edwin at bitstorm dot org
Status: Open
Bug Type: XML Reader
Operating System: Ubuntu 9.04
PHP Version: 5.2SVN-2009-11-12 (SVN)
New Comment:
Turns out I can use the isEmptyElement-property to find out when
dealing with an <a/>-element.
This is a bit unfortunate, because, for example, SAX (the mother of all
xml-readers?) does not use this mechnism and works as I would expect.
It's just how libxml seems to work, so it should probably not be marked
as a PHP-bug.
This bug can be closed: "not a bug"... :-/
A little parsing example in the documentation might be a very good
idea, though.
Previous Comments:
------------------------------------------------------------------------
[2009-11-12 21:14:17] edwin at bitstorm dot org
XML-parser used is libxml 2.7.3 .
------------------------------------------------------------------------
[2009-11-12 21:02:52] edwin at bitstorm dot org
<?php
// Source code for bug #50156
// The following code will output the text below, which
// is not what you expect when you see the xml.
//
// Barcode is not a child of Type, but how can you know?
//
// Titles
// Titles
// Titles - Title
// Titles - Title
// Titles - Title - ID
// Titles - Title - ID
// Titles - Title
// Titles - Title
// Titles - Title - Type
// Titles - Title - Type
// Titles - Title - Type - Barcode
// Titles - Title - Type
// Titles - Title - Type
// Titles - Title
// Titles - Title
// Titles
$xml = "
<Titles>
<Title>
<ID>429</ID>
<Type/>
<Barcode></Barcode>
</Title>
</Titles>
";
$expected = "
Titles
Titles
Titles - Title
Titles - Title
Titles - Title - ID
Titles - Title - ID
Titles - Title
Titles - Title
Titles - Title - Type
Titles - Title
Titles - Title
Titles - Title - Barcode
Titles - Title
Titles - Title
Titles
Titles
";
$reader = new XMLReader();
$reader->xml($xml);
$actual = '';
// Make a stack for every element
$stack = array();
while ($reader->read()) {
switch($reader->nodeType) {
case XMLReader::ELEMENT:
array_push($stack, $reader->name);
break;
case XMLReader::END_ELEMENT:
array_pop($stack);
break;
}
$actual .= join(' - ', $stack)."\n";
}
// Clean up and make it OS-agnostic
$expected = preg_replace('/\\r/', '', trim($expected));
$actual = preg_replace('/\\r/', '', trim($actual));
// Print result
echo "<h3>Expected</h3>\n";
echo "<pre>$expected</pre>\n";
echo "<h3>Actual</h3>\n";
echo "<pre>$actual</pre>\n";
// Test it
if ($expected == $actual) {
echo "<strong>Good</strong>";
} else {
echo "<strong>Not good</strong>";
}
?>
------------------------------------------------------------------------
[2009-11-12 13:48:14] edwin at bitstorm dot org
Description:
------------
Element <a></a> returns twice, one for XMLReader::ELEMENT and one for
XMLReader::END_ELEMENT.
Element <a/> returns once, for XMLReader::ELEMENT.
That should return a XMLReader::END_ELEMENT too, because that's
implicit.
Problem is that now you can't distinguish between <a><b> and <a/><b>
and that's a bug.
Reproduce code:
---------------
$reader = new XMLReader();
$reader->open($file);
echo "<table>\n";
while ($reader->read()) {
echo "<tr><td>".$reader->nodeType."</td><td>".$node =
$reader->name."</td><td>".$reader->value."</td></tr>\n";
}
echo "</table>\n";
Input:
<Titles>
<Title>
<ID>429</ID>
<Type />
<Barcode>
</Barcode>
Expected result:
----------------
1 Titles
14 #text
1 Title
14 #text
1 ID
3 #text 429
15 ID
14 #text
1 Type
14 #text
15 Type
14 #text
1 Barcode
14 #text
15 Barcode
14 #text
Actual result:
--------------
1 Titles
14 #text
1 Title
14 #text
1 ID
3 #text 429
15 ID
14 #text
1 Type
14 #text
1 Barcode
14 #text
15 Barcode
14 #text
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=50156&edit=1