RE: [uug] hard returns in xml

Gary Thornock Wed, 17 Sep 2003 07:45:13 -0700

The answer to your question depends in part on what you're doing with 
the XML, and in part on how you're parsing it.


Given that you seem to do most of your stuff in PHP, you're probably 
using a SAX-based parser, since that's the default XML parsing method 
in PHP.  An advantage to SAX-based parsers is that they work very 
quickly, even when the XML is very large, and the size of the XML is 
essentially unlimited.  One option that may work out well for you is 
to enclose the document body as HTML in a CDATA section, thus:

<body>
  <![CDATA[<p>Since every penny I earn depends on copyright protection, 
           I'm all in favor of reasonable laws to do the job.</p>

           <p>But there's something kind of sad about the recording 
           industry's indecent passion to punish the "criminals" who 
           are violating their rights.</p>

           <p>Copyright is a temporary monopoly granted by the 
           government -- it creates the legal fiction that a piece of 
           writing or composing (or, as technologies were created, a 
           recorded performance) is property and can only be sold by 
           those who have been licensed to do so by the copyright 
           holder.</p>]]>
</body>
  
This prevents the XML parser from trying to interpret the HTML tags
in the document body.

Another option you would do well to consider is XSL.  Essentially,
XSL is a stylesheet or transformation for XML, for presentation.
You can produce multiple XSL transforms for the same XML content;
for instance, you could do one that displays all of the document
titles and ignores the bodies, and another that displays the
documents in some HTML format.

There's a lot of information available on doing different things with 
XML and PHP.  I'd suggest taking a look at "Professional PHP4 XML" by 
Argerich et. al., from Wrox Press -- it's the most complete reference 
I've ever seen on the combination of PHP and XML.

- Gary
  
> -----Original Message-----
> From: Wade Preston Shearer [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, September 16, 2003 18:04
> To: BYU UUG Post; Utah PHP User Group
> Subject: [uug] hard returns in xml
> 
> 
> i have a question for all of you xml parsers.
> 
> i have an xml file like...
> 
> <snip xml>
> 
> ...that has many entries. as you can see, this xml file is acting 
> like a database of documents. the information in the xml is what i 
> will call the "title" information for the document. my question 
> is... what is the best way to put in the "body" of the document?
> 
> the reason that i ask is because, putting it all in a tag, like 
> this...
> 
> <snip more xml>
> 
> 
> ...doesn't seem to make much sense, because...
> 
>   1.  it is so much bigger than the other content
> 
>   2.  you'd have to put a <br> in there for it to display 
>       correctly... or perhaps a "\n" would work... or something...?
> 
>   3.  if i want to parse the entire directory... it'll get really 
>       big if the "body" tags contain more than a sentence or two
> 
> 
> How should i approach this? The best thing that I have come up with 
> is to use the info in the "id" tag to reference a text file by the 
> same name ($id . ".txt"). So, the xml file would then a list of the 
> documents "title" info while the "body" data is in separate text 
> files.
>
> Although everything isn't in the same "database," it seems like a 
> good   idea because I want to be able to use the data in the xml 
> file for both displaying one document at a time and also displaying 
> a listing of all of the "titles."
>
> If all of the data was the same size (ie: one sentence or less), 
> this   would be easy, but, hey... it's never easy. For those of you 
> that have done an XML parsing project like this before, with data 
> that requires a hard return within data that is within a tag...
>
> how have you done it?
>
> does xml have a special character for end-of-lines like this?
>
> does this approach (separate txt files for the "body") sound good?
> 
> 
> eager for you input,
> 
> wade

____________________
BYU Unix Users Group
http://uug.byu.edu/
___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list

RE: [uug] hard returns in xml

Reply via email to