> First when most people say "the internet" now, they pretty much mean
> "the web" and e-mail. When people say "the web" they mean HTTP and HTML
> (with a little SSL thrown in for e-commerce). HTTP is the transport protocol
> (how it is delivered) and HTML is the markup language (the message). XML
> attempts to replace and supersede HTML without saying anything about HTTP
> (though one can assume that most of the delivery will be done via HTTP, much
> to the chagrin of many security administrators who depend on firewalls).

That is not really the case. XML cannot be a replacement for html because
xml is not a set of defined elements like html is.

Think of XML as the _rules_ for defining your own markup languages.

XHTML is an attempt to being html in-line with the XML standard, but there
are gazillions of XML DTDs (document type definition, basically says what
elements and attributes you can have in a particular XML document) for all
kinds of things.

XML is fantastic specifically because it is _not_ a markup language: it
enables companies and industries to create _their_own_ markup languages for
specific purposes.

> XML is a markup language like HTML. Unlike HTML, the markup language is
> extensible (basically think of it as saying you can define your own tags and
> attributes). This means you can make descriptive tags such as

Ah, from the above, I thought you did not understand, but you do :)

> <book type="paperback"><AUTHOR>Joe Blogs</AUTHOR><TITLE>SATs - How to be
> beaten by the system</TITLE><SUBJECT>Test preparation</SUBJECT></book>
> 
> Which looks a lot like HTML but isn't. Interestingly, the tags are
> descriptive of the content which beats the hell out of UN/EDIFACT if you've
> ever had to do any work for big business. Other differences are the rules
> are more rigid than HTML: all tags must close, all attributes must be
> quoted, all reserved characters must be escaped properly, all tags and
> attributes are case sensitive. The default format for display is double byte
> encoded characters (UTF-16 / UNICODE) (Note: The default used by PHP seems
> to be UTF-8 so you should change it to that charset in the XML directives
> line).-
> So basically what you have when you are done is a text based
> hierarchical data structure that's extensible and machine readable. That's
> all XML is.

yeppers. I luv xml. :)

> Now the things you can do with it. Obviously for one I can use this to
> serialize objects in PHP very easily since I can store objects in XML
> representation which is just a string to be saved. The WDDX module does that
> in some standard way.

yes, and you can use it for document storage, remote procedure calls, EDI,
_anything_ that has a data payload :)!

> A note about standards. Since XML is extensible, there is a need to be
> specified so that I can communicate with you and we understand each other.
> XML is really more like a markup language FORMAT than a language (or seen
> another way, it's a standard but not a specification). There are various
> specifications and attempts at specifications out there and are usually
> referred to as DTDs, Document Type Definitions, or Schema. It used to be you
> specified your Schema in another markup language called SGML but then some
> people figured if XML is so extensible you should be able to specify your
> own Schema in an Schema language which itself is XML. This is known, not
> surprisingly, as XML Schema. Which represents another thing you can do with
> XML: Use XML to specify XML data formats.

for example, a DTD for that bookmark example above is unly useful if a bunch
of people agree on it as a standard, otherwise it doesn't do any real good.


> A useful one for web programmers right now is you can use XML to turn
> XML into other XML formats. This is done through XSL-T (eXtensible
> Stylesheet Language - Transform) which is built into a PHP module called
> Sablotron (Side Note: I couldn't compile Sablotron 0.50 in PHP yet, it

yes, this is a slightly more sad situation. (feel free to flame away - but
you have to do it on this list, not privately :)

XST (and XSLT) are fairly good for transforming between two document types,
but some people have started using it as a template mechanism for XML ->
HTML.

Problem is, you have to write like _ten_times_ as much XSL as html just to
get a basic template going.

XSL is like CFML but the design is even worse (gah) and it can't talk to a
database :)! (ok, that't not a perfect example...)

> failes during the linking step in Apache and claims that it can't find some
> library that is in Expat). Sablotron (and many XML-T parsers) is a little
> robust in that you can use it to transform it into HTML and text too. This
> warrants a bit larger description...
> 
> Basically XSL works by taking an input XML file (we'll call this the
> "data store") and using another XML file written using the XSL specification
> (we'll call this the "rules file") to create another file in a different XML
> format (we'll  all this the "presentation file"). Obviously when the
> presentation file is in XML, we can chain another rules file to it to make
> another presentation file and so on. XSLT parsers such as Sablotron allow us
> to do just that. Why is this powerful? The best way is through examples
> 
> (1) Our company builds a search engine that goes out and does a
> real-time travel comparison engine of 25 separate travel websites. Given
> that each search does this, we offload this to a business rules server that
> creates this and returns the results. Because we add sites and features
> almost at will, this messaging standard had to be extensible. The webserver
> has to communicate with this business rules server and understand it. A
> stylesheet can ensure that the message that gets sent to the web server is
> always in line with what the webserver can understand even if we upgrade our
> features on the business rules server.

That is cool, but why XSLT? Why not a bit of php code that is comparably
quite small?

> (2) Furthermore, we have some nasty internal business rules embeded in
> our XML data store on the business rules server. An XSLT filter allows us to
> remove  these internal business rules before delivery. This makes our
> business objects resellable to third parties as an application service
> without compromising our internal ones and requiring much coding. we can use
> the same XML data store to store private and public information.

I don't understand this, but that isn't surprising because there isn't
enough context :)

> (3) The webserver itself needs to parse and deliver the data. That data
> may vary on our site vs. a cobranded site. With XSLT you can transform XML
> on the fly to XHTML (a superset of HTML) and tack on your presentation layer
> (nice little font tags and setting the color and whatnot). A different XSLT
> for a different browser or cobrand, yet the same datastore for all of them.
> This is called "separating your presentation from your data". Microsoft
> calls this 3-tiering, n-tiering, DNA, NetDocs, and now dotNet. (Well some of
> the later ones are a bit more than just 3-tiering, but the basic idea is in
> tact).

yes, this is a central argument I've had with a few people: using XML and
parsing all that data coming out of a database is a huge hassle. XML should
be an output format and a storage format, used somewhat similarly in a
system as HTML.

for example, if you're talking to a browser, spit out html.
if you're talking to an XML-RPC server, spit out XML
if you're talking to a cell phone, spit out WML, etc.

I disagree with the idea that using schema-defined files as a base, paired
with gobs and gobs of XSL is the groovy way to go.

anyway.

> Well I hope you get the idea. I'm sure you can thing of other things
> such as...
> 
> (4) Oracle, Microsoft and others now allow you to query their databases
> in XML. So do companies (in my field) such as Apollo and Sabre. These are
> hardly compatible, nor do they in any way represent something that is
> comfortable to manipulate. An XSLT layer as a data abstraction layer allows
> you to transform someone else's standard into an internal one you can
> manipulate in a known way.

aargh :)

10x the code, and it doesn't get you anything :)

I heard about that oracle XML-query thing, gotta check it out...

but I'm rather fond of SQL.

> (5) I mentioned XML is machine readable. Imagine: web pages are not very
> machine readable. Need I say more?

yes, but only if the machine knows the rules of the document. that's been a
problem, foofy non- or proprietary standard that no one bothers to document.
:)

> I could go on, but I'm not an imaginative fellow.

seem pretty well informed to me :).

> Let's see. There's also PDF. PDF isn't in XML but there is something
> called XSL-FO (formatting objects) which some PDF generators (perhaps the
> two that PHP has modules for?) understand. So writing XSL-FO for output for
> a PDF generator is "doing XML" also.

yeah, apparently this is a complete nightmare. (speaking as someone who has
not actually done it, mind you :)

> Then there is the fact that it hierarchical data which sometimes,
> despite all these neat tools, needs to be parsed and understood. If we had a
> standardized API for manipulating it, then all the knowledge in learning the
> API for say visual basic on a Windows box can be transferred to doing in in
> C++ in AIX or perhaps PHP? Yes, there are two such standards, one is an
> event driven one (reads a tag and calls a callback function) known as "SAX"
> (Simple API for XML) and implemented in the Expat (--with-xml in PHP which
> is compiled by default in PHP4) and the other reads the whole thing into a
> hierarchical (treelike) object structure called the DOM (Document Object
> Model) which is implemented in libxml (--with-domxml or somesuch in PHP).

read:

SAX is fast as hell, and works really well for simple, not-very-nested xml
documents. it is event based:

Ohhhhh! A tag!
I'll call a function now to handle the fact that I have encountered a tag!
have a nice day!

DOM is the serious, bad-ass big brother of SAX that reads the DTD associated
with that document, and (usually) enforces all kinds of inconvenient rules
about form and correctness :) - it then takes the _entire_ document, and
turns it into a hierarchy (which is what an XML doc is).

I like dom too, but it has problems right now.

Also, most XML stuff I'm doing is _so_ simple (which is nice) that using SAX
is all that's really necessary.

once again, convenience reigns over correctness :)

> I personally prefer the DOM version of looking at things (it's a bit
> slower and chews more memory). Unfortunately the dom-xml in PHP doesn't much
> resemble anyone elses DOM (at least not Oracle's or Microsofts), it's a bit
> buggy (for instance, you can't remove a node, nor can you seem to modify the
> text in a node) and it chews a whole slew of memory (much more than I'd
> expect and that amount has almost doubled since they've incorporated XPath
> support with PHP 4.0.4+).
>
> Now a final reason why PHP and XML should go hand in hand. (Because if
> you haven't figured out by now, I'm really big on PHP and on XML). A study I
> read estimates that by the end of this year over 50% of the Fortune 500 will
> be using XML in some "test bed" situation and by 2004 80% of all business
> communication on the internet will use XML.

totally.

virtually business communications at the machine level are being converted
to XML.


> I hope this gives you some motivation to pick up XML and use PHP to do
> so. Because the more of us that are doing so, the more developers there will
> be working (or pushing others to work) on improving XML support in PHP. With
> more robust cool tools like the PHP developers have already given me (and
> hopefully will continue to do so), I won't feel so bad about studying
> condensed matter physics and neuroscience for the last 9 years instead of
> majoring in something real like computer science and learning how to code in
> Java and C++.

nah, the former's more interesting anyway :)

> Take care,
> 
> terry chay

_a

(oh yes, the obligatory ad for binarycloud:

go to http://www.binarycloud.com - it's a beefy web application platform in
PHP!)


-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]

Reply via email to