Re: [OT] RE: web.xml DTD for Servlet 2.3 & Struts 1.1

2004-03-04 Thread Craig R. McClanahan
Quoting Wendy Smoak <[EMAIL PROTECTED]>:

> A long time ago, Craig McClanahan wrote:
> > It is a common misconception that the public identifiers of a DTD like
> > this *must* actually be working URLs [...].
> > They are just unique strings of characters that
> > (often) happen to look like URLs.  
> > Blame the XML community for that :-).
> 
> 
> And then Yuan Saul asked:
> > If a local copy of DTD is not available, then an Internet connection
> is
> > required, in this case, does the URI has to be pointing to a working
> URL
> > where the DTD file can be retrieved?
> 
> Which is also my question, but there is no reply in the archives.
> Anyone?
> 
> Today we got a note from campus IT saying that they believed some
> problems in their J2EE apps were related to "code that connects to
> http://java.sun.com behind-the-scenes to download various DTD files
> related to parsing XML documents."
> 
> In addition to whether it happens at all (going out to the internet to
> retrieve the DTD) I'm also curious if it's the XML parser, or the
> Servlet container, etc.  What component would make the call out to get
> the DTD?
> 
> I've always wondered...
> 

Since Wendy spends quite a bit of time answering questions for users, it's only
fair that I answer this one for her :-).

The answer actually depends on your XML parser, and you can actually get
involved in the process if you want to, but for simple use cases the answer is
"yes".  If you're interested, here's a few details about how Struts (and
Tomcat, for that matter) use the commons-digester module to parse configuration
files:

* Your XML document includes a DOCTYPE header defining the DTD.  For a
  Struts config file, it would look like:

  http://jakarta.apache.org/struts/dtds/struts-config_1_1.dtd";>

* This header includes two identifiers for the DTD ... the *public* identifer
  "-//Apache ..." and the *system* identifier (in this case, a URL).

* Commons Digester uses the SAX parsing APIs provided by the parser.
  Included in these APIs is an interface, which Digester implements.

* The parser calls the resolveEntity() method of the EntityResolver
  (i.e. the Digester instance), asking it to return an InputSource
  so the parser can read the DTD's contents.

* The default EntityResolver defined by SAX simply uses the system id
  as a URL and attempts to retrieve it.  With the system identifier
  given above, it will go to the jakarta.apache.org site across the
  Internet.

* Digester, however, is "smarter than the average bear" (if you remember
  Yogi Bear from growing up days :-).  It allows you to register a
  mapping from a public identifier ("-//Apache ...") to an *internal*
  resource inside the JAR file.  If you call resolveEntity() and pass
  one of the registered public ids, it will ignore the system id and
  return a stream to the internal resource.  If the public id is not
  recognized, it wil do the usual thing (using the system id instead).

* Struts pre-registers the public ids for the various versions of the
  DTD, pointing at internal resources (see the initConfigDigester() method
  of ActionServlet), so that it will never need to use the system id.

In this way, you can run Struts based applications (and Tomcat, which does the
same thing for the DTDs for web.xml files) completely disconnected from the
Internet, without changing the system ids in your XML documents.

If you find that your application is attempting to go to the Internet for DTDs
anyway, the most likely explanation is that you have a typo in the public
dentifier in your config fie.

> -- 
> Wendy Smoak
> Application Systems Analyst, Sr.
> ASU IA Information Resources Management 
> 

Craig


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[OT] RE: web.xml DTD for Servlet 2.3 & Struts 1.1

2004-03-04 Thread Joe Germuska
Today we got a note from campus IT saying that they believed some
problems in their J2EE apps were related to "code that connects to
http://java.sun.com behind-the-scenes to download various DTD files
related to parsing XML documents."
In addition to whether it happens at all (going out to the internet to
retrieve the DTD) I'm also curious if it's the XML parser, or the
Servlet container, etc.  What component would make the call out to get
the DTD?
I've always wondered...
It's the XML parser that is used by the Servlet container which goes 
looking for the DTD so that it can perform a validating parse of the 
XML.  In most cases, a validating parse is not strictly required 
(although prior to Struts 1.1, Struts depended on a validating parse 
of struts-config.xml to fill in certain default values, etc.)

With yesterday's outage of java.sun.com, we solved the hang-on-launch 
problem by disabling validation.  For Jetty (which is the servlet 
container for our JBoss deployment), you can do this by defining a 
system property:

org.mortbay.xml.XmlParser.NotValidating=true

How you actually control this would be dependent on the servlet 
container and its own configuration.

If you are processing XML using SAX, you can also intervene in the 
entity resolution.  Commons Digester does this, for instance, by 
implementing the org.xml.sax.EntityResolver interface.

http://java.sun.com/j2se/1.4.2/docs/api/org/xml/sax/EntityResolver.html
http://java.sun.com/j2se/1.4.2/docs/api/org/xml/sax/helpers/DefaultHandler.html
http://jakarta.apache.org/commons/digester/xref/org/apache/commons/digester/Digester.html#1661
Joe

--
Joe Germuska
[EMAIL PROTECTED]  
http://blog.germuska.com
  "Imagine if every Thursday your shoes exploded if you tied them 
the usual way.  This happens to us all the time with computers, and 
nobody thinks of complaining."
-- Jef Raskin

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] RE: web.xml DTD for Servlet 2.3 & Struts 1.1

2004-03-04 Thread nicolas De Loof
Assuming this resources are right :
http://www.searchwin.net/doctype.htm
http://www.blooberry.com/indexdot/html/tagpages/d/doctype.htm
.. I thing the last (optional) part of a DOCTYPE declaration has to be a 
valid URL to the referenced DTD. Perhaps I'm wrong, because I didn't 
found a normative document about this.

Actualy, XML parsers are using this URL to look the Internet for a DTD 
when they don't have a local one for the PUBLIC identifier found in the 
document.

Nico.

Wendy Smoak a écrit :

A long time ago, Craig McClanahan wrote:
 

It is a common misconception that the public identifiers of a DTD like
this *must* actually be working URLs [...].
They are just unique strings of characters that
(often) happen to look like URLs.  
Blame the XML community for that :-).
   



And then Yuan Saul asked:
 

If a local copy of DTD is not available, then an Internet connection
   

is
 

required, in this case, does the URI has to be pointing to a working
   

URL
 

where the DTD file can be retrieved?
   

Which is also my question, but there is no reply in the archives.
Anyone?
Today we got a note from campus IT saying that they believed some
problems in their J2EE apps were related to "code that connects to
http://java.sun.com behind-the-scenes to download various DTD files
related to parsing XML documents."
In addition to whether it happens at all (going out to the internet to
retrieve the DTD) I'm also curious if it's the XML parser, or the
Servlet container, etc.  What component would make the call out to get
the DTD?
I've always wondered...

 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


[OT] RE: web.xml DTD for Servlet 2.3 & Struts 1.1

2004-03-04 Thread Wendy Smoak
A long time ago, Craig McClanahan wrote:
> It is a common misconception that the public identifiers of a DTD like
> this *must* actually be working URLs [...].
> They are just unique strings of characters that
> (often) happen to look like URLs.  
> Blame the XML community for that :-).


And then Yuan Saul asked:
> If a local copy of DTD is not available, then an Internet connection
is
> required, in this case, does the URI has to be pointing to a working
URL
> where the DTD file can be retrieved?

Which is also my question, but there is no reply in the archives.
Anyone?

Today we got a note from campus IT saying that they believed some
problems in their J2EE apps were related to "code that connects to
http://java.sun.com behind-the-scenes to download various DTD files
related to parsing XML documents."

In addition to whether it happens at all (going out to the internet to
retrieve the DTD) I'm also curious if it's the XML parser, or the
Servlet container, etc.  What component would make the call out to get
the DTD?

I've always wondered...

-- 
Wendy Smoak
Application Systems Analyst, Sr.
ASU IA Information Resources Management 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]