Neil,
I believe you should be _more_ strict regarding URIs. That, in fact, is
the "spirit" of XML in the first place. By the mid-90s there were all
kinds of encouragement by browser vendors to write sloppy (i.e., not
well-formed, etc.) html. So I think strictness should be enforced in XML
processors of all types.
On the other hand, not everybody (including myself) has a strong background
in SGML and the concept of systemIds, etc., so, as a resource, it would be
very helpful, in my opinion, to have a very good FAQ page or something,
with examples of real-world URIs, both relative and absolute, and
particularly such things as representation of filenames/locations on
different operating systems, platforms, whatever.
For example, below you provide the example:
"c:\myfile.xml" maps to file:///c/myfile.xml
I have found that in order to provide XML processors (such as XML Spy and
Xerces, among others) with a common-denominator URI, the following syntax
works (and many others do not):
"h:\myfile.xml" maps to file:////devmsg1/xml/myfile.xml
where "//devmsg1/xml" is the UNC name "\\devmsg1\xml" (i.e., a "network
share") that is "mapped" to the drive letter "H:" (which, of course, is
never going to be a good way of identifying a resource).
But I don't know if that's appropriate, and/or why it works.
Regards,
Michael McDonough
[EMAIL PROTECTED]
To: [EMAIL PROTECTED],
[email protected]
09/10/2002 12:10 cc:
PM Subject: filenames versus URI's
Please respond to
xerces-j-user
Hi all,
There are a number of places where the parser has to interact with the file
system (e.g., in resolving systemId's, schemaLocation hints and Strings
supplied to our JAXP #parse methods.) To my knowledge, all of these
situations are expecting a URI--possibly relative--rather than a filename.
Historically--at least in recent history--we've been more and more
permissive in what we'll accept here. We can usually figure out, for
instance, that "c:\myfile.xml" maps to file:///c/myfile.xml. But recently,
there have been a deluge of reports that we can't handle filenames with
spaces or other characters disallowed by the URI spec, or that non-ASCII
characters can't be processed.
It would be possible--in rrinciple--to keep on becoming more accomodating.
It would make our code more complex, and for things like Chinese characters
it isn't clear that that complexity wouldn't be rather substantial. Or, we
could change course and decide to allow only true URI's to be used
consistently, and restrict ourselves to making sure we can absolutize
relative URI's correctly in whatever context they're given.
What do people think? Is it too much to ask of applications to provide
URI's rather than platform-dependent filenames? Do people think increasing
the complexity of our stream-processing code is worth whatever convenience
is gained? Is it acceptable that, by allowing filenames, we're violating
the letter of many specifications and probably not aiding the cause of
platform/parser independence, since we're being more permissive than other
products are likely to be?
All thoughts appreciated!
Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone: 905-413-3519, T/L 969-3519
E-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]