RE: EntityResolver not allowed to do enough

Brian Quinlan Fri, 18 Oct 2002 00:30:21 -0700

> Not right now, but that may be why Xalan-J created the notion of a
> URIResolver.


> Again, I think the job of a URIResolver, although I don't know how
this
> fits in with EntityResolvers, because I don't think URIResolvers are 
> only responsible for string manipulation of URIs -- I think they 
> actually do that, _plus_ do the work of an EntityResolver.  In 
> other words, they return an InputSource.

So, if the user installs a URIResolver, they must calculate the complete
URL themselves or choose not to create the InputSource. That's a bit
annoying but good enough for me.

If the user is responsible for URI resolution, are they also responsible
for maintaining the base URI? Is the base URI just the URI of the
"parent" or the thing that we are resolving or must it be normalized in
some way e.g.

document('http://xml.apache.org/foo/foo.xml')

Is the base for relative document calls now
<http://xml.apache.org/foo/foo.xml> or <http://xml.apache.org/foo/>?

I ask because generating base URIs might be difficult for custom
schemes.

> This should only happen if the URI has the file scheme or doesn't have
a
> scheme, in which case we assume it's a file URL.  If we're trying to
do
> the realpath() thing when there's a scheme present, then that's a bug.

In both XMLURL and URISupport, unrecognized schemes (i.e. schemes other
than http, ftp and file) are considered to be no scheme at all. That was
the original problem that I reported:

scheme:foo/bar => file://<base-path>/scheme:foo/bar

On Linux, that will explode due to realpath. On Windows you can at least
try to extract your original URL amongst the garbage.

Of course, XMLUri is totally different, as one would expect given its
totally different name :-)

> OK, now I'm _really_ offended!  ;-)

I doubt it :-) 

Do you really understand the relationship between the various URI
processing systems?

> I believe everything is supposed to funnel through
> XPathEnvSupport::parseXML().  That way, the core code  doesn't need 
> to know about how the document was created.  This should apply to 
> parsing stylesheets from PIs, and also for xsl:include and xsl:import.

> I think this does work, 

Isn't the first concrete implementation of XPathEnvSupport::parseXML(),
XSLTProcessorEnvSupportDefault::parseXML()? 

In any case, isn't XSL compilation still done using a SAX steam? Since
...::parseXML() returns a XalanDocument, it seems like that cannot be
the bottleneck for XSL documents.


> although from the perspective of the XalanTransformer class, we don't
> really provide a hook to do this.

I don't use XalanTransformer so I don't really care :-)

> Isn't the problem really that you want to have a chance to see URIs 
> before they're used to parse something, so you provide an alternate 
> source for the byte stream?

Yes.

> In that case, I think you want more than
> just a hook to munge strings.  Don't you want a generic hook to 
> provide an InputSource based on some URI or relative URI/base URI 
> input?

I'm assuming that EntityResolver won't be taken away. Given the ability
to munge the base and relative URI into the final URI AND
EntityResolver, I can do what I want.

Or EntityResolver can go away or change such that I get the base and
relative URI and just return an InputSource directly.

> It sounds like we really should add something to do URI resolution
like
> Xalan-J/TrAX has.  What do you think?

I don't know anything about Xalan-J/TrAX so I couldn't say.

Cheers,
Brian

RE: EntityResolver not allowed to do enough

Reply via email to