RE: EntityResolver not allowed to do enough

David N Bertoni/Cambridge/IBM Tue, 22 Oct 2002 17:31:18 -0700



> > According to the XSLT recommendation, when you call the document()
> > function with a single string argument, any relative URIs are
> > resolved using the base URI of the stylesheet.
>
> I am not speaking of the XSLT language, but of the Xalan implementation.
> Namely, where base munging should be done when constructing a full URI.
>
> If the current document is <http://www.w3.org/TR/xslt#document> and
> <foo.xml> is requested, should our URIResolver be passed, by Xalan,
> <http://www.w3.org/TR/xslt#document> as a base or
> <http://www.w3.org/TR/xslt>?
>
> In the first case, the URIResolver is tasked with the responsibility of
> finding the base path of the URI.
>
> In the second, Xalan does the work.

OK, I misunderstood the question by not noticing the fragment ID in the
URI.  I'm pretty much not interested in supporting fragment IDs.  Or let's
just say that I'm not interested in doing the work that's required to do it
properly.

So I guess the answer to your question is that your resolver will get
http://www.w3.org/TR/xslt#document and it's up to your code to "do the
right thing."

> > http://xml.apache.org/xalan-j/apidocs/javax/xml/transform/Source.html
> >
> > How this would all work in C++ without all the run-time type
introspection
> > of Java is still up in the air. One way would be for the URIResolver to
> > return an instance of a XalanDocument.
>
> Then we would have to use a different technique for XSL documents since
> they are parsed using SAX, right?

Well, one thing we could do is add the ability to the URI resolver to send
events to a SAX handler, if one is supplied, rather than return an instance
of a document.  This is something like the multiple overloads for
XMLParserLiaison::parseXMLStream.  Or, the stylesheet compiler could simply
accept the document and walk it to compile the stylesheet.  We already do
this when the stylesheet is embedded in the source document, although I
think that's really ugly, and the code is pretty rickety.

So, I get what I'm saying is let's consider something like this:

class URIResolver
{
public:

    enum eResult
    {
        eOK,
        eFail,
        eNotSuppored
    };

    virtual eResult
    resolve(
            const XalanDOMString&  theURI,
            const XalanDOMString&  theBaseURI,
            XalanDOMString&        theUniqueURI) = 0;

    virtual eResult
    resolve(
            const XalanDOMString&  theURI,
            const XalanDOMString&  theBaseURI,
            XalanDOMString&        theUniqueURI,
            XalanDocument*&        theDocument) = 0;

    virtual eResult
    resolve(
            const XalanDOMString&  theURI,
            const XalanDOMString&  theBaseURI,
            XalanDOMString&        theUniqueURI,
            ContentHandler*        theContentHandler,
            LexicalHandler*        theLexicalHandler
            DTDHandler*            theDTDHandler) = 0;

    virtual eResult
    resolve(
            const XalanDOMString&  theURI,
            const XalanDOMString&  theBaseURI,
            XalanDOMString&        theUniqueURI,
            FormatterListener*     theFormatterListener) = 0;
};

The first one covers the case where you're just doing some fancy URI
hacking, or completely replacing the URI.  The others cover the cases where
you're not actually parsing a stream.

Of course, all of this is off the top of my head, so I may be spouting
gargabe/gibberish...

> > Other choices would be for it to act as a sort-of URL constructor,
which
> > is what I think you're thinking of.
>
> Yep.

Good. I'm glad I'm starting to grok where you're coming from.

> > That is, it can take two strings and determine what sort of URL can be
> > used to construct an InputSource that the parser can understand. That's
> > good for stream sources, but not for other sorts of sources.
>
> How so?

Because the "source" of the data is not a stream of bytes addressable by a
URL, so the parser has nothing to parse.  Remember, an EntityResolver must
return an InputSource, which pretty much means a stream of bytes to feed
the parser.  Consider the case where the user wants to construct a document
on-the-fly in response to a URI.  For example, imagine a custom URI scheme
that specifies a SQL query where the result set is used to create a
document.

> > In all cases, we'ed need a way to get a URI that's unique for that
> > document.
>
> Agreed.

Dave
RE: EntityResolver not allowed to do enough

Reply via email to