"Terje Bless" <[EMAIL PROTECTED]> writes:

> Ok. I dug around XML-Deviant and XML-Dev archives a little, and I
> came across a curious reference from April Y2K or so that claims
> Xerces (-J, presumably) supports EntityResolution using XML Catalogs
> of the Cowan variety.
>
> Question: Is this correct?
>
> Question: Does Xerces-C++ -- and, by extension, Xerces-P -- do that?

Here's the email that talks about xerces-j support for XML Catalog:

http://marc.theaimsgroup.com/?l=xerces-j-dev&m=98272746310789&w=2

So the answer to #1 seems to be "Yes".

Here's a very nice article by Norm Walsh about the whole issue, with
pointers to the Java classes:

http://www.arbortext.com/html/issue%5Fthree.html

However, for xerces-c this is as close as we get:

http://marc.theaimsgroup.com/?l=xerces-c-dev&m=97966491924777&w=2

So, as far as I can tell, the answer to #2 is 'No', Xerces-c doesn't
have support for XML Catalog, so we'll have to add it to Xerces.pm

> I'm guessing that the answer to the former is "Yes" but the answer
> to the latter is "No", or you would have used that instead of
> futzing with your own version using DOMParse in EntityResolver.t.
> 
> Given that, I'm continuing to tinker with my own Entity Manager in Perl.
>
> A little study has also revealed that OASIS has a running Committee
> on Entity Resolution using XML Catalogs that is about to publish a
> final specification for the mechanism using a significantly
> different (from the Cowant draft) syntax.
> 
> Question: Is it safe to assume that Xerces-* has not been updated to
>           use the new OASIS (draft) specification?

I'm assuming that the XML Catalog draft you're looking at is:

http://www.oasis-open.org/committees/entity/spec-2001-08-02.html

Nothing on my searches found anything that indicated either xerces-c/j
have begun to use this spec.

> Another issue is that the above specification seems to require that XML
> Catalogs be processed sequentially (i.e. the physical order in the file is
> significant).
> 
> Question: Will this require the use of SAX rather then DOM
> interfaces?

No, DOM preserves order. The tree structure that it builds in memory,
exactly models the underlying structure of the document. So you can
traverse the DOM tree using a tree walker and get the exact same thing
you would using SAX.

The importance question I had was whether the DOM accessor methods,
like getElementsByTagName() preserved order, and given a simple test
it appears that they do. So given:

<top>
 <foo att="one"/>
 <foo att="two"/>
 <foo att="three"/>
</top>

gives:

  DB<9> @e = $doc->getElementsByTagName('foo')
  DB<10> x map {$_->getAttribute('att')} @e
0  'one'
1  'two'
2  'three'

There is another important issue, though. I suspected that they needed
to be handled in order, but to get something out with 1.5.4 I just
dumped each element collection into a hash table (one for PUBLIC ids,
and another for SYSTEM ids). This might not work. I'll have to think
about it once I read the spec.

> And as if I hadn't exposed enough of my ignorance on the subject yet;
> is there any reason to use SAX1 for new applications instead of SAX2? :-)

I'm not much of a SAX person, but as far as I can tell, no. The only
issue is that SAX1 might be better supported by Xerces.pm. I can't say
that for certain, but I just don't have very many tests in place for
SAX2.

I'm glad that you're working on this Terje. When we add support for
this, Xerces.pm is likely to be the only Perl module to have support
for it.

jas.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to