Petr Pajas wrote: > Depends on what you call slow... About 10 seconds to process a document with 72318 nodes, or 40500 nodes after you skip the non-element, non-attribute nodes that don't require a schema check.
> ...and it does not make much sense using XPath lookup for this. What would you suggest then? > If your XPath expressions involve sorting of largish node sets (.// > may trigger sorting).... Looks like I am using .// in a couple spots, but for relatively small fragments of the schema, not the whole document. Here are the expressions I am using: /xs:schema/xs:eleme...@name='$name'] ./xs:complexType | ./xs:simpleType /xs:schema/xs:complexty...@name='$type_name'] | /xs:schema/xs:simplety...@name='$type_name'] ../xs:complexType/xs:simpleContent/xs:extension | ../xs:simpleType /xs:schema/xs:simplety...@name='$base_name'] ../xs:simpleType/xs:restriction/xs:enumeration/@value .//xs:attribu...@name='$att_name'] .//xs:restriction/xs:enumeration/@value > then calling $doc->indexElements() once for the > WSD schema file could help... Makes sense, but no appreciable difference when looping over my test document 10 times (2nd one w/indexElements()): timethis 10: 110 wallclock secs (107.84 usr + 0.25 sys = 108.09 CPU) @ 0.09/s (n=10) timethis 10: 109 wallclock secs (108.11 usr + 0.17 sys = 108.28 CPU) @ 0.09/s (n=10) > Traversing the tree in Perl and using $node->attributes may or may not > be faster (note that the Perl-XS-Perl transitions are surprisingly > expensive even if the function you call in C is a noop, so introducing > more such calls may slow your program down as well). If the schema is traversed only once, and cached in a Perl data structure, then it should result in fewer overall XS calls. (The schema is small - about 350 nodes - relative to the document.) > I'd suggest: if the above don't help (and even if they do), look at > the cases where you use the XPath and for stuff like the one you > mentioned consider scanning once through all xs:attribute using @name > and $type_def as keys. Yeah, I think some sort of native Perl caching is going to be the way to go. Thanks for the reply. (I forwarded a copy to the list.) -Tom -- Tom Metro Venture Logic, Newton, MA, USA "Enterprise solutions through open source." Professional Profile: http://tmetro.venturelogic.com/ _______________________________________________ Boston-pm mailing list Boston-pm@mail.pm.org http://mail.pm.org/mailman/listinfo/boston-pm