I agree that we should re-use the XML parser. While we generally are careful to keep per-tuple memory allocation minimal, we are generously allocation XMLReaders and SAXContentHandlers. I'm not sure that this will account for the difference (do you have more details on the time spent during parsing?), but this would certainly be a better approach. To make sure that we're not stepping on each others feet, we should however have one XMLParser object for each NodePushable/ScalarEvaluator instance.
Wrt. the cost of the select expression I guess that we still have a number of functions in there that are not strictly necessary. Is that right? In any case I think the we should now try to focus on parallelization and parallel performance and not necessarily on single thread performance. Does this make sense? Cheers, Till On Feb 11, 2014, at 8:46 AM, Eldon Carman <[email protected]> wrote: > The compiling and parsing for both Saxon and VXQuery consume a large amount > of the query time. Saxon definitely has improved their parsing efficiency > and later query processing. Take a look at these numbers: > > VXQuery compile time 700 to 1600ms (<1% of total query time) > Saxon compile time 230 to 260ms (<3% of total query time) > > Using a profiler... > VXQuery parsing time 335,000ms (43% of total query time) > Saxon parsing time 17,500ms (88% of total query time) > > VXQuery remaining time (56% of total query time) > Saxon remaining time (11% of total query time) > > Notice the huge difference in time dedicated to parsing for VXQuery. Also > not the time not outlined as the rest of the query time. Most of that time > for VXQuery is in the select expression (50% of total query time) while > saxon has really no noticeable time spent on evaluating the select > expression. > > > Seeing the difference in parsing, I found these two articles about > improving the XML Parser: > http://www.ibm.com/developerworks/xml/library/x-perfap1/index.html > http://www.ibm.com/developerworks/library/x-perfap2/ > > I think the section on reusing the parser would be a big help for us.
