So it is as expected, memory behavior is better, but it doesn't fix the performance. Not ideal, but a good step in the right direction :)
On Feb 12, 2014, at 12:40 PM, Eldon Carman <[email protected]> wrote: > I updated the XMLParser to be reused by each NodePushable/ScalarEvaluator > instance. The query time did not change much, although the heap memory > usage is more steady and does not grow significantly while processing the > query. > > Yes, a few remaining functions can be removed. > > > On Tue, Feb 11, 2014 at 11:51 PM, Till Westmann <[email protected]> wrote: > >> I agree that we should re-use the XML parser. While we generally are >> careful to keep per-tuple memory allocation minimal, we are generously >> allocation XMLReaders and SAXContentHandlers. I'm not sure that this will >> account for the difference (do you have more details on the time spent >> during parsing?), but this would certainly be a better approach. >> To make sure that we're not stepping on each others feet, we should >> however have one XMLParser object for each NodePushable/ScalarEvaluator >> instance. >> >> Wrt. the cost of the select expression I guess that we still have a number >> of functions in there that are not strictly necessary. Is that right? >> >> In any case I think the we should now try to focus on parallelization and >> parallel performance and not necessarily on single thread performance. >> >> Does this make sense? >> >> Cheers, >> Till >> >> On Feb 11, 2014, at 8:46 AM, Eldon Carman <[email protected]> wrote: >> >>> The compiling and parsing for both Saxon and VXQuery consume a large >> amount >>> of the query time. Saxon definitely has improved their parsing efficiency >>> and later query processing. Take a look at these numbers: >>> >>> VXQuery compile time 700 to 1600ms (<1% of total query time) >>> Saxon compile time 230 to 260ms (<3% of total query time) >>> >>> Using a profiler... >>> VXQuery parsing time 335,000ms (43% of total query time) >>> Saxon parsing time 17,500ms (88% of total query time) >>> >>> VXQuery remaining time (56% of total query time) >>> Saxon remaining time (11% of total query time) >>> >>> Notice the huge difference in time dedicated to parsing for VXQuery. Also >>> not the time not outlined as the rest of the query time. Most of that >> time >>> for VXQuery is in the select expression (50% of total query time) while >>> saxon has really no noticeable time spent on evaluating the select >>> expression. >>> >>> >>> Seeing the difference in parsing, I found these two articles about >>> improving the XML Parser: >>> http://www.ibm.com/developerworks/xml/library/x-perfap1/index.html >>> http://www.ibm.com/developerworks/library/x-perfap2/ >>> >>> I think the section on reusing the parser would be a big help for us. >> >>
