I updated the XMLParser to be reused by each NodePushable/ScalarEvaluator
instance. The query time did not change much, although the heap memory
usage is more steady and does not grow significantly while processing the
query.

Yes, a few remaining functions can be removed.


On Tue, Feb 11, 2014 at 11:51 PM, Till Westmann <[email protected]> wrote:

> I agree that we should re-use the XML parser. While we generally are
> careful to keep per-tuple memory allocation minimal, we are generously
> allocation XMLReaders and SAXContentHandlers. I'm not sure that this will
> account for the difference (do you have more details on the time spent
> during parsing?), but this would certainly be a better approach.
> To make sure that we're not stepping on each others feet, we should
> however have one XMLParser object for each NodePushable/ScalarEvaluator
> instance.
>
> Wrt. the cost of the select expression I guess that we still have a number
> of functions in there that are not strictly necessary. Is that right?
>
> In any case I think the we should now try to focus on parallelization and
> parallel performance and not necessarily on single thread performance.
>
> Does this make sense?
>
> Cheers,
> Till
>
> On Feb 11, 2014, at 8:46 AM, Eldon Carman <[email protected]> wrote:
>
> > The compiling and parsing for both Saxon and VXQuery consume a large
> amount
> > of the query time. Saxon definitely has improved their parsing efficiency
> > and later query processing. Take a look at these numbers:
> >
> > VXQuery compile time 700 to 1600ms (<1% of total query time)
> > Saxon compile time 230 to 260ms (<3% of total query time)
> >
> > Using a profiler...
> > VXQuery parsing time 335,000ms (43% of total query time)
> > Saxon parsing time 17,500ms (88% of total query time)
> >
> > VXQuery remaining time (56% of total query time)
> > Saxon remaining time (11% of total query time)
> >
> > Notice the huge difference in time dedicated to parsing for VXQuery. Also
> > not the time not outlined as the rest of the query time. Most of that
> time
> > for VXQuery is in the select expression (50% of total query time) while
> > saxon has really no noticeable time spent on evaluating the select
> > expression.
> >
> >
> > Seeing the difference in parsing, I found these two articles about
> > improving the XML Parser:
> > http://www.ibm.com/developerworks/xml/library/x-perfap1/index.html
> > http://www.ibm.com/developerworks/library/x-perfap2/
> >
> > I think the section on reusing the parser would be a big help for us.
>
>

Reply via email to