So it is as expected, memory behavior is better, but it doesn't fix the 
performance.
Not ideal, but a good step in the right direction :)

On Feb 12, 2014, at 12:40 PM, Eldon Carman <[email protected]> wrote:

> I updated the XMLParser to be reused by each NodePushable/ScalarEvaluator
> instance. The query time did not change much, although the heap memory
> usage is more steady and does not grow significantly while processing the
> query.
> 
> Yes, a few remaining functions can be removed.
> 
> 
> On Tue, Feb 11, 2014 at 11:51 PM, Till Westmann <[email protected]> wrote:
> 
>> I agree that we should re-use the XML parser. While we generally are
>> careful to keep per-tuple memory allocation minimal, we are generously
>> allocation XMLReaders and SAXContentHandlers. I'm not sure that this will
>> account for the difference (do you have more details on the time spent
>> during parsing?), but this would certainly be a better approach.
>> To make sure that we're not stepping on each others feet, we should
>> however have one XMLParser object for each NodePushable/ScalarEvaluator
>> instance.
>> 
>> Wrt. the cost of the select expression I guess that we still have a number
>> of functions in there that are not strictly necessary. Is that right?
>> 
>> In any case I think the we should now try to focus on parallelization and
>> parallel performance and not necessarily on single thread performance.
>> 
>> Does this make sense?
>> 
>> Cheers,
>> Till
>> 
>> On Feb 11, 2014, at 8:46 AM, Eldon Carman <[email protected]> wrote:
>> 
>>> The compiling and parsing for both Saxon and VXQuery consume a large
>> amount
>>> of the query time. Saxon definitely has improved their parsing efficiency
>>> and later query processing. Take a look at these numbers:
>>> 
>>> VXQuery compile time 700 to 1600ms (<1% of total query time)
>>> Saxon compile time 230 to 260ms (<3% of total query time)
>>> 
>>> Using a profiler...
>>> VXQuery parsing time 335,000ms (43% of total query time)
>>> Saxon parsing time 17,500ms (88% of total query time)
>>> 
>>> VXQuery remaining time (56% of total query time)
>>> Saxon remaining time (11% of total query time)
>>> 
>>> Notice the huge difference in time dedicated to parsing for VXQuery. Also
>>> not the time not outlined as the rest of the query time. Most of that
>> time
>>> for VXQuery is in the select expression (50% of total query time) while
>>> saxon has really no noticeable time spent on evaluating the select
>>> expression.
>>> 
>>> 
>>> Seeing the difference in parsing, I found these two articles about
>>> improving the XML Parser:
>>> http://www.ibm.com/developerworks/xml/library/x-perfap1/index.html
>>> http://www.ibm.com/developerworks/library/x-perfap2/
>>> 
>>> I think the section on reusing the parser would be a big help for us.
>> 
>> 

Reply via email to