[ 
https://issues.apache.org/jira/browse/XERCESJ-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265613#comment-16265613
 ] 

Mukul Gandhi commented on XERCESJ-1684:
---------------------------------------

I've few questions related to your project.

I counted the few tags that repeat, from your sample XML document with this 
little XSLT stylesheet,
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                         xmlns:ns0="urn:OECD:StandardAuditFile-Tax:PT_1.04_01"
                         version="2.0">
    
    <xsl:output method="text"/>
    
    <xsl:template match="/">
        <xsl:value-of 
select="count(ns0:AuditFile/ns0:MasterFiles/ns0:Customer)"/> 
<xsl:text>&#xa;</xsl:text>
        <xsl:value-of 
select="count(ns0:AuditFile/ns0:MasterFiles/ns0:Product)"/> 
<xsl:text>&#xa;</xsl:text>
        <xsl:value-of 
select="count(ns0:AuditFile/ns0:SourceDocuments/ns0:SalesInvoices/ns0:Invoice)"/>
    </xsl:template>
</xsl:stylesheet>

This gives me the following answer,
144
160
30720

Will these tags keep on growing with time, as the business keeps adding these 
objects to your XML? If this is true, and you think that <assert> (one that is 
near to the root of XML) will not break due to memory problems, then this is a 
false expectation. Or for that matter, the DOM parser provided by XercesJ.

Pls consider using any of my suggestions, mentioned in my previous comment.

I wish you should be able to you XercesJ (and its XSD 1.1 processor), with the 
technical and other project constraints you would have. We get a very nice, 
compliance to the XSD 1.1 language with Xerces XSD 1.1 implementation.

> Very high memory usage validating XSD 1.1 (+memory leak)
> --------------------------------------------------------
>
>                 Key: XERCESJ-1684
>                 URL: https://issues.apache.org/jira/browse/XERCESJ-1684
>             Project: Xerces2-J
>          Issue Type: Bug
>          Components: JAXP (javax.xml.validation)
>         Environment: windows java 1.8
>            Reporter: Simon Sprott
>            Priority: Critical
>         Attachments: SAFTPT_1_04_01_XSD11_Full.zip, tmp.zip
>
>
> using the 1.1 code branch built from
> http://svn.apache.org/repos/asf/xerces/java/branches/xml-schema-1.1-dev
> Using the built in sample validator 
> java jaxp.SourceValidator -xsd11 -a "SAFTPT_1_04_01_XSD11_Full.xsd" -i 
> "tmp.xml"
> The validation consumes huge amounts of memory (breaks at 10 GB on my 
> system), on smaller sample files validation does complete, but still consumes 
> a very large quantity of memory.
> Furthermore it seems to retain the memory allocated via a reference in 
> org.eclipse.wst.xml.xpath2.processor.internal.DefaultRSFactory._factory after 
> the validator has completed (this is not evident in the cmd line sample as 
> the process ends, but I have observed in in my own code).
> If I was to guess I would say that the results of the XPath queries are being 
> cached via DefaultRSFactory and never released.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to