[
https://issues.apache.org/jira/browse/XERCESJ-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542182#comment-17542182
]
Mike Beckerle commented on XERCESJ-1745:
----------------------------------------
Thank you for the link. I looked into the faq-grammars page and related example
source code.
Alas, none of these data structures are serializable, so using grammar pools
and preloaded grammers seems to acheve "compile once at start-up" behavior
which I believe we're already getting via the factory patterns that support
providing the schema once and then creating parsers from that factory.
With respect to the serialization form. I wanted to clarify our need is simpler
than what many people would think is needed for serializability. We do not need
any compatibility of the serializations across Xerces versions/builds.
For our needs the saved serialized representation can be completely tied to
exactly the same version/build of Xerces that created it. Reloading a
serialization created from a different version/build of Xerces can just be a
fatal error.
This actually removes the need for a great deal of the maintenance complexity
associated with serializability.
> Save/Restore serialized "compiled" parser-validator
> ---------------------------------------------------
>
> Key: XERCESJ-1745
> URL: https://issues.apache.org/jira/browse/XERCESJ-1745
> Project: Xerces2-J
> Issue Type: New Feature
> Components: Other, Serialization
> Affects Versions: 2.12.2
> Reporter: Mike Beckerle
> Priority: Major
>
> Feature requested by Apache Daffodil project PMC.
>
> We use Xerces-J to validate XML files.
>
> The schemas of these files are huge. Think 300+ fairly large XSD files all
> included/imported together. Megabytes of XSD.
>
> In order to validate+parse faster, we know Xerces does something akin to
> "compiling" the XSD into lower-level data structures.
>
> The requested feature is to make this "compilation" step of the large XSD
> schema explicit, and then be able to serialize the resulting java object to a
> file. Subsequently one can reload this pre-compiled object so as not to face
> this compiling overhead at startup time.
>
> An API call to explicitly force this compilation step, so that the time taken
> to do it can be measured, is an important part of this feature. This
> compilation can also occur automatically on first use, without requiring an
> explicit "compile it now" API call, and that would retain perfect
> compatiblity with Xerces APIs today.
>
> But for very large XSD, it is of value to be able to time this compile
> activity, so a new API method to cause Xerces to do this compilation step
> explicitly (and which is separate from the serialization of the resulting
> object) is of value.
>
> In summary I think numerous internal data structures within Xerces would have
> to be made Serializable, and a compileParser(),
> saveParser(java.io.OutputStream) and restoreParser(java.io.InputStream) or
> something along those lines are needed.
>
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]