stevedlawrence opened a new pull request, #1253:
URL: https://github.com/apache/daffodil/pull/1253

   Each DataProcessor currently creates and stores a unique instance of its 
Validator. When a DataProcessor is copied with one of the withXYZ functions, 
the validator is not copied and must be created again when that new processor 
performans validation, even thought it uses the same schema. In most normal 
uses this isn't actually a big deal since withXYZ functions are not called 
frequently and the validator won't actually be created until validation is 
needed.
   
   However, the TDML Runner often calls withXYZ for every test, which means if 
validation is enabled then every test will recreate a unique Validator. This 
can be very slow and expensive, especially for large schemas.
   
   To avoid this, this modifies the DataProcessor so the withXYZ functions copy 
the validator so it is shared among DataProcessors. And the withValidationMode 
function ensures we only create a new validator if the mode actually changes, 
avoiding the need to create unnecessary/expensive Validators.
   
   This also modifies the TDML runner so that the cached DataProcessor are 
built using the value of defaultValidation. This way the cached DataProcessor 
contains the pre-built Validator and any test cases that use the same 
validation mode will not need to rebuild the Validator.
   
   Note that this means test should run much quicker if you set 
defaultValidation="on" and validation="off" for tests that don't need 
validation, rather than setting defaultValidation="off" and validation="on" for 
tests that do need it, since the former will build the Validator once and share 
it with the tests that do not turn off validation.
   
   Another side effect of this change is we now build the Validator immediately 
when withValidaionMode is called rather than lazily waiting for the validator 
to be used. This is arguably better since it means there won't be possible 
hiccup on the first parse.
   
   DAFFODIL-2901


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to