stevedlawrence opened a new pull request, #1253: URL: https://github.com/apache/daffodil/pull/1253
Each DataProcessor currently creates and stores a unique instance of its Validator. When a DataProcessor is copied with one of the withXYZ functions, the validator is not copied and must be created again when that new processor performans validation, even thought it uses the same schema. In most normal uses this isn't actually a big deal since withXYZ functions are not called frequently and the validator won't actually be created until validation is needed. However, the TDML Runner often calls withXYZ for every test, which means if validation is enabled then every test will recreate a unique Validator. This can be very slow and expensive, especially for large schemas. To avoid this, this modifies the DataProcessor so the withXYZ functions copy the validator so it is shared among DataProcessors. And the withValidationMode function ensures we only create a new validator if the mode actually changes, avoiding the need to create unnecessary/expensive Validators. This also modifies the TDML runner so that the cached DataProcessor are built using the value of defaultValidation. This way the cached DataProcessor contains the pre-built Validator and any test cases that use the same validation mode will not need to rebuild the Validator. Note that this means test should run much quicker if you set defaultValidation="on" and validation="off" for tests that don't need validation, rather than setting defaultValidation="off" and validation="on" for tests that do need it, since the former will build the Validator once and share it with the tests that do not turn off validation. Another side effect of this change is we now build the Validator immediately when withValidaionMode is called rather than lazily waiting for the validator to be used. This is arguably better since it means there won't be possible hiccup on the first parse. DAFFODIL-2901 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
