stevedlawrence commented on code in PR #1514: URL: https://github.com/apache/daffodil/pull/1514#discussion_r2229265305
########## daffodil-schematron/src/main/scala/org/apache/daffodil/validation/schematron/Schematron.scala: ########## @@ -48,11 +48,15 @@ object Schematron { } final class Schematron private (reader: XMLReader, templates: Templates) { Review Comment: I think this XMLReader actually isn't used in a thread-safe way? `SchematronValidatorFactory.makeValidator` calls `Schematron.fromRules(...)` to create a single `Schematron` instance that is used whenever `validateXML` is called. But `fromRules` calls `xmlReader.get()`, so that single `Schematron` instance always uses the same `xmlReader` that it got when it created it. So even though xmlReader is a ThreadLocal, we always use the same instance regardless of the thread, which means multiple threads could use the same xmlreader and cause issues. I think part of what is making this difficult to make sense of is the Schematron class, where some of its parameters aren't thread safe. I wonder if things would be more clear if we just removed it, and moved its logic and thread locals into the SchematronValidator? Since SchematronValidator must be thread-safe it's essentially a requirement that its parameters must all be thread safe, and any local members must either be thread safe or thread locals. So if we removed Schematron, we could do something like this: ```scala // Template is thread safe so can be class member class SchematronValidator(template: Template, svrlPath: Option[URI]) { // XMLReader and Transformer are not thread safe so are ThreadLocal's. Alternatively they could be // created every time validateXML, but ThreadLocal allows reuse of objects that are potentially expensive to create val readerTransformerTL = new ThreadLocal[(XMLReader, Transformer)] = { override def initialValue(): (XMLReader, Transformer) = { val reader = DaffodilSaxParserFactory() ... val transformer = template.newTransformer (reader, transformer) } } override def validateXML(...) = { val (reader, transformer) = readerTransformerTL.get val writer = ... transformer.transform(new SAXSource(reader, ...), writer) ... } ``` Note that this puts XMLReader and Transformer in the same ThreadLocal as a tuple, so we only need to do one ThreadLocal lookup per validateXML. Also note there's sort of an issue that multiple threads could write to the same svrlPath if it was provided, but maybe we should just document in package-info.java that schematron.svrl.file is not thread safe and will be overwritten -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@daffodil.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org