stevedlawrence commented on code in PR #1514:
URL: https://github.com/apache/daffodil/pull/1514#discussion_r2229265305


##########
daffodil-schematron/src/main/scala/org/apache/daffodil/validation/schematron/Schematron.scala:
##########
@@ -48,11 +48,15 @@ object Schematron {
 }
 
 final class Schematron private (reader: XMLReader, templates: Templates) {

Review Comment:
   I think this XMLReader actually isn't used in a thread-safe way?
   
   `SchematronValidatorFactory.makeValidator` calls `Schematron.fromRules(...)` 
to create a single `Schematron` instance that is used whenever `validateXML` is 
called. But `fromRules` calls `xmlReader.get()`, so that single `Schematron` 
instance always uses the same `xmlReader` that it got when it created it. So 
even though xmlReader is a ThreadLocal, we always use the same instance 
regardless of the thread, which means multiple threads could use the same 
xmlreader and cause issues.
   
   I think part of what is making this difficult to make sense of is the 
Schematron class, where some of its parameters aren't thread safe. I wonder if 
things would be more clear if we just removed it, and moved its logic and 
thread locals into the SchematronValidator? Since SchematronValidator must be 
thread-safe it's essentially a requirement that its parameters must all be 
thread safe, and any local members must either be thread safe or thread locals. 
So if we removed Schematron, we could do something like this:
   
   ```scala
   // Template is thread safe so can be class member
   class SchematronValidator(template: Template, svrlPath: Option[URI]) {
     
     // XMLReader and Transformer are not thread safe so are ThreadLocal's. 
Alternatively they could be
    // created every time validateXML, but ThreadLocal allows reuse of objects 
that are potentially expensive to create 
     val readerTransformerTL = new ThreadLocal[(XMLReader, Transformer)] = {
       override def initialValue(): (XMLReader, Transformer) = {
         val reader = DaffodilSaxParserFactory()
         ...
         val transformer = template.newTransformer
         (reader, transformer)
       }
     }
   
     override def validateXML(...) = {
       val (reader, transformer) = readerTransformerTL.get
       val writer = ...
       transformer.transform(new SAXSource(reader, ...), writer)
       ...
   }
   ```
   Note that this puts XMLReader and Transformer in the same ThreadLocal as a 
tuple, so we only need to do one ThreadLocal lookup per validateXML.
   
   Also note there's sort of an issue that multiple threads could write to the 
same svrlPath if it was provided, but maybe we should just document in 
package-info.java that schematron.svrl.file is not thread safe and will be 
overwritten



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@daffodil.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to