Thanks for the review John. > Then how do you combine your forked Daffodil and sample schemetron > implementation/application together so that your simplest usage example > actually works?
Sure. The missing instructions are below and they were also added to the readme in the sample app repo. --- 1. From the root of daffodil; stage the cli package `sbt daffodil-cli/universal:stage` 2. From `daffodil-cli/target/universal/stage`; run the application, verifying it fails as expected due to missing schematron validator jar `./bin/daffodil parse --schema $data_dir/bmp.dfdl.xsd --validate sch=$data_dir/bmp.sch $data_dir/MARBLES.BMP` Should result in `[error] Bad arguments for option 'validate': 'sch=/sample/data/bmp.sch' - Unrecognized ValidationMode sch=/sample/data/bmp.sch. Must be 'on', 'limited', 'off', or name of spi validator.` 3. From the root of schematron validator; create an assembly jar `sbt assembly` 4. From `daffodil-schematron-validator/target/scala-2.12`; copy the validator jar to the staged daffodil-cli application lib dir `daffodil-cli/target/universal/stage/lib` 5. From `daffodil-cli/target/universal/stage`; run the application `./bin/daffodil parse --schema $data_dir/bmp.dfdl.xsd --validate sch=$data_dir/bmp.sch $data_dir/MARBLES.BMP` 6. See the parsed BMP with schematron validation status dumped to stdout. Note the exported path to the schematron validator data dir as `data_dir` in the examples above. --- > I'd have to do some research before I could say something about your bullet > items for discussion: Thanks. I am actively thinking about these and welcome input. Will follow up with additional thoughts... > FYI, Daffodil already uses the ServiceLoader API to load user defined > functions (daffodil-udf) Excellent. I did search for SPI related things when first looking at this but somehow missed that implementation. I'll review it. Appreciate the feedback, and looking forward to hearing if the app runs for you :) john ________________________________ From: Interrante, John A (GE Research, US) <inter...@research.ge.com> Sent: Tuesday, September 29, 2020 5:34 PM To: dev@daffodil.apache.org Subject: RE: Validator SPI proposal Hello John, Using ServiceLoader looks reasonable. I looked at your reference implementation and sample application, but can you clarify a question for me? First you build your forked Daffodil and your sample application separately in different directories. Then how do you combine your forked Daffodil and sample schemetron implementation/application together so that your simplest usage example actually works? That is, do you need to do step 1 below? 1. Copy a jar from daffodil-schematron-validator/target/... to incubator-daffodil/daffodil-cli/target/universal/stage/lib? $ <please fill in this step> 2. Define an alias (or create a symbolic link) to allow you to run your freshly built daffodil executable? $ alias daffodil="$HOME/incubator-daffodil/daffodil-cli/target/universal/stage/bin/daffodil" 3. Run your simplest usage example? $ cd daffodil-schematron-validator $ daffodil parse --schema data/bmp.dfdl.xsd --validate sch=data/bmp.sch data/MARBLES.BMP I'd have to do some research before I could say something about your bullet items for discussion: - How to approach breaking changes in the Validator API - How to evolve serialized API objects to prevent breakage in existing serialized objects (specifically from daffodil.api.ValidationMode) - Is there a better overall approach to this FYI, Daffodil already uses the ServiceLoader API to load user defined functions (daffodil-udf). I don't know much about the UDF files; I found them only because I searched for any occurrences of ServiceLoader in Daffodil. I don't know if you have seen these files and whether any of them informed your implementation, but I'll append a list of the UDF files for you to look at. interran@GH3WPL13E:~/apache/incubator-daffodil-asf$ fd udf daffodil-cli/src/it/scala/org/apache/daffodil/udf daffodil-cli/src/it/scala/org/apache/daffodil/udf/TestCLIUdfs.scala daffodil-runtime1/src/main/scala/org/apache/daffodil/udf daffodil-test/src/test/resources/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-test/src/test/resources/org/apache/daffodil/udf daffodil-test/src/test/resources/org/apache/daffodil/udf/udfs.tdml daffodil-test/src/test/scala/org/apache/daffodil/udf daffodil-test/src/test/scala/org/apache/daffodil/udf/TestUdfsInSchemas.scala daffodil-udf daffodil-udf/src/main/java/org/apache/daffodil/udf daffodil-udf/src/test/java/org/badudfs daffodil-udf/src/test/java/org/badudfs/annotations/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/java/org/badudfs/evaluate/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/java/org/badudfs/functionclasses1/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/java/org/badudfs/functionclasses2/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/java/org/badudfs/nonUDF daffodil-udf/src/test/java/org/badudfs/nonUDF/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/java/org/jgoodudfs daffodil-udf/src/test/resources/org/apache/daffodil/udf daffodil-udf/src/test/resources/org/apache/daffodil/udf/genericUdfSchema.xsd daffodil-udf/src/test/resources/org/badmetainf/nonexistentclass/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/resources/org/goodmetainf/IntegerFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/resources/org/goodmetainf/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sbadudfs daffodil-udf/src/test/scala/org/sbadudfs/functionclasses/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sbadudfs/functionclasses2/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sbadudfs/udfexceptions daffodil-udf/src/test/scala/org/sbadudfs/udfexceptions/evaluating/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sbadudfs/udfexceptions2 daffodil-udf/src/test/scala/org/sbadudfs/udfexceptions2/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sbadudfs/udfpexceptions daffodil-udf/src/test/scala/org/sbadudfs/udfpexceptions/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sbadudfs/udfpexceptions2 daffodil-udf/src/test/scala/org/sbadudfs/udfpexceptions2/StringFunctions/META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider daffodil-udf/src/test/scala/org/sgoodudfs interran@GH3WPL13E:~/apache/incubator-daffodil-asf$ rg ServiceLoader daffodil-udf/README.md 36:This class will act as a traditional service provider as explained in the ServiceLoader API, and must have a *META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider* file in its project. This file must contain the fully qualified name(s) of the **provider class(es)** in the JAR. Without that file, neither this class nor any of the User Defined Function classes it provides will be visible to Daffodil. 86:Each UDF is registered by including the fully qualified name of its provider in a text file named `META-INF/services/org.apache.daffodil.udf.UserDefinedFunctionProvider`. The META-INF folder must be accessible from the root of whatever paths are on the classpath, otherwise it won't be picked up by ServiceLoader. daffodil-udf/src/main/java/org/apache/daffodil/udf/UserDefinedFunctionProvider.java 21: * Abstract class used by ServiceLoader to poll for UDF providers on classpath. daffodil-runtime1/src/main/scala/org/apache/daffodil/udf/UserDefinedFunctionService.scala 22:import java.util.ServiceLoader 83: val loader: ServiceLoader[UserDefinedFunctionProvider] = ServiceLoader.load(classOf[UserDefinedFunctionProvider]) 185: * We catch any errors thrown by the ServiceLoader here. This usually means UDFP interran@GH3WPL13E:~/apache/incubator-daffodil-asf$ John -----Original Message----- From: Wass, John L <wa...@ctc.com> Sent: Tuesday, September 29, 2020 10:20 AM To: dev@daffodil.apache.org Subject: EXT: Validator SPI proposal Greetings, Please consider the following proposal to extend the Daffodil Infoset Validation API. The proposed changes support deploying custom validation implementations that are not built as part of the Daffodil distribution but are instead made available at runtime as Java Service Provider Interface (SPI) [1] "plug-ins". The intent here is to enable a wide range of validation approaches without increasing overhead for the Daffodil project, while increasing the velocity at which such implementations can be deployed. To support the discussion there is a minimally functional reference implementation for Daffodil[2] and sample application using Schematron in a standalone project[3]. I look forward to discussing the approach in more detail. Approach --- 1. Extract a Validator interface that describes validation behavior. 2. Detect implementations of this interface at runtime using SPI. 3. Parse additional validation arguments from CLI 4. Pass "Custom" validators through the existing api.ValidationMode. 5. Change ParseResult to execute validation through a SPI provided instance. - Instances of the Validator are accessed at runtime using SPI metadata from META-INF. - The existing Validator behavior remains and is installed as the "default" behavior. - The current CLI arguments for validation would not change, but an extended set of parse patterns is added. CLI Usage --- In the Schematron sample application there are a few CLI patterns impemented for reference. The simplest usage, using the BMP schema, is `daffodil parse --schema data/bmp.dfdl.xsd --validate sch=data/bmp.sch data/MARBLES.BMP` Where 'sch' is the lookup name for the SPI validator and following the '=' is an argument which points to the schematron to use. There are other argument configurations that will need discussed. Unknowns --- - How to approach breaking changes in the Validator API - How to evolve serialized API objects to prevent breakage in existing serialized objects (specifically from daffodil.api.ValidationMode) - Is there a better overall approach to this :P 1. https://docs.oracle.com/javase/tutorial/ext/basics/spi.html 2. https://github.com/ctc-oss/incubator-daffodil 3. https://github.com/ctc-oss/daffodil-schematron-validator -- John Wass Software Engineer Concurrent Technologies Corporation ----------------------------------------------------------------- This message and any files transmitted within are intended solely for the addressee or its representative and may contain company proprietary information. If you are not the intended recipient, notify the sender immediately and delete this message. Publication, reproduction, forwarding, or content disclosure is prohibited without the consent of the original sender and may be unlawful. Concurrent Technologies Corporation and its Affiliates. www.ctc.com<http://www.ctc.com> 1-800-282-4392 -----------------------------------------------------------------