tuxji commented on code in PR #1191:
URL: https://github.com/apache/daffodil/pull/1191#discussion_r1541213629
##########
daffodil-cli/src/main/scala/org/apache/daffodil/cli/Main.scala:
##########
@@ -901,7 +901,7 @@ object Main {
val BadExternalVariable = Value(33)
val UserDefinedFunctionError = Value(34)
val UnableToCreateProcessor = Value(35)
- val LayerExecutionError = Value(36)
+ val LayerExecutionError = Value(36) // TODO: remove if remains unused.
Review Comment:
Hard to tell from the diff whether `LayerExecutionError` is still unused.
If it is, need to remove it, otherwise remove the misleading comment.
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/schema/annotation/props/ByHandMixins.scala:
##########
@@ -394,7 +394,7 @@ object BinaryBooleanTrueRepType {
if (i < 0)
element.schemaDefinitionError(
- "For property 'binaryBooleanFalseRep', value must be an empty string
or a non-negative integer. Found: %d",
+ "For property 'binaryBooleanFalseRep', value must be an empty string
or a non-negative integer. Found: %s",
Review Comment:
Does replacing %d with %s make any difference when i is an Int?
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/dpath/NodeInfo.scala:
##########
@@ -932,6 +936,33 @@ object NodeInfo extends Enum {
DFDLTimeConversion.fromXMLString(s)
}
}
+
+ def toJavaType(dfdlType: DFDLPrimType): Class[_] = {
Review Comment:
Please add scaladocs explaining why the layer API needs `toJavaType` and
`toJavaTypeString` to perform type checking on DFDL variables and layers'
setters and getters to make sure the correct Java types are used.
##########
daffodil-io/src/main/scala/org/apache/daffodil/io/RegexLimitingInputStream.scala:
##########
@@ -263,14 +159,18 @@ class RegexLimitingStream(
// Decoding errors, and the complexities they create over how big the
string
// is, vs. how many bytes were consumed... those can't happen with
iso-8859-1.
//
- in.reset() // might have to backup farther than the matchString length
- in.skip(matchString.length + matchLength) // advance exactly the right
number of bytes
- if (matchLength > 0)
+ // We need to position our stream exactly after the characters that are
terminated by the match
+ // being found.
+ in.reset()
+ in.skip(
+ beforeMatchString.length + delimMatchLength,
+ ) // advance exactly the right number of bytes (past the regex match)
+ if (delimMatchLength > 0)
noMoreChunks = true
- if (matchString.isEmpty())
+ if (beforeMatchString.isEmpty)
Stream()
else
- matchString #:: chunks
+ beforeMatchString #:: chunks
Review Comment:
Please add a comment explaining what this unusual function symbol does
(`#::`) since I don't recall it. I could look it up if I was reading the code
in the IDE, but the code may be viewed in different ways.
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/layers/LayerUtils.scala:
##########
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.runtime1.layers
+
+import java.net.URI
+import java.net.URISyntaxException
+import java.util.Objects
+
+import org.apache.daffodil.lib.exceptions.Assert
+import org.apache.daffodil.lib.xml.NS
+import org.apache.daffodil.lib.xml.QName
+
+object LayerUtils {
Review Comment:
The functions defined in this object need scaladocs.
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/xml/XMLUtils.scala:
##########
@@ -153,6 +153,9 @@ object XMLUtils {
list
}
+ // FIXME: DAFFODIL-2883 - this needs checkForExistingPUA to be false so that
data
+ // which contains unicode PUA characters doesn't cause an SDE. Needs to be
either
+ // accepted or optionally cause a ParseError.
Review Comment:
Are we deferring that FIXME change to another PR? Consider adding this file
name and line number to <https://issues.apache.org/jira/browse/DAFFODIL-2883>
so that someone can find this location more easily.
##########
daffodil-io/src/main/scala/org/apache/daffodil/io/BoundaryMarkStreams.scala:
##########
@@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.io
+
+import java.io.FilterOutputStream
+import java.io.InputStream
+import java.nio.charset.Charset
+import java.nio.charset.StandardCharsets
+import java.util.regex.Pattern
+
+import org.apache.daffodil.lib.exceptions.Assert
+
+/**
+ * Can be used with any InputStream to restrict what is
+ * read from it to stop before a boundary mark string.
+ *
+ * The boundary mark string is exactly that, a string of characters. Not a
+ * regex, nor anything involving DFDL Character Entities or Character Class
+ * Entities. (No %WSP; no %NL; )
+ *
+ * This can be used to forcibly stop consumption of data from a stream at
+ * a length obtained from a delimiter.
+ *
+ * The boundary mark string is consumed from the underlying stream (if found),
and
+ * the underlying stream is left positioned at the byte after the boundary mark
+ * string.
+ *
+ * Thread safety: This is inherently stateful - so not thread safe to use
+ * this object from more than one thread.
+ */
+class BoundaryMarkLimitingInputStream(
+ inputStream: InputStream,
+ boundaryMark: String,
+ charset: Charset,
+ targetChunkSize: Int = 32 * 1024,
+) extends InputStream {
+
+ Assert.usage(targetChunkSize >= 1)
+ Assert.usage(boundaryMark.nonEmpty)
+
+ private lazy val boundaryMarkIn8859 =
+ new String(boundaryMark.getBytes(charset), StandardCharsets.ISO_8859_1)
+
+ private lazy val quotedBoundaryMark =
+ Pattern.quote(boundaryMarkIn8859) // in case pattern has non-regex-safe
characters in it
+
+ private lazy val delegateStream = new RegexLimitingInputStream(
+ inputStream,
+ quotedBoundaryMark,
+ boundaryMarkIn8859,
+ charset,
+ targetChunkSize,
+ )
+
+ override def read(): Int = delegateStream.read()
+
+}
+
+class BoundaryMarkInsertingJavaOutputStream(
Review Comment:
It would be nice to add scaladocs to this class to explain what its purpose
is (to insert the boundary mark into the output stream while unparsing an
infoset to its native data format).
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/infoset/InfosetWalker.scala:
##########
@@ -453,6 +453,10 @@ class InfosetWalker private (
outputterFunc
} catch {
case e: Exception => {
+ // FIXME: DAFFODIL-2884 This escalates a parser data exception to an
SDE
+ // Which breaks if string-as-xml encounters a string that is
malformed XML.
+ // We get the error thrown by the xml parser here outside of parsing,
which is
+ // too late.
val cause = e.getCause
Review Comment:
Please add this comment's file name and line number to
<https://issues.apache.org/jira/browse/DAFFODIL-2884> so someone can remove
this comment if they fix the bug, or just say grep the Daffodil source for
"DAFFODIL-2884" to find this comment.
##########
daffodil-core/src/main/scala/org/apache/daffodil/core/compiler/Compiler.scala:
##########
@@ -273,6 +273,9 @@ class Compiler private (
val dpObj = objInput.readObject()
objInput.close()
val dp = dpObj.asInstanceOf[DataProcessor]
+ // must recompile the layers since the data structure that creates
+ // is not serializable
+ dp.ssrd.compileLayers()
Review Comment:
Unclear grammar. Suggest rewording as:
```
// must recompile the layers since their data structure in memory
// is not serializable
```
##########
daffodil-runtime1-layers/src/main/resources/org/apache/daffodil/layers/xsd/lineFoldedLayer.dfdl.xsd:
##########
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+<schema
+ xmlns:xs="http://www.w3.org/2001/XMLSchema"
+ xmlns="http://www.w3.org/2001/XMLSchema"
+ xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/"
+ xmlns:dfdlx="http://www.ogf.org/dfdl/dfdl-1.0/extensions"
+ xmlns:lf="urn:org.apache.daffodil.layers.lineFolded"
+ targetNamespace="urn:org.apache.daffodil.layers.lineFolded">
+
+ <annotation>
+ <appinfo source="http://www.ogf.org/dfdl/">
+
+ <!-- lineFolded layers do not have any DFDL variables for parameter or
results
+
+ The two layers are named lineFolded_IMF and lineFolded_iCalendar -->
+
+ </appinfo>
+ </annotation>
+
+</schema>
Review Comment:
All of the new layer DFDL schema files need to have a final newline appended
to them. The sentences saying "The xxx layer has no parameter variables nor
return result variables" would read better if they said "The xxx layer has no
parameter or result variables".
##########
daffodil-test/src/test/resources/META-INF/services/org.apache.daffodil.runtime1.layers.api.Layer:
##########
@@ -27,8 +27,31 @@
# of the layering features of Daffodil. They can be reused to create "real"
# layering, but one should expect to have to adapt their code substantially.
#
-org.apache.daffodil.runtime1.layers.AISPayloadArmoringLayerCompiler
-org.apache.daffodil.runtime1.layers.CheckDigitLayerCompiler
-org.apache.daffodil.runtime1.layers.IPv4ChecksumLayerCompiler
+org.apache.daffodil.runtime1.layers.AISPayloadArmoringLayer
+org.apache.daffodil.runtime1.layers.CheckDigitLayer
+org.apache.daffodil.runtime1.layers.ipv4.IPv4ChecksumLayer
+org.apache.daffodil.runtime1.layers.AllTypesLayer
+org.apache.daffodil.runtime1.layers.STL_Ok1
+org.apache.daffodil.runtime1.layers.STL_Ok2
+org.apache.daffodil.runtime1.layers.STL_Ok3
+org.apache.daffodil.runtime1.layers.STL_Ok4
+org.apache.daffodil.runtime1.layers.STL_BadTypeInLayerCode1
+org.apache.daffodil.runtime1.layers.STL_BadTypeInLayerCode2
+org.apache.daffodil.runtime1.layers.STL_BadMissingSetter
+org.apache.daffodil.runtime1.layers.STL_BadMissingSetterArg
+org.apache.daffodil.runtime1.layers.STL_BadMissingGetter
+org.apache.daffodil.runtime1.layers.STL_BadMissingSetterVar
+org.apache.daffodil.runtime1.layers.STL_BadMissingGetterVar
+org.apache.daffodil.runtime1.layers.STL_BadMissingDefaultConstrutor
+org.apache.daffodil.runtime1.layers.STL_BadNotALayer # error intentional. This
isn't a layer class
+org.apache.daffodil.runtime1.layers.STL_BombOutLayer
+
+
+
+
+
+
+
+
Review Comment:
Are the empty lines at the end intentional?
##########
daffodil-runtime1/src/main/scala/org/apache/daffodil/runtime1/layers/api/Layer.java:
##########
@@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.daffodil.runtime1.layers.api;
+
+import org.apache.daffodil.runtime1.layers.LayerRuntime;
+import org.apache.daffodil.runtime1.layers.LayerUtils;
+
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * This is the primary API class for writing layers.
+ * <p/>
+ * All layers are derived from this class, and must have no-args default
constructors.
+ * <p/>
+ * Derived classes will be dynamically loaded by Java's SPI system.
+ * The names of concrete classes derived from Layer are listed in a
resources/META-INF/services file
+ * so that they can be found and dynamically loaded.
+ * <p/>
+ * The SPI creates an instance the class by calling a default (no-arg)
constructor, which should be
+ * the only constructor.
+ * <p/>
+ * Instances of derived layer classes can be stateful. They are private to
threads, and each time a layer
+ * is encountered during parse/unparse, an instance is created for that
situation.
+ * <p/>
+ * Layer instances should not share mutable state (such as via singleton
objects)
+ * <p/>
+ * The rest of the Layer class implements the
+ * layer decode/encode logic, which is done as part of deriving one's Layer
class from the
+ * Layer base class.
+ * <p/>
+ * About variables: Layer logic may read and write DFDL variables.
+ * <p/>
+ * Every DFDL Variable in the layer's targetNamespace is used either at the
start of the
+ * layer algorithm as a parameter to the layer or at the end of the layer
algorithm it is
+ * assigned as a return value (such as a checksum) from the layer.
+ * <p/>
+ * Variables being written must be undefined, since variables in DFDL are
single-assignment.
+ * <p/>
+ * Variables being read must be defined before being read by the layer, and
this is true for both
+ * parsing and unparsing. When unparsing, variables being read cannot be
forward-referencing to parts
+ * of the DFDL infoset that have not yet been unparsed.
+ * <p/>
+ * A layer that wants to read parameters declares special setter named
'setLayerVariableParameters'
+ * which has args such that each has a name and type that match a corresponding
+ * dfdl:defineVariable in the layer's namespace.
+ * <p/>
+ * A layer that wants to return a value after the layer algorithm completes
defines a special recognizable
+ * getter method. The name of the getter is formed from prefixing the DFDL
variable name with the string
+ * 'getLayerVariableResult_'. The return type of the getter must match the
type of the variable.
+ * <p/>
+ * For example, a result value getter for a DFDL variable named 'checksum' of
type xs:unsignedShort would be:
+ * <pre>
+ * int getLayerVariableResult_checksum() {
+ * // returns the value created by the checksum algorithm.
+ * }
+ * </pre>
+ * <p/>
+ */
+public abstract class Layer {
+
+ private final String localName;
+ private final String targetNamespace;
+
+ private LayerRuntime layerRuntime;
+
+ /**
+ * Constructs a new Layer object with the given layer name and namespace.
+ *
+ * @param localName the local NCName of the layer. Must be usable as a
Java identifier.
+ * @param targetNamespace the namespace of the layer. Must obey URI syntax.
+ * @throws IllegalArgumentException if arguments are null or do not obey
required syntax.
+ */
+ public Layer(String localName, String targetNamespace) {
+
+ LayerUtils.requireJavaIdCompatible(localName, "layerLocalName");
+ LayerUtils.requireURICompatible(targetNamespace, "layerNamespace");
+
+ this.localName = localName;
+ this.targetNamespace = targetNamespace;
+ }
+
+ /** The spiName of the Layer class.
+ * <p/>
+ * This method and the string it returns are required by the SPI loader.
+ * @return A unique indentifier for the kind of layer. Contains both local
and namespace components of the layer's complete name.
+ */
+ public final String name() { return LayerUtils.spiName(localName,
targetNamespace); }
+
+ public final String localName() { return this.localName; }
+ public final String namespace() { return this.targetNamespace; }
+
+ /**
+ * Called by the execution framework to give the context for reporting
errors.
+ * @param lr runtime data structure used by the framework
+ */
+ public final void setLayerRuntime(LayerRuntime lr) {
+ this.layerRuntime = lr;
+ }
+ public final LayerRuntime getLayerRuntime() { return layerRuntime; }
+
+ /**
+ * Use to report a processing error.
+ * <p/>
+ * When parsing a processing error can cause backtracking so that the parse
+ * can often recover from the error.
+ * <p/>
+ * When unparsing a processing error is fatal.
+ * @param msg describes the error
+ */
+ public void processingError(String msg) { layerRuntime.processingError(msg);
}
+
+ /**
+ * Use to report a processing error.
+ * <p/>
+ * When parsing a processing error can cause backtracking so that the parse
+ * can often recover from the error.
+ * <p/>
+ * When unparsing a processing error is fatal.
+ * @param cause a throwable object that describes the error
+ */
+ public void processingError(Throwable cause) {
layerRuntime.processingError(cause); }
+ /**
+ * Use to report a runtime schema definition error.
+ * <p/>
+ * This indicates that the layer is unable to
+ * work meaningfully because of the way it is configured. The schema itself
is not well defined due to
+ * the way the layer is configured.
+ * <p/>
+ * This error type is always fatal whether parsing or unparsing.
+ * <p/>
+ * @param msg describes the error
+ */
+ public void runtimeSchemaDefinitionError(String msg) {
layerRuntime.runtimeSchemaDefinitionError(msg); }
+
+ /**
+ * Use to report a runtime schema definition error.
+ * <p/>
+ * This indicates that the layer is unable to
+ * work meaningfully because of the way it is configured. The schema itself
is not well defined due to
+ * the way the layer is configured.
+ * <p/>
+ * This error type is always fatal whether parsing or unparsing.
+ * <p/>
+ * @param cause a throwable object that describes the error
+ */
+ public void runtimeSchemaDefinitionError(Throwable cause) {
layerRuntime.runtimeSchemaDefinitionError(cause); }
+
+ /**
+ * Wraps a layer input interpreter around an input stream, using the
provided LayerRuntimeFoo for runtime information and stateful services.
+ *
+ * @param jis The input stream to be wrapped.
+ * @return An input stream with the layer wrapped around it.
+ */
+ public abstract InputStream wrapLayerInput(InputStream jis) throws Exception;
+
+ /**
+ * Wraps a layer output interpreter around an output stream, using the
provided LayerRuntimeFoo for runtime information and stateful services.
+ *
+ * @param jos The output stream to be wrapped.
+ * @return An output stream with the layer wrapped around it.
+ */
+ public abstract OutputStream wrapLayerOutput(OutputStream jos) throws
Exception;
+
+ private ArrayList<Class<? extends Exception>> peExceptions = new
ArrayList<>();
+
+ /**
+ * Adds an exception class to the list of exceptions that will be
automatically converted
+ * into processing errors.
+ * <p/>
+ * The purpose of this is to allow one to use java/scala libraries that may
throw
+ * exceptions when encountering bad data. Such exceptions should be
translated into
+ * processing errors, which will allow the parser to backtrack and try other
alternatives
+ * which may work for that data.
+ * <p/>
+ * When considering whether a thrown Exception is to be converted to a
processing error
+ * RuntimeException classes are handled separately from Exception classes.
+ * Hence calling
+ * <pre>
+ * setProcessingErrorException(Exception.class);
+ * </pre>
+ * will NOT cause all RuntimeExceptions to also be converted into processing
errors.
+ * It will, however, cause all classes derived from Exception that are NOT
RuntimeExceptions
+ * to be caught and converted into Processing Errors.
+ *
+ * @param e the exception class to be added to the list of processing error
exceptions
+ */
+ public final void setProcessingErrorException(Class<? extends Exception> e) {
+ peExceptions.add(e);
+ }
+
+ public final List<Class<? extends Exception>> getProcessingErrorExceptions()
{
+ return peExceptions;
+ }
+
+}
Review Comment:
Many new files need a final newline - please check your IDE settings to
ensure new files get a final newline upon saving them.
##########
daffodil-test/src/test/scala/org/apache/daffodil/runtime1/layers/AISPayloadArmoringLayer.scala:
##########
@@ -183,17 +113,25 @@ class AISPayloadArmoringOutputStream(jos:
java.io.OutputStream) extends OutputSt
override def close(): Unit = {
if (!closed) {
- val ba = baos.toByteArray()
- val dis = InputSourceDataInputStream(ba)
- val finfo = new FormatInfoForAISDecode()
- val cb = CharBuffer.allocate(256)
- while ({ val numDecoded = dec.decode(dis, finfo, cb); numDecoded > 0 }) {
- cb.flip()
- IOUtils.write(cb, jos, iso8859)
- cb.clear()
+ val ba = baos.toByteArray
+ using(InputSourceDataInputStream(ba)) { dis =>
+ val finfo = FormatInfoForAISDecode
+ val cb = CharBuffer.allocate(256)
+ //
+ // TODO: This is not a supportable API. We want to reuse the
bitsCharset features of daffodil
+ // for this non-byte-sized charset decoding. But this finfo object (a
trait on the ParseOrUnparseState)
+ // should not be part of the API
Review Comment:
Are you talking only about the AIS Payload Armoring Layer API, or about the
layer API in general? I haven't noticed the finfo object used elsewhere in
this PR, just in this function.
##########
daffodil-runtime1-unparser/src/main/scala/org/apache/daffodil/unparsers/runtime1/LayeredSequenceUnparser.scala:
##########
@@ -52,10 +45,14 @@ class LayeredSequenceUnparser(
// since the flushing of the layer might be delayed due to suspensions. By
// getting an immutable state, we ensure that the flushing of the layer
// occurs with the state at this point.
+ //
+ // TODO: we're not unparsing here, just writing bytes, so perhaps we do not
+ // need this cloned state? Everything in layers is byte-centric, so there
is
+ // no issue of fragment bytes.
val formatInfoPre =
state.asInstanceOf[UStateMain].cloneForSuspension(layerUnderlyingDOS)
Review Comment:
Let's revisit this TODO. After having completed this PR, if you now have
more information whether `formatInfoPre` is needed or not, please remove the
TODO or remove the cloned state.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]