The Short Version:
I believe that we should change PState.setSuccess to take as an argument a
PState.Mark and revert the diagnostics that have occurred since that Mark. The
current uses of setSuccess leave diagnostics in the PState that should be
discarded. Does this ruffle anyone’s feathers?
Long Version:
Pursuing Steve Lawrence’s comment in the pull request (#444) – I attempted to
add an assert to the TDMLRunner’s doParseExpectSuccess that requires a
successful test case to produce no error level diagnostics if it returns a
success and an infoset.
e.g.
val actual = processor.parse(…)
val diagObjs = actual.getDiagnostics
if (actual.isProcessingError) {
…
// unrelated failure stuff
…
} else {
// If we think we've succeeded, verify there are no errors
// captured in the diagnostics. Otherwise there's probably
// an internal bug causing us to miss setting isProcessingError
val hasErrorDiags = diagObjs.exists { diag =>
diag.isError && !diag.isValidation
}
Assert.invariant(!hasErrorDiags)
}
This revealed that at least 35 test cases produce errors even though they
succeed and provide an infoset. At first glance, I thought we just had more
occurrences of AbsentRep and MissingSeparator being misused, but it appears the
actual cause might be the use of
PState.setSuccess() to mask errors.
setSuccess appears to reset only the overall error/success state, but not
affect diagnostics. This seems like an oversight. I would like to modify
setSuccess:
final def setSuccess(pou: PState.Mark): Unit = {
_processorStatus = Success
this.diagnostics = pou.diagnostics
}
Modifying the state without fully reverting to a PoU seems dangerous, but the
other option seems to be rethinking our 7 uses of setSuccess entirely. Thoughts
on this change?
[A picture containing object, clock Description automatically generated]
Ian Carlson | Software Engineer
[Owl Cyber Defense]
W [email protected]<https://owlcyberdefense.com/>
Connect with us!
Facebook<https://www.facebook.com/owlcyberdefense/> |
LinkedIn<https://www.linkedin.com/company/owlcyberdefense/> |
Twitter<https://twitter.com/owlcyberdefense>
[Find us at our next event. Click
Here.]<https://owlcyberdefense.com/resources/events/?utm_source=owl-cyber-defense&utm_medium=email&utm_content=banner&utm_campaign=2020-events>
The information contained in this transmission is for the personal and
confidential use of the individual or entity to which it is addressed.
If the reader is not the intended recipient, you are hereby notified that any
review, dissemination, or copying of this communication is strictly prohibited.
If you have received this transmission in error, please notify the sender
immediately
From: Beckerle, Mike<mailto:[email protected]>
Sent: Monday, October 26, 2020 11:27 AM
To: [email protected]<mailto:[email protected]>
Subject: Re: 2399 - Changing error code for missing separators
There's some discussion in the PR, but for this mailing list I wanted to
comment that this is indeed a deep area of algorithmic complexity in the
Daffodil runtime.
Whether the code is actually factored adequately such that it enables
implementing the right DFDL behavior is not entirely clear, but it is
encouraging that this fix seems to be a small change without requiring
refactoring.
I am sure there are other bugs in this general area of the code and getting
everything exactly right so that every nuance of the DFDL spec is properly
implemented is quite difficult and will require development of an extensive
collection of tests that cover all these various combinations of features of
DFDL.
I think right now, the important thing is no regression on any known existing
DFDL schemas.
________________________________
From: Carlson, Ian <[email protected]>
Sent: Wednesday, October 21, 2020 6:04 PM
To: [email protected] <[email protected]>
Subject: 2399 - Changing error code for missing separators
All,
I’ve done a great deal of digging to find the cause of the bug DAFFODIL-2399.
It appears to come down to
SeparatedSequenceChildParseResultHelper.scala line 282. In this case, we have
found an out of scope delimiter %NL; before finding any elements, and as a
result we abort parsing the sequence. We sensibly set the return status to
ParseAttemptStatus.MissingSeparator. However, since the requiredOptional field
is true, we have a status of success.
So far so good – everything works out until we roll back up to
SequenceParserBases.scala, line 270. The MissingSeparator status again sets a
status to success and terminates parsing of the sequence, but does not reset to
a PoU. This is the reason our errors aren’t being discarded.
My suggested solution is simply to replace ParseAttemptStatus.MissingSeparator
with ParseAttemptStatus.AbsentRep.
I bring this up to the mailing list because while this does seem to solve the
problem without breaking any other tests, this is a very deep behavior in our
parsing routine, and I want to see if anyone sees any potential dangers with
the change.
I’ve created a pull request:
https://github.com/apache/incubator-daffodil/pull/444 with this change for
review.
[A picture containing object, clock Description automatically generated]
Ian Carlson | Software Engineer
[Owl Cyber Defense]
W [email protected]<https://owlcyberdefense.com/>
Connect with us!
Facebook<https://www.facebook.com/owlcyberdefense/> |
LinkedIn<https://www.linkedin.com/company/owlcyberdefense/> |
Twitter<https://twitter.com/owlcyberdefense>
[Find us at our next event. Click
Here.]<https://owlcyberdefense.com/resources/events/?utm_source=owl-cyber-defense&utm_medium=email&utm_content=banner&utm_campaign=2020-events>
The information contained in this transmission is for the personal and
confidential use of the individual or entity to which it is addressed.
If the reader is not the intended recipient, you are hereby notified that any
review, dissemination, or copying of this communication is strictly prohibited.
If you have received this transmission in error, please notify the sender
immediately