Given recent discussion re: releases

Any inclination to implement validation strategy for reconstituted data in a 
lossless environment?

Thx - Attila

On 2021/09/17 20:11:54, "Beckerle, Mike" <mbecke...@owlcyberdefense.com> wrote: 
> Apologies on tardy reply. I missed parts of this thread due to spam email 
> filter.
> 
> (I learned that MS Outlook 365 is misclassifying some Apache email as junk 
> email. )
> 
> Here's the link to what is proposed for checksum calculations, and it has 
> links to some mock-ups showing how this checksum/crc stuff is supposed to 
> work.
> 
> https://cwiki.apache.org/confluence/display/DAFFODIL/Proposal%3A+Checksums%2C+CRC%2C+Parity+-+Layering+Enhancements
> 
> I do think this could be used to couple a generic hash into data that is 
> verified at unparse.
> 
> 
> ________________________________
> From: Steve Lawrence <slawre...@apache.org>
> Sent: Monday, August 30, 2021 9:50 AM
> To: dev@daffodil.apache.org <dev@daffodil.apache.org>
> Subject: Re: Fwd: FW: DFDL: potential problem
> 
> Interesting idea.
> 
> I was thinking you could do something like this once we have this new
> feature implemented:
> 
>   <xs:element name="FormatAndChecksum">
>     <xs:sequence>
>       <xs:sequence dfdlx:layer="checksum">
>         <xs:element ref="Format" />
>       </xs:sequence>
>       <xs:element name="checksum" type="xs:string"
>         dfdl:inputValueCalc="$checksum" />
>       <xs:sequence>
>         <xs:annotation>
>           <xs:appinfo source="http://www.ogf.org/dfdl/";>
>             <dfdl:assert test="{ ./checksum eq $checksum }" />
>           </xs:appinfo>
>         </xs:annotation>
>       </xs:sequence>
>     </xs:sequence>
>   </xs:element>
> 
> So we parse and checksum the entire data foramt, add the checksum to the
> infoset via input value calc, and then add an assert that the calculated
> checksum matchs the value in the infoset.
> 
> On parse, these two should always be the same. But on unparse, it's
> possible they could be different and the assert would fail.
> Unfortunately, this doesn't actually work because assert's are evaluated
> during unparse.
> 
> This seems like a reasonable use case for asserts during unparse, and I
> imagine there are others, so maybe that's a feature worth considering to
> allow this type of unparse validation.
> 
> 
> 
> 
> On 8/25/21 9:20 AM, Attila Horvath wrote:
> >
> > *Subject:* DFDL: potential problem
> >
> > ALCON
> >
> > re: idea for checksum calculations in DFDL
> > <https://lists.apache.org/thread.html/r85112d45e552a1f5b467406aeeee0f0a4bcaf143372b95c8e72f2669%40%3Cdev.daffodil.apache.org%3E>
> >
> > We may have a potential ‘situation’ as part of our DFDL/Daffodil offering as
> > follows…
> >
> > My DFDL schema development process consists of examining the exit codes of a
> > four (4) part mechanism:
> >
> >  1. DFDL parsing – “Houston, we have a go.”
> >  2. DFDL unparsing – “Houston, we have a go.”
> >  3. *End-to-end source/destination data comparison – “Houston, we have a 
> > problem.”*
> >  4. Intermediate xml validation against reconstituted data – “Houston, we 
> > have a
> >     go.”
> >
> > I have an *_unintentional_*error in my DFDL schema- unfortunately the
> > data/schema is lost that created this situation. Per above, both parse and
> > unparse execute successfully and xmllint validates Daffodil’s intermediate 
> > XML
> > file successfully against the reconstituted/unparsed data as well against 
> > the
> > DFDL [erroneous] schema.
> >
> > However, the source and target data are *_NOT_* congruent.This is one 
> > situation
> > I did not anticipate this situation.
> >
> > This means, our model and incorporation of Daffodil in our situation leaves
> > [albeit] a /possibility/ to have an erroneous DFDL schema that will 
> > ultimately
> > send data end-to-end but because the two [gateway]ends do not
> > communicatedirectly w/ each other there is no way for the destination 
> > gateway to
> > verify if the data is identical w/ the data received by the source gateway.
> >
> > To address above and perhaps along the lines of 'checksum calculations' re: 
> > IPV4
> > element, what is the collective opinion of having a SHASUM capability added 
> > to
> > Daffodil allowing the parser to optionally ("invisibly") incorporate a 
> > SHASUM in
> > the intermediate XML file allowing the destination unparser to validate the
> > reconstitute the data against the incorporated SHASUM?
> >
> > Perhaps a lame suggestion, could Daffodil optionally insert a comment tag 
> > while
> > parsing identifying it as a Daffodil inserted shasum comment which the 
> > unparser
> > can identify and validate the reconstituted data.
> >
> > Thx in advance,
> >
> > v/r
> >
> > Attila
> >
> >
> 
> 

Reply via email to