Interesting. The fact that they call it safe "Documents" suggests that they are 
worried about the file formats people typically use/create.

Note that DFDL cannot express most such formats due to not having recursion.

This was 100% intentional. It keeps DFDL not-turing-complete, which is 
theoretically good from a safety/security point of view.

However, people have requested adding recursion to DFDL as a feature one can 
turn on/off so that one could describe "document" style formats using DFDL. 
There's real hard work to changing daffodil to allow that. We'd like to have it 
as an experimental DFDL feature someday.

________________________________
From: Costello, Roger L. <[email protected]>
Sent: Wednesday, September 4, 2019 12:48 PM
To: [email protected] <[email protected]>
Subject: Are you familiar with Safe Documents (SafeDocs)?


Today, code for input data validation is typically written manually in an 
ad-hoc manner. For commonly-used electronic data formats, input validation is, 
at a minimum, a problem of scale whereby specifications of these formats 
comprise hundreds to thousands of pages. Input validation thus translates to 
thousands or more conditions to be checked against the input data before the 
data can be safely processed. Manually writing the code to parse and validate 
input, and then manually auditing whether that code implements all the 
necessary checks completely and correctly, does not scale. Moreover, manual 
parser coding and auditing typically fails even for electronic data formats 
specifically designed to be easier to perform such tasks, e.g., JSON and XML. A 
variety of critical vulnerabilities have been found in major parser 
implementations for these formats.



Widely deployed mitigations against crafted input attacks include (a) trying to 
prevent the flow of untrusted data to vulnerable software; and (b) testing 
software with randomized inputs to find and patch flaws that could be triggered 
by maliciously created inputs. Unfortunately, neither of these approaches offer 
security assurance guarantees.



Mitigations for preventing the flow of untrusted data to vulnerable software, 
which can be implemented via network or host-based measures such as firewalls, 
application proxies, antivirus scanners, etc., neither remove the underlying 
vulnerability from the target, nor encode complete knowledge of document or 
message format internals. Attacker bypasses of such mitigations exploit 
incompleteness of the mitigations' understanding of the data format to exploit 
the still-vulnerable targets.



The Safe Documents (SafeDocs) program will develop novel verified programming 
methodologies for building high assurance parsers for extant electronic data 
formats, and novel methodologies for comprehending, simplifying, and reducing 
these formats to their safe, unambiguous, verification-friendly subsets (“safe 
sub-setting”). SafeDocs will address the ambiguity and complexity obstacles 
that hinder the application of verified programming posed by extant electronic 
data formats. SafeDocs’ multi-pronged approach will combine:



  *   extracting the extant formats’ de facto syntax (including any 
non-compliant syntax deliberately accepted and substantially used in the wild);
  *   identifying a syntactically simpler subset of this syntax that yields 
itself to use in verified programming while preserving the format's essential 
functionality; and
  *   creating software construction kits for building secure, verified parsers 
for this syntactically simpler subset, and high-assurance translators for 
converting extant instances of the format to this subset.



The parser construction kits developed by SafeDocs will be usable by industry 
programmers who understand the syntax of electronic data formats but lack the 
theoretical background in verified programming. These tools will enable 
developers to construct verifiable parsers for new electronic data formats as 
well as extant ones. The tools will guide the syntactic design of new formats, 
by making verification-friendly format syntax easy to express, and vice versa.



https://www.darpa.mil/program/safe-documents

Reply via email to