My thoughts:

Currently we keep the C source files in daffodil-runtime2/src/main/resources/c. 
 I'm fine with us keeping the c directory while splitting and moving the C 
source files into two new subdirectories called "libruntime2" and 
"libruntime2cli".  I want to keep the "c" directory because it gives us one 
place to find all the C source files in Daffodil's classpath resources and 
unpack them from the jar file into a corresponding "<outdir>/c" directory.

We had a discussion on this list about renaming the daffodil subprojects to 
make their easier for newcomers to understand than "runtime1" and "runtime2" 
(such as daffodil-backend-scala and daffodil-backend-c-generator) so we might 
want to make these subdirectories' names simpler too (like "libruntime" and 
"libcli" since they'll be nested under "c" anyway).

Immediately we would have to change CodeGenerator.compileCode so it uses two -I 
options for each subdirectory instead of one -I option for the c directory 
itself (note we run the C compiler within "<outdir>" instead of "<outdir>/c" in 
order to make room for a cache directory that "zig cc" might create).  Later we 
might want to make CodeGenerator.compileCode build two static libraries and 
cache them somewhere if the number of files grows much larger.  Right now 
CodeGenerator.compileCode has simple clean code and runs quickly (and even more 
quickly with zig's caching which Daffodil takes advantage of while running TDML 
tests).

Note that I recommend people download a zig tarball 
(https://ziglang.org/download/) and put the zig executable on their PATH in 
order to allow compileCode to delegate caching of compiled files to zig 
(https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html).
  The caching will speed up running a large suite of TDML tests, we won't have 
to cache a static library somewhere, and zig will invalidate compiled files 
intelligently whenever we change C source or header files.  The hard part of 
caching is cache invalidation, not cache filling.  FYI, 
CodeGenerator.pickCompiler looks for any available C compiler on the system in 
this order (the first executable actually found in PATH wins):

   - value of env var `CC` if there is one
   - zig cc
   - gcc
   - clang
   - cc

New topic: when would be a good time to rename the daffodil subprojects as we 
previously discussed?  See https://issues.apache.org/jira/browse/DAFFODIL-2406. 
 We don't have to do the renaming anytime soon; it's orthogonal and can wait 
until closer to the time we merge the new backend into Daffodil's main branch.

John

From: Beckerle, Mike <mbecke...@owlcyberdefense.com>
Sent: Monday, November 30, 2020 7:19 PM
To: dev@daffodil.apache.org
Subject: EXT: thoughts on runtime2-2202

I'd like to split the C code into (a) essential runtime files (b) optional/test 
runtime files.

Of the list of current files:

daffodil_argp.c
daffodil_argp.h
daffodil_main.c
infoset.c
infoset.h
stack.c
stack.h
xml_reader.c
xml_reader.h
xml_writer.c
xml_writer.h

I believe the infoset.c/h are "essential" in that 99% of all applications using 
C-generated code will want what is in them.

The other files provide a command line allowing one to request parse/unparse 
behavior and interaction with an XML representation. These are "artifacts" of 
interacting seamlessly (and frankly, wonderfully) with the Daffodil test 
infrastructure based on TDML.

While users might want to couple this C-based code generation with XML, I think 
there's some natural tension between the ultra-lightweight nature of C-code and 
the expensive verbosity of XML. Certainly, some people will want to use the 
C-code and almost nothing else. Just parse data and fill in C-structures, and 
equivalent unparse maybe.

So while all this code is small by modern standards, I think infoset.c/h should 
end up in libruntime2 (essential for applications), and the remainder 
can/should end up in libruntime2cli which is 100% optional for applications.

I expect both libraries to grow substantially over time as more runtime2 
functionality is filled in. That's why I think we should separate them now. To 
clarify what goes where. Keep test infrastructure isolated so it doesn't "leak" 
into the essential runtime, etc.

Thoughts?



[cid:b1c6dcab-da6c-40ac-8151-bc6c5b0eea46]Mike Beckerle | Principal Engineer

[OWL Cyber Defense]
P +1-781-330-0412
W owlcyberdefense.com<http://www.owlcyberdefense.com>

Reply via email to