My thoughts: Currently we keep the C source files in daffodil-runtime2/src/main/resources/c. I'm fine with us keeping the c directory while splitting and moving the C source files into two new subdirectories called "libruntime2" and "libruntime2cli". I want to keep the "c" directory because it gives us one place to find all the C source files in Daffodil's classpath resources and unpack them from the jar file into a corresponding "<outdir>/c" directory.
We had a discussion on this list about renaming the daffodil subprojects to make their easier for newcomers to understand than "runtime1" and "runtime2" (such as daffodil-backend-scala and daffodil-backend-c-generator) so we might want to make these subdirectories' names simpler too (like "libruntime" and "libcli" since they'll be nested under "c" anyway). Immediately we would have to change CodeGenerator.compileCode so it uses two -I options for each subdirectory instead of one -I option for the c directory itself (note we run the C compiler within "<outdir>" instead of "<outdir>/c" in order to make room for a cache directory that "zig cc" might create). Later we might want to make CodeGenerator.compileCode build two static libraries and cache them somewhere if the number of files grows much larger. Right now CodeGenerator.compileCode has simple clean code and runs quickly (and even more quickly with zig's caching which Daffodil takes advantage of while running TDML tests). Note that I recommend people download a zig tarball (https://ziglang.org/download/) and put the zig executable on their PATH in order to allow compileCode to delegate caching of compiled files to zig (https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html). The caching will speed up running a large suite of TDML tests, we won't have to cache a static library somewhere, and zig will invalidate compiled files intelligently whenever we change C source or header files. The hard part of caching is cache invalidation, not cache filling. FYI, CodeGenerator.pickCompiler looks for any available C compiler on the system in this order (the first executable actually found in PATH wins): - value of env var `CC` if there is one - zig cc - gcc - clang - cc New topic: when would be a good time to rename the daffodil subprojects as we previously discussed? See https://issues.apache.org/jira/browse/DAFFODIL-2406. We don't have to do the renaming anytime soon; it's orthogonal and can wait until closer to the time we merge the new backend into Daffodil's main branch. John From: Beckerle, Mike <mbecke...@owlcyberdefense.com> Sent: Monday, November 30, 2020 7:19 PM To: dev@daffodil.apache.org Subject: EXT: thoughts on runtime2-2202 I'd like to split the C code into (a) essential runtime files (b) optional/test runtime files. Of the list of current files: daffodil_argp.c daffodil_argp.h daffodil_main.c infoset.c infoset.h stack.c stack.h xml_reader.c xml_reader.h xml_writer.c xml_writer.h I believe the infoset.c/h are "essential" in that 99% of all applications using C-generated code will want what is in them. The other files provide a command line allowing one to request parse/unparse behavior and interaction with an XML representation. These are "artifacts" of interacting seamlessly (and frankly, wonderfully) with the Daffodil test infrastructure based on TDML. While users might want to couple this C-based code generation with XML, I think there's some natural tension between the ultra-lightweight nature of C-code and the expensive verbosity of XML. Certainly, some people will want to use the C-code and almost nothing else. Just parse data and fill in C-structures, and equivalent unparse maybe. So while all this code is small by modern standards, I think infoset.c/h should end up in libruntime2 (essential for applications), and the remainder can/should end up in libruntime2cli which is 100% optional for applications. I expect both libraries to grow substantially over time as more runtime2 functionality is filled in. That's why I think we should separate them now. To clarify what goes where. Keep test infrastructure isolated so it doesn't "leak" into the essential runtime, etc. Thoughts? [cid:b1c6dcab-da6c-40ac-8151-bc6c5b0eea46]Mike Beckerle | Principal Engineer [OWL Cyber Defense] P +1-781-330-0412 W owlcyberdefense.com<http://www.owlcyberdefense.com>