Keep compiled C code or throw it away?

Interrante, John A (GE Research, US) Mon, 05 Oct 2020 11:12:17 -0700

The timing of when to compile the C source files that we will be adding to the 
Daffodil source tree is another topic I would like to discuss on the dev list.  
 I am using a sbt C compiler plugin in my runtime2 push request to allow 
Daffodil's sbt build to compile C source files as well as Scala source files.  
We would have to include both the libraries built by the C compiler (there 
would be several, not just one, as Mike pointed out) and some corresponding C 
header/source files in a Daffodil distribution and/or the output directory of a 
"daffodil generate C" command.

The current discussion in the pull request is now wavering between:

  1) Build the C libraries and distribute them with daffodil in its 
daffodil/include and daffodil/lib directories
  2) Build the C libraries, put them along with source files in a jar, and 
distribute the jar with Daffodil
  3) Put just the C source files in a jar and distribute the jar with Daffodil; 
the "daffodil generate C" and "daffodil test <.tdml>" commands will snap 
compile and/or execute the C files 

The question comes down to this: what is the best time to build the C source 
files?  

  - Before distribution: This allows us to verify that C source files build and 
we can test them before we distribute them
  - After distribution: We simplify the sbt build and don't need to build 
multiple daffodil distributions for different platforms

Are there other choices too?  Actually, I think we need to do BOTH.  We can fix 
compilation errors quicker if we can build C source files immediately after 
editing them.  We also need to test the C code by running TDML tests every time 
we run sbt test or sbt c-generator/test, which implies we need to build the C 
source files before distribution as well as after distribution.  However, 
throwing away the C-code libraries during distribution time does mean that we 
need to compile 50K lines of C code possibly multiple times or cache built C 
libraries somewhere in order to improve the user's experience. 

So the question really is this - do we want to throw away the compiled 
libraries (".a" files) and distribute only the C source code in 
platform-independent jars, or distribute compiled machine binary files along 
with the C source files in or with the platform-independent jars?

-----Original Message-----
From: Steve Lawrence <[email protected]> 
Sent: Monday, October 5, 2020 10:49 AM
To: [email protected]
Subject: EXT: Re: Subproject names proposed for discussion

A handful of unrelated thoughts, maybe overthinking things and I don't feel 
strongly about anything below, but renaming is always pain so it'd be nice to 
ensure we have something future proof.

1) Is there any benefit organizationally to having all backends being in the 
same directory?

2) From a sorting perspective, it'd be nice if the scala projects were 
together, so having it be scala-parser and scala-unparser rather than 
parser-scala and unparser-scala has advantages.

3) Maybe the scala parser/unparser should be considered the same "scala"
runtime, and so parser/unparser should be subdirectories of a 
"daffodil-backend-scala" subdirectory?

4) Is there even a benefit to separating parser/unparser into separate jars? 
There's so much shared logic between the two, and there's even a bunch of 
unparsing stuff in the parser jar. Should we just combine them under the same 
backend?

Taking all of the above into account, perhaps something like this:

...
|-- daffodil-backends
|   |-- daffodil-scala
|   |   `-- src
|   `-- daffodil-generator-c
|       `-- src
|-- daffodil-lib
|   `-- src
|-- daffodil-schema-compiler
|   `-- src
...

5) Is there something better than "backend" for describing these. I can't think 
of anything. Does the DFDL spec have a concept of this?

6) Are there any benefits to using "codenames". My thinking is maybe someday 
there could be multiple "scala" backends with different goals/extensions, and 
so "daffodil-scala" is too generic. Codenames would be more like what we have 
today, except real code names might be easier to remember than "runtime1" and 
"runtime2". Disadvantage is there's less discoverability, but a README could be 
added with short descriptions about what the backends try to accomplish. Not 
sure I like this, but thought I'd throw it out there.

On 10/5/20 10:23 AM, Beckerle, Mike wrote:
> +1 from me.
> 
> ________________________________
> From: Interrante, John A (GE Research, US) <[email protected]>
> Sent: Monday, October 5, 2020 9:28 AM
> To: [email protected] <[email protected]>
> Subject: Subproject names proposed for discussion
> 
> Steve Lawrence and I would like to bring a topic to the dev list for 
> discussion since not everyone is paying attention to the review of my 
> runtime2 push request.  Steve suggested, and I agree, that renaming some of 
> the Daffodil subprojects might make their meanings more obvious to newcomer 
> devs.  If we do rename some subprojects after discussing it on this list, we 
> will do it immediately in its own pull request since mixing changes with 
> renames makes it difficult to see which changes are just renames instead of 
> actual changes.
> 
> What do devs think about us renaming some subprojects like this?
> 
>     rename daffodil-core to daffodil-schema-compiler
>     leave daffodil-lib alone
>     rename daffodil-runtime1 to daffodil-backend-parser-scala
>     rename daffodil-runtime1-unparser to daffodil-backend-unparser-scala
>     rename daffodil-runtime2 to daffodil-backend-generator-c
> 
>

Keep compiled C code or throw it away?

Reply via email to