Re: Keep compiled C code or throw it away?

Beckerle, Mike Mon, 05 Oct 2020 14:03:17 -0700

There are 3 different kinds of code in Daffodil:

1) static code humans write - compiler code and runtime code, and test rig 
code. This includes scala, java, TDML, and soon enough, C code.

2) code that is generated that becomes part of daffodil itself. This is 
generated by code in the daffodil-propgen library and creates src-managed, and 
resource-managed code and resources in the daffodil-lib.

The above are taken care of by SBT, whether scala, java, or C code.

None of the above has anything to do with a DFDL schema created by a user.

3) The C-code generator creates C code from a user's schema.

I would expect that generator to perhaps lay down not just the C code, but 
make/build files so the user can build and run their code stand-alone.

But I think of this as 100% separate from daffodil's build.sbt build system. It 
could use sbt even, but it's not daffodil's build.

The place where things get confusing is that in order to test (3) above, we 
need to incorporate generating, compiling, linking, and running the generated C 
code into daffodil's build, for testing purposes.

So I think as a part of daffodil's build, analogous to how daffodil-propgen 
puts Scala code into daffodil-lib/src-managed/scala/... the C-code generator's 
src/test/scala code can be used to put C code into 
daffodil-runtime2/test-managed/C/....

That test-managed/C code would only be for test, but sbt would see it there and 
compile it almost as if it were hand-written C code.

This would work quite like daffodil-propgen then. Just at test-compile time, 
not regular compile time.

Does that makes sense?

________________________________
From: Interrante, John A (GE Research, US) <[email protected]>
Sent: Monday, October 5, 2020 2:11 PM
To: [email protected] <[email protected]>
Subject: Keep compiled C code or throw it away?

The timing of when to compile the C source files that we will be adding to the 
Daffodil source tree is another topic I would like to discuss on the dev list.  
 I am using a sbt C compiler plugin in my runtime2 push request to allow 
Daffodil's sbt build to compile C source files as well as Scala source files.  
We would have to include both the libraries built by the C compiler (there 
would be several, not just one, as Mike pointed out) and some corresponding C 
header/source files in a Daffodil distribution and/or the output directory of a 
"daffodil generate C" command.

The current discussion in the pull request is now wavering between:

  1) Build the C libraries and distribute them with daffodil in its 
daffodil/include and daffodil/lib directories
  2) Build the C libraries, put them along with source files in a jar, and 
distribute the jar with Daffodil
  3) Put just the C source files in a jar and distribute the jar with Daffodil; 
the "daffodil generate C" and "daffodil test <.tdml>" commands will snap 
compile and/or execute the C files

The question comes down to this: what is the best time to build the C source 
files?

  - Before distribution: This allows us to verify that C source files build and 
we can test them before we distribute them
  - After distribution: We simplify the sbt build and don't need to build 
multiple daffodil distributions for different platforms

Are there other choices too?  Actually, I think we need to do BOTH.  We can fix 
compilation errors quicker if we can build C source files immediately after 
editing them.  We also need to test the C code by running TDML tests every time 
we run sbt test or sbt c-generator/test, which implies we need to build the C 
source files before distribution as well as after distribution.  However, 
throwing away the C-code libraries during distribution time does mean that we 
need to compile 50K lines of C code possibly multiple times or cache built C 
libraries somewhere in order to improve the user's experience.

So the question really is this - do we want to throw away the compiled 
libraries (".a" files) and distribute only the C source code in 
platform-independent jars, or distribute compiled machine binary files along 
with the C source files in or with the platform-independent jars?

-----Original Message-----
From: Steve Lawrence <[email protected]>
Sent: Monday, October 5, 2020 10:49 AM
To: [email protected]
Subject: EXT: Re: Subproject names proposed for discussion

A handful of unrelated thoughts, maybe overthinking things and I don't feel 
strongly about anything below, but renaming is always pain so it'd be nice to 
ensure we have something future proof.

1) Is there any benefit organizationally to having all backends being in the 
same directory?

2) From a sorting perspective, it'd be nice if the scala projects were 
together, so having it be scala-parser and scala-unparser rather than 
parser-scala and unparser-scala has advantages.

3) Maybe the scala parser/unparser should be considered the same "scala"
runtime, and so parser/unparser should be subdirectories of a 
"daffodil-backend-scala" subdirectory?

4) Is there even a benefit to separating parser/unparser into separate jars? 
There's so much shared logic between the two, and there's even a bunch of 
unparsing stuff in the parser jar. Should we just combine them under the same 
backend?

Taking all of the above into account, perhaps something like this:

...
|-- daffodil-backends
|   |-- daffodil-scala
|   |   `-- src
|   `-- daffodil-generator-c
|       `-- src
|-- daffodil-lib
|   `-- src
|-- daffodil-schema-compiler
|   `-- src
...

5) Is there something better than "backend" for describing these. I can't think 
of anything. Does the DFDL spec have a concept of this?

6) Are there any benefits to using "codenames". My thinking is maybe someday 
there could be multiple "scala" backends with different goals/extensions, and 
so "daffodil-scala" is too generic. Codenames would be more like what we have 
today, except real code names might be easier to remember than "runtime1" and 
"runtime2". Disadvantage is there's less discoverability, but a README could be 
added with short descriptions about what the backends try to accomplish. Not 
sure I like this, but thought I'd throw it out there.

On 10/5/20 10:23 AM, Beckerle, Mike wrote:
> +1 from me.
>
> ________________________________
> From: Interrante, John A (GE Research, US) <[email protected]>
> Sent: Monday, October 5, 2020 9:28 AM
> To: [email protected] <[email protected]>
> Subject: Subproject names proposed for discussion
>
> Steve Lawrence and I would like to bring a topic to the dev list for 
> discussion since not everyone is paying attention to the review of my 
> runtime2 push request.  Steve suggested, and I agree, that renaming some of 
> the Daffodil subprojects might make their meanings more obvious to newcomer 
> devs.  If we do rename some subprojects after discussing it on this list, we 
> will do it immediately in its own pull request since mixing changes with 
> renames makes it difficult to see which changes are just renames instead of 
> actual changes.
>
> What do devs think about us renaming some subprojects like this?
>
>     rename daffodil-core to daffodil-schema-compiler
>     leave daffodil-lib alone
>     rename daffodil-runtime1 to daffodil-backend-parser-scala
>     rename daffodil-runtime1-unparser to daffodil-backend-unparser-scala
>     rename daffodil-runtime2 to daffodil-backend-generator-c
>
>

Re: Keep compiled C code or throw it away?

Reply via email to