Re: Keep compiled C code or throw it away?

Steve Lawrence Tue, 06 Oct 2020 09:53:26 -0700

Understood. We only plane to include .c/.h files in the source relase.
If we do distribute a precompiled binary, it would only be in a
convenience jar/zip/tar. Though, it sounds like we're leaning towards
not distributing any pre-compiled code at all, and it will always be
compiled on the target machine. With the build configuration allowing a
simple way to build the lib for dev testing purposes.


On 10/6/20 12:45 PM, Dave Fisher wrote:
> However the community decides Apache open source releases must not include 
> binary code.
> 
> Binary convenience releases made by the community can be made in addition to 
> the source release.
> 
> Sent from my iPhone
> 
>> On Oct 5, 2020, at 2:26 PM, Beckerle, Mike <[email protected]> 
>> wrote:
>>
>> re: new TDMLDFDLProcessor is needed. This exists and I think works as you 
>> have mentioned.
>>
>>
>> ________________________________
>> From: Steve Lawrence <[email protected]>
>> Sent: Monday, October 5, 2020 5:12 PM
>> To: [email protected] <[email protected]>
>> Subject: Re: Keep compiled C code or throw it away?
>>
>>> This would work quite like daffodil-propgen then. Just at test-compile
>> time, not regular compile time.
>>
>> That requires an sbt source/resource generator, which means it depends
>> on the sbt configuration in order to test. Might make testing with other
>> IDE's more difficult. It also means something like the "daffodil test"
>> CLI command couldn't work since that doesn't use sbt.
>>
>> What if we just create a new TDMLDFDLProcessor that is specific to the
>> new c generator backend. This new TDMLDFDLProcessor can generate code
>> based on the schema being tested, compile the schemas (using caching
>> where possible), and execute whatever is compiled to parse/unparse,
>> capture the result, and return it as a Parser/UnparseResult that the
>> TDMLDaffodilProcessor can use. This TDMLDFDLProcessor essentially mimics
>> how a normal user would use it, just like the current TDMLDFDLProcessor
>> does.
>>
>> This is analogous to how the IBM DFDL implementation works. This
>> TDMLDFDLProcesor just happens to use the same Daffodil frontend but with
>> a different Daffodil backend.
>>
>>
>>> On 10/5/20 5:02 PM, Beckerle, Mike wrote:
>>> There are 3 different kinds of code in Daffodil:
>>>
>>> 1) static code humans write - compiler code and runtime code, and test rig 
>>> code. This includes scala, java, TDML, and soon enough, C code.
>>>
>>> 2) code that is generated that becomes part of daffodil itself. This is 
>>> generated by code in the daffodil-propgen library and creates src-managed, 
>>> and resource-managed code and resources in the daffodil-lib.
>>>
>>> The above are taken care of by SBT, whether scala, java, or C code.
>>>
>>> None of the above has anything to do with a DFDL schema created by a user.
>>>
>>> 3) The C-code generator creates C code from a user's schema.
>>>
>>> I would expect that generator to perhaps lay down not just the C code, but 
>>> make/build files so the user can build and run their code stand-alone.
>>>
>>> But I think of this as 100% separate from daffodil's build.sbt build 
>>> system. It could use sbt even, but it's not daffodil's build.
>>>
>>> The place where things get confusing is that in order to test (3) above, we 
>>> need to incorporate generating, compiling, linking, and running the 
>>> generated C code into daffodil's build, for testing purposes.
>>>
>>> So I think as a part of daffodil's build, analogous to how daffodil-propgen 
>>> puts Scala code into daffodil-lib/src-managed/scala/... the C-code 
>>> generator's src/test/scala code can be used to put C code into 
>>> daffodil-runtime2/test-managed/C/....
>>>
>>> That test-managed/C code would only be for test, but sbt would see it there 
>>> and compile it almost as if it were hand-written C code.
>>>
>>> This would work quite like daffodil-propgen then. Just at test-compile 
>>> time, not regular compile time.
>>>
>>> Does that makes sense?
>>>
>>>
>>>
>>> ________________________________
>>> From: Interrante, John A (GE Research, US) <[email protected]>
>>> Sent: Monday, October 5, 2020 2:11 PM
>>> To: [email protected] <[email protected]>
>>> Subject: Keep compiled C code or throw it away?
>>>
>>> The timing of when to compile the C source files that we will be adding to 
>>> the Daffodil source tree is another topic I would like to discuss on the 
>>> dev list.   I am using a sbt C compiler plugin in my runtime2 push request 
>>> to allow Daffodil's sbt build to compile C source files as well as Scala 
>>> source files.  We would have to include both the libraries built by the C 
>>> compiler (there would be several, not just one, as Mike pointed out) and 
>>> some corresponding C header/source files in a Daffodil distribution and/or 
>>> the output directory of a "daffodil generate C" command.
>>>
>>> The current discussion in the pull request is now wavering between:
>>>
>>>  1) Build the C libraries and distribute them with daffodil in its 
>>> daffodil/include and daffodil/lib directories
>>>  2) Build the C libraries, put them along with source files in a jar, and 
>>> distribute the jar with Daffodil
>>>  3) Put just the C source files in a jar and distribute the jar with 
>>> Daffodil; the "daffodil generate C" and "daffodil test <.tdml>" commands 
>>> will snap compile and/or execute the C files
>>>
>>> The question comes down to this: what is the best time to build the C 
>>> source files?
>>>
>>>  - Before distribution: This allows us to verify that C source files build 
>>> and we can test them before we distribute them
>>>  - After distribution: We simplify the sbt build and don't need to build 
>>> multiple daffodil distributions for different platforms
>>>
>>> Are there other choices too?  Actually, I think we need to do BOTH.  We can 
>>> fix compilation errors quicker if we can build C source files immediately 
>>> after editing them.  We also need to test the C code by running TDML tests 
>>> every time we run sbt test or sbt c-generator/test, which implies we need 
>>> to build the C source files before distribution as well as after 
>>> distribution.  However, throwing away the C-code libraries during 
>>> distribution time does mean that we need to compile 50K lines of C code 
>>> possibly multiple times or cache built C libraries somewhere in order to 
>>> improve the user's experience.
>>>
>>> So the question really is this - do we want to throw away the compiled 
>>> libraries (".a" files) and distribute only the C source code in 
>>> platform-independent jars, or distribute compiled machine binary files 
>>> along with the C source files in or with the platform-independent jars?
>>>
>>> -----Original Message-----
>>> From: Steve Lawrence <[email protected]>
>>> Sent: Monday, October 5, 2020 10:49 AM
>>> To: [email protected]
>>> Subject: EXT: Re: Subproject names proposed for discussion
>>>
>>> A handful of unrelated thoughts, maybe overthinking things and I don't feel 
>>> strongly about anything below, but renaming is always pain so it'd be nice 
>>> to ensure we have something future proof.
>>>
>>> 1) Is there any benefit organizationally to having all backends being in 
>>> the same directory?
>>>
>>> 2) From a sorting perspective, it'd be nice if the scala projects were 
>>> together, so having it be scala-parser and scala-unparser rather than 
>>> parser-scala and unparser-scala has advantages.
>>>
>>> 3) Maybe the scala parser/unparser should be considered the same "scala"
>>> runtime, and so parser/unparser should be subdirectories of a 
>>> "daffodil-backend-scala" subdirectory?
>>>
>>> 4) Is there even a benefit to separating parser/unparser into separate 
>>> jars? There's so much shared logic between the two, and there's even a 
>>> bunch of unparsing stuff in the parser jar. Should we just combine them 
>>> under the same backend?
>>>
>>> Taking all of the above into account, perhaps something like this:
>>>
>>> ...
>>> |-- daffodil-backends
>>> |   |-- daffodil-scala
>>> |   |   `-- src
>>> |   `-- daffodil-generator-c
>>> |       `-- src
>>> |-- daffodil-lib
>>> |   `-- src
>>> |-- daffodil-schema-compiler
>>> |   `-- src
>>> ...
>>>
>>> 5) Is there something better than "backend" for describing these. I can't 
>>> think of anything. Does the DFDL spec have a concept of this?
>>>
>>> 6) Are there any benefits to using "codenames". My thinking is maybe 
>>> someday there could be multiple "scala" backends with different 
>>> goals/extensions, and so "daffodil-scala" is too generic. Codenames would 
>>> be more like what we have today, except real code names might be easier to 
>>> remember than "runtime1" and "runtime2". Disadvantage is there's less 
>>> discoverability, but a README could be added with short descriptions about 
>>> what the backends try to accomplish. Not sure I like this, but thought I'd 
>>> throw it out there.
>>>
>>>
>>>
>>>> On 10/5/20 10:23 AM, Beckerle, Mike wrote:
>>>> +1 from me.
>>>>
>>>> ________________________________
>>>> From: Interrante, John A (GE Research, US) <[email protected]>
>>>> Sent: Monday, October 5, 2020 9:28 AM
>>>> To: [email protected] <[email protected]>
>>>> Subject: Subproject names proposed for discussion
>>>>
>>>> Steve Lawrence and I would like to bring a topic to the dev list for 
>>>> discussion since not everyone is paying attention to the review of my 
>>>> runtime2 push request.  Steve suggested, and I agree, that renaming some 
>>>> of the Daffodil subprojects might make their meanings more obvious to 
>>>> newcomer devs.  If we do rename some subprojects after discussing it on 
>>>> this list, we will do it immediately in its own pull request since mixing 
>>>> changes with renames makes it difficult to see which changes are just 
>>>> renames instead of actual changes.
>>>>
>>>> What do devs think about us renaming some subprojects like this?
>>>>
>>>>    rename daffodil-core to daffodil-schema-compiler
>>>>    leave daffodil-lib alone
>>>>    rename daffodil-runtime1 to daffodil-backend-parser-scala
>>>>    rename daffodil-runtime1-unparser to daffodil-backend-unparser-scala
>>>>    rename daffodil-runtime2 to daffodil-backend-generator-c
>>>>
>>>>
>>>
>>>
>>
>

Re: Keep compiled C code or throw it away?

Reply via email to