Re: [GENERATION] Code generation with Daffodil

Dave Fisher Tue, 08 Oct 2019 10:21:18 -0700

Hi -

Any answers to Julian’s question?


Regards,
Dave

> On Sep 16, 2019, at 9:53 AM, Julian Feinauer <j.feina...@pragmaticminds.de> 
> wrote:
> 
> Hi,
> 
> I went a bit more through the source code and I wonder if we have any 
> possibility to restructure it a bit so that we could get an easier way to 
> take out separate parts of this "3 Trees" workflow, that Mike explained.
> 
> @Mike, Steve: Do you see any possibility for that? I'm still not undertand 
> the internals properly enough to have any idea about that.
> 
> Julian
> 
> Am 13.09.19, 20:21 schrieb "Julian Feinauer" <j.feina...@pragmaticminds.de>:
> 
>    Hi Dave,
> 
>    It really was fun and it's really nice to be able to contribute something 
> as cool as code gen : )
>    In fact, your idea in theory could work out although our current idea is 
> to still have the generated code to depend on a lib.
> 
>    But, if I consider it... We do not strictly need that if we just include 
> the core stuff also in the generated code. In fact we could make single 
> functions like that... But I'm unsure if it's worth the effort.
> 
>    But let's think through this together in some design docs after Mike made 
> the initial setup.
> 
>    Julian
>    ________________________________
>    From: Dave Fisher <wave4d...@comcast.net>
>    Sent: Friday, September 13, 2019 7:24:36 PM
>    To: dev@daffodil.apache.org <dev@daffodil.apache.org>
>    Subject: Re: [GENERATION] Code generation with Daffodil
> 
>    Hi -
> 
>    First it was very cool to see the three of you interacting at Apachecon 
> this week. It’s really great to see the conversation coming back to the list 
> to share with everyone who is interested.
> 
>    Second, while flying back home from ACNA I thought of a simple use case 
> I’d like to pursue  that is similar to the PLC4X case. I’d like to write 
> Pulsar Functions that parse or unparse infosets.
> 
>    Making this compiled to either Java or Python would assure best resource 
> utilization.
> 
>    I’m not sure at this point if this capability would be Daffodil, Pulsar, 
> plc4x, and/or something else.
> 
>    Regards,
>    Dave
> 
>    Sent from my iPhone
> 
>> On Sep 13, 2019, at 6:58 PM, Julian Feinauer <j.feina...@pragmaticminds.de> 
>> wrote:
>> 
>> Hi Mike,
>> 
>> thanks for the email, looking forward to the design doc.
>> One aspect I also see ist hat we COULD provide daffodil functionality in 
>> other languages as well.
>> Think of the possibilities of a DFDL to C converter... or popular languages 
>> like python or even javascript (I will definetly NOT do that one __ ).
>> 
>> Julian
>> 
>> Am 13.09.19, 18:55 schrieb "Christofer Dutz" <christofer.d...@c-ware.de>:
>> 
>>   Hi Mike,
>> 
>>   You might want to have a look how we use the expressions in plc4x as I 
>> have a dedicated Parser and model for expressions. Perhaps you can re-use 
>> that for daffodil.
>> 
>>   Chris
>> 
>>   Holen Sie sich Outlook für Android<https://aka.ms/ghei36>
>> 
>>   ________________________________
>>   From: Beckerle, Mike <mbecke...@tresys.com>
>>   Sent: Friday, September 13, 2019 3:14:04 PM
>>   To: dev@daffodil.apache.org <dev@daffodil.apache.org>
>>   Subject: Re: [GENERATION] Code generation with Daffodil
>> 
>>   I have been looking at the code branch Julian F created (while we were at 
>> ApacheCon NA 2019) based on modifying the BinaryIntegerKnownLengthParser 
>> class to have a code-generator.
>> 
>>   Previously I thought implementing this on the parser/unparser classes was 
>> ok, but having refreshed my knowledge of Daffodil internals in the grammar 
>> and parser/unparsers, I think the approach needs to evolve.
>> 
>>   The BinaryIntegerKnownLengthParser already is a somewhat specialized 
>> parser. It is selected by the schema compiler based on
>>   (a) binary (twos complement) integer
>>   (b) length is a known constant
>> 
>>   What we want is for the compiler to select parsers that are further 
>> specialized for:
>>   (a) signed/unsigned
>>   (b) known bitOrder unchanged from prior element
>>   (c) known byteOrder (not an expression) unchanged from prior element
>>   (d) alignment known to be 8-bit aligned
>>   (e) length known to be 8, 16, 32, or 64(signed) only (or at least, 
>> multiple of 8 bits)
>> 
>>   So the reduction in "interpretation" overhead we seek here is moving all 
>> the conditionals related to these (a) to (e) to compile time from run time.
>> 
>>   That's my first cut at everything Daffodil must prove in its compiler 
>> about the format in order to achieve the same performance as hand-written 
>> code that makes all these same assumptions.
>> 
>>   Then the runtime library has to be factored such that given this 
>> information you can generate calls to primitive parsers that actually are 
>> specialized on these things and so avoid overhead. The daffodil I/O library 
>> currently doesn't have these operations called out.
>> 
>>   None of the above requires code-generation for a java implementation.
>>   It's just about enabling the compiler to select more specialized 
>> parse/unparse primitive operations.
>> 
>>   The reason to generate separate code is really more about:
>> 
>>   1) reducing the footprint for all the primitives and runtime aspects that 
>> are unused by a given format. This is more like an issue of selective 
>> linking.
>> 
>>   2) populating different non-generic infoset slots corresponding to named 
>> elements (e.g., pojo data members) without using reflection. This requires 
>> generating code that literally contains assignments to object members. This 
>> requires inline code generation so that an assignment can be ordinary 
>> non-reflective code.
>> 
>>   Expression evaluation is something further we need to consider. E.g., if 
>> the length of something is to be computed, doing that in generated code 
>> requires that we compile DPath expressions into the generated code language.
>> 
>>   Next step is I plan to write up a design note on the wiki and get some 
>> feedback on it to solidify the requirements and approach. It is definitely 
>> time we considered all angles on this code-generation notion since numerous 
>> people have expressed interest in this means of using Daffodil.
>> 
>> 
>> 
>> 
>> 
>>   ________________________________
>>   From: Julian Feinauer <j.feina...@pragmaticminds.de>
>>   Sent: Wednesday, September 11, 2019 1:30 PM
>>   To: dev@daffodil.apache.org <dev@daffodil.apache.org>
>>   Subject: [GENERATION] Code generation with Daffodil
>> 
>>   Hi guys,
>> 
>>   I just had a discussion yesterday with Mike and Steve and we already had 
>> several discussions before in the PLC4X project.
>>   We like Daffodil but have the issue that we do not fit with the 
>> “Interpreter” Runtime it currently is.
>>   Mainly for two issues, performance and interoperability.
>>   So Ideally, I would like to have a piece of code which takes a DFDL Schema 
>> file and generated Code which is specifically to parse the given schema, 
>> probably in a given output. Ideally in multiple languages.
>> 
>>   As its not (yet) Christmas, I guess I will not get that for free so I 
>> played around a bit with the code and tried to understand it as good as 
>> possible and for me it seems that it is not that undoable as I initially 
>> thought (I already checked some months ago).
>>   In fact, if I get it right, the key would be to add another method 
>> `translate: AstNode` to the `Parser` trait.
>>   This should then generate an Ast (Sub-)node which represents all the 
>> action that would be done in the regular `parse` method.
>>   Then, we could finally, try to translate this Ast to Code and dump it to a 
>> file (I guess this is the rather easy part).
>> 
>>   This is just a rough thought, but I wanted to get it to the list and 
>> probably we will find some time to discuss it at ACNA.
>> 
>>   Julian
>> 
>> 
> 
> 
>

Re: [GENERATION] Code generation with Daffodil

Reply via email to