As a very achievable alternative...

How about a simple switch on the current compiler to output the parsed data 
model as an XML or JSON file? Checks like reserved words should be suppressed, 
but otherwise it's just another really simple generator.

The point about XML is that it's easily loaded as a DOM or manipulated by XSL. 
You can then write a code generator for a new language as an entirely separate 
standalone project, without needing to hack the C++ every time. XSL experts can 
do their thing, or there is probably a StringTemplate driver out there already 
that can load an XML data model and read a template from standard input.

Yes I know it's another step in the toolchain, but we're getting used to that 
for the benefits it can bring.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org


-----Original Message-----
From: BCG [mailto:bgo...@hushmail.com] 
Sent: Thursday, 8 October 2015 1:55 PM
To: dev@thrift.apache.org
Subject: Re: Major feature suggestion/observation

Perhaps an approach that wouldn't require completely rearchitecting the 
compiler could be implement a mechanism that allows filtering the generated 
code as it is being written out. For example, the compiler could make a call to 
some sort of filtering callback that has the capability of modifying the 
"default" code that is generated, or even replacing it entirely. Information 
about the current state of the parsing could(should?) also be passed into the 
callback.  If you want a somewhat cheeky eat-your-own-dogfood approach, this 
could even be defined in IDL as a Thrift service, with an optional command line 
flag to the compiler for specifying a protocol and transport to an 
implementation (in that case, people could tweak the code generation using 
their language of choice, or even just consume the events to feed into their 
own completely separate template engine if they choose to do so).

I'm sure that a templating tool could be a great approach with definite 
advantages but the Thrift compiler seems pretty baked at this point and ripping 
it apart to rebuild it seems like a monumental effort and a huge risk.

I've been using Thrift for a while now and I'm interested in contributing to 
the project.  If this is an area that you all think would be valuable to work 
on, I'd be willing to try to help out however I can.  Or if there is another 
area of the project that has a more urgent need of attention, I'd be glad to 
try to help out there instead, just let me know.  Mostly I know Java, C, PHP 
and Javascript and a few other tricks I've learned over the years.

-- Ben

On 10/07/2015 09:12 PM, David Bennett wrote:
> [Sorry -- I only just subscribed so missed any earlier commonents on 
> the dev list]
>
> Your experience parallels mine, except that I'm a compiler guy so I've leant 
> more towards language-based solutions, and I've written a couple of template 
> engines.
>
> Re simple stuff: agreed. Simple stuff is simple.
> Re performance: not interested. There are situations where the speed of code 
> generation matters, and this is not one of them.
> Re features of the template language: absolutely. A language that is not 
> 'Turing Complete' (whatever that means in this context) will run into 
> problems it cannot solve.
>
> FWIW TC means state, iteration and alternation, which covers your loops and 
> filters. The only way to get there is to include a full-blown macro 
> processing language or equivalent (I've written one of those too). Look at 
> Tex, m4 for examples. Good page here too: 
> https://en.wikipedia.org/wiki/Template_processor. The key thing is Model View 
> separation: the C++ parse provides the data model and the template language 
> generates the source code View. With this separation and a suitable data 
> model, there should (almost) never be a need to change anything except 
> individual templates.
>
> In practice what I have done is to write special purpose functions in the 
> host language and call them from the template. Your keyword example would 
> require a language-specific callable function for each supported language to 
> check and perhaps mangle identifiers.
>
> But this project is only reasonable if the templating tool exists, and it is 
> sufficiently powerful that the conversion is largely mechanical, and there 
> are sufficient regression tests to check the results.
>
> Of all the tools I know, this one 
> https://theantlrguy.atlassian.net/wiki/display/ST4/StringTemplate+4+Documentation
>  is the one that is most likely to be suitable.
>
> Regards
> David M Bennett FACS
>
> Andl - A New Database Language - andl.org
>
>
> -----Original Message-----
> From: Jens Geyer [mailto:jensge...@hotmail.com]
> Sent: Thursday, 8 October 2015 6:30 AM
> To: Thrift-Dev <dev@thrift.apache.org>
> Cc: u...@thrift.apache.org
> Subject: Re: Major feature suggestion/observation
>
> Hi *,
>
> Please, FUP @ dev list. Thank you.
>
> I agree that the existing code generation code has some potential, in many 
> ways. I even agree that it could be a good idea to rethink some of the 
> concepts. But the question I raised a few hours earlier (on the dev list) was 
> precisely targeted at what I think is the key here: How many will it cost and 
> how much will we really benefit from converting everything into a 
> template-based generator?
>
> Having a good portion of (production code) experience in both template based 
> and non-template based codegen worlds, I believe I can speak with enough 
> authority regarding this whole matter. From my experiences, both ways have 
> their pros and cons. For simple, example-like stuff, everything is easy, with 
> or without templates. But in the real world, you will face lots of special 
> cases making your life harder. The good thing about a code-based generator 
> is, that there are typically more options to deal with such things in a 
> performant and convenient way. Trying to express these in a template language 
> can become a pain very quickly. Templates are as good as the template 
> language and system reaches. It typically starts to get complex with things 
> that need to be enumerated and filtered. Bringing loops and conditions into a 
> template-based engine is a challenging task, this is where the good, the bad 
> and the ugly start to become separated.
>
> In fact, given a fairly complex project, there is not much difference in what 
> you do when there is need to add features that are not supported by your 
> coded generator or template language: You change the implementation.
>
> Just one example, that is still sort of an general issue across all
> languages: reserved keywords. Besides the few obvious Thrift IDL related 
> keywords, each language has its own special set of reserved keywords.
> Putting all of them into one single global list that is used by all (!) 
> languages is something that I don't like very much, yet we still have it in 
> the Thrift compiler. Furthermore, each language has its own way how to deal 
> with reserved keywords: Some allow for a prefix like @ or &. We also have 
> some additional, per-language treatment in the Thrift compiler as well to 
> deal with these subtleties. Although more to the point in my opinion, these 
> solutions are by no means perfect either.
>
> Now think about, how a template-based could generator help with that specific 
> issue? I don't mean the question whether or not it is possible /somehow/ - it 
> should indeed be sort of a neat and clean solution, a significant improvement 
> over what we already have.
>
> You may get the impression that I'm against templates, but that's not true, I 
> am not. Templates are a very powerful tool. But I strongly doubt that 
> switching Thrift from one to the other just because it is possible will 
> produce enough net gain to justify the efforts needed. In my humble opinion 
> we should spent that time and developer-power more wisely.
>
> $0,02,
> JensG
>
>
> -----Ursprüngliche Nachricht-----
> From: David Bennett
> Sent: Wednesday, October 7, 2015 12:26 PM
> To: u...@thrift.apache.org
> Subject: RE: Major feature suggestion/observation
>
> [I'm wary of Boost. It's quite a commitment. But if needs must...]
>
> I had a quick look: it seems that the generation is achieved while compiling 
> the code using C++ templates. This is not what I had in mind at all. It 
> should be possible to edit a template without a C++ recompile.
>
> Here is a simple program in T4. You can probably see how it works with no 
> further explanation.
>
> <table class="detailstable">
>    <# foreach (var prop in data.Properties) { #>
>    <tr>
>    <th>
>      <#= prop.Name #>
>    </th>
>    <td>
>    <asp:DynamicControl DataField="<#= prop.Name #>" runat="server" />
>    </td>
>    </tr>
>    <# } #>
> </table>
>
> But this is only suitable for C#, and rewriting the compiler is definitely a 
> step too far. There is Cheetah for Python and lots of other HTML template 
> engines, but in a quick review I could find nothing suitable. Maybe I just 
> imagined there was a solution...
>
> Regards
> David M Bennett FACS
>
> Andl - A New Database Language - andl.org
>
>
> -----Original Message-----
> From: Philip Polkovnikov [mailto:polkovnikov...@gmail.com]
> Sent: Wednesday, 7 October 2015 8:00 PM
> To: u...@thrift.apache.org
> Subject: Re: Major feature suggestion/observation
>
> David,
>
> Default codegen solution in C++ world is Boost Karma. Though I'm unsure if it 
> is OK to make users that would like to compile thrift compiler set boost up 
> and wait several minutes until thrift compiles.
>
> 2015-10-07 3:11 GMT+03:00 David Bennett <da...@yorkage.com>:
>>
>> Regards
>> David M Bennett FACS
>>
>> Andl - A New Database Language - andl.org
>>
>>
>> -----Original Message-----
>> From: Roger Meier [mailto:ro...@bufferoverflow.ch]
>> Sent: Wednesday, 7 October 2015 5:34 AM
>> To: u...@thrift.apache.org
>> Cc: dev@thrift.apache.org
>> Subject: Re: Major feature suggestion/observation
>>
>> Hi David
>>
>> Quoting David Bennett <da...@yorkage.com>:
>>
>>> I'm a compiler guy (amongst other scars). I was somewhat surprised 
>>> when I opened up the Thrift compiler to discover that it uses 
>>> industrial strength parsing (for a very slim language) and a 
>>> hand-rolled, ad hoc source code generator (for a serious backend 
>>> problem). I had expected the exact opposite.
>>>
>>> After reading a few comments on this list I think a number of the 
>>> shortcomings of Thrift result from this. The compiler may be 
>>> 'tweakable' but it sure ain't configurable. The precise content of 
>>> the generated code (and how to alter it) is an ever present problem.
>>>
>>> My suggestion is that the backend of the compiler should be entirely 
>>> rewritten using modern code generation technology and a selection of 
>>> 'skeletons' provided as separate text files. Anyone who wanted to 
>>> tweak the output for any of their special use cases could easily 
>>> copy and modify an individual skeleton without having to venture 
>>> into the dark recesses of the C++ compiler.
>>
>>>>> Did you had a look at the JIRA issues related to rewrite and 
>>>>> changes on the compiler?
>> I found this one: https://issues.apache.org/jira/browse/THRIFT-1173.
>> It's right on the money, but seems to have been silently abandoned 4 years 
>> ago.
>> Looks like the guy who tackled it didn't know enough about template 
>> tools to make it happen, despite the best of intentions.
>>
>> I didn't find anything else remotely similar, but lots of requests 
>> for little tweaks that would become no-brainers with a template system.
>>
>>>>> Have you seen the python variant? This was another try to do it again.
>> No. Which issue?
>>
>>>>> I have seldom seen some successful rewrites, usually it takes too 
>>>>> long to bring them to the same level. Personally, I'm a fan of evolution.
>> Agree absolutely. The only way to tackle this kind of transformation 
>> is to treat the existing compiler as the spec and set out to 
>> replicate it, to the point of being able to pass identical regression tests.
>> That works, but it takes a while just to get back where you started.
>>
>>> With luck, the initial batch of skeletons could be extracted 
>>> directly from the existing compiler. It's still a biggish job.
>>>
>>> [Side digression: for some languages code generation is not really 
>>> needed. The language has sufficient abstraction capability to 
>>> implement the IDL directly. Since there are other languages that do 
>>> not, we are stuck with code generation.]
>>>
>>> The biggest choice is: which product to use for the code generation?
>>> I have a little familiarity with T4 and the ANTLR StringTemplate, 
>>> and I've hand-rolled a couple of my own but there are heaps of 
>>> others out there. Maybe it all comes down to what you're used to.
>>> I'm not sure I'm quite ready for the investment of time.
>>>>> Feel free to rewrite the compiler and provide a test suite for review.
>> Probably not -- Andl is keeping me busy enough for now. I was kind of 
>> hoping someone with C++/compiler experience could at least nominate a 
>> suitable template product. I don't know one, and a quick look at 
>> cpptemplate does not leave me filled with joy. Without this, it's 
>> just far too much work.
>>
>>>>> Improving the test suites across languages, improving CMake, 
>>>>> fixing bugs and many other topics to improve on Thrift has much 
>>>>> higher priority than rewriting something we already have.
>> I get that. What Thrift does and what it needs don't really overlap 
>> my skill set (or my interests) all that well, but I will keep an eye 
>> out for somewhere I can help.
>>
>> best!
>> Roger
>>
>>>>> PS: dev list is a better place for such discussions.
>> Thanks. I'll look into that.
>>
>>
>>> Regards
>>> David M Bennett FACS
>



Reply via email to