Thanks for the suggestion.

I think both of these have been written as generators -- that is, they have 
been written with the intention that the output will be directly usable in some 
application. [Which is BTW perfectly reasonable.] The XSD seems intended as the 
schema of an actual data structure, but comes with a PHP file. The JSON 
contains things like "type=list<xxx>" and exceptions, so I'm not sure what the 
intended purpose was.

They are not quite suitable for the purpose I had in mind, which is a 
representation of the parsed input, suitable for processing by a subsequent 
generator. The XML would be processed by XSL (which can generate any kind of 
text you can think of); either could be loaded into a driver for StringTemplate 
(which would need to be enhanced by some computational and formatting routines).

But it is a feasible path, with a good part of the work done. I could raise 
issues, but perhaps this is not something that will attract as much interest as 
wrestling bugs or supporting new languages.


Regards
David M Bennett FACS

Andl - A New Database Language - andl.org


-----Original Message-----
From: Roger Meier [mailto:[email protected]] 
Sent: Friday, 9 October 2015 10:59 PM
To: [email protected]
Subject: Re: Major feature suggestion/observation

You can use json or xsd output of the Apache Thrift compiler if you need this.

cheers
roger

Quoting David Bennett <[email protected]>:

> As a very achievable alternative...
>
> How about a simple switch on the current compiler to output the parsed 
> data model as an XML or JSON file? Checks like reserved words should 
> be suppressed, but otherwise it's just another really simple 
> generator.
>
> The point about XML is that it's easily loaded as a DOM or manipulated 
> by XSL. You can then write a code generator for a new language as an 
> entirely separate standalone project, without needing to hack the C++ 
> every time. XSL experts can do their thing, or there is probably a 
> StringTemplate driver out there already that can load an XML data 
> model and read a template from standard input.
>
> Yes I know it's another step in the toolchain, but we're getting used 
> to that for the benefits it can bring.
>
> Regards
> David M Bennett FACS
>
>
> -----Original Message-----
> From: BCG [mailto:[email protected]]
> Sent: Thursday, 8 October 2015 1:55 PM
> To: [email protected]
> Subject: Re: Major feature suggestion/observation
>
> Perhaps an approach that wouldn't require completely rearchitecting 
> the compiler could be implement a mechanism that allows filtering the 
> generated code as it is being written out. For example, the compiler 
> could make a call to some sort of filtering callback that has the 
> capability of modifying the "default" code that is generated, or even 
> replacing it entirely. Information about the current state of the 
> parsing could(should?) also be passed into the callback.  If you want 
> a somewhat cheeky eat-your-own-dogfood approach, this could even be 
> defined in IDL as a Thrift service, with an optional command line flag 
> to the compiler for specifying a protocol and transport to an 
> implementation (in that case, people could tweak the code generation 
> using their language of choice, or even just consume the events to 
> feed into their own completely separate template engine if they choose 
> to do so).
>
> I'm sure that a templating tool could be a great approach with 
> definite advantages but the Thrift compiler seems pretty baked at this 
> point and ripping it apart to rebuild it seems like a monumental 
> effort and a huge risk.
>
> I've been using Thrift for a while now and I'm interested in 
> contributing to the project.  If this is an area that you all think 
> would be valuable to work on, I'd be willing to try to help out 
> however I can.  Or if there is another area of the project that has a 
> more urgent need of attention, I'd be glad to try to help out there 
> instead, just let me know.  Mostly I know Java, C, PHP and Javascript 
> and a few other tricks I've learned over the years.
>
> -- Ben
>
> On 10/07/2015 09:12 PM, David Bennett wrote:
>> [Sorry -- I only just subscribed so missed any earlier commonents on 
>> the dev list]
>>
>> Your experience parallels mine, except that I'm a compiler guy so 
>> I've leant more towards language-based solutions, and I've written a 
>> couple of template engines.
>>
>> Re simple stuff: agreed. Simple stuff is simple.
>> Re performance: not interested. There are situations where the speed 
>> of code generation matters, and this is not one of them.
>> Re features of the template language: absolutely. A language that is 
>> not 'Turing Complete' (whatever that means in this context) will run 
>> into problems it cannot solve.
>>
>> FWIW TC means state, iteration and alternation, which covers your 
>> loops and filters. The only way to get there is to include a 
>> full-blown macro processing language or equivalent (I've written one 
>> of those too). Look at Tex, m4 for examples. Good page here
>> too: https://en.wikipedia.org/wiki/Template_processor. The key thing 
>> is Model View separation: the C++ parse provides the data model and 
>> the template language generates the source code View.
>> With this separation and a suitable data model, there should
>> (almost) never be a need to change anything except individual 
>> templates.
>>
>> In practice what I have done is to write special purpose functions in 
>> the host language and call them from the template. Your keyword 
>> example would require a language-specific callable function for each 
>> supported language to check and perhaps mangle identifiers.
>>
>> But this project is only reasonable if the templating tool exists, 
>> and it is sufficiently powerful that the conversion is largely 
>> mechanical, and there are sufficient regression tests to check the 
>> results.
>>
>> Of all the tools I know, this one
>> https://theantlrguy.atlassian.net/wiki/display/ST4/StringTemplate+4+D
>> ocumentation is the one that is most likely to be suitable.
>>
>> Regards
>> David M Bennett FACS
>>
>> Andl - A New Database Language - andl.org
>>
>>
>> -----Original Message-----
>> From: Jens Geyer [mailto:[email protected]]
>> Sent: Thursday, 8 October 2015 6:30 AM
>> To: Thrift-Dev <[email protected]>
>> Cc: [email protected]
>> Subject: Re: Major feature suggestion/observation
>>
>> Hi *,
>>
>> Please, FUP @ dev list. Thank you.
>>
>> I agree that the existing code generation code has some potential, in 
>> many ways. I even agree that it could be a good idea to rethink some 
>> of the concepts. But the question I raised a few hours earlier (on 
>> the dev list) was precisely targeted at what I think is the key
>> here: How many will it cost and how much will we really benefit from 
>> converting everything into a template-based generator?
>>
>> Having a good portion of (production code) experience in both 
>> template based and non-template based codegen worlds, I believe I can 
>> speak with enough authority regarding this whole matter. From my 
>> experiences, both ways have their pros and cons. For simple, 
>> example-like stuff, everything is easy, with or without templates.
>> But in the real world, you will face lots of special cases making 
>> your life harder. The good thing about a code-based generator is, 
>> that there are typically more options to deal with such things in a 
>> performant and convenient way. Trying to express these in a template 
>> language can become a pain very quickly. Templates are as good as the 
>> template language and system reaches. It typically starts to get 
>> complex with things that need to be enumerated and filtered. Bringing 
>> loops and conditions into a template-based engine is a challenging 
>> task, this is where the good, the bad and the ugly start to become 
>> separated.
>>
>> In fact, given a fairly complex project, there is not much difference 
>> in what you do when there is need to add features that are not 
>> supported by your coded generator or template language: You change 
>> the implementation.
>>
>> Just one example, that is still sort of an general issue across all
>> languages: reserved keywords. Besides the few obvious Thrift IDL 
>> related keywords, each language has its own special set of reserved 
>> keywords.
>> Putting all of them into one single global list that is used by all
>> (!) languages is something that I don't like very much, yet we still 
>> have it in the Thrift compiler. Furthermore, each language has its 
>> own way how to deal with reserved keywords: Some allow for a prefix 
>> like @ or &. We also have some additional, per-language treatment in 
>> the Thrift compiler as well to deal with these subtleties. Although 
>> more to the point in my opinion, these solutions are by no means 
>> perfect either.
>>
>> Now think about, how a template-based could generator help with that 
>> specific issue? I don't mean the question whether or not it is 
>> possible /somehow/ - it should indeed be sort of a neat and clean 
>> solution, a significant improvement over what we already have.
>>
>> You may get the impression that I'm against templates, but that's not 
>> true, I am not. Templates are a very powerful tool. But I strongly 
>> doubt that switching Thrift from one to the other just because it is 
>> possible will produce enough net gain to justify the efforts needed. 
>> In my humble opinion we should spent that time and developer-power 
>> more wisely.
>>
>> $0,02,
>> JensG
>>
>>
>> -----Ursprüngliche Nachricht-----
>> From: David Bennett
>> Sent: Wednesday, October 7, 2015 12:26 PM
>> To: [email protected]
>> Subject: RE: Major feature suggestion/observation
>>
>> [I'm wary of Boost. It's quite a commitment. But if needs must...]
>>
>> I had a quick look: it seems that the generation is achieved while 
>> compiling the code using C++ templates. This is not what I had in 
>> mind at all. It should be possible to edit a template without a C++ 
>> recompile.
>>
>> Here is a simple program in T4. You can probably see how it works 
>> with no further explanation.
>>
>> <table class="detailstable">
>>    <# foreach (var prop in data.Properties) { #>
>>    <tr>
>>    <th>
>>      <#= prop.Name #>
>>    </th>
>>    <td>
>>    <asp:DynamicControl DataField="<#= prop.Name #>" runat="server" />
>>    </td>
>>    </tr>
>>    <# } #>
>> </table>
>>
>> But this is only suitable for C#, and rewriting the compiler is 
>> definitely a step too far. There is Cheetah for Python and lots of 
>> other HTML template engines, but in a quick review I could find 
>> nothing suitable. Maybe I just imagined there was a solution...
>>
>> Regards
>> David M Bennett FACS
>>
>> Andl - A New Database Language - andl.org
>>
>>
>> -----Original Message-----
>> From: Philip Polkovnikov [mailto:[email protected]]
>> Sent: Wednesday, 7 October 2015 8:00 PM
>> To: [email protected]
>> Subject: Re: Major feature suggestion/observation
>>
>> David,
>>
>> Default codegen solution in C++ world is Boost Karma. Though I'm 
>> unsure if it is OK to make users that would like to compile thrift 
>> compiler set boost up and wait several minutes until thrift compiles.
>>
>> 2015-10-07 3:11 GMT+03:00 David Bennett <[email protected]>:
>>>
>>> Regards
>>> David M Bennett FACS
>>>
>>> Andl - A New Database Language - andl.org
>>>
>>>
>>> -----Original Message-----
>>> From: Roger Meier [mailto:[email protected]]
>>> Sent: Wednesday, 7 October 2015 5:34 AM
>>> To: [email protected]
>>> Cc: [email protected]
>>> Subject: Re: Major feature suggestion/observation
>>>
>>> Hi David
>>>
>>> Quoting David Bennett <[email protected]>:
>>>
>>>> I'm a compiler guy (amongst other scars). I was somewhat surprised 
>>>> when I opened up the Thrift compiler to discover that it uses 
>>>> industrial strength parsing (for a very slim language) and a 
>>>> hand-rolled, ad hoc source code generator (for a serious backend 
>>>> problem). I had expected the exact opposite.
>>>>
>>>> After reading a few comments on this list I think a number of the 
>>>> shortcomings of Thrift result from this. The compiler may be 
>>>> 'tweakable' but it sure ain't configurable. The precise content of 
>>>> the generated code (and how to alter it) is an ever present problem.
>>>>
>>>> My suggestion is that the backend of the compiler should be 
>>>> entirely rewritten using modern code generation technology and a 
>>>> selection of 'skeletons' provided as separate text files. Anyone 
>>>> who wanted to tweak the output for any of their special use cases 
>>>> could easily copy and modify an individual skeleton without having 
>>>> to venture into the dark recesses of the C++ compiler.
>>>
>>>>>> Did you had a look at the JIRA issues related to rewrite and 
>>>>>> changes on the compiler?
>>> I found this one: https://issues.apache.org/jira/browse/THRIFT-1173.
>>> It's right on the money, but seems to have been silently abandoned
>>> 4 years ago.
>>> Looks like the guy who tackled it didn't know enough about template 
>>> tools to make it happen, despite the best of intentions.
>>>
>>> I didn't find anything else remotely similar, but lots of requests 
>>> for little tweaks that would become no-brainers with a template system.
>>>
>>>>>> Have you seen the python variant? This was another try to do it again.
>>> No. Which issue?
>>>
>>>>>> I have seldom seen some successful rewrites, usually it takes too 
>>>>>> long to bring them to the same level. Personally, I'm a fan of 
>>>>>> evolution.
>>> Agree absolutely. The only way to tackle this kind of transformation 
>>> is to treat the existing compiler as the spec and set out to 
>>> replicate it, to the point of being able to pass identical 
>>> regression tests.
>>> That works, but it takes a while just to get back where you started.
>>>
>>>> With luck, the initial batch of skeletons could be extracted 
>>>> directly from the existing compiler. It's still a biggish job.
>>>>
>>>> [Side digression: for some languages code generation is not really 
>>>> needed. The language has sufficient abstraction capability to 
>>>> implement the IDL directly. Since there are other languages that do 
>>>> not, we are stuck with code generation.]
>>>>
>>>> The biggest choice is: which product to use for the code generation?
>>>> I have a little familiarity with T4 and the ANTLR StringTemplate, 
>>>> and I've hand-rolled a couple of my own but there are heaps of 
>>>> others out there. Maybe it all comes down to what you're used to.
>>>> I'm not sure I'm quite ready for the investment of time.
>>>>>> Feel free to rewrite the compiler and provide a test suite for review.
>>> Probably not -- Andl is keeping me busy enough for now. I was kind 
>>> of hoping someone with C++/compiler experience could at least 
>>> nominate a suitable template product. I don't know one, and a quick 
>>> look at cpptemplate does not leave me filled with joy. Without this, 
>>> it's just far too much work.
>>>
>>>>>> Improving the test suites across languages, improving CMake, 
>>>>>> fixing bugs and many other topics to improve on Thrift has much 
>>>>>> higher priority than rewriting something we already have.
>>> I get that. What Thrift does and what it needs don't really overlap 
>>> my skill set (or my interests) all that well, but I will keep an eye 
>>> out for somewhere I can help.
>>>
>>> best!
>>> Roger
>>>
>>>>>> PS: dev list is a better place for such discussions.
>>> Thanks. I'll look into that.
>>>
>>>
>>>> Regards
>>>> David M Bennett FACS
>>


Reply via email to