Thanks for the suggestion. I think both of these have been written as generators -- that is, they have been written with the intention that the output will be directly usable in some application. [Which is BTW perfectly reasonable.] The XSD seems intended as the schema of an actual data structure, but comes with a PHP file. The JSON contains things like "type=list<xxx>" and exceptions, so I'm not sure what the intended purpose was.
They are not quite suitable for the purpose I had in mind, which is a representation of the parsed input, suitable for processing by a subsequent generator. The XML would be processed by XSL (which can generate any kind of text you can think of); either could be loaded into a driver for StringTemplate (which would need to be enhanced by some computational and formatting routines). But it is a feasible path, with a good part of the work done. I could raise issues, but perhaps this is not something that will attract as much interest as wrestling bugs or supporting new languages. Regards David M Bennett FACS Andl - A New Database Language - andl.org -----Original Message----- From: Roger Meier [mailto:[email protected]] Sent: Friday, 9 October 2015 10:59 PM To: [email protected] Subject: Re: Major feature suggestion/observation You can use json or xsd output of the Apache Thrift compiler if you need this. cheers roger Quoting David Bennett <[email protected]>: > As a very achievable alternative... > > How about a simple switch on the current compiler to output the parsed > data model as an XML or JSON file? Checks like reserved words should > be suppressed, but otherwise it's just another really simple > generator. > > The point about XML is that it's easily loaded as a DOM or manipulated > by XSL. You can then write a code generator for a new language as an > entirely separate standalone project, without needing to hack the C++ > every time. XSL experts can do their thing, or there is probably a > StringTemplate driver out there already that can load an XML data > model and read a template from standard input. > > Yes I know it's another step in the toolchain, but we're getting used > to that for the benefits it can bring. > > Regards > David M Bennett FACS > > > -----Original Message----- > From: BCG [mailto:[email protected]] > Sent: Thursday, 8 October 2015 1:55 PM > To: [email protected] > Subject: Re: Major feature suggestion/observation > > Perhaps an approach that wouldn't require completely rearchitecting > the compiler could be implement a mechanism that allows filtering the > generated code as it is being written out. For example, the compiler > could make a call to some sort of filtering callback that has the > capability of modifying the "default" code that is generated, or even > replacing it entirely. Information about the current state of the > parsing could(should?) also be passed into the callback. If you want > a somewhat cheeky eat-your-own-dogfood approach, this could even be > defined in IDL as a Thrift service, with an optional command line flag > to the compiler for specifying a protocol and transport to an > implementation (in that case, people could tweak the code generation > using their language of choice, or even just consume the events to > feed into their own completely separate template engine if they choose > to do so). > > I'm sure that a templating tool could be a great approach with > definite advantages but the Thrift compiler seems pretty baked at this > point and ripping it apart to rebuild it seems like a monumental > effort and a huge risk. > > I've been using Thrift for a while now and I'm interested in > contributing to the project. If this is an area that you all think > would be valuable to work on, I'd be willing to try to help out > however I can. Or if there is another area of the project that has a > more urgent need of attention, I'd be glad to try to help out there > instead, just let me know. Mostly I know Java, C, PHP and Javascript > and a few other tricks I've learned over the years. > > -- Ben > > On 10/07/2015 09:12 PM, David Bennett wrote: >> [Sorry -- I only just subscribed so missed any earlier commonents on >> the dev list] >> >> Your experience parallels mine, except that I'm a compiler guy so >> I've leant more towards language-based solutions, and I've written a >> couple of template engines. >> >> Re simple stuff: agreed. Simple stuff is simple. >> Re performance: not interested. There are situations where the speed >> of code generation matters, and this is not one of them. >> Re features of the template language: absolutely. A language that is >> not 'Turing Complete' (whatever that means in this context) will run >> into problems it cannot solve. >> >> FWIW TC means state, iteration and alternation, which covers your >> loops and filters. The only way to get there is to include a >> full-blown macro processing language or equivalent (I've written one >> of those too). Look at Tex, m4 for examples. Good page here >> too: https://en.wikipedia.org/wiki/Template_processor. The key thing >> is Model View separation: the C++ parse provides the data model and >> the template language generates the source code View. >> With this separation and a suitable data model, there should >> (almost) never be a need to change anything except individual >> templates. >> >> In practice what I have done is to write special purpose functions in >> the host language and call them from the template. Your keyword >> example would require a language-specific callable function for each >> supported language to check and perhaps mangle identifiers. >> >> But this project is only reasonable if the templating tool exists, >> and it is sufficiently powerful that the conversion is largely >> mechanical, and there are sufficient regression tests to check the >> results. >> >> Of all the tools I know, this one >> https://theantlrguy.atlassian.net/wiki/display/ST4/StringTemplate+4+D >> ocumentation is the one that is most likely to be suitable. >> >> Regards >> David M Bennett FACS >> >> Andl - A New Database Language - andl.org >> >> >> -----Original Message----- >> From: Jens Geyer [mailto:[email protected]] >> Sent: Thursday, 8 October 2015 6:30 AM >> To: Thrift-Dev <[email protected]> >> Cc: [email protected] >> Subject: Re: Major feature suggestion/observation >> >> Hi *, >> >> Please, FUP @ dev list. Thank you. >> >> I agree that the existing code generation code has some potential, in >> many ways. I even agree that it could be a good idea to rethink some >> of the concepts. But the question I raised a few hours earlier (on >> the dev list) was precisely targeted at what I think is the key >> here: How many will it cost and how much will we really benefit from >> converting everything into a template-based generator? >> >> Having a good portion of (production code) experience in both >> template based and non-template based codegen worlds, I believe I can >> speak with enough authority regarding this whole matter. From my >> experiences, both ways have their pros and cons. For simple, >> example-like stuff, everything is easy, with or without templates. >> But in the real world, you will face lots of special cases making >> your life harder. The good thing about a code-based generator is, >> that there are typically more options to deal with such things in a >> performant and convenient way. Trying to express these in a template >> language can become a pain very quickly. Templates are as good as the >> template language and system reaches. It typically starts to get >> complex with things that need to be enumerated and filtered. Bringing >> loops and conditions into a template-based engine is a challenging >> task, this is where the good, the bad and the ugly start to become >> separated. >> >> In fact, given a fairly complex project, there is not much difference >> in what you do when there is need to add features that are not >> supported by your coded generator or template language: You change >> the implementation. >> >> Just one example, that is still sort of an general issue across all >> languages: reserved keywords. Besides the few obvious Thrift IDL >> related keywords, each language has its own special set of reserved >> keywords. >> Putting all of them into one single global list that is used by all >> (!) languages is something that I don't like very much, yet we still >> have it in the Thrift compiler. Furthermore, each language has its >> own way how to deal with reserved keywords: Some allow for a prefix >> like @ or &. We also have some additional, per-language treatment in >> the Thrift compiler as well to deal with these subtleties. Although >> more to the point in my opinion, these solutions are by no means >> perfect either. >> >> Now think about, how a template-based could generator help with that >> specific issue? I don't mean the question whether or not it is >> possible /somehow/ - it should indeed be sort of a neat and clean >> solution, a significant improvement over what we already have. >> >> You may get the impression that I'm against templates, but that's not >> true, I am not. Templates are a very powerful tool. But I strongly >> doubt that switching Thrift from one to the other just because it is >> possible will produce enough net gain to justify the efforts needed. >> In my humble opinion we should spent that time and developer-power >> more wisely. >> >> $0,02, >> JensG >> >> >> -----Ursprüngliche Nachricht----- >> From: David Bennett >> Sent: Wednesday, October 7, 2015 12:26 PM >> To: [email protected] >> Subject: RE: Major feature suggestion/observation >> >> [I'm wary of Boost. It's quite a commitment. But if needs must...] >> >> I had a quick look: it seems that the generation is achieved while >> compiling the code using C++ templates. This is not what I had in >> mind at all. It should be possible to edit a template without a C++ >> recompile. >> >> Here is a simple program in T4. You can probably see how it works >> with no further explanation. >> >> <table class="detailstable"> >> <# foreach (var prop in data.Properties) { #> >> <tr> >> <th> >> <#= prop.Name #> >> </th> >> <td> >> <asp:DynamicControl DataField="<#= prop.Name #>" runat="server" /> >> </td> >> </tr> >> <# } #> >> </table> >> >> But this is only suitable for C#, and rewriting the compiler is >> definitely a step too far. There is Cheetah for Python and lots of >> other HTML template engines, but in a quick review I could find >> nothing suitable. Maybe I just imagined there was a solution... >> >> Regards >> David M Bennett FACS >> >> Andl - A New Database Language - andl.org >> >> >> -----Original Message----- >> From: Philip Polkovnikov [mailto:[email protected]] >> Sent: Wednesday, 7 October 2015 8:00 PM >> To: [email protected] >> Subject: Re: Major feature suggestion/observation >> >> David, >> >> Default codegen solution in C++ world is Boost Karma. Though I'm >> unsure if it is OK to make users that would like to compile thrift >> compiler set boost up and wait several minutes until thrift compiles. >> >> 2015-10-07 3:11 GMT+03:00 David Bennett <[email protected]>: >>> >>> Regards >>> David M Bennett FACS >>> >>> Andl - A New Database Language - andl.org >>> >>> >>> -----Original Message----- >>> From: Roger Meier [mailto:[email protected]] >>> Sent: Wednesday, 7 October 2015 5:34 AM >>> To: [email protected] >>> Cc: [email protected] >>> Subject: Re: Major feature suggestion/observation >>> >>> Hi David >>> >>> Quoting David Bennett <[email protected]>: >>> >>>> I'm a compiler guy (amongst other scars). I was somewhat surprised >>>> when I opened up the Thrift compiler to discover that it uses >>>> industrial strength parsing (for a very slim language) and a >>>> hand-rolled, ad hoc source code generator (for a serious backend >>>> problem). I had expected the exact opposite. >>>> >>>> After reading a few comments on this list I think a number of the >>>> shortcomings of Thrift result from this. The compiler may be >>>> 'tweakable' but it sure ain't configurable. The precise content of >>>> the generated code (and how to alter it) is an ever present problem. >>>> >>>> My suggestion is that the backend of the compiler should be >>>> entirely rewritten using modern code generation technology and a >>>> selection of 'skeletons' provided as separate text files. Anyone >>>> who wanted to tweak the output for any of their special use cases >>>> could easily copy and modify an individual skeleton without having >>>> to venture into the dark recesses of the C++ compiler. >>> >>>>>> Did you had a look at the JIRA issues related to rewrite and >>>>>> changes on the compiler? >>> I found this one: https://issues.apache.org/jira/browse/THRIFT-1173. >>> It's right on the money, but seems to have been silently abandoned >>> 4 years ago. >>> Looks like the guy who tackled it didn't know enough about template >>> tools to make it happen, despite the best of intentions. >>> >>> I didn't find anything else remotely similar, but lots of requests >>> for little tweaks that would become no-brainers with a template system. >>> >>>>>> Have you seen the python variant? This was another try to do it again. >>> No. Which issue? >>> >>>>>> I have seldom seen some successful rewrites, usually it takes too >>>>>> long to bring them to the same level. Personally, I'm a fan of >>>>>> evolution. >>> Agree absolutely. The only way to tackle this kind of transformation >>> is to treat the existing compiler as the spec and set out to >>> replicate it, to the point of being able to pass identical >>> regression tests. >>> That works, but it takes a while just to get back where you started. >>> >>>> With luck, the initial batch of skeletons could be extracted >>>> directly from the existing compiler. It's still a biggish job. >>>> >>>> [Side digression: for some languages code generation is not really >>>> needed. The language has sufficient abstraction capability to >>>> implement the IDL directly. Since there are other languages that do >>>> not, we are stuck with code generation.] >>>> >>>> The biggest choice is: which product to use for the code generation? >>>> I have a little familiarity with T4 and the ANTLR StringTemplate, >>>> and I've hand-rolled a couple of my own but there are heaps of >>>> others out there. Maybe it all comes down to what you're used to. >>>> I'm not sure I'm quite ready for the investment of time. >>>>>> Feel free to rewrite the compiler and provide a test suite for review. >>> Probably not -- Andl is keeping me busy enough for now. I was kind >>> of hoping someone with C++/compiler experience could at least >>> nominate a suitable template product. I don't know one, and a quick >>> look at cpptemplate does not leave me filled with joy. Without this, >>> it's just far too much work. >>> >>>>>> Improving the test suites across languages, improving CMake, >>>>>> fixing bugs and many other topics to improve on Thrift has much >>>>>> higher priority than rewriting something we already have. >>> I get that. What Thrift does and what it needs don't really overlap >>> my skill set (or my interests) all that well, but I will keep an eye >>> out for somewhere I can help. >>> >>> best! >>> Roger >>> >>>>>> PS: dev list is a better place for such discussions. >>> Thanks. I'll look into that. >>> >>> >>>> Regards >>>> David M Bennett FACS >>
