On Sat, Aug 15, 2009 at 06:04:39PM -0400, Joshua Cranmer wrote: > Pippijn van Steenhoven wrote: >> not needed for pork. > > If by that you mean removal of the various programs, you again have my > support; some of the stuff that involves ASTs is useful, IMO.
What stuff? XML, maybe? Yes.. I am unsure about that, also because of your next point: > > Then, I would like to have perl bindings (maybe >> python bindings) directly into the AST and build some tools with that. I >> would also like to rewrite the deparser (pretty printer) in a scripting >> language. For that, I would like to add intrusive support for perl. > > This is the part that I am most wary about. Elsa can output the AST into > an XML string, which can easily be read in by nearly every major > scripting language. That is a valid point and I have thought about it. Not very much, but when I first did this (yes, I already did all of this, but I lost it during a hdd crash (I did it in 5 days time and was done before my weekly backup)), I just removed everything that was not strictly necessary for the parser to work. I was planning to re-add it as an external library not included with the actual parser. Intrusive support is in fact against that idea, so I was also thinking about just writing a visitor building a perl data structure. That would probably be cleaner, too. On a whole, I think moving stuff out of the actual ast into visitors is a good idea (including the deparser, which is currently intrusive). > Initializing an interpreter for every run of pork > can hurt performance; Initialising a perl interpreter takes 0.000 to 0.002 seconds according to time(1). Since elsa can take an amount of time varying between 0.5 and 10 seconds for C++ files, I don't think it would be that much of a performance hit. > I have also had experience with poor performance > crossing language boundaries, which means core APIs implemented in a > scripting language could be painful in terms of performance. I don't think of deparsing or xml production as a core API. I think of core APIs being the least amount of code needed to have a parser and typechecker running. I have no plans to rewrite the typechecker in perl. That would be nonsense.. typechecking already takes the largest amount of time in the current parser. > I should also note that Mozilla tends to prefer JS or Python these days > to perl. That is fine, I just don't know JS and python very well. Perl would be for me anyway. It would be the last thing I do and probably have little value to anybody but me. > > Before I do all that, I want to restructure the source tree and use > > automake, autoconf and libtool. > > In my opinion, there is no reason to use the autotools suite if the > configuration arguments are simple enough: it adds complexity for so > little value (IMO). I have been using (gnu) make for a long time now and I feel comfortable with autotools. The makefiles would definitely become much shorter (I have already done this before the hdd crash, as well :\) and in my opinion, more understandable. It's basically just 6 lines of configure.ac and 10 lines of makefile in addition to the source file lists. > Indeed, if it's simple enough that a small script > would handle it just fine, there's no need to rip the system out for the > harder-to-understand autotools suite. Again, I don't think that suite is so hard to understand, but opinions may vary. Restructuring, by the way, would also include putting files that have something in common (the simplest case: all .ast files, all .gr files) into separate directories. > Much of the standard arguments are > rather meaningless, since pork tools are probably too case-specific to > be amenable to an installation procedure, That's actually what I would like to improve. I want elsa to become a standard component for many applications, including pork and oink. That is also why I am so much opposed to the astgen type of intrusive *everything*. Deparsing could so nicely be a visitor, just like xml generation and whatnot. astgen requires access to the source of elsa and its recompilation for every change. I don't think that is very clean. > and there's no need to have > pork work on systems that require arcane reconfiguration. I don't primarily use autotools for strange systems with even stranger compiler quirks, but they do relieve me of that duty. I do primarily use them simply because they are so easy to use. I know there are many bad examples of people not fully understanding the power of autotools and hacking something into them that could easily be done much more cleanly and efficiently if those people had actually consulted the manual. For fairness, I need to add here that some things are really hard to add to automake. For instance, if you have a code generator that generates two files, make (not just automake) really can't cope with it. You end up doing hacks like "the second file produced depends on the first file produced and has a no-op production rule". That's an inconvenience and ugliness I bite my tongue over every time I have to do it. Also, it may just be me not reading the manual carefully enough (I have not read it completely, yet), but I don't know how to easily add support for new source types (such as .ast) and have automake generate the appropriate makefile code for it. One time, I hacked up the automake script just for that and I have seen other people do it. Mostly, I just use gnu make, though, which does all of that very well (gnu make is brilliant, it's a freaking *programming* language). >> 1. restructure source tree, use autotools > > How do you intend to restructure the tree? I've also already given my > distrust of autotools, so unless it's strictly necessary, you should > probably save this step for later. Hm.. yes, I could save it up for later. I could do 2, 3 and 4, first. No arguments of feelings against it. >> 2. remove smbase string >> 3. remove other smbase ADTs and use std:: ones >> 4. remove even more smbase ADTs and use boost:: ones > > Two things to note here. When I last spoke with Taras, one of the things > he mentioned was that his biggest perceived barrier to removing > sm::string was concerns over the performance of the codebase. The goal > of porky is to be able to handle thousands of files of code, with header > includes that can easily push files past the 100,000 line mark or more. > It would be advisable to ensure that your removal does not adversely > impact performance of code parsing. I highly doubt that replacing sm::string with std::string would adversely impact performance, especially of code parsing, since code parsing does not in fact operate on sm::strings very much. Anyhow, my tests before the hdd crash much more proved the other way: it was faster. That said, I didn't test it on very large code files, I mostly just #included every C++ header and instantiated some of the std templates. > The other thing to note is the usage of boost::. I think there are > likely to be systems without boost (std::tr1 should be usable with the > appropriate g++ options); it may be inadvisable to require external > libraries if there is no real need for them. TR1 has no intrusive_ptr, which would replace the refcounted pointer class used in elkhound. TR1 has no ptr containers that would replace the Obj containers of smbase. Boost has a very permissive licence, that would allow you to ship the required headers with porky. There is no real need for it, but it would remove a code module from maintainance. I am generally very much in favour of removing as much from my own maintainance as possible, so I can concentrate on what is really my goal. I have learned to not be afraid of adding library dependencies (especially boost, since 1) it is very portable and 2) most people involved with projects I work on (C++ ones, mostly) already have it installed. Porky and elsa being C(++) parsing and manipulation projects make it quite probable that its potential users are going to have boost installed). I am of course willing to rearrange the order of my steps to meet your requirements/wishes. I want to do most (all?) of it eventually (excluding maybe the *intrusive* part of perl support, I have thought about it some more and found that it in fact is against everything I love and care for in programming (okay, that was exaggerated, but..), namely separation of concerns. I like to keep things small, separate modules appropriately. That would include no binary dependency or even inclusion of deparsing or xml generation in the core parsing library). All I want is give you and other elsa users as much of my work as possible. It's always nice to have as many people as possible profit from work one does. At least that is my view on things. >> 5. remove some unused or unneeded parts of elsa (they have some weird >> macro support I don't think actually works) > > Macro support--in terms of mcpp-style unwinding--is very much a > necessity for the fork of elsa that Mozilla maintains. Macro support in terms of.. /*!foo/*/ /*<foo/>*/.. I mean.. what the heck is this for? Does anybody even use that? Does it work at all? I haven't tried very hard, since I was too annoyed by the apparently senseless addition to the code that even uses an entire source file (cppundolog.cc). That is what I want to remove. I can't currently think of any other useless parts. It may be the only one, but I don't know what it's for. Maybe SM can tell us something about it... >> 6. remove XML in/output from elsa (yes, it's nice, but it's for the >> best and can easily be replaced by "my $xml = XMLout($ast)" after >> integrating elsa with perl > > Since I believe many people interested in pork will have no to little > experience with perl, I do not think that requiring people to use a > foreign language for a basic feature is a good idea. Indeed. I have, as described above, given it some more thought and I think having perl as visitor is more feasible. >> 7. add (intrusive) perl support to elsa's AST >> 8. rewrite the deparser in perl and remove it from the C++ code > > Again, I personally believe that adding perl support would be the wrong > way to go. Adding perl support would be a good thing, I think. Transforming the XMLised AST into perl datastructures would be more work and less maintainable than having a visitor produce the structures. Someone did the same for ocaml, I believe, I just can't think of who it was or what he did it for. > Caveat: I am not the central maintainer of pork or the extant elsa fork. > My opinions are wholly my own, and I can only speculate as to what > others (including Taras) may think of this proposal. Others, including taras, can think about it for quite a while, as I am busy for the next few weeks (maybe months). I am glad to see such a rapid and positive response. I am more than willing to save changes for later, forked efforts and do other changes earlier so porky and other projects can benefit optimally. I have many plans and they are all very limited in scope, so they can easily be done in separate steps, some earlier, some later. Please forgive me for producing this very badly structured mail. In my defence, I can only say I just came from the airport, it is 5:22 in the morning, I am very tired, but I just couldn't wait to answer. Thanks again for your quick reply. -- Pippijn van Steenhoven
signature.asc
Description: Digital signature
_______________________________________________ dev-static-analysis mailing list [email protected] https://lists.mozilla.org/listinfo/dev-static-analysis
