On Sat, Aug 15, 2009 at 06:04:39PM -0400, Joshua Cranmer wrote:
> Pippijn van Steenhoven wrote:
>> not needed for pork.
>
> If by that you mean removal of the various programs, you again have my  
> support; some of the stuff that involves ASTs is useful, IMO.

What stuff? XML, maybe? Yes.. I am unsure about that, also because of
your next point:

> > Then, I would like to have perl bindings (maybe
>> python bindings) directly into the AST and build some tools with that. I
>> would also like to rewrite the deparser (pretty printer) in a scripting
>> language. For that, I would like to add intrusive support for perl.
>
> This is the part that I am most wary about. Elsa can output the AST into  
> an XML string, which can easily be read in by nearly every major  
> scripting language.

That is a valid point and I have thought about it. Not very much, but
when I first did this (yes, I already did all of this, but I lost it
during a hdd crash (I did it in 5 days time and was done before my weekly
backup)), I just removed everything that was not strictly necessary for
the parser to work. I was planning to re-add it as an external library
not included with the actual parser. Intrusive support is in fact against
that idea, so I was also thinking about just writing a visitor building a
perl data structure. That would probably be cleaner, too. On a whole, I
think moving stuff out of the actual ast into visitors is a good idea
(including the deparser, which is currently intrusive).

> Initializing an interpreter for every run of pork  
> can hurt performance;

Initialising a perl interpreter takes 0.000 to 0.002 seconds according to
time(1). Since elsa can take an amount of time varying between 0.5 and 10
seconds for C++ files, I don't think it would be that much of a
performance hit.

> I have also had experience with poor performance  
> crossing language boundaries, which means core APIs implemented in a  
> scripting language could be painful in terms of performance.

I don't think of deparsing or xml production as a core API. I think of
core APIs being the least amount of code needed to have a parser and
typechecker running. I have no plans to rewrite the typechecker in perl.
That would be nonsense.. typechecking already takes the largest amount of
time in the current parser.

> I should also note that Mozilla tends to prefer JS or Python these days  
> to perl.

That is fine, I just don't know JS and python very well. Perl would be
for me anyway. It would be the last thing I do and probably have little
value to anybody but me.

> > Before I do all that, I want to restructure the source tree and use
> > automake, autoconf and libtool.
>
> In my opinion, there is no reason to use the autotools suite if the  
> configuration arguments are simple enough: it adds complexity for so  
> little value (IMO).

I have been using (gnu) make for a long time now and I feel comfortable
with autotools. The makefiles would definitely become much shorter (I
have already done this before the hdd crash, as well :\) and in my
opinion, more understandable. It's basically just 6 lines of configure.ac
and 10 lines of makefile in addition to the source file lists.

> Indeed, if it's simple enough that a small script  
> would handle it just fine, there's no need to rip the system out for the  
> harder-to-understand autotools suite.

Again, I don't think that suite is so hard to understand, but opinions
may vary. Restructuring, by the way, would also include putting files
that have something in common (the simplest case: all .ast files, all .gr
files) into separate directories.

> Much of the standard arguments are  
> rather meaningless, since pork tools are probably too case-specific to  
> be amenable to an installation procedure,

That's actually what I would like to improve. I want elsa to become a
standard component for many applications, including pork and oink. That
is also why I am so much opposed to the astgen type of intrusive
*everything*. Deparsing could so nicely be a visitor, just like xml
generation and whatnot. astgen requires access to the source of elsa and
its recompilation for every change. I don't think that is very clean.

> and there's no need to have  
> pork work on systems that require arcane reconfiguration.

I don't primarily use autotools for strange systems with even stranger
compiler quirks, but they do relieve me of that duty. I do primarily use
them simply because they are so easy to use. I know there are many bad
examples of people not fully understanding the power of autotools and
hacking something into them that could easily be done much more cleanly
and efficiently if those people had actually consulted the manual. For
fairness, I need to add here that some things are really hard to add to
automake. For instance, if you have a code generator that generates two
files, make (not just automake) really can't cope with it. You end up
doing hacks like "the second file produced depends on the first file
produced and has a no-op production rule". That's an inconvenience and
ugliness I bite my tongue over every time I have to do it. Also, it may
just be me not reading the manual carefully enough (I have not read it
completely, yet), but I don't know how to easily add support for new
source types (such as .ast) and have automake generate the appropriate
makefile code for it. One time, I hacked up the automake script just for
that and I have seen other people do it. Mostly, I just use gnu make,
though, which does all of that very well (gnu make is brilliant, it's a
freaking *programming* language).

>>   1. restructure source tree, use autotools
>
> How do you intend to restructure the tree? I've also already given my  
> distrust of autotools, so unless it's strictly necessary, you should  
> probably save this step for later.

Hm.. yes, I could save it up for later. I could do 2, 3 and 4, first. No
arguments of feelings against it.

>>   2. remove smbase string
>>   3. remove other smbase ADTs and use std:: ones
>>   4. remove even more smbase ADTs and use boost:: ones
>
> Two things to note here. When I last spoke with Taras, one of the things  
> he mentioned was that his biggest perceived barrier to removing  
> sm::string was concerns over the performance of the codebase. The goal  
> of porky is to be able to handle thousands of files of code, with header  
> includes that can easily push files past the 100,000 line mark or more.  
> It would be advisable to ensure that your removal does not adversely  
> impact performance of code parsing.

I highly doubt that replacing sm::string with std::string would adversely
impact performance, especially of code parsing, since code parsing does
not in fact operate on sm::strings very much. Anyhow, my tests before the
hdd crash much more proved the other way: it was faster. That said, I
didn't test it on very large code files, I mostly just #included every
C++ header and instantiated some of the std templates.

> The other thing to note is the usage of boost::. I think there are  
> likely to be systems without boost (std::tr1 should be usable with the  
> appropriate g++ options); it may be inadvisable to require external  
> libraries if there is no real need for them.

TR1 has no intrusive_ptr, which would replace the refcounted pointer
class used in elkhound. TR1 has no ptr containers that would replace the
Obj containers of smbase. Boost has a very permissive licence, that would
allow you to ship the required headers with porky. There is no real need
for it, but it would remove a code module from maintainance. I am
generally very much in favour of removing as much from my own
maintainance as possible, so I can concentrate on what is really my goal.
I have learned to not be afraid of adding library dependencies
(especially boost, since 1) it is very portable and 2) most people
involved with projects I work on (C++ ones, mostly) already have it
installed. Porky and elsa being C(++) parsing and manipulation projects
make it quite probable that its potential users are going to have boost
installed). I am of course willing to rearrange the order of my steps to
meet your requirements/wishes. I want to do most (all?) of it eventually
(excluding maybe the *intrusive* part of perl support, I have thought
about it some more and found that it in fact is against everything I love
and care for in programming (okay, that was exaggerated, but..), namely
separation of concerns. I like to keep things small, separate modules
appropriately. That would include no binary dependency or even inclusion
of deparsing or xml generation in the core parsing library). All I want
is give you and other elsa users as much of my work as possible. It's
always nice to have as many people as possible profit from work one does.
At least that is my view on things.

>>   5. remove some unused or unneeded parts of elsa (they have some weird
>>      macro support I don't think actually works)
>
> Macro support--in terms of mcpp-style unwinding--is very much a  
> necessity for the fork of elsa that Mozilla maintains.

Macro support in terms of.. /*!foo/*/ /*<foo/>*/.. I mean.. what the heck
is this for? Does anybody even use that? Does it work at all? I haven't
tried very hard, since I was too annoyed by the apparently senseless
addition to the code that even uses an entire source file
(cppundolog.cc). That is what I want to remove. I can't currently think
of any other useless parts. It may be the only one, but I don't know what
it's for. Maybe SM can tell us something about it...

>>   6. remove XML in/output from elsa (yes, it's nice, but it's for the
>>      best and can easily be replaced by "my $xml = XMLout($ast)" after
>>      integrating elsa with perl
>
> Since I believe many people interested in pork will have no to little  
> experience with perl, I do not think that requiring people to use a  
> foreign language for a basic feature is a good idea.

Indeed. I have, as described above, given it some more thought and I
think having perl as visitor is more feasible.

>>   7. add (intrusive) perl support to elsa's AST
>>   8. rewrite the deparser in perl and remove it from the C++ code
>
> Again, I personally believe that adding perl support would be the wrong  
> way to go.

Adding perl support would be a good thing, I think. Transforming the
XMLised AST into perl datastructures would be more work and less
maintainable than having a visitor produce the structures. Someone did
the same for ocaml, I believe, I just can't think of who it was or what
he did it for.

> Caveat: I am not the central maintainer of pork or the extant elsa fork.  
> My opinions are wholly my own, and I can only speculate as to what  
> others (including Taras) may think of this proposal.

Others, including taras, can think about it for quite a while, as I am
busy for the next few weeks (maybe months). I am glad to see such a rapid
and positive response. I am more than willing to save changes for later,
forked efforts and do other changes earlier so porky and other projects
can benefit optimally. I have many plans and they are all very limited in
scope, so they can easily be done in separate steps, some earlier, some
later.

Please forgive me for producing this very badly structured mail. In my
defence, I can only say I just came from the airport, it is 5:22 in the
morning, I am very tired, but I just couldn't wait to answer.

Thanks again for your quick reply.

-- 
Pippijn van Steenhoven

Attachment: signature.asc
Description: Digital signature

_______________________________________________
dev-static-analysis mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-static-analysis

Reply via email to