-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bruno Haible wrote: > If gnulib-tool was to be rewritten in another programming language than > shell + sed, what would be the good choices? > > The foremost criteria IMO should be the maintainability, i.e. the ability for > us and for new contributors to gnulib to master this programming language. > To get an estimate of this, there are various sources of information. > > 1) We can look at the number of developers who master one language or the > other. This matters because we cannot force or expect gnulib contributors > to learn a new programming language, just for gnulib-tool. > > I compared C, C++, Java, shell-script, Python, Perl in ohloh: > > <http://www.ohloh.net/languages/compare?commit=Update&l0=c&l1=java&l2=perl&l3=python&l4=shell&l5=cpp&l6=-1&measure=contributors&percent=> > The result is the following ordered list: > 1. C > 2. Java > 3. C++ > 4. Python > 5. perl > The comparison by number of projects rather than by number of developers > > <http://www.ohloh.net/languages/compare?commit=Update&l0=c&l1=java&l2=perl&l3=python&l4=shell&l5=cpp&l6=-1&measure=projects&percent=> > yields the same result.
I don't find a number-of-projects metric to be all that terribly useful, versus number-of-developers. I don't think the one is an accurate stand-in for the other. I suspect that these results represent the GNU and Free Software development communities quite poorly. I would definitely place Java and C++ much lower down the list based on my own personal experience with developers' abilities, for Free Software developers I've met. I suspect Perl and Python have similarly-sized communities; Python seems to be growing steadily, but Perl seems to have more history with GNU Software in particular. > 2) We can also look at the level of familiarity of the current gnulib-tool > maintainers with these languages. Among us recent contributors to > gnulib-tool > (Eric, Jim, Ralf, Simon, and me) two of us have made public their skills: > <http://savannah.gnu.org/people/resume.php?user_id=1389> > <http://savannah.gnu.org/people/resume.php?user_id=1871> > making up for: > C - 2 x master/expert > Java - 2 x master/expert > C++ - 1 x master/expert, 1 x good knowledge > Python - 1 x base knowledge > perl - 1 x base knowledge > > Also, I know that a few years ago Paul did not know C++ and was not > inclined > to learn it. > > So according this criteria, only C and Java remain possibilities. Python > and > perl have to be excluded because too few of us are skilled in these > languages. If one accepts that mastery of the language, as opposed to base knowledge, is necessary. That is absolutely true for C and C++; but in the case of Python, there isn't really a whole lot to know IME. > 3) Long-term maintainability requires some degree of standardization, so > that the amount of expected future changes in the language and its runtime > library is small. This speaks in favour of C, Java, C++, and against > Python and perl. I disagree wrt Python. Python has fairly thorough language and modules specifications, and has multiple implementations in use. In addition, a good level of future-"resistance" and backwards compatibility is maintained, through things such as the "future" module. And standardization seems to be a poor indication of portability/language change. For instance, shell code is of course standardized, and yet the existing standards often poorly represent real-world implementations. Likewise, the latest C standard has been around for about a decade now, and yet there are only one or two implementations that conform well to it (gcc not being one of them; though it implements most of the ones I tend to care about). Conversely, Perl doesn't really have a "standard", and yet old code has continued to be portable in newer language versions (this won't be true in 6.0, but somehow I don't think that'll matter much). > 4) For comparing simply the syntactic complexity of the languages (yes this > is only a small facet of maintainability, but nevertheless), one can take > the amount of code needed for writing a superficial parser. Such parsers > are implemented in gettext/gettext-tools/src/x-*.c, and x-perl.c is more > than twice as large as the other parsers. This indicates that also for a > human developer, perl syntax is harder to grok than the syntax of other > programming and scripting languages. Again, I think that's a poor measurement. What is easy for humans to write is often complex to parse. The ideal way to write software would be to provide a set of text instructions, in English. That would obviously require a very large amount of parser logic. Also, I'm guessing those gettext things are lexers, and not grammar parsers, as AFAICT gettext shouldn't need grammars, so they're not the whole story. In this case, I'd agree that Perl's is the most complex syntax; however, syntax is a very small part of a language's complexity. Perl and Python both offer quite a few high-level features that make the most common programming tasks much, much easier. By contrast, C's standard libraries are extremely low-level, and must be supplemented a great deal. Of course, we have a lot of handy utilities in gnulib itself to mitigate that a good deal... Still, I'd say Python and Perl come to the fore in terms of ease-of-implementation, followed by Java, and then C and C++ distantly. In terms of how easy it would be for future maintainers to maintain someone else's code, Python would probably be fore, followed by Java, C and C++. Perl can be written so that it's easier to maintain than easily-maintained C; it can also be written so that it's more difficult. Unless care is taken, it's probably more difficult in general. I'm somewhat surprised that "suitability to the task" got a miss. I definitely agree with those that have suggested that a scripting language makes a great deal of sense, particularly scripting languages whose implementations are likely to already be present on the machine. I suspect Perl may have a somewhat larger install base on Unixen than Python, though AFAICT that gap is closing in newer installs. If, as Mike suggests, string processing is a significant part of gnulib-tool's task, then Perl again seems the winner here: despite the fact that Java and Python have strong string-parsing _libraries_, Perl's built-into-the-syntax regex and string-manipulation operators are a big win. Obviously, the very biggest concern is what makes the most sense to gnulib's current implementors/maintainers, followed by what will work the best for future ones. Well, and ease-of-use to the user should be way up there, too, and I consider a compiled language, or even worse, a language with relatively poor install base on Unixen [Java], to have a pretty severe impact on that (provided the alternatives are likely to already be available on the system); but obviously no one can tell you what you're most comfortable with. - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer. GNU Maintainer: wget, screen, teseq http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAklkLhQACgkQ7M8hyUobTrE6ygCZAd1M67rv31y+n7FCGWa7Y5ZH w18AniTj5yvvDu/VNdIY0jKDq9imP9RB =vYVO -----END PGP SIGNATURE-----