Hi. First I want to say something about generated files in git repo. Then I want to say something about related, but deeper issue: trusting trust problem.
Repo has generated file src/parse-gram.c . And I think this is bad. You know why. Because it makes git merges harder, because useless "regen" commits appears, etc. There is well-known rule: don't put generated files to scm. Well, you may say that removing this file will create some problems, because this means that user should install Bison before building Bison from git. Well, yes, he should. And what? He already should install lots of other packages when building from git, see README-hacking. What is wrong with adding yet another dependency? There is no any bootstrapping problems for user, because he should simply install bison from his distro or download release tarball from gnu.org and then build bison from git. Also, autoconf depends on itself when building from scm, and they don't see any problems with it, look at their scm, they have configure.ac, but don't have configure. And finally: today there is two states: "I cloned git repo and it contains up-to-date parse-gram.c" and "I cloned git repo and it contains not up-to-date parse-gram.c". Well, if you simply remove parse-gram.c, then there are no this two states anymore, and build system possibly becomes simplified. I suggest not to add to build system rule for rebuilding parse-gram.c with just built bison, i. e. parse-gram.c should always be built with bison installed on system. This will simplify everything. Rule for determining whether parse-gram.c should be rebuild becomes simply "its change time is older than change time of parse-gram.y". You may say that changes I suggest complicate workflow when you use in parse-gram.y features that was introduced into bison recently. Well, yes. And that is one of the reasons why you should not use recently introduced features in parse-gram.y in the first place. Now I want to talk about a related issue. Bison uses Bison-generated parser and this is wrong. Even if you choose to still use Bison-generated parser, you should use POSIX-compatible features only and thus make sure that alternative yacc implementations can process your parse-gram.y . Why? Well, NOT because this self-dependency will create problems when porting to new architecture or new operating system. It will not. Anybody who want to port Bison to new operating system will simply grab release tarball, which already has all generated parsers. Okey, so why then? Well, first because this self-dependency complicates audit. If somebody wants to verify that Bison doesn't contain any malicious code, he will have problems with parse-gram.c . How to verify it? At first sight we should just verify parse-gram.y and then process it to get parse-gram.c and verify that we get the same parse-gram.c we have in release tarball. But, well, this means that we should run untrusted bison binary for that. So, this means that the only way to verify parse-gram.c is, well, verify this file itself. I. e. actual human auditor will be forced to actually reading and verifying generated hard-to-read 103k parse-gram.c . In this sense Bison is not free software at all. Why? Well, I would say that being free software includes this: you can get it in source-only form ("source" is something written by human, so human can read it and modify), and then build it using this sources only and nothing else. And thus you will be sure you got trusted binary. And so, well, Bison is not free software. (Well, of course, I know that my understanding of free software differs from usual). You may say, how this is possible to insert some malicious code into parse-gram.c and not to insert it to parse-gram.y ? Well, yes, this is possible. Read beautiful article by Ken Thompson "Reflections on Trusting Trust". This is Turing award lecture. 1984. https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf . Also read related article: Dr. David A. Wheeler PhD thesis: https://dwheeler.com/trusting-trust/ . Well, despite "Fully Countering" words in the title of thesis, keep in mind that this "Fully Countering" require that some "trusted compiler" already available, and thus this is serious limitation of presented method. Still, this PhD thesis gives lot of information on this topic. Second, some distros have policy of rebuilding all generated files. Debian has this. https://wiki.debian.org/UpstreamGuide has this: "we need to rebuild all generated files to make sure that they can really be built from source". This means that in Debian bison build-depends on itself. Well, when I look at actual build-dependencies of bison ( https://packages.debian.org/source/sid/bison ), I don't see bison here. It seems this is a bug. I will think about it and probably report it. So, theoretically bison is self-build-dependent in Debian and this is bad, because this will create a lot of problems for Debian. This means that this package should be handled specially. Well, okey, you will say that there is a lot packages around, which build-depend on itself, for example, gcc. Well, yes. And this is bad. Moreover, every operating system has its own cyclic build-dependency graph of its core packages. And this graphs often tends to be huge and complicated. For example, 44th slide of https://www.gnu.org/software/guix/guix-els-20130603.pdf shows this graph for GNU Guix SD. I want to note that Bison is present in this graph. Yes, scale here is not handy to really see something at this picture, but when I look at this slides in my browser and type Ctrl-F, I am able to find word "bison" in this graph. Alternatively you can install guix and type "guix graph --type=bag hello | dot -Tsvg > /graph.svg". (You can replace "hello" with "gcc" or any other core package, the graph still will contain "bison", because you need bison to build glibc.) So, such big graph means problems with audit. And I think we all should take all measures to keep this graph as small as possible. And, well, your package got to this graph. And even if glibc devs remove dependency on bison, then bison is still self-dependent. Okey, what to do? Well, ideally just write usual hand-written recursive descend parser. Or at least write yacc-compatible grammar, so that bison can be checked using alternative yacc-compatible implementation. And one more useful link: bootstrappable.org . And one more: https://www.gnu.org/software/guix/manual/en/guix.html#Bootstrapping - section on bootstrapping in GNU Guix manual. You may say "I will not do this, but this can be valuable contribution". Well, I don't want to do this either. But you can write to http://bootstrappable.org/ mailing list and it is possible that someone will become enthusiastic and will rewrite this parse-gram without yacc/bison. This can be seen as simple contribution for anyone interested in bootstrapping problems, but not so smart to do something really hard (say, writing Haskell compiler in C++). I saw this patch: http://git.savannah.gnu.org/cgit/bison.git/commit/?id=3ae81aa338fb08be451f7ed106adf94e35f52e15 and it caused me to write this big letter. As well as I know api.value.type union is not POSIX, thus this patches moves away from POSIX compatibility, and I think we should at least roll-back it. == Askar Safin http://vk.com/safinaskar