On Thu, Apr 12, 2012 at 6:30 PM, Richard Guenther <richard.guent...@gmail.com> wrote: > > Yes, files are too big - but splitting them is not easy unless you can > figure out > a hierarchy that you can expose. The largest file is dwarf2out.c with > 22825 lines, > but the average is more like 2000 (just looking at gcc/*.c files). > There are only > 23 files bigger than 6000 lines (out of 356), so the situation is not as bad > as > you paint it. But yes, looking at filenames hardly tells you about its > contents > anymore. >
Average file size is not relevant here. You should consider how many code are in files that are big. In gcc/ sub-directory, there are about 600 source files(.c and .h). 63 of them (10%) exceed 100 KB, which contribute over 50% of the total source file size of the directory. 75 of them(13%) is between 50 KB to 100 KB, which contribute 25% of the total source file size of the directory. The rest, 440 or so of them, is below 50 KB, which contribute 75% of the total source file size of the directory. Some of this files are so small that some merging is needed. So, I can say, most of the GCC source code is in large files. And this also hold for language front-ends. -- Chiheng Xu