On Tue, Jan 20, 2015 at 05:26:41PM -0800, Jonathan M Davis via Digitalmars-d wrote: > On Tuesday, January 20, 2015 16:10:26 Andrei Alexandrescu via Digitalmars-d > wrote: > > On 1/20/15 3:40 PM, H. S. Teoh via Digitalmars-d wrote: > > > - Andrei doesn't approve because apparently some people think "big > > > files are not a problem". > > > > cc Jonathan M. Davis, Steve Schveighoffer if I remember correctly :o). > > I honestly think that many developers are overly interested in having > small modules. Splitting stuff up too much causes maintenance problems > (e.g. it becomes that much harder to find everything when there are a > lot of files to look through, and it's that much less obvious which > module something might live in),
Although I agree that we shouldn't be splitting up stuff just for the sake of splitting it up, I don't agree that it's a problem to find stuff. In this day and age, most software projects are too large to scan through stuff manually; you'd use search tools like grep or IDE search or whatever. Besides, a "real" editor like vim :-P is most useful when you navigate via the search function instead of the directional/paging keys anyway, and with D support in ctags, I don't see why splitting things into smaller files would be a problem. That on its own doesn't justify splitting, of course, but neither does it count against splitting IMO. > and in my experience, large modules like std.algorithm or std.datetime > are actually quite maintainable. However, that doesn't mean that we > wouldn't be better off splitting up the particularly large ones. I > just started on splitting std.datetime again the other day, and > hopefully I can find time enough to finish it before I end up having > to deal with merging other changes in. std.datetime is one of those things that has grown large enough that it's causing a noticeable pause when I open it in my editor or search for a symbol... I think that's nearing the point where splitting just on basis of size may become justifiable. :-P (Having said that, though, std.datetime unittests actually compile and run on my machine, in spite of their far larger number, yet std.algorithm doesn't. I think it's because of too many deeply-nested templates in std.algorithm, which probably includes a problem of my own making, namely one of the overloads of cartesianProduct, that causes an exponential number of recursive template instantiations. I've been meaning to fix that, and have in fact managed to fix it for the finite range case, but the infinite range case thus far eludes me.) > As for std.algorithm, I think that the fact that the unit tests take > up too much memory on some of the Phobos developers' machines is > enough to merit at least looking at splitting it up. Yeah, I haven't been able to run Phobos unittests locally for months now (perhaps even a year?). I think that's pretty near the point of being ridiculous. > It's an actual, objective problem rather than a subjective one. And > std.algorithm contains enough disparate functions that it certainly > wouldn't hurt us to split it up from an organizational point of view > either. So, if H. S. Teoh has managed to split it in a sane way, it > makes good sense for us to look it over and merge it if it looks good. I think the disparate functions part is probably the biggest reason to split it. Lumping disparate functions into one file means all the disparate dependencies of said functions also get lumped into one file, so if you import X, you're also forced to import Z just because Y, which you don't need, happens to sit in the same file as X and Y imports Z. I'm pretty sure this tangled web of interdependencies between Phobos modules is responsible for a significant proportion of complaints about Phobos template bloat / excessive executable sizes. As well as the somewhat amusing finding of mine some time ago (dunno if it's still true) that importing std.algorithm (and not actually referencing anything in it) will introduce a dependency on std.complex to your program, even though you never use anything that might remotely need to reference std.complex. I wasn't able to track down the source of this issue before, because std.algorithm was just far too big to manage; but perhaps after the split it will become more tractable. It also introduces some inadvertent circular dependencies that can cause hard-to-understand bugs, especially when conditional compilation is involved. If you have two modules A and B, and A.x depends on B.x and B.y depends on A.y, then you have a circular dependency between A and B even though in actuality they *aren't* circularly dependent. But since D's import granularity is the module, the circular dependency is there, at which point it becomes a tricky thing to make sure things are instantiated in the right order to resolve the apparent dependency loop, otherwise a static if somewhere might fail where it shouldn't. (I've seen this problem before but didn't have the patience to actually unravel it down to the actual cause. Reducing the amount of gratuitous dependencies would help a lot in making this easier to track down.) [...] > I had thought that the consensus was already that we should split > std.algorithm at some point. The trick was spending the time to do it > and get it right. [...] That's what I thought too, which is why I was a bit taken aback when Andrei seemed to disapprove of the PR. T -- I think the conspiracy theorists are out to get us...