On 28/04/16 7:15 PM, Dan Sumption wrote:
This is a subject that interests me greatly, and I'm keen to hear people's views on it.

My perspective is that of a working software developer (albeit with a background in psychology), not an academic. I have no experience of ML, and have worked only a little with functional languages.

I was a little taken aback by this line in the original post:

    Why can't I see the structure in a 3000-line module, or even
a 1000-line module?

To me, a 1000-line module is a God Class. A 3000-line module is a complete disaster.

Accepted best practice is that a file too big to view on your screen is too long. Optimum file size is probably under 30 lines.
Really? I've heard that said about *function* size, but not about *module* size.

I just checked the Java 1.6 library source release.
7195 *.java files.
> summary(d)
  min = 9, 1st quartile = 55, median = 112, mean = 288.2,
  3rd quartile = 286, max = 9604.

Looking at the source code for the SML/NJ library, and excluding
files in .cm directories, there are
1121 *.sml files.
> summary(d)
  min = 1, 1st quartile = 36, median = 80, mean = 194.5,
  3rd quartile = 180, max = 10040
There were 29 files with >= 1000 lines.
The sizes are lower than the Java files because of the absence of
anything like JavaDoc and the paucity of other comments.

Looking at the Erlang 18.2 release, there are
7315 *.[hye]rl files.
> summary(d)
  min = 0, 1st quartile = 35, median = 135, mean = 481.4,
  3rd quartile = 420, max = 32020
I believe the superlarge file there was machine-generated.
There were 788 files with >= 1000 lines.

Looking at the Python 3.4.4 release, there were
1740 *.py files.
> summary(d)
  min = 0, 1st quartile = 53, median = 169, mean = 383.5,
  3rd quartile = 429, max = 6413.
There were 160 files with >= 1000 lines.

By and large, the people who wrote these four different
systems were experienced and capable programmers
following practices that were adequate for developing
systems that would be used by huge numbers of other
programmers a long period.

If large modules were *necessarily* a disaster, these systems
would not be so successful.

I have quite a few other large projects handy, but one has to
stop *somewhere*.  The take-home point is that far from being
disastrous scary monsters, large modules are *common*

I don't think I've ever seen a *non-trivial* class that would fit on a screen,
at least not one with comments.  I've seen way too many Java classes
that could do practically nothing and you couldn't even *begin* to
understand them in isolation; the result of splitting things up into tiny
pieces in the service of an unrealistic dogma was spaghetti tangles of
objects with complex interactions.

Breaking a 3000 line module into 30-line pieces would simply make
things worse: you'd probably end up with a lot more than 100 pieces
and the pieces would be no more readable without their context
than they were *with* their context.  Overall, the code would be
HARDER to read.  Not hypothetical:  more than once I've rewritten
a tangle of small pieces sprayed across several directories into one
file with (for me) a drastic improvement in readability.

In the case of the FileName class I mentioned, putting the two files
that define it together means the *class* is 815 SLOC, but the
average *method* is 4.5 SLOC, one line for header and 3.5 lines
for body.
Chunking strategies, allowing an entire module to be thought of as 7+/-2 concepts, will certainly help.
It will help some things, but not others.
This is arguably what Smalltalk method categories and C# regions are about.

However, this assumes that there is one right way to chunk things
and that there are no "cross-cutting" relationships.

Even then, I struggle to conceive of a case where a 1,000 line file could be broken down into 7 clear, comprehensible concepts.
You seem to be talking about a major rewrite, which I'm not.
You also seem to be assuming that large modules aren't *already*
divided into topics, and that "clear comprehensible concepts"
aren't connected by relationships that are not immediately obvious
from the code.

Most of the large modules I've looked at *are* divided into chunks of
some sort, but the reason they are single files is that the chunks are
for some good reason tightly coupled.




--
You received this message because you are subscribed to the Google Groups "PPIG 
Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to ppig-discuss+unsubscr...@googlegroups.com.
To post to this group, send an email to ppig-discuss@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to