Chris Fedde writes:
> -throw memory at a problem. Scalars in Perl use more memory than
> -strings in C, arrays take more than that, and hashes use even more. While
> +throw memory at a problem. Scalars in Perl use more memory than strings
> +in C, arrays take more than that, and hashes use even more. While
This appears to be a non-change.
> -these issues. For example, as of 5.004, duplicate hash keys are
> -shared amongst all hashes using them, so require no reallocation.
> +these issues. For example, as of 5.004, duplicate hash keys are shared
> +amongst all hashes using them, so require no reallocation.
Likewise.
> + while (<FILE>) {
> + # ...
> + }
> +
> +instead of this:
> +
> + @data = <FILE>;
> + foreach (@data) {
> + # ...
> + }
> +
> +When the files you're processing are small, it doesn't much matter which
> +way you do it, but it makes a huge difference when they start getting
> +larger. The latter method keeps eating up more and more memory, while
> +the former method scales to files of any size.
I think you've got 'latter' and 'former' the wrong way around here.
> +of data clogging up RAM.
"of data taking up memory". RAM seems randomly technical (and is the
first use of that term).
> +=item * Localize!
Heh, technically this would be 'Privatize' :-)
> +Don't make anything global that doesn't have to be. Use my()
> +prodigously to localize variables to the smallest possible scope.
> +Memory freed by variables that have gone out of scope can be reused
> +elsewhere in the current program, preventing the need for additional
> +allocations from system memory.
Actually, this not true. Perl has a ton of optimizations that end up
preventing it from freeing memory in strings and private variables in
non-recursive blocks.
> +=item * Tie large variables to disk.
> +
> +For "big" data stores (i.e. ones that exceed available memory) consider
> +using one of the DB modules to store it on disk instead of in RAM. This
> +will incur a penalty in access time, but that's probably better that
> +causing your hard disk to thrash due to massive swapping.
I'd be tongue in cheek and say:
For big data stores, consider using one of the DBM modules to store
it on disk instead of in memory. Accesses of disk-based structures
will be slower than memory-based ones, but the goal of this section
is to tell you how to make your code take less memory, not how to
have your cake and eat it.
Otherwise looks good. Thanks!
Nat