> On 30 Mar 2016, at 13:40, Tom Browder <[email protected]> wrote:
> On Tue, Mar 29, 2016 at 10:29 PM, Timo Paulssen <[email protected]> wrote:
>> On 03/30/2016 03:45 AM, Timo Paulssen wrote:
>>
>> Could you try using $filename.IO.slurp.lines instead of $filename.IO.lines
>> and see if that makes things any faster?
> ...
>> Actually, the method on an IO::Handle is called "slurp-rest"; slurp would
>> only work with a filename instead.
>> - Timo
> Timo, I'm trying to test a situation where I could process every line
> as it is read in. The situation assumes the file is too large to
> slurp into memory, thus the read of one line at a time. So is there
> another way to do that? According to the docs "slurp-rest" gets all
> the remaining file at one read.
That is correct.
The thing is that IO.lines basically depends on IO.get to get a line. So that
is extra overhead, that IO.slurp.lines doesn’t have.
If you know the line endings of the file, using IO::Handle.split($line-ending)
(note the actual character, rather than a regular expression) might help. That
will read in the file in chunks of 64K and then lazily serve lines from that
chunk.
A simple test on an /etc/dict/words:
$ 6 '"words".IO.lines.elems.say'
235886
real 0m0.645s
$ 6 '"words".IO.open.split("\x0a").elems.say'
235887
real 0m0.317s
Note that with .split you will get an extra empty line at the end.
Hope this helps.
Liz