Re: processing a file in chunks

Elizabeth Mattijsen Mon, 21 Oct 2019 00:12:12 -0700

In that bioinformatics data, is there another logical record separator?  If so, 
and $lrs contains the logical record separator,  you could do this:


    for "filename".IO.lines(:nl-in($lrs)) {
        .say
    }

the :nl-in indicates the line separator to be used when reading a file.

> On 21 Oct 2019, at 01:11, Joseph Brenner <[email protected]> wrote:
> 
> Thanks, that looks good.  At the moment I was thinking about cases
> where there's no need division by lines or words (like say,
> hypothetically bioinformatics data: very long strings no line breaks).
> 
> On 10/20/19, Elizabeth Mattijsen <[email protected]> wrote:
>>> On 20 Oct 2019, at 23:38, Joseph Brenner <[email protected]> wrote:
>>> I was just thinking about the case of processing a large file in
>>> chunks of an arbitrary size (where "lines" or "words" don't really
>>> work).   I can think of a few approaches that would seem kind-of
>>> rakuish, but don't seem to be built-in anywhere... something like a
>>> variant of "slurp" with an argument to specify how much you want to
>>> slurp at a time, that'll move on to the next chunk the next time it's
>>> invoked...
>>> 
>>> Is there anything like that kicking around that I've missed?
>> 
>> Remember that lines and words are lazy, so do you really need to process in
>> chunks?
>> 
>> If you do, you can build the chunks easily with X lines, you could do:
>> 
>>    for "filename".IO.lines(:!chomp).batch(3) -> @batch {
>>        say .join;
>>    }
>> 
>> Similarly for words, of course.  Note the use of :!chomp to keep the
>> newlines.  If you don't want them, just don't specify it: removing newlines
>> is the default.

Re: processing a file in chunks

Reply via email to