I have a file with two problems:
- It's too big to fit in memory (apparently, I thought 1.5 Gb would fit but I get an out of memory error when using std.file.read) - It is dirty (contains invalid Unicode characters, null bytes in the middle of lines)

I want to write a program that splits it up into multiple files, with the splits happening every n lines. I keep encountering roadblocks though:

- You can't give Yes.useReplacementChar to `byLine` and `byLine` (or `readln`) throws an Exception upon encountering an invalid character. - decodeFront doesn't work on inputRanges like `byChunk(4096).joiner`
- std.algorithm.splitter doesn't work on inputRanges either
- When you convert chunks to arrays, you have the risk of a split being in the middle of a character with multiple code units

Is there a simple way to do this?

Reply via email to