On Thursday, 11 September 2014 at 10:19:17 UTC, monarch_dodra wrote:
Well, the issue is that this isn't very portable for *reading*, as even on linux, you may read files with "\r\n" line endings (It's "standard" for csv files, for example), or read "\n" terminated files on windows. The issue is that (currently) we don't have any splitter that operates on multiple needles. *That'd* be what needs to be written (probably not too hard either, since "find" already exists).

Good idea. So its "just" a matter of extending splitter with std.algorithm.find with these three keys:
- \n
- \r
- \r\n
then? Or are there more encodings to choose from?

We also have splitLines, "http://dlang.org/phobos/std_string.html#.splitLines";. Is that good enough for you by any chance? Or do you need it to actually be lazy?

Lazyness is good in this case because my input files are Gigabytes in size :) I'm playing around with single-pass-parsing ConceptNet5 CSV-files at

https://github.com/nordlow/justd/blob/master/conceptnet5.d

Reply via email to