—>   ... | deblock linend 0a | split 0d | ...

Could that cause extra lines?

If you know the data has crlf or lfcr or just lf but never just cr then
… deblock 0a | strip both 0d | …
I suspect strip is more efficient than split because split has to scan
entire record, where as strip starts at each end and stops on the first
mismatch, no need to continue.


On Mon, Mar 13, 2023 at 15:43 Glenn Knickerbocker <n...@bestweb.net> wrote:

> (Copied here from an IBM internal discussion because I should have come
> here first anyway:)
>
> Anyone have an idiom for deblocking and translating a file in ASCII that
> may have either or both of CR and LF (*), and may be split into records,
> without unnecessarily buffering the whole file?  The possibility of null
> records that I might want to preserve makes this particularly confusing
> to think through.
>
> (*) in either order--I know LFCR is rare, but I'm positive I've run into
> it on some goofy old system, probably some BBS 30 years ago
>
>   ... | deblock linend 0a | split 0d | ...
>
> was good enough for the application at hand, but I was hoping someone
> might already have worked through getting it right.
>
> ¬R
>

Reply via email to