Re: [CMS-PIPELINES] deblocking with various possible linends

Paul Gilmartin Thu, 16 Mar 2023 14:33:20 -0700

On 3/16/23 14:46:04, Rob van der Heij wrote:

    ...
Yes, I think we all realized the ambiguity. I was considering these
alternatives (with preference for the first one)
- line end is any unique sequence of the specified characters, so if you
specify the CR and LF as candidate, then CR, LF, CR LF, and LF CR are all
one single end of line, but CR CR would imply a null line between (like CR
LF LF)
- the first string of characters from that set is taken as the line end
sequence, until eof. So when you start with a bare CR then the next CR will
cause LF to be the start of a new line.


Be careful.  As I read it:
    FOO<CR><LF><CR>BAR ...
is three records:
    FOO     (terminated by <CR><LF>)
    <EMPTY> (terminated by <CR>)
    BAR ...

If that behavior satisfies your taste, amen, seasoned with a dash of GIGO.
If not, amend the rules and let the devil pose a test case.

OMVS provides for metadata specifying the line separator:
    
<https://www.ibm.com/docs/en/zos/2.5.0?topic=descriptions-extattr-set-reset-display-extended-attributes-files>

If you expect your code to run under TSO (unlikely) it should respect the
extended attributes of its input file.

Otherwise, the format of the record separator might be an optional
parameter to your program.

DWIM is likely to yield unexpected results.

--
gil

Re: [CMS-PIPELINES] deblocking with various possible linends

Reply via email to