From: "Doug Ewell" <[EMAIL PROTECTED]>
> Jill Ramonsky wrote:
>
> > Here's a better idea.
> > Let's just stick with the idea that ANY C0 or C1 control has no place
> > being anywhere in a line of text, and so any sequence of one or more
> of
> > them will be interpretted as a line-break!
>
> Tab
At 02:05 PM 10/24/03 +0100, Jill Ramonsky wrote:
Here's a better idea.
Let's just stick with the idea that ANY C0 or C1 control has no place
being anywhere in a line of text, and so any sequence of one or more of
them will be interpretted as a line-break!
Sorted once and for all!
I'm not sure you
> For completeness the definitions of \n and \r in C are:
>
> "\n (new line) Moves the active position to the initial
> position of the next line.
Hmm. If the output is to a terminal, and the OS is Unixy,
then to guarantee that behaviour, \n must generate both
a CR and an LF, not just an LF. T
Jill Ramonsky wrote:
> Here's a better idea.
> Let's just stick with the idea that ANY C0 or C1 control has no place
> being anywhere in a line of text, and so any sequence of one or more
of
> them will be interpretted as a line-break!
Tab?
-Doug Ewell
Fullerton, California
http://users.adelp
Philippe Verdy scripsit:
> > On Mac Classic,
> > \n is 15, and on EBCDIC systems, it's also 15, though for a different
> > reason.
>
> Correction: On Mac Classic and in EBCDIC, \n is 015 (or 13), not 15:
> please don't mix in the same sentence the decimal,
> and octal notations.
It's worse than
From: <[EMAIL PROTECTED]>
> > > Still, I stand by saying that \n is defined in C++ as LF and \r as CR,
> > because
> > > that's sitting in front of me in black and white.
> >
> > Yes, true. But that does *not* mean that (int)'\n' can be counted on to
> > be 10
>
> Of course, given that any of a v
hilippe Verdy [mailto:[EMAIL PROTECTED]
> Sent: Friday, October 24, 2003 1:33 PM
> To: John Cowan
> Cc: [EMAIL PROTECTED]
> Subject: Re: Backslash n [OT] was Line Separator and
> Paragraph Separator
>
>
> <...LOTS...>
From: "John Cowan" <[EMAIL PROTECTED]>
> > Still, I stand by saying that \n is defined in C++ as LF and \r as CR,
because
> > that's sitting in front of me in black and white.
>
> Yes, true. But that does *not* mean that (int)'\n' can be counted on to
> be 10, any more than (int)'a' can be counte
> > Still, I stand by saying that \n is defined in C++ as LF and \r as CR,
> because
> > that's sitting in front of me in black and white.
>
> Yes, true. But that does *not* mean that (int)'\n' can be counted on to
> be 10
Of course, given that any of a variety of character encodings could be i
[EMAIL PROTECTED] scripsit:
> However, under closer examination we are both wrong. '\u000A' is not allowed!
Fair enough. Banning low-valued \u and \U escapes allows \u and \U
removal to be done at a very low level. Java in effect has the same
rule: it is legal to say '\u000A', but that is equiv
Quoting John Cowan <[EMAIL PROTECTED]>:
> [EMAIL PROTECTED] scripsit:
>
> > But if ('\n'=='\u000A') should always be true, because ISO 14882 defines \n
> as
> > LF and defines \u as "that character whose short name in ISO/IEC 10646
> is
> > " and the character whose short name in IS
> Of course, indeed I just said that! If it were true then that would imply
> that '\x' == '\u' making the \u and \U escapes rather pointless.
That's not pointless:
- '\x' is interpreted by C compilers as '\xNN' and two uppercase letters
N, where '\xNN' is compiled according to the so
[EMAIL PROTECTED] scripsit:
> But if ('\n'=='\u000A') should always be true, because ISO 14882 defines \n as
> LF and defines \u as "that character whose short name in ISO/IEC 10646 is
> " and the character whose short name in ISO/IEC 10646 is A is
> LF.
It's not clear to m
> From: <[EMAIL PROTECTED]>
> > However because the universal-character-name escapes (\u and
> \U)
> > are defined relative to a particular encoding, namely ISO 10646, it would
> be an
> > error if ('\n' != '\u000A' || '\r' != '\u000D'). Whether this is
> implemented by
> > using the va
en should I worry about what happens when I
concatenate text files (unless I know in advance that they're just
fragments)?
Jill
> -Original Message-
> From: Kent Karlsson [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, October 22, 2003 4:44 PM
> To: 'Peter Kirk'; [EMAIL PROTECTED]
> Subject: RE: Backslash n [OT] was Line Separator and
> Paragraph Separator
>
> The first and last lines in a text file may well be partial.
At 11:56 AM -0700 10/22/03, Jonathan Coxhead wrote:
> Don't know about . I think that should be two line ends.
I agree. I don't know any system that uses this sequence.
The BBC Micro---well-known to a generation of British schoolchildren---used
this sequence. You can probably find files in
From: <[EMAIL PROTECTED]>
> However because the universal-character-name escapes (\u and
\U)
> are defined relative to a particular encoding, namely ISO 10646, it would
be an
> error if ('\n' != '\u000A' || '\r' != '\u000D'). Whether this is
implemented by
> using the values 0x0A and 0x
"Jonathan Coxhead" <[EMAIL PROTECTED]> wrote:
> On 22 Oct 2003, at 6:53, John Cowan wrote:
> > Kent Karlsson scripsit:
> >
> > > Don't know about . I think that should be two line ends.
> >
> > I agree. I don't know any system that uses this sequence.
>
> The BBC Micro---well-known to a generat
On 22 Oct 2003, at 6:53, John Cowan wrote:
> Kent Karlsson scripsit:
>
> > All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the
> > encoding of the text file is recognised.)
>
> XML 1.0 treats CR, LF, and as line terminators and reports
> them as LF.
>
> XML 1.1 will treat CR, LF, N
On 22/10/2003 08:36, John Cowan wrote:
Peter Kirk scripsit:
But if two files each consist of one or more lines of text separated by
LS (but with no final LS), when they are concatenated, surely LS must be
added as a separator. Similarly with paragraphs and PS.
But your protasis is a pe
Unicode UAX 14 (Line Breaking Properties) also has a bit to say on this
topic of line separators
From http://www.unicode.org/reports/tr14/
BK - Mandatory Break (A) - (normative)
Explicit breaks act independently of the surrounding characters.
000C
FORM FEED
Form Feed separates pages.
Peter Kirk wrote:
> But if two files each consist of one or more lines of text
> separated by
> LS (but with no final LS), when they are concatenated, surely
> LS must be
> added as a separator. Similarly with paragraphs and PS. And
> this applies
> even when each consists of one line or on
Peter Kirk scripsit:
> But if two files each consist of one or more lines of text separated by
> LS (but with no final LS), when they are concatenated, surely LS must be
> added as a separator. Similarly with paragraphs and PS.
But your protasis is a petitio principii. Files may or may not co
From: "Kent Karlsson" <[EMAIL PROTECTED]>
> And then, later, we got the
> 0088;;Cc;0;BN;N;CHARACTER TABULATION SET
> 008A;;Cc;0;BN;N;LINE TABULATION SET
> which I've never seen used. (The daisy weel printers I did use
> long ago had some escape sequence for setting the HT positions.
> > > So this legacy encoding of end-of-lines is now quite obsolete
> > > even on MacOS.
> >
> > I don't think it can be called "obsolete" as long as files generated using
> > that line end convention exist. Or, at least, applications that have an
> > operation for "read a line" will have to cope
On 22/10/2003 05:19, Kent Karlsson wrote:
... And LS it's a separator, not a terminator, so EOF has to be a
line
terminator.
Calling it a line terminator means that every
document is forced into the mold of being an integral number of lines
long, regardless of the facts.
?? If you mean
Kent Karlsson scripsit:
>
>
> John Cowan wrote:
> > XML 1.1 will treat CR, LF, NEL, , , and LS as line
> > terminators and report them all as LF. PS is left alone, because of
> > the bare possibility that it is being used as quasi-markup.
>
> I'm not sure why should be seen as a single line en
Philippe Verdy scripsit:
> I also have some old documents that use =U+000B instead of
> LF=U+000A to increase the interparagraph spacing. This is still
> mapped to the source '\v' character constant in C/C++ (and Java
> as well, except that Java _requires_ that '\v' be mapped only to
> VT.
The XM
John Cowan wrote:
> XML 1.1 will treat CR, LF, NEL, , , and LS as line
> terminators and report them all as LF. PS is left alone, because of
> the bare possibility that it is being used as quasi-markup.
I'm not sure why should be seen as a single line end.
And I think PS should be seen as a l
From: "John Cowan" <[EMAIL PROTECTED]>
> Kent Karlsson scripsit:
>
> > All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the
> > encoding of the text file is recognised.)
>
> XML 1.0 treats CR, LF, and as line terminators and reports
> them as LF.
> XML 1.1 will treat CR, LF, NEL, , , an
Kent Karlsson scripsit:
> All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the
> encoding of the text file is recognised.)
XML 1.0 treats CR, LF, and as line terminators and reports
them as LF.
XML 1.1 will treat CR, LF, NEL, , , and LS as line
terminators and report them all as LF. PS
>
> > all of the CR LF CRLF LFCR should mark an "end of line".)
>
> All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the
I was still staying within the ASCII and \r \n discussion, but yes,
if one goes Latin 1 / Unicode the NEL and LS PS (why not FF, then?),
and of course EOF.
> encoding
> all of the CR LF CRLF LFCR should mark an "end of line".)
All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the
encoding of the text file is recognised.)
Don't know about . I think that should be two line ends.
/kent k
smime.p7s
Description: S/MIME cryptographic signat
From: <[EMAIL PROTECTED]>
>
> I wrote:
>
> > So this legacy encoding of end-of-lines is now quite obsolete
> > even on MacOS.
>
> I don't think it can be called "obsolete" as long as files generated using
> that line end convention exist. Or, at least, applications that have an
> operation for "r
> So this legacy encoding of end-of-lines is now quite obsolete
> even on MacOS.
I don't think it can be called "obsolete" as long as files generated using
that line end convention exist. Or, at least, applications that have an
operation for "read a line" will have to cope with it. (In other wo
From: Jill Ramonsky
> I would be more than grateful if someone could point me
> in the direction of a DEFINITVE specification which claims
> this is not the case, that the interpretion of "\n" as
> anything other than LF may be considered conformant
> behaviour.
If you had programmed for MacOS,
John Cowan wrote:
> In addition, the Rationale makes clear that internal newlines can be
> mapped to anything appropriate on output, including CR/LF and padding
> with blank spaces to fit into a card reader/punch environment:
Only for "text mode IO". Not for binary mode IO. If you want to
portabl
Arnold
==
-Original Message-
From: Jill Ramonsky [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 21, 2003 8:31 AM
To: [EMAIL PROTECTED]
Subject: RE: Backslash n [OT] was Line Separator and Paragraph Separator
On 21 Oct 2003, at 12:01, Jill Ramonsky wrote:
> I would be more than grateful if someone could point me in the direction
> of a DEFINITVE specification which claims this is not the case, that the
> interpretion of "\n" as anything other than LF may be considered
> conformant behaviour.
nd my apologies John,
Jill
> -Original Message-
> From: John Cowan [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 21, 2003 1:19 PM
> To: Jill Ramonsky
> Cc: [EMAIL PROTECTED]
> Subject: Re: Backslash n [OT] was Line Separator and
> Paragraph Separator
>
&g
Jill Ramonsky scripsit:
> This is axiomatically *THE* definition. Period. Everything else is
> merely quoting, rephrasing or reinterpretting this original.
Absolutely not. The *standard* for the C programming language is now
ISO/IEC 9899. The 2nd edition of K & R, much-beloved as it is, is jus
Call me pedantic, but
>From "The C Programming Language", Second Edition, by Brian W.
Kernighan and Dennis M. Ritchie (the architects of C), page 193, which
explicitly lists all allowable escape sequences. This defines them as
follows:
newline NL (LF) \n
horizontal ta
42 matches
Mail list logo