Re: Long Lines

Anonymous Sat, 26 Jun 1999 23:19:05 -0700
On 25 Jun 99, at 15:41, Matthew N. Kleiman wrote:

> I recently coded a "robomoderator" ...
> ...to reject any message with any line longer
> than 80 characters (excluding headers).  To my surprise, this has been
> the most common problem "solved" by the robomoderator ... approximately
> 1 in 30 messages sent to the list have long lines.
> 
> I've always believed that a well-behaved email client will wrap long
> lines before sending email, on the premise that some email programs
> don't have a word wrap feature.  I simply assumed it was the _sender's_
> obligation to wrap long lines, not the recipient's.
> 
> However, upon further research, the internet email protocals clearly
> permit line lengths up to 1000 characters (rfc 821).  This perhaps
> suggests that it's the _recipient's_ job to wrap long lines, not the
> sender's.

Actually, you missed a part of that section.  That section of RFC 821 [= 
STD 10] is the "minimum maximum".  Yes, long lines are permitted, as they 
should be: the *transport* machinery shouldn't impose unnecessary 
arbitrary limits on the package it is transporting.  The place you're 
looking at is:

>       4.5.3.  SIZES
> 
> There are several objects that have required minimum maximum
>          sizes.  
> [...] 

And it goes on to say that:

>           ****************************************************
>           *                                                  *
>           *  TO THE MAXIMUM EXTENT POSSIBLE, IMPLEMENTATION  *
>           *  TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH *
>           *  OF THESE OBJECTS SHOULD BE USED.                *
>           *                                                  *
>           ****************************************************

a limitation here [for SMTP] would preclude *all*other* possible internal 
data formats using SMTP for its transport, and so it, quite properly, 
should be generous [or unlimited!].

Now, if you go a level down, into the message within the wrapper, you 
turn to RFC 822 [STD-11] and things are a bit stranger.  Still: as it 
should be, the RFCs are fairly careful *not* to constrain the format 
overly, lest it cripple future uses for email, but to my reading we can 
make some deductions about what was intended...

Consider that the header field is explictly set up with machinery for 
"folding" and having continuation lines, when it could easily have been 
specified to be [and indeed, it is specified to be handled *as*if* it 
were] a single line with the header-tag at its beginning.  Clearly they 
expected those header fields to be wrapped explicitly in the message, 
whereas if they had merely had intended arbitrarily-long-lines they 
wouldn't have bothered [and indeed, it'd make parsing 822 headers easier 
NOT to have to deal with continuations..:o)].

You look farther on to the definition of "linear white space" and you see 
that the spec makes a distinction between the semantics of "SPACE" and 
"FOLDING" -- clearly [IMO] the intent for the semantics of LWSP that is 
_not_ a CR is different from an actual CRLF.  More indications, to my 
reading, that their -intent- was for regular text with CRLFs indicating 
where new lines should begin, but with _allowances_ for other, fancier, 
data formats...

At a practical level, another problem with wrapping long lines at the 
reader's end is that it is, in general, not really possible to do it 
quite right unless you have something like <PRE> </PRE> tags available to 
give the reader's client a clue *NOT* to mess with text that _has_ been 
formatted [it is rather WAAY too much of an restriction on email to 
mandate that _only_ unformatted, microsoft-style "paragraphs" should be 
sent!].  E.g., I get system messages with unix log snippets in it and 
histograms [basically, ascii-ized graphs], and all the nice neat columns 
are rather thouroughly trashed if my mail client folds gratuitously. I 
also get things with macsyma-generated equations in them [where 
superscripts and fractions and such are done via multi-line prints, like:
   2    2     2
  x  + y   = z
and you havne't seen a MESS until you've seen some of _that_ stuff re-
wrapped by the reader's client! :o)]

> At the same time, I hate to see subscriber's messages get bounced,
> especially novices, to whom the 80 character limitation may seem
> arbitrary and make little sense given modern email clients.  (After all,
> how many of you know the *original* derivation of the 80 character rule
> of thumb?  Answer at the end of this email.)

I do know that, but your derivation has it wrong: the reason for the 80-
char rule *IS* the size of terminals [both hardcopy [e.g., TTYs] and CRTs 
["glass TTYs and their descendents]].  The fact that Teletype corp may 
have been looking elsewhere when they decided to make the platen on the 
KSR33 be 80 characters wide isn't really relevant...  *Those*devices*, 
for whatever reason, *did* standarize on 80 chars and the constraints on  
email is *NOT* a homage to the IBM 407, but rather the reality of what 
the terminals *had* standardized on.  [and yes, I know there were a few 
terminals of other sizes [I think my old TI silent 700 only had 72 char 
lines, and I used an IBM 2741 that had lots more than 80... but 
overwhelmingly, 80 was the number...]

Moreover, the proper limitation is *less* than 80 characters. For 
example, RFC 1855 [the netiquette RFC] simply says:  

>     - Limit line length to fewer than 65 characters and end a line
>       with a carriage return.

good advice then, good advice now...

  /Bernie\
-- 
Bernie Cosell                     Fantasy Farm Fibers
mailto:[EMAIL PROTECTED]     Pearisburg, VA
    -->  Too many people, too few sheep  <--
Re: Long Lines

Reply via email to