Here's a very naïve approach, which will probably work. Might screw up your
<PRE> sections though...

perl -p0
s/
>/>
/g;s/^[^>]{1,69}[>"]/join$",split' ',$&/mge


Including an end of line match might make it a little more resilient:
^[^>]{1,69}[>"]$

It puts the attributes with the entity, but that looks right to me... i.e.
you get:
<META NAME="GENERATOR">
Instead of 
<META
NAME="GENERATOR">

Greg

-----Original Message-----
From: Phil Carmody [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 10, 2003 1:34 PM
To: [EMAIL PROTECTED]
Subject: HTML de-uglifier in 2 lines of perl


#!/usr/bin/perl -n
chomp;if($#p>=0&&s/^(\"?>)//){$p[-1].="$1\n";print(join($w<70?'
':"\n",@p));@p=($_);$w=0}
[EMAIL PROTECTED],$_}$w+=length;}{print(join("\n",@p))if($#p>=0);


I wrote that because docbook2html produces ugly HTML:
<<<
<HTML
><HEAD
><TITLE
>A World Wide Web Interface to CTAN</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.76b+
"></HEAD
><BODY
...
>>>

and I wanted (IMHO) prettier HTML:
<<<
<HEAD>
<TITLE>
A World Wide Web Interface to CTAN</TITLE>
<META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.76b+">
</HEAD>
<BODY
...
>>>

The script also tries to join multiple attributes onto the same line, 
as long as the line wouldn't be too long (70 chars) as I also find that
improves the readability of HTML (by reducing the noise level).

As I suck at perl, I reckon that something only half the length of that
might be possible. 

Don't spend more than 2 minutes on it. I didn't!

Phil


------------------------------------------------------------------------------
This message is intended only for the personal and confidential use of the designated 
recipient(s) named above.  If you are not the intended recipient of this message you 
are hereby notified that any review, dissemination, distribution or copying of this 
message is strictly prohibited.  This communication is for information purposes only 
and should not be regarded as an offer to sell or as a solicitation of an offer to buy 
any financial product, an official confirmation of any transaction, or as an official 
statement of Lehman Brothers.  Email transmission cannot be guaranteed to be secure or 
error-free.  Therefore, we do not represent that this information is complete or 
accurate and it should not be relied upon as such.  All information is subject to 
change without notice.

Reply via email to