I'm trying to write something that converts HTML into nicely formatted
text.

My current solution:

    use FileHandle;
    $text = "<p>Hello World</p>";
    $fh = FileHandle->new(">temp.txt")
        or die ("Unable to write temporary file");
    $fh->print($text);
    $fh->close;
    $fh->open("lynx -dump -stdin < temp.txt |")
        or die ("Unable to run HTML parser");
    $text = join "", <$fh>;
    $fh->close;
    print $text;

How can I improve it? I haven't really found anything on CPAN to do what I
want (there are "remove HTML tags" scripts and thing, but nothing with the
formatting power of lynx that I can see).

I'd like to avoid the intermediate file, too. The text will come from the
outside world, so I'd like to be safe while I'm processing it....

Cheers,

Ian

-
--
------------------------------------------------------------------------

The soul would have no rainbows if the eyes held no tears.

Ian Malpass                     [EMAIL PROTECTED]

Reply via email to