converting html to text

2002-04-04 Thread Paul Tremblay
CPAN module already written. Converting html to text seems like such a common task, that there ought to be some robust scripts out there. Interestingly enough, I found many scripts to convert html to rtf and LaTeX and every other format, but not plain old text! Paul --

Converting HTML to text?

2002-01-06 Thread Andy Ransom
Hi, I have a requirement to convert HTML files to plain text within a perl script, and I need to preserve the formating of HTML table as far as possible, say something like Netscape does when you do a "save as text" operation. I have looked on CPAN but could not find anything appropriate (altho

Re: converting html to text

2002-04-04 Thread Agustin Rivera
llstar.com - Original Message - From: "Paul Tremblay" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, April 04, 2002 9:10 AM Subject: converting html to text > I spent several hours last night trying to convert an html file > to text,

Re: converting html to text

2002-04-04 Thread Elaine -HFB- Ashton
Paul Tremblay [[EMAIL PROTECTED]] quoth: *> *>I am wodering if there isn't a CPAN module already written. *>Converting html to text seems like such a common task, that there *>ought to be some robust scripts out there. Interestingly enough, *>I found many scripts to convert h

Re: converting html to text

2002-04-04 Thread tom poe
am wodering if there isn't a CPAN module already written. > *>Converting html to text seems like such a common task, that there > *>ought to be some robust scripts out there. Interestingly enough, > *>I found many scripts to convert html to rtf and LaTeX and every &g

RE: converting html to text

2002-04-04 Thread murphy, daniel (BMC Eng)
TED] EMC Corp.508-249-3322 Hopkinton, MA 01748 EMC² where information lives -Original Message- From: Paul Tremblay [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 04, 2002 12:11 PM To: [EMAIL PROTECTED] Subject: converting html to text

Re: converting html to text

2002-04-04 Thread drieux
On Thursday, April 4, 2002, at 12:12 , tom poe wrote: [..] >> >> That's what the search engine is for >> >> http://search.cpan.org/search?dist=HTML-Format >> >> e. worth remembering is also that this will require the Font-AFM distribution. ciao drieux --- -- To unsubscribe, e-mail: [EMAIL P

Re: converting html to text

2002-04-04 Thread Paul Tremblay
On Thu, Apr 04, 2002 at 10:36:36AM -0800, Agustin Rivera wrote: > > Are you looking to keep the basic formatting of the HTML in tact during the > conversion, or just want the HTML stripped? I wouldn't imagine that it > would be that hard to convert the HTML to text if the HTML wasn't overly > c

Re: converting html to text

2002-04-05 Thread drieux
On Thursday, April 4, 2002, at 01:19 , murphy, daniel (BMC Eng) wrote: > Just did this with the help of "Perl Cookbook" (this book is great). > > Chapter 20.6 Extracting or Removing HTML tags > > use HTML::Parse; > use HTML::FormatText; > $plain_text = HTML::FormatText->new->format(parse_html($h

Re: converting html to text

2002-04-05 Thread Paul Tremblay
On Fri, Apr 05, 2002 at 05:15:08AM -0800, drieux wrote: > > ### #!/usr/bin/perl > ### > ### use HTML::Parser; > ### use HTML::FormatText; > ### use HTML::TreeBuilder; > ### > ### my $html_text; > ### my $filename = $ARGV[0]; > ### open(FH, $filename) or die "unable to open file $filename :$!\n";

Re: converting html to text

2002-04-06 Thread drieux
On Friday, April 5, 2002, at 10:43 , Paul Tremblay wrote: [..] > The problem is that the filter deletes all of my text and ouputs this: > > [TABLE NOT SHOWN][TABLE NOT SHOWN][TABLE NOT SHOWN][TABLE NOT > SHOWN][TABLE NOT SHOWN] Right! that is the big clue I should have seen - there is no 'plain