Superpages.com is looking for a team player to work in a dynamic group of
developers. We are a brand new company (recent spin-off). Regardless we are
a large, profitable and stable company. The office is located in Waltham,
MA just next to exit 28b of I-95/128.
Both senior and quick-learning juni
That worked. Thanks!
Running lynx on my local copies of the *.html files works reasonably
well, although the output is not what IE produces, and is harder for me
to parse.
A minor follow up question. Currently I have to run lynx from its own
directory. Otherwise I got
\lynx_w32\lynx.bat f
Thanks Jerrad,
I actually tried lynx first. However, the html files are on a server
that needs authentication. Even adding
-auth my-user-id:my-pw
To lynx was not enough.
Here is the lynx output (I added the # as these are comments in the perl
program):
# Looking up [my proxy]
# Making HTTP
NTLM is bad, 'm-k?
--
Free map of local environmental resources: http://CambridgeMA.GreenMap.org
--
MOTD on Boomtime, the 49th of Discord, in the YOLD 3173:
It is useless for sheep to pass resolutions in favor of vegetarianism while
wolves remain of a different opinion.
On Wed, 2 May 2007, Tolkin, Steve wrote:
> Q1. Is there a way to automate IE or Mozilla Firefox to save 100's of
> files as text?
Probably, but might it be easier to automate using `lynx -dump` (or
better still, `links -dump`) ?
If those produce output the way you want them, automating it shoul
lynx -dump
--
Free map of local environmental resources: http://CambridgeMA.GreenMap.org
--
MOTD on Boomtime, the 49th of Discord, in the YOLD 3173:
It is useless for sheep to pass resolutions in favor of vegetarianism while
wolves remain of a different opinion.
I want to extract the text from several hundred *.html files. Many html
tags cause a newline to appear in the output, e.g. etc.
In Internet Explorer if I do "Files Save As..." and change "Save as
Type" to be "Text File (*.txt)" the output file preserves newlines (and
other whitespace) in a reas
7 matches
Mail list logo