wget (centos build) page-requisites leaves requisites in a bad location

2007-09-04 Thread Ed
Seen this twice now but unable to track down how it happens. I am crawling a list of websites which are being kept in a cache area. My args (in ruby) are ARGS = "--html-extension " + "--page-requisites " + "--force-directories " + "--convert-links " + "--directory-

Wget 1.10.2 prodces different log output on mac osx and centos

2007-08-15 Thread Ed
Note the line beginning '=>' giving the filename is missing on CentOS, unfortunately I was using this for validation. Any idea why there is a difference between the 2 platforms? Any idea how I can get the output filename shown on all platforms? Ed

Wget doesn't use characters after '&' when saving a URL as a filename

2007-03-28 Thread Ed
' + "--timestamping " + #{}"--wait=1 " + '--restrict-file-names=unix ' + '--user-agent="' + UA + '" ' + '--append-output="' + BASE + '/wget2.log" ' + "--server-response " How do I get the full file name? Mac OS X by the way Ed

uploading a bloody form using multipart/form-data

2007-03-09 Thread Ed S. Peschko
d wrong or should I be looking at a different tool to do this? Whatever tool would have to support proxy authentication... Ed (ps - here are the headers I'm trying to use. I've gotten them to work with perl, but again without proxy. I can't get wget to work either with or with

Excluded hosts ignored when wget'ing from google search results

2006-11-09 Thread Ed
othing happens regardless of the -F switch. If I save as web page complete I get the same outcome as above. Just out of desperation I also tried adding -R '*\?*' to the command line, this made no difference either Ed

OT: How do I join this list?

2006-11-09 Thread Ed
nd/or where/how I have to reply to join? I'm having problems with wget and this isn't helping. Ed

Re: exclude_domains doesn't work with google search results

2006-11-09 Thread Ed
Sorry for loss of threading but I can't join the list from this address. This is a own-built version of 1.10.2 on Mac OS X 10.4 Ed

exclude_domains doesn't work with google search results

2006-11-09 Thread Ed
s getting them read from $HOME/.wgetrc but that may be another problem Now some odd things happen when I run this script, all of which seem to be related to domains in the supposedly excluded list. E.g. here is a directory listing after running my script nemesis:~/wget-test ed$ ls 64.233.183.104/

wget -O writes empty file on failure

2006-01-17 Thread Avis, Ed
without -O, that is, no output file should be created in the case of 404 errors or other total failures to download anything. -- Ed Avis <[EMAIL PROTECTED]> - This message may contain confidential, proprietary, or legally privileged information. No confidentiality or privilege is waived

Re: download files with foreign/"illegal" chars

2005-08-22 Thread ed
I assume your default character set is UTF-8 or something that handles >unicode characters? If not, perhaps that might help? > >ed wrote: > >> Actually with the ftp client I can't even get in to one of the >> subdirectories with a Chinese name. >> >> Does an

download files with foreign/"illegal" chars

2005-08-17 Thread ed
't necessarily question marks, but could be any "high ascii" characters I think. I tried logging in with a regular ftp client, and I'm seeing symbols like diamond shapes and such. Actually with the ftp client I can't even get in to one of the subdirectories with a Chinese name. Does anyone have any idea if there's any way around this? Thanks! Ed

Patch: improve 303 See Other handling

2003-08-16 Thread Ed Avis
-77,6 +77,7 @@ retrieving? */ int retr_symlinks; /* Whether we retrieve symlinks in FTP. */ + int follow_see_other;/* Is 303 See Other treated as a redirect? */ char *output_document; /* The output file to which the documents will be printed. */ int od_known_regular;/* whether output_document is a -- Ed Avis <[EMAIL PROTECTED]>

'broken pipe' error message uses wrong filename with -O-

2003-06-07 Thread Ed Avis
should instead say "cannot write to standard output" or perhaps "cannot write to '-'". -- Ed Avis <[EMAIL PROTECTED]>

HAVE_SELECT bugette

2003-02-15 Thread Ed Avis
cmd_time }, +#endif /* HAVE_SELECT */ { "timestamping",&opt.timestamping, cmd_boolean }, { "tries", &opt.ntry, cmd_number_inf }, { "useproxy",&opt.use_proxy, cmd_boolean }, -- Ed Avis <[EMAIL PROTECTED]>

'Redirection to itself', but okay with other fetchers

2002-09-28 Thread Ed Avis
self', yet somehow the same URL works with programs other than wget. -- Ed Avis <[EMAIL PROTECTED]>

Compile problem (and possible fix)

2001-11-05 Thread Ed Powell
to: assert (ch == '\'' || ch == '\"'); Otherwise, it would not compile... it was, I think, interpreting the ", rather than using it literally. Escaping it appears to have fixed the problem. The compiling process was simply doing a 'configure'