A forum has topics which are available only for members.
How to use wget for downloading copy of the pages in that
case? How to get the proper cookies and how to get wget to
use them correctly? I use IE in PC/Windows and wget in
a unix computer. I could use Lynx in the unix computer
if needed.
Hello. Wget 1.10.2 has the following bug compared to version 1.9.1.
First, the bin/wgetdir is defined as
wget -p -E -k --proxy=off -e robots=off --passive-ftp
-o zlogwget`date +%Y%m%d%H%M%S` -r -l 0 -np -U Mozilla --tries=50
--waitretry=10 $@
The download command is
wgetdir
Hello. How do I get wget to ignore urls containing one of the following
strings? The --help did not reveal a suitable option, surprisingly.
action=
printable=
redirect=
article=
returnto=
title=
I would like to remind about the problems with the existing options:
(1) I downloaded an ftp
Hello. The following problem occured recently. I started downloading
all under directory
http://site.edu/projects/software/
Then after a day I found that the subdirectory
http://site.edu/projects/software/program/manual/
had a wiki with millions of files. Because I wished that the download
Hello. The TODO lists the following:
* Add more protocols (e.g. gopher and news), implementing them in a
modular fashion.
Do you mean nntp protocol? If yes, I recently wrote an nntp
downloader:
http://www.funet.fi/~kouhia/nntppull20060409.tar.gz
I find it good for news archiving. I now
Hello. How I would type the -A option if I want both .pdf and .PDF
files from an ftp site? -A pdf,PDF failed -- only PDF files were
downloaded.
How I would type -X option if I want multiple subdirectories
excluded? -X dir1,dir2 failed -- only one of the given dirs
was excluded. (E.g.
Hello. I would like to have a database within wget. The database
would let wget know what it has downloaded earlier. Wget could
download only new and changed files, and could continue the download
without having the old downloadings in my disk.
The database would also be accessed by other
Hello.
The following document could not be downloaded at all:
http://www.greyc.ensicaen.fr/~dtschump/greycstoration/
If you succeed, please tell me how. I want all the html file
and the images.
Juhana
--
http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
for developers of open
Hello.
Does wget have a nntp (Usenet newsgroups) support?
For example, I might want download all articles between
numbers M and N. A date based system could be useful too.
We just should agree how these queries are represented to
wget.
I can dig out an old Usenet news downloader code if wget
does
Hello.
I traced the url given at command line, and it looks like there
is no difference if one gives ~ or %7E. Is this true?
The urls end up to url_parse() which switches ~ (as unsafe) to
%7E. If the original url is not used at all as it looks like,
then there is no difference. But mysteriously
Hello.
Wget could not download the images of the page
http://www.fusionindustries.com/alex/combustion/index.html
The image urls have %5C (backslash \) in them.
http://www.fusionindustries.com/alex/combustion/small%5C0103%20edgepoint-pressure%20small.png
Hello.
Has the ~ / %7E bug been always in wget? When it was added to wget?
Who wrote the code?
I would like to suggest that the person who made this severe bug
should immediately fix it back. It does not make sense that we waste
time in trying to fix this bug if the person did not use any moment
Hello.
Why wget generates the following index files?
Why so many index files?
ftp1.sourceforge.net/gut/index.html
ftp1.sourceforge.net/gut/index.html?C=MO=A
ftp1.sourceforge.net/gut/index.html?C=MO=D
ftp1.sourceforge.net/gut/index.html?C=NO=A
ftp1.sourceforge.net/gut/index.html?C=NO=D
Hello.
Wget could not follow dynsrc tags; the mpeg file was not downloaded:
pimg dynsrc=Collision.mpg CONTROLS LOOP=1
at
http://www.wideopenwest.com/~nkuzmenko7225/Collision.htm
Regards,
Juhana
Hello.
When the url
http://zeus.fri.uni-lj.si/%7Ealeks/POIS/Kolaborativno%20delo.htm
is downloaded with -np -r -l 0 etc., the file
http://zeus.fri.uni-lj.si/~aleks/POIS/Kolaborativno delo_files/filelist.xml
is downloaded correctly. However, the hrefs in the xml file are not
then followed:
Hello.
Recent mails has not been replied and CVS may be old.
Who are the developers of wget at the moment?
I just posted a couple of featureloss reports, but my intend
is not to pour the tasks on the current developers. However,
without anyone giving hints on what to look at, the features
may go
Hello.
I have slightly thought how to make wget more better, possibly.
We would need a scripting system so that features can be programmed
more easily. One way how to incorporate the scripting to wget would
be to re-write wget as a data flow system. Much similar way than
OpenGL (www.opengl.org)
Hello.
The file
http://www.cs.utah.edu/~gooch/JOT/index.html
is compressed and wget could not follow the urls in it.
What can be done? Should wget uncompress the compressed *.htm
and *.html files? *.asp, *.php??
Juhana
Hello.
Char coding of ~ causes problems in downloading.
Example:
wget -p -E -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np
http://www.stanford.edu/~dattorro/
However, not all was downloaded. The file machines.html has hrefs
http://www.stanford.edu/%7Edattorro/images/calloph.jpg
Hello.
One wget problem this time. I downloaded all in
http://www.planetunreal.com/wod/tutorials/
but most of the files were not downloaded because urls are
in the file
http://www.planetunreal.com/wod/tutorials/sidebar.js
in the following format
FItem(Beginner's Guide to UnrealScript,
Hello.
Problem: When downloading all in
http://udn.epicgames.com/Technical/MyFirstHUD
wget overwrites the downloaded MyFirstHUD file with
MyFirstHUD directory (which comes later).
GNU Wget 1.9.1
wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@
Solution: Use of -E
Hello. This is report on some wget bugs. My wgetdir command looks
the following (wget 1.9.1):
wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@
Bugs:
Command: wgetdir http://www.directfb.org;.
Problem: In file www.directfb.org/index.html the hrefs of type
Hello.
I downloaded
http://agar.csoft.org/index.html
with -k option, but the URL
http://agar.csoft.org/man.cgi?query=widgetamp;sektion=3
in the file was not converted to relative.
(The local filename is man.cgi?query=widgetsektion=3.)
Regards,
Juhana
--16:59:21-- http://www.maqamworld.com:80/
= `index.html'
Connecting to www.maqamworld.com:80... connected!
It looks like you have http_proxy=80 in your wgetrc file.
I placed use_proxy = off to .wgetrc (which file I did not have earlier)
and to ~/wget/etc/wgetrc (which file
Hello.
What goes wrong in the following? (I will read replies from the list
archives.)
% wget http://www.maqamworld.com/
--16:59:21-- http://www.maqamworld.com:80/
= `index.html'
Connecting to www.maqamworld.com:80... connected!
HTTP request sent, awaiting response... 503
25 matches
Mail list logo