A forum has topics which are available only for members.
How to use wget for downloading copy of the pages in that
case? How to get the proper cookies and how to get wget to
use them correctly? I use IE in PC/Windows and wget in
a unix computer. I could use Lynx in the unix computer
if needed.
(P
Hello. Wget 1.10.2 has the following bug compared to version 1.9.1.
First, the bin/wgetdir is defined as
wget -p -E -k --proxy=off -e robots=off --passive-ftp
-o zlogwget`date +%Y%m%d%H%M%S` -r -l 0 -np -U Mozilla --tries=50
--waitretry=10 $@
The download command is
wgetdir http://udn.epi
Hello. The TODO lists the following:
* Add more protocols (e.g. gopher and news), implementing them in a
modular fashion.
Do you mean nntp protocol? If yes, I recently wrote an nntp
downloader:
http://www.funet.fi/~kouhia/nntppull20060409.tar.gz
I find it good for news archiving. I now arch
Hello. The following problem occured recently. I started downloading
all under directory
http://site.edu/projects/software/
Then after a day I found that the subdirectory
http://site.edu/projects/software/program/manual/
had a wiki with millions of files. Because I wished that the download
con
Hello. How do I get wget to ignore urls containing one of the following
strings? The --help did not reveal a suitable option, surprisingly.
action=
printable=
redirect=
article=
returnto=
title=
I would like to remind about the problems with the existing options:
(1) I downloaded an ftp sit
Hello. How I would type the -A option if I want both .pdf and .PDF
files from an ftp site? "-A pdf,PDF" failed -- only PDF files were
downloaded.
How I would type -X option if I want multiple subdirectories
excluded? "-X dir1,dir2" failed -- only one of the given dirs
was excluded. (E.g. www.site
Hello. I would like to have a database within wget. The database
would let wget know what it has downloaded earlier. Wget could
download only new and changed files, and could continue the download
without having the old downloadings in my disk.
The database would also be accessed by other program
Hello.
The following document could not be downloaded at all:
http://www.greyc.ensicaen.fr/~dtschump/greycstoration/
If you succeed, please tell me how. I want all the html file
and the images.
Juhana
--
http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
for developers of open
Hello.
Does wget have a nntp (Usenet newsgroups) support?
For example, I might want download all articles between
numbers M and N. A date based system could be useful too.
We just should agree how these queries are represented to
wget.
I can dig out an old Usenet news downloader code if wget
does
Hello.
Wget could not download the images of the page
http://www.fusionindustries.com/alex/combustion/index.html
The image urls have %5C (backslash "\") in them.
http://www.fusionindustries.com/alex/combustion/small%5C0103%20edgepoint-pressure%20small.png
http://www.fusionindustries.com/ale
Hello.
I traced the url given at command line, and it looks like there
is no difference if one gives "~" or "%7E". Is this true?
The urls end up to url_parse() which switches "~" (as unsafe) to
"%7E". If the original url is not used at all as it looks like,
then there is no difference. But mysteri
Hello.
Recent mails has not been replied and CVS may be old.
Who are the developers of wget at the moment?
I just posted a couple of featureloss reports, but my intend
is not to pour the tasks on the current developers. However,
without anyone giving hints on what to look at, the features
may go
Hello.
When the url
http://zeus.fri.uni-lj.si/%7Ealeks/POIS/Kolaborativno%20delo.htm
is downloaded with -np -r -l 0 etc., the file
http://zeus.fri.uni-lj.si/~aleks/POIS/Kolaborativno delo_files/filelist.xml
is downloaded correctly. However, the hrefs in the xml file are not
then followed:
Hello.
Wget could not follow dynsrc tags; the mpeg file was not downloaded:
at
http://www.wideopenwest.com/~nkuzmenko7225/Collision.htm
Regards,
Juhana
Hello.
Why wget generates the following index files?
Why so many index files?
ftp1.sourceforge.net/gut/index.html
ftp1.sourceforge.net/gut/index.html?C=M&O=A
ftp1.sourceforge.net/gut/index.html?C=M&O=D
ftp1.sourceforge.net/gut/index.html?C=N&O=A
ftp1.sourceforge.net/gut/index.html?C=N&O=
Hello.
Has the ~ / %7E bug been always in wget? When it was added to wget?
Who wrote the code?
I would like to suggest that the person who made this severe bug
should immediately fix it back. It does not make sense that we waste
time in trying to fix this bug if the person did not use any moment
Hello.
This bug with ~ and %7E is starting to be very, very annoying.
Somebody should check it immediately. The problem was that wget
interprets urls such as
www.site.edu/~user/ and www.site.edu/%7Euser/
as different directory paths. With -np option wget does not
get all pages.
It is unkno
Hello.
I have slightly thought how to make wget more better, possibly.
We would need a scripting system so that features can be programmed
more easily. One way how to incorporate the scripting to wget would
be to re-write wget as a data flow system. Much similar way than
OpenGL (www.opengl.org) is
Hello.
The file
http://www.cs.utah.edu/~gooch/JOT/index.html
is compressed and wget could not follow the urls in it.
What can be done? Should wget uncompress the compressed *.htm
and *.html files? *.asp, *.php??
Juhana
Hello.
Char coding of "~" causes problems in downloading.
Example:
wget -p -E -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np
http://www.stanford.edu/~dattorro/
However, not all was downloaded. The file "machines.html" has hrefs
http://www.stanford.edu/%7Edattorro/images/calloph.jp
Hello.
One wget problem this time. I downloaded all in
http://www.planetunreal.com/wod/tutorials/
but most of the files were not downloaded because urls are
in the file
http://www.planetunreal.com/wod/tutorials/sidebar.js
in the following format
FItem("Beginner's Guide to UnrealScript", "gu
Hello.
Problem: When downloading all in
http://udn.epicgames.com/Technical/MyFirstHUD
wget overwrites the downloaded MyFirstHUD file with
MyFirstHUD directory (which comes later).
GNU Wget 1.9.1
wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@
Solution: Use of -E o
Hello. This is report on some wget bugs. My wgetdir command looks
the following (wget 1.9.1):
wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@
Bugs:
Command: "wgetdir http://www.directfb.org";.
Problem: In file "www.directfb.org/index.html" the hrefs of type
"/screen
Hello.
I downloaded
http://agar.csoft.org/index.html
with -k option, but the URL
http://agar.csoft.org/man.cgi?query=widget&sektion=3
in the file was not converted to relative.
(The local filename is "man.cgi?query=widget&sektion=3".)
Regards,
Juhana
>From: Hrvoje Niksic
>
>This looks like the problem exhibited by older versions, which always
Hello. Got latest 1.9.1 and the download is now working.
This list gets many request for how to use wget. So, here is my
"wgetdir" command. I have used it for years and have added options to
it always w
>> --16:59:21-- http://www.maqamworld.com:80/
>> => `index.html'
>> Connecting to www.maqamworld.com:80... connected!
>
>It looks like you have http_proxy=80 in your wgetrc file.
I placed "use_proxy = off" to .wgetrc (which file I did not have earlier)
and to "~/wget/etc/wgetrc"
Hello.
What goes wrong in the following? (I will read replies from the list
archives.)
% wget http://www.maqamworld.com/
--16:59:21-- http://www.maqamworld.com:80/
=> `index.html'
Connecting to www.maqamworld.com:80... connected!
HTTP request sent, awaiting response... 503
Hello.
I were not able to download the following URLs. I will read
possible replies from the mail archives as I'm not subscribed.
[By the way, Mailman list server with webpage interface makes
temporary subscriptions very easy. Digest mode can be turned on
at subscription time. It takes onl
28 matches
Mail list logo