forum download, cookies?

2007-09-13 Thread Juhana Sadeharju
A forum has topics which are available only for members. How to use wget for downloading copy of the pages in that case? How to get the proper cookies and how to get wget to use them correctly? I use IE in PC/Windows and wget in a unix computer. I could use Lynx in the unix computer if needed. (P

Bug in 1.10.2 vs 1.9.1

2006-12-03 Thread Juhana Sadeharju
Hello. Wget 1.10.2 has the following bug compared to version 1.9.1. First, the bin/wgetdir is defined as wget -p -E -k --proxy=off -e robots=off --passive-ftp -o zlogwget`date +%Y%m%d%H%M%S` -r -l 0 -np -U Mozilla --tries=50 --waitretry=10 $@ The download command is wgetdir http://udn.epi

news protocol?

2006-08-04 Thread Juhana Sadeharju
Hello. The TODO lists the following: * Add more protocols (e.g. gopher and news), implementing them in a modular fashion. Do you mean nntp protocol? If yes, I recently wrote an nntp downloader: http://www.funet.fi/~kouhia/nntppull20060409.tar.gz I find it good for news archiving. I now arch

wget server?

2006-08-04 Thread Juhana Sadeharju
Hello. The following problem occured recently. I started downloading all under directory http://site.edu/projects/software/ Then after a day I found that the subdirectory http://site.edu/projects/software/program/manual/ had a wiki with millions of files. Because I wished that the download con

url accept/reject? accept scripts

2006-08-04 Thread Juhana Sadeharju
Hello. How do I get wget to ignore urls containing one of the following strings? The --help did not reveal a suitable option, surprisingly. action= printable= redirect= article= returnto= title= I would like to remind about the problems with the existing options: (1) I downloaded an ftp sit

accepted and excluded?

2006-02-10 Thread Juhana Sadeharju
Hello. How I would type the -A option if I want both .pdf and .PDF files from an ftp site? "-A pdf,PDF" failed -- only PDF files were downloaded. How I would type -X option if I want multiple subdirectories excluded? "-X dir1,dir2" failed -- only one of the given dirs was excluded. (E.g. www.site

wget with a log database?

2005-11-30 Thread Juhana Sadeharju
Hello. I would like to have a database within wget. The database would let wget know what it has downloaded earlier. Wget could download only new and changed files, and could continue the download without having the old downloadings in my disk. The database would also be accessed by other program

wget problem

2005-04-04 Thread Juhana Sadeharju
Hello. The following document could not be downloaded at all: http://www.greyc.ensicaen.fr/~dtschump/greycstoration/ If you succeed, please tell me how. I want all the html file and the images. Juhana -- http://music.columbia.edu/mailman/listinfo/linux-graphics-dev for developers of open

Re: Help Needed

2004-11-02 Thread Juhana Sadeharju
Hello. Does wget have a nntp (Usenet newsgroups) support? For example, I might want download all articles between numbers M and N. A date based system could be useful too. We just should agree how these queries are represented to wget. I can dig out an old Usenet news downloader code if wget does

char 5C problem

2004-11-01 Thread Juhana Sadeharju
Hello. Wget could not download the images of the page http://www.fusionindustries.com/alex/combustion/index.html The image urls have %5C (backslash "\") in them. http://www.fusionindustries.com/alex/combustion/small%5C0103%20edgepoint-pressure%20small.png http://www.fusionindustries.com/ale

on tilde bug

2004-11-01 Thread Juhana Sadeharju
Hello. I traced the url given at command line, and it looks like there is no difference if one gives "~" or "%7E". Is this true? The urls end up to url_parse() which switches "~" (as unsafe) to "%7E". If the original url is not used at all as it looks like, then there is no difference. But mysteri

Developers here?

2004-10-16 Thread Juhana Sadeharju
Hello. Recent mails has not been replied and CVS may be old. Who are the developers of wget at the moment? I just posted a couple of featureloss reports, but my intend is not to pour the tasks on the current developers. However, without anyone giving hints on what to look at, the features may go

xml files not processed?

2004-10-16 Thread Juhana Sadeharju
Hello. When the url http://zeus.fri.uni-lj.si/%7Ealeks/POIS/Kolaborativno%20delo.htm is downloaded with -np -r -l 0 etc., the file http://zeus.fri.uni-lj.si/~aleks/POIS/Kolaborativno delo_files/filelist.xml is downloaded correctly. However, the hrefs in the xml file are not then followed:

img dynsrc not downloaded?

2004-10-16 Thread Juhana Sadeharju
Hello. Wget could not follow dynsrc tags; the mpeg file was not downloaded: at http://www.wideopenwest.com/~nkuzmenko7225/Collision.htm Regards, Juhana

Directory indecies?

2004-10-16 Thread Juhana Sadeharju
Hello. Why wget generates the following index files? Why so many index files? ftp1.sourceforge.net/gut/index.html ftp1.sourceforge.net/gut/index.html?C=M&O=A ftp1.sourceforge.net/gut/index.html?C=M&O=D ftp1.sourceforge.net/gut/index.html?C=N&O=A ftp1.sourceforge.net/gut/index.html?C=N&O=

Tilde bug again

2004-10-16 Thread Juhana Sadeharju
Hello. Has the ~ / %7E bug been always in wget? When it was added to wget? Who wrote the code? I would like to suggest that the person who made this severe bug should immediately fix it back. It does not make sense that we waste time in trying to fix this bug if the person did not use any moment

tilde bug??!!

2004-10-07 Thread Juhana Sadeharju
Hello. This bug with ~ and %7E is starting to be very, very annoying. Somebody should check it immediately. The problem was that wget interprets urls such as www.site.edu/~user/ and www.site.edu/%7Euser/ as different directory paths. With -np option wget does not get all pages. It is unkno

wget scripting?

2004-10-04 Thread Juhana Sadeharju
Hello. I have slightly thought how to make wget more better, possibly. We would need a scripting system so that features can be programmed more easily. One way how to incorporate the scripting to wget would be to re-write wget as a data flow system. Much similar way than OpenGL (www.opengl.org) is

compressed html files?

2004-09-23 Thread Juhana Sadeharju
Hello. The file http://www.cs.utah.edu/~gooch/JOT/index.html is compressed and wget could not follow the urls in it. What can be done? Should wget uncompress the compressed *.htm and *.html files? *.asp, *.php?? Juhana

Character coding gives problems

2004-08-20 Thread Juhana Sadeharju
Hello. Char coding of "~" causes problems in downloading. Example: wget -p -E -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np http://www.stanford.edu/~dattorro/ However, not all was downloaded. The file "machines.html" has hrefs http://www.stanford.edu/%7Edattorro/images/calloph.jp

wget problem: urls behind script

2004-04-16 Thread Juhana Sadeharju
Hello. One wget problem this time. I downloaded all in http://www.planetunreal.com/wod/tutorials/ but most of the files were not downloaded because urls are in the file http://www.planetunreal.com/wod/tutorials/sidebar.js in the following format FItem("Beginner's Guide to UnrealScript", "gu

wget bug: directory overwrite

2004-04-05 Thread Juhana Sadeharju
Hello. Problem: When downloading all in http://udn.epicgames.com/Technical/MyFirstHUD wget overwrites the downloaded MyFirstHUD file with MyFirstHUD directory (which comes later). GNU Wget 1.9.1 wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@ Solution: Use of -E o

Bug report

2004-03-24 Thread Juhana Sadeharju
Hello. This is report on some wget bugs. My wgetdir command looks the following (wget 1.9.1): wget -k --proxy=off -e robots=off --passive-ftp -q -r -l 0 -np -U Mozilla $@ Bugs: Command: "wgetdir http://www.directfb.org";. Problem: In file "www.directfb.org/index.html" the hrefs of type "/screen

will mime coding make the site different?

2004-03-13 Thread Juhana Sadeharju
Hello. I downloaded http://agar.csoft.org/index.html with -k option, but the URL http://agar.csoft.org/man.cgi?query=widget&sektion=3 in the file was not converted to relative. (The local filename is "man.cgi?query=widget&sektion=3".) Regards, Juhana

Re: not downloading at all, help

2004-02-13 Thread Juhana Sadeharju
>From: Hrvoje Niksic > >This looks like the problem exhibited by older versions, which always Hello. Got latest 1.9.1 and the download is now working. This list gets many request for how to use wget. So, here is my "wgetdir" command. I have used it for years and have added options to it always w

Re: not downloading at all, help

2004-02-12 Thread Juhana Sadeharju
>> --16:59:21-- http://www.maqamworld.com:80/ >> => `index.html' >> Connecting to www.maqamworld.com:80... connected! > >It looks like you have http_proxy=80 in your wgetrc file. I placed "use_proxy = off" to .wgetrc (which file I did not have earlier) and to "~/wget/etc/wgetrc"

not downloading at all, help

2004-02-11 Thread Juhana Sadeharju
Hello. What goes wrong in the following? (I will read replies from the list archives.) % wget http://www.maqamworld.com/ --16:59:21-- http://www.maqamworld.com:80/ => `index.html' Connecting to www.maqamworld.com:80... connected! HTTP request sent, awaiting response... 503

downloads failed

2003-12-23 Thread Juhana Sadeharju
Hello. I were not able to download the following URLs. I will read possible replies from the mail archives as I'm not subscribed. [By the way, Mailman list server with webpage interface makes temporary subscriptions very easy. Digest mode can be turned on at subscription time. It takes onl