Re: Another case that cause wget to crash

2002-07-22 Thread Hack Kampbjørn

Xuehua Shen wrote:
 
 Hi,there,
Another Crash case of wget.

Note thatthe latest wget version (1.8.2) doesn't segfault on this.

   when I use wget http://www.usmint.gov/what.cfm.

In the future, when reporting problem please include the --debug output
of the wget command.

   Resolving www.usmint.gov...done
   Connecting to
 www.usmint.gov[208.45.143.104]80...connected
   Location:http:\\catalog.usmint.gov[following]
   http:\\catalog.usmint.gov:Unsupported scheme.

Wget nicely informs you that it does not support the http:\\ scheme (or
is it http:).
   Segmentation fault(core dump).

Wget shouldn't segfault on this, and as said before this is fixed in
wget 1.8.2.
 
 I think there are some problems when wget deals with
 the redirection.

More likely this the redirection itself, if the webmaster intended to
redirect to http://catalog.usmint.gov (using a schemme (http://)
supported by wget) it should say so. Instead of inventing a new scheme
not supported by any web-clients.

If you would like to see support for this new scheme please provide
links to RFCs and references to software already implementing it. Of
course patches will of be considered 8-)
 
 Regards.
 
 Xuehua

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: pftp mode for wget?

2002-07-08 Thread Hack Kampbjørn

Joshua N Pritikin wrote:
 
 Does wget support passive ftp (pftp)?  i have wget 1.8.1-4 (debian i386).

Then look at --passive-ftp:
$ wget --help
GNU Wget 1.8.1, a non-interactive network retriever.
[...]
FTP options:
  -nr, --dont-remove-listing   don't remove `.listing' files.
  -g,  --glob=on/off   turn file name globbing on or off.
   --passive-ftp   use the passive transfer mode.
   --retr-symlinks when recursing, get linked-to files (not
dirs).
[...]
 
 --
 Victory to the Divine Mother!! after all,
   http://sahajayoga.org  http://why-compete.org

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: dynamic calendar pages

2002-07-06 Thread Hack Kampbjørn


Stan Reeves wrote:
 
 I'm having trouble when recursively downloading sites with dynamic calendar
 pages.  It can take *forever* to get through several levels before hitting
 the recursion level limit.  I can reject pages with a .pl or .asp extension,
 but they're apparently still downloaded and scanned for links before being
 removed.  Is there a solution to this?  I'm using v. 1.8.1.

This is be design. The reject/accept options after scanning a text/html
page for links, which seems  ot to be what most people expect deemed by
the mailing list complains about this. And I don't think there's anyway
to work around it.
 
 Stan Reeves
 Electrical and Computer Engineering Dept.
 Auburn University
 Auburn, AL  36849
 [EMAIL PROTECTED]
 http://www.eng.auburn.edu/~sjreeves

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: wget and meta name=robots content=noindex,nofollow

2002-07-06 Thread Hack Kampbjørn

Cédric Rosa wrote:
 
 Hello,
 
 Is-it normal that wget saves web pages which contain meta name=robots
 content=noindex ?
 Or does wget considerate that it is not a search engine and respects only
 the follow/nofollow rules ?
 Or is-it a bug ? :)

I don't think wget support meta name=robots tags. Robot support was
added to wget long before these tags where proposed.
 
 Thanks.
 
 Cedric.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: hi

2002-07-06 Thread Hack Kampbjørn

 mp3TEAM wrote:
 
 My question is stupid, but i need HELP. When i connect with telnet to
 my server and i try to grab some link with WGET and if in the link has
 ( or ' or . syntax error near unexpected token . How to find
 SUBSTITUTE for these symbol?

This character have some special meaning for your shell (it's a good
idea to become familiar with them), you can usually protect them by
either quoting the URL like http:/ or or using a backslash before
the special character like http://host/cgi\?id=\'test\'
 
 I do not speak ENGLISH very well PLS excuse me !!
 
 
 
 http://www.MP3-BG.com
 
 

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Honesly, wget as a webcrawler?

2002-06-24 Thread Hack Kampbjørn

 Jason Davis wrote:
 
 I'm trying to find the best efficient solution for mirroring,
 spidering and/or crawlering (however I need to put this)
 of hunderds of thousdands of websites. a solution that can handle
 literally millions of files.
 I've read that wget gets delayed on incremental mirroing of huge sites
 and I wonder if that's true.
 if so, is fwget (http://bay4.de/FWget/) can be a solution? or there is
 totally diffrent place I should look into?

As the page says FWget is WGet with hastables. Since version 1.7 has
wget used hashtables.

Kalium: do you have anything to add to this. If not would you mind to
add a note about wget now using hashtables internally?

 
 I appriciate your help and would love to hear any tip!

Some things to think about are:
- can you install software on the server e.g. rsync
- does the server offer the sames files via a service better suited for
mirroring than HTTP
- do you access different webservers (wget only uses one connection)
- are the servers load balanced

 
 please keep me CC:d on the replies as I wasn't able to subscribe
 myself..
 
 Thanks!
 
 

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Bug with wget ? I need help.

2002-06-21 Thread Hack Kampbjørn

Cédric Rosa wrote:
 
 Hello,
 
 First, scuse my english but I'm french.
 
 When I try with wget (v 1.8.1) to download an url which is behind a router,
 the software wait for ever even if I've specified a timeout.
 
 With ethereal, I've seen that there is no response from the server (ACK
 never appears).
 
This a documented behavior, because of programming issues the timeout
does not cover the connection but only response after a connection has
been established. For version 1.9 the timeout option will also cover the
connection.

http://cvs.sunsite.dk/viewcvs.cgi/*checkout*/wget/NEWS?rev=HEADcontent-type=text/plain

 Here is the debug output:
 rosa@r1:~/htmlparser1.1/lib$ wget www.sosi.cnrs.fr
 --16:30:54-- http://www.sosi.cnrs.fr/
 = `index.html'
 Resolving www.sosi.cnrs.fr... done.
 Connecting to www.sosi.cnrs.fr[193.55.87.37]:80...
 
 Thanks by advance for your help.
 Cedric Rosa.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: interesting bug

2002-06-09 Thread Hack Kampbjørn

[EMAIL PROTECTED] wrote:
 
 I was using wget to suck a website, and found an interesting problem
 some of the URLs it found contained a question mark, after which it
 responded with cannot write to '... insert file/URL here?more
 text  ...'  (invalid argument).
 
 And - it didn't save any of those URLs to files (on my NTFS/windows XP
 machine) ...

It may also have said Illegal filename. Note that not all characters
are allowed in Windows filenames, among them '?'. As '?' is quite common
in data driven web-sites most Windows binaries have included a patch to
deal with it.

The latest wget release 1.8.2 includes now such a patch. But the rest of
illegal characters are not deal with, nor is other special windows
features.
 
 what can I do in order to spider/crawl these pages and save them to my
 local disk ?

Use wget version 1.8.2
 
 Alex

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: HTTP /1.1 500 Internal Server Error

2002-06-02 Thread Hack Kampbjørn

Mark Bucciarelli wrote:
 
 I am having trouble wgetting a samsung printer driver from their site.  Every
 time I try, I immediately get an HTTP/1.1 500 Internal Server Error.   The
 web browser initiates the download properly when I click on the link from the
 referer page.
 
 Here is the command I am running (I don't have a .wgetrc):
 
 wget --debug
 
--referer=http://www.samsungelectronics.com/printer/support/downloads/400329_844_file4.html;
 
http://211.45.27.253/servlet/Downloader?path=%2Fprinter%2Fsupport%2Fdownloads%2Fattach_file%2F20020516175051spp-1.0.2.i386.tar.gzamp;realname=spp-1.0.2.i386.tar.gz;
 
 and here is the debug output:
debug output skipped/

This seems to be yet another encoding problem. I have no problem if I
change the 'amp;' to ''. IIRC URLs found in a HTML page should be HTML
decoded. A simple test (wget -F -i URL.html) shows that wget does this.
But I'm not sure wget should do it for URLs on the cmd line or in a
non-HTML file. In the past we had a lot of problems with wget being
overzealously {en|de}coding URLs.

$ wget
http://211.45.27.253/servlet/Downloader?path=%2Fprinter%2Fsupport%2Fdownloads%2Fattach_file%2F20020516175051spp-1.0.2.i386.tar.gzrealname=spp-1.0.2.i386.tar.gz;
--15:20:35-- 
http://211.45.27.253/servlet/Downloader?path=%2Fprinter%2Fsupport%2Fdownloads%2Fattach_file%2F20020516175051spp-1.0.2.i386.tar.gzrealname=spp-1.0.2.i386.tar.gz
   =
`Downloader@path=%2Fprinter%2Fsupport%2Fdownloads%2Fattach_file%2F20020516175051spp-1.0.2.i386.tar.gzrealname=spp-1.0.2.i386.tar.gz'
Connecting to 211.45.27.253:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 28,864,218 [application/octet-stream]
Last-modified header missing -- time-stamps turned off.
--15:20:36-- 
http://211.45.27.253/servlet/Downloader?path=%2Fprinter%2Fsupport%2Fdownloads%2Fattach_file%2F20020516175051spp-1.0.2.i386.tar.gzrealname=spp-1.0.2.i386.tar.gz
   =
`Downloader@path=%2Fprinter%2Fsupport%2Fdownloads%2Fattach_file%2F20020516175051spp-1.0.2.i386.tar.gzrealname=spp-1.0.2.i386.tar.gz'
Connecting to 211.45.27.253:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/octet-stream]

[ = ] 1,257,472 25.53K/s

 Thanks for a great tool!

And thank you for reading the instructions and actually including debug
output !

 
 Mark

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: 1.8.2 branch opened

2002-05-23 Thread Hack Kampbjørn

Hrvoje Niksic wrote:
 
 Since we need to have a release because of the OpenSSL legalese, we
 can as well fix the most important (crashing) bugs in 1.8.1.  I have
 opened a branch named `branch-1_8_2' where the 1.8.2-specific changes
 will be applied.
 
 Note that only bug fixes will be accepted for 1.8.2.  No new features.
 Here are the patches that I plan to apply initially.  Please let me
 know if you have more.

It seems you missed one of the || SCHEME_HTTPS patches:

Date: Mon, 11 Feb 2002 21:24:44 +0100
From: Christian Lackas [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Subject: patch: recursive downloading for https
Message-ID: [EMAIL PROTECTED]

2002-02-11  Christian Lackas [EMAIL PROTECTED]

* recurive downloading for https fixed

Index: src/recur.c
===
RCS file: /pack/anoncvs/wget/src/recur.c,v
retrieving revision 1.41
diff -u -r1.41 recur.c
--- src/recur.c 2001/12/19 14:27:29 1.41
+++ src/recur.c 2002/02/11 20:15:54
@@ -438,6 +438,9 @@
 
   /* 1. Schemes other than HTTP are normally not recursed into. */
   if (u-scheme != SCHEME_HTTP
+#ifdef HAVE_SSL
+u-scheme != SCHEME_HTTPS
+#endif
!(u-scheme == SCHEME_FTP  opt.follow_ftp))
 {
   DEBUGP ((Not following non-HTTP schemes.\n));
@@ -446,7 +449,11 @@
 
   /* 2. If it is an absolute link and they are not followed, throw it
  out.  */
-  if (u-scheme == SCHEME_HTTP)
+  if (u-scheme == SCHEME_HTTP
+#ifdef HAVE_SSL
+ || u-scheme == SCHEME_HTTPS
+#endif
+)
 if (opt.relative_only  !upos-link_relative_p)
   {
DEBUGP ((It doesn't really look like a relative link.\n));
@@ -534,7 +541,12 @@
   }
 
   /* 8. */
-  if (opt.use_robots  u-scheme == SCHEME_HTTP)
+  if (opt.use_robots  (u-scheme == SCHEME_HTTP
+#ifdef HAVE_SSL
+ || u-scheme == SCHEME_HTTPS
+#endif
+ )
+)
 {
   struct robot_specs *specs = res_get_specs (u-host, u-port);
   if (!specs)



-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: OK, time to moderate this list

2002-03-22 Thread Hack Kampbjørn

Ian Abbott wrote:
 
 On 22 Mar 2002 at 4:08, Hrvoje Niksic wrote:
 
  The suggestion of having more than one admin is good, as long as there
  are people who volunteer to do it besides me.
 
 I'd volunteer too, but don't want to be the only person moderating
 the lists for the same reasons as yourself. (I'm also completely
 clueless about the process of moderating mailing lists at the
 moment!)

I'll volunteer too, so now we have 4 moderators but all based in Europe
(If I've counted right: de, se, uk and dk). A couple of moderators from
other timezones (like America, Asia or Australia) would be nice.
 
  I also have to check with the sunsite.dk people whether the ML
  manager, ezmlm, can handle this.
 
 If it only handles a single moderator account, perhaps a secure
 web-based email account could be set up for moderation purposes
 which the real moderators could log into on a regular basis.

Now that we are talking about changing the ml configuration. Some other
things I would like to have changed too:
- The current setup removes the original mail's headers. Change it so
that the original headers are preserved.
- Add a header with the receiver's subscribed email so {s,}he can
unsubscribe {her,him}self. The cygwin ml also running ezmlm adds this:
  List-Unsubscribe:
mailto:[EMAIL PROTECTED]
  List-Subscribe: mailto:[EMAIL PROTECTED]
  List-Archive: http://sources.redhat.com/ml/cygwin/
  List-Post: mailto:[EMAIL PROTECTED]
  List-Help: mailto:[EMAIL PROTECTED],
http://sources.redhat.com/ml/#faqs

If we switch to mailman can it be configured to not send a password
reminder every month? I unsubscribed from a really low-traffic list on
sunsite.dk just because of this.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: wget core dump with recursive file transfer

2002-02-16 Thread Hack Kampbjørn

Paul Eggert wrote:
 
 (I built wget on Solaris 8 with GCC 3.0.3.)
 
 Here are the symptoms of the problem.
 
 184-shade $ wget --recursive file:///
 Segmentation Fault (core dumped)

Note that file:// is not supported.
$ wget -d file://
DEBUG output created by Wget 1.8.1 on cygwin.

file://: Unsupported scheme.

But the core dump is not limited to Solaris when combining --recursive
with file://
$ wget -d --recursive file://
DEBUG output created by Wget 1.8.1 on cygwin.

Segmentation fault (core dumped)

And this is not fixed in the current CVS code
$ wget-dev -d --recursive file://
DEBUG output created by Wget 1.8.1+cvs on cygwin.

Segmentation fault (core dumped)

A patch like this (would one of the C coder on the list check it?) seems
to fix it. I suppose the FINISHED and Downloaded pat should be removed
too, to make it more clear that it ended with error.

$ ./wget-dev.exe -d --recursive file://
DEBUG output created by Wget 1.8.1+cvs on cygwin.

file://: Unsupported scheme.

FINISHED --01:12:30--
Downloaded: 0 bytes in 0 files

Index: src/recur.c
===
RCS file: /pack/anoncvs/wget/src/recur.c,v
retrieving revision 1.41
diff -u -r1.41 recur.c
--- src/recur.c 2001/12/19 14:27:29 1.41
+++ src/recur.c 2002/02/17 00:15:25
@@ -184,6 +184,7 @@
 retrieve_tree (const char *start_url)
 retrieve_tree (const char *start_url)
 {
   uerr_t status = RETROK;
+  int url_error_code;  /* url parse error code */
 
   /* The queue of URLs we need to load. */
   struct url_queue *queue = url_queue_new ();
@@ -194,7 +195,14 @@
 
   /* We'll need various components of this, so better get it over with
  now. */
-  struct url *start_url_parsed = url_parse (start_url, NULL);
+  struct url *start_url_parsed = url_parse (start_url,
url_error_code);
+  if (!start_url_parsed)
+{
+  logprintf (LOG_NOTQUIET, %s: %s.\n, start_url, url_error
(url_error_code));
+  xfree (start_url);
+  return URLERROR;
+}
+
 
   /* Enqueue the starting URL.  Use start_url_parsed-url rather than
  just URL so we enqueue the canonical form of the URL.  */

 185-shade $ wget --version
 GNU Wget 1.8.1
 
 Copyright (C) 1995, 1996, 1997, 1998, 2000, 2001 Free Software Foundation, Inc.
 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
 Originally written by Hrvoje Niksic [EMAIL PROTECTED].
 186-shade $ uname -a
 SunOS shade.twinsun.com 5.8 Generic_108528-13 sun4u sparc SUNW,Ultra-1
 187-shade $ gdb /opt/reb/bin/wget core
 GNU gdb 5.1.1
 Copyright 2002 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as sparc-sun-solaris2.8...
 Core was generated by `wget --recursive file:///'.
 Program terminated with signal 11, Segmentation fault.
 Reading symbols from /usr/lib/libmd5.so.1...done.
 Loaded symbols for /usr/lib/libmd5.so.1
 Reading symbols from /opt/reb/lib/libssl.so.0.9.6...done.
 Loaded symbols for /opt/reb/lib/libssl.so.0.9.6
 Reading symbols from /opt/reb/lib/libcrypto.so.0.9.6...done.
 Loaded symbols for /opt/reb/lib/libcrypto.so.0.9.6
 Reading symbols from /usr/lib/libdl.so.1...done.
 Loaded symbols for /usr/lib/libdl.so.1
 Reading symbols from /usr/lib/libsocket.so.1...done.
 Loaded symbols for /usr/lib/libsocket.so.1
 Reading symbols from /usr/lib/libnsl.so.1...done.
 Loaded symbols for /usr/lib/libnsl.so.1
 Reading symbols from /usr/lib/libc.so.1...done.
 Loaded symbols for /usr/lib/libc.so.1
 Reading symbols from /usr/lib/libmp.so.2...done.
 Loaded symbols for /usr/lib/libmp.so.2
 Reading symbols from /usr/platform/SUNW,Ultra-1/lib/libc_psr.so.1...done.
 Loaded symbols for /usr/platform/SUNW,Ultra-1/lib/libc_psr.so.1
 #0  0x0002a698 in retrieve_tree (start_url=0x4fb98 file:///) at recur.c:201
 201 recur.c: No such file or directory.
 in recur.c
 (gdb) where
 #0  0x0002a698 in retrieve_tree (start_url=0x4fb98 file:///) at recur.c:201
 #1  0x0002832c in main (argc=-4264136, argv=0xffbef054) at main.c:812

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: wget crash

2002-02-15 Thread Hack Kampbjørn

Steven Enderle wrote:
 
 short and dirty (in german):
 
 Größen stimmen nicht überein (lokal 6968552) -- erneuter Download.
 (sizes do not match) ... (retrieving)
 
 --00:08:03--
 ftp://ftp.scene.org/pub/music/artists/nutcase/mp3/timeofourlives.mp3
 =
 `ftp.scene.org/pub/music/artists/nutcase/mp3/timeofourlives.mp3'
 == CWD nicht erforderlich.
 == PORT ... fertig.== REST 6968552 ... fertig.
 == RETR timeofourlives.mp3 ... fertig.
 Länge: 5,574,867 [noch -1,393,685]


The already downloaded file is bigger than the (6,968,552) than the file
to download (5,574,867).

 
 assertion percentage = 100 failed: file progress.c, line 552
 zsh: abort (core dumped)  wget -m -c --tries=0
 ftp://ftp.scene.org/pub/music/artists/nutcase/mp3/timeofourlives.mp3

progress.c
  int percentage = (int)(100.0 * size / bp-total_length);

  assert (percentage = 100);
Of course the assert will fail, size is bigger than total_length !

 
 hope this helps in any way.

Yes, it did after I actually read it 8-)

To reproduce with wget-1.8.1
$ wget ftp://sunsite.dk/disk1/gnu/wget/wget-1.8{,.1}.tar.gz
$ cat wget-1.8.tar.gz  wget-1.8.1.tar.gz
$ wget -d -c ftp://sunsite.dk/disk1/gnu/wget/wget-1.8.1.tar.gz

DEBUG output created by Wget 1.8.1 on cygwin.

Using `.listing' as listing tmp file.
--13:48:44--  ftp://sunsite.dk/disk1/gnu/wget/wget-1.8.1.tar.gz
   = `.listing'
Resolving sunsite.dk... done.
Caching sunsite.dk = 130.225.247.90
Connecting to sunsite.dk[130.225.247.90]:21... connected.
Created socket 3.
Releasing 0x100b07b8 (new refcount 1).
Logging in as anonymous ... 220 ProFTPD 1.2.4 Server (SunSITE Denmark
FTP-Server) [sunsite-int.sunsite.dk]

-- USER anonymous

331 Anonymous login ok, send your complete email address as your
password.

-- PASS -wget@

230-
Welcome to SunSITE.dk
 
 SunSITE.dk is located at Aalborg University, Denmark. It is a Sun
 Enterprise E3500 Server with 2 400MHz UltraSPARC-II CPUs, 2 GB Memory
 and 563 GB raw storage capacity.
 and 563 GB raw storage capacity.
 
 The server was kindly donated by Sun Microsystems. Aalborg University,
 SuSE GmbH, 3Com Nordic, Silcon Group, CLARiiON, FourLeaf Technologies
 and Infoseek are sponsoring the project.
 
More information on SunSITE.dk can be found at
 
  http://sunsite.dk/SunSITE/
 
 Note that if ftp hangs or dies, try putting a hyphen at the start of
 your password. All transfers are logged and any misuse will be acted
 upon.
 
 Please email suggestions and questions to [EMAIL PROTECTED]
 
230 Anonymous access granted, restrictions apply.
Logged in!
== SYST ... 
-- SYST

215 UNIX Type: L8
done.== PWD ... 
-- PWD
257 / is current directory.
done.
== TYPE I ... 
-- TYPE I

200 Type set to I.
done.  changing working directory
Prepended initial PWD to relative path:
  old: 'disk1/gnu/wget'
  new: '/disk1/gnu/wget'
== CWD /disk1/gnu/wget ... 
-- CWD /disk1/gnu/wget

250 CWD command successful.
done.
== PORT ... Master socket fd 4 bound.

-- PORT 192,168,1,131,13,205

200 PORT command successful.
done.== LIST ... 
-- LIST

150 Opening ASCII mode data connection for file list.
done.
Created socket fd 5.

[ = ] 514   
3.35K/s 

Closing fd 5
Closing fd 4
226 Transfer complete.
13:48:45 (3.35 KB/s) - `.listing' saved [514]

PLAINFILE; perms 644; month: Sep; day: 23; year: 1998 (no tm); 
PLAINFILE; perms 644; month: Dec; day: 31; year: 2000 (no tm); 
PLAINFILE; perms 644; month: Dec; day: 31; year: 2000 (no tm); 
PLAINFILE; perms 644; month: Nov; day: 18; time: 15:43:00 (no yr); 
PLAINFILE; perms 644; month: Jun; day: 4; year: 2001 (no tm); 
PLAINFILE; perms 644; month: Dec; day: 25; time: 21:04:00 (no yr); 
PLAINFILE; perms 644; month: Dec; day: 10; time: 08:00:00 (no yr); 
Removed `.listing'.
The sizes do not match (local 2185627) -- retrieving.

--13:48:45--  ftp://sunsite.dk/disk1/gnu/wget/wget-1.8.1.tar.gz
   = `wget-1.8.1.tar.gz'
== CWD not required.
== PORT ... Master socket fd 4 bound.

-- PORT 192,168,1,131,13,224

200 PORT command successful.
done.== REST 2185627 ... 
-- REST 2185627

350 Restarting at 2185627. Send STORE or RETRIEVE to initiate transfer.
done.
== RETR wget-1.8.1.tar.gz ... 
-- RETR wget-1.8.1.tar.gz

150 Opening BINARY mode data connection for wget-1.8.1.tar.gz
(4293879449 bytes).
done.
Lying FTP server found, adjusting.
Created socket fd 5.
Length: 1,097,780 [-1,087,847 to go]

assertion percentage = 100 failed: file
/home/hack/projects/cygwin-wget/wget-1.8.1/src/progress.c, line 552
Aborted (core dumped)


 
 Thanks
 
 Steven
 --
 --
 - Steven Enderle - m d n Huebner GmbH 
 - [EMAIL PROTECTED] - + 49 911 93 90 90 -
 -  Digital Imaging  Documentmanagment   -
 --

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Debian bug 21588 - inconsistent naming of directories created by wget

2002-02-06 Thread Hack Kampbjørn

Guillaume Morin wrote:
 
 Forward of
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=21588repeatmerged=yes
 
 
 
 If I access a server not on the default port, wget does not write that
 port in the name of the directory it creates.  Here is an example:
 
 --13:43:40--  http://www.center.osaka-u.ac.jp:7080/center/contents.html
= `www.center.osaka-u.ac.jp/center/contents.html'

This was changed with version 1.8. Now it will be saved under
www.center.osaka-u.ac.jp:7080.

$ wget -l inf -r http://www.wsu.edu:8080/~brians/errors/errors.html
--19:24:23--  http://www.wsu.edu:8080/%7Ebrians/errors/errors.html
   = `www.wsu.edu:8080/%7Ebrians/errors/errors.html'
Resolving www.wsu.edu... done.
Connecting to www.wsu.edu[134.121.1.61]:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 40,575 [text/html]

100%[] 40,57522.62K/s   
ETA 00:00


There can still be directory collusions, but now only for different
services on the same host all on their default port (or http and https
on the same non-default port) i.e. ftp://host, http://host and
https://host will all be saved under host.


 
 
 Please keep ,[EMAIL PROTECTED] CC'ed
 
 --
 Guillaume Morin [EMAIL PROTECTED]
 
 Oh, that is nice out there, I think I'll stay for a while (RHCP)

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: hola saludos

2002-02-06 Thread Hack Kampbjørn

Please in the future write in english if you expect to get any help.

En el futuro por favor escriba en inglés si quire recibir ayuda.

maromans wrote:
 
 estoy utilizando este programa y no puedo bajar sitios enteros
 desconosco si estoy equivocado en las opciones o modificadores pero no me
 baja los hipervinculos de los sitios que intento grabar si pueden ayudarme
 sinceramente gracias

Ademas de escribir en ingles y hasta que hallamos perfeccionado el
modulo der leer mentes, prodrias empezar por detallar que es lo que
haces, que esperabas que ocurriese (y por que) y que es lo que realmente
ocurre.

How To Ask Questions The Smart Way:
http://www.tuxedo.org/~esr/faqs/smart-questions.html

 
 _
 Do You Yahoo!?
 Get your free @yahoo.com address at http://mail.yahoo.com

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Dynamic Images

2002-01-31 Thread Hack Kampbjørn

 RUI SHANTILAL wrote:
 
 Don´t know if you can consider it as a bug but reality is that I
 couldn´t retrieve images that are generated through a script.
 
 Example is :
 
 http://www.portugaldiario.iol.pt/idpd92.html
 
 try to retrieve all this page including all the images and all the
 images called as :
 
 
 img
 
src=http://www.iol.pt/intermedia/iol/mediaget/get_iol_image/?ord_procedure_path=17441nome_tabela=imagens_sitemakernome_campo=imagemcondicao=id
 border=0
 ..
 
 are not saved using wget !!
 
Most people think of it as a feature. Since most externally linked
images are banners they are glad it's not downloaded -- right
troubleshooting, wrong conclusion: it's not the dynamic generation
(banners again?) wget doesn't like but the different host
(www.portugaldiario.iol.pt != www.iol.pt)

 Hope u ppl get this funcionality working in next version !!
 Keep the good work !!

It's already there tell wget to span host (--span-hosts). But use it
with care as it also would follow links to banners ...

 smiler
 

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Possible bugs when making https requests

2002-01-23 Thread Hack Kampbjørn

Sacha Mallais wrote:
 
 I'm having some problems with wget and SSL.  I have been getting the
 following output on occasion (meaning, the exact same command works
 sometimes and sometimes produces this), even when everything else (my web
 browser, etc) is able to connect with no problem:
 
 --
 DEBUG output created by Wget 1.8.1 on aix4.1.5.0.
 
 --14:30:01--  https://tpurs.oda.state.or.us/
= `/tmp/tPURS-apache-AYA-wget.output'
 Resolving tpurs.oda.state.or.us... done.
 Caching tpurs.oda.state.or.us = 192.152.7.27
 Connecting to tpurs.oda.state.or.us[192.152.7.27]:443... connected.
 Created socket 5.
 Releasing 20026168 (new refcount 1).
 
 Unable to establish SSL connection.
 Closing fd 5
 
 Unable to establish SSL connection.
 --
 
No problem here but then it's only version 1.7.1

$ wget -d https://tpurs.oda.state.or.us/
DEBUG output created by Wget 1.7.1 on cygwin.

parseurl (https://tpurs.oda.state.or.us/;) - host
tpurs.oda.state.or.us - opath  - dir  - file  - ndir 
newpath: /
--23:50:07--  https://tpurs.oda.state.or.us/
   = `index.html'
Connecting to tpurs.oda.state.or.us:443... Caching tpurs.oda.state.or.us
- 192.152.7.27
Created fd 3.
connected!
---request begin---
GET / HTTP/1.0
User-Agent: Wget/1.7.1
Host: tpurs.oda.state.or.us
Accept: */*
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Wed, 23 Jan 2002 22:50:29 GMT
Server: Apache/1.3.22 (Darwin) mod_ssl/2.8.5 OpenSSL/0.9.6b
Cache-Control: max-age=60
Expires: Wed, 23 Jan 2002 22:51:29 GMT
Last-Modified: Sat, 12 Jan 2002 00:30:02 GMT
ETag: d78f1-100-3c3f838a
Accept-Ranges: bytes
Content-Length: 256
Keep-Alive: timeout=15, max=500
Connection: Keep-Alive
Content-Type: text/html


Found tpurs.oda.state.or.us in host_name_address_map: 192.152.7.27
Registered fd 3 for persistent reuse.
Length: 256 [text/html]

0K   100% @
250.00 KB/s

23:50:09 (250.00 KB/s) - `index.html' saved [256/256]

 Also note the it does _not_ appear to be retrying the connection.  I have
 explicitly set --tries=5, and with a non-ssl connection, the above stuff
 appears 5 times when it cannot connect.  But, for SSL stuff, one failure
 kills the process.
 
 If there is any other info I can give you, let me know.

You have already done the exceptional: providing debug output !
 
 sacha
 
 --
 Sacha Michel Mallais   [EMAIL PROTECTED]
 Global Village Consulting Inc.  http://www.global-village.net/sacha
 Things won are done; joy's soul lies in the doing.
 -- William Shakespeare, Troilus and Cressida, Act 1, Scene 2

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



unsubscribing from list (WAS: Win ssl bug)

2001-12-04 Thread Hack Kampbjørn

Matt Pease wrote:
 
 somebody please get me off this list!  emailing [EMAIL PROTECTED]
 does not work

Sorry, that's not how it works.

YOU subscribed yourself to the list, you were warned that only YOU could
unsubscribe later on, and to save the welcome message in case you might
forget which address you subscribed with. It's obviously not
[EMAIL PROTECTED] or [EMAIL PROTECTED], but only YOU have a chance
of guessing it. Look at the headers of the mail you get from the list
maybe it's coming via some mail forwarder service you've forgotten
about.

Nobody on this list has the powers to unsubscribe other people. Those
that have are on [EMAIL PROTECTED] If you're a little more helpful than
your latest mail maybe they can help you i.e. you provide them with
_all_ the email addresses you used about the time you subscribed to this
list. But if you know them all you can just as easily unsubscribe the
right address yourself.

 
 Thanks -
 Matt

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: unsubscribing from list (WAS: Win ssl bug)

2001-12-04 Thread Hack Kampbjørn

James C. McMaster (Jim) wrote:
 
 What we cannot seem to get through to the thick-headed people is THE
 AUTOMATED UNSUBSCRIBE PROCEDURE IS BROKEN, AND HAS BEEN FOR A LONG TIME.
 FOLLOWING THE UNSUBSCRIBE INSTRUCTIONS YOU SO HELPFULLY EXPLAIN IS
 POINTLESS, BECAUSE THE AUTOMATED UNSUBSCRIBE PROCEDURE IS BROKEN, AND HAS
 BEEN FOR A LONG TIME.  SENDING EMAIL TO [EMAIL PROTECTED] WILL NOT
 DO THE TRICK BECAUSE THE AUTOMATED UNSUBSCRIBE PROCEDURE IS BROKEN, AND HAS
 BEEN FOR A LONG TIME.
 
 Do you get it now?  We are asking for the list admins to fix the list so we
 can stop bothering people on the list.  Do you get it now?

You're right I hadn't got it. But if the unsubcribe procedure was broken
it's fixed now (btw the list admins can be reached at [EMAIL PROTECTED]).
I had no problem unsubscribing myself ([EMAIL PROTECTED]) and
subscribing a new mail address ([EMAIL PROTECTED]).

From my Welcome msg:
 Please save this message so that you know the address you are
 subscribed under, in case you later want to unsubscribe or change your
 subscription address.
 [...]
 You can start a subscription for an alternate address,
 for example [EMAIL PROTECTED], just add a hyphen and your
 address (with '=' instead of '@') after the command word:
 [EMAIL PROTECTED]
 
 To stop subscription for this address, mail:
 [EMAIL PROTECTED]
 
 In both cases, I'll send a confirmation message to that address. When
 you receive it, simply reply to it to complete your subscription.
 
 If despite following these instructions, you do not get the
 desired results, please contact my owner at
 [EMAIL PROTECTED] Please be patient, my owner is a
 lot slower than I am ;-)

And here is the prove that I successfully unsubscribe my previous mail
address:

 Original Message 
Subject: GOODBYE from [EMAIL PROTECTED]
Date: 4 Dec 2001 17:08:53 -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]


Hi! This is the ezmlm program. I'm managing the
[EMAIL PROTECTED] mailing list.

Acknowledgment: I have removed the address

   [EMAIL PROTECTED]

from the wget mailing list. This address
is no longer a subscriber.
[...]

 
 --
 Jim McMaster
 mailto:[EMAIL PROTECTED]
 

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: --only and --not

2001-11-16 Thread Hack Kampbjørn

Hrvoje Niksic wrote:
 
 First, my apologies for the long delay in answering.
 

Welcome back !

For those not on the wget-patches list, check the CVS ChangeLog 8-)
http://sunsite.dk/cvsweb/wget/src/ChangeLog

 The idea behind this patch, and the patch itself, are very
 interesting.  I'll look into it for Wget 1.8 (1.7.1 should be a
 bugfix-only release.)
 
 Several random musings:
 
 * It would be nice to have an option to use only one filter, so that
   people who want speed and/or retain state between filter invocations
   get get them.
 
 * \ is probably not the best choice for the escape character; it's
   easy to lose it.  Maybe %u et al. would be a better choice?
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Compile problem (and possible fix)

2001-11-11 Thread Hack Kampbjørn

Ian Abbott wrote:
 
 On 7 Nov 2001, at 23:07, Hack Kampbjørn wrote:
 
 Agreed that you don't want to use Apple's precompiler, but I
 couldn't tell from the links you posted what platform the fix
 fails to compile on. There was one reference to VC++ 5.0 breaking,
 but that was for the unfixed version.

On Monday, June 25, 2001, at 05:54 PM, Hrvoje Niksic wrote:
 Perhaps the problem is that the '' constant is within the assert,
 which might indeed hurt some compilers.  In fact, I originally used
 '\', but a Microsoft compiler couldn't swallow it in the `assert'
 expression.

I read that as '\' (which the original poster proposed) breaks M$
Visual Studio.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Incorrect numbers with -c option

2001-11-11 Thread Hack Kampbjørn

Lukasz Bolikowski wrote:
 
 Hello!
 
I think the -c option in wget results in misleading output.
 I have been downloading
 
 ftp://ftp.kernel.org/pub/dist/superrescue/v2/superrescue-2.0.0a.iso.gz
 
 which is 514596847 bytes long. I aborted the downloading after 246345728
 bytes and then run:
 
 wget -c --passive-ftp -o super-log ftp://...
 
 This is the beginning of the logfile:
 
 --17:52:13--
 ftp://ftp.kernel.org/pub/dist/superrescue/v2/superrescue-2.0.0a.iso.gz
= `superrescue-2.0.0a.iso.gz'
 Connecting to ftp.kernel.org:21... connected!
 Logging in as anonymous ... Logged in!
 == SYST ... done.== PWD ... done.
 == TYPE I ... done.  == CWD /pub/dist/superrescue/v2 ... done.
 == PASV ... done.== REST 246345728 ... done.
 == RETR superrescue-2.0.0a.iso.gz ... done.
 Length: 268,251,119 [21,905,391 to go] (unauthoritative)
 
   [ skipping 240550K ]
 240550K ,, ,, ,, .. .. 91% @
 34.61 KB/s
 lots of lines
 328050K .. .. .. .. ..125% @
 50.81 KB/s
 snip!
 
 IMHO there is much more to go than 21,905,391 bytes. Besides, the
 percentages on the right are incorrect. I'm using GNU Wget 1.7

Yes, this has been an ongoing problem in wget (but it should be fixed
now at least in CVS if not in 1.7), but I cannot reproduce it with wget
1.7 and the URL provided (so that's likely 1.7).

$ wget -d --passive-ftp -c
ftp://ftp.kernel.org/pub/dist/superrescue/v2/superrescue-2.0.0a.iso.gz
DEBUG output created by Wget 1.7 on cygwin.

snip/

200 PORT command successful.
done.== REST 316224 ... 
-- REST 316224

350 Restarting at 316224. Send STORE or RETRIEVE to initiate transfer.
done.
== RETR superrescue-2.0.0a.iso.gz ... 
-- RETR superrescue-2.0.0a.iso.gz

150 Opening BINARY mode data connection for superrescue-2.0.0a.iso.gz
(514280623 bytes).
done.
Lying FTP server found, adjusting.
Created socket fd 5.
Length: 514,596,847 [514,280,623 to go]

  [ skipping 300K ]
  300K .. .. .. .. ..  0% @ 
28.56 KB/s
  350K .. .. .. .. ..  0% @ 
52.58 KB/s

 
 Best regards
 
 Lukasz Bolo Bolikowski

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: connect timeout

2001-11-07 Thread Hack Kampbjørn


Nic Ferrier wrote:
 
 Sorry if you're already aware of this... I couldn't find the archives of
 this list at GNU. Maybe you should put a link on the page:
 http://www.gnu.org/software/wget/

The official web-site is http://wget.sunsite.dk/
Yes, there should be a link from the GNU site.
[List] how can it be added ?

 
 I've discovered that wget doesn't do connection timeouts. That is if the
 host it is trying to connect to cannot be reached for some reason then wget
 simply hangs.
 
 I expected wget to return after T seconds after specifying the timeout
 option on the command line but it didn't.
 
 No control of connect timeouts is a serious weakness in a tool designed to
 be used for batched downloads... I've had to swap wget for curl for the
 particular task I'm working on (which is a pity because in all other
 respects I like wget and want to support GNU projects).

$ wget --help
GNU Wget 1.7, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...
[...]
Download:
[...]
  -t,  --tries=NUMBER   set number of retries to NUMBER (0
unlimits).
[...]
  -T,  --timeout=SECONDSset the read timeout to SECONDS.
  -w,  --wait=SECONDS   wait SECONDS between retrievals.
   --waitretry=SECONDS  wait 1...SECONDS between retries of a
retrieval.
[...]

Which of this were you using ?
And please the next time send bugreports including debug output (wget -d
...)

 
 Nic Ferrier

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: connect timeout

2001-11-07 Thread Hack Kampbjørn

Nic Ferrier wrote:
 
 The official web-site is http://wget.sunsite.dk/
 Yes, there should be a link from the GNU site.
 [List] how can it be added ?
 
 sign the project up on savannah (http://savannah.gnu.org).
 
 That will provide you with a nice management interface (based on CVS)
 for changing the wget tree.
 
 $ wget --help
 GNU Wget 1.7, a non-interactive network retriever.
 Usage: wget [OPTION]... [URL]...
 Download:
   -t,  --tries=3DNUMBER   set number of retries to NUMBER (0
   unlimits).
   -T,  --timeout=3DSECONDSset the read timeout to SECONDS.
   -w,  --wait=3DSECONDS   wait SECONDS between retrievals.
--waitretry=3DSECONDS  wait 1...SECONDS between retries of a
   retrieval.
 Which of this were you using ?
 
 I tried sveral things, including:
 
wget -t 2 -T 10 -w 1
 
 But, be honest, is the -T option actually a *connect* timeout?

You're right it's not 8-(
$ wget -d -T 5 http://192.168.1.254/
DEBUG output created by Wget 1.7 on cygwin.

parseurl (http://192.168.1.254/;) - host 192.168.1.254 - opath  -
dir  - file  - ndir 
newpath: /
--22:23:46--  http://192.168.1.254/
   = `index.html'
Connecting to 192.168.1.254:80... 
connect: Attempt to connect timed out without establishing a connection
Closing fd 3
Retrying.

--22:24:08--  http://192.168.1.254/
  (try: 2) = `index.html'
Connecting to 192.168.1.254:80... 
[hack@DUR0N2000 webs]$ wget -d -T 5 http://192.168.1.254/
DEBUG output created by Wget 1.7 on cygwin.

parseurl (http://192.168.1.254/;) - host 192.168.1.254 - opath  -
dir  - file  - ndir 
newpath: /
--22:24:19--  http://192.168.1.254/
   = `index.html'
Connecting to 192.168.1.254:80... 
[...]

$ wget -d -T 5 http://hostname/
DEBUG output created by Wget 1.7-dev on linux-gnu.

parseurl (http://hostname/;) - host hostname - opath  - dir  -
file  - ndir 
newpath: /
--22:36:48--  http://hostname/
   = `index.html'
Connecting to hostname:80... Caching hostname - 192.168.1.254

connect: Connection timed out
Closing fd 3
Retrying.

--22:39:58--  http://hostname/
  (try: 2) = `index.html'
Connecting to hostname:80... Found hostname in host_name_address_map:
192.168.1.254

Note: output edited hostname is a host that doesn't answer on port 80.
Two different systems on two different networks, that might explain the
difference in timeout times.

Well, Daniel Stenberg maybe you should try to get your cURL
implementation accepted. It's bad when other packages maintainers are
more active on the list that Wget's 8-(

 
 And please the next time send bugreports including debug output (wget
 -d =2E..)
 
 I don't think it would do you much good in this case... but I can send
 you one if you want.

Likely not, but it would include the Wget version and if that's not 1.7
then the standard recommendation would be to update. You'll be surpised
how many bug-reports there is related to older version like 1.5.3 or
even a couple to 1.4.5 8-)

 
 Nic

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Compile problem (and possible fix)

2001-11-07 Thread Hack Kampbjørn

Ed Powell wrote:
 
 I was compiling wget 1.7 on MacOS X 10.1 (Darwin 1.4).  Around line 435 in
 html-parse.c there's the section:
 
 case AC_S_QUOTE1:
   assert (ch == '\'' || ch == '');
   quote_char = ch;  /* cheating -- I really don't feel like
introducing more different states for
different quote characters. */
   ch = *p++;
   state = AC_S_IN_QUOTE;
   break;
 
 I had to change:
 
 assert (ch == '\'' || ch == '');
 
 to:
 
 assert (ch == '\'' || ch == '\');
 
 Otherwise, it would not compile... it was, I think, interpreting the ,
 rather than using it literally.  Escaping it appears to have fixed the
 problem.

Right conclusion, wrong fix. Fix the broken software not the correct
one, i.e. your fix breaks wget on another broken platform (read the
below links if cannot guess which). You don't want to use Apple's
precompiler anyway.


http://www.mail-archive.com/cgi-bin/htsearch?method=andformat=shortconfig=wget_sunsite_dkrestrict=exclude=words=darwin
http://www.mail-archive.com/wget@sunsite.dk/msg01532.html
http://www.mail-archive.com/wget@sunsite.dk/msg01289.html

 
 The compiling process was simply doing a 'configure' then 'make'.  After
 making the change described above, I ran 'make' again, and everything was
 fine.
 
 --
 Ed Powell - Meus Navis Aerius est Plena Anguillarum
  http://www.visi.com/~epowell

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: Compilation problems

2001-11-07 Thread Hack Kampbjørn

Andrew Coggins wrote:
 
 Hi Wolfgang,
 
 Thanks, that cleared the problem. though  a search on freshmeat came up empty
 for texi2pod.

texi2pod isn't a separate package, but included in the wget
distribution. It should be build by make. But it failed, that's the
error message you got. Wget as a GNU software is developed to use GNU
make and *BSD system uses (you guessed it) Berkeley (or BSD) make. You
should install gmake (it's in ports) and run 'gmake' wherever the
documentation says 'make'

In general it's a good idea on *BSD systems to built on ports shared
knowledge (even when you're no using it's build and installation
system). Information like this is already recorded there (and even some
patches). On my OpenBSD system:

$ cd /usr/ports/net/wget
$ make show=USE_GMAKE
Yes
$ ls patches/
CVSpatch-configure_in
patch-doc_wget.texi
patch-configurepatch-doc_Makefile.in


 
 -Andrew
 
 On Sunday 04 November 2001 12:33, Andrew Coggins wrote:
  erm, well I went to freshmeat, looked up wget, and followed the links.
  http://www.freshmeat.net
 
  So, anyone able to help with my problem? :)
 
 Hi, I am also rather asker than answerer on this list. But: Looks like you
 
 have to install texi2pod?
 
 Second guess: texi2pod tranlates your docs from GNU texinfo to perl pod
 format (at least I guess so). So I guess for making the program work, you can
 
 do without.

Which is converted to man format which the pod2man from perl (usually
already on the system).

 
 make -k
 
 should get you farther than just till texi2pod.
 
 Cheers,
 Wolfgang
 
 --
 Dr. Wolfgang Muuml;ller, assistant == teaching assistant
 Personal page: http://cui.unige.ch/~vision/members/WolfgangMueller.html
 Maintainer, GNU Image Finding Tool (http://www.gnu.org/software/gift)
 
 
 

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: wget 1.5.3 suggestions

2001-10-25 Thread Hack Kampbjørn

Hermann Rugen wrote:
 
 Hallo folks,
 nice to use this software.
 I was lokking for more, than I got.
 Maybe, I made a mistake.
 Downloading or trying to mirror a Site does run, but without style sheets.
 For eaxample:take my homepage.
 I did 'wget' it for testing and got not the complete one. Steyle sheets
 where missing.

Version 1.5.3 is several years old now. Try using the latest version of
wget it should know about Style Sheets. You can download the source from
the web-site (http://wget.sunsite.dk/).


 What else, I don't know now.
 I would be happy, to get a wget.conf-sample fo downloding a Site fo
 mirroring.
 Can you help me?
 wget is running on SUSE7.1 with linux2.2.18
 
 kind regards
 Hermann Rugen
 
 eMail: [EMAIL PROTECTED]
 Internet: www.rugen-consultng.com

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: problem with permanent connection.... (maybe a bug)

2001-10-24 Thread Hack Kampbjørn

Marcelo Taube wrote:
 
 wget was working OK with my ppp connection, now I installed a new connection
 throw a NE2000 ETHERNET card, most of the programms work Ok and faster but
 wget (and other programms that depend on it) don't work anymore
 
 this is what happens when I try to download a file (It happens with other files from 
other ftp servers as well)...
 **
 [root@localhost /root]# wget 
ftp://ftp.cs.tu-berlin.de/pub/X/XFree86/4.1.0/binaries/Linux-ix86-glibc22/Xf100.tgz

Next time please send the debug output (wget -d ...)

 --14:28:48--  
ftp://ftp.cs.tu-berlin.de/pub/X/XFree86/4.1.0/binaries/Linux-ix86-glibc22/Xf100.tgz
= `Xf100.tgz'
 Connecting to ftp.cs.tu-berlin.de:21... connected!
 Logging in as anonymous ... Logged in!
 == SYST ... done.== PWD ... done.
 == TYPE I ... done.  == CWD /pub/X/XFree86/4.1.0/binaries/Linux-ix86-glibc22 ... 
done.
 == PORT ... done.== RETR Xf100.tgz ... done.
 **
 After the last done it freezes and doesn't  download a single byte.
 I have already updated to the last version of wget (cvs version) but the problem was 
not fixed.
 

Just a guess, but are your new network connection using NAT. And if so
does the router (or firewall) has a FTP proxy to allow active FTP
connections through ?
If not, try using passive mode.

$ wget --help 
GNU Wget 1.7, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...
[...]
FTP options:
  -nr, --dont-remove-listing   don't remove `.listing' files.
  -g,  --glob=on/off   turn file name globbing on or off.
   --passive-ftp   use the passive transfer mode.
   --retr-symlinks when recursing, get linked-to files (not
dirs).


 **
 [root@localhost /root]# wget --version
 GNU Wget 1.7.1-pre1
 
 Copyright (C) 1995, 1996, 1997, 1998, 2000, 2001 Free Software Foundation, Inc.
 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
 Originally written by Hrvoje Niksic [EMAIL PROTECTED].
 **
 I have no idea of what's causing this...
 Thank you in advance...

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: convert-links doesn't work on directories

2001-10-22 Thread Hack Kampbjørn

Dan Christensen wrote:
By the way, I searched but couldn't find the cvs repository for
  wget.  I found the sunsite repository, but it seems like it hasn't
  been updated since June, so I'm guessing it moved somewhere else?
  (Or maybe I'm just not getting the right branch?)  Some of the
  web pages for wget that google turns up seem out of date.
 
  Take a look at the developement page for instructions to access to the
  CVS sources:
  http://wget.sunsite.dk/wgetdev.html
 
 I already did that and found a version which hasn't been updated since
 June.

That is the latest version not much get commited to CVS with the
commiters MIA (or busy with real work). And AFAIK there's no archive for
the wget-patches where activity has continued 8-(

 
 Also, the cvsweb link on that page gets redirected to
 
   http://sunsite.dk/sunsite.css
 
 which says
 
   The requested URL /sunsite.css was not found on this server.
 

Then you're using Netscape Navigator with JavaScript and StyleSheets
activated and there's a broken link in the page (well the style sheet
link). Yes, it's annoying how Netscape won't show such a page, such
disable JavaScript and you'll get it without CSS.

I'll report this to [EMAIL PROTECTED]

 Thanks for the help.

You'll welcome.

 
 Dan

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: ampersand troubles

2001-10-21 Thread Hack Kampbjørn

[EMAIL PROTECTED] wrote:
 
 When I am using wget for CGI-scripts with argumets,
 I need use  (ampersand) between argumets;
 but wget change  to %26 via quoting.
 Ho cat I get
 http://find.infoart.ru/cgi-bin/yhs.pl?hidden=http%3A%2F%2Ffind.infoart.ruword=wget
 ?


Which version of wget are you using?
I have no problem getting this page with wget 1.7. Note that I use -O as
the wget proposed filename would be illegal on windows (contains '?:/'):

$ wget -O testing -S
'http://find.infoart.ru/cgi-bin/yhs.pl?hidden=http%3A%2F%2Ffind.infoart.ruword=wget'
--12:06:44-- 
http://find.infoart.ru/cgi-bin/yhs.pl?hidden=http%3A//find.infoart.ruword=wget
   = `testing'
Connecting to find.infoart.ru:80... connected!
HTTP request sent, awaiting response... 200 OK
2 Date: Sun, 21 Oct 2001 10:09:23 GMT
3 Server: Apache/1.3.20 (Unix) mod_fastcgi/2.2.8 rus/PL30.5
4 Connection: close
5 Content-Type: text/html; charset=windows-1251
6 

0K ...@  10.82
KB/s

Last-modified header missing -- time-stamps turned off.
12:06:44 (10.82 KB/s) - `testing' saved [7327]


 PS: Please, answer directly to [EMAIL PROTECTED].

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



[Cygwin: Updated: wget-1.7-1]

2001-10-21 Thread Hack Kampbjørn

Since some on this list has problem compiling wget on Cygwin (those that
don't know what cygwin is just skip this message or read about:
http://www.cygwin.com/). I'm forwarding this announcement to let them
know that they can just use the cygwin package.

Hack 8-)


 Original Message 
Subject: Updated: wget-1.7-1
Date: Thu, 18 Oct 2001 19:01:12 +0200
From: Hack Kampbjørn [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]

I've update wget in cygwin to version 1.7-1

DESCRIPTION:
GNU Wget is a free software package for retrieving files using HTTP,
HTTPS and FTP, the two most widely-used Internet protocols. It is a
non-interactive commandline tool, so it may easily be called from
scripts, cron jobs, terminals without Xsupport, etc.


CHANGES:
 - SSL (or https) support
 - Cookies
 - HTTP/1.1 KeepAlive (persistent) connections
 - Many FTP improvements including support of NT and VMS servers
 - Internal structure changes resulting in big speedups when downloading
   big sites (thousands of documents)
For more changes see the NEWS file (/usr/doc/wget-1.7/NEWS)


WARNING:
wget-1.7-1 depends on a installed openssl package!


INSTALLATION:
To update your installation, click on the Install Cygwin now link on
the http://cygwin.com/ web page.  This downloads setup.exe to your
system.  Then, run setup and answer all of the questions.

Note that we do not allow downloads from sources.redhat.com (aka
cygwin.com) due to bandwidth limitations.  This means that you will need
to find a mirror which has this update.

In the US,
ftp://mirrors.rcn.net/mirrors/sources.redhat.com/cygwin/ is a
reliable high bandwidth connection.

In Germany,
ftp://ftp.uni-erlangen.de/pub/pc/gnuwin32/cygwin/mirrors/cygnus/ is
usually pretty good.

In the UK,
http://programming.ccp14.ac.uk/ftp-mirror/programming/cygwin/pub/cygwin/
is usually up-to-date within 48 hours.

If one of the above doesn't have the latest version of this package then
you can either wait for the site to be updated or find another mirror.

The setup.exe program will figure out what needs to be updated on your
system and will install newer packages automatically.

If you have questions or comments, please send them to the Cygwin
mailing list at: [EMAIL PROTECTED] .  I would appreciate if you would
use this mailing list rather than mailing me directly.  This includes
ideas and comments about the setup utility or Cygwin in general.

If you want to make a point or ask a question, the Cygwin mailing list
is the appropriate place.

  *** CYGWIN-ANNOUNCE UNSUBSCRIBE INFO ***

If you want to unsubscribe from the cygwin-announce mailing list, look
at the List-Unsubscribe:  tag in the email header of this message.
Send email to the address specified there.  It will be in the format:

[EMAIL PROTECTED]


NOTES:
Yes, we have a new maintainer 8-)

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: convert-links doesn't work on directories

2001-10-21 Thread Hack Kampbjørn

Dan Christensen wrote:
 
 Dear wget maintainers,

I hope the wget maintainers (Hrvoje Niksic and Dan Harkless) are reading
this, but they've been MIA for some months now.

 
   I noticed that the Debian wget maintainer isn't forwarding many
 bugs upstream to you.  If you are curious, you can find the list
 of bugs at
 
   http://bugs.debian.org/wget
 

I wasn't aware of this site. A quick scan of it shows that many of the
bugs are related to wget version 1.5.3 (which is the debian stable
package), note that it is quite old and that many small quirks have been
fixed since then.
I'll try to find some time to go through it,

 The bug that bit me today is
 
   http://bugs.debian.org/62425
 
 which has been open for a year and a half.  In short, convert-links
 doesn't handle URL's of the form .../directory or .../directory/.
 If they are replaced with .../directory/index.html then it works,
 but otherwise it thinks it hasn't downloaded the URL's.

IIRC this is a known bug, but nobody has been annoyed enough to provide
a fix. Try searching the wget archives for a better answer (we really
need an archive for the wget-patches list)

 
   By the way, I searched but couldn't find the cvs repository for
 wget.  I found the sunsite repository, but it seems like it hasn't
 been updated since June, so I'm guessing it moved somewhere else?
 (Or maybe I'm just not getting the right branch?)  Some of the
 web pages for wget that google turns up seem out of date.

That is the official site (and has always been since wget got a
web-site): http://wget.sunsite.dk/ or http://sunsite.dk/wget (previously
known as http://sunsite.auc.dk/wget but AUC (Aalborg Universitets
Center) has changed name (they didn't want to be just a center but a
real university) and dropped the auc.dk domain

Take a look at the developement page for instructions to access to the
CVS sources:
http://wget.sunsite.dk/wgetdev.html

 
   Thanks for a great program.  If you know any easy work-arounds
 for the above bug, I'd love to hear them.  Or if I can get access
 to a version with it fixed, that'd be great.  We have a website
 we're trying to put onto a cd, and this is the only thing in our
 way.
 
 Dan
 
 --
 Dan Christensen
 [EMAIL PROTECTED]

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn



Re: WGET multiple files?

2001-10-10 Thread Hack Kampbjørn


Ifj. Pentek Imre wrote:
 
 Dear Sir,
 
 I'm writing to you because I want to know if WGET can be used to
 download multiple files. So if I want to download files in the same dir?
 What to do in this case? Can your program handle wildcards (like *?)?
 
This is best answer by reading the documentation on the web-site
(http://sunsite.dk/wget/). I think all your questions are answered in
first section. You are os course welcome to suggest improvements to our
documentation 8-)


Introduction to GNU wget

GNU Wget is a free software package for retrieving files using HTTP,
HTTPS and FTP, the two most widely-used Internet protocols. It is a
non-interactive
commandline tool, so it may easily be called from scripts, cron jobs,
terminals without Xsupport, etc.

Wget has many features to make retrieving large files or mirroring
entire web or FTP sites easy, including:

Can resume aborted downloads, using REST and RANGE 
Can use filename wild cards and recursively mirror directories 
NLS-based message files for many different languages 
Optionally converts absolute links in downloaded documents to
relative, so that downloaded documents may link to each other locally 
Runs on most UNIX-like operating systems as well as Microsoft
Windows 
Supports HTTP and SOCKS proxies 
Supports HTTP cookies 
Supports persistent HTTP connections 
Unattended / background operation 
Uses local file timestamps to determine whether documents need to be
re-downloaded when mirroring 
GNU wget is distributed under the GNU General Public License. 

 Thank you for your answer for my letter.
 Yours sincerely:Imre Pentek
 E-mail:[EMAIL PROTECTED]

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: cgi scripts and wget

2001-06-04 Thread Hack Kampbjørn

Samer Nassar wrote:
 
 Hello,
 
 I am an undergrad student in University of Alberta, and downloaded wget
 recently to mirror a site for research purposes. However, wget seems to be
 having trouble pulling pages whose urls are cgi. I went through wget
 manual and didn't see anything about this. Any hints?

Please include the debug output of running wget with the -d option.
There's a couple of problems you can have with CGI scripts and the
like:
- On windows systems: '?' is an illegal character in filenames
- The CGI is filtering on UserAgent value
- Use of cookies
- Use of POST instead of GET
...

 
 Thanks for your help.
 Samer

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Problem with https connection (sslv23 - ok with sslv3)

2001-06-03 Thread Hack Kampbjørn

Some time ago I came across this web-site with HTTPS connection
problems:

  $ wget -S https://www.ihi.dk/
  --23:34:50--  https://www.ihi.dk/
 = `index.html'
  Connecting to www.ihi.dk:443... connected!

  Unable to establish SSL connection.

  Unable to establish SSL connection.
  $

But it works in my browser. So I try with the openssl client:

  $ openssl s_client -connect www.ihi.dk:443
  CONNECTED(0003)
  592:error:140790E5:SSL routines:SSL23_WRITE:ssl handshake  
failure:s23_lib.c:216:
  $

Same problem 8-(
Now it's time to force the SSL protocols manually:

  $ openssl s_client -ssl3 -connect www.ihi.dk:443
  CONNECTED(0003)
  depth=0 /C=DK/ST=Copenhagen/L=Copenhagen/O=International Health  
Insurance/CN=www.ihi.dk
  verify error:num=20:unable to get local issuer certificate
  verify return:1
  depth=0 /C=DK/ST=Copenhagen/L=Copenhagen/O=International Health  
Insurance/CN=www.ihi.dk
  verify error:num=27:certificate not trusted
  verify return:1
  depth=0 /C=DK/ST=Copenhagen/L=Copenhagen/O=International Health  
Insurance/CN=www.ihi.dk
  verify error:num=21:unable to verify the first certificate
  verify return:1
  ---
  Certificate chain
   0 s:/C=DK/ST=Copenhagen/L=Copenhagen/O=International Health  
Insurance/CN=www.ihi.dk
 i:/C=US/O=RSA Data Security, Inc./OU=Secure Server Certification
Authority
  ---
  Server certificate
  -BEGIN CERTIFICATE-
  MIICCjCCAXcCEAUS4W9dOIDJk7K/MmOykJUwDQYJKoZIhvcNAQEEBQAwXzELMAkG
  A1UEBhMCVVMxIDAeBgNVBAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMS4wLAYD
  VQQLEyVTZWN1cmUgU2VydmVyIENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTAw
  MDYxMzAwMDAwMFoXDTAxMDYyNjIzNTk1OVowdTELMAkGA1UEBhMCREsxEzARBgNV
  BAgTCkNvcGVuaGFnZW4xEzARBgNVBAcUCkNvcGVuaGFnZW4xJzAlBgNVBAoUHklu
  dGVybmF0aW9uYWwgSGVhbHRoIEluc3VyYW5jZTETMBEGA1UEAxQKd3d3LmloaS5k
  azBcMA0GCSqGSIb3DQEBAQUAA0sAMEgCQQC8OGOR/9UZ6EFk8oGLVB5C3VbXG5T4
  V5zZJyPRFh7KTBtSnWQvGSxMBwES/n8kIowsX1cRZw2ot1aaU3X8k3KvAgMBAAEw
  DQYJKoZIhvcNAQEEBQADfgAM3sAMXClUWsrMM7Ztx/+HuqEi5rHs4MouKPmj93e0
  U8eV2QqsuwDKIkUxqyLFdiWKCmGbMasAOAOyS1wL7CIu2QCsNFINNBQX4LD19WYg
  +Vh3QHGB4EewkidIZ0Q9AD+DKMqAC45cB6JmbJ512gA3u9z1vpmiL8ZimmXPAg==
  -END CERTIFICATE-
  subject=/C=DK/ST=Copenhagen/L=Copenhagen/O=International Health  
Insurance/CN=www.ihi.dk
  issuer=/C=US/O=RSA Data Security, Inc./OU=Secure Server Certification
Authority
  ---
  No client certificate CA names sent
  ---
  SSL handshake has read 694 bytes and written 238 bytes
  ---
  New, TLSv1/SSLv3, Cipher is DES-CBC3-SHA
  Server public key is 512 bit
  SSL-Session:
  Protocol  : SSLv3
  Cipher: DES-CBC3-SHA
  Session-ID:
  114EFD511DE3F7FBDE1A8C917F7E4DC9CA7F66BA5D478FC82778ED923CBE43CA
  Session-ID-ctx: 
  Master-Key:
  509D485AC95363FA0F8C2786DFE1E90D78564CAF45F78082EFF81A8FED0E87C1D46B29
  824AE396EB953907BA0D07EB73
  Key-Arg   : None
  Start Time: 991604431
  Timeout   : 7200 (sec)
  Verify return code: 0 (ok)
  ---
  HEAD / HTTP/1.0

  HTTP/1.1 302 Found
  Server: Lotus-Domino/5.0.6
  Date: Sun, 03 Jun 2001 22:30:51 GMT
  Location: ihihome.nsf/all/e_main
  Connection: close
  Content-Type: text/html
  Content-Length: 310

  read:errno=0
  $ 

BINGO !
Now I change line 54 in src/gen_sslfunc.c
  /* meth = SSLv23_client_method (); */
  meth = SSLv3_client_method ();


  $ wget -S https://www.ihi.dk/
  --23:35:36--  https://www.ihi.dk/
 = `index.html'
  Connecting to www.ihi.dk:443... connected!
  HTTP request sent, awaiting response... 302 Found
  2 Server: Lotus-Domino/5.0.6
  3 Date: Sun, 03 Jun 2001 22:31:12 GMT
  4 Location: ihihome.nsf/all/e_main
  5 Connection: close
  6 Content-Type: text/html
  7 Content-Length: 310
  8 
  Location: ihihome.nsf/all/e_main [following]
  --23:35:37--  https://www.ihi.dk/ihihome.nsf/all/e_main
 = `e_main'
  Connecting to www.ihi.dk:443... connected!
  HTTP request sent, awaiting response... 200 OK
  2 Server: Lotus-Domino/5.0.6
  3 Date: Sun, 03 Jun 2001 22:36:54 GMT
  4 Connection: close
  5 Content-Type: text/html; charset=US-ASCII
  6 Content-Length: 1404
  7 Last-Modified: Wed, 23 May 2001 14:23:36 GMT
  8 

  0K . 100% @  
1.34 MB/s

  23:35:37 (1.34 MB/s) - `e_main' saved [1404/1404]

  $ wget -S https://www.ihi.dk/

Now that was a really crude solution. I'm not so familiar with openssl
but isn't it supposed to just use the right SSL protocol. If this is the
expected behavior and not a bug in openssl then we should allow the user
to override the SSL protocol used (maybe a --ssl-version=ssl3 or
something). Or even better circle throught them all till it clicks (if
that's possible).

  $ openssl version
  OpenSSL 0.9.6 24 Sep 2000

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: Is there a way to override wgetrc options on command line?

2001-06-01 Thread Hack Kampbjørn



Humes, David G. wrote:
 
 Hello,
 
 I have several cronjobs using wget and the wgetrc file turns on passive-ftp
 by default.  I have one site where strangely enough passive ftp does not
 work but active does work.  I'd rather leave the passive ftp default set and
 just change the one cronjob that requires active ftp.  Is there any way to
 tell wget to either disregard the wgetrc file or to override one or more of
 its options?
 
 Thanks.

What about --execute=COMMAND ?

$ wget --help
GNU Wget 1.7-pre1, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...

Mandatory arguments to long options are mandatory for short options too.

Startup:
  -V,  --version   display the version of Wget and exit.
  -h,  --help  print this help.
  -b,  --backgroundgo to background after startup.
  -e,  --execute=COMMAND   execute a `.wgetrc'-style command.
[...]

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: What do you think my chances are of getting wget to work on HP-UXare ?

2001-05-24 Thread Hack Kampbjørn



Alan Barrow wrote:
 
 This message contains information which may be confidential and privileged.  Unless 
you are the addressee (or authorised to receive for the addressee),  you should not 
use, copy or disclose to anyone the details or information contained in this message. 
 The content of the message and or attachments may not reflect the view and opinions 
of the originating company. If you have received this message in error, you should 
reply to the sender and copy [EMAIL PROTECTED] and delete the message from your 
system.  Thank you for your co-operation.


I think that your changes are pretty good (if not 100%) to get wget
working if you set enough effort into it.

sarcasm
I will even consider (without knowing HP-UX) you have 50% chances to
succeeding with less effort than taken in sending this mail:
./configure
make
make install
/sarcasm

PS: Watch your lines length, max. 72 chars please.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: Change Request: Cookies 4 WGet

2001-05-15 Thread Hack Kampbjørn


Michael Klaus wrote:
 
 Dear WGet team,
 
 first of all, i want to say that WGet really is a _great_ program. My
 company is mostly using it for regression tests for different web
 servers and servlet engines. And there's the problem. Servlet engines
 meintain their sessions - which are critical for regression tests - via
 cookies. A functionality to hold cookies (one cookie would be sufficient
 for this task) and send them back with each request would really be
 helpful.
 
 Would it be able for someone of your team to support us getting this to
 work? We have a bit of c knowledge here and perhaps would even be able
 to write it ourselves...if we only had a clue where to change what :-/

Cookie support has been added in the current developement code. You can
get it from CVS see the Developement section on the web-site
(http://sunsite.dk/wget/). Of course with all the usual warnings about
using developement code.

 
 Many thanks in advance,
 
 Michael Klaus
 
 --
 Michael Klaus
 Entwickler / IT-Consultant
 
 orgafactory gmbh
 Hügelstraße 8
 60435 Frankfurt am Main
 
 Telefon (0 69) 90 54 66 35
 Telefax (0 69) 90 54 66 13
 mailto:[EMAIL PROTECTED]

-- 
Med venlig hilsen / Kind regards

Hack Kampbjørn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: How to put user name and password using wget?

2001-04-18 Thread Hack Kampbjørn


N SHU wrote:
 
 One(version1.6) is OK in Unix, another(version1.5) in Dos(windows) has
 problem,

Well then this bug has been fixed (at least) in version 1.6. Is there
any reason you cannot update your wget to that version. There are links
to binaries from the web-site (http://sunsite.dk/wget/).


 The wget debug out are like below:
  in Windows has problem===
 DEBUG output created by Wget 1.5.0 on Windows.

This is the second bug-report (in this eastern week-end) from "Wget
1.5.0 on Windows". Does anyone know where this file comes from? Is there
any chanced that we can encoraged them to update to the current version
of Wget (version 1.6). 

 
 parseurl ("ftp://shu:[EMAIL PROTECTED]/1.html") - host
 astro12.phy.
 ornl.gov - opath 1.html - dir  - file 1.html - ndir
 --10:18:58--  ftp://shu:[EMAIL PROTECTED]:21/1.html
= `1.html'
 wget: Cannot determine user-id.
 
 While in Unix , it is OK==
 DEBUG output created by Wget 1.6 on osf4.0.
 
 parseurl ("ftp://shu:[EMAIL PROTECTED]/1.html") - host
 astro12.phy.
 ornl.gov - opath 1.html - dir  - file 1.html - ndir
 newpath: /1.html
 --10:21:21--  ftp://shu:[EMAIL PROTECTED]/1.html
= `1.html.4'
 Connecting to astro12.phy.ornl.gov:21... Created fd 3.
 connected!
 Logging in as shu ... 220 astro12.phy.ornl.gov FTP server (Digital UNIX
 Version
 5.60) ready.
 
 -- USER shu
 
 331 Password required for shu.
 
 -- PASS mypasswd
 
 230 User shu logged in.
 Logged in!
 == TYPE I ...
 -- TYPE I
 
 200 Type set to I.
 done.  == CWD not needed.
 == PORT ... Master socket fd 5 bound.
 
 -- PORT 134,167,21,90,9,115
 
 200 PORT command successful.
 done.== RETR 1.html ...
 -- RETR 1.html
 
 150 Opening BINARY mode data connection for 1.html (134.167.21.90,2419)
 (2220 by
 tes).
 done.
 Created socket fd 6.
 Length: 2,220 (unauthoritative)
 
 0K - .. [100%]
 
 Closing fd 6
 Closing fd 5
 226 Transfer complete.
 Closing fd 3
 10:21:21 (541.99 KB/s) - `1.html.4' saved [2220]
 =END==
 Thanks.
 
 From: Hack Kampbjrn [EMAIL PROTECTED]
 Reply-To: Wget List [EMAIL PROTECTED]
 To: N SHU [EMAIL PROTECTED]
 CC: [EMAIL PROTECTED]
 Subject: Re: How to put user name and password using wget?
 Date: Mon, 16 Apr 2001 22:19:23 +0200
 
 N SHU wrote:
  
   Dear sir,
   I don't know how to put username and passwd using wget.
   When I used: wget ftp://username:[EMAIL PROTECTED]/file,
   it said: Can't dermine user-id.
 
 This sounds like a bug. Please send the output of running Wget in debug
 mode:
 wget -d ftp://username:password@ftp
 
   Thanks.
   N.Shu.
  
 _
   Get Your Private, Free E-mail from MSN Hotmail at
 http://www.hotmail.com.
 
 --
 Med venlig hilsen / Kind regards
 
 Hack Kampbjrn   [EMAIL PROTECTED]
 HackLine +45 2031 7799
 
 _
 Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: How to put user name and password using wget?

2001-04-16 Thread Hack Kampbjørn

N SHU wrote:
 
 Dear sir,
 I don't know how to put username and passwd using wget.
 When I used: wget ftp://username:[EMAIL PROTECTED]/file,
 it said: Can't dermine user-id.

This sounds like a bug. Please send the output of running Wget in debug
mode:
wget -d ftp://username:password@ftp
 
 Thanks.
 N.Shu.
 _
 Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: unsubsribe -- fishlet@hotmail.com

2001-04-16 Thread Hack Kampbjørn

Take a look at the web-site (http://sunsite.dk/wget/) there's directions
on how to unsubscribe from the mailing-lists

Wei Xiong wrote:
 
 _
 Get your FREE download of MSN Explorer at http://explorer.msn.com

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: Changing links to absolute.

2001-04-14 Thread Hack Kampbjørn

Rakhesh Sasidharan wrote:
 
 Hi,
 
 I don't know if wget is the tool for this, but I still ask.
 
 I need to mirror some sites for offline viewing.  I usally use "wget" to
 recursively suck parts of the web, but in some cases it does not work. For
 example, say I want to suck the whole site (www.imap.org) to
 /mirrors/www.imap.org.  This means I want all links in the file of the
 form /pics/something.jpg should be modified to point to
 /mirrors/www.imap.org/pics/something.jpg -- is there some way to do this
 automatically, or do I have to visit each file and do it manually ?  Or
 maybe I'm using the wrong program for mirroring ... could somebody help ?

--convert-links should do this but it had some problems with these
hostless absolute links. I'm not sure if this has been fixed for all
cases but you can track the discussion in the archives or, even better
try the current CVS source (wget-1.7-dev) and report any problems you
find.

Instruction on how to get and compile the CVS source is on the web-site
(http://sunsite.dk/wget) under developement.

 
 Thanks,

You're welcome

 __
 Rakhesh Sasidharan  rakhesh at cse.iitd.ac.in

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: -r without effect

2001-04-10 Thread Hack Kampbjørn



Micha Meier wrote:
 
 On Mon, Apr 09, 2001 at 07:03:49PM +0200, Hack Kampbjrn wrote:
  What is reclevel set to (if any) in /etc/wgetrc?
  Try setting -l (or --level) see if it helps.
 
 reclevel is not set  in /etc/wgetrc, setting -l makes no difference.
 Strangely, for some URL's -r works and for others not. It could be that
 if wget thinks the URL is not an HTML document, it won't do the recursive
 lookup, but this would be a bug, IMHO, especially when -F works
 only with the -i option, not with an URL. But still, all this
 has worked before...?!

I dunno ... maybe you should send the output of running wget in debug
mode:
wget -d -r http://

 
 Cheers,
 
 --Micha

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: -r without effect

2001-04-09 Thread Hack Kampbjørn



Micha Meier wrote:
 
 I was using wget with SuSE 6.2. After upgrading to 7.1, wget refuses
 to search recursively, even in the same script that used to work before.
 It is the same wget version 1.5.3 as before, /etc/wgetrc is the same and
 I'm not using .wgetrc. Can someone tell me what else could have changed?
 I've also tried to compile 1.6 sources, but the result is the same: I say
 
 wget -r -nc http://...
 
 and wget says the file is already there and does not look at the recursive
 links.

What is reclevel set to (if any) in /etc/wgetrc?
Try setting -l (or --level) see if it helps.

 
 --Micha

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: Mirrorinf a web site with FTP

2001-03-17 Thread Hack Kampbjørn



John Vorstermans wrote:
 
 Hi.
 
 I have a problem which I cannot work out and wonder if anyone can point
 our what I am doing wrong.
 
 I wish to mirror a WEB site between to machines.  To do this I am useing:
 
  wget -m -l50 -L ftp://www-data:[EMAIL PROTECTED]
 
 It  logs in just fine but will only collect files in the / directory of
 the server.  I wish it to copy files from subsdirectories (as in a
 recursive copy which the -m should do).  Here is an example of the sort of
 error I am getting:
 
 =
 also.--14:10:25--
 ftp://www-data:[EMAIL PROTECTED]:21/%2Fcommunications
 /
= `xena.website.org/_stats/index.html'
 Connecting to xena.website.org:21... connected!
 Logging in as www-data ... Logged in!
 == TYPE I ... done.  == CWD /communications ...
 No such directory `/communications'.

Hmmm, this could be that you're not placed in the root directory. Older
wget expected to be placed in / and couldn't handle it when placed
somewhere else. This should be fixed in the current 1.7-dev version.

There has been some discussion here in the list about what is the right
thing to do and what wget should do with URL like:
ftp://ftp.somehost.com/path
ftp://ftp.somehost.com//path
ftp://ftp.somehost.com/%2Fpath

Try using wget version 1.6 or even better the current development
version (1.7-dev). Look at the web-site (http://sunsite.dk/wget/) for
instructions. There's also links to mail-archives of the list.

 
 --14:10:25--  ftp://www-data:[EMAIL PROTECTED]:21/%2Fcouncil/
= `xena.website.org/council/.listing'
 Connecting to xena.website.org.:21... connected!
 Logging in as www-data ... Logged in!
 == TYPE I ... done.  == CWD /council ...
 No such directory `/council'.
 
 ===
 
 The directories are present on the server I am copying from and even of I
 create them on the destination server I get the same errors.
 
 I am using FTP because the files on the server contain "Sever Side
 Includes" which get translated when using HTTP which is not what we want.
 
 Any advice would be most welcome.
 
 Thanks
 John
 
 --
 John Vorstermans||   [EMAIL PROTECTED]
 Serion E-Commerce Solutions   ||   Ph (021) 432-987
 New Zealand

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: WGet v1.7 - Problems with Serv-U Win32 FTP Server?

2001-03-05 Thread Hack Kampbjørn



GoTo wrote:
 
 Hi,
 
 perhaps anyone of you can explain me the following:
 
 I tried to WGet a file from my local FTP-Server - don't ask me why :-)
 
 WGet v1.7-dev 2001/02/16 (with ssl support) for Win32
 Serv-U v2.5e for Win32
 
 -- WGet Output Start -
 
 D:\wget -d http://guest:[EMAIL PROTECTED]:2100/
  
Have you tried to use FTP. It usually helps when talking to a FTP Server
8-)

 DEBUG output created by Wget 1.7-dev on Windows.
 
 parseurl ("http://guest:[EMAIL PROTECTED]:2100/") - host 127.0.0.1 - port
 2100 -
   opath  - dir  - file  - ndir
 newpath: /
 --18:23:35--  http://guest:password@127.0.0.1:2100/
 = `index.html.3'
 Connecting to 127.0.0.1:2100... Created fd 36.
 connected!
 ---request begin---
 GET / HTTP/1.0
 User-Agent: Wget/1.7-dev
 Host: 127.0.0.1:2100
 Accept: */*
 Connection: Keep-Alive
 Authorization: Basic Z3Vlc3Q6Z3Vlc3Q=
 
 ---request end---
 HTTP request sent, awaiting response... 220-Serv-U FTP-Server v2.5e for WinSock
 ready...
 
 Closing fd 36
 18:23:35 ERROR -1: Malformed status line.
 
 -- WGet Output End -
 
 So I tried the usual console FTP:
 
 - Console FTP Output Start ---
 
 D:\ftp
 Ftp open localhost 2100
 Verbindung mit arcticblue.
 220-Serv-U FTP-Server v2.5e for WinSock ready...
 220-Welcome to ArcticBlue.
 220-
 some more text here
 220-
 220-Have Fun!
 220-
 220
 Benutzer (arcticblue:(none)): guest
 331 User name okay, need password.
 Kennwort:
 230 User logged in, proceed.
 Ftp dir
 200 PORT Command successful.
 150 Opening ASCII mode data connection for /bin/ls.
 -rwxrwxrwx   1 user group2048 Mar  5 14:59 1.r00
 226 Transfer complete.
 Ftp: 62 Bytes empfangen in 0.00Sekunden 62000.00KB/Sek.
 Ftp
 
 - Console FTP Output End ---
 
 As you can see this worked perfectly.
 
 My first thought was the startup message, but then I tried another
 FTP-Server (ftp.leo.org) which also has a startup-msg and that worked fine
 with WGet.
 
 Any hints?
 
 bye
 GoTo

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Problem accessing CVS server ...

2001-03-03 Thread Hack Kampbjørn

[wget-website]$ cvs update -d
cvs [update aborted]: recv() from server sunsite.dk: EOF
[wget-website]$

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: FTP retrieval not functioning

2001-02-25 Thread Hack Kampbjørn

Chunks wrote:
 
 I am attempting to retrieve all subdirecties on a specific host. It appears that the 
host is not using a standard FTP service as using a standard (win32) FTP client 
(CuteFTP) I will not get a visible directory listing. However, when using FTP 
straight from the command line (Win2k machine), it works without any problems (ls, 
dir, get, all work fine.)
 
 I enabled debug, logged it, and it seems the wget is successfully viewing the 
directory tree, yet for some reason (permissions?) it is not recursively entering 
these directories. The error looks as follows:
 
0Feb-21-2001  14:37:20   xxx.yyy.zzz   DIR UNKOWN; perms 0;
 Skipping.
 
 and this is what appears for all directories in the tree. The user I am logging in 
with definitely has permissions (I am able to down load files manually or by telling 
wget to retrieve a specific file by name, just not all files.)
 
 The tail end of the log, if this helps, is as follows (IP's and file names changed 
=)  ):
 
 
 
0Feb-21-2001  14:37:26   xx.yyy.zzz   DIR UNKOWN; perms 0;
 Skipping.
  --15:20:10--  ftp://blah:[EMAIL PROTECTED]:21/blah/
= `10.10.10.10/blah/index.html'
 == CWD not required.
 == PASV ...
 -- PASV
 227 Entering Passive Mode 10,10,10,10,4,1
 Will try connecting to 10.10.10.10:1025.
 Created fd 412.
 done.== RETR  ...
 -- RETR
 501 "" is a directory, not a file
 
 No such file `'.
 
 Closing fd 412
 Closing fd 384
 
 FINISHED --15:20:10--
 Downloaded: 0 bytes in 0 files
 
 
 
 I did RTFM, and the links to any mailing list archives I could find were broken. 
Please accept my apologies in advance if this is something covered elsewhere. Perhaps 
ignoring permissions will take care of it?

How come then that you didn't find this message:
http://www.mail-archive.com/wget@sunsite.dk/msg00326.html
At least the last remark is revelant in your case: "And please, in the
future include debug output when reporting problems !!!". If you had
followed it I could tell if you're dealing with MS's so called "FTP
Server" or not.


 
 I am running GNU Wget 1.5.3.1, win32 compilation and have also tried wget 1.5.3 
linux compilation with identical results.

BTW the latest release version is 1.6. And the web-site is at
http://sunsite.dk/wget

 
 I appreciate any and all help,
 
 Kit

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: wget feature request: mail when complete

2001-02-18 Thread Hack Kampbjørn

"Mordechai T. Abzug" wrote:
 
 Sometimes, I run wget in background to download a file that will take
 hours or days to complete.  It would be handy to have an option for
 wget to send me mail when it's done, so I can fire and forget.
 
 Thanks!
 
 - Morty

wget comes from the *nix world where utilities tries to be good at one
or two things and relay on other utilities to good at other things so
that they don't bloat their code. E.g. wget is good at downloading files
from the internet: using http and ftp adding other protocol might be a
natural thing for wget to do. But for sending mail there're already a
lot of other utilities that's good at that.

And there are also a bunch of utilities that are good at making other
utilities cooperate and intercomunicate: those are the shells.

I use the bash shell so if I wanted this feature I'll type something
like:

$ (wget -r -l 0 http://www.vigilante.com/ | mail -s "wget run completed"
`id -un`) 

Arghhh, wget sends the output to STDERR. Well then sm like:
$ (wget -r -l 0 http://www.vigilante.com/ 21 | mail -s "wget run
completed" `id -un`) 

Or if I used it a lot make a litlle script for it:
$ cat~/bin/bwget
#!/bin/bash
# Background wget: runs wget and sends a mail when finished
(wget $* 21 | mail -s "wget run completed" `id -un`) 
^D
$ chmod 0700 ~/bin/bwget

Look at the documentation for the shell you use.


-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: can wget do POSTing?

2001-02-18 Thread Hack Kampbjørn

Cyrus Adkisson wrote:
 
 I'm trying to retrieve information from a website, but to get to the
 page I want, there is a form submission using the POST method. I've
 tried everything I know to do, including using a --header="POST /
 HTTP/1.0" parameter, but with all the errors I'm getting, I'm starting
 to come to the conclusing that wget is only capable of GET http
 requests. That would explain why it's called wget and not wpost, right?
 Am I correct in this assumption?
 
As you already found out, wget can only do GET.

 If so, does anyone have any ideas how I might retrieve the webpage
 beyond the POST form? I'd really appreciate any help you might have for
 me.


You can try and use the GET method anyway. Many web-scripts don't really
care with method you use (or even cookies). But I suppose you already
have tried that 8-(

Next you can use your browser and make the POST request there. Save the
resulting page. And the use the `--force-html' and `--input-file'
options to retrieve all the resting pages.

If those pages also requires POST to access then you could consider
adding support for this method in wget 8-)

 
 Cyrus

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



compiling --with-ssl under cygwin

2001-02-18 Thread Hack Kampbjørn

I tried to run compile wget with the ssl support under cygwin.
$ Makefile realclean
$ Makefile -f Makefile.cvs
$ ./configure --with-ssl
$ make
But I got a bunch of undefined references like:
/usr/lib/libssl.a(ssl_lib.o)(.text+0x585):ssl_lib.c: undefined reference
to `BIO_s_socket'

If I changed the Makefile to link with crypto first
LIBS = -lintl -l crytpo -lssl
them it compile fine.

Now since Makefile is an autogenerated one, I looked where to fix this.
After trying a couple of things I ended changing the order of the
AC_CHECK_LIB for crypto and ssl in configure.in:

  AC_CHECK_LIB(crypto,main,,ssl_lose=yes)
  AC_CHECK_LIB(ssl,SSL_new,,ssl_lose=yes,-lcrypto)

I'm not sure that this is the right solution (FAIK it might break things
on other platforms) or that is fixed the right place. So if someone on
the list with more knowledge about crypto and ssl can help here I will
try it out.


-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: FTP directory listing

2001-02-15 Thread Hack Kampbjørn

Florian Fuessl wrote:
 
 Hi,
 
 why does a directory listing on: wget ftp://ftp.mcafee.com or any
 subdirectories of this server not work.
 Other FTP servers seem to work fine.

$ wget -d ftp://ftp.mcafee.com
DEBUG output created by Wget 1.6.1-dev on cygwin32.

parseurl ("ftp://ftp.mcafee.com") - host ftp.mcafee.com - ftp_type I
- opath
 - dir  - file  - ndir
newpath: /
Using `.listing' as listing tmp file.
--18:02:59--  ftp://ftp.mcafee.com/
   = `.listing'
Connecting to ftp.mcafee.com:21... Created fd 3.
connected!
Logging in as anonymous ... 220 sncwebftp2 Microsoft FTP Service
(Version 5.0).
[...]

Aha it's a Microsoft so called "FTP Server" with DOS dirstyle listing.
This listing format is unsupported in wget 1.6. You can:
 - ask ftp.mcafee.com to change there default to unix listing which is
supported
 - patch the 1.6 code so that it issues an DIRSTYLE command at
connection
or
 - use the current 1.7-dev CVS code. Where support for this has been
added. Look at the web-site http://sunsite.dk/wget for instruction on
how to get it.

And please, in the future include debug output when reporting problems
!!!

 
 Greetings from Bavaria,
 Florian Fessl

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: Strange wget behaviour in mirroring

2001-02-10 Thread Hack Kampbjørn

First of all. Please wrap your lines at column 72 !!!

[EMAIL PROTECTED] wrote:
 
 I am trying to set up an automated mirroring of a small part of the NAI ftp server.
 The small part is just the directory of the antivirus updates.
 I tried to download the directory by automated ftp scripts (something like mget *.*) 
but that means lots of waste, so I decided to try wget.
 I found the port to windows of the version 1.6 so I just put the executable in a 
directory and tried to mirror a directory from a local ftp server here: it was ok.
 Then I prepared for the real mirroring, I just typed "wget -S 
ftp://ftp.nai.com/path/" and I noticed that at the end of the log, wget was trying to 
download index.html while there was no reference to that file in the .listing, so I 
had a look at the whole log using the -d option (and the attached file is the result 
of that operation).

That's all what you're asking for: the listing of files in that
directory. wget converts it to a html index. 

You most likely want to do something like:
$ wget-dev -S -d -r -l 1 -np
ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/*
or
$ wget-dev -S -d ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/*

That's it: retrieve all files referenced in the directory listing of
/datfiles/ or retrieve all files in the /datfiles/ directory.

Note that there seems to be an FTP parsing bug in wget 1.6 wrt MS FTP
server. This has been fixed in the 1.7-dev branch. Again look at the
web-site for instruccions (http://sunsite.dk/wget)

$ wget-dev ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/*
--23:08:16--  ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/*
   = `.listing'
Connecting to ftp.nai.com:21... connected!
Logging in as anonymous ... Logged in!
== SYST ... done.== PWD ... done.
== TYPE I ... done.  == CWD pub/antivirus/datfiles/4.x ... done.
== PORT ... done.== LIST ... done.

0K - .

23:08:25 (54.00 KB/s) - `.listing' saved [1106]

Removed `.listing'.
The sizes do not match (local 33580) -- retrieving.
--23:08:25--  ftp://ftp.nai.com/pub/antivirus/datfiles/4.x/41054106.upd
   = `41054106.upd'
== CWD not required.
== PORT ... done.== RETR 41054106.upd ... done.
Length: 109,312

0K - .. ...

 Now I don't know if this is a bug or if there is the usual simple and big mistake I 
am doing, so I thought to email someone... I tried to look for the mailing list 
archives on the web but it seems I found an old link, so I thought to write here.
 I must say that I also thought that it was a problem of the win32 port of wget so I 
downloaded the tarball and compiled in a redhat 6.2 machine here but the result is 
exactely the same.
 I also tried to download some files in the listing, to see if the error was because 
of read permission, and the files were downloaded correctly, and also the .listing is 
created correctly, so I don't really know what to do.
 I hope I am not making you waste lots of time and excuse me for this terrible 
english, and obviously thanks a lot.
 
 Emiliano
 
   
Name: wget-log
wget-logType: unspecified type (application/octet-stream)
Encoding: base64

-- 
Med venlig hilsen / Kind regards

Hack Kampbjrn   [EMAIL PROTECTED]
HackLine +45 2031 7799



Re: f/up bug

2001-02-10 Thread Hack Kampbjørn

Clayton Vernon wrote:
 
 Hack-
 
 While it now "seems" to parse correctly, and while it displays "TYPE A" in
 its download dialog, it does NOT actually download the file in ASCII format,
 but in binary.
 
 Clayton

I have been testing this against MS so called "FTP Server". And I to get
the same file no matter which type I use. I have tried files with CRLF
line endings and files with NL line endings.

Quoting from RFC959 
3.1.1.1.  ASCII TYPE

This is the default type and must be accepted by all FTP
implementations.  It is intended primarily for the transfer
of text files, except when both hosts would find the EBCDIC
type more convenient.

The sender converts the data from an internal character
representation to the standard 8-bit NVT-ASCII
representation (see the Telnet specification).  The receiver
will convert the data from the standard form to his own
internal form.

In accordance with the NVT standard, the CRLF sequence
should be used where necessary to denote the end of a line
of text.  (See the discussion of file structure at the end
of the Section on Data Representation and Storage.)

Using the standard NVT-ASCII representation means that data
must be interpreted as 8-bit bytes.

The Format parameter for ASCII and EBCDIC types is discussed
below.


There seems to be two problems here:
  1. MS FTP Server always sends the file in binary mode.
  2. wget always saves the file in binary mode.


First I try on a MS FTP Server (look at how the byte count is the same
in both cases, this is expected if the file has only CRLF line endings
but we have the same result with binary files)
$ wget-dev -S -d -O readme-A.htm
ftp://ftp.nai.com/pub/antivirus/readme.htm\;type=A
DEBUG output created by Wget 1.7-dev on cygwin32.
[...]
220 sncwebftp2 Microsoft FTP Service (Version 5.0).
[...]
200 PORT command successful.
-- RETR readme.htm

150 Opening ASCII mode data connection for readme.htm(592 bytes).
Created socket fd 6.
Length: 592

0K -[100%]

Closing fd 6
Closing fd 5
226 Transfer complete.
23:27:36 (8.26 KB/s) - `readme-A.htm' saved [592]

Closing fd 4

$ wget-dev -S -d -O readme-I.htm
ftp://ftp.nai.com/pub/antivirus/readme.htm\;type=I
DEBUG output created by Wget 1.7-dev on cygwin32.
[...]
220 sncwebftp2 Microsoft FTP Service (Version 5.0).
[...]
200 PORT command successful.
-- RETR readme.htm

150 Opening BINARY mode data connection for readme.htm(592 bytes).
Created socket fd 6.
Length: 592

0K -[100%]

Closing fd 6
Closing fd 5
226 Transfer complete.
23:30:15 (28.91 KB/s) - `readme-I.htm' saved [592]

Closing fd 4

Now on a Unix FTP server: (look at how the byte count is different in
this case)
$ wget -S -d -O wget-A.html
ftp://sunsite.dk/projects/wget/wget.html\;type=A
DEBUG output created by Wget 1.6.1-dev on cygwin32.
[...]
220 ProFTPD 1.2.0rc3 Server (SunSITE Denmark) [sunsite.dk]
[...]
200 PORT command successful.
-- RETR wget.html

150 Opening ASCII mode data connection for wget.html (5689 bytes).
Created socket fd 6.
Length: 5,689

0K - .  [102%]

Closing fd 6
Closing fd 5
226 Transfer complete.
23:50:32 (6.35 KB/s) - `wget-A.html' saved [5804]

]$ wget -S -d -O wget-I.html
ftp://sunsite.dk/projects/wget/wget.html\;type=I
DEBUG output created by Wget 1.6.1-dev on cygwin32.
[...]
220 ProFTPD 1.2.0rc3 Server (SunSITE Denmark) [sunsite.dk]
[...]
200 PORT command successful.
-- RETR wget.html

150 Opening BINARY mode data connection for wget.html (5689 bytes).
Created socket fd 6.
Length: 5,689

0K - .  [100%]

Closing fd 6
Closing fd 5
226 Transfer complete.
23:50:47 (3.91 KB/s) - `wget-I.html' saved [5689]

$ ls -als
total 13
   2 drwxrwxrwx   2 administ Administ 4096 Feb 11 00:01 .
   4 drwxrwxrwx  13 administ Administ 8192 Feb 10 20:33 ..
   0 -rw-rw-rw-   1 hack Administ  592 Feb 10 23:27 readme-A.htm
   0 -rw-rw-rw-   1 hack Administ  592 Feb 10 23:30 readme-I.htm
   3 -rw-rw-rw-   1 hack Administ 5804 Feb 10 23:50 wget-A.html
   3 -rw-rw-rw-   1 hack Administ 5689 Feb 10 23:50 wget-I.html


 
 -Original Message-
 From: Hack Kampbjrn [mailto:[EMAIL PROTECTED]]
 Sent: Sunday, February 04, 2001 11:38 AM
 To: Clayton Vernon
 Cc: [EMAIL PROTECTED]
 Subject: Re: simple ?/bug
 
 Clayton Vernon wrote:
 
  Sirs:
 
  Pardon my naivete, but  I can't get the ASCII mode FTP to work because
  my shell thinks the ';' delimits the command.  I can't put the  entire
  arg to wget in quotes because it then thinks the ';type=a' is a part of
  the URL.
 
 
 And it is! wget will parse it correctly. Obs the debug output doesn't
 include the ftp 

Re: Design issue

2001-02-09 Thread Hack Kampbjørn

Herold Heiko wrote:
 
 I think the most straightforward mapping would also be the
 most attractive:
 
 ftp/site/dir/file
 http/site/dir/file
 
 Wget should certainly have an option to make it behave this
 way.  In fact,
 I'd prefer it to behave that way by default, for the reasons
 you mention,
 and introduce an option to leave off the protocol.
 
 
 I agree. What about https ?

What about answering on more than one port like java.sun.com used to do
where :80 had a java menu and :81 not. This is a bad example as it was
mostly the same web-site

 The files could be either in a separate https directory (logically more
 correct) or reside in the http directory in order to minimize
 ../../../../dir/dir/dir/something url rewriting (since I suppose those
 pages could share lots of inline pics and other links with the http
 structure).
 
 Speaking of https, I got exactly one report (in private mail) of
 successfully testing of the windows ssl enabled binary, nothing else.
 
 Could you commit the patch as
 http://www.mail-archive.com/wget@sunsite.dk/msg00142.html ?
 The changes in gen_sslfunc.c could be needed anyway for other operating
 systems (the are mirrored from similar code in sysdep.h and http.c,
 although I just noticed a inconditional include of time.h in
 ftpparse.c), while the changes in the VC makefile are as default
 commented out.
 
 Heiko
 
 --
 -- PREVINET S.p.A.[EMAIL PROTECTED]
 -- Via Ferretto, 1 ph  x39-041-5907073
 -- I-31021 Mogliano V.to (TV) fax x39-041-5907087
 -- ITALY


Hack 8-)