can wget be used over a local file system?

2003-11-04 Thread Kelley Terry
I installed a new hard drive.  I want to save the web sites I downloaded with 
wget.  The links in the files were automacally changed relative from the web 
sites to point to within the local file structure.  If I just copy the 
directories over to the new drive I'll lose all the functionality of the HTML 
links.  I tried using wget to retrieve from the one drive and copy to new 
drive and convert the links like this:

wget -r --convert-links /mnt/hda4/home/kb/www.uuhome.de
or
wget -r --convert-links file://mnt/hda4/home/kb/www.uuhome.de
etc.

All I can get is :

Unsupported scheme.

Is there a way to use wget to do this?


TIA 

since I'm not on  this list please CC all responses to include me:
Kelley Terry[EMAIL PROTECTED]


Re: can wget be used over a local file system?

2003-11-04 Thread Hrvoje Niksic
Kelley Terry [EMAIL PROTECTED] writes:

 I installed a new hard drive.  I want to save the web sites I
 downloaded with wget.  The links in the files were automacally
 changed relative from the web sites to point to within the local
 file structure.  If I just copy the directories over to the new
 drive I'll lose all the functionality of the HTML links.  I tried
 using wget to retrieve from the one drive and copy to new drive and
 convert the links like this:
[...]

Wget doesn't support copying from the file system.  But there should
be no need to do that just to convert links -- it is perfectly OK to
just move the files, as long as you used `--convert-links' when you
downloaded the files from the web.


Re: wget v1.9 (Windows port) newbie needs help in download files recursively...

2003-11-04 Thread Hrvoje Niksic
Herold Heiko [EMAIL PROTECTED] writes:

 From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]
  
 To get the stable sources that have this bug fixed, you might want to
 check out the head of the wget-1_9 branch in CVS.  Heiko, how about
 creating a bugfix 1.9 release for Windows?

 No problem with that, but wouldn't a dot release be better ?

A dot release will happen anyway.  I'm waiting until more bug reports
arrive, so that the 1.9.1 code base can get as mature as possible.  A
dedicated 1.9 branch maintainer would help matters, but noone has
volunteered, so I must maintain both branches myself.  That doesn't
help.

 I'm not too comfortable with the idea of a windows binary based on
 the cvs sources for a already released version, confusion could
 easily arise due to different behaviour

There is no difference in behavior.  The 1.9 branch contains *only*
bug fixes and possibly build changes.  Yet I understand your
reluctance.


gettext and charsets

2003-11-04 Thread Gisle Vanem
This should go to the gettext people, but I couldn't find any
mailing list.

I've built Wget with NLS support on Win-XP, but the display 
char-set is wrong. Built with LOCALEDIR=g:/MingW32/share

BTW. This is IMHO so ugly. Shouldn't there be a way to
set this at runtime (as Lynx does). E.g. have a $WGET_LOCALEDIR
and call bindtextdomain() on that. $LANGUAGE doesn't
seem to handle drive letters and ':' on the Win32 version of gettext.

But the main problem I can solve by e.g.
  wget -h | iconv -f ISO-8859-1 -t CP850

Isn't there a better way?

--gv




Re: gettext and charsets

2003-11-04 Thread Hrvoje Niksic
Gisle Vanem [EMAIL PROTECTED] writes:

 This should go to the gettext people, but I couldn't find any
 mailing list.

Perhaps you could try posting to [EMAIL PROTECTED]?  Failing
that, you might want to try at the address of the Free Translation
Project and/or the Norwegian national team near you.

I'm not sure about the charset issues on Windows.  Does gettext detect
the presence of GNU iconv?  (I assume you have the latter if you have
the `iconv' command.)

As for the LOCALEDIR, I am not against being able to change it at run
time.


Re: gettext and charsets

2003-11-04 Thread Gisle Vanem
Hrvoje Niksic [EMAIL PROTECTED] said:

 I'm not sure about the charset issues on Windows.  Does gettext detect
 the presence of GNU iconv?  (I assume you have the latter if you have
 the `iconv' command.)

libintl depends on libiconv:
 cygcheck wget.exe
..
  f:\windows\System32\libintl-2.dll
f:\windows\System32\libiconv-2.dll

Browsing the sources, I found the answer:
  set OUTPUT_CHARSET=CP850

--gv



About termination of wget and -T option

2003-11-04 Thread Luigi Stefano Sona (lsona)
Hello,

I'm tryng to use -T option, as I have to download a file (result of a
cgi) which is big, and very often I cannot download it within 15
minutes.

I read that the default is 15 inutes (900 secs), and used -T 1800 to
have a timeout of 30 minutes.
However it seems not to work, and the timeout expires anyway after 15
minutes.

Can you gime me any suggestion ?

Thanks

Luigi Sona


Re: About termination of wget and -T option

2003-11-04 Thread Hrvoje Niksic
Luigi Stefano Sona (lsona) [EMAIL PROTECTED] writes:

 I'm tryng to use -T option, as I have to download a file (result of
 a cgi) which is big, and very often I cannot download it within 15
 minutes.

The -T option times out only if *no data* is read in the designated
period, not if the whole file fails to download in that time.

 I read that the default is 15 inutes (900 secs), and used -T 1800 to
 have a timeout of 30 minutes.  However it seems not to work, and the
 timeout expires anyway after 15 minutes.

Could you post the debug output?


RE: About termination of wget and -T option

2003-11-04 Thread Luigi Stefano Sona (lsona)
How do I get debug output ?

Is there any other way to have a total timeout longer than 15 minutes ?

Thanks
Luigi

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 04, 2003 4:52 PM
To: Luigi Stefano Sona (lsona)
Cc: [EMAIL PROTECTED]
Subject: Re: About termination of wget and -T option


Luigi Stefano Sona (lsona) [EMAIL PROTECTED] writes:

 I'm tryng to use -T option, as I have to download a file (result of
 a cgi) which is big, and very often I cannot download it within 15
 minutes.

The -T option times out only if *no data* is read in the designated
period, not if the whole file fails to download in that time.

 I read that the default is 15 inutes (900 secs), and used -T 1800 to
 have a timeout of 30 minutes.  However it seems not to work, and the
 timeout expires anyway after 15 minutes.

Could you post the debug output?


Re: About termination of wget and -T option

2003-11-04 Thread Hrvoje Niksic
Luigi Stefano Sona (lsona) [EMAIL PROTECTED] writes:

 How do I get debug output ?

By using the `-d' option.

 Is there any other way to have a total timeout longer than 15
 minutes ?

The `-T' option can be used to specify a longer timeout value.
However, in many cases, the timeout is not forced by Wget, but by the
operating system routines that implement networking.  In that case
Wget can only retry the retrieval -- which is what it's designed to
do.

The debug output ought to provide more insight into what Wget might be
doing.


RE: About termination of wget and -T option

2003-11-04 Thread Luigi Stefano Sona (lsona)
I'll try with -d and provide output.
About -T, you confirm that anyway, the timeout is for the start of the
answer, not for the 
finish ?

Thanks a lot.
Luigi

-Original Message-
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, November 04, 2003 6:32 PM
To: Luigi Stefano Sona (lsona)
Cc: [EMAIL PROTECTED]
Subject: Re: About termination of wget and -T option


Luigi Stefano Sona (lsona) [EMAIL PROTECTED] writes:

 How do I get debug output ?

By using the `-d' option.

 Is there any other way to have a total timeout longer than 15
 minutes ?

The `-T' option can be used to specify a longer timeout value.
However, in many cases, the timeout is not forced by Wget, but by the
operating system routines that implement networking.  In that case
Wget can only retry the retrieval -- which is what it's designed to
do.

The debug output ought to provide more insight into what Wget might be
doing.


Re: About termination of wget and -T option

2003-11-04 Thread Hrvoje Niksic
Luigi Stefano Sona (lsona) [EMAIL PROTECTED] writes:

 About -T, you confirm that anyway, the timeout is for the start of
 the answer, not for the finish ?

Almost.  In fact, the timeout applies whenever the download stalls,
at any point when Wget waits for data to arrive.  You can think of it
this way: Wget reads data in a loop like this one:

while data_pending:
  read_chunk_from_network
  write_chunk_to_disk

If a read_chunk_from_network step takes more than 15min, the
download is interrupted (and retried).  But the whole download can
take as long as it takes.



Re: Cookie options

2003-11-04 Thread Nicolas Schodet
* Hrvoje Niksic [EMAIL PROTECTED] [031101 01:25]:
 Nicolas, I started merging your patch for saving session cookies and
 need some advice.  The patch adds two options:

 [...]

 Any thoughts on this?

Actually, the more usefull option is the one which allow to keep session
cookies and allow the webserver to treat several wget instances as the
same client.

The second option is usefull when a server give a cookies to
authenticate one the website for one week for example. This option
enable wget to always read the same cookie file to access a site.
However, this could generate weird behaviour if the data associated with
the cookie has been deleted on the server.

Your idea about all-in-one option is quite interresting.

Nicolas.

-- 
Je peux faire confiance à votre ordinateur.
http://www.gnu.org/philosophy/can-you-trust.fr.html


Re: Cookie options

2003-11-04 Thread Hrvoje Niksic
Nicolas Schodet [EMAIL PROTECTED] writes:

 * Hrvoje Niksic [EMAIL PROTECTED] [031101 01:25]:
 Nicolas, I started merging your patch for saving session cookies and
 need some advice.  The patch adds two options:

 [...]

 Any thoughts on this?

 Actually, the more usefull option is the one which allow to keep
 session cookies and allow the webserver to treat several wget
 instances as the same client.

Agreed.  For starters, let's add the `--keep-session-cookies' option.
If we get reports asking for special treatment of expired cookies, we
can resurrect the other one or incorporate it in a more general
keep-cookies or something.

Interestingly enough, curl has a `--junk-session-cookies', which
indicates that it keeps them by default (?).  Daniel, are you still
listening?  :-)



Re: Cookie options

2003-11-04 Thread Nicolas Schodet
* Hrvoje Niksic [EMAIL PROTECTED] [031104 23:06]:
  Nicolas, I started merging your patch for saving session cookies and
  need some advice.  The patch adds two options:
  Any thoughts on this?
  Actually, the more usefull option is the one which allow to keep
  session cookies and allow the webserver to treat several wget
  instances as the same client.
 Agreed.  For starters, let's add the `--keep-session-cookies' option.
 If we get reports asking for special treatment of expired cookies, we
 can resurrect the other one or incorporate it in a more general
 keep-cookies or something.

It's ok for me, I only use the --keep-session-cookies option.

Nicolas.

-- 
Aidez à défendre le droit d'écrire des logiciels libres ou non :
http://www.gnu.org/philosophy/protecting.fr.html


The patch list

2003-11-04 Thread Hrvoje Niksic
I'm curious... is anyone using the patch list to track development?
I'm posting all my changes to that list, and sometimes it feels a lot
like talking to myself.  :-)


Factoid of the day: did you know that, as of this writing, `.wgetrc'
supports exactly 100 different options?



Re: Time Stamping and Daylight Savings Time

2003-11-04 Thread Fred Holmes
I am and have been using NTFS since the installation of the OS, on a brand 
new machine.

At 05:40 PM 11/4/2003, Gisle Vanem wrote:
Fred Holmes [EMAIL PROTECTED] said:

 OTOH, if anyone knows how to make Windows stop changing the time stamps,
 that would be even better.
You're using FAT filesystem? Convert to NTFS; it stores filetimes
in UTC (as 64-bit, 100 nanosecond steps from 1 jan 1601).
--gv



FTP time-stamping and time zones

2003-11-04 Thread Jochen Hayek
That other thread Time Stamping and Daylight Savings Time
reminded me of an issue,
that I have been carrying around with me for quite a while
and of which I thought
would be worth clarifying and maybe also sorting out.

For a customer of mine I am keeping FTP mirrors of data files,
the customer being located in Europe,
the data files being fetched in the U.S., let's say New-England.

As there is no proper and standard way to obtain
the exact time stamp of a remote file through ftp,
the respective directory listing is being parsed instead.

But as FTP directory listings do not include the time zone,
the files live in,
(I think) wget just assumes the local time zone
to be identical to the remote one.
Am I right with this?

As long as this concept is being kept to entirely,
this is probably the best, that can be done.

But it leads to a situation,
that wget creates files with time stamps much older, than the files actually are.
E.g. let's assume we now have 07:00 in Europe and 01:00 in New-England
and the file, we are going to retrieve from New-England over to Europe,
has a New-England local time stamp of 00:00.

So when wget retrieves the file from New-England to Europe 
it is still being time-stamped by wget as 00:00 ,
but rather should that be a Europe local 06:00.

A couple of hours later *I* am going to be asked this:
we can see from the log files,
that wget only retrieved the files at Europe local 07:00,
although they apparently already were to ready to get picked up at 00:00.
why is that?

I wouldn't mind making wget believe
(maybe through setting the environment variable TZ)
it actually lives in New-England, although it lives here in Europe.

Would that be a reasonable approach or rather nonsense?

JH


Re: The patch list

2003-11-04 Thread Tony Lewis
Hrvoje Niksic wrote:


 I'm curious... is anyone using the patch list to track development?
 I'm posting all my changes to that list, and sometimes it feels a lot
 like talking to myself.  :-)

I read the introductory stuff to see what's changed, but I never extract the
patches from the messages. From my perspective, the introductory stuff plus
a list of affected files would be sufficient.

Tony



Re: Time Stamping and Daylight Savings Time

2003-11-04 Thread Fred Holmes
At 07:24 PM 11/4/2003, Hrvoje Niksic wrote:
It continues to amaze me how many people use Wget on Windows.  Anyway,
thanks for the detailed bug report.
I would love to learn linux and a whole bunch of computer stuff, but there 
are only so many hours in a day.  I'm not an IT guy, just a worker that has 
to learn the computer for himself and figure the most efficient way to get 
stuff done, where efficiency includes cost of capital and learning curves 
as well.  Many thanks to all who contribute for a very fine product.  I had 
messed with a couple of gui sitesnag programs and found them lacking, and 
asked for a better recommendation on a local discussion list (WAMU 
ComputerGuys).  A gal by the name of Vicky Staubly recommended WGET, and 
the rest, as they say, is history.

v/r

Fred Holmes 



Re: Time Stamping and Daylight Savings Time

2003-11-04 Thread Fred Holmes
At 07:24 PM 11/4/2003, Hrvoje Niksic wrote:
Until then, if old files really never change, could you simply use
`-nc'?
Yes, that will do it quite nicely.  I missed that one.  I'll try it 
tomorrow, but a simple condition like that should work well.

Thanks for your help.

Fred Holmes