Re: Feedback feature request

2002-04-22 Thread Scott . Simpson

Thanks for the pointer.  It looks like a good product, but unfortunately 
not what I'm after.  I really need it to have no dependencies under win32, 
like wget, so I can just drop the exe in and make it go.

Scott






Alan E [EMAIL PROTECTED]
22/04/2002 17:06

 
To: [EMAIL PROTECTED]
cc: [EMAIL PROTECTED], (bcc: Mail Administrator/Newcastle/Computer Systems 
Australia), (bcc: )
Subject:Re: Feedback  feature request


On Mon, Apr 22, 2002 at 04:07:10PM +1000, [EMAIL PROTECTED] 
wrote:
definition download points onto LAN servers.  What I would like to see in 

the software is a switch to allow destructive mirroring of an ftp site, 
where files that no longer exist on the server are deleted from the 
download target directory.  I would appreciate if this type of feature 
could be included in a release somewhere in the future.

You've just described the original mirror.pl (a PITA to set up), or the
python based emirror, which is easy, and can produce all sorts of nice
html logs and such if you want.

See the emirror project on sourceforge.
-- 
AlanE





---
The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material.  Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited.   If you 
received this in error, please contact the sender and delete the material 
from any computer.



RE: Feedback feature request

2002-04-22 Thread Herold Heiko

Maybe you didn't know there are (at least two) ways to compile perl on w32
to a independant executable: perl2exe and a tool made by Activestate. Both
are commercial.

Heiko Herold

-- 
-- PREVINET S.p.A.[EMAIL PROTECTED]
-- Via Ferretto, 1ph  x39-041-5907073
-- I-31021 Mogliano V.to (TV) fax x39-041-5907472
-- ITALY

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED]]
 Sent: Monday, April 22, 2002 9:14 AM
 To: [EMAIL PROTECTED]
 Subject: Re: Feedback  feature request
 
 
 Thanks for the pointer.  It looks like a good product, but 
 unfortunately 
 not what I'm after.  I really need it to have no dependencies 
 under win32, 
 like wget, so I can just drop the exe in and make it go.
 
 Scott
 
 
 
 
 
 
 Alan E [EMAIL PROTECTED]
 22/04/2002 17:06
 
  
 To: [EMAIL PROTECTED]
 cc: [EMAIL PROTECTED], (bcc: Mail 
 Administrator/Newcastle/Computer Systems 
 Australia), (bcc: )
 Subject:Re: Feedback  feature request
 
 
 On Mon, Apr 22, 2002 at 04:07:10PM +1000, 
 [EMAIL PROTECTED] 
 wrote:
 definition download points onto LAN servers.  What I would 
 like to see in 
 
 the software is a switch to allow destructive mirroring of 
 an ftp site, 
 where files that no longer exist on the server are deleted from the 
 download target directory.  I would appreciate if this type 
 of feature 
 could be included in a release somewhere in the future.
 
 You've just described the original mirror.pl (a PITA to set 
 up), or the
 python based emirror, which is easy, and can produce all sorts of nice
 html logs and such if you want.
 
 See the emirror project on sourceforge.
 -- 
 AlanE
 
 
 
 
 
 ---
 The information transmitted is intended only for the person 
 or entity to 
 which it is addressed and may contain confidential and/or privileged 
 material.  Any review, retransmission, dissemination or other 
 use of, or 
 taking of any action in reliance upon, this information by persons or 
 entities other than the intended recipient is prohibited.   If you 
 received this in error, please contact the sender and delete 
 the material 
 from any computer.
 



spaces in file names

2002-04-22 Thread Jamie Zawinski

When a URL path component contains a space, is wget supposed to create
a file with a space in it, or not?  

Regardless of the answer to that question, I can't imagine how the
following behavior could be correct:

--04:40:07--  
http://www.vieuxmac.com/DOWNLOAD/SYSTEM%20SOFTWARE/System%20Software%206.0.5/S6.0.5%20-%20F6.1.5%20-%20F.sit.bin
   = `System%20Software%206.0.5/S6.0.5 - F6.1.5 - F.sit.bin'

Note that the URL contained both directories and files with spaces
in them, and that it escaped all spaces as %20.

Note that wget created *directories* with literal %20 in them, but
created *files* with literal spaces.  Both can't be right.

wget 1.8.  Launched like so:
wget -m -np -nH --cut-dirs=2 http://www.vieuxmac.com/DOWNLOAD/SYSTEM%20SOFTWARE/

-- 
Jamie Zawinski
[EMAIL PROTECTED] http://www.jwz.org/
[EMAIL PROTECTED]   http://www.dnalounge.com/



Re: apache irritations

2002-04-22 Thread Maciej W. Rozycki

On Mon, 22 Apr 2002, Jamie Zawinski wrote:

 I know this would be somewhat evil, but can we have a special case in
 wget to assume that files named ?N=D and index.html?N=D are the same
 as index.html?  I'm tired of those dumb apache sorting directives
 showing up in my mirrors as if they were real files...

 How about using the -R option of wget?  A brief test proves -R
'*\?[A-Z]=[A-Z]' works as it should. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+




Re: Validating cookie domains

2002-04-22 Thread Hrvoje Niksic

Ian Abbott [EMAIL PROTECTED] writes:

 I realized it was stupid after I posted it (I was about to leave!)
 when I remembered cc domains like .de don't need an extra period. I
 thought maybe a table of exceptions would sort that out

The problem is that new domains appear all the times, and policies
change.  Any static table is doomed to fail miserably.

 However, that doesn't work for your .fr example.

Nothing works for the .fr example.  :-)

 Ye gods!  If it was just a reflection of the common 3-letter TLDs
 such as .com, that would be a reasonable thing to check for.

That's what I eventually implemented.



Re: apache irritations

2002-04-22 Thread Hrvoje Niksic

Maciej W. Rozycki [EMAIL PROTECTED] writes:

 On Mon, 22 Apr 2002, Jamie Zawinski wrote:

 I know this would be somewhat evil, but can we have a special case in
 wget to assume that files named ?N=D and index.html?N=D are the same
 as index.html?  I'm tired of those dumb apache sorting directives
 showing up in my mirrors as if they were real files...

  How about using the -R option of wget?  A brief test proves -R
 '*\?[A-Z]=[A-Z]' works as it should.

Or maybe the default system wgetrc should ship with something like:

reject = *?[A-Z]=[A-Z]

Adding new reject patterns will correctly append to this.  If the user
wanted to nullify that in his `.wgetrc', he'd need to set `reject' to
empty string.



Re: apache irritations

2002-04-22 Thread Maciej W. Rozycki

On Mon, 22 Apr 2002, Hrvoje Niksic wrote:

   How about using the -R option of wget?  A brief test proves -R
  '*\?[A-Z]=[A-Z]' works as it should.
 
 Or maybe the default system wgetrc should ship with something like:
 
 reject = *?[A-Z]=[A-Z]

 Note the difference between strings! -- the backslash before the
quotation mark is essential as otherwise it's a glob character. 

 Adding new reject patterns will correctly append to this.  If the user
 wanted to nullify that in his `.wgetrc', he'd need to set `reject' to
 empty string.

 Well, I don't think it's sane but adding a *commented-out* reject line
with an appropriate annotation to the default system wgetrc looks like a
good idea to me.

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+




ScanMail Message: To Recipient virus found or matched file blocking setting.

2002-04-22 Thread System Attendant

ScanMail for Microsoft Exchange has taken action on the message, please
refer to the contents of this message for further details.

Sender = [EMAIL PROTECTED]
Recipient(s) = [EMAIL PROTECTED];
Subject = CELLSPACING
Scanning Time = 04/22/2002 18:06:07
Engine/Pattern = 6.150-1001/267

Action on message:
The attachment border.bat matched file blocking settings. ScanMail has taken
the Deleted action. 

In einer für Sie bestimmten Nachricht wurde ein als gefährlich eingestufter
Anhang geblockt oder es wurde ein Virus gefunden. Der Absender der Nachricht
wird ebenfalls automatisch informiert. Als gefährlich eingestuft gelten u.A.
alle ausführbaren Dateien wie z.B. *.exe, *.bat, *.com, *.cmd, *.pif, *.scr.
Wenn sie eine Datei mit entsprechender Endung verschicken oder empfangen
wollen, komprimieren sie diese bitte zu einer *.zip-Datei mit Winzip.
An attachment has been blocked which is classified as dangerous or a Virus
has been found in the mail received by you. The sender of this mail was
automatically informed. Among the attachments classified as dangerous are
all executable files like *.exe, *.bat, *.com, *.cmd, *.pif, *.scr. If you
need to send or receive such an attachment you should compress it first into
a *.zip archive by using Winzip.



ScanMail Message: To Recipient virus found or matched file blocking setting.

2002-04-22 Thread System Attendant

ScanMail for Microsoft Exchange has taken action on the message, please
refer to the contents of this message for further details.

Sender = [EMAIL PROTECTED]
Recipient(s) = [EMAIL PROTECTED];
Subject = How are you
Scanning Time = 04/22/2002 18:07:56
Engine/Pattern = 6.150-1001/267

Action on message:
The attachment WHAT1.exe matched file blocking settings. ScanMail has taken
the Deleted action. 

In einer für Sie bestimmten Nachricht wurde ein als gefährlich eingestufter
Anhang geblockt oder es wurde ein Virus gefunden. Der Absender der Nachricht
wird ebenfalls automatisch informiert. Als gefährlich eingestuft gelten u.A.
alle ausführbaren Dateien wie z.B. *.exe, *.bat, *.com, *.cmd, *.pif, *.scr.
Wenn sie eine Datei mit entsprechender Endung verschicken oder empfangen
wollen, komprimieren sie diese bitte zu einer *.zip-Datei mit Winzip.
An attachment has been blocked which is classified as dangerous or a Virus
has been found in the mail received by you. The sender of this mail was
automatically informed. Among the attachments classified as dangerous are
all executable files like *.exe, *.bat, *.com, *.cmd, *.pif, *.scr. If you
need to send or receive such an attachment you should compress it first into
a *.zip archive by using Winzip.



Re: apache irritations

2002-04-22 Thread csaba . raduly


On 22/04/2002 16:38:15 Maciej W. Rozycki wrote:

On Mon, 22 Apr 2002, Hrvoje Niksic wrote:

   How about using the -R option of wget?  A brief test proves -R
  '*\?[A-Z]=[A-Z]' works as it should.

 Or maybe the default system wgetrc should ship with something like:

 reject = *?[A-Z]=[A-Z]

Note the difference between strings! -- the backslash before the
quotation mark is essential as otherwise it's a glob character.


[A-Z] is a bit extreme, IMHO. How about

reject = *\?[NMSD]=[AD]
  ^^ literal '?' needed here



Well, I don't think it's sane but adding a *commented-out* reject line
with an appropriate annotation to the default system wgetrc looks like a
good idea to me.


A good idea.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: apache irritations

2002-04-22 Thread Maciej W. Rozycki

On Mon, 22 Apr 2002 [EMAIL PROTECTED] wrote:

  reject = *?[A-Z]=[A-Z]
 
 Note the difference between strings! -- the backslash before the
 quotation mark is essential as otherwise it's a glob character.
 
 [A-Z] is a bit extreme, IMHO. How about
 
 reject = *\?[NMSD]=[AD]

 Hmm, it's too fragile in my opinion.  What if a new version of Apache
defines a new format? 

   ^^ literal '?' needed here

 Exactly -- I've meant the question mark above, of course. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+




Re: apache irritations

2002-04-22 Thread Tony Lewis

Maciej W. Rozycki wrote:

  Hmm, it's too fragile in my opinion.  What if a new version of Apache
 defines a new format?

I think all of the expressions proposed thus far are too fragile. Consider
the following URL:

http://www.google.com/search?num=100q=%2Bwget+-GNU

The regular expression needs to account for multiple arguments separated by
ampersands. It also needs to account from any valid URI character between an
equal sign and either end of string or an ampersand.

I'm not fluent enough in regular expressions to compose one myself. (Some
day I'll absorb all of Friedl's Mastering Regular Expressions, but not
today.)

Tony




Re: apache irritations

2002-04-22 Thread Maciej W. Rozycki

On Mon, 22 Apr 2002, Tony Lewis wrote:

 I think all of the expressions proposed thus far are too fragile. Consider
 the following URL:
 
 http://www.google.com/search?num=100q=%2Bwget+-GNU
 
 The regular expression needs to account for multiple arguments separated by
 ampersands. It also needs to account from any valid URI character between an
 equal sign and either end of string or an ampersand.

 I'm not sure what you are referring to.  We are discussing a common
problem with static pages generated by default by Apache as index.html 
objects for server's filesystem directories providing no default page. 
Any dynamic content should probably be protected by robots.txt and
otherwise dealt by a user specifically depending on the content. 

 BTW, wget's accept/reject rules are not regular expressions but simple
shell globbing patterns. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--+
+e-mail: [EMAIL PROTECTED], PGP key available+




segmentation fault on bad url

2002-04-22 Thread Renaud Saliou

Hi,

  wget -t 3 -d -r -l 3 -H --random-wait -nd --delete-after 
-A.jpg,.gif,.zip,.png,.pdf http://http://www.microsoft.com 

DEBUG output created by Wget 1.8.1 on linux-gnu.

zsh: segmentation fault  wget -t 3 -d -r -l 3 -H --random-wait -nd 
--delete-after

And that's all.

-- 
SALIOU Renaud  NoRSfall (icq: 61340098)
[EMAIL PROTECTED] 06.99.75.50.30




Re: apache irritations

2002-04-22 Thread Hrvoje Niksic

Tony Lewis [EMAIL PROTECTED] writes:

 Maciej W. Rozycki wrote:

  Hmm, it's too fragile in my opinion.  What if a new version of Apache
 defines a new format?

 I think all of the expressions proposed thus far are too fragile. Consider
 the following URL:

 http://www.google.com/search?num=100q=%2Bwget+-GNU

That URL will not match the proposed pattern.

As Maciej said, Wget's reject feature implements shell-style
patterns that are much simpler than regexps.  Also, they always match
the entire string by default.



Re: apache irritations

2002-04-22 Thread Tony Lewis

Maciej W. Rozycki wrote:

  I'm not sure what you are referring to.  We are discussing a common
 problem with static pages generated by default by Apache as index.html
 objects for server's filesystem directories providing no default page.

Really? The original posting from Jamie Zawinski said:

 I know this would be somewhat evil, but can we have a special case in
 wget to assume that files named ?N=D and index.html?N=D are the same
 as index.html?  I'm tired of those dumb apache sorting directives
 showing up in my mirrors as if they were real files...

I understood the question to be about URLs containing query strings (which
Jamie called sorting directives) showing up as separate files. I thought the
discussion was related to that topic. Maybe it diverged from that later in
the chain and I missed the change of topic.

I think what Jamie wants is one copy of index.html no matter how many links
of the form index.html?N=D appear.

  BTW, wget's accept/reject rules are not regular expressions but simple
 shell globbing patterns.

OK.

Tony




Wget in Windows Filename Saving Problem

2002-04-22 Thread Jeff Creamer



To whom it may concern:
 Wget works great except for if 
you follow a site that has ? in the URL for querystring, Windows cannot save a 
filename with a ? in the filename. What can we do to correct this 
problem?


Jeff 
CreamerFenwick Technologies, Inc.Systems 
Administrator/ProgrammerPhone: 304-623-5260 Ext. 16Email: [EMAIL PROTECTED]IM: 
JCreamer23MSN: 
JCreamer23


Re: RFE:add tar option

2002-04-22 Thread Hrvoje Niksic

Max Waterman [EMAIL PROTECTED] writes:

 Someone (rudely) suggested it was unacceptable to ask for a 'cc'
 rather than joining the email list.

That is not the case -- it is perfectly acceptable to post a question
and ask for `Cc'.  Especially so when you're posting to
[EMAIL PROTECTED], an address specifically maintained for users'
questions.