Re: wget ftp wildcards

2002-04-15 Thread Hrvoje Niksic

Krish Krothapalli <[EMAIL PROTECTED]> writes:

> When using wget ftp://XXX, is it intentional to preserve server
> permissions when using wildcards?

It's intentional.  Wildcards force Wget to retrieve the full directory
listing, which has a side-effect that more information is available.
The remote permissions are an example of that information.  The
"exact" sizes of files (for an admittedly odd definition of exact) are
another -- that's why Wget doesn't call the length "unauthoritative"
when you're using wildcards.

But you have a good point.  If your groups and users differ from those
on the remote site, blindly copying permissions can do more harm than
good, and can in fact pose a security risk.  Mirroring remote
permission should be controlled with an option, which should probaboly
default to "off".

I've added that to the TODO list; thanks for the report.



Re: problems with msgfmt making .gmo [v1.8.1]

2002-04-15 Thread Hrvoje Niksic

Thanks for the report.  The thing I don't quite understand is, how come
you are the only one to experience this?  My `msgfmt --version' says
"0.10.40", so I'm not sure what your "1.3" refers to.

Maybe you should upgrade gettext?



Re: Problem using SSL with client certificates and custom CA cert

2002-04-15 Thread Hrvoje Niksic

Hmm.  I'm afraid I cannot offer much help with SSL, since I don't
really understand how it is supposed to work.  The code was submitted
by a volunteer and seems to work for "obvious" sites, as you yourself
noticed.

I'm sorry I cannot help you more.  If you do find what the problem
was, please let me know.



Re: FTP options

2002-04-15 Thread Hrvoje Niksic

<[EMAIL PROTECTED]> writes:

> Good evening, I'm trying to make a ftp with WGET 1.7 My problem is
> that my PC is under a proxy, and there is no way to make an FTP
> unless you make first a FTP to that proxy, and then the proxy opens
> a ftp session to the final machine what you want to connect to You
> introduce as the name for the ftp when you connect to the proxy:
> anonymous@machine_where_you_want_ftp ¿Is there any form to make this
> type of ftp with Wget?

I've just recently added such functionality to the CVS version of
Wget.  (You have to download and compile it yourself, though; see
 for instructions how to do that.)

The way it works -- in the CVS version -- is as simple as setting
ftp_proxy to a FTP URL representing your proxy, and Wget does the
rest.



Re: --html-extension and content type query

2002-04-15 Thread Hrvoje Niksic

Picot Chappell <[EMAIL PROTECTED]> writes:

> Why doesn't wget assume that files, which don't declare content
> type, are text/html files?

Good question.  I don't know, perhaps such brokenness never occurred
to me.  And I don't remember anyone reporting it until now.

> I'm looking into patching http.c, so that if type isn't defined it
> gets set to text/html.  Has this been done for 1.8.1 already?  If
> so, can someone pass that patch along to me?
>
> Also, if I do this, will it cause horrible wget hiccups?

I don't think it will make a difference, except improve user
experience in the case that you describe.  Correctly written pages
will not be affected adversely, and that's what truly matters.

Here is a patch that should implement what you need.  Please let me
know if it works for you.

2002-04-16  Hrvoje Niksic  <[EMAIL PROTECTED]>

* http.c (gethttp): If Content-Type is not given, assume
text/html.

Index: src/http.c
===
RCS file: /pack/anoncvs/wget/src/http.c,v
retrieving revision 1.90
diff -u -r1.90 http.c
--- src/http.c  2002/04/14 05:19:27 1.90
+++ src/http.c  2002/04/16 00:14:57
@@ -1308,10 +1308,12 @@
}
 }
 
-  if (type && !strncasecmp (type, TEXTHTML_S, strlen (TEXTHTML_S)))
+  /* If content-type is not given, assume text/html.  This is because
+ of the multitude of broken CGI's that "forget" to generate the
+ content-type.  */
+  if (!type || 0 == strncasecmp (type, TEXTHTML_S, strlen (TEXTHTML_S)))
 *dt |= TEXTHTML;
   else
-/* We don't assume text/html by default.  */
 *dt &= ~TEXTHTML;
 
   if (opt.html_extension && (*dt & TEXTHTML))



Re: Anyone maintaining RedHat 6.x RPM's for weget?

2002-04-15 Thread Hrvoje Niksic

"Jeroen W. Pluimers \(mailings\)" <[EMAIL PROTECTED]> writes:

> I wonder if anyone is maintaining RedHat 6.x RPM's for wget.

I have no idea.  But, Wget is fairly easy to build from source, so I
never really bothered to find out.

> I could not find a 1.8.1. RPM on the net using google nor using
> rpmfind, and it seems the version RedHat ships is really really old.
>
> Any pointers to a download place are welcome.

If you have a C compiler, building Wget should be as simple as running
`configure' and `make install'.



Re: wget-1.8: assert in register_download while mirroring big site

2002-04-15 Thread Hrvoje Niksic

I believe this bug has been fixed in Wget 1.8.1.



Re: selective proxy usage

2002-04-15 Thread Hrvoje Niksic

Velimir Kalik <[EMAIL PROTECTED]> writes:

> Is it posible to specify for wget not to use proxy for some IPs or
> domains? E.g. not to use proxy for www.nba.com, but use it for
> everything else.
>
> Thanks and please cc replies to my email address too!

Yes, that should work with the `no_proxy' environment variable.  For
instance:

$ no_proxy=nba.com wget ...



Interactive Gaming News Weekly Headlines 4-15-2002

2002-04-15 Thread NewsLetter


 ,





~~
Interactive Gaming News Weekly Headlines (4/15/02)
~~

Subscribers may read the complete stories at http://www.igamingnews.com.

--
IGN READERS POLL
--

Last week we asked IGN readers, "The interactive gambling industry has fallen upon 
hard times. Is the worst yet to come?"

This week we ask: 

"Is Alderney as attractive a jurisdiction for online casinos as the Isle of Man?"

For the results of last week's poll, and to participate in this week's poll, visit: 
http://www.igamingnews.com.

==

GOING SOMEWHERE?

Less than a month to plan for the Global Interactive Gaming Summit & Expo - which 
means 30 days to ensure that your company IS going somewhere in the future of I-gaming.

REGISTRATION:  Delegates have a choice to register for the full Summit & Expo or for 
the Expo Hall only.  Online registration to attend the Global Interactive Gaming 
Summit & Expo is easy! Book your registration today for the largest gathering of 
I-Gaming suppliers ever so you can start scheduling your exhibitor visits.
http://www.rivercitygroup.com/summit2002/Registration.cfm

HOTELS:  Hotel blocks for attendees are almost full - you have until April 22. 
Exhibitors have until April 17 to reserve their rooms at special Summit pricing!   All 
telephone or e-mail booking details are conveniently found at:
http://www.rivercitygroup.com/summit2002/TravelInfo.cfm

POTENTIAL EXHIBITORS:  Have you checked yet to see if your competition has already 
reserved their booth at the show?  There is still time to secure and administrate a 
booth at the Summit, but you have to act fast to get in before all of the 
quickly-approaching deadlines!  E-mail [EMAIL PROTECTED] for more information.

==

ON CREATING AND SUPPORTING EFFECTIVE E-GAMING WEB SITES (4/12/02)
By Sudhir Kale

The e-gaming business has the dubious distinction of being the biggest online 
money-spinner after pornography.

http://www.igamingnews.com/articles/article_listing.cfm?tid=3296

==

--
This Week in IGN
--
> Is P2P Betting a Hotbed for Money Laundering?
> Alderney Issues Online Casino Licenses
> Oxley Seeks Additional Support for Leach Bill
> Internet Gambling Blamed for Suicide
> Online Gambling and Addiction
--

--

==

IS P2P BETTING A HOTBED FOR MONEY LAUNDERING? (4/9/02)

Tighter banking regulations around the world, stricter credit card policies and a self 
policing by operators and industry groups have helped minimize money laundering 
conducted through online gaming channels, but the emergence of person-to-person (P2P) 
betting has created new challenges amid the ongoing battle.

==

ALDERNEY ISSUES ONLINE CASINO LICENSES (4/10/02)

Alderney made its highly anticipated move into the online casino business official 
today as the island nation's Gambling Control Commission announced the issuing of its 
first three interactive gaming licenses.

==

OXLEY SEEKS ADDITIONAL SUPPORT FOR LEACH BILL (4/10/02)

The chairman of the U.S. House of Representatives Financial Services Committee 
recently sent a letter to all House members asking them to support a bill that would 
prohibit illegal Internet gambling in the United States.

==

INTERNET GAMBLING BLAMED FOR SUICIDE (4/12/02)

Debt from compulsive gambling on Internet casinos led a British man to commit suicide 
in January. This week, U.K. newspapers are reporting that the man's son is asking the 
area's MP, Jack Staw, to do something to prevent others from facing similar situations.

==

ONLINE GAMBLING AND ADDICTION (4/12/02)

A new report attempts to shed light on the relationship between Internet gambling and 
the rate of addicted players.

==

THINK YOU KNOW I-GAMING?

Take this fun quiz and test your I-gaming knowledge.  Challenge your boss with your 
high score and see who should REALLY be running the store!
http://www.rivercitygroup.com/summit/igaming_quiz.html

==

TAKE ADVANTAGE OF WORLD CUP 2002

Sports & Gaming Asia Magazine was the first authoritative magazine dedicated to 
serving sports bettors and gamblers in the Pacific-Asia rim.

Sports & Gaming Asia has negotiated special extra distribution points during the first 
week

Re: small bug in wget manpage: --progress

2002-04-15 Thread Hrvoje Niksic

Noel Koethe <[EMAIL PROTECTED]> writes:

> the wget 1.8.1 manpage tells me:
>
>--progress=type
>Select the type of the progress indicator you wish to
>use.  Legal indicators are ``dot'' and ``bar''.
>
>The ``dot'' indicator is used by default.  It traces
>the retrieval by printing dots on the screen, each dot
>representing a fixed amount of downloaded data.
>
> But it looks like the default is "bar".

Yes.  Thanks for the report; I'm about to apply this fix.


2002-04-15  Hrvoje Niksic  <[EMAIL PROTECTED]>

* wget.texi (Download Options): Fix the documentation of
`--progress'.

Index: doc/wget.texi
===
RCS file: /pack/anoncvs/wget/doc/wget.texi,v
retrieving revision 1.64
diff -u -r1.64 wget.texi
--- doc/wget.texi   2002/04/13 22:44:16 1.64
+++ doc/wget.texi   2002/04/15 20:52:28
@@ -625,10 +625,15 @@
 Select the type of the progress indicator you wish to use.  Legal
 indicators are ``dot'' and ``bar''.
 
-The ``dot'' indicator is used by default.  It traces the retrieval by
-printing dots on the screen, each dot representing a fixed amount of
-downloaded data.
+The ``bar'' indicator is used by default.  It draws an ASCII progress
+bar graphics (a.k.a ``thermometer'' display) indicating the status of
+retrieval.  If the output is not a TTY, the ``dot'' bar will be used by
+default.
 
+Use @samp{--progress=dot} to switch to the ``dot'' display.  It traces
+the retrieval by printing dots on the screen, each dot representing a
+fixed amount of downloaded data.
+
 When using the dotted retrieval, you may also set the @dfn{style} by
 specifying the type as @samp{dot:@var{style}}.  Different styles assign
 different meaning to one dot.  With the @code{default} style each dot
@@ -639,11 +644,11 @@
 files---each dot represents 64K retrieved, there are eight dots in a
 cluster, and 48 dots on each line (so each line contains 3M).
 
-Specifying @samp{--progress=bar} will draw a nice ASCII progress bar
-graphics (a.k.a ``thermometer'' display) to indicate retrieval.  If the
-output is not a TTY, this option will be ignored, and Wget will revert
-to the dot indicator.  If you want to force the bar indicator, use
-@samp{--progress=bar:force}.
+Note that you can set the default style using the @code{progress}
+command in @file{.wgetrc}.  That setting may be overridden from the
+command line.  The exception is that, when the output is not a TTY, the
+``dot'' progress will be favored over ``bar''.  To force the bar output,
+use @samp{--progress=bar:force}.
 
 @item -N
 @itemx --timestamping



Re: WGET malformed status line

2002-04-15 Thread Hrvoje Niksic

Löfstrand Thomas <[EMAIL PROTECTED]> writes:

> I have used wget with -d option to see what is going on, and it seems
> like the proxyserver returns the following response: "X-PLEASE_WAIT".
>
> After reading the source code in http.c it seems like wget expects
> the answer from the proxy to be HTTP/ and a version number.
>
> Is there any easy way to bypass this response part?

Maybe.  But what should the response be, then?  This sounds like
either a gross breach of HTTP or a completely different problem.  (We
had a report of a proxy server returning FTP status.)



timestamping

2002-04-15 Thread David C. Anderson

This isn't a bug, but the offer of a new feature.  The timestamping
feature doesn't quite work for us, as we don't keep just the latest
view of a website and we don't want to copy all those files around for
each update.

So I implemented a --changed-since=mmdd[hhmm] flag to only get
files that have changed since then according to the header.  It seems
to work okay, although your extra check for file-size eqality for the
timestamping feature makes me wonder if the date isn't always a good
measure.

One oddity is that if you point wget at a file that's older than the
date at the top level, it won't be gotten and there won't be any urls
to recurse on.  (We're pointing it at an url that changes daily.)

I tested it under Solaris 7, but there is a dependency on time() and
gmtime() that I haven't conditionalized for autoconf, as I am not
familiar with that tool.

I would like this feature to get carried along with the rest of the
codebase; would you like it?

-dca




Re: wget bug (overflow)

2002-04-15 Thread Hrvoje Niksic

I'm afraid that downloading files larger than 2G is not supported by
Wget at the moment.



Re: dynamic IPs

2002-04-15 Thread Thomas Lussnig

Hrvoje Niksic wrote:

>You're probably right; there should be an option to disable DNS
>caching.  As a stop-gap measure, you can simply stop `lookup_host'
>from caching the information it retrieves, by commenting the call to
>`cache_host_lookup' at the end of `lookup_host'.
>
Hi,
i think the idee to general disable the caching of DNS is not an so good 
idee.
I Think the right way should be an switch to tell optional an "max TTL" 
for the DNS
entrys and second the cache list need to be rewriten so that also TTL 
and TIME
to check for valid entrys. Since now we do not respect the TTL i think.

Cu Thomas Lußnig



smime.p7s
Description: S/MIME Cryptographic Signature


Re: typo in `man wget`

2002-04-15 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

> Unfinished sentence...
>
>Another way to specify username and password is in the
>URL itself.  For more information about security
>issues with Wget,

If only that were a typo.  It's a bug in the ugly script that converts
the Texinfo manual to POD.  :-(

*sigh*



Re: dynamic IPs

2002-04-15 Thread Hrvoje Niksic

You're probably right; there should be an option to disable DNS
caching.  As a stop-gap measure, you can simply stop `lookup_host'
from caching the information it retrieves, by commenting the call to
`cache_host_lookup' at the end of `lookup_host'.



Re: Change in behaviour between 1.7 and 1.8.1

2002-04-15 Thread Hrvoje Niksic

Philipp Thomas <[EMAIL PROTECTED]> writes:

> When you issue
>
>  wget --recursive --level=1 --reject=.html www.suse.de
>
> wget 1.7 really ommits downloading all the .html files except
> index.html (which is needed for --recursive), but wget 1.8.1 also
> downloads all .html files that are referenced from index.html and
> deletes them immediately.
>
> It is clear that the .html files are needed to find the next level
> of files when downloading recursively, but they should be ommitted
> when the recursion depth is limited and the limit has been reached.

Yes.  Please let me know if this patch fixes things for you:

2002-04-15  Hrvoje Niksic  <[EMAIL PROTECTED]>

* recur.c (download_child_p): Don't ignore rejection of HTML
documents that are themselves leaves of recursion.

Index: src/recur.c
===
RCS file: /pack/anoncvs/wget/src/recur.c,v
retrieving revision 1.44
diff -u -r1.44 recur.c
--- src/recur.c 2002/04/12 18:53:38 1.44
+++ src/recur.c 2002/04/15 18:09:47
@@ -511,23 +511,13 @@
   /* 6. */
   {
 /* Check for acceptance/rejection rules.  We ignore these rules
-   for HTML documents because they might lead to other files which
-   need to be downloaded.  Of course, we don't know which
-   documents are HTML before downloading them, so we guess.
-
-   A file is subject to acceptance/rejection rules if:
-
-   * u->file is not "" (i.e. it is not a directory)
-   and either:
- + there is no file suffix,
-+ or there is a suffix, but is not "html" or "htm" or similar,
-+ both:
-  - recursion is not infinite,
-  - and we are at its very end. */
-
+   for directories (no file name to match) and for HTML documents,
+   which might lead to other files that do need to be downloaded.
+   That is, unless we've exhausted the recursion depth anyway.  */
 if (u->file[0] != '\0'
-   && (!has_html_suffix_p (url)
-   || (opt.reclevel != INFINITE_RECURSION && depth >= opt.reclevel)))
+   && !(has_html_suffix_p (u->file)
+&& depth < opt.reclevel - 1
+&& depth != INFINITE_RECURSION))
   {
if (!acceptable (u->file))
  {



Re: wget -r not following links

2002-04-15 Thread Hrvoje Niksic

Mika Tuupola <[EMAIL PROTECTED]> writes:

>   I have a site which has relative links like this:
>
>   link
>
>   I have been trying different switches to make wget -r follow
>   those links but have been unsuccesfull. Is this possible with
>   the current version of wget?

Can you give more details?  The actual URL and a debug log would help;
failing that, more information would be nice.

To answer your question, yes, it is possible, and it should work.



Re: HTTP 1.1

2002-04-15 Thread Hrvoje Niksic

[EMAIL PROTECTED] writes:

>>Csaba Raduly's patch would break Wget because it doesn't suppose the
>>"chunked" transfer-encoding.  Also, its understanding of persistent
>>connection might not be compliant with HTTP/1.1.
>
> IT WAS A JOKE !

I know that.  But someone not acquainted with HTTP might not
understand why the patch doesn't work, especially with Wget seemingly
using a lot of HTTP/1.1 features.  That is why I felt compelled to
explain the joke, despite what experience taught me about explaining
jokes.  :-)

> Serves me right. I need to put bigger smilies :-(

Don't worry; jokes without smilies are usually the best ones.



Re: HTTP 1.1

2002-04-15 Thread csaba . raduly


On 12/04/2002 21:37:31 hniksic wrote:

>"Tony Lewis" <[EMAIL PROTECTED]> writes:
>
>> Hrvoje Niksic wrote:
>>
>>> > Is there any way to make Wget use HTTP/1.1 ?
>>>
>>> Unfortunately, no.
>>
>> In looking at the debug output, it appears to me that wget is really
>> sending HTTP/1.1 headers, but claiming that they are HTTP/1.0
>> headers. For example, the Host header was not defined in RFC 1945,
>> but wget is sending it.
>
>Yes.  That is by design -- HTTP was meant to be extended in that way.
>Wget is also requesting and accepting `Keep-Alive', using `Range', and
>so on.
>
>Csaba Raduly's patch would break Wget because it doesn't suppose the
>"chunked" transfer-encoding.  Also, its understanding of persistent
>connection might not be compliant with HTTP/1.1.

IT WAS A JOKE !
Serves me right. I need to put bigger smilies :-(


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Goodbye and good riddance

2002-04-15 Thread csaba . raduly


On 12/04/2002 19:21:41 "James C. McMaster (Jim)" wrote:

>My patience has reached an end.  Perhaps, now that you have (for the first
>time) indicated you will do something to fix the problem, the possible
light
>at the end of the tunnel will convince others to stay.

The light at the end of the tunnel is just the explosion around the Pu239 :
-)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




wget -r not following links

2002-04-15 Thread Mika Tuupola


I have a site which has relative links like this:

link

I have been trying different switches to make wget -r follow
those links but have been unsuccesfull. Is this possible with
the current version of wget?

-- 
Mika Tuupola  http://www.appelsiini.net/~tuupola/




Re: Proposal for despamming the list

2002-04-15 Thread Karsten Thygesen

> "Daniel" == Daniel Stenberg <[EMAIL PROTECTED]> writes:

 Daniel> On 14 Apr 2002, Karsten Thygesen wrote:
 >> Anyway - spamassassin is now in place - let's give it a chance
 >> before doing radical movements - and I can assure, that ezmlm is
 >> far more mature and stable than Mailman - we (sunsite) have been
 >> running both systems for years, and there is no doubt about, which
 >> type we recommends!

 Daniel> As was very quickly proven, that just isn't enough. Or you
 Daniel> need to add much stricter rules or whatever.

 Daniel> I found it very ironic that the first mail after your
 Daniel> previous post here, was a... spam!

Yes - I have hardened the rules afterwards. But please bear in mind,
that there is no 100% effective spam tool, which does not require a
human interaction. During the next weeks, I'm sure that you will find,
that the level of spam will be reduced to only a few percentage.

Karsten