Re: ** Nigerian Scam variation (Re: Co-operation Needed!)

2002-07-17 Thread csaba . raduly


On 16/07/2002 16:36:15 Fernando Cassia wrote:

>FYI and if someone has been living in a bottle This is a variation of
the
>Nigerian scam.
>
>http://www.secretservice.gov/alert419.shtml
>http://www.fdic.gov/consumers/consumer/news/cnwin0102/TooGood.html
>
>Don't even bother contacting them.
>
>Regards
>Fernando
>
>"Jesse Ndoro." wrote:
>
>> Dear Sir,
[snip "Nigerian" scam quoted in its entirety]

Please don't do that.
1) This is a mailing list where the subscribes can actually think
2) FYI, you should not top-post and quote the original *in its entirety*
   This is a mailing list, where discussions are frequent. Replying
   at the top makes it difficult to follow who said what and in reply
   to whom.
3) We already received two copies of the scam. There was no need
   to send a third, unabridged copy.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: user-agent string for IE

2002-06-26 Thread csaba . raduly


On 20/06/2002 10:03:13 jgrosman wrote:

>Hi all.
>
[snip question about emulating IE in the User-agent string]

Virtually all browsers start their User-Agent with "Mozilla"
For IE 6, try something like

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: speed units

2002-06-11 Thread csaba . raduly


On 10/06/2002 23:07:47 Joonas Kortesalmi wrote:

>Wget seems top repots speeds with wrong units. It uses for example "KB/s"
>rather than "kB/s" which would be correct. Any possibility to fix that? :)
>
>K = Kelvin
>k = Kilo
>
>Propably you want to use small k with download speeds, right?
>

Let's not go there again, lest wget will have to report download in
kibibytes (ISTR wget using 1024 to divide).
k = kilo is reserved for dividing by 1000.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Can't get remote files - what am I doing wrong?

2002-06-05 Thread csaba . raduly


On 05/06/2002 13:08:05 drt - lists wrote:

>Thank for no help.
>
>If this is typical of how you reply to your customers

I do *not* reply to customers. I am a developer, and post here as a private
individual. Perhaps I should unsubscribe altogether.

[snip]
>>> The Mac machine I am using for testing is behind our firewall, but
>>> there is a "hole" opened to allow my internal IP to reach
>>> the specific remote IP.
>> [snip]
>>
>> Because you didn't include the output with the -d switch, I'm guessing.
>> Do you use a proxy to go through the firewall ? A lot of proxies issue
>> HTTP requests even for FTP. HTTP cannot glob.
>
>Yes we do,

So there is a proxy after all.

>and no, it doesn't issue an ftp request as I have an opening for
>this specific request - which if you had bothered to read my message
>instead of trying to attack you would know that.
>
>
>Here is the part that you ignored which addresses the accusation above.
   ^^
Huh ? I described a scenario which could have caused the failure you
described. I did not *accuse* you of using a proxy !

>---
>The Mac machine I am using for testing is behind our firewall, but there
>is a "hole" opened to allow my internal IP to reach the specific remote
IP.
>And using the first example above it does connect so I know I am getting
>through the firewall.
>---
>

Note that if wget is set up to use the proxy by default (env. var, wgetrc)
then it'll use the proxy even if it could connect directly through
the hole in the firewall. The first example (which I snipped) did not
use globbing. That would succeed regardless of whether wget connected
directly or through a HTML-ized proxy.

We're not getting any closer to a solution. Please post the output
of the failed request (the one that fails) in debugging mode
(be careful to obscure any possible passwords).

>
[ad hominem attack snipped]

I apologise. Although I consider what I've written to be valid, the tone
was not. I claim temporary loss of diplomatic abilities.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Can't get remote files - what am I doing wrong?

2002-06-05 Thread csaba . raduly


On 03/06/2002 14:56:47 dale wrote:

[snip]
>wget ftp://user:[EMAIL PROTECTED]/folder1/folder2/*s.csv
>
>I get an error message of "no match" and if I use:
>
>wget --glob=on ftp://user:[EMAIL PROTECTED]/folder1/folder2/*s.csv
>
>I also get "no match"
>

In the future, please post the output with the -d switch added.
(did you read the instructions ?)

[snip]
>The Mac machine I am using for testing is behind our firewall, but there
>is
>a "hole" opened to allow my internal IP to reach the specific remote IP.
[snip]

Because you didn't include the output with the -d switch, I'm guessing.
Do you use a proxy to go through the firewall ? A lot of proxies issue
HTTP requests even for FTP. HTTP cannot glob.

>
>p.s. The reply-to address has been anti-spammed (I hope anyway), please
>post any replies to the list.
>

Somebody at Ultimate Search (the owner of nospam.net) will be mightily
surprised. What you did can be interpreted as email address forgery.
Please in the future use addresses which end in .invalid (this top level
domain is guaranteed to always be, err, invalid), e.g.
[EMAIL PROTECTED]

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: ? gets translated to @

2002-05-24 Thread csaba . raduly


On 24/05/2002 13:39:29 ladislav.gaspar wrote:

>Hi
>
>I do the following:
>wget http://killefiz.de/zaurus/showdetail.php?app=221
>
>but the file is saved as http://killefiz.de/zaurus/showdetail.php@app=221
>
>(*.php?app gets translated to *.php@app)
>
>Why is that and is there a workaround?
>

That *is* the workaround :-)
'?' is an invalid character for filenames on FAT, FAT32, NTFS.
Instead of giving an error message like this:
"Cannot open killefiz.de/zaurus/showdetail.php?app=221"
wget actually tries to do what you want (i.e. download the file).

You can run wget on another platform (Linux, some Unix. etc).
The filesystems there usually don't have this restriction.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




JavaScript

2002-05-21 Thread csaba . raduly

Links, the (formerly) text-mode browser has recently acquired the ability
to parse JavaScript.
Look at http://atrey.karlin.mff.cuni.cz/~clock/twibright/links/

It seems to use around four source files (two generated by lex and yacc,
respectively).
This might be usable to teach wget JavaScript. Alas, all comments are in
Czech :-(

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: crawling servlet based urls

2002-05-16 Thread csaba . raduly


On 16/05/2002 17:06:31 "Steve Mestdagh" wrote:

>Hi,
>I'm trying to get crawl intranet urls of form:
[snip, wget will try to save to filename like this:]
> `WKCCommand?command=getLesson&LessonId=137'
[snip]

The filename above is invalid on many filesystems used by Micros~1.
(It's the '?' causing the problem).

This is corrected for sure in a newer version, either 1.8.1
or the current CVS.


Heiko Herold provides:
New CVS binary for windows at http://space.tin.it/computer/hherold




--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: cookie pb: download one file on member area

2002-05-15 Thread csaba . raduly


On 15/05/2002 13:34:29 "[EMAIL PROTECTED]" wrote:

[snip problem possibly related to cookies]
>
>Although i use wget with the option for using the mozilla's
>cookies file, i am not able to download that file. Could
>someone help me ??? If you want further information, just ask.

Without the output of "wget -d" our guess is actually worse than yours.
Please run wget with the -d option in addition to the existing ones.
Then post the results (if it's big it might be a better idea
to post it on a website and send just the link to the list).
If you send to the list, I'd prefer it pasted into the mail
(in-line, rather than an attachment).


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re:

2002-05-01 Thread csaba . raduly


On 30/04/2002 16:31:17 "Tony Lewis" wrote:

>[EMAIL PROTECTED] wrote:
>
>> I want to get page http://www.boards.spb.ru/?3~sell with _all_ contents
>> as in browser. But i get only part of web page.
>> Page contains
>> 
>> that output data into page.

Which suggests that the include didn't work.
Probably because the # is missing before the include.
See below.

>
>When I view the source of that page in my browser, I also see the include
>tag. For what its worth, this almost looks like an Apache server side
>include command, but if it were, it would be:
>
>
>

It's the server's job to replace 
with the output of temp.pl
If you can see the  (an HTML comment), it means the server
didn't do its job (maybe because of the typo)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget wild card

2002-04-26 Thread csaba . raduly


On 25/04/2002 21:55:26 "Zhao, David  [PRDUS Non J&J]" wrote:

>When I do:
>ftp://ftp.something.com/*, I've got "wget: No match".
>Any clue?
>Thanks in advance
>

It is likely that there are no files in the FTP root,
only directories. Run wget with

wget -d -nr ftp://ftp.something.com/*

for more information

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Worm Klez.E immunity

2002-04-25 Thread csaba . raduly


On 25/04/2002 06:43:10 "Tony Lewis" wrote:

>Admin wrote:
>
>> Klez.E is the most common world-wide spreading worm.
>
>It's definitely a nasty little piece of code. Even though I had Outlook
>Express configured to disallow practially everything, the E-mail messages
>opened themselves and let the rogue code loose. If you've managed to avoid
>Klez so far and you're running Windows, I strongly recommend you read more
>about this at Microsoft and then install the recommended security patch:
>

http://www.pmail.com/  :-)



--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: apache irritations

2002-04-22 Thread csaba . raduly


On 22/04/2002 16:38:15 "Maciej W. Rozycki" wrote:

>On Mon, 22 Apr 2002, Hrvoje Niksic wrote:
>
>> >  How about using the "-R" option of wget?  A brief test proves "-R
>> > '*\?[A-Z]=[A-Z]'" works as it should.
>>
>> Or maybe the default system wgetrc should ship with something like:
>>
>> reject = *?[A-Z]=[A-Z]
>
>Note the difference between strings! -- the backslash before the
>quotation mark is essential as otherwise it's a glob character.
>

[A-Z] is a bit extreme, IMHO. How about

reject = *\?[NMSD]=[AD]
  ^^ literal '?' needed here


>
>Well, I don't think it's sane but adding a *commented-out* reject line
>with an appropriate annotation to the default system wgetrc looks like a
>good idea to me.
>

A good idea.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: HTTP 1.1

2002-04-15 Thread csaba . raduly


On 12/04/2002 21:37:31 hniksic wrote:

>"Tony Lewis" <[EMAIL PROTECTED]> writes:
>
>> Hrvoje Niksic wrote:
>>
>>> > Is there any way to make Wget use HTTP/1.1 ?
>>>
>>> Unfortunately, no.
>>
>> In looking at the debug output, it appears to me that wget is really
>> sending HTTP/1.1 headers, but claiming that they are HTTP/1.0
>> headers. For example, the Host header was not defined in RFC 1945,
>> but wget is sending it.
>
>Yes.  That is by design -- HTTP was meant to be extended in that way.
>Wget is also requesting and accepting `Keep-Alive', using `Range', and
>so on.
>
>Csaba Raduly's patch would break Wget because it doesn't suppose the
>"chunked" transfer-encoding.  Also, its understanding of persistent
>connection might not be compliant with HTTP/1.1.

IT WAS A JOKE !
Serves me right. I need to put bigger smilies :-(


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Goodbye and good riddance

2002-04-15 Thread csaba . raduly


On 12/04/2002 19:21:41 "James C. McMaster (Jim)" wrote:

>My patience has reached an end.  Perhaps, now that you have (for the first
>time) indicated you will do something to fix the problem, the possible
light
>at the end of the tunnel will convince others to stay.

The light at the end of the tunnel is just the explosion around the Pu239 :
-)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: HTTP 1.1

2002-04-12 Thread csaba . raduly


On 11/04/2002 18:26:15 hniksic wrote:

>"Boaz Yahav" <[EMAIL PROTECTED]> writes:
>
>> Is there any way to make Wget use HTTP/1.1 ?
>
>Unfortunately, no.

Sure it can be made to use HTTP 1.1

--- http.c.orig   Wed Jan 30 14:10:42 2002
+++ http.c  Fri Apr 12 11:56:22 2002
@@ -838,7 +838,7 @@
  + 64);
   /* Construct the request.  */
   sprintf (request, "\
-%s %s HTTP/1.0\r\n\
+%s %s HTTP/1.1\r\n\
 User-Agent: %s\r\n\
 Host: %s%s%s%s\r\n\
 Accept: %s\r\n\




:-)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: qestio

2002-04-05 Thread csaba . raduly


On 05/04/2002 12:44:22 Varga Gabor wrote:

>Hi
>
>I am gabor from hungary I have a qestion
>I have an URL ending like this */show.php?id=843
>I know how it works(correct me if I am wrong) the *.php
>(gets or posts) the arg. ID
>and the server returns the page 843 but why can't wget
>mirror these pages ?
>

Because it'll try to save with the filename "show.php?id=843", and '?' is
invalid in a filename on DOS/Windows/OS2

What version of wget are you using ? What platform (operating system) ?
What does the debug log say ? (run wget with the -d switch added)

CC'd to wget, not bug-wget

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




wget parsing JavaScript

2002-03-26 Thread csaba . raduly

wget stumbled upon the following HTML file:

--- >8 


foo





var sitems=new Array()
var sitemlinks=new Array()

///Edit below/

//extend or shorten this list
sitems[0]="15.html"
sitems[1]="16.html"
sitems[2]="17.html"
sitems[3]="18.html"
sitems[4]="19.html"
sitems[5]="20.html"
sitems[6]="21.html"
sitems[7]="22.html"
sitems[8]="23.html"
sitems[9]="24.html"
sitems[10]="25.html"
sitems[11]="26.html"
sitems[12]="27.html"


//These are the links pertaining to the above text.
sitemlinks[0]="31.html"
sitemlinks[1]="32.html"
sitemlinks[2]="33.html"
sitemlinks[3]="34.html"
sitemlinks[4]="35.html"
sitemlinks[5]="36.html"
sitemlinks[6]="37.html"
sitemlinks[7]="38.html"
sitemlinks[8]="39.html"
sitemlinks[9]="40.html"
sitemlinks[10]="41.html"
sitemlinks[11]="42.html"
sitemlinks[12]="43.html"

//If you want the links to load in another frame/window, specify name of
//target (ie: target="_new")
var target=""

for (i=0;i<=sitems.length-1;i++)
document.write(''+sitems[i]+'
') Congratulations, you have turned off JavaScript. --- >8 I see that wget handles

Re: OK, time to moderate this list

2002-03-22 Thread csaba . raduly


On 22/03/2002 07:06:13 Daniel Stenberg wrote:

>On Fri, 22 Mar 2002, Hrvoje Niksic wrote:
[snip]
>> I think I agree with this.  The amount of spam is staggering.  I have no
>> explanation as to why this happens on this list, and not on other lists
>> which are *also* open to non-subscribers.
>
>Spammers work in mysterious ways. ;-)
>

No, they work in fairly predictable ways.
The wget mailinglist address is advertised on the wget homepage.
According to empirical observations, if you publish a brand new email
address
on a web page, it'll receive spam within eight *hours* of it being
published.

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: KB or kB

2002-02-08 Thread csaba . raduly


On 08/02/2002 13:58:55 Andre Majorel wrote:

>On 2002-02-08 08:54 +0100, Hrvoje Niksic wrote:
>
>> Wget currently uses "KB" as abbreviation for "kilobyte".  In a Debian
>> bug report someone suggested that "kB" should be used because it is
>> "more correct".  The reporter however failed to cite the reference for
>> this, and a search of the web has proven inconclusive.
>>
>> Does someone understand the spelling issues involved enough to point
>> out the "correct" spelling and back it up with arguments?
>
>The applicable standard is the SI (Système International)

[snip SI prefixes]

>Capital K is not a prefix, it's the SI abbreviation for the
>temperature unit, the kelvin (note : lower case k) named after
>Lord Kelvin.
>
>So it's definitely kB for kilobyte.

As long as it means 1000 and NOT 1024

>
>Whether that means 1000 bytes or 1024 bytes is another issue.

Not while claiming to conform to SI.

Csaba

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget not working

2002-02-08 Thread csaba . raduly


On 08/02/2002 15:34:53 Martin Schöneberger wrote:

>At 14:37 08.02.2002 +, Henderson, Daniel wrote:
>>#wget www.sophos.com/downloads/ide/ides.zip
>>--14:32:57--  http://www.sophos.com/downloads/ide/ides.zip
>>=> `ides.zip'
>>Connecting to www.sophos.com:80...
>>www.sophos.com: Host not found.
>>
>>Is there something else I should configure in Solaris to allow this to
>>work?
>
>First of all you should find out why you can't connect to "sophos.com".
>1) sophos is down -> try later
>Solution: get the file from another server
>2) dns lookup failed -> try if you can connect to other hosts like
>"google.com" or anything else, or if you can only connect to ip adresses.
>Solution1: try another DNS server
>Solution2: reconfigure your DNS settings or even your DNS-server (if you
>are running one)

Try
nslookup www.sophos.com
ping www.sophos.com
telnet   www.sophos.com 80

If these work, it's wget's fault.
If they don't, it's a connectivity problem.

>4) user root not allowed to connect to the internet (standard on BSD if i
>remember correctly) -> try if you can DL the file using another user
>Solution: change the user database or the firewall settings, or just don't
>connect to the internet using root :-)

Good point. Look at the prompt...

[snip]
>
>Last but not least: Try the "-d" switch with wget and have a look at the
>debug output of wget. Perhaps you find further information why you can't
>connect. If you don't, send it to this list, perhaps "we" find smth :-)
>

Very good advice indeed.
HTH,


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: KB or kB

2002-02-08 Thread csaba . raduly


On 08/02/2002 08:30:59 Henrik van Ginhoven wrote:

>On Fri, Feb 08, 2002 at 08:54:06AM +0100, Hrvoje Niksic wrote:
>> Wget currently uses "KB" as abbreviation for "kilobyte".  In a Debian
>> bug report someone suggested that "kB" should be used because it is
>> "more correct".

This is the kind of stuff that leads to month-long flamewars :-)

>
>"kB" rather than "KB"? I think whoever filed that bugreport got it wrong,
as
>far as I know "kB" would always mean 1000 (bytes), since "k" = thousand,
and
>never ever 1024. If he'd said "KiB" I'd agree with him to a certain
degree,
>but "kB" simply can't be right.

Note that we can claim the distinction that k=1000 and K=1024
That won't work with 1E6 vs 2**20 because SI uses uppercase M for 1E6.

>
>Rather than me trying to sum it up and risk typing something wrong, this
>page seems to address the issue well:
>
>http://www.romulus2.com/articles/guides/misc/bitsbytes.shtml
>

Please, no kibibytes :-)
Maybe wget should just count 512-byte "blocks", a la df.
That would improve the understandability of the display ... NOT
But it would keep the terminally anal-retentives at bay :-)

Seriously, just ignore it. I can certainly live with 5%
"experimental error" ( 2**20 = 1.0486E6 ) at megabyte level.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: BUG https + index.html

2002-02-01 Thread csaba . raduly


On 01/02/2002 12:10:59 "Mr.Fritz" wrote:

>After the https/robots.txt bug, doing a recursive wget to an https-only
server
>gives me this error: it searches for http://servername/index.html but
there
>is no server on port 80, so wget receives a Connection refused error and
>quits.  It should search for https://servername/index.html 
>

Are you sure this was an SSL-enabled wget ?
Please provide a debug log by running wget with the -d parameter.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: mirroring vs -m

2002-01-29 Thread csaba . raduly


On 29/01/2002 15:54:17 Andre Majorel wrote:

[snip debate about following links in HTML retrieved by FTP]
>
>I'm inclined to think that recursive retrieval without parsing
>is a feature. HTML content is normally served over HTTP. If you
>want to retrieve HTML through FTP, it's likely because you do
>*not* want to follow the links.
>

I (client) don't get the choice. If the document at
http://foo.bar/index.html has all its links like this:

ftp://foo.bar/welcome.html";>welcome

the client has no choice but to retrieve them via FTP.
It would be nice if wget was able to follow all those links.


>If Wget always parsed HTML, even over FTP, it would be
>impossible to make a complete mirror a tree that has broken href
>links or hidden files.

Perhaps "If wget started with FTP, it should mirror FTP-like
(.listing and all that). If it started via HTTP, it should follow links,
regardless of future retrieval modes"

[snip]

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




RE: Bug report: 1) Small error 2) Improvement to Manual

2002-01-17 Thread csaba . raduly


On 17/01/2002 07:34:05 Herold Heiko wrote:
[proper order restored]
>> -Original Message-
>> From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
>> Sent: Thursday, January 17, 2002 2:15 AM
>> To: Michael Jennings
>> Cc: [EMAIL PROTECTED]
>> Subject: Re: Bug report: 1) Small error 2) Improvement to Manual
>>
>>
>> Michael Jennings <[EMAIL PROTECTED]> writes:
>>
>> > 1) There is a very small bug in WGet version 1.8.1. The bug occurs
>> >when a .wgetrc file is edited using an MS-DOS text editor:
>> >
>> > WGet returns an error message when the .wgetrc file is terminated
>> > with an MS-DOS end-of-file mark (Control-Z). MS-DOS is the
>> > command-line language for all versions of Windows, so ignoring the
>> > end-of-file mark would make sense.
>>
>> Ouch, I never thought of that.  Wget opens files in binary mode and
>> handles the line termination manually -- but I never thought to handle
>> ^Z.
>>
>> As much as I'd like to be helpful, I must admit I'm loath to encumber
>> the code with support for this particular thing.  I have never seen it
>> before; is it only an artifact of DOS editors, or is it used on
>> Windows too?
>>


[snip "copy con file.txt"]
>
>However in this case (at least when I just tried) the file won't contain
>the ^Z. OTOH some DOS programs still will work on NT4, NT2k and XP, and
>could be used, and would create files ending with ^Z. But do they really
>belong here and should wget be bothered ?
>
>What we really need to know is:
>
>Is ^Z still a valid, recognized character indicating end-of-file (for
>textmode files) for command shell programs on windows NT 4/2k/Xp ?
>Somebody with access to the *windows standards* could shed more light on
>this question ?
>
>My personal idea is:
>As a matter of fact no *windows* text editor I know of, even the
>supplied windows ones (notepad, wordpad) AFAIK will add the ^Z at the
>end of file.txt. Wget is a *windows* program (although running in
>console mode), not a *Dos* program (except for the real dos port I know
>exists but never tried out).
>

I don't think there's a distinction between DOS and Windows programs
in this regard. The C runtime library is most likely to play a
significant role here. For a file fopen-ed in "rt" mode, teh RTL
would convert \r\n -> \n and silently eat the _first_ ^Z,
returning EOF at that point.

When writing, it goes the other way 'round WRT \n->\r\n.
I'm unsure about whether it writes ^Z at the end, though.

>So personally I'd say it would not be really necessary adding support
>for the ^Z, even in the win32 port; except possibly for the Dos port, if
>the porter of that beast thinks it would be useful.
>

Problem could be solved by opening .netrc in "rt"
However, the "t" is a non-standard extension.

However, this is not wget's problem IMO. Different editors may behave
differently. Example: on OS/2 (which isn't a DOS shell, but can run
DOS programs), the system editor (e.exe) *does* append a ^Z at the end
of every file it saves. People have patched the binary to remove this
feature :-) AFAIK no other OS/2 editor does this.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: A strange bit of HTML

2002-01-16 Thread csaba . raduly


On 16/01/2002 19:31:26 "Ian Abbott" wrote:

>I came across this extract from a table on a website:
>
>href="66B27885.htm" "msover1('Pic1','thumbnails/MO66B27885.jpg');"
>onMouseOut="msout1('Pic1','thumbnails/66B27885.jpg');">SRC="thumbnails/66B27885.jpg" NAME="Pic1" BORDER=0 >
>
>Note the string beginning "msover1(", which seems to be an
>attribute value without a name, so that makes it illegal HTML.
>

That sounds like they wanted onMouseOver="msover1(...)"
It's also likely that msover1 is a Javascript function :-(

>I haven't traced what Wget is actually doing when it encounters
>this, but it doesn't treat "66B27885.htm" as a URL to be
>downloaded.
>

in map_html_tags()
 /* Establish bounds of attribute name. */
 attr_name_begin = p; /*  */
/*  ^*/
 while (NAME_CHAR_P (*p))
   ADVANCE (p);
 attr_name_end = p;  /*  */
/* ^ */
 if (attr_name_begin == attr_name_end)
   goto backout_tag;

When it sees "msover1(..." it doesn't ADVANCE
(because NAME_CHAR_P(") is false).
Hence attr_name_begin == attr_name_end, and it backs out:

  backout_tag:
#ifdef STANDALONE
++tag_backout_count;
#endif
/* The tag wasn't really a tag.  Treat its contents as ordinary
   data characters. */


>I can't call this a bug, but is Wget doing the right thing by
>ignoring the href altogether?
>

Until there's an ESP package that can guess what the author intended,
I doubt wget has any choice but to ignore the defective tag. In addition,
wget should send an email to webmaster@,
complaining about the invalid HTML :-)


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Is "wget --timestamping URL" working on Windows 2000?

2001-12-11 Thread csaba . raduly


>From main.c:


  /* Open the output filename if necessary.  */
  if (opt.output_document)
{
  if (HYPHENP (opt.output_document))
 opt.dfp = stdout;
  else
 {
   struct stat st;
   opt.dfp = fopen (opt.output_document, opt.always_rest ? "ab" :
"wb");
   if (opt.dfp == NULL)
 {
   perror (opt.output_document);
   exit (1);
 }
   if (fstat (fileno (opt.dfp), &st) == 0 && S_ISREG (st.st_mode))
 opt.od_known_regular = 1;
 }
}

It seems to me that if an output_document is specified, it is being
clobbered at the very beginning (unless always_rest is true). Later in
http_loop stat() comes up with zero length. Hence there's always a size
mismatch when --output-document is specified.

That doesn't sound good to me...

Csaba

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: log errors

2001-12-11 Thread csaba . raduly


On 11/12/2001 15:09:25 hniksic wrote:

>Summer Breeze <[EMAIL PROTECTED]> writes:
>
>> I want to know if Wget is a program similar to Mozilla, and if so is
>> there any way to make my pages available to Wget? I use Netscape to
>> create my web pages.
>
>Wget is a command-line downloading utility; it allows you to download
>a page or a part of the site without further user interaction.
>
>> Here is a sample entry:
>>
>> 66.28.29.44 - - [08/Dec/2001:18:21:20 -0500] "GET /index4.html%0A
>> HTTP/1.0" 403 280 "-" "Wget/1.6"
>
>"/index4.html%0A" looks like a page is trying to link to /index4.html,
>but the link contains a trailing newline.

That IP address is assigned to Road Runner (big cable ISP, I think)

Is /index4.html%0A the *first* error line in the log from 66...44 ?

Wget will try to download a URL in two cases: either because it was told to
explicitly, or because it was doing a recursive download and found that
link in a page downloaded earlier.

/index4.html%0A looks like something somewhere was misparsed. It might
conceivably be wget (unlikely, as this sort of problem would've surfaced
long ago).

If /index4.html%0A *is* the first URL requested by that IP address, then
the blame is clearly elsewhere (unless -i was used). If not, can you search
your site for a link to /index4.html that might be badly formatted HTML
(although wget should be able to defend itself against bad HTML).


(Please don't CC me; I'm on the list)
--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933





Re: Is "wget --timestamping URL" working on Windows 2000?

2001-12-11 Thread csaba . raduly


On 11/12/2001 14:03:54 Adrian Aichner wrote:

>Hi Wgeteers!
>
>Is
>  -N,  --timestamping   don't retrieve files if older than local.
>supposed to work on windows 2000?
>
[snip]
>
>cd c:\Hacking\SunSITE.dk\xemacsweb\Download\win32\
>%TEMP%\wget.wip\src\wget.exe --debug --timestamping
--output-document=setup.exe
>http://ftp.xemacs.org/windows/setup.exe
>Compilation started at Tue Dec 11 14:53:07 2001 +0100 (W. Europe Standard
Time)
>DEBUG output created by Wget 1.8 on Windows.
>
>--14:53:07--  http://ftp.xemacs.org/windows/setup.exe
>   => `setup.exe'
>Resolving ftp.xemacs.org... done.
>Caching ftp.xemacs.org => 207.96.122.9
>Connecting to ftp.xemacs.org[207.96.122.9]:80... connected.
>Created socket 420.
>Releasing 007D1C00 (new refcount 1).
>---request begin---
[snip HEAD request and response]
>
>
>Found ftp.xemacs.org in host_name_addresses_map (007D1C00)
>Registered fd 420 for persistent reuse.
>Length: 181,760 [application/octet-stream]
>Closing fd 420
>Releasing 007D1C00 (new refcount 1).
>Invalidating fd 420 from further reuse.
>The sizes do not match (local 0) -- retrieving.

 ^^^
 ^^^
Something is wrong there.
Try it without --output-document; it should put it in the current dir
anyway


>--14:53:08--  http://ftp.xemacs.org/windows/setup.exe
>   => `setup.exe'
>Found ftp.xemacs.org in host_name_addresses_map (007D1C00)
>Connecting to ftp.xemacs.org[207.96.122.9]:80... connected.
>Created socket 420.
>Releasing 007D1C00 (new refcount 1).
>---request begin---
>GET /windows/setup.exe HTTP/1.0
[snip]
>
>14:53:47 (6.14 KB/s) - `setup.exe' saved [181760/181760]
>
>
>Compilation finished at Tue Dec 11 14:53:47
>


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Uncoupling translations from source

2001-12-10 Thread csaba . raduly


On 10/12/2001 08:10:12 "Martin v. Loewis" wrote:

>> Maybe you wanted to say that many Europeans speak English so well,
>> that they do not need translations?
>
>It is my observation as well: Some users are hostile towards the
>notion of translated software. Those are typically not native English
>speakers, but people who found, at one time or the other, reason to
>complain about translations. They do so for all operating systems,
>making fun of erroneous translations (such as the infamous "Pfeife
>zerbrochen" of SINIX, or translations that an MS employee came up
>with).
>

>From an ancient DR-DOS (version 3.something)

Nicht breit __reading__ laufwerk A:

This was clearly an oversight (the message was probably pasted together
from various places).

My native language is Hungarian, and I don't remember using ANY software in
Hungarian (with the possible exception of Recognita, which is written by
hungarians). For the few I tried, I found the hungarian translation
incredibly awkward (this is exacerbated by the fact that Hungarian is
neither germanic nor latinic), even if not at the level of "all your base
are belong to us" :-) It was easier to use the english version (this was
all commercial software).

Complaining about the *presence* of translation is silly, IMO. Presumably
gettext has a way to decide what language to use (LANG environment
variable, or suchlike; LANG=en_gb should do).

Decoupling translations is a good idea, if the logistics can be sorted out.

Csaba

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Wget 1.8-beta1 now available

2001-12-03 Thread csaba . raduly


On 01/12/2001 19:44:44 John Poltorak wrote:

>On Sat, Dec 01, 2001 at 04:30:47PM +0100, Hrvoje Niksic wrote:
>> John Poltorak <[EMAIL PROTECTED]> writes:
>>
>> > Is it possible to include OBJEXT in Makefile.in to make this more
>> > cross-platform?
>>
>> I suppose so.  I mean, o is already defined to .@U@o, but I'm not
>> exactly sure what the U is supposed to stand for.
>
>
>It's looks to me as though @U@ is set up for some variable substitution,
>but I can't work out what for... Maybe it's getting replaced by NULL.
>
>

I know next to nothing about how Auto* is (supposed to be) working, but
I've seen lots of sed commands in


If @U@ is doing a variable substitution, then it'll expand to something
_before_ o
(if @U@ -> bar, then this will result in a dependency involving .baro)

(looking through configure)
Wget's configure contains this towards the end:

s%@U@%$U%g

U seems to be related to ansi2knr:

if(can use prototypes)
 U= ANSI2KNR=
else
 U=_ ANSI2KNR=./ansi2knr
endif

This will result in dependencies written as ._o if ansi2knr was run over
the sources.


This forces me to conclude that using @U@ _CAN_NOT_ and _WILL_NOT_ change
.o to .obj
I think .@U@o might need to be replaced with .@U@@objext@ (if there is such
a beast, in analogy with @exeext@)

Csaba



--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget1.7.1: Compilation Error (please Cc'ed to me :-)

2001-11-28 Thread csaba . raduly


On 28/11/2001 10:28:44 Daniel Stenberg wrote:

>On Wed, 28 Nov 2001, zefiro wrote:
>
>> ld: Undefined symbol
>>_memmove
>>
>> Do you have any suggestion ?
>
>SunOS 4 is known to not have memmove.
>

Isn't configure supposed to notice that ?


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: minor memory leak risk

2001-11-20 Thread csaba . raduly


On 20/11/2001 10:12:05 Daniel Stenberg wrote:

>This subject says it all. The leak is minor, the fix could be made
>something like this:
>
>diff -u -r1.21 utils.c
>--- utils.c 2001/05/27 19:35:12 1.21
>+++ utils.c 2001/11/20 10:10:17
>@@ -903,7 +903,12 @@
>   while (fgets (line + length, bufsize - length, fp))
> {
>   length += strlen (line + length);
>-  assert (length > 0);
>+  if (0 == length)
>+{
>+  /* bad input file */
>+  xfree(line);
>+  return NULL;
>+}
>   if (line[length - 1] == '\n')
>break;
>   /* fgets() guarantees to read the whole line, or to use up the
>

It's not just a memory leak. Length <= 0 is declared as a "can't happen".
If length is zero, wget will suddenly end due to the assert.
If a bad input file can lead to length being zero, then using assert is bad
on principle. One should never assert external input.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: wget mirroring busted

2001-11-15 Thread csaba . raduly


On 14/11/2001 16:27:34 jwz wrote:

>[EMAIL PROTECTED] wrote:
>>
>> Can you post the entire debug log (on a web/ftp site, of course, not the
>> list).
>
>Done -- http://www.jwz.org/wget-log.gz
>
>Does this mean you can't reproduce this when you run wget the same
>way I did?
>

No, I just wanted to take a look at the surrounding lines in the log.

>wget -nv -m -nH -np \
>   http://www.dnalounge.com/flyers/
>   http://www.dnalounge.com/gallery/
>

I may try that myself.

P.S. Please *don't* CC in the future, I'm on the list.



--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: A tricky download

2001-10-12 Thread csaba . raduly


On 12/10/2001 16:49:07 "Edward J. Sabol" wrote:

[snip question about downloading a site with Javascript-only links]
>
>Probably not. If the only links to the other chapters are in JavaScript
>commands, then there's no way wget can do it. Wget does not interpret
>JavaScript and most likely never will.

Implementing it is left as an exercise for the reader.
;-)

--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: Recursive retrieval of page-requisites

2001-10-09 Thread csaba . raduly


On 09/10/2001 14:25:57 Andre Pang wrote:

>On Tue, Oct 09, 2001 at 03:46:52PM +0300, Mikko Kurki-Suonio wrote:
>
>> > To me that sounds like a logical combination of -r -np -p?
>> > Any correction appreciated.
>>
>> Doesn't work, apparently because -np overrides -p.
>>
>> I.e. with -np set, no document outside the selected subtree will be
>> loaded, whether it is referred to through regular link-traversal or as a
>> page-requisite element.
>>
>> My guess is that -p adds those links to the list of documents to load,
but
>> -np later rejects them because they're not within the selected subtree.
>>
>> What I'd basically like is a setting that loads page-requisites
REGARDLESS
>> OF ALL OTHER SETTINGS. I.e. you use the myriad of settings to fine tune
>> the exact set of pages requested, and then request "all requisites for
the
>> selected set of pages".
>
>Try this patch.  It should make -p _always_ get pre-requisites,
>even if you have -np on (which was the reason why i wrote the
>patch).  [snip]

Actually, case can be made for both ways.
Sometimes you might want -p to only get "images" conforming to -np. Perhaps
to skip (advertising)banners.
(those are usually served by another server, and thus ignored anyway unless
--span-hosts).

Perhaps make -p override -np, but have an "alternative" -p (e.g. -pnp )
which obeys -np.

I didn't see Andre's patch so I cannot comment on it (stripped by my mail
system)-:
It modifies existing (admittedly confusing) behaviour, my suggestion would
permit getting the old behaviour back.

Another possibility would be to keep the existing behaviour (i.e. -np
overrides -p) and have a "stronger" -p (e.g. -pp ) which ignores -np.

Csaba


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933




Re: GNU Wget 1.5.3

2001-09-04 Thread csaba . raduly


On 03/09/2001 23:18:14 Tomas Dalebjörk wrote:

>Hi,
>
>I like wget a lot.
>But I have found a bug in the program.
>
>If I want to download cgi-bin data (the output from a program on a
>server), it does not work.
>
>I issued the following command:
>
>daleto@modesty:~/slask > wget -d -O slask
>'http://www.atg.se/StartlistServlet?action=20&race=5&datum=&bankod=S

>ä&lopptyp=100&betType=V75'
[snip debug output from wget]
>Length: unspecified [text/html]
>
>0K -> .
>
>Closing fd 4
>00:14:43 (55.31 KB/s) - `slask' saved [9571]
>

Huh ?
Sounds like a success to me. You asked wget to save the page in a file
called slask, which it did.
What did you expect to happen ?

[snip]

>
>Is there a bug in the software, or just limited.
>

Every software has bugs, but some have more bugs than others ;-)
Wget is quite high quality software.

Note that wget 1.5.3 is quite old, the newest version is 1.7
But try to sort out your problem with 1.5.3 first.


--
Csaba Ráduly, Software Engineer   Sophos Anti-Virus
email: [EMAIL PROTECTED]http://www.sophos.com
US Support: +1 888 SOPHOS 9 UK Support: +44 1235 559933
Life is complex, with real and imaginary parts. +++ATH0 +++ATZ +++ARGGH




Re: Bus errors and recursion

2001-06-18 Thread csaba . raduly


[about alloca vs malloc]

If you allocate with malloc and then accidentally overwrite it, you get a
corrupted heap.
If you allocate with alloca and then accidentally overwrite it, you get a
corrupted stack.

Guess which is easier to notice :-)

Besides, alloca is a GCC builtin (IIRC), so you you don't have to worry
about its implementation (the GCC folks do :). As long as you have the
stack to allocate from, it's as transparent as declaring automatic arrays
with variable length. e.g.

p = alloca( strlen(s) );

is almost the same as

char a[ strlen(s) ], p=a;
/* this is a legal GCC extension */

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933






Re: --spider downloads files

2001-05-18 Thread csaba . raduly


Are you doing --spider recursively ? In that case, wget HAS to download
HTML files, otherwise it can't find the links to recurse...

(Please DO NOT email me, I'm on the list)
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933




   

Tom Gordon 

<[EMAIL PROTECTED]To: [EMAIL PROTECTED], [EMAIL PROTECTED] 

m>   cc:   

 Subject: --spider downloads files 

17/05/01   

16:53  

   

   





When using the --spider option (GNU Wget 1.6), the URL is downloaded.
The doc says "it will not download the pages, just check that they are
there."

Please help, I need this functionality.

Thank you,

Tom Gordon
[EMAIL PROTECTED]







Yet another Makefile.watcom :-)

2001-05-16 Thread csaba . raduly


"An hour of careful debugging can save you five minutes of reading the
documentation"

(See attached file: Makefile.watcom)

This version gets rid of the ugly double list of object files (one for the
linker, one for the dependencies ).


WLINK expects the object files to be specified like this:

wlink FILE 1.obj,2.obj,etc_etc,n.obj NAME program.exe ...
  ^^

This is the format auto-generated by their IDE, BTW.
However, wlink also accepts an alternate way:

wlink FILE { 1.obj 2.obj etc_etc n.obj } NAME program.exe ...

What's more, this is actually present in the documentation (gasp)!



--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933


 =?iso-8859-1?Q?Makefile.watcom?=


Re: WGET for OS/2 and Proxy-Server

2001-05-16 Thread csaba . raduly

   
 
Hrvoje Niksic  
 
, 
Thomas Bohn  
.com> <[EMAIL PROTECTED]>   
 
Sent by:  cc:  
 
[EMAIL PROTECTED]Subject: Re: WGET for OS/2 and 
Proxy-Server   
sdigita.de 
 
   
 
   
 
15/05/01 13:00 
 
   
 
   
 









> Thomas Bohn <[EMAIL PROTECTED]> writes:
>
> > Hello,
> >
> > I tried to use WGET for OS/2 (tested V 1.5.3 and 1.6) with a proxy
> > server. Without proxy server all works fine. But with...
> >
> > In a OS/2 commandline session I type the following commands:
> >
> > SET HTTP_PROXY=62.52.17.1:80
>
> Your proxy setting gets ignored.  Try using lower-case `http_proxy'.
>

It seems to me that getenv has some "issues" on OS/2.
Workaround: use .wgetrc commands instead.

All environment variale names (i.e. the part before the '=') are uppercase
on OS/2

wget uses getenv("http_proxy"); the implementation of getenv seems to be
scanning _environ and doing a strncmp (i.e. case-sensitive comparison). If
getproxy in url.c is changed to getenv("HTTP_PROXY") then it does pick up
the environment setting.

Could we postulate that *ALL* environment vars influencing WGET be
uppercase ?
These are the places where getenv is used (excluding getopt.c)

init.c:237:  tmp = getenv ("no_proxy");
init.c:259:  char *home = getenv ("HOME");
init.c:292:  env = getenv ("WGETRC");
url.c:1292:proxy = opt.http_proxy ? opt.http_proxy : getenv ("http_proxy");
url.c:1294:proxy = opt.ftp_proxy ? opt.ftp_proxy : getenv ("ftp_proxy");
url.c:1297:proxy = opt.https_proxy ? opt.https_proxy : getenv ("https_proxy");
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933





RE: New and improved Makefile.watcom

2001-05-14 Thread csaba . raduly


   
   
Herold Heiko   
   
, Wget List  
evinet.it>  <[EMAIL PROTECTED]>  
   
cc:
   
14/05/01 12:05  Subject: RE: New and improved 
Makefile.watcom 
   
   
   
   









> >-Original Message-
> >From: Hrvoje Niksic [mailto:[EMAIL PROTECTED]]
> >Sent: Monday, May 14, 2001 11:23 AM
> >To: Wget List
> >Subject: Re: New and improved Makefile.watcom
> >
> >
> >[EMAIL PROTECTED] writes:
> >
> >> This is a rewrite of Makefile.watcom
> >
> >Thanks; I've put it in the repository.
> >
> >> # Copy this file to the ..\src directory (maybe rename to
> >Makefile). Also:
> >> # copy config.h.ms ..\src\config.h
> >
> >Maybe we should provide a "win-build" script (or something) that does
> >this automatically?
> >

How about this ?

config.h : ..\windows\config.h.ms
 copy $[@ $^@

(this would be "copy $< $@" for GNU make)

Yup, it works (for me ! :-)

>
> Isn't this what configure.bat is for ?

In theory, but...

> Default to VC (or use VC if --msvc is given), otherwise if env var
> BORPATH is present (or --borland is given) use borland, otherwise error.
>

I see no Watcom here :-) configure.bat doesn't know about Watcom C

Hrvoje also wrote:
> > #disabled for faster compiler
> > LFLAGS=sys nt op st=32767 op vers=1.7 op map op q op de 'GNU wget
1.7dev' de all
> > CFLAGS=/zp4 /d1 /w4 /fpd /5s /fp5 /bm /mf /os /bt=nt [snip]
> > # /zp4= pack structure members with this alignment
> > # /d1 = line number debug info
> > # /w4 = warning level
> > # /fpd= ??? no such switch !
> > # /5s = Pentium stack-based calling
> > # /fp5= Pentium floating point
> > # /bm = build multi-threaded
> > # /mf = flat memory model
> > # /os = optimize for size
> ^^^
> > # /bt = "build target" (nt)
>
> One thing I don't understand: why do you optimize for size?  Doesn't
> it almost always make sense to optimize for speed instead?>

Because I like small and sleek executables :-)
Are there any processor-intensive bits in wget ? Most of the time it'll
wait for the "Internet" anyway.


BTW, compiling with DEBUG_MALLOC reveals three memory leaks :
0x13830432: mswindows.c:72<-   *exec_name = xstrdup (*exec_name); in
windows_main_junk
0x13830496: mswindows.c:168   <-   wspathsave = (char*) xmalloc (strlen
(buffer) + 1); in ws_mypath
0x13830848: utils.c:1525  <-   (struct wget_timer *)xmalloc (sizeof
(struct wget_timer));

Here's another edition of Makefile.watcom
(See attached file: Makefile.watcom)
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933

 =?iso-8859-1?Q?Makefile.watcom?=


New and improved Makefile.watcom

2001-05-13 Thread csaba . raduly


This is a rewrite of Makefile.watcom
It is the end of two separate OBJ file lists (one for dependencies, the
other for the linker command) which needed to be kept in sync.
The explicit dependency list is also gone (Watcom C can pass dependencies
to Watcom Make when using .AUTODEPEND)


wget/windows/(See attached file: Makefile.watcom)
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933


 =?iso-8859-1?Q?Makefile.watcom?=


Re: windows, continue bug

2001-05-10 Thread csaba . raduly


:-( Apologies for the top-posting. Please don't quote this message )-:

Yes, this patch produced correct behaviour (as far as I can tell :-)
Downloading a file (with -c) for the first time, regardless of whether the
server supported resume, succeeded.

Downloading a file (with -c) for the second time:
* If the server supports resume, then "File is fully downloaded, nothing to
do"
* If the server doesn't support resume, then "Refusing to truncate file"

Downloading a file (with -c) again, after manually truncating it:
* If the server supports resume, then it skips past the downloaded part
correctly, and gets the rest.
* If the server doesn't support resume, then "Refusing to truncate file"

It is up to somebody else to dream up more scenarios.
--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933




   
 
Hrvoje Niksic  
 
  
 
.com> cc:  
 
Sent by:  Subject: Re: windows, continue bug   
 
[EMAIL PROTECTED] 
 
sdigita.de 
 
   
 
   
 
09/05/01 19:26 
 
   
 
   
 




[EMAIL PROTECTED] writes:

> At least the CVS version I downloaded on 9th of May still has the
problem:
>
> wget -c http://some.random.com/  results in
> "The file is already fully retrieved, nothing to do." and nothing is
> downloaded :-(

Ah, I see.  This is a different bug from the one Herold was seeing.
Thanks for the explanation.

Does this patch fix the problem?

 2001-05-09  Hrvoje Niksic  <[EMAIL PROTECTED]>

 * http.c (gethttp): Before concluding that the file is already
 fully retrieved, make sure that the file existed and `Range' was
 actually requested.

Index: src/http.c
===
RCS file: /pack/anoncvs/wget/src/http.c,v
retrieving revision 1.58
diff -u -r1.58 http.c
--- src/http.c 2001/05/08 11:47:05 1.58
+++ src/http.c 2001/05/09 18:25:41
@@ -1190,7 +1190,11 @@
   if (opt.always_rest)
 {
   /* Check for condition #2. */
-  if (hs->restval >= contlen)
+  if (hs->restval > 0/* restart was requested. */
+  && contlen != -1  /* we got content-length. */
+  && hs->restval >= contlen /* file fully downloaded
+   or has shrunk.  */
+  )
 {
   logputs (LOG_VERBOSE, _("\
 \nThe file is already fully retrieved; nothing to do.\n\n"));






Re: windows, continue bug

2001-05-09 Thread csaba . raduly

   
 
Hrvoje Niksic  
 
  
 
.com> cc:  
 
Sent by:  Subject: Re: windows, continue bug   
 
[EMAIL PROTECTED] 
 
sdigita.de 
 
   
 
   
 
08/05/01 12:52 
 
   
 
   
 







> [EMAIL PROTECTED] writes:
>
> > I don't know about Heiko, but I got the sources from the CVS shortly
after
> > he posted his "windows, continue bug" message to the list.
>
> And yet the http.c code you showed looked different from what I
> assumed was the latest version.
>
> > It seems to me that your "fix" doesn't work.
>
> It wasn't supposed to fix the problem you had; it was a minor
> optimization.
>
> In the meantime I believe I found and fixed the real problem; updating
> to the latest CVS sources should fix the problem Heiko was seeing.
>


At least the CVS version I downloaded on 9th of May still has the problem:

wget -c http://some.random.com/  results in
"The file is already fully retrieved, nothing to do." and nothing is
downloaded :-(

I ran it under the debugger, this is what I saw:

In gethttp at http.c(1193), where the code is
 if( hs->restval >= contlen )
 {
  //say fully retrieved and bail with RETRUNNEEDED
 }

hs->restval is 0 and contlen is -1
contlen is -1 because the server didn't bother to send Content-Length :-(

(we got here because contrange was -1 [line 1172] and opt.always_rest was 1
[line 1190] )

Regardless what the comments say, wget didn't send any 'Range' request for
the server to honor.


It seems to me that:
IF the server doesn't send content-range, AND
   opt.always_reset==1 ( wget -c ) AND
   the server doesn't send Content-Length (so contlen==-1)
THEN
   hs->restval (at least 0) will always be >= contlen (-1),
   hence gethttp will abort with RETRUNNEEDED (why does it need a retrun ?
:-)
ENDIF

In other words, wget -c ... on a "lazy" (one that doesn't send
Content-Length) server will NOT download anything. This is not good, the
logic around here is faulty or the values aren't set up correctly.

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933





Re: windows, continue bug

2001-05-08 Thread csaba . raduly

   
 
Hrvoje Niksic  
 
  
 
.com> cc:  
 
Sent by:  Subject: Re: windows, continue bug   
 
[EMAIL PROTECTED] 
 
sdigita.de 
 
   
 
   
 
07/05/01 20:06 
 
   
 
   
 







[EMAIL PROTECTED] writes:

> http_loop calls gethttp() at line 1539, but the following is only
> at line 1554:
>
> if( opt.always_rest )
> hstat.no_truncate = file_exists_p(locf);
>
> Moving these two lines *above* the call to gethttp() on line 1554,
> the file was downloaded correctly.

How are you guys getting this?  The latest source from the CVS should
look different, and should in fact work.

(I've just applied another fix, this time a small optimization.)
 end quoted 

I don't know about Heiko, but I got the sources from the CVS shortly after
he posted his "windows, continue bug" message to the list.

It seems to me that your "fix" doesn't work. Compiling and running (a CVS
checkout around 9:30 BST, +0100) on both OS/2 (gcc) and windows (watcom)
produced this:

->8
DEBUG output created by Wget 1.7-dev on os2-emx.

parseurl ("http://some.random.com/";) -> host some.random.com -> opath  -> dir  -> file 
 -> ndir
newpath: /
--11:09:30--  http://some.random.com/
   => `index.html'
Connecting to some.random.com:80... Caching some.random.com <-> 10.1.1.9
Created fd 3.
connected!
---request begin---
GET / HTTP/1.0

User-Agent: Wget/1.7-dev

Host: some.random.com

Accept: */*

Connection: Keep-Alive



---request end---
HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Tue, 08 May 2001 10:09:28 GMT
Server: Apache/1.3.14 (Unix) PHP/4.0.4pl1
X-Powered-By: PHP/4.0.4pl1
Connection: close
Content-Type: text/html



The file is already fully retrieved; nothing to do.

Closing fd 3
->8

Note: the file was *NOT* retrieved before.

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933





Re: windows, continue bug

2001-05-04 Thread csaba . raduly


You mean this ?

--->8---
DEBUG output created by Wget 1.7-dev on Windows.

parseurl ("http://turtle.power.org/";) -> host turtle.power.org -> opath  -> dir  -> 
file  -> ndir
newpath: /
Checking for turtle.power.org in host_name_address_map.
Checking for turtle.power.org in host_slave_master_map.
First time I hear about turtle.power.org by that name; looking it up.
Caching turtle.power.org <-> 10.1.1.9
Checking again for turtle.power.org in host_slave_master_map.
--10:35:49--  http://turtle.power.org/
   => `turtle.power.org/index.html'
Connecting to turtle.power.org:80... Found turtle.power.org in host_name_address_map: 
10.1.1.9
Created fd 88.
connected!
---request begin---
GET / HTTP/1.0

User-Agent: Wget/1.7-dev

Host: turtle.power.org

Accept: */*

Connection: Keep-Alive



HTTP request sent, awaiting response... HTTP/1.1 200 OK
Date: Fri, 04 May 2001 09:35:48 GMT
Server: Apache/1.3.14 (Unix) PHP/4.0.4pl1
X-Powered-By: PHP/4.0.4pl1
Connection: close
Content-Type: text/html



The server does not support continued download;
refusing to truncate `turtle.power.org/index.html'.


FINISHED --10:35:49--
Downloaded: 0 bytes in 0 files
--->8---

It's not just on Windows; happens on OS/2 ( compiled with GCC ) too.

Debugging it suggests that hstat.no_truncate desn't get initialized
(dodgy random-looking value contained in no_truncate) :

http_loop calls gethttp() at line 1539, but the following is only
at line 1554:

if( opt.always_rest )
hstat.no_truncate = file_exists_p(locf);

Moving these two lines *above* the call to gethttp() on line 1554,
the file was downloaded correctly.

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933






RE: Fix safe-ctype detection

2001-04-27 Thread csaba . raduly


(Re: defining inline to nothing) That's strange...VC understands inline.
Did you try to define it to __inline ?

Try substituting ( #defining ) ftruncate to chsize.
I had this problem when trying to compile FTE (fte.sourceforge.net) with
VisualAge C++ (which also doesn't have ftruncate).

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933




   
   
Herold Heiko   
   
 
evinet.it>  cc: "List Wget (E-mail)" 
<[EMAIL PROTECTED]>
Subject: RE: Fix safe-ctype  
detection 
27/04/01 10:13 
   
   
   
   
   




It does work (I suppose this means no inline optimizations).
However then it stops later at linking stage. Either there is no
ftruncate (used in http.c, ftp.c) function or my compiler is not yet set
up correctly.

[snip]





Re: Anon FTP password

2001-04-26 Thread csaba . raduly


> Following the example set by lftp, I'll change Wget to send "-wget@"
> as anonymous FTP password, with the option of changing it.  That way
> we will have a decent default, and enable the users who know what
> they're doing to change it to their email address, if they're
> oldfashioned, or to something even more anonymizing, like "mozilla@".

You mean "-wget@" with no host ? Won't some FTP sites consider that
as invalid ?

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933






Re: Bundling libtool

2001-03-27 Thread csaba . raduly

   

"Dan   

Harkless"To: Wget List <[EMAIL PROTECTED]>   

   Subject: Re: Bundling libtool 

   

27/03/01   

12:55  

   

   








[snipped]
> Hrvoje Niksic <[EMAIL PROTECTED]> writes:
> > Is it the standard configure caching mechanism, the "(cached)" thing?
> > I think that can be turned off.
>
> Per-check, you mean?  We wouldn't want to turn it off for the whole
> configure run just for the benefit of this check.  I looked at the
autoconf
> documentation but didn't see a way to turn it off for a particular
> (predefined) check.  Custom checks are not automatically cached, though,
so
> doing the AC_CHECK_LIB stuff manually would do the trick.

One can always manually delete the corresponding line in config.cache

--
Csaba Ráduly, Software Engineer Sophos Anti-Virus
email: [EMAIL PROTECTED]  http://www.sophos.com
US support: +1 888 SOPHOS 9   UK Support: +44 1235 559933






Re: wget bug - after closing control connection

2001-03-08 Thread csaba . raduly


Which version of wget do you use ? Are you aware that wget 1.6 has been
released and 1.7 is in development (and they contain a workaround for the
"Lying FTP server syndrome" you are seeing) ?
--
Csaba Ráduly, Software Engineer  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933






Re: Wget

2001-03-06 Thread csaba . raduly


I'm confused. I thought 1.5.3 *did* display the dots, but I could be wrong.

Please send queries like this to the list ( [EMAIL PROTECTED] ), not to me
personally.
--
Csaba Ráduly, Software Engineer  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933
:-( sorry for the top-posting )-:




   

[EMAIL PROTECTED] 

(Timo Maier) To: [EMAIL PROTECTED]   

 cc:   

06/03/01 Subject: Re: Wget 

10:58  

   

   





Hi!

>The newest wget is 1.6 release and 1.7 developer.

I have GNU Wget 1.5.3 which doesn't dsiplay the dots, it lokks like
this:

>---
Connecting to www.telekom.de:80... connected!
HTTP request sent, awaiting response... 206 Partial content
Length: 4,509,742 (4,267,794 to go) [application/octet-stream]

3.05Mb (236.28kb) done at 5.19 KB/s. time: 0:09:16 (0:04:05 left)
>---

Is it possible to implement this in new versions, too?

TAM
--
OS/2 Warp4, Ducati 750SS '92
You still have the freedom to learn and say what you wanna say
http://tam.belchenstuermer.de






Re: Windows ssl enabled binary

2001-01-24 Thread csaba . raduly


>At http://www.geocities.com/heiko_herold you can find a ssl enabled
>windows binary, which however does still need thorough testing (please
>feedback on the list).
>


Me too, except s/windows/os2/g at http://www.geocities.com/csaba_22/
It needs the EMX runtime.


"configure --with-ssl" produced a Makefile with LIBS= -lcrypto -lssl
-lsocket
This caused lots of link errors. Changing it to LIBS= -lssl -lcrypto
-lsocket
then produced a wget.exe which successfully connected to and downloaded a
few files from my Apache+mod_ssl server via https.

--
Csaba Ráduly, Programmer - OS/2  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933





RE: SUGGESTION: rollback like GetRight

2001-01-10 Thread Csaba Raduly

On 10/01/2001 08:50:18 ZIGLIO Frediano wrote:

>I suggest two parameter:
>- rollback-size
>- rollback-check-size
>where 0 <= rollback-check-size <= rollback-size
>The first for calculate the beginning of range (filesize - rollback-size)
>and the second for check (wget should check the range [filesize -
>rollback-size,filesize - rollback-size + rollback-check-size) )
>

I was thinking of making -c have an optional parameter specifying the rollback.
If this was defaulted to 0, it can be given to lseek( , , SEEK_END )
(it would be nice if it could accept a 'k' suffix)

The check size then could be specified separately.

Csaba


--
Csaba Ráduly, Programmer - OS/2  Sophos Anti-Virus
email: [EMAIL PROTECTED]   http://www.sophos.com
US support: +1 888 SOPHOS 9UK Support: +44 1235 559933