spanhost and recursive.

2001-08-24 Thread Anders Rosendal

Could you make an option to only fetch from other hosts what is directly referenced
from the orig page?

Is this a TODO?
-- 
Make software - not war
The box said win95 or better, so I installed linux



RE: Help me!!!

2001-08-24 Thread Dell, Kevin
Title: RE: Help me!!!





-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Say What?

- -Original Messag e-
From: Irina [mailto:[EMAIL PROTECTED]]
Sent: 22 August 2001 23:18
To: [EMAIL PROTECTED]
Subject: Help me!!!


Ïîæàëóéñòà, îòêëèêíèòåñü êòî ìîæåò! ß æèâó íà Óêðàèíå. Ìíå 37
ëåò. Ìóæ áðîñèë. Ðîäñòâåííèêîâ íåò. ß îñòàëàñü îäíà ñ äâóìÿ äåòüìè.
Ïîëó÷èëà èíâàëèäíîñòü, ïîñëå ÷åãî íà ðàáîòå ìåíÿ ñîêðàòèëè. Áîëüøå íà
ðàáîòó ìåíÿ íèãäå íå áåðóò - èíâàëèäû íèêîìó íå íóæíû. Ïåíñèþ ïëàòÿò
î÷åíü ìàëåíüêóþ, íà íå¸ äàæå õëåáà íå êóïèøü. Äåòè íå ïîëó÷àþò
íîðìàëüíîãî ïèòàíèÿ, è îáðàçîâàíèå ÿ èì äàòü íå ìîãó - çà îáó÷åíèå
íóæíî ïëàòèòü. Ó íèõ íåò íèêàêîãî áóäóùåãî. Ïîæàëóéñòà, îòêëèêíèòåñü.
Ïîìîãèòå ìíå è ìîèì äåòÿì, ó ìåíÿ íà âàñ ïîñëåäíÿÿ íàäåæäà! ß
áëàãîäàðíà ëþäÿì, êîòîðûå ïîìîãëè îòïðàâèòü ýòî ïèñüìî â Internet.
P.S. Æåíùèíà â î÷åíü ñëîæíîé ñèòóàöèè, åñëè âû ñìîæåòå ïîìî÷ü,
îòêëèêíèòåñü ïî ýòîìó àäðåñó: [EMAIL PROTECTED] Please, respond who
can! I live on Ukraine. To me 37 years. The husband has thrown.
The relatives are not present. I have remained one with two children.
Has received physical inability, then at work me have reduced. More
on work me the invalids nobody anywhere do not take - are necessary.
Pension pay very small, on ¸ even bread will buy. Children do not
receive a normal feed(meal), and formation(education) I give to them
I can not - for training it is necessary to pay. They do not have any
future. Please, respond. Help me and my children, at me on you last
hope! I am grateful to the people, which have helped to send this
letter in Internet. P.S. The woman in a very complex(difficult)
situation, if you can help, respond to this address:
[EMAIL PROTECTED]


-BEGIN PGP SIGNATURE-
Version: PGPfreeware 7.0.3 for non-commercial use http://www.pgp.com

iQA/AwUBO4ZKkwxuP44/+NFmEQIn+QCeKWz5Cqr9fFqnhDAvqoktESbvrI8AoL0M
d038RQ5CyTglCFeIQCmTCe6R
=ZngW
-END PGP SIGNATURE-









PLEASE READ: The information contained in this e-mail is confidential
and intended for the named recipient(s) only. If you are not an intended
recipient of this email you must not copy, distribute or take any
further action in reliance on it and you should delete it and notify the
sender immediately. Email is not a secure method of communication and
Nomura International plc cannot accept responsibility for the accuracy
or completeness of this message or any attachment(s). Please examine this
e-mail for virus infection, for which Nomura International plc accepts
no responsibility. If verification of this email is sought then please
request a hard copy. Unless otherwise stated any views or opinions
presented are solely those of the author and do not represent those of
Nomura International plc. This email is intended for informational
purposes only and is not a solicitation or offer to buy or sell
securities or related financial instruments. Nomura International plc is
regulated by the Securities and Futures Authority Limited and is a
member of the London Stock Exchange.
 PGPexch.rtf.asc


Re: wget -k crashes when converting a specific url

2001-08-24 Thread Ian Abbott

On 23 Aug 2001, at 13:33, Edward J. Sabol wrote:

 Nathan J. Yoder wrote:
  Please fix this soon,
  
  ***COMMAND***
  wget -k http://reality.sgi.com/fxgovers_houst/yama/panels/panelsIntro.html
 [snip]
  02:30:05 (23.54 KB/s) - `panelsIntro.html' saved [3061/3061]
  
  Converting panelsIntro.html... zsh: segmentation fault (core dumped)
 
 Ian Abbott replied:
  I cannot reproduce this failure on my RedHat 7.1 box.
 
 I was able to reproduce this pretty easily on both Irix 6.5.2 and Digital
 Unix 4.0d, using gcc 2.95.2. (I bet Linux's glibc has code to protect against
 fwrite() calls with negative lengths.)
 
 The problem occurs when you have a single tag with multiple attributes that
 specify links that need to be converted. In this case, it's an IMG tag with
 SRC and LOWSRC attributes. The urlpos structure passed to convert_links() is
 a linked list of pointers to where the links are that needed to be converted.
 The problem is that the links are not in positional order. The second
 attribute is in the linked list before the first attribute, causing the
 length of the string to be printed out to be a negative number.

Thanks for tracking that down. I've now found the problem, fixed it and 
created a patch (attached) against the current CVS sources.

 Here's a diff (against the current CVS sources) which will prevent the core
 dump, but please note that it does not fix the problem. html-parse.c and
 html-url.c are some dense code, and I'm still wading through it. (Also, it's
 not clear if the linked list is supposed to be in positional order or if
 convert_links() is wrongly assuming that.)

[snipped the diff]

At least that extra code was a convenient place for me to stick a 
brreakpoint on in gdb, and also helped me verify that I've nailed the 
bug (I checked the converted html file too, of course!).

It's a shame Hrvoje Niksik's not arround at the moment to apply all 
these patches to the repository.



Index: src/html-url.c
===
RCS file: /pack/anoncvs/wget/src/html-url.c,v
retrieving revision 1.10
diff -u -r1.10 html-url.c
--- src/html-url.c  2001/05/27 19:35:02 1.10
+++ src/html-url.c  2001/08/24 15:07:49
@@ -383,7 +383,7 @@
 {
 case TC_LINK:
   {
-   int i;
+   int i, id, first;
int size = ARRAY_SIZE (url_tag_attr_map);
for (i = 0; i  size; i++)
  if (url_tag_attr_map[i].tagid == tagid)
@@ -391,25 +391,34 @@
/* We've found the index of url_tag_attr_map where the
attributes of our tags begin.  Now, look for every one of
them, and handle it.  */
-   for (; (i  size  url_tag_attr_map[i].tagid == tagid); i++)
+   /* Need to process the attributes in the order they appear in
+  the tag, as this is required if we convert links.  */
+   first = i;
+   for (id = 0; id  tag-nattrs; id++)
  {
-   char *attr_value;
-   int id;
-   if (closure-dash_p_leaf_HTML
-(url_tag_attr_map[i].flags  AF_EXTERNAL))
- /* If we're at a -p leaf node, we don't want to retrieve
- links to references we know are external to this document,
-such as a href=  */
- continue;
+   /* This nested loop may seem inefficient (O(n^2)), but it's
+  not, since the number of attributes (n) we loop over is
+  extremely small.  In the worst case of IMG with all its
+  possible attributes, n^2 will be only 9.  */
+   for (i = first; (i  size  url_tag_attr_map[i].tagid == tagid);
+i++)
+ {
+   char *attr_value;
+   if (closure-dash_p_leaf_HTML
+(url_tag_attr_map[i].flags  AF_EXTERNAL))
+ /* If we're at a -p leaf node, we don't want to retrieve
+links to references we know are external to this document,
+such as a href=  */
+ continue;
 
-   /* This find_attr() buried in a loop may seem inefficient
-   (O(n^2)), but it's not, since the number of attributes
-   (n) we loop over is extremely small.  In the worst case
-   of IMG with all its possible attributes, n^2 will be
-   only 9.  */
-   attr_value = find_attr (tag, url_tag_attr_map[i].attr_name, id);
-   if (attr_value)
- handle_link (closure, attr_value, tag, id);
+   if (!strcasecmp (tag-attrs[id].name,
+url_tag_attr_map[i].attr_name))
+ {
+   attr_value = tag-attrs[id].value;
+   if (attr_value)
+ handle_link (closure, attr_value, tag, id);
+ }
+ }
  }
   }
   break;



Re: spanhost and recursive.

2001-08-24 Thread Edward J. Sabol

Anders Rosendal asked:
 Could you make an option to only fetch from other hosts what is directly
 referenced from the orig page?

Have you tried the --page-requisites (a.k.a. -p) command line option?

The info documentation says this:

 Actually, to download a single page and all its requisites (even
 if they exist on separate websites), and make sure the lot
 displays properly locally, this author likes to use a few options
 in addition to `-p':

  wget -E -H -k -K -nh -p http://SITE/DOCUMENT

 In one case you'll need to add a couple more options.  If DOCUMENT
 is a `FRAMESET' page, the one more hop that `-p' gives you
 won't be enough--you'll get the `FRAME' pages that are
 referenced, but you won't get _their_ requisites.  Therefore, in
 this case you'll need to add `-r -l1' to the commandline.  The `-r
 -l1' will recurse from the `FRAMESET' page to to the `FRAME'
 pages, and the `-p' will get their requisites.  If you're already
 using a recursion level of 1 or more, you'll need to up it by one.
 In the future, `-p' may be made smarter so that it'll do two
 more hops in the case of a `FRAMESET' page.

 To finish off this topic, it's worth knowing that Wget's idea of an
 external document link is any URL specified in an `A' tag, an
 `AREA' tag, or a `LINK' tag other than `LINK
 REL=stylesheet'.



Patch to make SocksV5 work

2001-08-24 Thread Peter_Korman


The attached patch will fix wget so that it works
through a socks firewall. The client library
calls changed from the form Rconnect to teh form SOCKS5connect.
In addition, it works with some pretty cute c preprocessor
token pasting techniques that must be seen to be believed.

If you don't need wget to get passed a socks firewall
then you don't need this patch. It is possible someone
has already done this, but not as of June 2001.

I'd like to get feedback on this because I only tested
it on redhat 7.1 against an installation of socks5-v1.0r11


If you need it and this fix does not work, let me know.
I'll try to fix it.


(See attached file: wget-1.7-socksv5.patch.sig) (See attached file:
wget-1.7-socksv5.patch)
 wget-1.7-socksv5.patch.sig
 wget-1.7-socksv5.patch


Patch to make SocksV5 work

2001-08-24 Thread Peter_Korman



08/24/2001 02:28 PM

Peter Korman

To:[EMAIL PROTECTED]

Subject:Patch to make SocksV5 work


The attached patch will fix wget so that it works
through a socks firewall. The client library
calls changed from the form Rconnect to the form SOCKSconnect.
In addition, it works with some pretty cute c preprocessor
token pasting techniques that must be seen to be believed.

If you don't need wget to get passed a socks firewall
then you don't need this patch. It is possible someone
has already done this, but not as of June 2001.

I'd like to get feedback on this because I only tested
it on redhat 7.1 against an installation of socks5-v1.0r11


If you need it and this fix does not work, let me know.
I'll try to fix it.


(See attached file: wget-1.7-socksv5.patch.sig) (See attached file:
wget-1.7-socksv5.patch)



Please note. This patch will break wget on those
libraries that expect a call to Rconnect and not
SOCKSconnect


 wget-1.7-socksv5.patch.sig
 wget-1.7-socksv5.patch