At 08:20 PM 9/29/2002  -0400, David A. Desrosiers wrote:
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>
> > David, what did you see?  What I saw on my system was that it held the
> > pluck to the host, which is what I would expect given that --stayonhost is
> > more rigid than --stayondomain.  Other than that, it functioned normally.
>
>         Look even closer. images.slashdot.org is not fetched (though still
>is part of the --stayondomain construct of slashdot.ort), when --stayonhost
>is used. If you omit --stayonhost, you will properly get all the pages from
>images.slashdot.org as well as slashdot.org content.

Yes, this is precisely what I expected. "images.slashdot.org" is not the 
host "slashdot.org".  Perhaps I'm not understanding... when I do 
--stayonhost without --stayondomain, specifying "http://slashdot.org";, I do 
not get "images.slashdot.org".  This is pre-existing behavior.  If I do 
"--stayondomain" without "--stayonhost", I do get "images.slashdot.org", 
which is what I figured was desired behavior.  If I do both, I've told it 
to not just stay on the "slashdot.org" domain, but on the "slashdot.org" 
host, correct?

What behavior would you expect under that circumstance?  I'm willing to 
change it, but intuitively it seems to me that --stayonhost should trump 
--stayondomain, as it currently does.

>         "I don't see why you can't simply load http://slashdot.org/palm/..";
>
>         So basically, don't pound the main page. Use the lean page, always,
>or use the RSS content they make available (or use the slashpluck script,
>which does the same thing).

I haven't been, except a few tests.  I pound my own website about three 
meters from me.

> > I just overlooked that possibility because it's a radio button (i.e.
> > mutually exclusive) in Desktop.
>
>         Bleh, radio button. My shell script doesn't include radio buttons,
>so remember that what you might use as your interface, may not be the same
>that others use for theirs. Your changes affect more than your GUI
>interface. Great work overall, keep it up.

Like I said, I'm happy to put an exclusion in there... I'm just fully 
unclear on how the current behavior differs from what you'd expect.

How's this for a solution... if both are specified, I'll print out a 
warning that stayonhost overrides stayondomain and spidering will be 
limited accordingly.  Sound good?

Regards
         Tony McNamara

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to