Hey all,

I'm using the standard Python plucker converter which was installed with
the Plucker Desktop 1.2.0.4 on Win 2K.  I have a "Daily news" channel that
spiders a whole bunch of things, including the NYT, csmonitor, etc.  One
site I've been having particular trouble with is the Nature Science
Update, which seems to stall on occation.  When this happens I'd like to
skip the document.  In the headers of the text window, I see a "Timeout:
never" setting, but I can't find any mention of the "Timeout" setting in
the Plucker documentation online.

So, I have two questions:

1.  Does the Timeout fail on individual http requests or is it an overall
"the whole document hasn't finished" kind of deal?

2.  Where do I implement that setting?  In the plucker.ini?  In the
home.html?

I already have a kludge to get around it:

In my home.html, I add <!-- slow --> to the linse containing links to the
content that fails most consistently, then I grep -v "<!-- slow -->" and
create another home.html in another directory, which is otherwise a clone
of the original (same DB name and everything).  I set up my windows task
scheduler to run both at the same time, and the pruned version completes
earlier.  Then, if the unpruned one times out, I still get most of the
content I want, and if the unpruned one completes, it overwrites the DB of
the fast one.

-alan

-- 
    Alan Hoyle  -  [EMAIL PROTECTED]  -  http://www.alanhoyle.com/
      "I don't want the world, I just want your half." -TMBG
                 Get Horizontal, Play Ultimate.




_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to