Re: HELP plucking large website

Jeff Mikels Tue, 26 Aug 2003 06:29:54 +0000

David's message to me is at the bottom: I have Plucker 1.4 for Windows, and I also have python (2.3, I think)installed on my Windows 2000 machine. However, this version of plucker did not come with the plucker-build commandline executable. I'm using the plucker desktop. After sending my message to this list, I used wget to capture a local mirror of the entire website. Then, I plucked that, and got the same exact error message. Oh, and I exclude the letters directory entirely with an exclusion in plucker desktop. Incidentally, I would love to have the plucker-build executable again! Jeff M. ERROR MESSAGE -----------

Traceback (most recent call last):
 File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line
1641, in ?
   sys.exit(realmain(None))
 File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line
1626, in realmain
   retval = main (config, exclusion_lists)
 File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line
1106, in main
   mapping = writer.write (verbose=verbosity, alias_list=alias_list)
 File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
520, in write
   result = Writer.write (self, verbose, alias_list=alias_list)
 File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
310, in write
   self._mapper = Mapper(self._collection, alias_list.as_dict())
 File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
102, in __init__
   self._get_id_for_doc(doc)
 File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
112, in _get_id_for_doc
   id = self._url_to_id_mapping.get(doc.get_url())
AttributeError: 'None' object has no attribute 'get_url'
Installing channel output to destinations...
Setting new due date...
Tasks completed for all channels.

----------- YOUR MESSAGE Message: 4 Date: Mon, 25 Aug 2003 13:24:26 -0400 (EDT) From: "David A. Desrosiers" <[EMAIL PROTECTED]> To: Plucker General List <[EMAIL PROTECTED]> Subject: Re: HELP plucking large website Reply-To: [EMAIL PROTECTED]

I have been trying to pluck the details of a very large website. But I
keep getting a python error. Here's the pertinent information.


    I just plucked this site with the Python-based parser tools on Linux
and FreeBSD, and it seems to work perfectly here, generating a 807,340 byte
file. Maybe something Windows-specific is broken?

The syntax I used was:

plucker-build -H "http://www.v-a.com";                 \
              --maxheight=150 --maxwidth=150    \
              --maxdepth=5 -f Bible --zlib-compression \
              --staybelow="http://www.v-a.com"; --bpp=4

        The only part I had to work around was that it prompts for a
username/password on /letters in the parse, but I just hit enter and
bypassed it. Other than that, it worked fine.

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Re: HELP plucking large website

Reply via email to