SORRY FOR THE LENGTH OF THIS EMAIL!

I have been trying to pluck the details of a very large website. But I
keep getting a python error. Here's the pertinent information.

FROM THE PLUCKER.INI FILE
---------------------------
[AramaicBibleTranslation]
after_command=
alt_maxheight=0
alt_maxwidth=0
alt_text=1
anchor_color=#0000FF
backup_bit=1
before_command=
big_icon=
bpp=4
category=
charset=
compression=zlib
copyprevention_bit=0
copy_to_dir=C:\Documents and Settings\Jeff Mikels\My Documents\Resources
depth_first=0
directory_on_card=
doc_file=channels/AramaicBibleTranslation/AramaicBibleTranslation
doc_name=Aramaic Bible Translation
exclusion_lists=channels/AramaicBibleTranslation/exclusionlist.txt
handheld_target_storage_mode=1
home_maxdepth=5
home_stayondomain=0
home_stayonhost=0
home_url=http://www.v-a.com
home_url_pattern=.*\.v-a\.com.*
image_compression_limit=0
indent_paragraphs=0
is_usb_pause=1
launchable_bit=0
maxheight=250
maxwidth=150
no_urlinfo=1
owner_id_build=
referrer=
small_icon=
status_line_length=60
tables=1
try_reduce_bpp=1
try_reduce_dimension=0
update_base=2003-08-25T16:30:00
update_enabled=0
update_frequency=1
update_period=daily
user=Jeffrey Mikels
user_agent=
verbosity=2
close_on_exit=1
close_on_error=1
---------------------------


The parsing works just fine until the end when spider.py tries to write
the pdb. I get the following at the end of the parsing.


Details:
---------------------------
---- all 400 pages retrieved and parsed ----
Writing out collected data...
Writing document 'Aramaic Bible Translation' to file C:\Program
Files\Plucker\channels/AramaicBibleTranslation/AramaicBibleTranslation.pdb
URL http://www.v-a.com/bible/john_15-21.html for doc <PluckerTextDocument
'http://www.v-a.com/bible/john_15-21.html' at 26208700> points to doc
<PluckerTextDocument 'http://www.v-a.com/bible/john_15-21.html' at
22400972>
URL http://www.v-a.com/bible/john_8-14.html for doc <PluckerTextDocument
'http://www.v-a.com/bible/john_8-14.html' at 26141940> points to doc
<PluckerTextDocument 'http://www.v-a.com/bible/john_8-14.html' at
25832548>
URL http://www.v-a.com/ashurai/ashur-1.GIF?width=235&height=124&depth=4
for doc <PluckerImageDocument
'http://www.v-a.com/ashurai/ashur-1.GIF?width=235&height=124&depth=4' at
29367580> points to doc <PluckerImageDocument
'http://www.v-a.com/ashurai/ashur-1.GIF?width=235&height=124&depth=4' at
28846652>
URL http://www.v-a.com/usflag.gif?width=150&height=83&depth=4 for doc
<PluckerImageDocument
'http://www.v-a.com/usflag.gif?width=150&height=83&depth=4' at 29827092>
points to doc <PluckerImageDocument
'http://www.v-a.com/usflag.gif?width=150&height=83&depth=4' at 29203548>
URL http://www.v-a.com/usflag.gif?width=150&height=83&depth=4 for doc
<PluckerImageDocument
'http://www.v-a.com/usflag.gif?width=150&height=83&depth=4' at 29203548>
points to doc <PluckerImageDocument
'http://www.v-a.com/usflag.gif?width=150&height=83&depth=4' at 29827092>
URL http://www.v-a.com/bible/john_15-21.html for doc <PluckerTextDocument
'http://www.v-a.com/bible/john_15-21.html' at 22400972> points to doc
<PluckerTextDocument 'http://www.v-a.com/bible/john_15-21.html' at
26208700>
URL http://www.v-a.com/ashurai/ashur-1.GIF?width=150&height=79&depth=4 for
doc <PluckerImageDocument
'http://www.v-a.com/ashurai/ashur-1.GIF?width=150&height=79&depth=4' at
29867052> points to doc <PluckerImageDocument
'http://www.v-a.com/ashurai/ashur-1.GIF?width=150&height=79&depth=4' at
23967300>
URL http://www.v-a.com/bible/john_8-14.html for doc <PluckerTextDocument
'http://www.v-a.com/bible/john_8-14.html' at 25832548> points to doc
<PluckerTextDocument 'http://www.v-a.com/bible/john_8-14.html' at
26141940>
URL http://www.v-a.com/usflag.gif?width=161&height=90&depth=4 for doc
<PluckerImageDocument
'http://www.v-a.com/usflag.gif?width=161&height=90&depth=4' at 28402948>
points to doc <PluckerImageDocument
'http://www.v-a.com/usflag.gif?width=161&height=90&depth=4' at 28204604>
URL http://www.v-a.com/bible/john_1-7.html for doc <PluckerTextDocument
'http://www.v-a.com/bible/john_1-7.html' at 24969060> points to doc
<PluckerTextDocument 'http://www.v-a.com/bible/john_1-7.html' at 18042372>
URL http://www.v-a.com/ashurai/ashur-1.GIF?width=235&height=124&depth=4
for doc <PluckerImageDocument
'http://www.v-a.com/ashurai/ashur-1.GIF?width=235&height=124&depth=4' at
28846652> points to doc <PluckerImageDocument
'http://www.v-a.com/ashurai/ashur-1.GIF?width=235&height=124&depth=4' at
29367580>
Traceback (most recent call last):
  File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line
1641, in ?
    sys.exit(realmain(None))
  File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line
1626, in realmain
    retval = main (config, exclusion_lists)
  File "C:\Program Files\Plucker/parser/python/PyPlucker/Spider.py", line
1106, in main
    mapping = writer.write (verbose=verbosity, alias_list=alias_list)
  File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
520, in write
    result = Writer.write (self, verbose, alias_list=alias_list)
  File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
310, in write
    self._mapper = Mapper(self._collection, alias_list.as_dict())
  File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
102, in __init__
    self._get_id_for_doc(doc)
  File "C:\Program Files\Plucker/parser/python\PyPlucker\Writer.py", line
112, in _get_id_for_doc
    id = self._url_to_id_mapping.get(doc.get_url())
AttributeError: 'None' object has no attribute 'get_url'
Installing channel output to destinations...
Setting new due date...
Tasks completed for all channels.
---------------------------

Can someone please help?


Pastor Jeff Mikels
Northwest Baptist Church
www.nwbc-chicago.org
773-338-1111

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to