Jetus schrieb:
I am able to download this page (enclosed code), but I then want to
download a pdf file that I can view in a regular browser by clicking
on the "view" link. I don't know how to automate this next part of my
script. It seems like it uses Javascript.
The line in the page source says
href="javascript:openimagewin('JCCOGetImage.jsp?
refnum=DN2007036179');" tabindex=-1>

So, in summary, when I download this page, for each record, I would
like to initiate the "view" link.
Can anyone point me in the right direction?

When the "view" link is clicked on in IE or Firefox, it returns a pdf
file, so I should be able to download it with
urllib.urlretrieve('pdffile, 'c:\temp\pdffile')

Here is the following code I have been using
----------------------------------------------------------------
    import urllib, urllib2

    params = [
                ('booktype', 'L'),
                ('book', '930'),
                ('page', ''),
                ('hidPageName', 'S3Search'),
                ('DoItButton', 'Search'),]

    data = urllib.urlencode(params)

    f = urllib2.urlopen("http://www.landrecords.jcc.ky.gov/records/
S3DataLKUP.jsp", data)

    s = f.read()
    f.close()
    open('jcolib.html','w').write(s)

Use something like the FireBug-extension to see what the openimagewin-function ultimately creates as reqest. Then issue that, parametrised from parsed information out of the above href.

There is no way to interpret the JS in Python, let alone mimic possible browser dom behavior.

Diez
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to