On Oct 24, 2009, at 6:17 PM, elca wrote:
hello!
thanks for your reply
for example i want to extract some text in cnn website.
such like 'Sponsored links' 'Money' text in cnn website.
follow is sample what i want to make script.
i want to add function into my script source which can extract such like
text.
thanks in advance ! :)

Unless I'm missing something, why do you need Internet Explorer at all? You can get the HTML using urllib2:

import urllib2
response = urllib2.urlopen('http://cnn.com/')
html = response.read()

then extract what you're looking for with beautiful soup:

from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)

for content in soup.findAll('div', class="cnn_sectbincntnt2"):
    if '<a href="/money?cnn=yes"' in content:
        print 'MONEY!'
        print content

You can get a lot fancier finding the money section, but that's the gist of it. And, no IE necessary.

-Roberto.

import win32com.client
from time import sleep
from win32com.client
import Dispatch
import urllib,urllib2
from BeautifulSoup import BeautifulSoup
ie = Dispatch("InternetExplorer.Application")
ie.Visible = 1
ie.Navigate("http://www.cnn.com";)
sleep(15)
ie.Quit()


ccurvey wrote:

you can definitely use IE to and innerHTML() to get the HTML, then use
BeautifulSoup to parse the HTML.  What are you having trouble with?



On Sat, Oct 24, 2009 at 8:34 PM, elca <high...@gmail.com> wrote:


hello...
if anyone know..please help me !
i really want to know...i was searched in google lot of time.
but can't found clear soultion. and also because of my lack of python
knowledge.
i want to use IE.navigate function with beautifulsoup or lxml..
if anyone know about this  or sample.
please help me!
thanks in advance
--
View this message in context:
http://www.nabble.com/how-to-use-win32com-with-beautifulsoup-or-lxml--tp26044332p26044332.html
Sent from the Python - python-win32 mailing list archive at Nabble.com .

_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32




--
The source of your stress might be a moron

_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32



--
View this message in context: 
http://www.nabble.com/how-to-use-win32com-with-beautifulsoup-or-lxml--tp26044332p26044523.html
Sent from the Python - python-win32 mailing list archive at Nabble.com.

_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32

_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32

Reply via email to