On Oct 24, 2009, at 6:17 PM, elca wrote:
hello!
thanks for your reply
for example i want to extract some text in cnn website.
such like 'Sponsored links' 'Money' text in cnn website.
follow is sample what i want to make script.
i want to add function into my script source which can extract such
like
text.
thanks in advance ! :)
Unless I'm missing something, why do you need Internet Explorer at
all? You can get the HTML using urllib2:
import urllib2
response = urllib2.urlopen('http://cnn.com/')
html = response.read()
then extract what you're looking for with beautiful soup:
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
for content in soup.findAll('div', class="cnn_sectbincntnt2"):
if '<a href="/money?cnn=yes"' in content:
print 'MONEY!'
print content
You can get a lot fancier finding the money section, but that's the
gist of it. And, no IE necessary.
-Roberto.
import win32com.client
from time import sleep
from win32com.client
import Dispatch
import urllib,urllib2
from BeautifulSoup import BeautifulSoup
ie = Dispatch("InternetExplorer.Application")
ie.Visible = 1
ie.Navigate("http://www.cnn.com")
sleep(15)
ie.Quit()
ccurvey wrote:
you can definitely use IE to and innerHTML() to get the HTML, then
use
BeautifulSoup to parse the HTML. What are you having trouble with?
On Sat, Oct 24, 2009 at 8:34 PM, elca <high...@gmail.com> wrote:
hello...
if anyone know..please help me !
i really want to know...i was searched in google lot of time.
but can't found clear soultion. and also because of my lack of
python
knowledge.
i want to use IE.navigate function with beautifulsoup or lxml..
if anyone know about this or sample.
please help me!
thanks in advance
--
View this message in context:
http://www.nabble.com/how-to-use-win32com-with-beautifulsoup-or-lxml--tp26044332p26044332.html
Sent from the Python - python-win32 mailing list archive at Nabble.com
.
_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32
--
The source of your stress might be a moron
_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32
--
View this message in context:
http://www.nabble.com/how-to-use-win32com-with-beautifulsoup-or-lxml--tp26044332p26044523.html
Sent from the Python - python-win32 mailing list archive at
Nabble.com.
_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32
_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32