On Sat, Aug 22, 2015 at 5:05 PM, Anthony Papillion <papill...@gmail.com> wrote: > Hello Everyone, > > I'm pretty new to lxml but I pretty much thought I'd understood the basics. > However, for some reason, my first attempt at using it is failing miserably. > > Here's the deal: > > I'm parsing specific page on Craigslist ( > http://joplin.craigslist.org/search/rea) and trying to retreive the text of > each link on that page. When I do an "inspect element" in Firefox, a sample > anchor link looks like this: > > <a href="/reb/5185592209.html" data-id="5185592209" class="hdrlnk">FIRST > OPEN HOUSE TOMORROW 2:00pm-4:00pm!!! (8-23-15)</a> > > The code I'm using to try to get the link text is this: > > from lxml import html > import requests > > page = requests.get("http://joplin.craigslist.org/search/rea") > titles = tree.xpath('//a[@title="hdrlnk"]/text()') > print titles > > The last line, where it supposedly will print the text of each anchor > returns []. > > I can't seem to figure out what I'm doing wrong. lmxml seems pretty > straightforward but I can't seem to get this down. > > Can anyone make any suggestions? > > Thanks! > Anthony
Not an answer, but have you checked out Beautiful Soup? It is a great html parsing tool, with a good tutorial: http://www.crummy.com/software/BeautifulSoup/bs4/doc/ > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor -- Joel Goldstick http://joelgoldstick.com _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor