Andrew Berg, 19.05.2011 02:39:
On 2011.05.18 03:30 AM, Stefan Behnel wrote:
Well, it pretty clearly states that on the PyPI page, but I also added it
to the project home page now. lxml 2.3 works with any CPython version from
2.3 to 3.2.
Thank you. I never would've looked at PyPI for info on a
On 2011.05.16 02:26 AM, Karim wrote:
Use regular expression for bad HTLM or beautifulSoup (google it), below
a exemple to extract all html links:
Actually, using regex wasn't so bad:
import re
import urllib.request
url = 'http://x264.nl/x264/?dir=./64bit/8bit_depth'
page =
On 05/19/2011 11:35 PM, Andrew Berg wrote:
On 2011.05.16 02:26 AM, Karim wrote:
Use regular expression for bad HTLM or beautifulSoup (google it), below
a exemple to extract all html links:
Actually, using regex wasn't so bad:
import re
import urllib.request
url =
Andrew Berg wrote:
ElementTree doesn't seem to have been updated in a long time, so I'll
assume it won't work with Python 3.
I don't know how to use it, but you'll find ElementTree as xml.etree in
Python 3.
~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list
Andrew Berg, 17.05.2011 03:05:
lxml looks promising, but it doesn't say anywhere whether it'll work on
Python 3 or not
Well, it pretty clearly states that on the PyPI page, but I also added it
to the project home page now. lxml 2.3 works with any CPython version from
2.3 to 3.2.
Stefan
--
On 2011.05.18 03:30 AM, Stefan Behnel wrote:
Well, it pretty clearly states that on the PyPI page, but I also added it
to the project home page now. lxml 2.3 works with any CPython version from
2.3 to 3.2.
Thank you. I never would've looked at PyPI for info on a project that
has its own site.
On 05/17/2011 03:05 AM, Andrew Berg wrote:
On 2011.05.16 02:26 AM, Karim wrote:
Use regular expression for bad HTLM or beautifulSoup (google it), below
a exemple to extract all html links:
linksList = re.findall('a href=(.*?).*?/a',htmlSource)
for link in linksList:
print link
I was
On 05/16/2011 03:06 AM, David Robinow wrote:
On Sun, May 15, 2011 at 4:45 PM, Andrew Bergbahamutzero8...@gmail.com wrote:
I'm trying to understand why HMTLParser.feed() isn't returning the whole
page. My test script is this:
import urllib.request
import html.parser
class
On 2011.05.16 02:26 AM, Karim wrote:
Use regular expression for bad HTLM or beautifulSoup (google it), below
a exemple to extract all html links:
linksList = re.findall('a href=(.*?).*?/a',htmlSource)
for link in linksList:
print link
I was afraid I might have to use regexes (mostly
I'm trying to understand why HMTLParser.feed() isn't returning the whole
page. My test script is this:
import urllib.request
import html.parser
class MyHTMLParser(html.parser.HTMLParser):
def handle_starttag(self, tag, attrs):
if tag == 'a' and attrs:
On Sun, May 15, 2011 at 4:45 PM, Andrew Berg bahamutzero8...@gmail.com wrote:
I'm trying to understand why HMTLParser.feed() isn't returning the whole
page. My test script is this:
import urllib.request
import html.parser
class MyHTMLParser(html.parser.HTMLParser):
def
11 matches
Mail list logo