Re: [Python-Dev] teaching the new urllib
Brett Cannon wrote: > On Tue, Feb 3, 2009 at 15:50, Tres Seaver wrote: > > -BEGIN PGP SIGNED MESSAGE- > > Hash: SHA1 > > > > Brett Cannon wrote: > >> On Tue, Feb 3, 2009 at 11:08, Brad Miller wrote: > >>> I'm just getting ready to start the semester using my new book (Python > >>> Programming in Context) and noticed that I somehow missed all the changes > >>> to > >>> urllib in python 3.0. ARGH to say the least. I like using urllib in the > >>> intro class because we can get data from places that are more > >>> interesting/motivating/relevant to the students. > >>> Here are some of my observations on trying to do very basic stuff with > >>> urllib: > >>> 1. urllib.urlopen is now urllib.request.urlopen > >> > >> Technically urllib2.urlopen became urllib.request.urlopen. See PEP > >> 3108 for the details of the reorganization. > >> > >>> 2. The object returned by urlopen is no longer iterable! no more for > >>> line > >>> in url. > >> > >> That is probably a difference between urllib2 and urllib. > >> > >>> 3. read, readline, readlines now return bytes objects or arrays of bytes > >>> instead of a str and array of str > >> > >> Correct. > >> > >>> 4. Taking the naive approach to converting a bytes object to a str does > >>> not > >>> work as you would expect. > >>> > >> import urllib.request > >> page = urllib.request.urlopen('http://knuth.luther.edu/test.html') > >> page > >>> > > >> line = page.readline() > >> line > >>> b' >> str(line) > >>> 'b\' >>> As you can see from the example the 'b' becomes part of the string! It > >>> seems like this should be a bug, is it? > >>> > >> > >> No because you are getting back the repr for the bytes object. Str > >> does not know what the encoding is for the bytes so it has no way of > >> performing the decoding. > > > > The encoding information *is* available in the response headers, e.g.: > > > > - -- %< - > > $ wget -S --spider http://knuth.luther.edu/test.html > > - --18:46:24-- http://knuth.luther.edu/test.html > > => `test.html' > > Resolving knuth.luther.edu... 192.203.196.71 > > Connecting to knuth.luther.edu|192.203.196.71|:80... connected. > > HTTP request sent, awaiting response... > > HTTP/1.1 200 OK > > Date: Tue, 03 Feb 2009 23:46:28 GMT > > Server: Apache/2.0.50 (Linux/SUSE) > > Last-Modified: Mon, 17 Sep 2007 23:35:49 GMT > > ETag: "2fcd8-1d8-43b2bf40" > > Accept-Ranges: bytes > > Content-Length: 472 > > Keep-Alive: timeout=15, max=100 > > Connection: Keep-Alive > > Content-Type: text/html; charset=ISO-8859-1 > > Length: 472 [text/html] > > 200 OK > > - -- %< - > > > > Right, but he was asking about why passing bytes to str() led to it > returning the repr. > > > So, the OP's use case *could* be satisfied, assuming that the Py3K > > version of urllib sprouted a means of leveraging that header. In this > > sense, fetching the resource over HTTP is *better* than loading it from > > a file: information about the character set is explicit, and highly > > likely to be correct, at least for any resource people expect to render > > cleanly in a browser. > > Right. And even if the header lacks the info as Content-Type is not > guaranteed to contain the charset there is also the chance for the > HTML or DOCTYPE declaration to say. > > But as Bill pointed out, urllib just fetches data via HTTP, so a > character encoding will not always be valuable. Best solution would be > to provide something in html that can take what urllib.request.urlopen > returns and handle the decoding. Yes, that sounds like the right solution to me, too. Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 4285 Review
On Tue, Feb 3, 2009 at 1:20 PM, Aahz wrote: > When sending in a request like this, it's useful to summarize the issue; > few people know bug reports by number, and at least some people who might > be interested in looking probably won't bother if they have no clue > whether it's in their area of expertise. You're right. I'm sorry. I completely forgot. And thank you, Eric, for reviewing and adding the subject for me. Cheers, Ross Light ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C API for appending to arrays
On 2-Feb-09, at 9:21 AM, Hrvoje Niksic wrote: It turns out that an even faster method of creating an array is by using the fromstring() method. fromstring() requires an actual string, not a buffer, so in C++ I created an std::vector with a contiguous array of doubles, passed that array to PyString_FromStringAndSize, and called array.fromstring with the resulting string. Despite all the unnecessary copying, the result was much faster than either of the previous versions. Would it be possible for the array module to define a C interface for the most frequent operations on array objects, such as appending an item, and getting/setting an item? Failing that, could we at least make fromstring() accept an arbitrary read buffer, not just an actual string? Do you need to append, or are you just looking to create/manipulate an array with a bunch of c-float values? I find As{Write/Read}Buffer sufficient for most of these tasks. I've included some example pyrex code that populates a new array.array at c speed. (Note that you can get the size of the resulting c array more easily than you are by using PyObject_Length). Of course, this still leaves difficult appending to an already-created array. def calcW0(W1, colTotal): """ Calculate a W0 array from a W1 array. @param W1: array.array of doubles @param colTotal: value to which each column should sum @return W0 = [colTotal] * NA - W1 """ cdef int NA NA = len(W1) W0 = array('d', [colTotal]) * NA cdef double *cW1, *cW0 cdef int i cdef Py_ssize_t dummy PyObject_AsReadBuffer(W1, &cW1, &dummy) PyObject_AsWriteBuffer(W0, &cW0, &dummy) for i from 0 <= i < NA: cW0[i] = cW0[i] - cW1[i] return W0 regards, -Mike ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
On Tue, Feb 3, 2009 at 15:50, Tres Seaver wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Brett Cannon wrote: >> On Tue, Feb 3, 2009 at 11:08, Brad Miller wrote: >>> I'm just getting ready to start the semester using my new book (Python >>> Programming in Context) and noticed that I somehow missed all the changes to >>> urllib in python 3.0. ARGH to say the least. I like using urllib in the >>> intro class because we can get data from places that are more >>> interesting/motivating/relevant to the students. >>> Here are some of my observations on trying to do very basic stuff with >>> urllib: >>> 1. urllib.urlopen is now urllib.request.urlopen >> >> Technically urllib2.urlopen became urllib.request.urlopen. See PEP >> 3108 for the details of the reorganization. >> >>> 2. The object returned by urlopen is no longer iterable! no more for line >>> in url. >> >> That is probably a difference between urllib2 and urllib. >> >>> 3. read, readline, readlines now return bytes objects or arrays of bytes >>> instead of a str and array of str >> >> Correct. >> >>> 4. Taking the naive approach to converting a bytes object to a str does not >>> work as you would expect. >>> >> import urllib.request >> page = urllib.request.urlopen('http://knuth.luther.edu/test.html') >> page >>> > >> line = page.readline() >> line >>> b'> str(line) >>> 'b\'>> As you can see from the example the 'b' becomes part of the string! It >>> seems like this should be a bug, is it? >>> >> >> No because you are getting back the repr for the bytes object. Str >> does not know what the encoding is for the bytes so it has no way of >> performing the decoding. > > The encoding information *is* available in the response headers, e.g.: > > - -- %< - > $ wget -S --spider http://knuth.luther.edu/test.html > - --18:46:24-- http://knuth.luther.edu/test.html > => `test.html' > Resolving knuth.luther.edu... 192.203.196.71 > Connecting to knuth.luther.edu|192.203.196.71|:80... connected. > HTTP request sent, awaiting response... > HTTP/1.1 200 OK > Date: Tue, 03 Feb 2009 23:46:28 GMT > Server: Apache/2.0.50 (Linux/SUSE) > Last-Modified: Mon, 17 Sep 2007 23:35:49 GMT > ETag: "2fcd8-1d8-43b2bf40" > Accept-Ranges: bytes > Content-Length: 472 > Keep-Alive: timeout=15, max=100 > Connection: Keep-Alive > Content-Type: text/html; charset=ISO-8859-1 > Length: 472 [text/html] > 200 OK > - -- %< - > Right, but he was asking about why passing bytes to str() led to it returning the repr. > So, the OP's use case *could* be satisfied, assuming that the Py3K > version of urllib sprouted a means of leveraging that header. In this > sense, fetching the resource over HTTP is *better* than loading it from > a file: information about the character set is explicit, and highly > likely to be correct, at least for any resource people expect to render > cleanly in a browser. Right. And even if the header lacks the info as Content-Type is not guaranteed to contain the charset there is also the chance for the HTML or DOCTYPE declaration to say. But as Bill pointed out, urllib just fetches data via HTTP, so a character encoding will not always be valuable. Best solution would be to provide something in html that can take what urllib.request.urlopen returns and handle the decoding. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
Tres Seaver wrote: > Brett Cannon wrote: >> No because you are getting back the repr for the bytes object. Str >> does not know what the encoding is for the bytes so it has no way of >> performing the decoding. > > The encoding information *is* available in the response headers, e.g.: [snip] That's the target of http://bugs.python.org/issue4733 cited by Benjamin: 'Add a "decode to declared encoding" version of urlopen to urllib' . I think it's an important use case, but the current patch is pretty awful. Improvements/feedback welcome :) Daniel ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Brett Cannon wrote: > On Tue, Feb 3, 2009 at 11:08, Brad Miller wrote: >> I'm just getting ready to start the semester using my new book (Python >> Programming in Context) and noticed that I somehow missed all the changes to >> urllib in python 3.0. ARGH to say the least. I like using urllib in the >> intro class because we can get data from places that are more >> interesting/motivating/relevant to the students. >> Here are some of my observations on trying to do very basic stuff with >> urllib: >> 1. urllib.urlopen is now urllib.request.urlopen > > Technically urllib2.urlopen became urllib.request.urlopen. See PEP > 3108 for the details of the reorganization. > >> 2. The object returned by urlopen is no longer iterable! no more for line >> in url. > > That is probably a difference between urllib2 and urllib. > >> 3. read, readline, readlines now return bytes objects or arrays of bytes >> instead of a str and array of str > > Correct. > >> 4. Taking the naive approach to converting a bytes object to a str does not >> work as you would expect. >> > import urllib.request > page = urllib.request.urlopen('http://knuth.luther.edu/test.html') > page >> > > line = page.readline() > line >> b' str(line) >> 'b\'> As you can see from the example the 'b' becomes part of the string! It >> seems like this should be a bug, is it? >> > > No because you are getting back the repr for the bytes object. Str > does not know what the encoding is for the bytes so it has no way of > performing the decoding. The encoding information *is* available in the response headers, e.g.: - -- %< - $ wget -S --spider http://knuth.luther.edu/test.html - --18:46:24-- http://knuth.luther.edu/test.html => `test.html' Resolving knuth.luther.edu... 192.203.196.71 Connecting to knuth.luther.edu|192.203.196.71|:80... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK Date: Tue, 03 Feb 2009 23:46:28 GMT Server: Apache/2.0.50 (Linux/SUSE) Last-Modified: Mon, 17 Sep 2007 23:35:49 GMT ETag: "2fcd8-1d8-43b2bf40" Accept-Ranges: bytes Content-Length: 472 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html; charset=ISO-8859-1 Length: 472 [text/html] 200 OK - -- %< - So, the OP's use case *could* be satisfied, assuming that the Py3K version of urllib sprouted a means of leveraging that header. In this sense, fetching the resource over HTTP is *better* than loading it from a file: information about the character set is explicit, and highly likely to be correct, at least for any resource people expect to render cleanly in a browser. Tres. - -- === Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software "Excellence by Design"http://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJiNhU+gerLs4ltQ4RAjalAKC6BcbTIFjUIBg51IbVtSd8dZsoDACggw1O +1Zlt7RlzdieQjoAw8AeScE= =lvtX -END PGP SIGNATURE- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 4285 Review
Aahz wrote: On Tue, Feb 03, 2009, Ross Light wrote: Hello, python-dev. I submitted a patch a couple weeks ago for Issue 4285, and it has been reviewed and revised. Would someone please review/commit it? Thank you. http://bugs.python.org/issue4285 When sending in a request like this, it's useful to summarize the issue; few people know bug reports by number, and at least some people who might be interested in looking probably won't bother if they have no clue whether it's in their area of expertise. I'll review it with the intention of committing it. The subject is "Use a named tuple for sys.version_info". Eric. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Issue 4285 Review
On Tue, Feb 03, 2009, Ross Light wrote: > > Hello, python-dev. I submitted a patch a couple weeks ago for Issue > 4285, and it has been reviewed and revised. Would someone please > review/commit it? Thank you. > > http://bugs.python.org/issue4285 When sending in a request like this, it's useful to summarize the issue; few people know bug reports by number, and at least some people who might be interested in looking probably won't bother if they have no clue whether it's in their area of expertise. -- Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
I still haven't seen any real code presented that would benefit from partial.skip or partial_right. # some Articles have timestamp attributes and some don't stamp = partial_right(getattr, 'timestamp', 0) lastupdate = max(map(stamp, articles)) # some beautiful soup nodes have a name attribute and some don't name = partial_right(getattr, 'name', '') alltags = set(map(name, soup)) The arguments for and against the patch could be brought against partial() itself, so I don't understand the -1's at all. Quite so, but that doesn't justify adding more capabilities to partial(). I concur with Collin. Lever arguments are a road to bloat. "In for a penny, in for a pound" is not a language design principle. One of the real problems with partial() and its variants is that they provide almost no advantage over an equivalent lambda. IMO, lambda has an advantage over partial.skip() because the lambda is easier to read: modcubes = lambda base, mod: pow(base, 3, mod) Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Issue 4285 Review
Hello, python-dev. I submitted a patch a couple weeks ago for Issue 4285, and it has been reviewed and revised. Would someone please review/commit it? Thank you. http://bugs.python.org/issue4285 Cheers, Ross Light ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
On Tue, Feb 3, 2009 at 11:53 AM, Antoine Pitrou wrote: > Collin Winter gmail.com> writes: >> >> Have any of the original objections to Calvin's patch >> (http://bugs.python.org/issue1706256) been addressed? If not, I don't >> see anything in these threads that justify resurrecting it. >> >> I still haven't seen any real code presented that would benefit from >> partial.skip or partial_right. > > The arguments for and against the patch could be brought against partial() > itself, so I don't understand the -1's at all. Quite so, but that doesn't justify adding more capabilities to partial(). Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
İsmail Dönmez wrote: > Hi, > > On Tue, Feb 3, 2009 at 21:56, Brett Cannon wrote: > > Probably the biggest issue will be having to explain string encoding. > > Obviously you can gloss over it or provide students with a simple > > library that just automatically converts the strings. Or even better, > > provide some code for the standard library that can take the HTML, > > figure out the encoding, and then return the decoded strings (might > > actually already be something for that that I am not aware of). > > http://chardet.feedparser.org/ should work fine for most auto-encoding > detection needs. Remember that the return value from urlopen() need not be HTML or XML. It could be, say, an image or PDF or Word, or pretty much anything. Bill ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
On Tue, Feb 3, 2009 at 2:08 PM, Brad Miller wrote: > Here's the iteration problem: > 'b\'>>> for line in page: > print(line) > Traceback (most recent call last): > File "", line 1, in > for line in page: > TypeError: 'addinfourl' object is not iterable > Why is this not iterable anymore? Is this too a bug? What the heck is an > addinfourl object? See http://bugs.python.org/issue4608. > > 5. Finally, I see that a bytes object has some of the same methods as > strings. But the error messages are confusing. line > b' "http://www.w3.org/TR/html4/loose.dtd";>\n' line.find('www') > Traceback (most recent call last): > File "", line 1, in > line.find('www') > TypeError: expected an object with the buffer interface line.find(b'www') > 11 > Why couldn't find take string as a parameter? See http://bugs.python.org/issue4733 -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
Hi, On Tue, Feb 3, 2009 at 21:56, Brett Cannon wrote: > Probably the biggest issue will be having to explain string encoding. > Obviously you can gloss over it or provide students with a simple > library that just automatically converts the strings. Or even better, > provide some code for the standard library that can take the HTML, > figure out the encoding, and then return the decoded strings (might > actually already be something for that that I am not aware of). http://chardet.feedparser.org/ should work fine for most auto-encoding detection needs. Regards, ismail ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] teaching the new urllib
On Tue, Feb 3, 2009 at 11:08, Brad Miller wrote: > I'm just getting ready to start the semester using my new book (Python > Programming in Context) and noticed that I somehow missed all the changes to > urllib in python 3.0. ARGH to say the least. I like using urllib in the > intro class because we can get data from places that are more > interesting/motivating/relevant to the students. > Here are some of my observations on trying to do very basic stuff with > urllib: > 1. urllib.urlopen is now urllib.request.urlopen Technically urllib2.urlopen became urllib.request.urlopen. See PEP 3108 for the details of the reorganization. > 2. The object returned by urlopen is no longer iterable! no more for line > in url. That is probably a difference between urllib2 and urllib. > 3. read, readline, readlines now return bytes objects or arrays of bytes > instead of a str and array of str Correct. > 4. Taking the naive approach to converting a bytes object to a str does not > work as you would expect. > import urllib.request page = urllib.request.urlopen('http://knuth.luther.edu/test.html') page > > line = page.readline() line > b'>>> str(line) > 'b\'>>> > As you can see from the example the 'b' becomes part of the string! It > seems like this should be a bug, is it? > No because you are getting back the repr for the bytes object. Str does not know what the encoding is for the bytes so it has no way of performing the decoding. > Here's the iteration problem: > 'b\'>>> for line in page: > print(line) > Traceback (most recent call last): > File "", line 1, in > for line in page: > TypeError: 'addinfourl' object is not iterable > Why is this not iterable anymore? Is this too a bug? What the heck is an > addinfourl object? > > 5. Finally, I see that a bytes object has some of the same methods as > strings. But the error messages are confusing. line > b' "http://www.w3.org/TR/html4/loose.dtd";>\n' line.find('www') > Traceback (most recent call last): > File "", line 1, in > line.find('www') > TypeError: expected an object with the buffer interface line.find(b'www') > 11 > Why couldn't find take string as a parameter? Once again, encoding. The bytes object doesn't know what to encode the string to in order to do an apples-to-apples search of bytes. > If folks have advice on which, if any, of these are bugs please let me know > and I'll file them, and if possible work on fixes for them too. While not a bug, adding iterator support wouldn't hurt. And for the better TypeError messages, you could try submitting a patch to change to tack on something like "(e.g. bytes)", although I am not sure if anyone else would agree on that decision. > If you have advice on how I should better be teaching this new urllib that > would be great to hear as well. Probably the biggest issue will be having to explain string encoding. Obviously you can gloss over it or provide students with a simple library that just automatically converts the strings. Or even better, provide some code for the standard library that can take the HTML, figure out the encoding, and then return the decoded strings (might actually already be something for that that I am not aware of). -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
Collin Winter gmail.com> writes: > > Have any of the original objections to Calvin's patch > (http://bugs.python.org/issue1706256) been addressed? If not, I don't > see anything in these threads that justify resurrecting it. > > I still haven't seen any real code presented that would benefit from > partial.skip or partial_right. The arguments for and against the patch could be brought against partial() itself, so I don't understand the -1's at all. I know I hardly every use partial() (apart from the performance aspect, it looks like a completely useless addition to me), but from a performance standpoint, partial.skip has as much usefulness as partial() itself. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
http://bugs.python.org/issue1706256 Took me a couple days to catch up on this thread so here is the link for any interested. Could it be possible to reevaluate this? On Sat, Jan 31, 2009 at 2:40 PM, Leif Walsh wrote: > On Fri, Jan 30, 2009 at 7:38 PM, Calvin Spealman wrote: >> I am just replying to the end of this thread to throw in a reminder >> about my partial.skip patch, which allows the following usage: >> >> split_one = partial(str.split, partial.skip, 1) >> >> Not looking to say "mine is better", but if the idea is being given >> merit, I like the skipping arguments method better than just the >> "right partial", which I think is confusing combined with keyword and >> optional arguments. And, this patch already exists. Could it be >> re-evaluated? > > +1 but I don't know where the patch is. > > -- > Cheers, > Leif > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] teaching the new urllib
I'm just getting ready to start the semester using my new book (Python Programming in Context) and noticed that I somehow missed all the changes to urllib in python 3.0. ARGH to say the least. I like using urllib in the intro class because we can get data from places that are more interesting/motivating/relevant to the students. Here are some of my observations on trying to do very basic stuff with urllib: 1. urllib.urlopen is now urllib.request.urlopen 2. The object returned by urlopen is no longer iterable! no more for line in url. 3. read, readline, readlines now return bytes objects or arrays of bytes instead of a str and array of str 4. Taking the naive approach to converting a bytes object to a str does not work as you would expect. >>> import urllib.request >>> page = urllib.request.urlopen('http://knuth.luther.edu/test.html') >>> page > >>> line = page.readline() >>> line b'>> str(line) 'b\'>> As you can see from the example the 'b' becomes part of the string! It seems like this should be a bug, is it? Here's the iteration problem: 'b\'>> for line in page: print(line) Traceback (most recent call last): File "", line 1, in for line in page: TypeError: 'addinfourl' object is not iterable Why is this not iterable anymore? Is this too a bug? What the heck is an addinfourl object? 5. Finally, I see that a bytes object has some of the same methods as strings. But the error messages are confusing. >>> line b' "http://www.w3.org/TR/html4/loose.dtd";>\n' >>> line.find('www') Traceback (most recent call last): File "", line 1, in line.find('www') TypeError: expected an object with the buffer interface >>> line.find(b'www') 11 Why couldn't find take string as a parameter? If folks have advice on which, if any, of these are bugs please let me know and I'll file them, and if possible work on fixes for them too. If you have advice on how I should better be teaching this new urllib that would be great to hear as well. Thanks, Brad -- Brad Miller Assistant Professor, Computer Science Luther College -- Brad Miller Assistant Professor, Computer Science Luther College ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] teaching the new urllib
I'm just getting ready to start the semester using my new book (Python Programming in Context) and noticed that I somehow missed all the changes to urllib in python 3.0. ARGH to say the least. I like using urllib in the intro class because we can get data from places that are more interesting/motivating/relevant to the students. Here are some of my observations on trying to do very basic stuff with urllib: 1. urllib.urlopen is now urllib.request.urlopen 2. The object returned by urlopen is no longer iterable! no more for line in url. 3. read, readline, readlines now return bytes objects or arrays of bytes instead of a str and array of str 4. Taking the naive approach to converting a bytes object to a str does not work as you would expect. >>> import urllib.request >>> page = urllib.request.urlopen('http://knuth.luther.edu/test.html') >>> page > >>> line = page.readline() >>> line b'>> str(line) 'b\'>> As you can see from the example the 'b' becomes part of the string! It seems like this should be a bug, is it? Here's the iteration problem: 'b\'>> for line in page: print(line) Traceback (most recent call last): File "", line 1, in for line in page: TypeError: 'addinfourl' object is not iterable Why is this not iterable anymore? Is this too a bug? What the heck is an addinfourl object? 5. Finally, I see that a bytes object has some of the same methods as strings. But the error messages are confusing. >>> line b' "http://www.w3.org/TR/html4/loose.dtd";>\n' >>> line.find('www') Traceback (most recent call last): File "", line 1, in line.find('www') TypeError: expected an object with the buffer interface >>> line.find(b'www') 11 Why couldn't find take string as a parameter? If folks have advice on which, if any, of these are bugs please let me know and I'll file them, and if possible work on fixes for them too. If you have advice on how I should better be teaching this new urllib that would be great to hear as well. Thanks, Brad -- Brad Miller Assistant Professor, Computer Science Luther College ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Partial function application 'from the right'
On Tue, Feb 3, 2009 at 5:44 AM, Ben North wrote: > Hi, > > Thanks for the further responses. Again, I'll try to summarise: > > Scott David Daniels pointed out an awkward interaction when chaining > partial applications, such that it could become very unclear what was > going to happen when the final function is called: > >> If you have: >> def button(root, position, action=None, text='*', color=None): >> ... >> ... >> blue_button = partial(button, my_root, color=(0,0,1)) >> >> Should partial_right(blue_button, 'red') change the color or the text? > > Calvin Spealman mentioned a previous patch of his which took the 'hole' > approach, i.e.: > >> [...] my partial.skip patch, which allows the following usage: >> >>split_one = partial(str.split, partial.skip, 1) > > This would solve my original problems, and, continuing Scott's example, > > def on_clicked(...): ... > > _ = partial.skip > clickable_blue_button = partial(blue_button, _, on_clicked) > > has a clear enough meaning I think: > > clickable_blue_button('top-left corner') > = blue_button('top-left corner', on_clicked) > = button(my_root, 'top-left corner', on_clicked, color=(0,0,1)) > > Calvin's idea/patch sounds good to me, then. Others also liked it. > Could it be re-considered, instead of the partial_right idea? Have any of the original objections to Calvin's patch (http://bugs.python.org/issue1706256) been addressed? If not, I don't see anything in these threads that justify resurrecting it. I still haven't seen any real code presented that would benefit from partial.skip or partial_right. Collin Winter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [patch] Duplicate sections detection in ConfigParser
>> The attached patch is compatible with both the 2.x and the 3.x >> branches; it adds a `unique_sects` parameter to the constructor of >> RawConfigParser and a test in the parser loop that raises >> DuplicateSectionError if a section is seen more then once and that >> unique_sects is True. http://bugs.python.org/issue2204 refers to the same issue. Perhaps, you can upload your patch there in addition to adding any comments. Thanks, Raghu ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [patch] Duplicate sections detection in ConfigParser
On Tue, Feb 03, 2009, Yannick Gingras wrote: > > The attached patch is compatible with both the 2.x and the 3.x > branches; it adds a `unique_sects` parameter to the constructor of > RawConfigParser and a test in the parser loop that raises > DuplicateSectionError if a section is seen more then once and that > unique_sects is True. Please go ahead and post the patch to bugs.python.org; it can always be revised later and this ensures that we have a record. -- Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Partial function application 'from the right'
Hi, Thanks for the further responses. Again, I'll try to summarise: Scott David Daniels pointed out an awkward interaction when chaining partial applications, such that it could become very unclear what was going to happen when the final function is called: > If you have: > def button(root, position, action=None, text='*', color=None): > ... > ... > blue_button = partial(button, my_root, color=(0,0,1)) > > Should partial_right(blue_button, 'red') change the color or the text? Calvin Spealman mentioned a previous patch of his which took the 'hole' approach, i.e.: > [...] my partial.skip patch, which allows the following usage: > >split_one = partial(str.split, partial.skip, 1) This would solve my original problems, and, continuing Scott's example, def on_clicked(...): ... _ = partial.skip clickable_blue_button = partial(blue_button, _, on_clicked) has a clear enough meaning I think: clickable_blue_button('top-left corner') = blue_button('top-left corner', on_clicked) = button(my_root, 'top-left corner', on_clicked, color=(0,0,1)) Calvin's idea/patch sounds good to me, then. Others also liked it. Could it be re-considered, instead of the partial_right idea? Ben. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Should there be a source-code checksum in module objects?
Guido van Rossum writes: > I suggest that you move this discussion to python-ideas to ferret out > a possible implementation and API; or to find out work-arounds. Okay. Done. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C API for appending to arrays
Raymond Hettinger wrote: [Hrvoje Niksic] The one thing missing from the array module is the ability to directly access array values from C. Please put a feature request on the bug tracker. Done, http://bugs.python.org/issue5141 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com