Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Bill Janssen
Brett Cannon  wrote:

> On Tue, Feb 3, 2009 at 15:50, Tres Seaver  wrote:
> > -BEGIN PGP SIGNED MESSAGE-
> > Hash: SHA1
> >
> > Brett Cannon wrote:
> >> On Tue, Feb 3, 2009 at 11:08, Brad Miller  wrote:
> >>> I'm just getting ready to start the semester using my new book (Python
> >>> Programming in Context) and noticed that I somehow missed all the changes 
> >>> to
> >>> urllib in python 3.0.  ARGH to say the least.  I like using urllib in the
> >>> intro class because we can get data from places that are more
> >>> interesting/motivating/relevant to the students.
> >>> Here are some of my observations on trying to do very basic stuff with
> >>> urllib:
> >>> 1.  urllib.urlopen  is now urllib.request.urlopen
> >>
> >> Technically urllib2.urlopen became urllib.request.urlopen. See PEP
> >> 3108 for the details of the reorganization.
> >>
> >>> 2.  The object returned by urlopen is no longer iterable!  no more for 
> >>> line
> >>> in url.
> >>
> >> That is probably a difference between urllib2 and urllib.
> >>
> >>> 3.  read, readline, readlines now return bytes objects or arrays of bytes
> >>> instead of a str and array of str
> >>
> >> Correct.
> >>
> >>> 4.  Taking the naive approach to converting a bytes object to a str does 
> >>> not
> >>> work as you would expect.
> >>>
> >> import urllib.request
> >> page = urllib.request.urlopen('http://knuth.luther.edu/test.html')
> >> page
> >>> >
> >> line = page.readline()
> >> line
> >>> b' >> str(line)
> >>> 'b\' >>> As you can see from the example the 'b' becomes part of the string!  It
> >>> seems like this should be a bug, is it?
> >>>
> >>
> >> No because you are getting back the repr for the bytes object. Str
> >> does not know what the encoding is for the bytes so it has no way of
> >> performing the decoding.
> >
> > The encoding information *is* available in the response headers, e.g.:
> >
> > - -- %< -
> > $ wget -S --spider http://knuth.luther.edu/test.html
> > - --18:46:24--  http://knuth.luther.edu/test.html
> >   => `test.html'
> > Resolving knuth.luther.edu... 192.203.196.71
> > Connecting to knuth.luther.edu|192.203.196.71|:80... connected.
> > HTTP request sent, awaiting response...
> >  HTTP/1.1 200 OK
> >  Date: Tue, 03 Feb 2009 23:46:28 GMT
> >  Server: Apache/2.0.50 (Linux/SUSE)
> >  Last-Modified: Mon, 17 Sep 2007 23:35:49 GMT
> >  ETag: "2fcd8-1d8-43b2bf40"
> >  Accept-Ranges: bytes
> >  Content-Length: 472
> >  Keep-Alive: timeout=15, max=100
> >  Connection: Keep-Alive
> >  Content-Type: text/html; charset=ISO-8859-1
> > Length: 472 [text/html]
> > 200 OK
> > - -- %< -
> >
> 
> Right, but he was asking about why passing bytes to str() led to it
> returning the repr.
> 
> > So, the OP's use case *could* be satisfied, assuming that the Py3K
> > version of urllib sprouted a means of leveraging that header.  In this
> > sense, fetching the resource over HTTP is *better* than loading it from
> > a file:  information about the character set is explicit, and highly
> > likely to be correct, at least for any resource people expect to render
> > cleanly in a browser.
> 
> Right. And even if the header lacks the info as Content-Type is not
> guaranteed to contain the charset there is also the chance for the
> HTML or DOCTYPE declaration to say.
> 
> But as Bill pointed out, urllib just fetches data via HTTP, so a
> character encoding will not always be valuable. Best solution would be
> to provide something in html that can take what urllib.request.urlopen
> returns and handle the decoding.

Yes, that sounds like the right solution to me, too.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 4285 Review

2009-02-03 Thread Ross Light
On Tue, Feb 3, 2009 at 1:20 PM, Aahz  wrote:
> When sending in a request like this, it's useful to summarize the issue;
> few people know bug reports by number, and at least some people who might
> be interested in looking probably won't bother if they have no clue
> whether it's in their area of expertise.

You're right.  I'm sorry.  I completely forgot.  And thank you, Eric,
for reviewing and adding the subject for me.

Cheers,
Ross Light
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for appending to arrays

2009-02-03 Thread Mike Klaas


On 2-Feb-09, at 9:21 AM, Hrvoje Niksic wrote:


It turns out that an even faster method of creating an array is by  
using the fromstring() method.  fromstring() requires an actual  
string, not a buffer, so in C++ I created an std::vector  
with a contiguous array of doubles, passed that array to  
PyString_FromStringAndSize, and called array.fromstring with the  
resulting string.  Despite all the unnecessary copying, the result  
was much faster than either of the previous versions.



Would it be possible for the array module to define a C interface  
for the most frequent operations on array objects, such as appending  
an item, and getting/setting an item?  Failing that, could we at  
least make fromstring() accept an arbitrary read buffer, not just an  
actual string?


Do you need to append, or are you just looking to create/manipulate an  
array with a bunch of c-float values?  I find As{Write/Read}Buffer  
sufficient for most of these tasks.  I've included some example pyrex  
code that populates a new array.array at c speed.  (Note that you can  
get the size of the resulting c array more easily than you are by  
using PyObject_Length).  Of course, this still leaves difficult  
appending to an already-created array.


def calcW0(W1, colTotal):
""" Calculate a W0 array from a W1 array.

@param W1: array.array of doubles
@param colTotal: value to which each column should sum

@return W0 = [colTotal] * NA - W1
"""
cdef int NA
NA = len(W1)
W0 = array('d', [colTotal]) * NA

cdef double *cW1, *cW0
cdef int i
cdef Py_ssize_t dummy

PyObject_AsReadBuffer(W1, &cW1, &dummy)
PyObject_AsWriteBuffer(W0, &cW0, &dummy)

for i from 0 <= i < NA:
cW0[i] = cW0[i] - cW1[i]

return W0

regards,
-Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Brett Cannon
On Tue, Feb 3, 2009 at 15:50, Tres Seaver  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Brett Cannon wrote:
>> On Tue, Feb 3, 2009 at 11:08, Brad Miller  wrote:
>>> I'm just getting ready to start the semester using my new book (Python
>>> Programming in Context) and noticed that I somehow missed all the changes to
>>> urllib in python 3.0.  ARGH to say the least.  I like using urllib in the
>>> intro class because we can get data from places that are more
>>> interesting/motivating/relevant to the students.
>>> Here are some of my observations on trying to do very basic stuff with
>>> urllib:
>>> 1.  urllib.urlopen  is now urllib.request.urlopen
>>
>> Technically urllib2.urlopen became urllib.request.urlopen. See PEP
>> 3108 for the details of the reorganization.
>>
>>> 2.  The object returned by urlopen is no longer iterable!  no more for line
>>> in url.
>>
>> That is probably a difference between urllib2 and urllib.
>>
>>> 3.  read, readline, readlines now return bytes objects or arrays of bytes
>>> instead of a str and array of str
>>
>> Correct.
>>
>>> 4.  Taking the naive approach to converting a bytes object to a str does not
>>> work as you would expect.
>>>
>> import urllib.request
>> page = urllib.request.urlopen('http://knuth.luther.edu/test.html')
>> page
>>> >
>> line = page.readline()
>> line
>>> b'> str(line)
>>> 'b\'>> As you can see from the example the 'b' becomes part of the string!  It
>>> seems like this should be a bug, is it?
>>>
>>
>> No because you are getting back the repr for the bytes object. Str
>> does not know what the encoding is for the bytes so it has no way of
>> performing the decoding.
>
> The encoding information *is* available in the response headers, e.g.:
>
> - -- %< -
> $ wget -S --spider http://knuth.luther.edu/test.html
> - --18:46:24--  http://knuth.luther.edu/test.html
>   => `test.html'
> Resolving knuth.luther.edu... 192.203.196.71
> Connecting to knuth.luther.edu|192.203.196.71|:80... connected.
> HTTP request sent, awaiting response...
>  HTTP/1.1 200 OK
>  Date: Tue, 03 Feb 2009 23:46:28 GMT
>  Server: Apache/2.0.50 (Linux/SUSE)
>  Last-Modified: Mon, 17 Sep 2007 23:35:49 GMT
>  ETag: "2fcd8-1d8-43b2bf40"
>  Accept-Ranges: bytes
>  Content-Length: 472
>  Keep-Alive: timeout=15, max=100
>  Connection: Keep-Alive
>  Content-Type: text/html; charset=ISO-8859-1
> Length: 472 [text/html]
> 200 OK
> - -- %< -
>

Right, but he was asking about why passing bytes to str() led to it
returning the repr.

> So, the OP's use case *could* be satisfied, assuming that the Py3K
> version of urllib sprouted a means of leveraging that header.  In this
> sense, fetching the resource over HTTP is *better* than loading it from
> a file:  information about the character set is explicit, and highly
> likely to be correct, at least for any resource people expect to render
> cleanly in a browser.

Right. And even if the header lacks the info as Content-Type is not
guaranteed to contain the charset there is also the chance for the
HTML or DOCTYPE declaration to say.

But as Bill pointed out, urllib just fetches data via HTTP, so a
character encoding will not always be valuable. Best solution would be
to provide something in html that can take what urllib.request.urlopen
returns and handle the decoding.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Daniel (ajax) Diniz
Tres Seaver wrote:
> Brett Cannon wrote:
>> No because you are getting back the repr for the bytes object. Str
>> does not know what the encoding is for the bytes so it has no way of
>> performing the decoding.
>
> The encoding information *is* available in the response headers, e.g.:
[snip]

That's the target of http://bugs.python.org/issue4733 cited by
Benjamin: 'Add a "decode to declared encoding" version of urlopen to
urllib' . I think it's an important use case, but the current patch is
pretty awful. Improvements/feedback welcome :)

Daniel
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brett Cannon wrote:
> On Tue, Feb 3, 2009 at 11:08, Brad Miller  wrote:
>> I'm just getting ready to start the semester using my new book (Python
>> Programming in Context) and noticed that I somehow missed all the changes to
>> urllib in python 3.0.  ARGH to say the least.  I like using urllib in the
>> intro class because we can get data from places that are more
>> interesting/motivating/relevant to the students.
>> Here are some of my observations on trying to do very basic stuff with
>> urllib:
>> 1.  urllib.urlopen  is now urllib.request.urlopen
> 
> Technically urllib2.urlopen became urllib.request.urlopen. See PEP
> 3108 for the details of the reorganization.
> 
>> 2.  The object returned by urlopen is no longer iterable!  no more for line
>> in url.
> 
> That is probably a difference between urllib2 and urllib.
> 
>> 3.  read, readline, readlines now return bytes objects or arrays of bytes
>> instead of a str and array of str
> 
> Correct.
> 
>> 4.  Taking the naive approach to converting a bytes object to a str does not
>> work as you would expect.
>>
> import urllib.request
> page = urllib.request.urlopen('http://knuth.luther.edu/test.html')
> page
>> >
> line = page.readline()
> line
>> b' str(line)
>> 'b\'> As you can see from the example the 'b' becomes part of the string!  It
>> seems like this should be a bug, is it?
>>
> 
> No because you are getting back the repr for the bytes object. Str
> does not know what the encoding is for the bytes so it has no way of
> performing the decoding.

The encoding information *is* available in the response headers, e.g.:

- -- %< -
$ wget -S --spider http://knuth.luther.edu/test.html
- --18:46:24--  http://knuth.luther.edu/test.html
   => `test.html'
Resolving knuth.luther.edu... 192.203.196.71
Connecting to knuth.luther.edu|192.203.196.71|:80... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Date: Tue, 03 Feb 2009 23:46:28 GMT
  Server: Apache/2.0.50 (Linux/SUSE)
  Last-Modified: Mon, 17 Sep 2007 23:35:49 GMT
  ETag: "2fcd8-1d8-43b2bf40"
  Accept-Ranges: bytes
  Content-Length: 472
  Keep-Alive: timeout=15, max=100
  Connection: Keep-Alive
  Content-Type: text/html; charset=ISO-8859-1
Length: 472 [text/html]
200 OK
- -- %< -

So, the OP's use case *could* be satisfied, assuming that the Py3K
version of urllib sprouted a means of leveraging that header.  In this
sense, fetching the resource over HTTP is *better* than loading it from
a file:  information about the character set is explicit, and highly
likely to be correct, at least for any resource people expect to render
cleanly in a browser.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJiNhU+gerLs4ltQ4RAjalAKC6BcbTIFjUIBg51IbVtSd8dZsoDACggw1O
+1Zlt7RlzdieQjoAw8AeScE=
=lvtX
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 4285 Review

2009-02-03 Thread Eric Smith

Aahz wrote:

On Tue, Feb 03, 2009, Ross Light wrote:

Hello, python-dev.  I submitted a patch a couple weeks ago for Issue
4285, and it has been reviewed and revised.  Would someone please
review/commit it?  Thank you.

http://bugs.python.org/issue4285


When sending in a request like this, it's useful to summarize the issue;
few people know bug reports by number, and at least some people who might
be interested in looking probably won't bother if they have no clue
whether it's in their area of expertise.


I'll review it with the intention of committing it.

The subject is "Use a named tuple for sys.version_info".

Eric.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Issue 4285 Review

2009-02-03 Thread Aahz
On Tue, Feb 03, 2009, Ross Light wrote:
>
> Hello, python-dev.  I submitted a patch a couple weeks ago for Issue
> 4285, and it has been reviewed and revised.  Would someone please
> review/commit it?  Thank you.
> 
> http://bugs.python.org/issue4285

When sending in a request like this, it's useful to summarize the issue;
few people know bug reports by number, and at least some people who might
be interested in looking probably won't bother if they have no clue
whether it's in their area of expertise.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Raymond Hettinger

I still haven't seen any real code presented that would benefit from
partial.skip or partial_right.


# some Articles have timestamp attributes and some don't
stamp = partial_right(getattr, 'timestamp', 0)
lastupdate = max(map(stamp, articles))

# some beautiful soup nodes have a name attribute and some don't
name = partial_right(getattr, 'name', '')
alltags = set(map(name, soup))




The arguments for and against the patch could be brought against partial()
itself, so I don't understand the -1's at all.


Quite so, but that doesn't justify adding more capabilities to partial().


I concur with Collin.  Lever arguments are a road to bloat.
"In for a penny, in for a pound" is not a language design principle.

One of the real problems with partial() and its variants is that they provide
almost no advantage over an equivalent lambda.  IMO, lambda has
an advantage over partial.skip() because the lambda is easier to read:
 
  modcubes = lambda base, mod:   pow(base, 3, mod)



Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Issue 4285 Review

2009-02-03 Thread Ross Light
Hello, python-dev.  I submitted a patch a couple weeks ago for Issue
4285, and it has been reviewed and revised.  Would someone please
review/commit it?  Thank you.

http://bugs.python.org/issue4285

Cheers,
Ross Light
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Collin Winter
On Tue, Feb 3, 2009 at 11:53 AM, Antoine Pitrou  wrote:
> Collin Winter  gmail.com> writes:
>>
>> Have any of the original objections to Calvin's patch
>> (http://bugs.python.org/issue1706256) been addressed? If not, I don't
>> see anything in these threads that justify resurrecting it.
>>
>> I still haven't seen any real code presented that would benefit from
>> partial.skip or partial_right.
>
> The arguments for and against the patch could be brought against partial()
> itself, so I don't understand the -1's at all.

Quite so, but that doesn't justify adding more capabilities to partial().

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Bill Janssen
İsmail Dönmez  wrote:

> Hi,
> 
> On Tue, Feb 3, 2009 at 21:56, Brett Cannon  wrote:
> > Probably the biggest issue will be having to explain string encoding.
> > Obviously you can gloss over it or provide students with a simple
> > library that just automatically converts the strings. Or even better,
> > provide some code for the standard library that can take the HTML,
> > figure out the encoding, and then return the decoded strings (might
> > actually already be something for that that I am not aware of).
> 
> http://chardet.feedparser.org/ should work fine for most auto-encoding
> detection needs.

Remember that the return value from urlopen() need not be HTML or XML.
It could be, say, an image or PDF or Word, or pretty much anything.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Benjamin Peterson
On Tue, Feb 3, 2009 at 2:08 PM, Brad Miller  wrote:
> Here's the iteration problem:
> 'b\'>>> for line in page:
> print(line)
> Traceback (most recent call last):
>   File "", line 1, in 
> for line in page:
> TypeError: 'addinfourl' object is not iterable
> Why is this not iterable anymore?  Is this too a bug?  What the heck is an
> addinfourl object?

See http://bugs.python.org/issue4608.

>
> 5.  Finally, I see that a bytes object has some of the same methods as
> strings.  But the error messages are confusing.
 line
> b'   "http://www.w3.org/TR/html4/loose.dtd";>\n'
 line.find('www')
> Traceback (most recent call last):
>   File "", line 1, in 
> line.find('www')
> TypeError: expected an object with the buffer interface
 line.find(b'www')
> 11
> Why couldn't find take string as a parameter?

 See http://bugs.python.org/issue4733



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread İsmail Dönmez
Hi,

On Tue, Feb 3, 2009 at 21:56, Brett Cannon  wrote:
> Probably the biggest issue will be having to explain string encoding.
> Obviously you can gloss over it or provide students with a simple
> library that just automatically converts the strings. Or even better,
> provide some code for the standard library that can take the HTML,
> figure out the encoding, and then return the decoded strings (might
> actually already be something for that that I am not aware of).

http://chardet.feedparser.org/ should work fine for most auto-encoding
detection needs.

Regards,
ismail
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] teaching the new urllib

2009-02-03 Thread Brett Cannon
On Tue, Feb 3, 2009 at 11:08, Brad Miller  wrote:
> I'm just getting ready to start the semester using my new book (Python
> Programming in Context) and noticed that I somehow missed all the changes to
> urllib in python 3.0.  ARGH to say the least.  I like using urllib in the
> intro class because we can get data from places that are more
> interesting/motivating/relevant to the students.
> Here are some of my observations on trying to do very basic stuff with
> urllib:
> 1.  urllib.urlopen  is now urllib.request.urlopen

Technically urllib2.urlopen became urllib.request.urlopen. See PEP
3108 for the details of the reorganization.

> 2.  The object returned by urlopen is no longer iterable!  no more for line
> in url.

That is probably a difference between urllib2 and urllib.

> 3.  read, readline, readlines now return bytes objects or arrays of bytes
> instead of a str and array of str

Correct.

> 4.  Taking the naive approach to converting a bytes object to a str does not
> work as you would expect.
>
 import urllib.request
 page = urllib.request.urlopen('http://knuth.luther.edu/test.html')
 page
> >
 line = page.readline()
 line
> b'>>> str(line)
> 'b\'>>>
> As you can see from the example the 'b' becomes part of the string!  It
> seems like this should be a bug, is it?
>

No because you are getting back the repr for the bytes object. Str
does not know what the encoding is for the bytes so it has no way of
performing the decoding.

> Here's the iteration problem:
> 'b\'>>> for line in page:
> print(line)
> Traceback (most recent call last):
>   File "", line 1, in 
> for line in page:
> TypeError: 'addinfourl' object is not iterable
> Why is this not iterable anymore?  Is this too a bug?  What the heck is an
> addinfourl object?
>
> 5.  Finally, I see that a bytes object has some of the same methods as
> strings.  But the error messages are confusing.
 line
> b'   "http://www.w3.org/TR/html4/loose.dtd";>\n'
 line.find('www')
> Traceback (most recent call last):
>   File "", line 1, in 
> line.find('www')
> TypeError: expected an object with the buffer interface
 line.find(b'www')
> 11
> Why couldn't find take string as a parameter?

Once again, encoding. The bytes object doesn't know what to encode the
string to in order to do an apples-to-apples search of bytes.

> If folks have advice on which, if any, of these are bugs please let me know
> and I'll file them, and if possible work on fixes for them too.

While not a bug, adding iterator support wouldn't hurt. And for the
better TypeError messages, you could try submitting a patch to change
to tack on something like "(e.g. bytes)", although I am not sure if
anyone else would agree on that decision.

> If you have advice on how I should better be teaching this new urllib that
> would be great to hear as well.

Probably the biggest issue will be having to explain string encoding.
Obviously you can gloss over it or provide students with a simple
library that just automatically converts the strings. Or even better,
provide some code for the standard library that can take the HTML,
figure out the encoding, and then return the decoded strings (might
actually already be something for that that I am not aware of).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Antoine Pitrou
Collin Winter  gmail.com> writes:
> 
> Have any of the original objections to Calvin's patch
> (http://bugs.python.org/issue1706256) been addressed? If not, I don't
> see anything in these threads that justify resurrecting it.
> 
> I still haven't seen any real code presented that would benefit from
> partial.skip or partial_right.

The arguments for and against the patch could be brought against partial()
itself, so I don't understand the -1's at all.

I know I hardly every use partial() (apart from the performance aspect, it looks
like a completely useless addition to me), but from a performance standpoint,
partial.skip has as much usefulness as partial() itself.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Calvin Spealman
http://bugs.python.org/issue1706256

Took me a couple days to catch up on this thread so here is the link
for any interested. Could it be possible to reevaluate this?

On Sat, Jan 31, 2009 at 2:40 PM, Leif Walsh  wrote:
> On Fri, Jan 30, 2009 at 7:38 PM, Calvin Spealman  wrote:
>> I am just replying to the end of this thread to throw in a reminder
>> about my partial.skip patch, which allows the following usage:
>>
>> split_one = partial(str.split, partial.skip, 1)
>>
>> Not looking to say "mine is better", but if the idea is being given
>> merit, I like the skipping arguments method better than just the
>> "right partial", which I think is confusing combined with keyword and
>> optional arguments. And, this patch already exists. Could it be
>> re-evaluated?
>
> +1 but I don't know where the patch is.
>
> --
> Cheers,
> Leif
>



-- 
Read my blog! I depend on your acceptance of my opinion! I am interesting!
http://techblog.ironfroggy.com/
Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] teaching the new urllib

2009-02-03 Thread Brad Miller
I'm just getting ready to start the semester using my new book (Python
Programming in Context) and noticed that I somehow missed all the changes to
urllib in python 3.0.  ARGH to say the least.  I like using urllib in the
intro class because we can get data from places that are more
interesting/motivating/relevant to the students.
Here are some of my observations on trying to do very basic stuff with
urllib:

1.  urllib.urlopen  is now urllib.request.urlopen
2.  The object returned by urlopen is no longer iterable!  no more for line
in url.
3.  read, readline, readlines now return bytes objects or arrays of bytes
instead of a str and array of str
4.  Taking the naive approach to converting a bytes object to a str does not
work as you would expect.

>>> import urllib.request
>>> page = urllib.request.urlopen('http://knuth.luther.edu/test.html')
>>> page
>
>>> line = page.readline()
>>> line
b'>> str(line)
'b\'>>

As you can see from the example the 'b' becomes part of the string!  It
seems like this should be a bug, is it?


Here's the iteration problem:
'b\'>> for line in page:
print(line)

Traceback (most recent call last):
  File "", line 1, in 
for line in page:
TypeError: 'addinfourl' object is not iterable

Why is this not iterable anymore?  Is this too a bug?  What the heck is an
addinfourl object?


5.  Finally, I see that a bytes object has some of the same methods as
strings.  But the error messages are confusing.

>>> line
b'   "http://www.w3.org/TR/html4/loose.dtd";>\n'
>>> line.find('www')
Traceback (most recent call last):
  File "", line 1, in 
line.find('www')
TypeError: expected an object with the buffer interface
>>> line.find(b'www')
11

Why couldn't find take string as a parameter?

If folks have advice on which, if any, of these are bugs please let me know
and I'll file them, and if possible work on fixes for them too.

If you have advice on how I should better be teaching this new urllib that
would be great to hear as well.


Thanks,

Brad

-- 
Brad Miller
Assistant Professor, Computer Science
Luther College

-- 
Brad Miller
Assistant Professor, Computer Science
Luther College
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] teaching the new urllib

2009-02-03 Thread Brad Miller
I'm just getting ready to start the semester using my new book (Python
Programming in Context) and noticed that I somehow missed all the changes to
urllib in python 3.0.  ARGH to say the least.  I like using urllib in the
intro class because we can get data from places that are more
interesting/motivating/relevant to the students.
Here are some of my observations on trying to do very basic stuff with
urllib:

1.  urllib.urlopen  is now urllib.request.urlopen
2.  The object returned by urlopen is no longer iterable!  no more for line
in url.
3.  read, readline, readlines now return bytes objects or arrays of bytes
instead of a str and array of str
4.  Taking the naive approach to converting a bytes object to a str does not
work as you would expect.

>>> import urllib.request
>>> page = urllib.request.urlopen('http://knuth.luther.edu/test.html')
>>> page
>
>>> line = page.readline()
>>> line
b'>> str(line)
'b\'>>

As you can see from the example the 'b' becomes part of the string!  It
seems like this should be a bug, is it?


Here's the iteration problem:
'b\'>> for line in page:
print(line)

Traceback (most recent call last):
  File "", line 1, in 
for line in page:
TypeError: 'addinfourl' object is not iterable

Why is this not iterable anymore?  Is this too a bug?  What the heck is an
addinfourl object?


5.  Finally, I see that a bytes object has some of the same methods as
strings.  But the error messages are confusing.

>>> line
b'   "http://www.w3.org/TR/html4/loose.dtd";>\n'
>>> line.find('www')
Traceback (most recent call last):
  File "", line 1, in 
line.find('www')
TypeError: expected an object with the buffer interface
>>> line.find(b'www')
11

Why couldn't find take string as a parameter?

If folks have advice on which, if any, of these are bugs please let me know
and I'll file them, and if possible work on fixes for them too.

If you have advice on how I should better be teaching this new urllib that
would be great to hear as well.


Thanks,

Brad

-- 
Brad Miller
Assistant Professor, Computer Science
Luther College
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Collin Winter
On Tue, Feb 3, 2009 at 5:44 AM, Ben North  wrote:
> Hi,
>
> Thanks for the further responses.  Again, I'll try to summarise:
>
> Scott David Daniels pointed out an awkward interaction when chaining
> partial applications, such that it could become very unclear what was
> going to happen when the final function is called:
>
>> If you have:
>> def button(root, position, action=None, text='*', color=None):
>> ...
>> ...
>> blue_button = partial(button, my_root, color=(0,0,1))
>>
>> Should partial_right(blue_button, 'red') change the color or the text?
>
> Calvin Spealman mentioned a previous patch of his which took the 'hole'
> approach, i.e.:
>
>> [...] my partial.skip patch, which allows the following usage:
>>
>>split_one = partial(str.split, partial.skip, 1)
>
> This would solve my original problems, and, continuing Scott's example,
>
>   def on_clicked(...): ...
>
>   _ = partial.skip
>   clickable_blue_button = partial(blue_button, _, on_clicked)
>
> has a clear enough meaning I think:
>
>   clickable_blue_button('top-left corner')
>   = blue_button('top-left corner', on_clicked)
>   = button(my_root, 'top-left corner', on_clicked, color=(0,0,1))
>
> Calvin's idea/patch sounds good to me, then.  Others also liked it.
> Could it be re-considered, instead of the partial_right idea?

Have any of the original objections to Calvin's patch
(http://bugs.python.org/issue1706256) been addressed? If not, I don't
see anything in these threads that justify resurrecting it.

I still haven't seen any real code presented that would benefit from
partial.skip or partial_right.

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [patch] Duplicate sections detection in ConfigParser

2009-02-03 Thread Raghuram Devarakonda
>> The attached patch is compatible with both the 2.x and the 3.x
>> branches; it adds a `unique_sects` parameter to the constructor of
>> RawConfigParser and a test in the parser loop that raises
>> DuplicateSectionError if a section is seen more then once and that
>> unique_sects is True.

http://bugs.python.org/issue2204 refers to the same issue. Perhaps,
you can upload your patch there in addition to adding any comments.

Thanks,
Raghu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [patch] Duplicate sections detection in ConfigParser

2009-02-03 Thread Aahz
On Tue, Feb 03, 2009, Yannick Gingras wrote:
>
> The attached patch is compatible with both the 2.x and the 3.x
> branches; it adds a `unique_sects` parameter to the constructor of
> RawConfigParser and a test in the parser loop that raises
> DuplicateSectionError if a section is seen more then once and that
> unique_sects is True.

Please go ahead and post the patch to bugs.python.org; it can always be
revised later and this ensures that we have a record.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Partial function application 'from the right'

2009-02-03 Thread Ben North
Hi,

Thanks for the further responses.  Again, I'll try to summarise:

Scott David Daniels pointed out an awkward interaction when chaining
partial applications, such that it could become very unclear what was
going to happen when the final function is called:

> If you have:
> def button(root, position, action=None, text='*', color=None):
> ...
> ...
> blue_button = partial(button, my_root, color=(0,0,1))
>
> Should partial_right(blue_button, 'red') change the color or the text?

Calvin Spealman mentioned a previous patch of his which took the 'hole'
approach, i.e.:

> [...] my partial.skip patch, which allows the following usage:
>
>split_one = partial(str.split, partial.skip, 1)

This would solve my original problems, and, continuing Scott's example,

   def on_clicked(...): ...

   _ = partial.skip
   clickable_blue_button = partial(blue_button, _, on_clicked)

has a clear enough meaning I think:

   clickable_blue_button('top-left corner')
   = blue_button('top-left corner', on_clicked)
   = button(my_root, 'top-left corner', on_clicked, color=(0,0,1))

Calvin's idea/patch sounds good to me, then.  Others also liked it.
Could it be re-considered, instead of the partial_right idea?

Ben.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should there be a source-code checksum in module objects?

2009-02-03 Thread rocky
Guido van Rossum writes:
 > I suggest that you move this discussion to python-ideas to ferret out
 > a possible implementation and API; or to find out work-arounds.

Okay. Done. 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for appending to arrays

2009-02-03 Thread Hrvoje Niksic

Raymond Hettinger wrote:

[Hrvoje Niksic]
 The one thing missing from the array 
module is the ability to directly access array values from C.


Please put a feature request on the bug tracker.


Done, http://bugs.python.org/issue5141
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com