Announcing python-libarchive.

2012-01-21 Thread Ben Timby
Python-libarchive is a wrapper around the excellent libarchive
library. It allows you to work with many different archive formats
using a single API.

http://code.google.com/p/python-libarchive/

Python-libarchive is a SWIG wrapper around the library as well as some
high-level Python classes to make usage easier. There are also
compatibility modules for zipfile and tarfile. These modules use
libarchive to deal with the archive files, but provide an interface as
similar as possible to the stdlib versions. The library is usable, but
by no means complete.

This library was written against libarchive 3.0.X and will not work
with libarchive 2. Some instructions on installing libarchive and
python-libarchive are available in the Wiki.

http://code.google.com/p/python-libarchive/wiki/Building

There is a Google Group for feedback.

http://groups.google.com/group/python-libarchive-users

Thank you!
-- 
http://mail.python.org/mailman/listinfo/python-announce-list

Support the Python Software Foundation:
http://www.python.org/psf/donations/


Tab-completion in tutorial

2012-01-21 Thread Steven D'Aprano
I'm reading the part of the tutorial that talks about tab-completion, and 
I think the docs are wrong.

http://docs.python.org/tutorial/interactive.html#key-bindings

The more capable startup file example given claims:

# Add auto-completion and a stored history file of commands to your Python
# interactive interpreter. Requires Python 2.0+, readline. Autocomplete is
# bound to the Esc key by default (you can change it - see readline docs).

but I have tried it, and it doesn't seem to actually bind autocomplete to 
anything.

Is this a documentation bug, or am I doing something wrong?



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Is a with on open always necessary?

2012-01-21 Thread Lie Ryan

On 01/21/2012 02:44 AM, Andrea Crotti wrote:

I normally didn't bother too much when reading from files, and for example
I always did a

content = open(filename).readlines()

But now I have the doubt that it's not a good idea, does the file
handler stays
open until the interpreter quits?


It is not necessary most of the time, and most likely is not necessary 
for short-lived programs. The file handler stays open until the file 
object is garbage collected, in CPython which uses reference counting 
the file handler is closed when the last reference to the file object is 
deleted or goes out of context; in python implementations that uses 
garbage collection method, this is indeterministic.


It is only strictly necessary for programs that opens thousands of files 
in a short while, since the operating system may limit of the number of 
active file handlers you can have.


However, it is considered best practice to close file handlers; making 
it a habit will avoid problems when you least expect it.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Tab-completion in tutorial

2012-01-21 Thread Peter Otten
Steven D'Aprano wrote:

 I'm reading the part of the tutorial that talks about tab-completion, and
 I think the docs are wrong.
 
 http://docs.python.org/tutorial/interactive.html#key-bindings
 
 The more capable startup file example given claims:
 
 # Add auto-completion and a stored history file of commands to your Python
 # interactive interpreter. Requires Python 2.0+, readline. Autocomplete is
 # bound to the Esc key by default (you can change it - see readline docs).
 
 but I have tried it, and it doesn't seem to actually bind autocomplete to
 anything.
 
 Is this a documentation bug, or am I doing something wrong?

I've just tried it on Kubuntu's konsole. I see strange reactions:
After typing imp I have to hit ESC three times before ort is added, and 
afterwards a character is swallowed. However, I can get the expected 
behaviour after binding the key explicitly with

 import readline
 readline.parse_and_bind(esc: complete)

I'm not sure whether Python is to blame or Ubuntu; it may be an interference 
with konsole's key bindings.

-- 
http://mail.python.org/mailman/listinfo/python-list


while True or while 1

2012-01-21 Thread Andrea Crotti

I see sometimes in other people code while 1 instead of while True.
I think using True is more pythonic, but I wanted to check if there is
any difference in practice.

So I tried to do the following, and the result is surprising.  For what
I can see it looks like the interpreter can optimize away the 1 boolean
conversion while it doesn't with the True, the opposite of what I
supposed.

Anyone can explain me why is that, or maybe is my conclusion wrong?

  def f1():
  while 1:
  pass

  def f2():
  while True:
  pass

  In [10]: dis.dis(f)
  2   0 SETUP_LOOP   3 (to 6)

  3 3 JUMP_ABSOLUTE3
6 LOAD_CONST   0 (None)
  9 RETURN_VALUE

  In [9]: dis.dis(f1)
  2   0 SETUP_LOOP  10 (to 13)
3 LOAD_GLOBAL  0 (True)
  6 POP_JUMP_IF_FALSE   12

  3   9 JUMP_ABSOLUTE3
   12 POP_BLOCK
   13 LOAD_CONST   0 (None)
 16 RETURN_VALUE

--
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Andrea Crotti

Actually there was the same question here (sorry should have looked before)
http://stackoverflow.com/questions/3815359/while-1-vs-for-whiletrue-why-is-there-a-difference

And I think the main reason is that 1 is a constant while True is not
such and can be reassigned.
--
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Chris Angelico
On Sun, Jan 22, 2012 at 12:47 AM, Andrea Crotti
andrea.crott...@gmail.com wrote:
 So I tried to do the following, and the result is surprising.  For what
 I can see it looks like the interpreter can optimize away the 1 boolean
 conversion while it doesn't with the True, the opposite of what I
 supposed.

 Anyone can explain me why is that, or maybe is my conclusion wrong?

In Python 3, they compile to the same code, because 'True' is a
keyword. In Python 2, you can reassign True to be 0.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


What's the very simplest way to run some Python from a button on a web page?

2012-01-21 Thread tinnews
I want to run a server side python script when a button on a web page
is clicked.  This is on a LAMP server - apache2 on xubuntu 11.10.

I know I *could* run it as a CGI script but I don't want to change the
web page at all when the button is clicked (I'll see the effect
elsewhere on the screen anyway) so normal CGI isn't ideal.

It's easy to show a button:-

INPUT TYPE=submit NAME=Button ONCLICK=something;

Can I get away with something clever for 'something' that will somehow
hook through to a server side script?

Alternatively, seeing as both client and server are on the same
system, this *could* be done on the client side by breaking out of the
browser sandbox - is there any easy way to do this?


I'm just looking for the crudest, simplest possible way of doing this,
it's only for my own convenience to fire up a utility I want to use
when viewing certain of my local HTML pages.  These pages aren't
visible from the outside world so security isn't a big issue.

-- 
Chris Green
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Matteo Landi
Probably because of the fact it is possible to set True equal to False and
consequently then invalidate loop logic as presented below:

True = False
while True:
...

On the other hand `1' will always be evaluated as a constant.

Don't know, just guessing.


Matteo

On Jan/21, Andrea Crotti wrote:
 I see sometimes in other people code while 1 instead of while True.
 I think using True is more pythonic, but I wanted to check if there is
 any difference in practice.
 
 So I tried to do the following, and the result is surprising.  For what
 I can see it looks like the interpreter can optimize away the 1 boolean
 conversion while it doesn't with the True, the opposite of what I
 supposed.
 
 Anyone can explain me why is that, or maybe is my conclusion wrong?
 
   def f1():
   while 1:
   pass
 
   def f2():
   while True:
   pass
 
   In [10]: dis.dis(f)
   2   0 SETUP_LOOP   3 (to 6)
 
   3 3 JUMP_ABSOLUTE3
 6 LOAD_CONST   0 (None)
   9 RETURN_VALUE
 
   In [9]: dis.dis(f1)
   2   0 SETUP_LOOP  10 (to 13)
 3 LOAD_GLOBAL  0 (True)
   6 POP_JUMP_IF_FALSE   12
 
   3   9 JUMP_ABSOLUTE3
12 POP_BLOCK
13 LOAD_CONST   0 (None)
  16 RETURN_VALUE
 
 -- 
 http://mail.python.org/mailman/listinfo/python-list
 

-- 
http://www.matteolandi.net
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: etree/lxml/XSLT and dynamic stylesheet variables

2012-01-21 Thread Adam Tauno Williams
On Sat, 2012-01-21 at 05:56 +0100, Stefan Behnel wrote:
 Adam Tauno Williams, 20.01.2012 21:38:
  I'm using etree to perform XSLT transforms, such as -
  from lxml import etree
  source = etree.parse(self.rfile)
  xslt = etree.fromstring(self._xslt)
  transform = etree.XSLT(xslt)
  result = transform(source)
  according to the docs at
  http://lxml.de/xpathxslt.html#stylesheet-parameters I can pass a
  dictionary of parameters to transform, such as -
  result = transform(doc_root, **{'non-python-identifier': '5'})
  Can I pass a dictionary-like object?  That doesn't seem to be working.
 Yes it does, Python copies it into a plain dict at call time.

Ah, I wondered if that was happening.  In which case is supresses all
the magic of my dict subclass.

  I need to perform dynamic lookup of variables for the stylesheet.
 Different story.
  I've subclassed dictionary and overloaded [], get, has_key, and in to
  perform the required lookups; these work in testing. But passing the
  object to transform doesn't work
 You should make the lookup explicit in your XSLT code using an XPath
 function. See here:
 http://lxml.de/extensions.html

Perfect thanks;  this provides everything I need.  

A stupid test case / example for anyone interested:

from lxml import etree

class MyExt:
def __init__(self, languages):
self._languages = languages


def languagelookup(self, _, arg):
language = self._languages.get(arg)
if not language:
return 'undefined'
return language

extensions = etree.Extension( MyExt(languages={ 'ES': 'Spanish',
'EL': 'Greek', 
'DE': 'German',
'EN': 'English' } ),
  ( 'languagelookup', ), 
  ns='847fe241-df88-45c6-b4a7' )

text = '''documents
  document
  id109/id
  categoryOP/category
  titleRevolt Of The Masses/title
  authorJose Ortega y Gasset/author
  published1930/published
  language translator=anonymousES/language
  /document
  document
  id108/id
  categoryP/category
  titleMeditations/title
  authorMarcus Aurelius/author
  language translator=Maxwell StaniforthEL/language
  published1930/published
  /document
  document
  id425/id
  categoryOP/category
  titleThe Communist Manifesto/title
  authorKarl Marx/author
  authorFriedrich Engels/author
  language translator=Samuel MooreDE/language
  published1914/published
  /document
  document
  id507/id
  categoryPOT/category
  titleThe Cathedral amp; The Bazaar/title
  authorEric S. Raymond/author
  published199/published
  /document
/documents'''

source = etree.fromstring(text)

style ='''xsl:stylesheet version=1.0
   xmlns:xsl=http://www.w3.org/1999/XSL/Transform;
   xmlns:ext=847fe241-df88-45c6-b4a7
   xsl:output method=text/
   xsl:template match=/documents/document
 xsl:value-of select=id/
 xsl:text,/xsl:text
 xsl:value-of select=author/
 xsl:text,/xsl:text
 xsl:value-of select=ext:languagelookup(string(language))/
 xsl:text/xsl:text
   /xsl:template
/xsl:stylesheet'''

xslt = etree.XSLT(etree.XML(style), extensions=extensions)
print xslt(source)

-- 
Adam Tauno Williams http://www.whitemiceconsulting.com
System Administrator, OpenGroupware Developer, LPI / CNA
Fingerprint 8C08 209A FBE3 C41A DD2F A270 2D17 8FA4 D95E D383


signature.asc
Description: This is a digitally signed message part
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What's the very simplest way to run some Python from a button on a web page?

2012-01-21 Thread Ian Kelly
On Sat, Jan 21, 2012 at 7:58 AM, tinn...@isbd.co.uk wrote:

 I want to run a server side python script when a button on a web page
 is clicked.  This is on a LAMP server - apache2 on xubuntu 11.10.

 I know I *could* run it as a CGI script but I don't want to change the
 web page at all when the button is clicked (I'll see the effect
 elsewhere on the screen anyway) so normal CGI isn't ideal.

 It's easy to show a button:-

INPUT TYPE=submit NAME=Button ONCLICK=something;

 Can I get away with something clever for 'something' that will somehow
 hook through to a server side script?


Yes, use AJAX to make an asynchronous request.  There are several AJAX
toolkits out there that will make this simple.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Erik Max Francis

Chris Angelico wrote:

On Sun, Jan 22, 2012 at 12:47 AM, Andrea Crotti
andrea.crott...@gmail.com wrote:

So I tried to do the following, and the result is surprising.  For what
I can see it looks like the interpreter can optimize away the 1 boolean
conversion while it doesn't with the True, the opposite of what I
supposed.

Anyone can explain me why is that, or maybe is my conclusion wrong?


In Python 3, they compile to the same code, because 'True' is a
keyword. In Python 2, you can reassign True to be 0.


Why this should concern anyone, I don't know; someone who's rebound 
`True` or `False` to evaluate to something other than true and false, 
respectively, is only doing so to be difficult (or very foolish).  One 
of the principles of Python programming is that We're All Adults Here, 
so this kind of defensive programming is really superfluous.  In other 
words, yes, it's quite reasonable to assume that (even in Python 2) 
`True` is bound to something which is, in fact, true.


The real reason people still use the `while 1` construct, I would 
imagine, is just inertia or habit, rather than a conscious, defensive 
decision.  If it's the latter, it's a case of being _way_ too defensive.


--
Erik Max Francis  m...@alcyone.com  http://www.alcyone.com/max/
 San Jose, CA, USA  37 18 N 121 57 W  AIM/Y!M/Jabber erikmaxfrancis
  Ambition can creep as well as soar.
   -- Edmund Burke
--
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Erik Max Francis

Andrea Crotti wrote:

I see sometimes in other people code while 1 instead of while True.
I think using True is more pythonic, but I wanted to check if there is
any difference in practice.


No (with the exception of `True` and `False` being rebinable in Python 
2).  The idiomatic `while 1` notation comes from back in the pre-Boolean 
days.  In any reasonably modern implementation, `while True` is more 
self-documenting.  I would imagine the primary reason people still do 
it, any after-the-fact rationalizations aside, is simply habit.


--
Erik Max Francis  m...@alcyone.com  http://www.alcyone.com/max/
 San Jose, CA, USA  37 18 N 121 57 W  AIM/Y!M/Jabber erikmaxfrancis
  Ambition can creep as well as soar.
   -- Edmund Burke
--
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Chris Angelico
On Sun, Jan 22, 2012 at 8:13 AM, Erik Max Francis m...@alcyone.com wrote:
 Why this should concern anyone, I don't know; someone who's rebound `True`
 or `False` to evaluate to something other than true and false, respectively,
 is only doing so to be difficult (or very foolish).  One of the principles
 of Python programming is that We're All Adults Here, so this kind of
 defensive programming is really superfluous.  In other words, yes, it's
 quite reasonable to assume that (even in Python 2) `True` is bound to
 something which is, in fact, true.

Yes, but there's no special code in the compiler to handle True - it's
just a name like any other. It finds a token that looks like a name,
so it puts a name lookup into the bytecode.

 The real reason people still use the `while 1` construct, I would imagine,
 is just inertia or habit, rather than a conscious, defensive decision.  If
 it's the latter, it's a case of being _way_ too defensive.

Ehh, 'while 1' is shorter too. I reckon some people are just lazy :)
Or have come from C where 'while (1)' is the normal thing to do.
According to the Eliza Effect, supporting 'while 1' is a Good Thing.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Masking a dist package with a copy in my own package

2012-01-21 Thread Sam Simmons
I just installed 2.7... should have done this a while ago. pip finally works!

Thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list


access address from object and vice versa

2012-01-21 Thread Tamer Higazi
Hi people!
I have asked myself the following thing.

How do I access the address of an object and later get the object from
that address ?!

I am heavily interisted.


thank you



Tamer
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: access address from object and vice versa

2012-01-21 Thread Chris Rebert
On Sat, Jan 21, 2012 at 7:04 PM, Tamer Higazi th9...@googlemail.com wrote:
 Hi people!
 I have asked myself the following thing.

 How do I access the address of an object

id(obj) happens to do that in CPython, but it's a mere implementation detail.

 and later get the object from
 that address ?!

Not possible.

 I am heavily interisted.

Why?

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What's the very simplest way to run some Python from a button on a web page?

2012-01-21 Thread Tim Roberts
tinn...@isbd.co.uk wrote:

I want to run a server side python script when a button on a web page
is clicked.  This is on a LAMP server - apache2 on xubuntu 11.10.

I know I *could* run it as a CGI script but I don't want to change the
web page at all when the button is clicked (I'll see the effect
elsewhere on the screen anyway) so normal CGI isn't ideal.

It seems what you're after is AJAX.  If you are using a Javascript
framework like jQuery, it's easy to fire off an asynchronous request back
to your server that leaves the existing page alone.  If you aren't, then I
think the easiest method is to use an invisible iframe.  From Javascript,
you can set the src property of the iframe to fire off a request while
leaving the rest of the page alone.

You could spend the rest of your career reading all of the good web
material on AJAX.
-- 
Tim Roberts, t...@probo.com
Providenza  Boekelheide, Inc.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What's the very simplest way to run some Python from a button on a web page?

2012-01-21 Thread Chris Angelico
On Sun, Jan 22, 2012 at 3:36 PM, Tim Roberts t...@probo.com wrote:
 It seems what you're after is AJAX.  If you are using a Javascript
 framework like jQuery, it's easy to fire off an asynchronous request back
 to your server that leaves the existing page alone.

If you aren't using a framework, look up the XMLHttpRequest object -
that's what does the work. As Tim says, there's no lack of good
material on the subject.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: access address from object and vice versa

2012-01-21 Thread Chris Angelico
On Sun, Jan 22, 2012 at 2:04 PM, Tamer Higazi th9...@googlemail.com wrote:
 Hi people!
 I have asked myself the following thing.

 How do I access the address of an object and later get the object from
 that address ?!

The problem with that sort of idea is that it mucks up garbage
collection. CPython, for example, maintains a reference count for
every object; your address is, in a sense, another reference, but one
that the GC doesn't know about - so it might release the object and
reuse the memory.

What you can do, though, is simply have another name bound to the same
object. You can then manipulate the object through that name, and
it'll function just like a pointer would in C. The original name and
the new name will function exactly the same.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: while True or while 1

2012-01-21 Thread Steven D'Aprano
On Sun, 22 Jan 2012 09:13:23 +1100, Chris Angelico wrote:

 On Sun, Jan 22, 2012 at 8:13 AM, Erik Max Francis m...@alcyone.com
 wrote:
 Why this should concern anyone, I don't know; someone who's rebound
 `True` or `False` to evaluate to something other than true and false,
 respectively, is only doing so to be difficult (or very foolish).  One
 of the principles of Python programming is that We're All Adults Here,
 so this kind of defensive programming is really superfluous.  In other
 words, yes, it's quite reasonable to assume that (even in Python 2)
 `True` is bound to something which is, in fact, true.
 
 Yes, but there's no special code in the compiler to handle True - it's
 just a name like any other. It finds a token that looks like a name, so
 it puts a name lookup into the bytecode.
 
 The real reason people still use the `while 1` construct, I would
 imagine, is just inertia or habit, rather than a conscious, defensive
 decision.  If it's the latter, it's a case of being _way_ too
 defensive.
 
 Ehh, 'while 1' is shorter too. I reckon some people are just lazy :) 


Or they've been writing Python code since before version 2.2 when True 
and False were introduced, and so they are used to the while 1 idiom 
and never lost the habit.

In Python 2, while 1 is a micro-optimization over while True, because 
there is no need to look-up the name True. For extremely tight loops, 
that may make a difference.

In Python 3, there is no longer any real difference:


py dis(compile('while 1: pass', '', 'exec'))
  1   0 SETUP_LOOP   3 (to 6)
3 JUMP_ABSOLUTE3
6 LOAD_CONST   0 (None)
  9 RETURN_VALUE
py dis(compile('while True: pass', '', 'exec'))
  1   0 SETUP_LOOP   3 (to 6)
3 JUMP_ABSOLUTE3
6 LOAD_CONST   0 (None)
  9 RETURN_VALUE


Or perhaps they just like the look of while 1.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: access address from object and vice versa

2012-01-21 Thread Steven D'Aprano
On Sun, 22 Jan 2012 04:04:08 +0100, Tamer Higazi wrote:

 Hi people!
 I have asked myself the following thing.
 
 How do I access the address of an object and later get the object from
 that address ?!

Use another language.

By design, Python does not provide pointers. This is a good thing, 
because it makes a whole class of bugs and security vulnerabilities 
impossible in Python.


-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


bufsize in subprocess

2012-01-21 Thread yves

Is this the expected behaviour?
When I run this script, it reads only once, but I expected once per line with 
bufsize=1.


What I am trying to do is display the output of a slow process in a tkinter 
window as it runs. Right now, the process runs to completion, then display the 
result.


import subprocess

com = ['/bin/ls', '-l', '/usr/bin']
with subprocess.Popen(com, bufsize=1, stdout=subprocess.PIPE, 
stderr=subprocess.STDOUT) as proc:

print('out: ' + str(proc.stdout.read(), 'utf8'))


Thanks.

--
Yves.  http://www.SollerS.ca/
   http://ipv6.SollerS.ca
   http://blog.zioup.org/
--
http://mail.python.org/mailman/listinfo/python-list


Re: access address from object and vice versa

2012-01-21 Thread Steven D'Aprano
On Sat, 21 Jan 2012 19:36:32 -0800, Chris Rebert wrote:

 On Sat, Jan 21, 2012 at 7:04 PM, Tamer Higazi th9...@googlemail.com
 wrote:
 Hi people!
 I have asked myself the following thing.

 How do I access the address of an object
 
 id(obj) happens to do that in CPython, but it's a mere implementation
 detail.

I really wish that CPython didn't expose the fact that id happens to use 
address. That simply gives people the wrong idea.

Jython has the right approach, in my opinion. Objects are given IDs on 
request, starting with 1, and no id is ever re-used:

steve@runes:~$ jython
Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19)
[OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18
Type help, copyright, credits or license for more information.
 id(None)
1
 id(True)
2

On the other hand, presumably this means that Jython objects need an 
extra field to store the ID, so the CPython approach is a space 
optimization.



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: bufsize in subprocess

2012-01-21 Thread Chris Rebert
On Sat, Jan 21, 2012 at 9:45 PM,  y...@zioup.com wrote:
 Is this the expected behavior?

Yes. `.read()` [with no argument] on a file-like object reads until
EOF. See http://docs.python.org/library/stdtypes.html#file.read

 When I run this script, it reads only once, but I expected once per line
 with bufsize=1.

You want proc.stdout.readline().
http://docs.python.org/library/stdtypes.html#file.readline

 What I am trying to do is display the output of a slow process in a tkinter
 window as it runs. Right now, the process runs to completion, then display
 the result.

    import subprocess

    com = ['/bin/ls', '-l', '/usr/bin']
    with subprocess.Popen(com, bufsize=1, stdout=subprocess.PIPE,
 stderr=subprocess.STDOUT) as proc:
        print('out: ' + str(proc.stdout.read(), 'utf8'))

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


[issue6631] Disallow relative files paths in urllib*.open()

2012-01-21 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc amaur...@gmail.com added the comment:

Sorry, why was this change backported?
Does this fix a specific issue in 2.7 or 3.2?
On the contrary, it seems to me that code which (incorrectly) used 
urllib.urlopen() to allow both urls and local files will suddenly break.

--
nosy: +amaury.forgeotdarc
stage: committed/rejected - commit review
status: closed - pending

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6631] Disallow relative files paths in urllib*.open()

2012-01-21 Thread Senthil Kumaran

Senthil Kumaran sent...@uthcode.com added the comment:

Actually, I saw this as a bug with urllib.urlopen and urllib2 had
exhibited proper behaviour previously. Now, both behaviour will be
consistent now.

But, you are right that an *incorrect* usage of urllib.urlopen would
break in 2.7.2. 

If we need to be lenient on that incorrect usage, then this change can
be there in 3.x series, because of urllib.request.urlopen would be
interface which users will be using and it can be reverted from 2.7.

Personally, I am +/- 0 on reverting this in 2.7. Initially, I saw this
as a bug, but later when I added tests for ValueError and checkedin,
I realized that it can break some incorrect usages, as you say.

--
status: pending - open

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6631
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13790] In str.format an incorrect error message for list, tuple, dict, set

2012-01-21 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

While looking at object.__format__, I recall that we've already addressed this, 
sort of. For a different reason, this is already deprecated in 3.3 and will 
become an error in 3.4. See issues 9856 and 7994.

$ ./python -Wd
Python 3.3.0a0 (default:40e1be1e0707, Jan 15 2012, 00:58:51) 
[GCC 4.6.1] on linux
Type help, copyright, credits or license for more information.
 format([], 'd')
__main__:1: DeprecationWarning: object.__format__ with a non-empty format 
string is deprecated
Traceback (most recent call last):
  File stdin, line 1, in module
ValueError: Unknown format code 'd' for object of type 'str'
[67288 refs]
 

We could still have object.__format__ catch and re-throw the ValueError with a 
better message. I'd have to think it through if we could catch all ValueErrors, 
or if it's possible for another ValueError to be thrown and we'd only catch and 
rethrow this specific ValueError.

But since this is deprecated, I'm not sure it's worth the hassle. I'd advocate 
closing this issue as won't fix.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13790
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13609] Add os.get_terminal_size() function

2012-01-21 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

 Does this need need more discussion, code review, testing,
 or just more time?

As I already wrote, I would prefer a very simple os.get_terminal_size() 
function: don't read environment varaiables, use a simple tuple instead of a 
new type, and raise an error if the size cannot be read (so no need of default 
values). The os module is written as a thin wrapper between Python and the OS. 
A more high level function (read environment variables, handle the error, use a 
namedtuple) can be written in your application, or maybe in another module.

This is just my opinion, other core developers may prefer your version :-)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13609
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13790] In str.format an incorrect error message for list, tuple, dict, set

2012-01-21 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

So the error is going to be something about the source type not supporting 
'__format__'?

That change will also address the OP's concern about truncated reprs when a 
fixed string length is specified, so I agree that the title issue can be 
closed.  Terry's patch with the ({}) removed should be committed, though.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13790
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Thoughts? (apart from ugh! it's ugly! yes I know - it's late here)

Is it guaranteed that no usage pattern can render this protection
inefficient? What if a dict is constructed by intermingling lookups and
inserts?
Similarly, what happens with e.g. the common use case of
dictdefault(list), where you append() after the lookup/insert? Does some
key distribution allow the attack while circumventing the protection?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13706] non-ascii fill characters no longer work in formatting

2012-01-21 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 231c6042c40c by Victor Stinner in branch 'default':
Issue #13706: Support non-ASCII fill characters
http://hg.python.org/cpython/rev/231c6042c40c

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13706
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13706] non-ascii fill characters no longer work in formatting

2012-01-21 Thread STINNER Victor

STINNER Victor victor.stin...@haypocalc.com added the comment:

I fixed the original report, but there is still an issue with non-ASCII 
thousands separator.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13706
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Zbyszek Szmek

Zbyszek Szmek zbys...@in.waw.pl added the comment:

The hashing with random seed is only marginally slower or more 
complicated than current version.

The patch is big because it moves random number generator initialization 
code around. There's no per object tax, and the cost of the random 
number generator initialization is only significant on windows. 
Basically, there's no tax.

Zbyszek

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Dave Malcolm

Dave Malcolm dmalc...@redhat.com added the comment:

On Sat, 2012-01-21 at 14:27 +, Antoine Pitrou wrote:
 Antoine Pitrou pit...@free.fr added the comment:
 
  Thoughts? (apart from ugh! it's ugly! yes I know - it's late here)
 
 Is it guaranteed that no usage pattern can render this protection
 inefficient? What if a dict is constructed by intermingling lookups and
 inserts?
 Similarly, what happens with e.g. the common use case of
 dictdefault(list), where you append() after the lookup/insert? Does some
 key distribution allow the attack while circumventing the protection?

Yes, I agree that I was making an unrealistic assumption about usage
patterns.  There was also some global state (the is_inserting
variable).

I've tweaked the approach somewhat, moved the global to be per-dict, and
am attaching a revised version of the patch:
   amortized-probe-counting-dmalcolm-2012-01-21-003.patch

In this patch, rather than reset the count each time, I keep track of
the total number of calls to insertdict() that have happened for each
large dict (i.e. for which ma_table != ma_smalltable), and the total
number of probe iterations that have been needed to service those
insertions/overwrites.  It raises the exception when the *number of
probe iterations per insertion* exceeds a threshold factor (rather than
merely comparing the number of iterations against the current ma_used of
the dict).  I believe this means that it's tracking and checking every
time the dict is modified, and (I hope) protects us against any data
that drives the dict implementation away from linear behavior (because
that's essentially what it's testing for).  [the per-dict stats are
reset each time that it shrinks down to using ma_smalltable again, but I
think at-risk usage patterns in which that occurs are uncommon relative
to those in which it doesn't].

When attacked, this leads to exceptions like this:
AlgorithmicComplexityError: dict construction used 1697 probes whilst
performing 53 insertions (len() == 58) at key 58 with hash 42

i.e we have a dictionary containing 58 keys, which has seen 53
insert/overwrite operations since transitioning to the non-ma_smalltable
representation (at size 6); presumably it has 128 PyDictEntries.
Servicing those 53 operations has required a total 1697 iterations
through the probing loop, or a little over 32 probes per insert.

I just did a full run of the test suite (using run_tests.py), and it
mostly passed the new tests I've added (included the test for scenario 2
from Frank's email).

There were two failures:
==
FAIL: test_inheritance (test.test_pep352.ExceptionClassTests)
--
AssertionError: 1 != 0 : {'AlgorithmicComplexityError'} not accounted
for
--
which is obviously fixable (given a decision on where the exception
lives in the hierarchy)

and this one:
test test_mutants crashed -- Traceback (most recent call last):
  File
/home/david/coding/python-hg/cpython-count-collisions/Lib/test/regrtest.py, 
line 1214, in runtest_inner
the_package = __import__(abstest, globals(), locals(), [])
  File
/home/david/coding/python-hg/cpython-count-collisions/Lib/test/test_mutants.py,
 line 159, in module
test(100)
  File
/home/david/coding/python-hg/cpython-count-collisions/Lib/test/test_mutants.py,
 line 156, in test
test_one(random.randrange(1, 100))
  File
/home/david/coding/python-hg/cpython-count-collisions/Lib/test/test_mutants.py,
 line 132, in test_one
dict2keys = fill_dict(dict2, range(n), n)
  File
/home/david/coding/python-hg/cpython-count-collisions/Lib/test/test_mutants.py,
 line 118, in fill_dict
Horrid(random.choice(candidates))
AlgorithmicComplexityError: dict construction used 2753 probes whilst
performing 86 insertions (len() == 64) at key Horrid(86) with hash 42
though that seems to be deliberately degenerate code.

Caveats:
* no overflow handling (what happens after 2**32 modifications to a
long-lived dict on a 32-bit build?) - though that's fixable.
* no idea what the scaling factor for the threshold should be (there may
also be a deep mathematical objection here, based on how big-O notation
is defined in terms of an arbitrary scaling factor and limit)
* not optimized; I haven't looked at performance yet
* doesn't cover set(), though that also has spare space (I hope) via its
own smalltable array.

BTW, note that although I've been working on this variant of the
collision counting approach, I'm not opposed to the hash randomization
approach, or to adding extra checks in strategic places within the
stdlib: I'm keen to get some kind of appropriate fix approved by the
upstream Python development community so I can backport it to the
various recent-to-ancient versions of CPython I support in RHEL (and
Fedora), before we start seeing real-world attacks.

Hope this is helpful

[issue13829] exception error

2012-01-21 Thread Brett Cannon

Brett Cannon br...@python.org added the comment:

Then I'm going to assume the bug lies with Moviegrabber doing something wrong 
and it isn't Python's direct fault.

--
resolution:  - invalid
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13829
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Dave Malcolm

Dave Malcolm dmalc...@redhat.com added the comment:

(or combination of fixes, of course)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13790] In str.format an incorrect error message for list, tuple, dict, set

2012-01-21 Thread Eric V. Smith

Eric V. Smith e...@trueblade.com added the comment:

The error message will be: non-empty format string passed to 
object.__format__.

I agree with your comment about Terry's patch.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13790
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 In this patch, rather than reset the count each time, I keep track of
 the total number of calls to insertdict() that have happened for each
 large dict (i.e. for which ma_table != ma_smalltable), and the total
 number of probe iterations that have been needed to service those
 insertions/overwrites.  It raises the exception when the *number of
 probe iterations per insertion* exceeds a threshold factor (rather than
 merely comparing the number of iterations against the current ma_used of
 the dict).

This sounds much more robust than the previous attempt.

 When attacked, this leads to exceptions like this:
 AlgorithmicComplexityError: dict construction used 1697 probes whilst
 performing 53 insertions (len() == 58) at key 58 with hash 42

We'll have to discuss the name of the exception and the error message :)

 Caveats:
 * no overflow handling (what happens after 2**32 modifications to a
 long-lived dict on a 32-bit build?) - though that's fixable.

How do you suggest to fix it?

 * no idea what the scaling factor for the threshold should be (there may
 also be a deep mathematical objection here, based on how big-O notation
 is defined in terms of an arbitrary scaling factor and limit)

I'd make the threshold factor a constant, e.g. 64 or 128 (it should not
be too small, to avoid false positives).
We're interested in the actual slowdown factor, which a constant factor
models adequately. It's the slowdown factor which makes a DOS attack
using this technique efficient. Whether or not dict construction truely
degenerates into a O(n**2) operation is less relevant.

There needs to be a way to disable it: an environment variable would be
the minimum IMO.
Also, in 3.3 there should probably be a sys function to enable or
disable it at runtime. Not sure it should be backported since it's a new
API.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13609] Add os.get_terminal_size() function

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

  Does this need need more discussion, code review, testing,
  or just more time?
 
 As I already wrote, I would prefer a very simple
 os.get_terminal_size() function: don't read environment varaiables,
 use a simple tuple instead of a new type, and raise an error if the
 size cannot be read (so no need of default values). The os module is
 written as a thin wrapper between Python and the OS. A more high level
 function (read environment variables, handle the error, use a
 namedtuple) can be written in your application, or maybe in another
 module.

I think we have reached the point where we won't be in total agreement
over the API, so let's choose whatever is submitted as a patch.

I only have two remaining issues with the patch:
- the tests needn't be in a separate file, they can go in test_os
- there should be a test for get_terminal_size_raw as well (and of
course it should be skipped if the function doesn't exist)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13609
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12922] StringIO and seek()

2012-01-21 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 03e61104f7a2 by Antoine Pitrou in branch '3.2':
Issue #12922: fix the TextIOBase documentation to include a description of 
seek() and tell() methods.
http://hg.python.org/cpython/rev/03e61104f7a2

New changeset f7e5abfb31ea by Antoine Pitrou in branch 'default':
Issue #12922: fix the TextIOBase documentation to include a description of 
seek() and tell() methods.
http://hg.python.org/cpython/rev/f7e5abfb31ea

New changeset fcf4d547bed8 by Antoine Pitrou in branch '2.7':
Issue #12922: fix the TextIOBase documentation to include a description of 
seek() and tell() methods.
http://hg.python.org/cpython/rev/fcf4d547bed8

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12922
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12922] StringIO and seek()

2012-01-21 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
resolution:  - fixed
stage: needs patch - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue12922
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13609] Add os.get_terminal_size() function

2012-01-21 Thread Giampaolo Rodola'

Giampaolo Rodola' g.rod...@gmail.com added the comment:

 read environment varaiables [...] and raise an error if the size cannot be
 read (so no need of default values). The os module is written as a thin
 wrapper between Python and the OS. A more high level function (read 
 environment variables, handle the error, use a namedtuple) can be written in 
 your application, or maybe in another module.

+1. I also find weird that a function, especially one living in the os module, 
has such a high level of abstraction (basically this is why I was originally 
proposing shutil module for this to go in).

Given the different opinions about the API, I think it's best to expose the 
lowest level functionality as-is, and let the user decide what to do (read env 
vars first, suppress the exception, use a fallback, etc.). 


 I think we have reached the point where we won't be in total 
 agreement over the API, so let's choose whatever is submitted 
 as a patch.

I'd be more careful. Once this gets in it will be too late for a change.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13609
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13816] Two typos in the docs

2012-01-21 Thread Stefan Krah

Stefan Krah stefan-use...@bytereef.org added the comment:

 ... with *n*th (italic n) as alternate form

Knuth uses that in TAOCP, too. I think with or without italics it's
the most frequently used form overall.

Also the Lisp function is called nth and not n-th, even though in Lisp
it is possible to use hyphens in function names.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13816
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Dave Malcolm

Dave Malcolm dmalc...@redhat.com added the comment:

Well, the old attempt was hardly robust :)

Can anyone see any vulnerabilities in this approach?

Yeah; I was mostly trying to add raw data (to help me debug the
implementation).

I wonder if the dict statistics should be exposed with extra attributes
or a method on the dict; e.g. a __stats__ attribute, something like
this:

LargeDictStats(keys=58, mask=127, insertions=53, iterations=1697)

SmallDictStats(keys=3, mask=7)

or somesuch. Though that's a detail, I think.

  Caveats:
  * no overflow handling (what happens after 2**32 modifications to a
  long-lived dict on a 32-bit build?) - though that's fixable.
 
 How do you suggest to fix it?

If the dict is heading towards overflow of these counters, it's either
long-lived, or *huge*.

Possible approaches:
(a) use 64-bit counters rather than 32-bit, though that's simply
delaying the inevitable
(b) when one of the counters gets large, divide both of them by a
constant (e.g. 2).  We're interested in their ratio, so dividing both by
a constant preserves this.

By a constant do you mean from the perspective of big-O notation, or
do you mean that it should be hardcoded (I was wondering if it should be
a sys variable/environment variable etc?).

 We're interested in the actual slowdown factor, which a constant factor
 models adequately. It's the slowdown factor which makes a DOS attack
 using this technique efficient. Whether or not dict construction truely
 degenerates into a O(n**2) operation is less relevant.

OK.

 There needs to be a way to disable it: an environment variable would be
 the minimum IMO.

e.g. set it to 0 to enable it, set it to nonzero to set the scale
factor.
Any idea what to call it? 

PYTHONALGORITHMICCOMPLEXITYTHRESHOLD=0 would be quite a mouthful.

OK

BTW, presumably if we do it, we should do it for sets as well?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 I wonder if the dict statistics should be exposed with extra attributes
 or a method on the dict; e.g. a __stats__ attribute, something like
 this:
 
 LargeDictStats(keys=58, mask=127, insertions=53, iterations=1697)
 
 SmallDictStats(keys=3, mask=7)

Sounds a bit overkill, and it shouldn't be a public API (which
__methods__ are). Even a private API on dicts would quickly become
visible, since dicts are so pervasive.

   Caveats:
   * no overflow handling (what happens after 2**32 modifications to a
   long-lived dict on a 32-bit build?) - though that's fixable.
  
  How do you suggest to fix it?
 
 If the dict is heading towards overflow of these counters, it's either
 long-lived, or *huge*.
 
 Possible approaches:
 (a) use 64-bit counters rather than 32-bit, though that's simply
 delaying the inevitable

Well, even assuming one billion lookup probes per second on a single
dictionary, the inevitable will happen in 584 years with a 64-bit
counter (but only 4 seconds with a 32-bit counter).

A real issue, though, may be the cost of 64-bit arithmetic on 32-bit
CPUs.

 (b) when one of the counters gets large, divide both of them by a
 constant (e.g. 2).  We're interested in their ratio, so dividing both by
 a constant preserves this.

Sounds good, although we may want to pull this outside of the critical
loop.

 By a constant do you mean from the perspective of big-O notation, or
 do you mean that it should be hardcoded (I was wondering if it should be
 a sys variable/environment variable etc?).

Hardcoded, as in your patch.

  There needs to be a way to disable it: an environment variable would be
  the minimum IMO.
 
 e.g. set it to 0 to enable it, set it to nonzero to set the scale
 factor.

0 to enable it sounds misleading. I'd say:
- 0 to disable it
- 1 to enable it and use the default scaling factor
- = 2 to enable it and set the scaling factor

 Any idea what to call it? 

PYTHONDICTPROTECTION ?
Most people should either enable or disable it, not change the scaling
factor.

 BTW, presumably if we do it, we should do it for sets as well?

Yeah, and use the same env var / sys function.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8052] subprocess close_fds behavior should only close open fds

2012-01-21 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 61aa484a3e54 by Gregory P. Smith in branch '3.2':
Fixes issue #8052: The posix subprocess module's close_fds behavior was
http://hg.python.org/cpython/rev/61aa484a3e54

New changeset 8879874d66a2 by Gregory P. Smith in branch 'default':
Fixes issue #8052: The posix subprocess module's close_fds behavior was
http://hg.python.org/cpython/rev/8879874d66a2

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8052
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Dave Malcolm

Dave Malcolm dmalc...@redhat.com added the comment:

On Sat, 2012-01-21 at 22:20 +, Antoine Pitrou wrote:

 Sounds a bit overkill, and it shouldn't be a public API (which
 __methods__ are). Even a private API on dicts would quickly become
 visible, since dicts are so pervasive.

Fair enough.

Caveats:
* no overflow handling (what happens after 2**32 modifications to a
long-lived dict on a 32-bit build?) - though that's fixable.
   
   How do you suggest to fix it?
  
  If the dict is heading towards overflow of these counters, it's either
  long-lived, or *huge*.
  
  Possible approaches:
  (a) use 64-bit counters rather than 32-bit, though that's simply
  delaying the inevitable
 
 Well, even assuming one billion lookup probes per second on a single
 dictionary, the inevitable will happen in 584 years with a 64-bit
 counter (but only 4 seconds with a 32-bit counter).
 
 A real issue, though, may be the cost of 64-bit arithmetic on 32-bit
 CPUs.
 
  (b) when one of the counters gets large, divide both of them by a
  constant (e.g. 2).  We're interested in their ratio, so dividing both by
  a constant preserves this.
 
 Sounds good, although we may want to pull this outside of the critical
 loop.

OK; I'll look at implementing (b).

Oops, yeah, that was a typo; I meant 0 to disable.

 - 0 to disable it
 - 1 to enable it and use the default scaling factor
 - = 2 to enable it and set the scaling factor

You said above that it should be hardcoded; if so, how can it be changed
at run-time from an environment variable?  Or am I misunderstanding.

Works for me.

  BTW, presumably if we do it, we should do it for sets as well?
 
 Yeah, and use the same env var / sys function.

Despite the DICT in the title?  OK.

Thanks for the feedback.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 You said above that it should be hardcoded; if so, how can it be changed
 at run-time from an environment variable?  Or am I misunderstanding.

You're right, I used the wrong word. I meant it should be a constant
independently of the dict size. But, indeed, not hard-coded in the
source.

   BTW, presumably if we do it, we should do it for sets as well?
  
  Yeah, and use the same env var / sys function.
 
 Despite the DICT in the title?  OK.

Well, dict is the most likely target for these attacks.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13609] Add os.get_terminal_size() function

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 +1. I also find weird that a function, especially one living in the os
 module, has such a high level of abstraction (basically this is why I
 was originally proposing shutil module for this to go in).
 
 Given the different opinions about the API, I think it's best to
 expose the lowest level functionality as-is, and let the user decide
 what to do (read env vars first, suppress the exception, use a
 fallback, etc.). 

Fair enough, but other people expressed sympathy for the two-function
approach :) I'm personally indifferent, although I find
get_terminal_size_raw a bit ugly and liked query_terminal_size
better.

(and looking up ROWS and COLUMNS make sense, since they are de facto
standards)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13609
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8052] subprocess close_fds behavior should only close open fds

2012-01-21 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 780992c9afea by Gregory P. Smith in branch '3.2':
Add a Misc/NEWS entry for issue 8052.
http://hg.python.org/cpython/rev/780992c9afea

New changeset 1f0a01dc723c by Gregory P. Smith in branch 'default':
A Misc/NEWS entry for issue 8052.
http://hg.python.org/cpython/rev/1f0a01dc723c

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8052
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13405] Add DTrace probes

2012-01-21 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 So, yes. The code is intrusive. The code deals with a lot of internal
 machinery (PEP393 support in the ustack helper was quite difficult).
 It is going to break from time to time, sure. At the same time, I am
 committed to support it. And even if it is dropped in 3.4, no Python
 program will be affected.

To ease the concerns, I think you should make it so that dtrace-specific
code gets out of the way as much as possible.
I suggest you create a Python/ceval-dtrace.h header and put most
dtrace-specific code from ceval.c there. Its inclusion should be
conditional on WITH_DTRACE so that other core devs can ignore its
presence.

A couple other comments:
- in the makefile, DTRACEOBJS is inconsistent with DTRACE_STATIC and
LIBRARY_OBJS. You should make it DTRACE_OBJS.
- please add comments at the top of whatever header files you add, to
make it clear that they are dtrace-specific. Mentions of ustack helper
are a bit too specific to be helpful.
- some code lacks error checking, e.g. when calling PyUnicode_AsUTF8.
- is co_linenos ever freed or is it a memory leak?
- your indices and offsets should be Py_ssize_t, not int
- is an empty dtrace module really needed? a flag variable in the sys
module should be enough
- as you can see the Makefile uses -rm -f, you should probably do the
same instead of rm -f
- you have a rather strange if true in your configure.in additions

Thanks in advance.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13405
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8052] subprocess close_fds behavior should only close open fds

2012-01-21 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset d0acd8169c2a by Gregory P. Smith in branch '3.2':
Bugfix for issue #8052 fix on *BSD variants.
http://hg.python.org/cpython/rev/d0acd8169c2a

New changeset 5be3dadd2eef by Gregory P. Smith in branch '3.2':
Another issue #8052 bugfix (related to previous commit).
http://hg.python.org/cpython/rev/5be3dadd2eef

New changeset e52d81e0c750 by Gregory P. Smith in branch 'default':
bugfix for issue 8052 fixes on *BSD platforms.
http://hg.python.org/cpython/rev/e52d81e0c750

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8052
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Gregory P. Smith

Gregory P. Smith g...@krypto.org added the comment:

On Sat, Jan 21, 2012 at 2:45 PM, Antoine Pitrou rep...@bugs.python.org wrote:

 Antoine Pitrou pit...@free.fr added the comment:

 You said above that it should be hardcoded; if so, how can it be changed
 at run-time from an environment variable?  Or am I misunderstanding.

 You're right, I used the wrong word. I meant it should be a constant
 independently of the dict size. But, indeed, not hard-coded in the
 source.

   BTW, presumably if we do it, we should do it for sets as well?
 
  Yeah, and use the same env var / sys function.

 Despite the DICT in the title?  OK.

 Well, dict is the most likely target for these attacks.


While true I wouldn't make that claim as there will be applications
using a set in a vulnerable manner. I'd prefer to see any such
environment variable name used to configure this behavior not mention
DICT or SET but just say HASHTABLE.  That is a much better bikeshed
color. ;)

I'm still in the hash seed randomization camp but I'm finding it
interesting all of the creative ways others are trying to solve this
problem in a way that could be enabled by default in stable versions
regardless. :)

-gps

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Alex Gaynor

Alex Gaynor alex.gay...@gmail.com added the comment:

On Sat, Jan 21, 2012 at 5:42 PM, Gregory P. Smith rep...@bugs.python.orgwrote:


 Gregory P. Smith g...@krypto.org added the comment:

 On Sat, Jan 21, 2012 at 2:45 PM, Antoine Pitrou rep...@bugs.python.org
 wrote:
 
  Antoine Pitrou pit...@free.fr added the comment:
 
  You said above that it should be hardcoded; if so, how can it be changed
  at run-time from an environment variable?  Or am I misunderstanding.
 
  You're right, I used the wrong word. I meant it should be a constant
  independently of the dict size. But, indeed, not hard-coded in the
  source.
 
BTW, presumably if we do it, we should do it for sets as well?
  
   Yeah, and use the same env var / sys function.
 
  Despite the DICT in the title?  OK.
 
  Well, dict is the most likely target for these attacks.
 

 While true I wouldn't make that claim as there will be applications
 using a set in a vulnerable manner. I'd prefer to see any such
 environment variable name used to configure this behavior not mention
 DICT or SET but just say HASHTABLE.  That is a much better bikeshed
 color. ;)

 I'm still in the hash seed randomization camp but I'm finding it
 interesting all of the creative ways others are trying to solve this
 problem in a way that could be enabled by default in stable versions
 regardless. :)

 -gps

 --

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue13703
 ___


I'm a little slow, so bear with me, but David, does this counting scheme in
any way address the issue of:

I'm able to put N pieces of data into the database on successive requests,
but then *rendering* that data puts it in a dictionary, which renders that
page unviewable by anyone.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13609] Add os.get_terminal_size() function

2012-01-21 Thread Denilson Figueiredo de Sá

Denilson Figueiredo de Sá denilso...@gmail.com added the comment:

On Sat, Jan 21, 2012 at 17:40, Giampaolo Rodola' rep...@bugs.python.org wrote:

 Given the different opinions about the API, I think it's best to expose the 
 lowest
 level functionality as-is, and let the user decide what to do (read env vars 
 first,
 suppress the exception, use a fallback, etc.).

As a Python user (and not a committer), I disagree.

As an user, I don't care too much where the function should be placed
(although I believe os or sys are sensible choices). What I do care is
that I want a extremely simple function that will just work. Don't
make me add code for handling all the extra cases, such code should be
inside the function.

All this discussion about the API made me remember this presentation:
http://python-for-humans.heroku.com/

Also, I see no downside of using a Named Tuple. Issue 4285 actually
added a named tuple to the sys.version_info.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13609
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Dave Malcolm

Dave Malcolm dmalc...@redhat.com added the comment:

5 more characters:
PYTHONHASHTABLEPROTECTION
or
PYHASHTABLEPROTECTION
maybe?

I'm in *both* camps: I like hash seed randomization fwiw.  I'm nervous
about enabling either of the approaches by default, but I can see myself
backporting both approaches into RHEL's ancient Python versions,
compiled in, disabled by default, but available at runtime via env vars
(assuming that no major flaws are discovered in my patch e.g.
performance).

I'm sorry if I'm muddying the waters by working on this approach.

Is the hash randomization approach ready to go, or is more work needed?
If the latter, is there a clear TODO list?
(for backporting to 2.*, presumably we'd want PyStringObject to be
randomized; I think this means that PyBytesObject needs to be randomized
also in 3.*; don't we need hash(b'foo') == hash('foo') ?).  Does the
patch needs to also randomize the hashes of the numeric types? (I think
not; that may break too much 3rd-party code (NumPy?)).

[If we're bikeshedding,  I prefer the term salt to seed in the hash
randomization approach: there's a per-process hash salt, which is
either randomly generated, or comes from the environment, set to 0 to
disable]

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13790] In str.format an incorrect error message for list, tuple, dict, set

2012-01-21 Thread Terry J. Reedy

Terry J. Reedy tjre...@udel.edu added the comment:

Looking further, I noticed that 'string' needed to be changed to 
'specification' in the following sentence also. Then I decided that the 
preceding sentence
 
Most built-in types implement the following options for format specifications, 
although some of the formatting options are only supported by the numeric 
types.

should really follow the one about non-empty format specs. This positioning 
should make it more obvious that most of the options affect the string 
representation of the object after, not before, the string is produced, and are 
therefore applicable to all objects and not just string and number objects. I 
also propose to modify it so it is shorter and no longer contradictory, to read

Most built-in types implement various options for such modifications, although 
some are only supported by the numeric types.

Further on, under The available string presentation types are:
I think ``'s'`` String format. This is the default type for strings and may be 
omitted. should have 'and other non-numeric types ' inserted after strings. 
New patch i13790b.diff attached

The point of these additional changes is to make it clearer that the default 
formatting of non-number, non-string objects is to call str() and then apply 
the options to the resulting string. That makes something like
 format(range(5), '-^20s') # same with object.__format__(), 3.3.0a0
'range(0, 5)-'
predictable and comprehensible.

I agree with not making a temporary change (but see below ;-).

But it seems that the 3.4 message should at least be
numeric format string passed to object.__format__ or
format string with number-only options passed to object.__format__ or
object.__format__ cannot handle number-only options
as string formats work fine and, I presume, are not deprecated (?).

However, if the new ValueError message did not specify object.__format__ (which 
could still be confusing, even if more accurate), the change could be make now. 
For instance
'Numeric option 'd' for non-number object'.
It would not really matter if it is later raised in object.__format__ instead 
of str.__format__. I believe *all* of the format codes 'unknown' to str (and by 
extension, by default, to all other non-number types) *are* number codes.

--
assignee: docs@python - terry.reedy
Added file: http://bugs.python.org/file24290/i13790b.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13790
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13703] Hash collision security issue

2012-01-21 Thread Paul McMillan

Paul McMillan p...@mcmillan.ws added the comment:

On Sat, Jan 21, 2012 at 3:47 PM, Alex Gaynor rep...@bugs.python.org wrote:
 I'm able to put N pieces of data into the database on successive requests,
 but then *rendering* that data puts it in a dictionary, which renders that
 page unviewable by anyone.

This and the problems Frank mentions are my primary concerns about the
counting approach. Without the original suggestion of modifying the
hash and continuing without an exception (which has its own set of
problems), the valid data python can't process problem is a pretty
big one. Allowing attackers to poison interactions for other users is
unacceptable.

The other thing I haven't seen mentioned yet is that while it is true
that most web applications do have robust error handling to produce
proper 500s, an unexpected error will usually result in restarting the
server process - something that can carry significant weight by
itself. I would consider it a serious problem if every attack request
required a complete application restart, a la original cgi.

I'm strongly in favor of randomization. While there are many broken
applications in the wild that depend on dictionary ordering, if we
ship with this feature disabled by default for security and bugfix
branches, and enable it for 3.3, users can opt-in to protection as
they need it and as they fix their applications. Users who have broken
applications can still safely apply the security fix (without even
reading the release notes) because it won't change the default
behavior. Distro managers can make an appropriate choice for their
user base. Most importantly, it negates the entire compute once,
attack everywhere class of collision problems, even if we haven't
explicitly discovered them.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13703
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13783] Clean up PEP 380 C API additions

2012-01-21 Thread Meador Inge

Meador Inge mead...@gmail.com added the comment:

'PyStopIteration_Create' is just a trivial wrapper:

PyObject *
PyStopIteration_Create(PyObject *value)
{
return PyObject_CallFunctionObjArgs(PyExc_StopIteration, value, NULL);
}

It is not needed.

As for 'PyGen_FetchStopIterationValue', does it really need to be public?  It 
is trivial to make it private because all calls to it are in 'genobject.c'.  
However, I am not sure if there is a strong use case for having it public.

--
nosy: +meador.inge
stage:  - needs patch
type:  - enhancement

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13783
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue11551] test_dummy_thread.py test coverage improvement

2012-01-21 Thread Denver Coneybeare

Denver Coneybeare denver.coneybe...@gmail.com added the comment:

I've looked at the review (thanks for the review) and can submit an updated 
patch.  I don't have the Python source code pulled down to my PC anymore so it 
might take a week or two before I'm able to update the patch and test it out.  
I imagine that's not too much of a problem though :)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue11551
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8052] subprocess close_fds behavior should only close open fds

2012-01-21 Thread Roundup Robot

Roundup Robot devn...@psf.upfronthosting.co.za added the comment:

New changeset 754c2eb0a92c by Gregory P. Smith in branch '3.2':
Fix FreeBSD, NetBSD and OpenBSD behavior of the issue #8052 fix.
http://hg.python.org/cpython/rev/754c2eb0a92c

New changeset 7d4658a8de96 by Gregory P. Smith in branch 'default':
Fix FreeBSD, NetBSD and OpenBSD behavior of the issue #8052 fix.
http://hg.python.org/cpython/rev/7d4658a8de96

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8052
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8052] subprocess close_fds behavior should only close open fds

2012-01-21 Thread Gregory P. Smith

Gregory P. Smith g...@krypto.org added the comment:

For FreeBSD, Python 3.2 and 3.3 now check to see if /dev/fd is valid.  Be sure 
and mount -t fdescfs none /dev/fd on FreeBSD if you want faster subprocess 
launching.  Run a FreeBSD buildbot?  Please do it!

For Python 3.1 the fix for #13788 would fix this, but I believe 3.1 is in 
security fix only mode at this point so we're not going to backport that 
os.closerange change that far.

--
resolution:  - fixed
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue8052
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13814] Document why generators don't support the context management protocol

2012-01-21 Thread Nick Coghlan

Nick Coghlan ncogh...@gmail.com added the comment:

Generators deliberately don't support the context management protocol. This is 
so that they raise an explicit TypeError or AttributeError (pointing out that 
__exit__ is missing) if you leave out the @contextmanager decorator when you're 
using a generator to write an actual context manager.

Generators supporting the context management protocol natively would turn that 
into a far more subtle (and confusing) error: your code would silently fail to 
invoke the generator body.

Ensuring this common error remains easy to detect is far more important than 
making it easier to invoke close() on a generator object (particularly when 
contextlib.closing() already makes that very easy).

--
assignee:  - docs@python
components: +Documentation
nosy: +docs@python
stage: test needed - needs patch
status: pending - open
title: Generators as context managers. - Document why generators don't support 
the context management protocol

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13814
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com