from:"David Beazley"

[issue7946] Convoy effect with I/O bound threads and New GIL

2021-01-01 Thread David Beazley



Change by David Beazley :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16894] Function attribute access doesn't invoke methods in dict subclasses

2021-01-01 Thread David Beazley



Change by David Beazley :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue16894>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24844] Python 3.5rc1 compilation error with Apple clang 4.2 included with Xcode 4

2021-01-01 Thread David Beazley



Change by David Beazley :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue24844>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32810] Expose ags_gen and agt_gen in asynchronous generators

2021-01-01 Thread David Beazley



Change by David Beazley :


--
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue32810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27436] Strange code in selectors.KqueueSelector

2021-01-01 Thread David Beazley



Change by David Beazley :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue27436>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16132] ctypes incorrectly encodes .format attribute of memory views

2021-01-01 Thread David Beazley



Change by David Beazley :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue16132>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2020-10-03 Thread David Beazley



David Beazley  added the comment:

About nine years ago, I stood in front of a room of Python developers, 
including many core developers, and gave a talk about the problem described in 
this issue.  It included some live demos and discussion of a possible fix. 

https://www.youtube.com/watch?v=fwzPF2JLoeU

Based on subsequent interest, I think it's safe to say that this issue will 
never be fixed.  Probably best to close this issue.

--

___
Python tracker 
<https://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33014] Clarify doc string for str.isidentifier()

2018-03-10 Thread David Beazley


David Beazley  added the comment:

s = 'Some String'
s.isalnum()
s.isalpha()
s.isdecimal()
s.isdigit()
s.isidentifier()
s.islower()
s.isnumeric()
s.isprintable()
s.isspace()
s.istitle()
s.isupper()

Not really sure where I would have gotten the idea that it might be referring 
to s.iskeyword().  But what do I know?  I'll stop submitting further 
suggestions.

--

___
Python tracker 
<https://bugs.python.org/issue33014>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33014] Clarify doc string for str.isidentifier()

2018-03-09 Thread David Beazley


David Beazley  added the comment:

That wording isn't much better in my opinion.  If I'm sitting there looking at 
methods like str.isdigit(), str.isnumeric(), str.isascii(), and 
str.isidentifier(), seeing keyword.iskeyword() makes me think it's a method 
regardless of whether you label it a function or method.  Explicitly stating 
that "keyword" is actually the keyword module makes it much clearer. Or at 
least including the argument as well keyword.iskeyword(kw).

It really should be a string method though ;-)

--

___
Python tracker 
<https://bugs.python.org/issue33014>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue33014] Clarify doc string for str.isidentifier()

2018-03-06 Thread David Beazley


New submission from David Beazley :

This is a minor nit, but the doc string for str.isidentifier() states:

Use keyword.iskeyword() to test for reserved identifiers such as "def" and 
"class".

At first glance, I thought that it meant you'd do this (doesn't work):

'def'.iskeyword()

As opposed to this:

import keyword
keyword.iskeyword('def')

Perhaps a clarification that "keyword" refers to the keyword module could be 
added.   Or better yet, just make 'iskeyword()` a string method ;-).

--
assignee: docs@python
components: Documentation
messages: 313335
nosy: dabeaz, docs@python
priority: normal
severity: normal
status: open
title: Clarify doc string for str.isidentifier()
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue33014>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32810] Expose ags_gen and agt_gen in asynchronous generators

2018-02-10 Thread David Beazley


David Beazley  added the comment:

I've attached a file that illustrates the issue.

(Side thought: this would be nice to have in inspect or traceback)

--
Added file: https://bugs.python.org/file47434/agen.py

___
Python tracker 
<https://bugs.python.org/issue32810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32810] Expose ags_gen and agt_gen in asynchronous generators

2018-02-09 Thread David Beazley


New submission from David Beazley :

Libraries such as Curio and asyncio provide a debugging facility that allows 
someone to view the call stack of generators/coroutines.  For example, the 
_task_get_stack() function in asyncio/base_tasks.py.  This works by manually 
walking up the chain of coroutines (by following cr_frame and gi_frame links as 
appropriate).   

The only problem is that it doesn't work if control flow falls into an async 
generator because an "async_generator_asend" instance is encountered and there 
is no meaningful way to proceed any further with stack inspection.

This problem could be fixed if "async_generator_asend" and 
"async_generator_athrow" instances exposed the underlying "ags_gen" and 
"agt_gen" attribute that's held inside the corresponding C structures in 
Objects/genobject.c.  

Note: I made a quick and dirty "hack" to Python to extract "ags_gen" and 
verified that having this information would allow me to get complete stack 
traces in Curio.

--
messages: 311906
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Expose ags_gen and agt_gen in asynchronous generators
type: enhancement
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue32810>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32690] Return function locals() in order of creation?

2018-01-28 Thread David Beazley


David Beazley  added the comment:

Some context:  I noticed this while discussing (in a course) a programming 
trick involving instance initialization and locals() that I'd encountered in 
the past:

def _init(locs):
self = locs.pop('self')
for name, val in locs.items():
setattr(self, name, val)

class Spam:
def __init__(self, a, b, c, d):
_init(locals())

In looking at locals(), it was coming back in reverse order of method arguments 
(d, c, b, a, self).   To be honest, it wasn't a critical matter, but more of an 
odd curiosity in light of recent dictionary ordering.

I could imagine writing a slightly more general version of _init() that didn't 
depend on a named 'self' argument if order was preserved:

def _init(locs):
   items = list(locs.items())
   _, self = items[0]
   for name, val in items[1:]:
   setattr(self, name, val)

Personally, I don't think the issue Nathaniel brings up is worth worrying about 
because it would be such a weird edge case on something that is already an edge 
case.  Returning variables in "lexical order"--meaning the order in which first 
encountered in the source seems pretty sensible to me.

--
nosy: +dabeaz

___
Python tracker 
<https://bugs.python.org/issue32690>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27436] Strange code in selectors.KqueueSelector

2016-07-01 Thread David Beazley


David Beazley added the comment:

I don't see any possible way that you would ever get events = EVENT_READ | 
EVENT_WRITE if the flag is a single value (e.g., KQ_FILTER_READ) and the flag 
itself is not a bitmask.  Only one of those == tests will ever be True.  There 
is no need to use |=.   Unless I'm missing something.

--

___
Python tracker 
<http://bugs.python.org/issue27436>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27436] Strange code in selectors.KqueueSelector

2016-07-01 Thread David Beazley


David Beazley added the comment:

If the KQ_FILTER constants aren't bitmasks, it seems that the code could be 
simplified to the last version then.  At the least, it would remove a few 
unnecessary calculations.Again, a very minor thing (I only stumbled onto it 
by accident really).

--

___
Python tracker 
<http://bugs.python.org/issue27436>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27436] Strange code in selectors.KqueueSelector

2016-07-01 Thread David Beazley


New submission from David Beazley:

Not so much a bug, but an observation based on reviewing the implementation of 
the selectors.KqueueSelector class.  In that class there is the select() method:

def select(self, timeout=None):
timeout = None if timeout is None else max(timeout, 0)
max_ev = len(self._fd_to_key)
ready = []
try:
kev_list = self._kqueue.control(None, max_ev, timeout)
except InterruptedError:
return ready
for kev in kev_list:
fd = kev.ident
flag = kev.filter
events = 0
if flag == select.KQ_FILTER_READ:
events |= EVENT_READ
if flag == select.KQ_FILTER_WRITE:
events |= EVENT_WRITE

key = self._key_from_fd(fd)
if key:
ready.append((key, events & key.events))
return ready

The for-loop looks like it might be checking flags against some kind of 
bit-mask in order to build events.  However, if so, the code just looks wrong.  
Wouldn't it use the '&' operator (or some variant) instead of '==' like this?

for kev in kev_list:
fd = kev.ident
flag = kev.filter
events = 0
if flag & select.KQ_FILTER_READ:
events |= EVENT_READ
if flag & select.KQ_FILTER_WRITE:
events |= EVENT_WRITE

If it's not a bit-mask, then wouldn't the code be simplified by something like 
this?

for kev in kev_list:
fd = kev.ident
flag = kev.filter
if flag == select.KQ_FILTER_READ:
events = EVENT_READ
elif flag == select.KQ_FILTER_WRITE:
events = EVENT_WRITE


Again, not sure if this is a bug or not. It's just something that looks weirdly 
off.

--
components: Library (Lib)
messages: 269676
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Strange code in selectors.KqueueSelector
type: enhancement
versions: Python 3.6

___
Python tracker 
<http://bugs.python.org/issue27436>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25476] close() behavior on non-blocking BufferedIO objects with sockets

2015-10-26 Thread David Beazley


David Beazley added the comment:

Please don't make flush() close the file on a BlockingIOError.   That would be 
an unfortunate mistake and make it impossible to implement non-blocking I/O 
correctly with buffered I/O.

--

___
Python tracker 
<http://bugs.python.org/issue25476>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue25476] close() behavior on non-blocking BufferedIO objects with sockets

2015-10-25 Thread David Beazley


New submission from David Beazley:

First comment: In the I/O library, there is documented behavior for how things 
work in the presence of non-blocking I/O.  For example, read/write methods 
returning None on raw file objects.  Methods on BufferedIO instances raise a 
BlockingIOError for operations that can't complete. 

However, the implementation of close() is currently broken.  If buffered I/O is 
being used and a file is closed, it's possible that the close will fail due to 
a BlockingIOError occurring as buffered data is flushed to output.  However, in 
this case, the file is closed anyways and there is no possibility to retry.  
Here is an example to illustrate:

>>> from socket import *
>>> s = socket(AF_INET, SOCK_STREAM)
>>> s.connect(('somehost', port))
>>> s.setblocking(False)
>>> f = s.makefile('wb', buffering=1000)   # Large buffer
>>> f.write(b'x'*100)
>>>

Now, watch carefully

>>> f
<_io.BufferedWriter name=4>
>>> f.closed
False
>>> f.close()
Traceback (most recent call last):
  File "", line 1, in 
BlockingIOError: [Errno 35] write could not complete without blocking
>>> f
<_io.BufferedWriter name=-1>
>>> f.closed
True
>>>

I believe this can be fixed by changing a single line in 
Modules/_io/bufferedio.c:

--- bufferedio_orig.c   2015-10-25 16:40:22.0 -0500
+++ bufferedio.c2015-10-25 16:40:35.0 -0500
@@ -530,10 +530,10 @@
 res = PyObject_CallMethodObjArgs((PyObject *)self, _PyIO_str_flush, NULL);
 if (!ENTER_BUFFERED(self))
 return NULL;
-if (res == NULL)
-PyErr_Fetch(&exc, &val, &tb);
-else
-Py_DECREF(res);
+if (res == NULL) 
+  goto end;
+else 
+  Py_DECREF(res);
 
 res = PyObject_CallMethodObjArgs(self->raw, _PyIO_str_close, NULL);

With this patch, the close() method can be retried as appropriate until all 
buffered data is successfully written.

--
components: IO
messages: 253438
nosy: dabeaz
priority: normal
severity: normal
status: open
title: close() behavior on non-blocking BufferedIO objects with sockets
type: behavior
versions: Python 3.5

___
Python tracker 
<http://bugs.python.org/issue25476>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7322] Socket timeout can cause file-like readline() method to lose data

2015-10-25 Thread David Beazley


David Beazley added the comment:

This bug is still present in Python 3.5, but it occurs if you attempt to do a 
readline() on a socket that's in non-blocking mode.  In that case, you probably 
DO want to retry at a later time (unlike the timeout case).

--

___
Python tracker 
<http://bugs.python.org/issue7322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24975] Python 3.5 can't compile AST involving PEP 448 unpacking

2015-08-31 Thread David Beazley


New submission from David Beazley:

The compile() function is not able to compile an AST created from code that 
uses some of the new unpacking generalizations in PEP 448.  Example:

code = ''' 
a = { 'x':1, 'y':2 }
b = { **a, 'z': 3 }
'''

# Works
ccode = compile(code, '', 'exec')

# Crashes
import ast
tree = ast.parse(code)
ccode = compile(tree, '', 'exec')

# --

Error Traceback:

Traceback (most recent call last):
  File "bug.py", line 11, in 
ccode = compile(tree, '', 'exec')
ValueError: None disallowed in expression list

Note:  This bug makes it impossible to try generalized unpacking examples 
interactively in IPython.

--
components: Library (Lib)
messages: 249442
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Python 3.5 can't compile AST involving PEP 448 unpacking
type: crash
versions: Python 3.5

___
Python tracker 
<http://bugs.python.org/issue24975>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue24844] Python 3.5rc1 compilation error on OS X 10.8

2015-08-11 Thread David Beazley


New submission from David Beazley:

Just a note that Python-3.5.0rc1 fails to compile on Mac OS X 10.8.5 with the 
following compiler:

bash$ clang --version
Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn)
Target: x86_64-apple-darwin12.6.0
Thread model: posix
bash$ 

Here is the resulting compilation error:

/usr/bin/clang -c -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG 
-g -fwrapv -O3 -Wall -Wstrict-prototypes-Werror=declaration-after-statement 
  -I. -IInclude -I./Include-DPy_BUILD_CORE -o Python/ceval.o Python/ceval.c
fatal error: error in backend: Cannot select: 0x102725710: i8,ch =
  AtomicSwap 0x102c45ce0, 0x102725010, 0x102725510 [ID=7]
  0x102725010: i64 = X86ISD::WrapperRIP 0x102723710 [ID=6]
0x102723710: i64 = TargetGlobalAddress 0 [ID=4]
  0x102725510: i8 = Constant<1> [ID=2]
In function: take_gil
make: *** [Python/ceval.o] Error 1

Problem can be fixed by commenting out the following line in pyconfig.h

/* Has builtin atomics */
// #define HAVE_BUILTIN_ATOMIC 1   


Not really sure what to advise.  To my eyes, it looks like a bug in clang or 
Xcode.  So, maybe this is more just an FYI that source builds might fail on 
certain older Mac systems.

--
messages: 248415
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Python 3.5rc1 compilation error on OS X 10.8
type: compile error
versions: Python 3.5

___
Python tracker 
<http://bugs.python.org/issue24844>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23441] rlcompleter: tab on empty prefix => insert spaces

2015-07-26 Thread David Beazley


David Beazley added the comment:

It's still broken on Python 3.5b4.

--

___
Python tracker 
<http://bugs.python.org/issue23441>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23441] rlcompleter: tab on empty prefix => insert spaces

2015-07-09 Thread David Beazley


David Beazley added the comment:

Wanted to add:  I see this as being about the same as having a broken window 
pane on the front of Python 3.  Maybe there are awesome things inside, but it 
makes a bad first impression on anyone who dares to use the interactive console.

--

___
Python tracker 
<http://bugs.python.org/issue23441>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23441] rlcompleter: tab on empty prefix => insert spaces

2015-07-09 Thread David Beazley


David Beazley added the comment:

Frivolity aside, I really wish this issue would get more traction and a fix.

Indentation is an important part of the Python language (obviously).  A pretty 
standard way to indent is to hit "tab" in whatever environment you're using to 
edit Python code.

Yet, at the interactive prompt, tab doesn't actually indent on a blank line. 
Instead, it autocompletes the builtins.  Aside from it being highly annoying 
(as previously mentioned), it is also an embarrassment.

Newcomers to Python will very often try things out using the stock interpreter 
before moving on to more sophisticated environments.  The fact that tab is 
broken from the get-go leaves a pretty sour impression when not even the most 
basic tutorial examples work at the interactive console (and keep in mind that 
whitespace sensitivity is probably already an issue on their minds).

Experienced Python users coming from Python 2 to Python 3 are going to find 
that tab is busted in Python 3.  Well, of course it's busted because everything 
is busted in Python 3.  "Wow, this really sucks as bad as everyone says" 
they'll say. 

So, with that as context, I'm really hoping I don't have to watch people use a 
busted tab key for another entire release cycle of Python 3 as I did for 
Python-3.4. 

I have no particular thoughts about the specifics (tabs vs. spaces) or the 
amount of indentation.   It's the autocomplete on empty line that's the issue.

--

___
Python tracker 
<http://bugs.python.org/issue23441>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23441] rlcompleter: tab on empty prefix => insert spaces

2015-07-07 Thread David Beazley


David Beazley added the comment:

For what it's worth, I'm kind of tired having to hack site.py every time I 
upgrade Python in order to avoid being shown 6000 choices when hitting tab on 
an empty line. It is crazy annoying.

--

___
Python tracker 
<http://bugs.python.org/issue23441>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23441] rlcompleter: tab on empty prefix => insert spaces

2015-07-07 Thread David Beazley


David Beazley added the comment:

This is a problem that will never be fixed.  Sure, it was a release blocker in 
Python 3.4.

It wasn't fixed.

It is a release blocker in Python 3.5.

It won't be fixed.

They'll just tell you to indent using the spacebar as generations of typists 
have done for centuries.

It won't be fixed.

Why don't you just use ipython or bpython?

It won't be fixed.

Doesn't your IDE take care of this?

It won't be fixed.

By the way, backspace will never work right either.  No, that will never be 
fixed.

Did we mention that this will never be fixed?   

You can fix it!

Yes, you!  No, I mean you! Yes, yes, you can.

Simply edit the file Lib/site.py and comment out the line that does this:

  # enablerlcompleter()

Problem solved. All is well.

By the way.  This problem will never be fixed.  That is all.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue23441>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23642] Interaction of ModuleSpec and C Extension Modules

2015-03-12 Thread David Beazley


David Beazley added the comment:

This is great news.   Read the PEP draft and think this is a very good thing to 
be addressing. Thanks, Brett.

--

___
Python tracker 
<http://bugs.python.org/issue23642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23642] Interaction of ModuleSpec and C Extension Modules

2015-03-12 Thread David Beazley


David Beazley added the comment:

Note: Might be related to Issue 19713.

--

___
Python tracker 
<http://bugs.python.org/issue23642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23642] Interaction of ModuleSpec and C Extension Modules

2015-03-11 Thread David Beazley


David Beazley added the comment:

Sorry. I take back the previous message.  It still doesn't quite do what I 
want.   Anyways, any insight or thoughts about this would be appreciated ;-).

--

___
Python tracker 
<http://bugs.python.org/issue23642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23642] Interaction of ModuleSpec and C Extension Modules

2015-03-11 Thread David Beazley


David Beazley added the comment:

inal comment.  It seems that one can generally avoid a lot of nastiness if 
importlib.reload() is used instead.  For example:

>>> mod = sys.modules[spec.name] = module_from_spec(spec)
>>> importlib.reload(mod)

This works for both source and Extension modules and completely avoids the need 
to worry about the exec_module()/load_module() warts.   Wouldn't say it's an 
obvious approach though ;-).

--

___
Python tracker 
<http://bugs.python.org/issue23642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue23642] Interaction of ModuleSpec and C Extension Modules

2015-03-11 Thread David Beazley


New submission from David Beazley:

I have been investigating some of the new importlib machinery and the addition 
of ModuleSpec objects.  I am a little curious about the intended handling of C 
Extension modules going forward. 

Backing up for a moment, consider a pure Python module.  It seems that I can do 
things like this to bring a module into existence (some steps involving 
sys.modules omitted).

>>> from importlib.util import find_spec, module_from_spec
>>> spec = find_spec('socket')
>>> socket = module_from_spec(spec)
>>> spec.loader.exec_module(socket)
>>>

However, it all gets "weird" with C extension modules.  For example, you can 
perform the first few steps:

>>> spec = find_spec('math')
>>> spec
ModuleSpec(name='math', loader=<_frozen_importlib.ExtensionFileLoader object at 
0x1012122b0>, origin='/usr/local/lib/python3.5/lib-dynload/math.so')
>>> math = module_from_spec(spec)
>>> math

>>> dir(math)
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__']

As you can see, you get a fresh "unloaded" module here.  However, if you try to 
bring in the module contents, things get screwy.

>>> spec.loader.exec_module(math)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'ExtensionFileLoader' object has no attribute 'exec_module'
>>>

Yes, this is the old legacy interface in action--there is no exec_module() 
method.   You can always fall back to load_module() like this:

>>> spec.loader.load_module(spec.name)

>>>

The problem here is that it creates a brand new module and ignores the one that 
was previously created by module_from_spec().  That module is still empty:

>>> dir(math)
['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__']
>>> 

I realize that I'm treading into a swamp of legacy interfaces and some pretty 
complex machinery here.  However, here's my question:  are C extension modules 
always going to be a special case that need to be considered code that 
interacts with the import system.  Specifically, will it need to be 
special-cased to use load_module() instead of the 
module_from_spec()/exec_module() combination?

I suppose the question might also apply to built-in and frozen modules as well 
(although I haven't investigated that so much). 

Mainly, I'm just trying to gain some insight from the devs as to the overall 
direction where the import implementation is going with this.

P.S.  ModuleSpecs are cool. +1

--
components: Interpreter Core
messages: 237872
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Interaction of ModuleSpec and C Extension Modules
type: behavior
versions: Python 3.5

___
Python tracker 
<http://bugs.python.org/issue23642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15986] memoryview: expose 'buf' attribute

2014-07-31 Thread David Beazley


David Beazley added the comment:

One of the other goals of memoryviews is to make memory access less hacky.  To 
that end, it would be nice to have the .buf attribute available given that all 
of the other attributes are already there.  I don't see why people should need 
to do some even more hacky hack thing on top of hacks just to expose the 
pointer (which they'll figure out how to do anyway if they actually need to use 
it for something).

--

___
Python tracker 
<http://bugs.python.org/issue15986>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15986] memoryview: expose 'buf' attribute

2014-07-31 Thread David Beazley


David Beazley added the comment:

Well, a lot of things in this big bad world are dangerous.  Don't see how this 
is any more dangerous than all of the peril that tools like ctypes and llvmpy 
already provide.

--

___
Python tracker 
<http://bugs.python.org/issue15986>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15986] memoryview: expose 'buf' attribute

2014-07-31 Thread David Beazley


David Beazley added the comment:

There are other kinds of libraries that might want to access the .buf 
attribute. For example, the llvmpy extension.  Exposing it would be useful.

--

___
Python tracker 
<http://bugs.python.org/issue15986>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5845] rlcompleter should be enabled automatically

2014-04-08 Thread David Beazley


David Beazley added the comment:

Funny thing, this feature breaks the interactive interpreter in the most basic 
way on OS X systems.   For example, the tab key won't even work to indent.  You 
can't even type the most basic programs into the interactive interpreter. For 
example:

>>> for i in range(10):
...  print(i)

Oh sure, you can make it work by typing the space bar a bunch of times, but 
it's extremely annoying.  

The only way I was able to get a working interactive interpreter on my machine 
was to manually edit site.py and remove the call to enablerlcompleter() from 
main().

I hope someone reconsiders this feature and removes it as default behavior.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue5845>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18111] Add a default argument to min & max

2013-06-04 Thread David Beazley


David Beazley added the comment:

To me, the fact that m = max(s) if s else default doesn't work with iterators 
alone makes this worthy of consideration.   

I would also note that min/max are the only reduction functions that don't have 
the ability to work with a possibly empty sequence.  For example:

>>> sum([])
0
>>> any([])
False
>>> all([])
True
>>> functools.reduce(lambda x,y: x+y, [], 0)
0
>>> math.fsum([])
0.0
>>>

--

___
Python tracker 
<http://bugs.python.org/issue18111>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue18111] Add a default argument to min & max

2013-06-03 Thread David Beazley


David Beazley added the comment:

I could have used this feature myself somewhat recently.  It was in some code 
involving document matching where zero or more possible candidates were 
assigned a score and I was trying to find the max score.  The fact that an 
empty list was a possibility complicated everything because I had to add extra 
checks for it.   max(scores, default=0) would have been a lot simpler.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue18111>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16723] io.TextIOWrapper on urllib.request.urlopen terminates prematurely

2013-01-15 Thread David Beazley


David Beazley added the comment:

I have run into this bug myself.  Agree that a file-like object should never 
report itself as closed unless .close() has been explicitly called on it.   
HTTPResponse should not return itself as closed after the end-of-file has been 
reached.

I think there is also a bug in the implementation of TextIOWrapper as well.  
Even if the underlying file reports itself as closed, previously read and 
buffered data should be processed first before reporting an error about the 
file being closed.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue16723>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16894] Function attribute access doesn't invoke methods in dict subclasses

2013-01-08 Thread David Beazley


New submission from David Beazley:

Suppose you subclass a dictionary:

class mdict(dict):
def __getitem__(self, index):
print('Getting:', index)
return super().__getitem__(index)

Now, suppose you define a function and perform these steps that reassign the 
function's attribute dictionary:

>>> def foo():
... pass
... 
>>> foo.__dict__ = mdict()
>>> foo.x = 23
>>> foo.x  # Observe: No output from overridden __getitem__
23
>>> type(foo.__dict__)

>>> foo.__dict__
{'x': 23}
>>> 

Carefully observe that access to foo.x does not invoke the overridden 
__getitem__() method in mdict.  Instead, it just directly accesses the default 
__getitem__() on dict. 

Admittedly, this is a really obscure corner case.  However, if the __dict__ 
attribute of a function can be legally reassigned, it might be nice for 
inheritance to work ;-).

--
components: Interpreter Core
messages: 179364
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Function attribute access doesn't invoke methods in dict subclasses
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue16894>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14965] super() and property inheritance behavior

2013-01-06 Thread David Beazley


David Beazley added the comment:

Just as a note, there is a distinct possibility that a "property" in a 
superclass could be some other kind of descriptor object that's not a property. 
 To handle that case, the solution of

super(self.__class__, self.__class__).x.fset(self, value)

would actually have to be rewritten as

super(self.__class__, self.__class__).x.__set__(self, value)

That said, I agree it would be nice to have a simplified means of accomplishing 
this.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue14965>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16254] PyUnicode_AsWideCharString() increases string size

2012-10-16 Thread David Beazley


David Beazley added the comment:

Another note: the PyUnicode_AsUTF8String() doesn't leave the UTF-8 encoded byte 
string behind on the original string object.  I got into this thinking that 
PyUnicode_AsWideCharString() might have similar behavior.

--

___
Python tracker 
<http://bugs.python.org/issue16254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16254] PyUnicode_AsWideCharString() increases string size

2012-10-16 Thread David Beazley


David Beazley added the comment:

Maybe it's not a bug, but I still think it's undesirable.   Basically, you have 
a function that allocates a buffer, fills it with data, and allows the buffer 
to be destroyed.   Yet, as a side effect, it allocates a second buffer, fills 
it, and permanently attaches it to the original string object.  Thus it makes 
the size of the string object blow up to a size substantially larger than it 
was before with no way to reclaim memory other than to delete the whole string.

Maybe this is some sort of rare event that doesn't matter, but maybe there's 
some bit of C extension code that is trying to pass a wchar_t array off to some 
external library.  The extension writer is using the 
PyUnicode_AsWideCharString() function with the understanding that it creates a 
new array and that you have to destroy it.   They understand that it's not 
super fast to have to make a copy, but it's better than nothing.  What's 
unfortunate is that all of this attention to memory management doesn't reward 
the programmer as a copy gets left behind on the string object anyways.

For instance, I start with a 10 Megabyte string, I pass it through a C 
extension function, and now the string is mysteriously using 50 Megabytes of 
memory.

I think the idea of filling wstr, returning it and clearing it (if originally 
NULL) would definitely work here.   Actually, that's exactly what I want--don't 
fill in the wstr member if it's not set already.  That way, it's possible for C 
extensions to temporarily get the wstr buffer, do something, and then toss it 
away without affecting the original string.

Another suggestion: An API function to simply clear wstr and the UTF-8 
representation could also work.   Again, this is for extension writers who want 
to pull data out of strings, but don't want to leave these memory side effects 
behind.

--

___
Python tracker 
<http://bugs.python.org/issue16254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16254] PyUnicode_AsWideCharString() increases string size

2012-10-16 Thread David Beazley


David Beazley added the comment:

I should quickly add, is there any way to simply have this function not keep 
the wchar_t buffer around afterwards?   That would be great.

--

___
Python tracker 
<http://bugs.python.org/issue16254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16254] PyUnicode_AsWideCharString() increases string size

2012-10-16 Thread David Beazley


New submission from David Beazley:

The PyUnicode_AsWideCharString() function is described as creating a new buffer 
of type wchar_t allocated by PyMem_Alloc() (which must be freed by the user).   
However, if you use this function, it causes the size of the original string 
object to permanently increase.  For example, suppose you had some extension 
code like this:

static PyObject *py_receive_wchar(PyObject *self, PyObject *args) {
  PyObject *obj;
  wchar_t *s;
  Py_ssize_t len;

  if (!PyArg_ParseTuple(args, "U", &obj)) {
return NULL;
  }
  if ((s = PyUnicode_AsWideCharString(obj, &len)) == NULL) {
return NULL;
  }
  /* Do nothing */
  PyMem_Free(s);
  Py_RETURN_NONE;
}

Now, try an experiment (assume that the above extension function is available 
as 'receive_wchar'). 

>>> s = "Hell"*1000
>>> len(s)
4000
>>> import sys
>>> sys.getsizeof(s)
4049
>>> receive_wchar(s)
>>> sys.getsizeof(s)
20053
>>>

It seems that PyUnicode_AsWideCharString() may be filling in the wstr field of 
the associated PyASCIIObject structure from PEP393 (I haven't verified).  Once 
filled, it never seems to be discarded.

Background:  I am trying to figure out how to convert from Unicode to (wchar_t, 
int *) that doesn't cause a permanent increase in the memory footprint of the 
original Unicode object.  Also, I'm trying to stay away from deprecated Unicode 
APIs.

--
components: Extension Modules, Interpreter Core, Unicode
messages: 173089
nosy: dabeaz, ezio.melotti
priority: normal
severity: normal
status: open
title: PyUnicode_AsWideCharString() increases string size
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue16254>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue16132] ctypes incorrectly encodes .format attribute of memory views

2012-10-04 Thread David Beazley


New submission from David Beazley:

This is somewhat related to an earlier bug report concerning memory views, but 
as far as I can tell, ctypes is not encoding the '.format' attribute correctly 
in most cases.   Consider this example:

First, create a ctypes array:

>>> a = (ctypes.c_double * 3)(1,2,3)
>>> len(a)
3
>>> a[0]
1.0
>>> a[1]
2.0
>>> 

Now, create a memory view for it:

>>> m = memoryview(a)
>>> len(m)
3
>>> m.itemsize
8
>>> m.ndim
1
>>> m.shape
(3,)
>>> 

All looks well.  However, if you try to do anything with the .format or access 
the items, it's completely broken:

>>> m.format
'(3)>> m[0]
Traceback (most recent call last):
  File "", line 1, in 
NotImplementedError: memoryview: unsupported format (3)>> 

This is quite inconsistent with the behavior observed elsewhere. For example:

>>> import array
>>> b = array.array('d',[1,2,3])
>>> memoryview(b).format
'd'
>>> import numpy
>>> c = numpy.array([1,2,3],dtype='d')
>>> memoryview(c).format
'd'
>>> 

As you can see, array libraries are using .format to encode the format of a 
single array item.  ctypes is encoding the format of the entire array (all 
items).  ctypes also includes endianness which presents additional difficulties.

This behavior affects both Python code that wants to use memoryviews, but also 
C extension code that wants to use the underlying buffer protocol to work with 
arrays in a generic way. Essentially, it cuts the use of ctypes off entirely 
unless you modify the underlying buffer handling code to special case it. 

Suggested fix:  Have ctypes only encode the format for a single item in the 
case of arrays.  Also, for items that are encoded using the native byte 
ordering, don't include an endianness modifier ('<','>', etc.).  Including the 
byte order just complicates all of the handling code because it has to be 
modified to a) know what the native byte ordering is and b) to check multiple 
cases such as for "d" and "
<http://bugs.python.org/issue16132>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-20 Thread David Beazley


David Beazley added the comment:

One followup note---I think it's fine to punt on cast('B') if the memoryview is 
non-contiguous.  That's a rare case that's probably not as common.

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-20 Thread David Beazley


David Beazley added the comment:

There's probably a bigger discussion about memoryviews for a rainy day.  
However, the number one thing that would save all of this in my book would be 
to make sure cast('B') is universally supported regardless of format including 
endianness--especially in the standard library. For example, being able to do 
this:

>>> a = array.array('d',[1.0, 2.0, 3.0, 4.0])
>>> m = memoryview(a).cast('B')
>>> m[0:4] = b'\x00\x01\x02\x03'
>>> a
array('d', [1.000112050316, 2.0, 3.0, 4.0])
>>> 

Right now, it doesn't work for ctypes.  For example:

>>> import ctypes
>>> a = (ctypes.c_double * 4)(1,2,3,4)
>>> a
<__main__.c_double_Array_4 object at 0x1006a7cb0>
>>> m = memoryview(a).cast('B')
Traceback (most recent call last):
  File "", line 1, in 
ValueError: memoryview: source format must be a native single character format 
prefixed with an optional '@'
>>> 

As some background, being able to work with a "byte" view of memory is 
important for a lot of problems involving I/O, data interchange, and related 
problems where being able to accurately construct/deconstruct the underlying 
memory buffers is more useful than actually interpreting their contents.

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


David Beazley added the comment:

I should add that 0-dim indexing doesn't work as described either:

>>> import ctypes
>>> d = ctypes.c_double()
>>> m = memoryview(d)
>>> m[()]
Traceback (most recent call last):
  File "", line 1, in 
NotImplementedError: memoryview: unsupported format >>

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


David Beazley added the comment:

Just to be specific, why is something like this not possible?

>>> d = ctypes.c_double()
>>> m = memoryview(d)
>>> m[0:8] = b'abcdefgh'
>>> d.value
8.540883223036124e+194
>>>

(Doesn't have to be exactly like this, but what's wrong with overwriting bytes 
with bytes of a compatible size?).

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


David Beazley added the comment:

No, I want to be able to access the raw bytes sitting behind a memoryview as 
bytes without all of this casting and reinterpretation.  Just show me the raw 
bytes.  Not doubles, not ints, not structure packing, not copying into byte 
strings, or whatever.   Is this really impossible?   It sure seems so.

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


David Beazley added the comment:

I don't think memoryviews should be imposing any casting restrictions at all. 
It's low level.  Get out of the way.

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


David Beazley added the comment:

Even with the 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


David Beazley added the comment:

I don't want to read the representation by copying it into a bytes object.  I 
want direct access to the underlying memory--including the ability to modify 
it.  As it stands now, it's completely useless.

--

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15944] memoryviews and ctypes

2012-09-14 Thread David Beazley


New submission from David Beazley:

I've been playing with the interaction of ctypes and memoryviews and am curious 
about intended behavior.  Consider the following:

>>> import ctypes
>>> d = ctypes.c_double()
>>> m = memoryview(d)
>>> m.ndim
0
>>> m.shape
()
>>> m.readonly
False
>>> m.itemsize
8
>>>

As you can see, you have a memory view for the ctypes double object.  However, 
the fact that it has a 0-dimension and no shape seems to cause all sorts of 
weird behavior.  For instance, indexing and slicing don't work:

>>> m[0]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: invalid indexing of 0-dim memory
>>> m[:]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: invalid indexing of 0-dim memory
>>> 

As such, you can't really seem to do anything interesting with the resulting 
memory view.  For example, you can't pull data out of it.  Nor can you 
overwrite the contents (i.e., replacing the contents with an 8-byte byte 
string).

Attempting to cast the memory view to something else doesn't work either.

>>> d = ctypes.c_double()
>>> m = memoryview(d)
>>> m2 = m.cast('c')
Traceback (most recent call last):
  File "", line 1, in 
ValueError: memoryview: source format must be a native single character format 
prefixed with an optional '@'
>>> 

I must be missing something really obvious here.  Is there no way to get access 
to the memory behind a ctypes object?

--
messages: 170477
nosy: dabeaz
priority: normal
severity: normal
status: open
title: memoryviews and ctypes
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15944>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15546] Iteration breaks with bz2.open(filename,'rt')

2012-08-03 Thread David Beazley


David Beazley added the comment:

File attached.The file can be read in its entirety in binary mode.

--
Added file: http://bugs.python.org/file26673/access-log-0108.bz2

___
Python tracker 
<http://bugs.python.org/issue15546>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue15546] Iteration breaks with bz2.open(filename,'rt')

2012-08-03 Thread David Beazley


New submission from David Beazley:

The bz2 library in Python3.3b1 doesn't support iteration for text-mode 
properly.  Example:

>>> f = bz2.open('access-log-0108.bz2')
>>> next(f)   # Works
b'140.180.132.213 - - [24/Feb/2008:00:08:59 -0600] "GET /ply/ply.html HTTP/1.1" 
200 97238\n'

>>> g = bz2.open('access-log-0108.bz2','rt')
>>> next(g)   # Fails
Traceback (most recent call last):
  File "", line 1, in 
StopIteration
>>>

--
components: Library (Lib)
messages: 167299
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Iteration breaks with bz2.open(filename,'rt')
type: behavior
versions: Python 3.3

___
Python tracker 
<http://bugs.python.org/issue15546>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2011-02-23 Thread David Beazley


David Beazley  added the comment:

Python 3.2 (r32:88445, Feb 20 2011, 21:51:21) 
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import gzip
>>> import io
>>> f = io.TextIOWrapper(gzip.open("file.gz"),encoding='latin-1')
>>> f.readline()
Traceback (most recent call last):
  File "", line 1, in 
io.UnsupportedOperation: read1
>>>

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2011-02-23 Thread David Beazley


David Beazley  added the comment:

If I can find some time, I may took a look at this.   I just noticed that 
similar problems arise trying to wrap TextIOWrapper around the file-like 
objects returned by urllib.request.urlopen as well.

In the big picture, some discussion of what it means to be "file-like" might be 
in order.   If something is "file-like" and binary, should that always imply 
that I be able to wrap a TextIOWrapper object around it in order to 
encode/decode text?  I would argue "yes", but I'd be curious to know what 
others think.

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2011-02-23 Thread David Beazley


David Beazley  added the comment:

Bump.  This is still broken in Python 3.2.

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11117] Implementing Async IO

2011-02-05 Thread David Beazley


David Beazley  added the comment:

Glad you liked it!   I think there is a bit of a cautionary tale in there 
though. With aio_, there is the promise of better performance, but you're also 
going to need a *LOT* of advance planning and thought to avoid creating a 
tangled coding nightmare with it.

Just as an aside, one of the uses of aio_ related functions is to implement 
parts of user-level thread libraries in C (e.g., pthreads, etc.). A library 
might use the asynchronous I/O callbacks as part of implementing non-kernel 
(green) threads.  The code for doing this tends to be very low level and hairy 
with lots of signal handling--for example, if you want to context-switch 
between two user-level threads in C, you usually do it inside a signal handler 
(i.e., you thread-switch inside the signal handler called in response to aio_ 
completions). 

Whether it's feasible to expose aio_* all the way up to Python or not is an 
open question. I suspect it will be fraught with lots of tricky issues. In the 
end, it might just be easier to use threads.  Nevertheless, you'll learn a lot 
about Python internals by working on this :-).

--

___
Python tracker 
<http://bugs.python.org/issue7>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11117] Implementing Async IO

2011-02-05 Thread David Beazley


David Beazley  added the comment:

Anyone contemplating the use of aio_ functions should first go read "The Story 
of Mel".

http://www.catb.org/jargon/html/story-of-mel.html

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue7>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7322] Socket timeout can cause file-like readline() method to lose data

2011-01-14 Thread David Beazley


David Beazley  added the comment:

Just wanted to say that I agree it's nonsense to continue reading on a socket 
that timed out (I'm not even sure what I might have been thinking when I first 
submitted this bug other than just experimenting with edge cases of the socket 
interface).It's still probably good to precisely specify what the behavior 
is in any case.

--

___
Python tracker 
<http://bugs.python.org/issue7322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10907] OS X installer: warn users of buggy Tcl/Tk in OS X 10.6

2011-01-14 Thread David Beazley


David Beazley  added the comment:

A comment from the training world:  The instability of IDLE on the Mac makes 
teaching introductory Python courses a nightmare at the moment.   Sure, one 
might argue that students should install an alternative editor, but then you 
usually end up with two problems instead of one.  It would be great if IDLE 
just "worked" out of the box for starting out.

Glad to see someone looking at this.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue10907>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7322] Socket timeout can cause file-like readline() method to lose data

2011-01-13 Thread David Beazley


David Beazley  added the comment:

Have any other programming environments ever had a feature where a socket 
timeout returns an exception containing partial data?I'm not aware of one 
offhand and speaking as a systems programmer, something like this might be 
somewhat unexpected.

My concern is that in the presence of timeouts, the programmer will be forced 
to reassemble the message themselves from fragments returned in the exception.  
However, one reason for using readline() in the first place is precisely so 
that you don't have to do that sort of thing.

Is there any reason why the input buffer can't be preserved across calls?   
You've already got a file-like wrapper around the socket.  Just keep the 
unconsumed buffer in that instance.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue7322>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley


David Beazley  added the comment:

Hmmm. Interesting.  In the big picture, it might be an interesting project for 
someone (not necessarily the core devs) to sit down and refactor both of these 
modules so that they play nice with Python 3 I/O system.  Obviously that's a 
project outside the scope of this bug or the 3.2 release for that matter.

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley


David Beazley  added the comment:

Do Python devs really view gzip and bz2 as two totally completely different 
animals?  They both have the same functionality and would be used for the same 
kinds of things.   Maybe I'm missing something.

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley


David Beazley  added the comment:

C or not, wrapping a BZ2File instance with a TextIOWrapper to get text still 
seems like something that someone might want to do.  I doubt it would take much 
modification to give BZ2File instances the required set of methods.

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley


David Beazley  added the comment:

It goes without saying that this also needs to be checked with the bz2 module. 
A quick check seems to indicate that it has the same problem.

While you're at it, maybe someone could add an 'open' function to bz2 to make 
it symmetrical with gzip as well :-).

--

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10791] Wrapping TextIOWrapper around gzip files

2010-12-29 Thread David Beazley


New submission from David Beazley :

Is something like this supposed to work:

>>> import gzip
>>> import io
>>> f = io.TextIOWrapper(gzip.open("foo.gz"),encoding='ascii'))
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: readable

In a nutshell--reading a .gz file as text.

--
messages: 124870
nosy: dabeaz
priority: normal
severity: normal
status: open
title: Wrapping TextIOWrapper around gzip files
type: behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue10791>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-28 Thread David Beazley


David Beazley  added the comment:

Thanks everyone for looking at this!

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-28 Thread David Beazley


David Beazley  added the comment:

As a user of Python 3, I would like echo Victor's comment about fixing the API 
right now as opposed to having to deal with it later.  I can only speak for 
myself, but I would guess that anyone using Python 3 already understands that 
it's bleeding edge and that the bytes/strings distinction is really important.  
If fixing this breaks some third party libraries, I say good--they shouldn't 
have been blindly passing Unicode into struct in the first place.  Better to 
deal with it now when the number of users is relatively small.

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-27 Thread David Beazley


David Beazley  added the comment:

Actually, here's another one of my favorite examples:

>>> import struct
>>> struct.pack("s","\xf1")
b'\xc3'
>>> 

Not only does this not encode the correct value, it doesn't even encode the 
entire UTF-8 encoding (just the first byte of it).   Like I said, pity the poor 
bastard who puts something that in their code and they spend the whole day 
trying figure out where in the hell '\xf1' magically got turned into '\xc3'.

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-27 Thread David Beazley


David Beazley  added the comment:

I encountered this issue is in the context of distributed 
computing/interprocess communication involving binary-encoded records (and 
encoding/decoding such records using struct). At its core, this is all about 
I/O--something where encodings and decoding matter a lot.  Frankly, it was 
quite surprising that a unicode string would silently pass through struct and 
turn into bytes.  IMHO, the fact that this is even possible encourages a sloppy 
usage of struct that favors programming convenience over correctness--something 
that's only going to end badly for the poor soul who passes non-ASCII 
characters into struct without knowing it. 

A default encoding might be okay as long as it was set to something like ASCII 
or Latin-1 (not UTF-8).  At least then you'd get an encoding error for 
characters that don't fit into a byte.

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-27 Thread David Beazley


David Beazley  added the comment:

Why is it even encoding at all?  Almost every other part of Python 3 forces you 
to be explicit about bytes/string conversion.  For example:

struct.pack("10s", x.encode('utf-8'))

Given that automatic conversion is documented, it's not clear what can be done 
at this point.  However, there are very few other parts of Python 3 that 
perform implicit string-byte conversions like this (at least that I know of 
off-hand).

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-27 Thread David Beazley


David Beazley  added the comment:

Hmmm. Well, the docs seem to say that it's allowed and that it will be encoded 
as UTF-8.  

Given the treatment of Unicode/bytes elsewhere in Python 3, all I can say is 
that this behavior is rather surprising.

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-27 Thread David Beazley


David Beazley  added the comment:

Note: This is what happens in Python 2.6.4:

>>> import struct
>>> struct.pack("10s",u"Jalape\u00f1o")
Traceback (most recent call last):
  File "", line 1, in 
struct.error: argument for 's' must be a string
>>>

--

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10783] struct.pack() and Unicode strings

2010-12-27 Thread David Beazley


New submission from David Beazley :

Is the struct.pack() function supposed to automatically encode Unicode strings 
into binary?  For example:

>>> struct.pack("10s","Jalape\u00f1o")
b'Jalape\xc3\xb1o\x00'
>>>

This is Python 3.2b1.

--
components: Library (Lib)
messages: 124727
nosy: dabeaz
priority: normal
severity: normal
status: open
title: struct.pack() and Unicode strings
type: behavior
versions: Python 3.2

___
Python tracker 
<http://bugs.python.org/issue10783>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-28 Thread David Beazley


David Beazley  added the comment:

Wow, that is a *really* intriguing performance result with radically different 
behavior than Unix.  Do you have any ideas of what might be causing it?

--

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-27 Thread David Beazley


David Beazley  added the comment:

One more attempt at fixing tricky segfaults.   Glad someone had some eagle eyes 
on this :-).

--
Added file: http://bugs.python.org/file17106/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-27 Thread David Beazley


Changes by David Beazley :


Removed file: http://bugs.python.org/file17104/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-27 Thread David Beazley


David Beazley  added the comment:

I stand corrected.   However, I'm going to have to think of a completely 
different approach for carrying out that functionality as I don't know how the 
take_gil() function is able to determine whether gil_last_holder has been 
deleted or not.   Will think about it and post an updated patch later. 

Do you have any examples or insight you can provide about how these segfaults 
have shown up in Python code?   I'm not able to observe any such behavior on 
OS-X or Linux.  Is this happening while running the ccbench program?  Some 
other program?

--

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-27 Thread David Beazley


David Beazley  added the comment:

That second access of gil_last_holder->cpu_bound is safe because that block of 
code is never entered unless some other thread currently holds the GIL.   If a 
thread holds the GIL, then gil_last_holder is guaranteed to have a valid value.

--

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-27 Thread David Beazley


David Beazley  added the comment:

Added extra pointer check to avoid possible segfault.

--
Added file: http://bugs.python.org/file17104/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-27 Thread David Beazley


Changes by David Beazley :


Removed file: http://bugs.python.org/file17102/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-26 Thread David Beazley


David Beazley  added the comment:

New version of patch that will probably fix Windows-XP problems. Was doing 
something stupid in the monitor (not sure how it worked on Unix).

--
Added file: http://bugs.python.org/file17102/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-26 Thread David Beazley


Changes by David Beazley :


Removed file: http://bugs.python.org/file17094/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-26 Thread David Beazley


David Beazley  added the comment:

I've also attached a new file schedtest.py that illustrates a subtle difference 
between having the GIL monitor thread and not having the monitor.

Without the monitor, every thread is responsible for its own scheduling.  If 
you have a lot of threads running, you may have a lot of threads all performing 
a timed wait and then waking up only to find that the GIL is locked and that 
they have to go back to waiting.  One side effect is that certain threads have 
a tendency to starve.

For example, if you run the schedtest.py with the original GIL, you get a trace 
where three CPU-bound threads run like this:

Thread-3 16632
Thread-2 16517
Thread-1 31669
Thread-2 16610
Thread-1 16256
Thread-2 16445
Thread-1 16643
Thread-2 16331
Thread-1 16494
Thread-3 16399
Thread-1 17090
Thread-1 20860
Thread-3 16306
Thread-1 19684
Thread-3 16258
Thread-1 16669
Thread-3 16515
Thread-1 16381
Thread-3 16600
Thread-1 16477
Thread-3 16507
Thread-1 16740
Thread-3 16626
Thread-1 16564
Thread-3 15954
Thread-2 16727
...

You will observe that Threads 1 and 2 alternate, but Thread 3 starves.  Then at 
some point, Threads 1 and 3 alternate, but Thread 2 starves. 

By having a separate GIL monitor, threads are no longer responsible for making 
scheduling decisions concerning timeouts.  Instead, the monitor is what times 
out and yanks threads off the GIL.  If you run the same test with the GIL 
monitor, you get scheduling like this:

Thread-1 33278
Thread-2 32278
Thread-3 31981
Thread-1 33760
Thread-2 32385
Thread-3 32019
Thread-1 32700
Thread-2 32085
Thread-3 32248
Thread-1 31630
Thread-2 32200
Thread-3 32054
Thread-1 32721
Thread-2 32659
Thread-3 34150

Threads nicely cycle round-robin.  There also appears to be about half as much 
thread switching (for reasons I don't quite understand).

--
Added file: http://bugs.python.org/file17095/schedtest.py

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-26 Thread David Beazley


David Beazley  added the comment:

I've updated the GIL patch to reflect concerns about the monitor thread running 
forever.  This version has a suspension mechanism where the monitor goes to 
sleep if nothing is going on for awhile.  It gets resumed if threads try to 
acquire the GIL, but timeout for some reason.

--
Added file: http://bugs.python.org/file17094/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-26 Thread David Beazley


Changes by David Beazley :


Removed file: http://bugs.python.org/file17084/dabeaz_gil.patch

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-26 Thread David Beazley


David Beazley  added the comment:

Greg,

I like the idea of the monitor suspending if no thread owns the GIL.  Let me 
work on that.   Good point on embedded systems.

Antoine, 

Yes, the gil monitor is completely independent and simply ticks along every 5 
ms.   A worst case scenario is that an I/O bound thread is scheduled shortly 
after the 5ms tick and then becomes CPU-bound afterwards.  In that case, the 
monitor might let it run up to about 10ms before switching it.  Hard to say if 
it's a real problem though---the normal timeslice on many systems is 10 ms so 
it doesn't seem out of line.  

As for the priority part, this patch should have similar behavior to the 
glinter patch except for very subtle differences in thread scheduling due to 
the use of the GIL monitor.  For instance, since threads never time out on the 
condition variable anymore, they tend to cycle execution in a purely 
round-robin fashion.

--

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-25 Thread David Beazley


David Beazley  added the comment:

Here is the result of running the writes.py test with the patch I submitted.   
This is on OS-X.

bash-3.2$ ./python.exe writes.py
t1 2.83990693092 0
t2 3.27937912941 0
t1 5.54346394539 1
t2 6.68237304688 1
t1 8.9648039341 2
t2 9.60041999817 2
t1 12.1856160164 3
t2 12.5866689682 3
t1 15.3869640827 4
t2 15.7042851448 4
t1 18.4115200043 5
t2 18.5771169662 5
t2 21.4922711849 6
t1 21.6835460663 6
t2 24.6117911339 7
t1 24.9126679897 7
t1 27.1683580875 8
t2 28.2728791237 8
t1 29.4513950348 9
t1 32.2438161373 10
t2 32.5283250809 9
t1 34.8905010223 11
t2 36.0952250957 10
t1 38.109760046 12
t2 39.3465380669 11
t1 41.5758800507 13
t2 42.587772131 12
t1 45.1536290646 14
t2 45.8339021206 13
t1 48.6495029926 15
t2 49.1581180096 14
t1 51.5414950848 16
t2 52.6768190861 15
t1 54.818582058 17
t2 56.1163961887 16
t1 58.1549630165 18
t2 59.6944830418 17
t1 61.4515309334 19
t2 62.7685520649 18
t1 64.3223180771 20
t2 65.8158640862 19
65.8578810692

--

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-25 Thread David Beazley


David Beazley  added the comment:

One comment on that patch I just submitted. Basically, it's an attempt to make 
an extremely simple tweak to the GIL that fixes most of the problems discussed 
here in an extremely simple manner.  I don't have any special religious 
attachment to it though.  Would love to see a BFS comparison.

--

___
Python tracker 
<http://bugs.python.org/issue7946>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue7946] Convoy effect with I/O bound threads and New GIL

2010-04-25 Thread David Beazley


David Beazley  added the comment:

The attached patch makes two simple refinements to the new GIL implemented in 
Python 3.2.   Each is briefly described below.

1. Changed mechanism for thread time expiration

In the current implementation, threads perform a timed-wait on a condition 
variable.  If time expires and no thread switches have occurred, the currently 
running thread is forced to drop the GIL.

In the patch, timeouts are now performed by a special "GIL monitor" thread.  
This thread runs independently of Python and simply handles time expiration.  
Basically, it records the number of thread switches, sleeps for a specified 
interval (5ms), and then looks at the number of thread switches again.  If no 
switches occurred, it forces the currently running thread to drop the GIL.

With this monitor thread, it is no longer necessary to perform any timed 
condition variable waits.  This approach has a few subtle benefits.  First, 
threads no longer sit in a wait/timeout cycle when trying to get the GIL (so, 
there is less overhead).   Second, you get FIFO scheduling of threads.  When 
time expires, the thread that has been waiting the longest on the condition 
variable runs next.  Generally, you want this.

2. A very simple two-level priority mechanism

A new attribute 'cpu_bound' is added to the PyThreadState structure.  If a 
thread is ever forced to drop the GIL, this attribute is simply set True (1).  
If a thread gives up the GIL voluntarily, it is set back to False (0).  This 
attribute is used to set up simple scheduling (described next).

There are now two separate condition variables (gil_cpu_cond) and (gil_io_cond) 
that separate waiting threads according to their cpu_bound attribute setting.  
CPU-bound threads wait on gil_cpu_cond whereas I/O-bound threads wait on 
gil_io_cond. 

Using the two condition variables, the following scheduling rules are enforced:

   - If there are any waiting I/O bound threads, they are always signaled 
first, before any CPU-bound threads.
   - If an I/O bound thread wants the GIL, but a CPU-bound thread is running, 
the CPU-bound thread is immediately forced to drop the GIL.
   - If a CPU-bound thread wants the GIL, but another CPU-bound thread is 
running, the running thread is immediately forced to drop the GIL if its time 
period has already expired.

Results
---
This patch gives excellent results for both the ccbench test and all of my 
previous I/O bound tests.  Here is the output:

== CPython 3.2a0.0 (py3k:80470:80497M) ==
== i386 Darwin on 'i386' ==

--- Throughput ---

Pi calculation (Python)

threads=1: 871 iterations/s.
threads=2: 844 ( 96 %)
threads=3: 838 ( 96 %)
threads=4: 826 ( 94 %)

regular expression (C)

threads=1: 367 iterations/s.
threads=2: 345 ( 94 %)
threads=3: 339 ( 92 %)
threads=4: 327 ( 89 %)

bz2 compression (C)

threads=1: 384 iterations/s.
threads=2: 728 ( 189 %)
threads=3: 695 ( 180 %)
threads=4: 707 ( 184 %)

--- Latency ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 0 ms.)
CPU threads=2: 1 ms. (std dev: 2 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 1 ms.)

Background CPU task: regular expression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 2 ms. (std dev: 1 ms.)
CPU threads=2: 1 ms. (std dev: 1 ms.)
CPU threads=3: 1 ms. (std dev: 1 ms.)
CPU threads=4: 2 ms. (std dev: 1 ms.)

Background CPU task: bz2 compression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 2 ms.)
CPU threads=2: 2 ms. (std dev: 3 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 1 ms.)

--- I/O bandwidth ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 5850.9 packets/s.
CPU threads=1: 5246.8 ( 89 %)
CPU threads=2: 4228.9 ( 72 %)
CPU threads=3: 4222.8 ( 72 %)
CPU threads=4: 2959.5 ( 50 %)

Particular attention should be given to tests involving I/O performance.  In 
particular, here are the results of the I/O bandwidth test using the unmodified 
GIL:

--- I/O bandwidth ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 6007.1 packets/s.
CPU threads=1: 189.0 ( 3 %)
CPU threads=2: 19.7 ( 0 %)
CPU threads=3: 19.7 ( 0 %)
CPU threads=4: 5.1 ( 0 %)

Other Benefits
--
This patch does not involve any complicated libraries, platform specific 
functionality, low-level lock twiddling, or mathematically complex priority 
scheduling algorithms.  Emphasize: The code is simple.

Negative Aspects

This modification might introduce a starvation effect where CPU-bound threads 
never get to run if there is an extremely heavy load of I/O-bound threads 
competing for the GIL.

Comparison to BFS
-
Still need to test. Would be curious.

--
Added file: http://bugs.python.org/file17084/dabeaz_gil.patch

___
Python tracker 
<

[issue8532] Refinements to Python 3 New GIL

2010-04-25 Thread David Beazley


David Beazley  added the comment:

Can't decide whether this should be attached to Issue 7946 or not.
I will also post it there.  (Feel free to close this issue if you want to keep 
7946 alive).

--

___
Python tracker 
<http://bugs.python.org/issue8532>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8532] Refinements to Python 3 New GIL

2010-04-25 Thread David Beazley


New submission from David Beazley :

The attached patch makes two simple refinements to the new GIL implemented in 
Python 3.2.   Each is briefly described below.

1. Changed mechanism for thread time expiration

In the current implementation, threads perform a timed-wait on a condition 
variable.  If time expires and no thread switches have occurred, the currently 
running thread is forced to drop the GIL.

In the patch, timeouts are now performed by a special "GIL monitor" thread.  
This thread runs independently of Python and simply handles time expiration.  
Basically, it records the number of thread switches, sleeps for a specified 
interval (5ms), and then looks at the number of thread switches again.  If no 
switches occurred, it forces the currently running thread to drop the GIL.

With this monitor thread, it is no longer necessary to perform any timed 
condition variable waits.  This approach has a few subtle benefits.  First, 
threads no longer sit in a wait/timeout cycle when trying to get the GIL (so, 
there is less overhead).   Second, you get FIFO scheduling of threads.  When 
time expires, the thread that has been waiting the longest on the condition 
variable runs next.  Generally, you want this.

2. A very simple two-level priority mechanism

A new attribute 'cpu_bound' is added to the PyThreadState structure.  If a 
thread is ever forced to drop the GIL, this attribute is simply set True (1).  
If a thread gives up the GIL voluntarily, it is set back to False (0).  This 
attribute is used to set up simple scheduling (described next).

There are now two separate condition variables (gil_cpu_cond) and (gil_io_cond) 
that separate waiting threads according to their cpu_bound attribute setting.  
CPU-bound threads wait on gil_cpu_cond whereas I/O-bound threads wait on 
gil_io_cond. 

Using the two condition variables, the following scheduling rules are enforced:

   - If there are any waiting I/O bound threads, they are always signaled 
first, before any CPU-bound threads.
   - If an I/O bound thread wants the GIL, but a CPU-bound thread is running, 
the CPU-bound thread is immediately forced to drop the GIL.
   - If a CPU-bound thread wants the GIL, but another CPU-bound thread is 
running, the running thread is immediately forced to drop the GIL if its time 
period has already expired.

Results
---
This patch gives excellent results for both the ccbench test and all of my 
previous I/O bound tests.  Here is the output:

== CPython 3.2a0.0 (py3k:80470:80497M) ==
== i386 Darwin on 'i386' ==

--- Throughput ---

Pi calculation (Python)

threads=1: 871 iterations/s.
threads=2: 844 ( 96 %)
threads=3: 838 ( 96 %)
threads=4: 826 ( 94 %)

regular expression (C)

threads=1: 367 iterations/s.
threads=2: 345 ( 94 %)
threads=3: 339 ( 92 %)
threads=4: 327 ( 89 %)

bz2 compression (C)

threads=1: 384 iterations/s.
threads=2: 728 ( 189 %)
threads=3: 695 ( 180 %)
threads=4: 707 ( 184 %)

--- Latency ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 0 ms.)
CPU threads=2: 1 ms. (std dev: 2 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 1 ms.)

Background CPU task: regular expression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 2 ms. (std dev: 1 ms.)
CPU threads=2: 1 ms. (std dev: 1 ms.)
CPU threads=3: 1 ms. (std dev: 1 ms.)
CPU threads=4: 2 ms. (std dev: 1 ms.)

Background CPU task: bz2 compression (C)

CPU threads=0: 0 ms. (std dev: 0 ms.)
CPU threads=1: 0 ms. (std dev: 2 ms.)
CPU threads=2: 2 ms. (std dev: 3 ms.)
CPU threads=3: 0 ms. (std dev: 1 ms.)
CPU threads=4: 0 ms. (std dev: 1 ms.)

--- I/O bandwidth ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 5850.9 packets/s.
CPU threads=1: 5246.8 ( 89 %)
CPU threads=2: 4228.9 ( 72 %)
CPU threads=3: 4222.8 ( 72 %)
CPU threads=4: 2959.5 ( 50 %)

Particular attention should be given to tests involving I/O performance.  In 
particular, here are the results of the I/O bandwidth test using the unmodified 
GIL:

--- I/O bandwidth ---

Background CPU task: Pi calculation (Python)

CPU threads=0: 6007.1 packets/s.
CPU threads=1: 189.0 ( 3 %)
CPU threads=2: 19.7 ( 0 %)
CPU threads=3: 19.7 ( 0 %)
CPU threads=4: 5.1 ( 0 %)

Other Benefits
--
This patch does not involve any complicated libraries, platform specific 
functionality, low-level lock twiddling, or mathematically complex priority 
scheduling algorithms.  Emphasize: The code is simple.

Negative Aspects

This modification might introduce a starvation effect where CPU-bound threads 
never get to run if there is an extremely heavy load of I/O-bound threads 
competing for the GIL.

Is starvation a real problem or a theoretical problem?  Hard to say. Would need 
study.

--
components: Interpreter Core
files: gil.patch
keywords: patch
messages: 104192
nosy: dabeaz
severity: normal
status: open
title: Refinements to Pyth

[issue8410] Fix emulated lock to be 'fair'

2010-04-21 Thread David Beazley


David Beazley  added the comment:

I know that multicore processors are all the rage right now, but one thing that 
concerns me about this patch is its effect on single-core systems.  If you 
apply this on a single-CPU, are threads just going to sit there and thrash as 
they rapidly context switch? (Something that does not occur now).

Also, I've done a few experiments and on a single-core Windows-XP machine, the 
GIL does not appear to have any kind of fairness to it (as previously claimed 
here).   Yet, if I run the same experiments on a dual-core PC, the GIL is 
suddenly fair.  So, somewhere in that lock implementation, it seems to adapt to 
the environment.  Do we have to try an emulate that behavior in Unix?   If so, 
how do you do it without it turning into a huge coding mess? 

I'll just mention that the extra context-switching introduced by fair-locking 
has a rather pronounced effect on performance that should be considered even on 
multicore.  I posted some benchmarks in Issue 8299 for Linux and OS-X.  In 
those benchmarks, the introduction of fair GIL locking makes CPU-bound threads 
run about 2-5 times slower than before on Linux and OS-X.

--
nosy: +dabeaz

___
Python tracker 
<http://bugs.python.org/issue8410>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8299] Improve GIL in 2.7

2010-04-18 Thread David Beazley


David Beazley  added the comment:

Here are the results of running the fair.py test on a Mac OS-X system using a 
"fair" GIL implementation (modified condition variable):

[ Fair GIL, Dual-Core, OS-X ]
Sequential execution
slow: 5.490943 (0 left)
fast: 0.369257 (0 left)
Threaded execution
slow: 6.122093 (0 left)
fast: 6.179179 (0 left)
Treaded, balanced execution:
fast C: 3.345452 (0 left)
fast B: 3.389235 (0 left)
fast A: 3.426407 (0 left)
Treaded, balanced execution, with quickstop:
fast C: 2.557972 (0 left)
fast B: 2.558551 (35087 left)
fast A: 2.558914 (13142 left)

Here is the same test with the original GIL.

[Unfair GIL, original implementation]

Sequential execution
slow: 5.444754 (0 left)
fast: 0.361340 (0 left)
Threaded execution
slow: 5.542008 (0 left)
fast: 5.225690 (0 left)
Treaded, balanced execution:
fast C: 1.381929 (0 left)
fast B: 1.499969 (0 left)
fast A: 1.549571 (0 left)
Treaded, balanced execution, with quickstop:
fast A: 1.284043 (0 left)
fast B: 1.295507 (32490 left)
fast C: 1.294981 (274777 left)

Please observe that the performance of threads under the "fair" GIL are 
significantly worse than with the "unfair" GIL.

Having studied this in more depth, I have to say that I would much rather have 
fast-running unfair threads than slow-running fair threads.  Although I agree 
that there are other benefits to fairness, they just aren't enough to 
compensate for the huge performance hit.

--

___
Python tracker 
<http://bugs.python.org/issue8299>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8299] Improve GIL in 2.7

2010-04-17 Thread David Beazley


David Beazley  added the comment:

As a followup, since I'm not sure anyone actually here actually tried a fair 
GIL on Linux, I incorporated your suggested fairness patch to the 
condition-variable version of the GIL (using this pseudocode you wrote as a 
guide):

with gil.cond:
  if gil.n_waiting or gil.locked:
gil.n_waiting += 1
while True:
  gil.cond.wait() #always wait at least once
  if not gil.locked:
break
gil.n_waiting -= 1
  gil.locked = True

I did some tests on this and it does appear to exhibit fairness. Here are the 
results of running the 'fair.py' test with a fair GIL on my Linux system:

[ Fair GIL Linux ]
Sequential execution
slow: 6.246764 (0 left)
fast: 0.465102 (0 left)
Threaded execution
slow: 7.534725 (0 left)
fast: 7.674448 (0 left)
Treaded, balanced execution:
fast A: 10.415756 (0 left)
fast B: 10.456502 (0 left)
fast C: 10.520457 (0 left)
Treaded, balanced execution, with quickstop:
fast B: 8.423304 (0 left)
fast A: 8.409794 (16016 left)
fast C: 8.381977 (9162 left)
beaz...@ubuntu:~/Desktop/Python-2.6.4$ 

If I switch back to the unfair GIL, this is the result:

[ Unfair GIL, original implementation, Linux]
Sequential execution
slow: 6.164739 (0 left)
fast: 0.422626 (0 left)
Threaded execution
slow: 6.570084 (0 left)
fast: 6.690927 (0 left)
Treaded, balanced execution:
fast A: 1.994143 (0 left)
fast C: 2.014925 (0 left)
fast B: 2.073212 (0 left)
Treaded, balanced execution, with quickstop:
fast A: 1.614533 (0 left)
fast C: 1.607324 (377323 left)
fast B: 1.625987 (111451 left)

Probably the main thing to notice is the huge increase in performance over the 
fair GIL.  For instance, the balance execution test runs about 5 times faster.

Here are the two tests repeated with checkinterval = 1000.

[ Fair GIL, checkinterval = 1000]
Sequential execution
slow: 6.175320 (0 left)
fast: 0.424410 (0 left)
Threaded execution
slow: 6.505094 (0 left)
fast: 6.746649 (0 left)
Treaded, balanced execution:
fast A: 2.243123 (0 left)
fast B: 2.416043 (0 left)
fast C: 2.442475 (0 left)
Treaded, balanced execution, with quickstop:
fast A: 1.565914 (0 left)
fast C: 1.514024 (81254 left)
fast B: 1.531937 (63740 left)

[ Unfair GIL, checkinterval = 1000]

Sequential execution
slow: 6.258882 (0 left)
fast: 0.411590 (0 left)
Threaded execution
slow: 6.255027 (0 left)
fast: 0.409412 (0 left)
Treaded, balanced execution:
fast A: 1.291007 (0 left)
fast C: 1.135373 (0 left)
fast B: 1.437205 (0 left)
Treaded, balanced execution, with quickstop:
fast C: 1.331775 (0 left)
fast A: 1.418670 (54841 left)
fast B: 1.403853 (208732 left)

Here, the unfair GIL is still quite a bit faster on raw performance.  I tried 
kicking the check interval up to 1 and the unfair GIL still won by a pretty 
significant margin on raw speed of completing the different tasks.

I've attached a copy of the thread_pthread.h file I modified for this test.  
It's from Python-2.6.4.

--
Added file: http://bugs.python.org/file16958/thread_pthread.h

___
Python tracker 
<http://bugs.python.org/issue8299>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8299] Improve GIL in 2.7

2010-04-17 Thread David Beazley


David Beazley  added the comment:

I'm definitely sure that semaphores were being used in my test---I stuck a 
print statement inside the code that creates locks just to make sure it was 
using the semaphore version :-).

Unfortunately, at this point I think most of this discussion is academic since 
no change is likely to be incorporated into Python 2.7.  I can definitely see 
where fairness might help I/O performance if there is only 1 CPU bound thread.  
I just don't know for other situations.  For example, if you have a server 
where it's all I/O-bound threads, but it suddenly comes under extreme load 
(e.g., slashdot effect), does a fair GIL help or hurt with that?  I just don't 
know.

In the big picture, all of the issues raised here should be on the minds of 
people fixing the GIL in py3k though.   It's just one more aspect of why fixing 
the GIL is hard.

--

___
Python tracker 
<http://bugs.python.org/issue8299>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue8299] Improve GIL in 2.7

2010-04-16 Thread David Beazley


David Beazley  added the comment:

One other comment.  Running the modified fair.py file on my Linux system using 
Python compiled with semaphores shows they they are *definitely* not fair.  
Here's the relevant part of your test:

Treaded, balanced execution, with quickstop:
fast C: 1.580815 (0 left)
fast B: 1.636923 (158919 left)
fast A: 1.788634 (310323 left)

--

___
Python tracker 
<http://bugs.python.org/issue8299>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

1 2 >

1 - 100 of 124 matches

Mail list logo