[issue7946] Convoy effect with I/O bound threads and New GIL
Change by David Beazley : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16894] Function attribute access doesn't invoke methods in dict subclasses
Change by David Beazley : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue16894> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24844] Python 3.5rc1 compilation error with Apple clang 4.2 included with Xcode 4
Change by David Beazley : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue24844> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32810] Expose ags_gen and agt_gen in asynchronous generators
Change by David Beazley : -- stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue32810> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27436] Strange code in selectors.KqueueSelector
Change by David Beazley : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue27436> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16132] ctypes incorrectly encodes .format attribute of memory views
Change by David Beazley : -- stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue16132> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: About nine years ago, I stood in front of a room of Python developers, including many core developers, and gave a talk about the problem described in this issue. It included some live demos and discussion of a possible fix. https://www.youtube.com/watch?v=fwzPF2JLoeU Based on subsequent interest, I think it's safe to say that this issue will never be fixed. Probably best to close this issue. -- ___ Python tracker <https://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33014] Clarify doc string for str.isidentifier()
David Beazley added the comment: s = 'Some String' s.isalnum() s.isalpha() s.isdecimal() s.isdigit() s.isidentifier() s.islower() s.isnumeric() s.isprintable() s.isspace() s.istitle() s.isupper() Not really sure where I would have gotten the idea that it might be referring to s.iskeyword(). But what do I know? I'll stop submitting further suggestions. -- ___ Python tracker <https://bugs.python.org/issue33014> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33014] Clarify doc string for str.isidentifier()
David Beazley added the comment: That wording isn't much better in my opinion. If I'm sitting there looking at methods like str.isdigit(), str.isnumeric(), str.isascii(), and str.isidentifier(), seeing keyword.iskeyword() makes me think it's a method regardless of whether you label it a function or method. Explicitly stating that "keyword" is actually the keyword module makes it much clearer. Or at least including the argument as well keyword.iskeyword(kw). It really should be a string method though ;-) -- ___ Python tracker <https://bugs.python.org/issue33014> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33014] Clarify doc string for str.isidentifier()
New submission from David Beazley : This is a minor nit, but the doc string for str.isidentifier() states: Use keyword.iskeyword() to test for reserved identifiers such as "def" and "class". At first glance, I thought that it meant you'd do this (doesn't work): 'def'.iskeyword() As opposed to this: import keyword keyword.iskeyword('def') Perhaps a clarification that "keyword" refers to the keyword module could be added. Or better yet, just make 'iskeyword()` a string method ;-). -- assignee: docs@python components: Documentation messages: 313335 nosy: dabeaz, docs@python priority: normal severity: normal status: open title: Clarify doc string for str.isidentifier() versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue33014> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32810] Expose ags_gen and agt_gen in asynchronous generators
David Beazley added the comment: I've attached a file that illustrates the issue. (Side thought: this would be nice to have in inspect or traceback) -- Added file: https://bugs.python.org/file47434/agen.py ___ Python tracker <https://bugs.python.org/issue32810> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32810] Expose ags_gen and agt_gen in asynchronous generators
New submission from David Beazley : Libraries such as Curio and asyncio provide a debugging facility that allows someone to view the call stack of generators/coroutines. For example, the _task_get_stack() function in asyncio/base_tasks.py. This works by manually walking up the chain of coroutines (by following cr_frame and gi_frame links as appropriate). The only problem is that it doesn't work if control flow falls into an async generator because an "async_generator_asend" instance is encountered and there is no meaningful way to proceed any further with stack inspection. This problem could be fixed if "async_generator_asend" and "async_generator_athrow" instances exposed the underlying "ags_gen" and "agt_gen" attribute that's held inside the corresponding C structures in Objects/genobject.c. Note: I made a quick and dirty "hack" to Python to extract "ags_gen" and verified that having this information would allow me to get complete stack traces in Curio. -- messages: 311906 nosy: dabeaz priority: normal severity: normal status: open title: Expose ags_gen and agt_gen in asynchronous generators type: enhancement versions: Python 3.7 ___ Python tracker <https://bugs.python.org/issue32810> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue32690] Return function locals() in order of creation?
David Beazley added the comment: Some context: I noticed this while discussing (in a course) a programming trick involving instance initialization and locals() that I'd encountered in the past: def _init(locs): self = locs.pop('self') for name, val in locs.items(): setattr(self, name, val) class Spam: def __init__(self, a, b, c, d): _init(locals()) In looking at locals(), it was coming back in reverse order of method arguments (d, c, b, a, self). To be honest, it wasn't a critical matter, but more of an odd curiosity in light of recent dictionary ordering. I could imagine writing a slightly more general version of _init() that didn't depend on a named 'self' argument if order was preserved: def _init(locs): items = list(locs.items()) _, self = items[0] for name, val in items[1:]: setattr(self, name, val) Personally, I don't think the issue Nathaniel brings up is worth worrying about because it would be such a weird edge case on something that is already an edge case. Returning variables in "lexical order"--meaning the order in which first encountered in the source seems pretty sensible to me. -- nosy: +dabeaz ___ Python tracker <https://bugs.python.org/issue32690> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27436] Strange code in selectors.KqueueSelector
David Beazley added the comment: I don't see any possible way that you would ever get events = EVENT_READ | EVENT_WRITE if the flag is a single value (e.g., KQ_FILTER_READ) and the flag itself is not a bitmask. Only one of those == tests will ever be True. There is no need to use |=. Unless I'm missing something. -- ___ Python tracker <http://bugs.python.org/issue27436> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27436] Strange code in selectors.KqueueSelector
David Beazley added the comment: If the KQ_FILTER constants aren't bitmasks, it seems that the code could be simplified to the last version then. At the least, it would remove a few unnecessary calculations.Again, a very minor thing (I only stumbled onto it by accident really). -- ___ Python tracker <http://bugs.python.org/issue27436> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27436] Strange code in selectors.KqueueSelector
New submission from David Beazley: Not so much a bug, but an observation based on reviewing the implementation of the selectors.KqueueSelector class. In that class there is the select() method: def select(self, timeout=None): timeout = None if timeout is None else max(timeout, 0) max_ev = len(self._fd_to_key) ready = [] try: kev_list = self._kqueue.control(None, max_ev, timeout) except InterruptedError: return ready for kev in kev_list: fd = kev.ident flag = kev.filter events = 0 if flag == select.KQ_FILTER_READ: events |= EVENT_READ if flag == select.KQ_FILTER_WRITE: events |= EVENT_WRITE key = self._key_from_fd(fd) if key: ready.append((key, events & key.events)) return ready The for-loop looks like it might be checking flags against some kind of bit-mask in order to build events. However, if so, the code just looks wrong. Wouldn't it use the '&' operator (or some variant) instead of '==' like this? for kev in kev_list: fd = kev.ident flag = kev.filter events = 0 if flag & select.KQ_FILTER_READ: events |= EVENT_READ if flag & select.KQ_FILTER_WRITE: events |= EVENT_WRITE If it's not a bit-mask, then wouldn't the code be simplified by something like this? for kev in kev_list: fd = kev.ident flag = kev.filter if flag == select.KQ_FILTER_READ: events = EVENT_READ elif flag == select.KQ_FILTER_WRITE: events = EVENT_WRITE Again, not sure if this is a bug or not. It's just something that looks weirdly off. -- components: Library (Lib) messages: 269676 nosy: dabeaz priority: normal severity: normal status: open title: Strange code in selectors.KqueueSelector type: enhancement versions: Python 3.6 ___ Python tracker <http://bugs.python.org/issue27436> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25476] close() behavior on non-blocking BufferedIO objects with sockets
David Beazley added the comment: Please don't make flush() close the file on a BlockingIOError. That would be an unfortunate mistake and make it impossible to implement non-blocking I/O correctly with buffered I/O. -- ___ Python tracker <http://bugs.python.org/issue25476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25476] close() behavior on non-blocking BufferedIO objects with sockets
New submission from David Beazley: First comment: In the I/O library, there is documented behavior for how things work in the presence of non-blocking I/O. For example, read/write methods returning None on raw file objects. Methods on BufferedIO instances raise a BlockingIOError for operations that can't complete. However, the implementation of close() is currently broken. If buffered I/O is being used and a file is closed, it's possible that the close will fail due to a BlockingIOError occurring as buffered data is flushed to output. However, in this case, the file is closed anyways and there is no possibility to retry. Here is an example to illustrate: >>> from socket import * >>> s = socket(AF_INET, SOCK_STREAM) >>> s.connect(('somehost', port)) >>> s.setblocking(False) >>> f = s.makefile('wb', buffering=1000) # Large buffer >>> f.write(b'x'*100) >>> Now, watch carefully >>> f <_io.BufferedWriter name=4> >>> f.closed False >>> f.close() Traceback (most recent call last): File "", line 1, in BlockingIOError: [Errno 35] write could not complete without blocking >>> f <_io.BufferedWriter name=-1> >>> f.closed True >>> I believe this can be fixed by changing a single line in Modules/_io/bufferedio.c: --- bufferedio_orig.c 2015-10-25 16:40:22.0 -0500 +++ bufferedio.c2015-10-25 16:40:35.0 -0500 @@ -530,10 +530,10 @@ res = PyObject_CallMethodObjArgs((PyObject *)self, _PyIO_str_flush, NULL); if (!ENTER_BUFFERED(self)) return NULL; -if (res == NULL) -PyErr_Fetch(&exc, &val, &tb); -else -Py_DECREF(res); +if (res == NULL) + goto end; +else + Py_DECREF(res); res = PyObject_CallMethodObjArgs(self->raw, _PyIO_str_close, NULL); With this patch, the close() method can be retried as appropriate until all buffered data is successfully written. -- components: IO messages: 253438 nosy: dabeaz priority: normal severity: normal status: open title: close() behavior on non-blocking BufferedIO objects with sockets type: behavior versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue25476> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7322] Socket timeout can cause file-like readline() method to lose data
David Beazley added the comment: This bug is still present in Python 3.5, but it occurs if you attempt to do a readline() on a socket that's in non-blocking mode. In that case, you probably DO want to retry at a later time (unlike the timeout case). -- ___ Python tracker <http://bugs.python.org/issue7322> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24975] Python 3.5 can't compile AST involving PEP 448 unpacking
New submission from David Beazley: The compile() function is not able to compile an AST created from code that uses some of the new unpacking generalizations in PEP 448. Example: code = ''' a = { 'x':1, 'y':2 } b = { **a, 'z': 3 } ''' # Works ccode = compile(code, '', 'exec') # Crashes import ast tree = ast.parse(code) ccode = compile(tree, '', 'exec') # -- Error Traceback: Traceback (most recent call last): File "bug.py", line 11, in ccode = compile(tree, '', 'exec') ValueError: None disallowed in expression list Note: This bug makes it impossible to try generalized unpacking examples interactively in IPython. -- components: Library (Lib) messages: 249442 nosy: dabeaz priority: normal severity: normal status: open title: Python 3.5 can't compile AST involving PEP 448 unpacking type: crash versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue24975> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24844] Python 3.5rc1 compilation error on OS X 10.8
New submission from David Beazley: Just a note that Python-3.5.0rc1 fails to compile on Mac OS X 10.8.5 with the following compiler: bash$ clang --version Apple LLVM version 4.2 (clang-425.0.28) (based on LLVM 3.2svn) Target: x86_64-apple-darwin12.6.0 Thread model: posix bash$ Here is the resulting compilation error: /usr/bin/clang -c -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes-Werror=declaration-after-statement -I. -IInclude -I./Include-DPy_BUILD_CORE -o Python/ceval.o Python/ceval.c fatal error: error in backend: Cannot select: 0x102725710: i8,ch = AtomicSwap 0x102c45ce0, 0x102725010, 0x102725510 [ID=7] 0x102725010: i64 = X86ISD::WrapperRIP 0x102723710 [ID=6] 0x102723710: i64 = TargetGlobalAddress 0 [ID=4] 0x102725510: i8 = Constant<1> [ID=2] In function: take_gil make: *** [Python/ceval.o] Error 1 Problem can be fixed by commenting out the following line in pyconfig.h /* Has builtin atomics */ // #define HAVE_BUILTIN_ATOMIC 1 Not really sure what to advise. To my eyes, it looks like a bug in clang or Xcode. So, maybe this is more just an FYI that source builds might fail on certain older Mac systems. -- messages: 248415 nosy: dabeaz priority: normal severity: normal status: open title: Python 3.5rc1 compilation error on OS X 10.8 type: compile error versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue24844> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23441] rlcompleter: tab on empty prefix => insert spaces
David Beazley added the comment: It's still broken on Python 3.5b4. -- ___ Python tracker <http://bugs.python.org/issue23441> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23441] rlcompleter: tab on empty prefix => insert spaces
David Beazley added the comment: Wanted to add: I see this as being about the same as having a broken window pane on the front of Python 3. Maybe there are awesome things inside, but it makes a bad first impression on anyone who dares to use the interactive console. -- ___ Python tracker <http://bugs.python.org/issue23441> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23441] rlcompleter: tab on empty prefix => insert spaces
David Beazley added the comment: Frivolity aside, I really wish this issue would get more traction and a fix. Indentation is an important part of the Python language (obviously). A pretty standard way to indent is to hit "tab" in whatever environment you're using to edit Python code. Yet, at the interactive prompt, tab doesn't actually indent on a blank line. Instead, it autocompletes the builtins. Aside from it being highly annoying (as previously mentioned), it is also an embarrassment. Newcomers to Python will very often try things out using the stock interpreter before moving on to more sophisticated environments. The fact that tab is broken from the get-go leaves a pretty sour impression when not even the most basic tutorial examples work at the interactive console (and keep in mind that whitespace sensitivity is probably already an issue on their minds). Experienced Python users coming from Python 2 to Python 3 are going to find that tab is busted in Python 3. Well, of course it's busted because everything is busted in Python 3. "Wow, this really sucks as bad as everyone says" they'll say. So, with that as context, I'm really hoping I don't have to watch people use a busted tab key for another entire release cycle of Python 3 as I did for Python-3.4. I have no particular thoughts about the specifics (tabs vs. spaces) or the amount of indentation. It's the autocomplete on empty line that's the issue. -- ___ Python tracker <http://bugs.python.org/issue23441> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23441] rlcompleter: tab on empty prefix => insert spaces
David Beazley added the comment: For what it's worth, I'm kind of tired having to hack site.py every time I upgrade Python in order to avoid being shown 6000 choices when hitting tab on an empty line. It is crazy annoying. -- ___ Python tracker <http://bugs.python.org/issue23441> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23441] rlcompleter: tab on empty prefix => insert spaces
David Beazley added the comment: This is a problem that will never be fixed. Sure, it was a release blocker in Python 3.4. It wasn't fixed. It is a release blocker in Python 3.5. It won't be fixed. They'll just tell you to indent using the spacebar as generations of typists have done for centuries. It won't be fixed. Why don't you just use ipython or bpython? It won't be fixed. Doesn't your IDE take care of this? It won't be fixed. By the way, backspace will never work right either. No, that will never be fixed. Did we mention that this will never be fixed? You can fix it! Yes, you! No, I mean you! Yes, yes, you can. Simply edit the file Lib/site.py and comment out the line that does this: # enablerlcompleter() Problem solved. All is well. By the way. This problem will never be fixed. That is all. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue23441> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23642] Interaction of ModuleSpec and C Extension Modules
David Beazley added the comment: This is great news. Read the PEP draft and think this is a very good thing to be addressing. Thanks, Brett. -- ___ Python tracker <http://bugs.python.org/issue23642> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23642] Interaction of ModuleSpec and C Extension Modules
David Beazley added the comment: Note: Might be related to Issue 19713. -- ___ Python tracker <http://bugs.python.org/issue23642> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23642] Interaction of ModuleSpec and C Extension Modules
David Beazley added the comment: Sorry. I take back the previous message. It still doesn't quite do what I want. Anyways, any insight or thoughts about this would be appreciated ;-). -- ___ Python tracker <http://bugs.python.org/issue23642> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23642] Interaction of ModuleSpec and C Extension Modules
David Beazley added the comment: inal comment. It seems that one can generally avoid a lot of nastiness if importlib.reload() is used instead. For example: >>> mod = sys.modules[spec.name] = module_from_spec(spec) >>> importlib.reload(mod) This works for both source and Extension modules and completely avoids the need to worry about the exec_module()/load_module() warts. Wouldn't say it's an obvious approach though ;-). -- ___ Python tracker <http://bugs.python.org/issue23642> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue23642] Interaction of ModuleSpec and C Extension Modules
New submission from David Beazley: I have been investigating some of the new importlib machinery and the addition of ModuleSpec objects. I am a little curious about the intended handling of C Extension modules going forward. Backing up for a moment, consider a pure Python module. It seems that I can do things like this to bring a module into existence (some steps involving sys.modules omitted). >>> from importlib.util import find_spec, module_from_spec >>> spec = find_spec('socket') >>> socket = module_from_spec(spec) >>> spec.loader.exec_module(socket) >>> However, it all gets "weird" with C extension modules. For example, you can perform the first few steps: >>> spec = find_spec('math') >>> spec ModuleSpec(name='math', loader=<_frozen_importlib.ExtensionFileLoader object at 0x1012122b0>, origin='/usr/local/lib/python3.5/lib-dynload/math.so') >>> math = module_from_spec(spec) >>> math >>> dir(math) ['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__'] As you can see, you get a fresh "unloaded" module here. However, if you try to bring in the module contents, things get screwy. >>> spec.loader.exec_module(math) Traceback (most recent call last): File "", line 1, in AttributeError: 'ExtensionFileLoader' object has no attribute 'exec_module' >>> Yes, this is the old legacy interface in action--there is no exec_module() method. You can always fall back to load_module() like this: >>> spec.loader.load_module(spec.name) >>> The problem here is that it creates a brand new module and ignores the one that was previously created by module_from_spec(). That module is still empty: >>> dir(math) ['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__'] >>> I realize that I'm treading into a swamp of legacy interfaces and some pretty complex machinery here. However, here's my question: are C extension modules always going to be a special case that need to be considered code that interacts with the import system. Specifically, will it need to be special-cased to use load_module() instead of the module_from_spec()/exec_module() combination? I suppose the question might also apply to built-in and frozen modules as well (although I haven't investigated that so much). Mainly, I'm just trying to gain some insight from the devs as to the overall direction where the import implementation is going with this. P.S. ModuleSpecs are cool. +1 -- components: Interpreter Core messages: 237872 nosy: dabeaz priority: normal severity: normal status: open title: Interaction of ModuleSpec and C Extension Modules type: behavior versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue23642> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15986] memoryview: expose 'buf' attribute
David Beazley added the comment: One of the other goals of memoryviews is to make memory access less hacky. To that end, it would be nice to have the .buf attribute available given that all of the other attributes are already there. I don't see why people should need to do some even more hacky hack thing on top of hacks just to expose the pointer (which they'll figure out how to do anyway if they actually need to use it for something). -- ___ Python tracker <http://bugs.python.org/issue15986> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15986] memoryview: expose 'buf' attribute
David Beazley added the comment: Well, a lot of things in this big bad world are dangerous. Don't see how this is any more dangerous than all of the peril that tools like ctypes and llvmpy already provide. -- ___ Python tracker <http://bugs.python.org/issue15986> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15986] memoryview: expose 'buf' attribute
David Beazley added the comment: There are other kinds of libraries that might want to access the .buf attribute. For example, the llvmpy extension. Exposing it would be useful. -- ___ Python tracker <http://bugs.python.org/issue15986> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5845] rlcompleter should be enabled automatically
David Beazley added the comment: Funny thing, this feature breaks the interactive interpreter in the most basic way on OS X systems. For example, the tab key won't even work to indent. You can't even type the most basic programs into the interactive interpreter. For example: >>> for i in range(10): ... print(i) Oh sure, you can make it work by typing the space bar a bunch of times, but it's extremely annoying. The only way I was able to get a working interactive interpreter on my machine was to manually edit site.py and remove the call to enablerlcompleter() from main(). I hope someone reconsiders this feature and removes it as default behavior. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue5845> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18111] Add a default argument to min & max
David Beazley added the comment: To me, the fact that m = max(s) if s else default doesn't work with iterators alone makes this worthy of consideration. I would also note that min/max are the only reduction functions that don't have the ability to work with a possibly empty sequence. For example: >>> sum([]) 0 >>> any([]) False >>> all([]) True >>> functools.reduce(lambda x,y: x+y, [], 0) 0 >>> math.fsum([]) 0.0 >>> -- ___ Python tracker <http://bugs.python.org/issue18111> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue18111] Add a default argument to min & max
David Beazley added the comment: I could have used this feature myself somewhat recently. It was in some code involving document matching where zero or more possible candidates were assigned a score and I was trying to find the max score. The fact that an empty list was a possibility complicated everything because I had to add extra checks for it. max(scores, default=0) would have been a lot simpler. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue18111> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16723] io.TextIOWrapper on urllib.request.urlopen terminates prematurely
David Beazley added the comment: I have run into this bug myself. Agree that a file-like object should never report itself as closed unless .close() has been explicitly called on it. HTTPResponse should not return itself as closed after the end-of-file has been reached. I think there is also a bug in the implementation of TextIOWrapper as well. Even if the underlying file reports itself as closed, previously read and buffered data should be processed first before reporting an error about the file being closed. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue16723> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16894] Function attribute access doesn't invoke methods in dict subclasses
New submission from David Beazley: Suppose you subclass a dictionary: class mdict(dict): def __getitem__(self, index): print('Getting:', index) return super().__getitem__(index) Now, suppose you define a function and perform these steps that reassign the function's attribute dictionary: >>> def foo(): ... pass ... >>> foo.__dict__ = mdict() >>> foo.x = 23 >>> foo.x # Observe: No output from overridden __getitem__ 23 >>> type(foo.__dict__) >>> foo.__dict__ {'x': 23} >>> Carefully observe that access to foo.x does not invoke the overridden __getitem__() method in mdict. Instead, it just directly accesses the default __getitem__() on dict. Admittedly, this is a really obscure corner case. However, if the __dict__ attribute of a function can be legally reassigned, it might be nice for inheritance to work ;-). -- components: Interpreter Core messages: 179364 nosy: dabeaz priority: normal severity: normal status: open title: Function attribute access doesn't invoke methods in dict subclasses type: behavior versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue16894> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14965] super() and property inheritance behavior
David Beazley added the comment: Just as a note, there is a distinct possibility that a "property" in a superclass could be some other kind of descriptor object that's not a property. To handle that case, the solution of super(self.__class__, self.__class__).x.fset(self, value) would actually have to be rewritten as super(self.__class__, self.__class__).x.__set__(self, value) That said, I agree it would be nice to have a simplified means of accomplishing this. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue14965> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16254] PyUnicode_AsWideCharString() increases string size
David Beazley added the comment: Another note: the PyUnicode_AsUTF8String() doesn't leave the UTF-8 encoded byte string behind on the original string object. I got into this thinking that PyUnicode_AsWideCharString() might have similar behavior. -- ___ Python tracker <http://bugs.python.org/issue16254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16254] PyUnicode_AsWideCharString() increases string size
David Beazley added the comment: Maybe it's not a bug, but I still think it's undesirable. Basically, you have a function that allocates a buffer, fills it with data, and allows the buffer to be destroyed. Yet, as a side effect, it allocates a second buffer, fills it, and permanently attaches it to the original string object. Thus it makes the size of the string object blow up to a size substantially larger than it was before with no way to reclaim memory other than to delete the whole string. Maybe this is some sort of rare event that doesn't matter, but maybe there's some bit of C extension code that is trying to pass a wchar_t array off to some external library. The extension writer is using the PyUnicode_AsWideCharString() function with the understanding that it creates a new array and that you have to destroy it. They understand that it's not super fast to have to make a copy, but it's better than nothing. What's unfortunate is that all of this attention to memory management doesn't reward the programmer as a copy gets left behind on the string object anyways. For instance, I start with a 10 Megabyte string, I pass it through a C extension function, and now the string is mysteriously using 50 Megabytes of memory. I think the idea of filling wstr, returning it and clearing it (if originally NULL) would definitely work here. Actually, that's exactly what I want--don't fill in the wstr member if it's not set already. That way, it's possible for C extensions to temporarily get the wstr buffer, do something, and then toss it away without affecting the original string. Another suggestion: An API function to simply clear wstr and the UTF-8 representation could also work. Again, this is for extension writers who want to pull data out of strings, but don't want to leave these memory side effects behind. -- ___ Python tracker <http://bugs.python.org/issue16254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16254] PyUnicode_AsWideCharString() increases string size
David Beazley added the comment: I should quickly add, is there any way to simply have this function not keep the wchar_t buffer around afterwards? That would be great. -- ___ Python tracker <http://bugs.python.org/issue16254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16254] PyUnicode_AsWideCharString() increases string size
New submission from David Beazley: The PyUnicode_AsWideCharString() function is described as creating a new buffer of type wchar_t allocated by PyMem_Alloc() (which must be freed by the user). However, if you use this function, it causes the size of the original string object to permanently increase. For example, suppose you had some extension code like this: static PyObject *py_receive_wchar(PyObject *self, PyObject *args) { PyObject *obj; wchar_t *s; Py_ssize_t len; if (!PyArg_ParseTuple(args, "U", &obj)) { return NULL; } if ((s = PyUnicode_AsWideCharString(obj, &len)) == NULL) { return NULL; } /* Do nothing */ PyMem_Free(s); Py_RETURN_NONE; } Now, try an experiment (assume that the above extension function is available as 'receive_wchar'). >>> s = "Hell"*1000 >>> len(s) 4000 >>> import sys >>> sys.getsizeof(s) 4049 >>> receive_wchar(s) >>> sys.getsizeof(s) 20053 >>> It seems that PyUnicode_AsWideCharString() may be filling in the wstr field of the associated PyASCIIObject structure from PEP393 (I haven't verified). Once filled, it never seems to be discarded. Background: I am trying to figure out how to convert from Unicode to (wchar_t, int *) that doesn't cause a permanent increase in the memory footprint of the original Unicode object. Also, I'm trying to stay away from deprecated Unicode APIs. -- components: Extension Modules, Interpreter Core, Unicode messages: 173089 nosy: dabeaz, ezio.melotti priority: normal severity: normal status: open title: PyUnicode_AsWideCharString() increases string size versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue16254> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16132] ctypes incorrectly encodes .format attribute of memory views
New submission from David Beazley: This is somewhat related to an earlier bug report concerning memory views, but as far as I can tell, ctypes is not encoding the '.format' attribute correctly in most cases. Consider this example: First, create a ctypes array: >>> a = (ctypes.c_double * 3)(1,2,3) >>> len(a) 3 >>> a[0] 1.0 >>> a[1] 2.0 >>> Now, create a memory view for it: >>> m = memoryview(a) >>> len(m) 3 >>> m.itemsize 8 >>> m.ndim 1 >>> m.shape (3,) >>> All looks well. However, if you try to do anything with the .format or access the items, it's completely broken: >>> m.format '(3)>> m[0] Traceback (most recent call last): File "", line 1, in NotImplementedError: memoryview: unsupported format (3)>> This is quite inconsistent with the behavior observed elsewhere. For example: >>> import array >>> b = array.array('d',[1,2,3]) >>> memoryview(b).format 'd' >>> import numpy >>> c = numpy.array([1,2,3],dtype='d') >>> memoryview(c).format 'd' >>> As you can see, array libraries are using .format to encode the format of a single array item. ctypes is encoding the format of the entire array (all items). ctypes also includes endianness which presents additional difficulties. This behavior affects both Python code that wants to use memoryviews, but also C extension code that wants to use the underlying buffer protocol to work with arrays in a generic way. Essentially, it cuts the use of ctypes off entirely unless you modify the underlying buffer handling code to special case it. Suggested fix: Have ctypes only encode the format for a single item in the case of arrays. Also, for items that are encoded using the native byte ordering, don't include an endianness modifier ('<','>', etc.). Including the byte order just complicates all of the handling code because it has to be modified to a) know what the native byte ordering is and b) to check multiple cases such as for "d" and " <http://bugs.python.org/issue16132> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: One followup note---I think it's fine to punt on cast('B') if the memoryview is non-contiguous. That's a rare case that's probably not as common. -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: There's probably a bigger discussion about memoryviews for a rainy day. However, the number one thing that would save all of this in my book would be to make sure cast('B') is universally supported regardless of format including endianness--especially in the standard library. For example, being able to do this: >>> a = array.array('d',[1.0, 2.0, 3.0, 4.0]) >>> m = memoryview(a).cast('B') >>> m[0:4] = b'\x00\x01\x02\x03' >>> a array('d', [1.000112050316, 2.0, 3.0, 4.0]) >>> Right now, it doesn't work for ctypes. For example: >>> import ctypes >>> a = (ctypes.c_double * 4)(1,2,3,4) >>> a <__main__.c_double_Array_4 object at 0x1006a7cb0> >>> m = memoryview(a).cast('B') Traceback (most recent call last): File "", line 1, in ValueError: memoryview: source format must be a native single character format prefixed with an optional '@' >>> As some background, being able to work with a "byte" view of memory is important for a lot of problems involving I/O, data interchange, and related problems where being able to accurately construct/deconstruct the underlying memory buffers is more useful than actually interpreting their contents. -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: I should add that 0-dim indexing doesn't work as described either: >>> import ctypes >>> d = ctypes.c_double() >>> m = memoryview(d) >>> m[()] Traceback (most recent call last): File "", line 1, in NotImplementedError: memoryview: unsupported format >> -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: Just to be specific, why is something like this not possible? >>> d = ctypes.c_double() >>> m = memoryview(d) >>> m[0:8] = b'abcdefgh' >>> d.value 8.540883223036124e+194 >>> (Doesn't have to be exactly like this, but what's wrong with overwriting bytes with bytes of a compatible size?). -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: No, I want to be able to access the raw bytes sitting behind a memoryview as bytes without all of this casting and reinterpretation. Just show me the raw bytes. Not doubles, not ints, not structure packing, not copying into byte strings, or whatever. Is this really impossible? It sure seems so. -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: I don't think memoryviews should be imposing any casting restrictions at all. It's low level. Get out of the way. -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: Even with the <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
David Beazley added the comment: I don't want to read the representation by copying it into a bytes object. I want direct access to the underlying memory--including the ability to modify it. As it stands now, it's completely useless. -- ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15944] memoryviews and ctypes
New submission from David Beazley: I've been playing with the interaction of ctypes and memoryviews and am curious about intended behavior. Consider the following: >>> import ctypes >>> d = ctypes.c_double() >>> m = memoryview(d) >>> m.ndim 0 >>> m.shape () >>> m.readonly False >>> m.itemsize 8 >>> As you can see, you have a memory view for the ctypes double object. However, the fact that it has a 0-dimension and no shape seems to cause all sorts of weird behavior. For instance, indexing and slicing don't work: >>> m[0] Traceback (most recent call last): File "", line 1, in TypeError: invalid indexing of 0-dim memory >>> m[:] Traceback (most recent call last): File "", line 1, in TypeError: invalid indexing of 0-dim memory >>> As such, you can't really seem to do anything interesting with the resulting memory view. For example, you can't pull data out of it. Nor can you overwrite the contents (i.e., replacing the contents with an 8-byte byte string). Attempting to cast the memory view to something else doesn't work either. >>> d = ctypes.c_double() >>> m = memoryview(d) >>> m2 = m.cast('c') Traceback (most recent call last): File "", line 1, in ValueError: memoryview: source format must be a native single character format prefixed with an optional '@' >>> I must be missing something really obvious here. Is there no way to get access to the memory behind a ctypes object? -- messages: 170477 nosy: dabeaz priority: normal severity: normal status: open title: memoryviews and ctypes type: behavior versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue15944> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15546] Iteration breaks with bz2.open(filename,'rt')
David Beazley added the comment: File attached.The file can be read in its entirety in binary mode. -- Added file: http://bugs.python.org/file26673/access-log-0108.bz2 ___ Python tracker <http://bugs.python.org/issue15546> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15546] Iteration breaks with bz2.open(filename,'rt')
New submission from David Beazley: The bz2 library in Python3.3b1 doesn't support iteration for text-mode properly. Example: >>> f = bz2.open('access-log-0108.bz2') >>> next(f) # Works b'140.180.132.213 - - [24/Feb/2008:00:08:59 -0600] "GET /ply/ply.html HTTP/1.1" 200 97238\n' >>> g = bz2.open('access-log-0108.bz2','rt') >>> next(g) # Fails Traceback (most recent call last): File "", line 1, in StopIteration >>> -- components: Library (Lib) messages: 167299 nosy: dabeaz priority: normal severity: normal status: open title: Iteration breaks with bz2.open(filename,'rt') type: behavior versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue15546> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: Python 3.2 (r32:88445, Feb 20 2011, 21:51:21) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import gzip >>> import io >>> f = io.TextIOWrapper(gzip.open("file.gz"),encoding='latin-1') >>> f.readline() Traceback (most recent call last): File "", line 1, in io.UnsupportedOperation: read1 >>> -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: If I can find some time, I may took a look at this. I just noticed that similar problems arise trying to wrap TextIOWrapper around the file-like objects returned by urllib.request.urlopen as well. In the big picture, some discussion of what it means to be "file-like" might be in order. If something is "file-like" and binary, should that always imply that I be able to wrap a TextIOWrapper object around it in order to encode/decode text? I would argue "yes", but I'd be curious to know what others think. -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: Bump. This is still broken in Python 3.2. -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11117] Implementing Async IO
David Beazley added the comment: Glad you liked it! I think there is a bit of a cautionary tale in there though. With aio_, there is the promise of better performance, but you're also going to need a *LOT* of advance planning and thought to avoid creating a tangled coding nightmare with it. Just as an aside, one of the uses of aio_ related functions is to implement parts of user-level thread libraries in C (e.g., pthreads, etc.). A library might use the asynchronous I/O callbacks as part of implementing non-kernel (green) threads. The code for doing this tends to be very low level and hairy with lots of signal handling--for example, if you want to context-switch between two user-level threads in C, you usually do it inside a signal handler (i.e., you thread-switch inside the signal handler called in response to aio_ completions). Whether it's feasible to expose aio_* all the way up to Python or not is an open question. I suspect it will be fraught with lots of tricky issues. In the end, it might just be easier to use threads. Nevertheless, you'll learn a lot about Python internals by working on this :-). -- ___ Python tracker <http://bugs.python.org/issue7> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11117] Implementing Async IO
David Beazley added the comment: Anyone contemplating the use of aio_ functions should first go read "The Story of Mel". http://www.catb.org/jargon/html/story-of-mel.html -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue7> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7322] Socket timeout can cause file-like readline() method to lose data
David Beazley added the comment: Just wanted to say that I agree it's nonsense to continue reading on a socket that timed out (I'm not even sure what I might have been thinking when I first submitted this bug other than just experimenting with edge cases of the socket interface).It's still probably good to precisely specify what the behavior is in any case. -- ___ Python tracker <http://bugs.python.org/issue7322> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10907] OS X installer: warn users of buggy Tcl/Tk in OS X 10.6
David Beazley added the comment: A comment from the training world: The instability of IDLE on the Mac makes teaching introductory Python courses a nightmare at the moment. Sure, one might argue that students should install an alternative editor, but then you usually end up with two problems instead of one. It would be great if IDLE just "worked" out of the box for starting out. Glad to see someone looking at this. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue10907> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7322] Socket timeout can cause file-like readline() method to lose data
David Beazley added the comment: Have any other programming environments ever had a feature where a socket timeout returns an exception containing partial data?I'm not aware of one offhand and speaking as a systems programmer, something like this might be somewhat unexpected. My concern is that in the presence of timeouts, the programmer will be forced to reassemble the message themselves from fragments returned in the exception. However, one reason for using readline() in the first place is precisely so that you don't have to do that sort of thing. Is there any reason why the input buffer can't be preserved across calls? You've already got a file-like wrapper around the socket. Just keep the unconsumed buffer in that instance. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue7322> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: Hmmm. Interesting. In the big picture, it might be an interesting project for someone (not necessarily the core devs) to sit down and refactor both of these modules so that they play nice with Python 3 I/O system. Obviously that's a project outside the scope of this bug or the 3.2 release for that matter. -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: Do Python devs really view gzip and bz2 as two totally completely different animals? They both have the same functionality and would be used for the same kinds of things. Maybe I'm missing something. -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: C or not, wrapping a BZ2File instance with a TextIOWrapper to get text still seems like something that someone might want to do. I doubt it would take much modification to give BZ2File instances the required set of methods. -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
David Beazley added the comment: It goes without saying that this also needs to be checked with the bz2 module. A quick check seems to indicate that it has the same problem. While you're at it, maybe someone could add an 'open' function to bz2 to make it symmetrical with gzip as well :-). -- ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10791] Wrapping TextIOWrapper around gzip files
New submission from David Beazley : Is something like this supposed to work: >>> import gzip >>> import io >>> f = io.TextIOWrapper(gzip.open("foo.gz"),encoding='ascii')) Traceback (most recent call last): File "", line 1, in AttributeError: readable In a nutshell--reading a .gz file as text. -- messages: 124870 nosy: dabeaz priority: normal severity: normal status: open title: Wrapping TextIOWrapper around gzip files type: behavior versions: Python 3.2 ___ Python tracker <http://bugs.python.org/issue10791> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: Thanks everyone for looking at this! -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: As a user of Python 3, I would like echo Victor's comment about fixing the API right now as opposed to having to deal with it later. I can only speak for myself, but I would guess that anyone using Python 3 already understands that it's bleeding edge and that the bytes/strings distinction is really important. If fixing this breaks some third party libraries, I say good--they shouldn't have been blindly passing Unicode into struct in the first place. Better to deal with it now when the number of users is relatively small. -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: Actually, here's another one of my favorite examples: >>> import struct >>> struct.pack("s","\xf1") b'\xc3' >>> Not only does this not encode the correct value, it doesn't even encode the entire UTF-8 encoding (just the first byte of it). Like I said, pity the poor bastard who puts something that in their code and they spend the whole day trying figure out where in the hell '\xf1' magically got turned into '\xc3'. -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: I encountered this issue is in the context of distributed computing/interprocess communication involving binary-encoded records (and encoding/decoding such records using struct). At its core, this is all about I/O--something where encodings and decoding matter a lot. Frankly, it was quite surprising that a unicode string would silently pass through struct and turn into bytes. IMHO, the fact that this is even possible encourages a sloppy usage of struct that favors programming convenience over correctness--something that's only going to end badly for the poor soul who passes non-ASCII characters into struct without knowing it. A default encoding might be okay as long as it was set to something like ASCII or Latin-1 (not UTF-8). At least then you'd get an encoding error for characters that don't fit into a byte. -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: Why is it even encoding at all? Almost every other part of Python 3 forces you to be explicit about bytes/string conversion. For example: struct.pack("10s", x.encode('utf-8')) Given that automatic conversion is documented, it's not clear what can be done at this point. However, there are very few other parts of Python 3 that perform implicit string-byte conversions like this (at least that I know of off-hand). -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: Hmmm. Well, the docs seem to say that it's allowed and that it will be encoded as UTF-8. Given the treatment of Unicode/bytes elsewhere in Python 3, all I can say is that this behavior is rather surprising. -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
David Beazley added the comment: Note: This is what happens in Python 2.6.4: >>> import struct >>> struct.pack("10s",u"Jalape\u00f1o") Traceback (most recent call last): File "", line 1, in struct.error: argument for 's' must be a string >>> -- ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue10783] struct.pack() and Unicode strings
New submission from David Beazley : Is the struct.pack() function supposed to automatically encode Unicode strings into binary? For example: >>> struct.pack("10s","Jalape\u00f1o") b'Jalape\xc3\xb1o\x00' >>> This is Python 3.2b1. -- components: Library (Lib) messages: 124727 nosy: dabeaz priority: normal severity: normal status: open title: struct.pack() and Unicode strings type: behavior versions: Python 3.2 ___ Python tracker <http://bugs.python.org/issue10783> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: Wow, that is a *really* intriguing performance result with radically different behavior than Unix. Do you have any ideas of what might be causing it? -- ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: One more attempt at fixing tricky segfaults. Glad someone had some eagle eyes on this :-). -- Added file: http://bugs.python.org/file17106/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
Changes by David Beazley : Removed file: http://bugs.python.org/file17104/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: I stand corrected. However, I'm going to have to think of a completely different approach for carrying out that functionality as I don't know how the take_gil() function is able to determine whether gil_last_holder has been deleted or not. Will think about it and post an updated patch later. Do you have any examples or insight you can provide about how these segfaults have shown up in Python code? I'm not able to observe any such behavior on OS-X or Linux. Is this happening while running the ccbench program? Some other program? -- ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: That second access of gil_last_holder->cpu_bound is safe because that block of code is never entered unless some other thread currently holds the GIL. If a thread holds the GIL, then gil_last_holder is guaranteed to have a valid value. -- ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: Added extra pointer check to avoid possible segfault. -- Added file: http://bugs.python.org/file17104/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
Changes by David Beazley : Removed file: http://bugs.python.org/file17102/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: New version of patch that will probably fix Windows-XP problems. Was doing something stupid in the monitor (not sure how it worked on Unix). -- Added file: http://bugs.python.org/file17102/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
Changes by David Beazley : Removed file: http://bugs.python.org/file17094/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: I've also attached a new file schedtest.py that illustrates a subtle difference between having the GIL monitor thread and not having the monitor. Without the monitor, every thread is responsible for its own scheduling. If you have a lot of threads running, you may have a lot of threads all performing a timed wait and then waking up only to find that the GIL is locked and that they have to go back to waiting. One side effect is that certain threads have a tendency to starve. For example, if you run the schedtest.py with the original GIL, you get a trace where three CPU-bound threads run like this: Thread-3 16632 Thread-2 16517 Thread-1 31669 Thread-2 16610 Thread-1 16256 Thread-2 16445 Thread-1 16643 Thread-2 16331 Thread-1 16494 Thread-3 16399 Thread-1 17090 Thread-1 20860 Thread-3 16306 Thread-1 19684 Thread-3 16258 Thread-1 16669 Thread-3 16515 Thread-1 16381 Thread-3 16600 Thread-1 16477 Thread-3 16507 Thread-1 16740 Thread-3 16626 Thread-1 16564 Thread-3 15954 Thread-2 16727 ... You will observe that Threads 1 and 2 alternate, but Thread 3 starves. Then at some point, Threads 1 and 3 alternate, but Thread 2 starves. By having a separate GIL monitor, threads are no longer responsible for making scheduling decisions concerning timeouts. Instead, the monitor is what times out and yanks threads off the GIL. If you run the same test with the GIL monitor, you get scheduling like this: Thread-1 33278 Thread-2 32278 Thread-3 31981 Thread-1 33760 Thread-2 32385 Thread-3 32019 Thread-1 32700 Thread-2 32085 Thread-3 32248 Thread-1 31630 Thread-2 32200 Thread-3 32054 Thread-1 32721 Thread-2 32659 Thread-3 34150 Threads nicely cycle round-robin. There also appears to be about half as much thread switching (for reasons I don't quite understand). -- Added file: http://bugs.python.org/file17095/schedtest.py ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: I've updated the GIL patch to reflect concerns about the monitor thread running forever. This version has a suspension mechanism where the monitor goes to sleep if nothing is going on for awhile. It gets resumed if threads try to acquire the GIL, but timeout for some reason. -- Added file: http://bugs.python.org/file17094/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
Changes by David Beazley : Removed file: http://bugs.python.org/file17084/dabeaz_gil.patch ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: Greg, I like the idea of the monitor suspending if no thread owns the GIL. Let me work on that. Good point on embedded systems. Antoine, Yes, the gil monitor is completely independent and simply ticks along every 5 ms. A worst case scenario is that an I/O bound thread is scheduled shortly after the 5ms tick and then becomes CPU-bound afterwards. In that case, the monitor might let it run up to about 10ms before switching it. Hard to say if it's a real problem though---the normal timeslice on many systems is 10 ms so it doesn't seem out of line. As for the priority part, this patch should have similar behavior to the glinter patch except for very subtle differences in thread scheduling due to the use of the GIL monitor. For instance, since threads never time out on the condition variable anymore, they tend to cycle execution in a purely round-robin fashion. -- ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: Here is the result of running the writes.py test with the patch I submitted. This is on OS-X. bash-3.2$ ./python.exe writes.py t1 2.83990693092 0 t2 3.27937912941 0 t1 5.54346394539 1 t2 6.68237304688 1 t1 8.9648039341 2 t2 9.60041999817 2 t1 12.1856160164 3 t2 12.5866689682 3 t1 15.3869640827 4 t2 15.7042851448 4 t1 18.4115200043 5 t2 18.5771169662 5 t2 21.4922711849 6 t1 21.6835460663 6 t2 24.6117911339 7 t1 24.9126679897 7 t1 27.1683580875 8 t2 28.2728791237 8 t1 29.4513950348 9 t1 32.2438161373 10 t2 32.5283250809 9 t1 34.8905010223 11 t2 36.0952250957 10 t1 38.109760046 12 t2 39.3465380669 11 t1 41.5758800507 13 t2 42.587772131 12 t1 45.1536290646 14 t2 45.8339021206 13 t1 48.6495029926 15 t2 49.1581180096 14 t1 51.5414950848 16 t2 52.6768190861 15 t1 54.818582058 17 t2 56.1163961887 16 t1 58.1549630165 18 t2 59.6944830418 17 t1 61.4515309334 19 t2 62.7685520649 18 t1 64.3223180771 20 t2 65.8158640862 19 65.8578810692 -- ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: One comment on that patch I just submitted. Basically, it's an attempt to make an extremely simple tweak to the GIL that fixes most of the problems discussed here in an extremely simple manner. I don't have any special religious attachment to it though. Would love to see a BFS comparison. -- ___ Python tracker <http://bugs.python.org/issue7946> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7946] Convoy effect with I/O bound threads and New GIL
David Beazley added the comment: The attached patch makes two simple refinements to the new GIL implemented in Python 3.2. Each is briefly described below. 1. Changed mechanism for thread time expiration In the current implementation, threads perform a timed-wait on a condition variable. If time expires and no thread switches have occurred, the currently running thread is forced to drop the GIL. In the patch, timeouts are now performed by a special "GIL monitor" thread. This thread runs independently of Python and simply handles time expiration. Basically, it records the number of thread switches, sleeps for a specified interval (5ms), and then looks at the number of thread switches again. If no switches occurred, it forces the currently running thread to drop the GIL. With this monitor thread, it is no longer necessary to perform any timed condition variable waits. This approach has a few subtle benefits. First, threads no longer sit in a wait/timeout cycle when trying to get the GIL (so, there is less overhead). Second, you get FIFO scheduling of threads. When time expires, the thread that has been waiting the longest on the condition variable runs next. Generally, you want this. 2. A very simple two-level priority mechanism A new attribute 'cpu_bound' is added to the PyThreadState structure. If a thread is ever forced to drop the GIL, this attribute is simply set True (1). If a thread gives up the GIL voluntarily, it is set back to False (0). This attribute is used to set up simple scheduling (described next). There are now two separate condition variables (gil_cpu_cond) and (gil_io_cond) that separate waiting threads according to their cpu_bound attribute setting. CPU-bound threads wait on gil_cpu_cond whereas I/O-bound threads wait on gil_io_cond. Using the two condition variables, the following scheduling rules are enforced: - If there are any waiting I/O bound threads, they are always signaled first, before any CPU-bound threads. - If an I/O bound thread wants the GIL, but a CPU-bound thread is running, the CPU-bound thread is immediately forced to drop the GIL. - If a CPU-bound thread wants the GIL, but another CPU-bound thread is running, the running thread is immediately forced to drop the GIL if its time period has already expired. Results --- This patch gives excellent results for both the ccbench test and all of my previous I/O bound tests. Here is the output: == CPython 3.2a0.0 (py3k:80470:80497M) == == i386 Darwin on 'i386' == --- Throughput --- Pi calculation (Python) threads=1: 871 iterations/s. threads=2: 844 ( 96 %) threads=3: 838 ( 96 %) threads=4: 826 ( 94 %) regular expression (C) threads=1: 367 iterations/s. threads=2: 345 ( 94 %) threads=3: 339 ( 92 %) threads=4: 327 ( 89 %) bz2 compression (C) threads=1: 384 iterations/s. threads=2: 728 ( 189 %) threads=3: 695 ( 180 %) threads=4: 707 ( 184 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 1 ms. (std dev: 2 ms.) CPU threads=3: 0 ms. (std dev: 1 ms.) CPU threads=4: 0 ms. (std dev: 1 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 2 ms. (std dev: 1 ms.) CPU threads=2: 1 ms. (std dev: 1 ms.) CPU threads=3: 1 ms. (std dev: 1 ms.) CPU threads=4: 2 ms. (std dev: 1 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 2 ms.) CPU threads=2: 2 ms. (std dev: 3 ms.) CPU threads=3: 0 ms. (std dev: 1 ms.) CPU threads=4: 0 ms. (std dev: 1 ms.) --- I/O bandwidth --- Background CPU task: Pi calculation (Python) CPU threads=0: 5850.9 packets/s. CPU threads=1: 5246.8 ( 89 %) CPU threads=2: 4228.9 ( 72 %) CPU threads=3: 4222.8 ( 72 %) CPU threads=4: 2959.5 ( 50 %) Particular attention should be given to tests involving I/O performance. In particular, here are the results of the I/O bandwidth test using the unmodified GIL: --- I/O bandwidth --- Background CPU task: Pi calculation (Python) CPU threads=0: 6007.1 packets/s. CPU threads=1: 189.0 ( 3 %) CPU threads=2: 19.7 ( 0 %) CPU threads=3: 19.7 ( 0 %) CPU threads=4: 5.1 ( 0 %) Other Benefits -- This patch does not involve any complicated libraries, platform specific functionality, low-level lock twiddling, or mathematically complex priority scheduling algorithms. Emphasize: The code is simple. Negative Aspects This modification might introduce a starvation effect where CPU-bound threads never get to run if there is an extremely heavy load of I/O-bound threads competing for the GIL. Comparison to BFS - Still need to test. Would be curious. -- Added file: http://bugs.python.org/file17084/dabeaz_gil.patch ___ Python tracker <
[issue8532] Refinements to Python 3 New GIL
David Beazley added the comment: Can't decide whether this should be attached to Issue 7946 or not. I will also post it there. (Feel free to close this issue if you want to keep 7946 alive). -- ___ Python tracker <http://bugs.python.org/issue8532> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8532] Refinements to Python 3 New GIL
New submission from David Beazley : The attached patch makes two simple refinements to the new GIL implemented in Python 3.2. Each is briefly described below. 1. Changed mechanism for thread time expiration In the current implementation, threads perform a timed-wait on a condition variable. If time expires and no thread switches have occurred, the currently running thread is forced to drop the GIL. In the patch, timeouts are now performed by a special "GIL monitor" thread. This thread runs independently of Python and simply handles time expiration. Basically, it records the number of thread switches, sleeps for a specified interval (5ms), and then looks at the number of thread switches again. If no switches occurred, it forces the currently running thread to drop the GIL. With this monitor thread, it is no longer necessary to perform any timed condition variable waits. This approach has a few subtle benefits. First, threads no longer sit in a wait/timeout cycle when trying to get the GIL (so, there is less overhead). Second, you get FIFO scheduling of threads. When time expires, the thread that has been waiting the longest on the condition variable runs next. Generally, you want this. 2. A very simple two-level priority mechanism A new attribute 'cpu_bound' is added to the PyThreadState structure. If a thread is ever forced to drop the GIL, this attribute is simply set True (1). If a thread gives up the GIL voluntarily, it is set back to False (0). This attribute is used to set up simple scheduling (described next). There are now two separate condition variables (gil_cpu_cond) and (gil_io_cond) that separate waiting threads according to their cpu_bound attribute setting. CPU-bound threads wait on gil_cpu_cond whereas I/O-bound threads wait on gil_io_cond. Using the two condition variables, the following scheduling rules are enforced: - If there are any waiting I/O bound threads, they are always signaled first, before any CPU-bound threads. - If an I/O bound thread wants the GIL, but a CPU-bound thread is running, the CPU-bound thread is immediately forced to drop the GIL. - If a CPU-bound thread wants the GIL, but another CPU-bound thread is running, the running thread is immediately forced to drop the GIL if its time period has already expired. Results --- This patch gives excellent results for both the ccbench test and all of my previous I/O bound tests. Here is the output: == CPython 3.2a0.0 (py3k:80470:80497M) == == i386 Darwin on 'i386' == --- Throughput --- Pi calculation (Python) threads=1: 871 iterations/s. threads=2: 844 ( 96 %) threads=3: 838 ( 96 %) threads=4: 826 ( 94 %) regular expression (C) threads=1: 367 iterations/s. threads=2: 345 ( 94 %) threads=3: 339 ( 92 %) threads=4: 327 ( 89 %) bz2 compression (C) threads=1: 384 iterations/s. threads=2: 728 ( 189 %) threads=3: 695 ( 180 %) threads=4: 707 ( 184 %) --- Latency --- Background CPU task: Pi calculation (Python) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 0 ms.) CPU threads=2: 1 ms. (std dev: 2 ms.) CPU threads=3: 0 ms. (std dev: 1 ms.) CPU threads=4: 0 ms. (std dev: 1 ms.) Background CPU task: regular expression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 2 ms. (std dev: 1 ms.) CPU threads=2: 1 ms. (std dev: 1 ms.) CPU threads=3: 1 ms. (std dev: 1 ms.) CPU threads=4: 2 ms. (std dev: 1 ms.) Background CPU task: bz2 compression (C) CPU threads=0: 0 ms. (std dev: 0 ms.) CPU threads=1: 0 ms. (std dev: 2 ms.) CPU threads=2: 2 ms. (std dev: 3 ms.) CPU threads=3: 0 ms. (std dev: 1 ms.) CPU threads=4: 0 ms. (std dev: 1 ms.) --- I/O bandwidth --- Background CPU task: Pi calculation (Python) CPU threads=0: 5850.9 packets/s. CPU threads=1: 5246.8 ( 89 %) CPU threads=2: 4228.9 ( 72 %) CPU threads=3: 4222.8 ( 72 %) CPU threads=4: 2959.5 ( 50 %) Particular attention should be given to tests involving I/O performance. In particular, here are the results of the I/O bandwidth test using the unmodified GIL: --- I/O bandwidth --- Background CPU task: Pi calculation (Python) CPU threads=0: 6007.1 packets/s. CPU threads=1: 189.0 ( 3 %) CPU threads=2: 19.7 ( 0 %) CPU threads=3: 19.7 ( 0 %) CPU threads=4: 5.1 ( 0 %) Other Benefits -- This patch does not involve any complicated libraries, platform specific functionality, low-level lock twiddling, or mathematically complex priority scheduling algorithms. Emphasize: The code is simple. Negative Aspects This modification might introduce a starvation effect where CPU-bound threads never get to run if there is an extremely heavy load of I/O-bound threads competing for the GIL. Is starvation a real problem or a theoretical problem? Hard to say. Would need study. -- components: Interpreter Core files: gil.patch keywords: patch messages: 104192 nosy: dabeaz severity: normal status: open title: Refinements to Pyth
[issue8410] Fix emulated lock to be 'fair'
David Beazley added the comment: I know that multicore processors are all the rage right now, but one thing that concerns me about this patch is its effect on single-core systems. If you apply this on a single-CPU, are threads just going to sit there and thrash as they rapidly context switch? (Something that does not occur now). Also, I've done a few experiments and on a single-core Windows-XP machine, the GIL does not appear to have any kind of fairness to it (as previously claimed here). Yet, if I run the same experiments on a dual-core PC, the GIL is suddenly fair. So, somewhere in that lock implementation, it seems to adapt to the environment. Do we have to try an emulate that behavior in Unix? If so, how do you do it without it turning into a huge coding mess? I'll just mention that the extra context-switching introduced by fair-locking has a rather pronounced effect on performance that should be considered even on multicore. I posted some benchmarks in Issue 8299 for Linux and OS-X. In those benchmarks, the introduction of fair GIL locking makes CPU-bound threads run about 2-5 times slower than before on Linux and OS-X. -- nosy: +dabeaz ___ Python tracker <http://bugs.python.org/issue8410> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8299] Improve GIL in 2.7
David Beazley added the comment: Here are the results of running the fair.py test on a Mac OS-X system using a "fair" GIL implementation (modified condition variable): [ Fair GIL, Dual-Core, OS-X ] Sequential execution slow: 5.490943 (0 left) fast: 0.369257 (0 left) Threaded execution slow: 6.122093 (0 left) fast: 6.179179 (0 left) Treaded, balanced execution: fast C: 3.345452 (0 left) fast B: 3.389235 (0 left) fast A: 3.426407 (0 left) Treaded, balanced execution, with quickstop: fast C: 2.557972 (0 left) fast B: 2.558551 (35087 left) fast A: 2.558914 (13142 left) Here is the same test with the original GIL. [Unfair GIL, original implementation] Sequential execution slow: 5.444754 (0 left) fast: 0.361340 (0 left) Threaded execution slow: 5.542008 (0 left) fast: 5.225690 (0 left) Treaded, balanced execution: fast C: 1.381929 (0 left) fast B: 1.499969 (0 left) fast A: 1.549571 (0 left) Treaded, balanced execution, with quickstop: fast A: 1.284043 (0 left) fast B: 1.295507 (32490 left) fast C: 1.294981 (274777 left) Please observe that the performance of threads under the "fair" GIL are significantly worse than with the "unfair" GIL. Having studied this in more depth, I have to say that I would much rather have fast-running unfair threads than slow-running fair threads. Although I agree that there are other benefits to fairness, they just aren't enough to compensate for the huge performance hit. -- ___ Python tracker <http://bugs.python.org/issue8299> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8299] Improve GIL in 2.7
David Beazley added the comment: As a followup, since I'm not sure anyone actually here actually tried a fair GIL on Linux, I incorporated your suggested fairness patch to the condition-variable version of the GIL (using this pseudocode you wrote as a guide): with gil.cond: if gil.n_waiting or gil.locked: gil.n_waiting += 1 while True: gil.cond.wait() #always wait at least once if not gil.locked: break gil.n_waiting -= 1 gil.locked = True I did some tests on this and it does appear to exhibit fairness. Here are the results of running the 'fair.py' test with a fair GIL on my Linux system: [ Fair GIL Linux ] Sequential execution slow: 6.246764 (0 left) fast: 0.465102 (0 left) Threaded execution slow: 7.534725 (0 left) fast: 7.674448 (0 left) Treaded, balanced execution: fast A: 10.415756 (0 left) fast B: 10.456502 (0 left) fast C: 10.520457 (0 left) Treaded, balanced execution, with quickstop: fast B: 8.423304 (0 left) fast A: 8.409794 (16016 left) fast C: 8.381977 (9162 left) beaz...@ubuntu:~/Desktop/Python-2.6.4$ If I switch back to the unfair GIL, this is the result: [ Unfair GIL, original implementation, Linux] Sequential execution slow: 6.164739 (0 left) fast: 0.422626 (0 left) Threaded execution slow: 6.570084 (0 left) fast: 6.690927 (0 left) Treaded, balanced execution: fast A: 1.994143 (0 left) fast C: 2.014925 (0 left) fast B: 2.073212 (0 left) Treaded, balanced execution, with quickstop: fast A: 1.614533 (0 left) fast C: 1.607324 (377323 left) fast B: 1.625987 (111451 left) Probably the main thing to notice is the huge increase in performance over the fair GIL. For instance, the balance execution test runs about 5 times faster. Here are the two tests repeated with checkinterval = 1000. [ Fair GIL, checkinterval = 1000] Sequential execution slow: 6.175320 (0 left) fast: 0.424410 (0 left) Threaded execution slow: 6.505094 (0 left) fast: 6.746649 (0 left) Treaded, balanced execution: fast A: 2.243123 (0 left) fast B: 2.416043 (0 left) fast C: 2.442475 (0 left) Treaded, balanced execution, with quickstop: fast A: 1.565914 (0 left) fast C: 1.514024 (81254 left) fast B: 1.531937 (63740 left) [ Unfair GIL, checkinterval = 1000] Sequential execution slow: 6.258882 (0 left) fast: 0.411590 (0 left) Threaded execution slow: 6.255027 (0 left) fast: 0.409412 (0 left) Treaded, balanced execution: fast A: 1.291007 (0 left) fast C: 1.135373 (0 left) fast B: 1.437205 (0 left) Treaded, balanced execution, with quickstop: fast C: 1.331775 (0 left) fast A: 1.418670 (54841 left) fast B: 1.403853 (208732 left) Here, the unfair GIL is still quite a bit faster on raw performance. I tried kicking the check interval up to 1 and the unfair GIL still won by a pretty significant margin on raw speed of completing the different tasks. I've attached a copy of the thread_pthread.h file I modified for this test. It's from Python-2.6.4. -- Added file: http://bugs.python.org/file16958/thread_pthread.h ___ Python tracker <http://bugs.python.org/issue8299> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8299] Improve GIL in 2.7
David Beazley added the comment: I'm definitely sure that semaphores were being used in my test---I stuck a print statement inside the code that creates locks just to make sure it was using the semaphore version :-). Unfortunately, at this point I think most of this discussion is academic since no change is likely to be incorporated into Python 2.7. I can definitely see where fairness might help I/O performance if there is only 1 CPU bound thread. I just don't know for other situations. For example, if you have a server where it's all I/O-bound threads, but it suddenly comes under extreme load (e.g., slashdot effect), does a fair GIL help or hurt with that? I just don't know. In the big picture, all of the issues raised here should be on the minds of people fixing the GIL in py3k though. It's just one more aspect of why fixing the GIL is hard. -- ___ Python tracker <http://bugs.python.org/issue8299> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8299] Improve GIL in 2.7
David Beazley added the comment: One other comment. Running the modified fair.py file on my Linux system using Python compiled with semaphores shows they they are *definitely* not fair. Here's the relevant part of your test: Treaded, balanced execution, with quickstop: fast C: 1.580815 (0 left) fast B: 1.636923 (158919 left) fast A: 1.788634 (310323 left) -- ___ Python tracker <http://bugs.python.org/issue8299> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com