ssl.SSLError: [SSL: BAD_WRITE_RETRY] bad write retry (_ssl.c:1636)
Hello, Can someone explain help me understand what this exception means? [...] File "/usr/local/lib/python3.4/dist-packages/dugong-3.2-py3.4.egg/dugong/__init__.py", line 584, in _co_send len_ = self._sock.send(buf) File "/usr/lib/python3.4/ssl.py", line 679, in send v = self._sslobj.write(data) ssl.SSLError: [SSL: BAD_WRITE_RETRY] bad write retry (_ssl.c:1636) Presumably this is generated by OpenSSL, but how do I figure out what it means? The best I found in the OpenSSL documentation is https://www.openssl.org/docs/crypto/err.html, and Google only found brought me to https://stackoverflow.com/questions/2997218. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Missing stack frames?
Chris Angelico writes: > On Fri, Jun 6, 2014 at 12:16 PM, Nikolaus Rath wrote: >> - Is there some way to make the call stack for destructors less confusing? > > First off, either don't have refloops, or explicitly break them. The actual code isn't as simple as the example. I wasn't even aware there were any refloops. Guess I have to start hunting them down now. > That'll at least make things a bit more predictable; in CPython, > you'll generally see destructors called as soon as something goes out > of scope. Secondly, avoid threads when you're debugging a problem! I > said this right at the beginning. If you run into problems, the first > thing to do is isolate the system down to a single thread. Debugging > is way easier without threads getting in each other's ways. I don't see how this would have made a difference in this case. I still would have gotten lots of apparently non-sensical backtraces. Only this time they would all come from MainThread. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Missing stack frames?
dieter writes: [...] > Someone else already mentioned that the "close" call > can come from a destructor. Destructors can easily be called > at not obvious places (e.g. "s = C(); ... x = [s for s in ...]"; > in this example the list comprehension calls the "C" destructor > which is not obvious when one looks only locally). > The destructor calls often have intervening C code (which > one does not see). However, in your case, I do not see > why the "cgi" module should cause a destructor call of one > of your server components. Paul, dieter, you are my heroes. It was indeed an issue with a destructor. It turns out that the io.RawIOBase destructor calls self.close(). If the instance of a derived class is part of a reference cycle, it gets called on the next routine run of the garbage collector -- with the stack trace originating at whatever statement was last executed before the gc run. The following minimal example reproduces the problem: #!/usr/bin/env python3 import io import traceback import threading class Container: pass class InnocentVictim(io.RawIOBase): def close(self): print('close called in %s by:' % threading.current_thread().name) traceback.print_stack() def busywork(): numbers = [] for i in range(500): o = Container() o.l = numbers numbers.append(o) if i % 87 == 0: numbers = [] l = [ InnocentVictim() ] l[0].cycle = l del l t = threading.Thread(target=busywork) t.start() t.join() If you run this, you could things like: close called in Thread-1 by: File "/usr/lib/python3.4/threading.py", line 888, in _bootstrap self._bootstrap_inner() File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner self.run() File "/usr/lib/python3.4/threading.py", line 868, in run self._target(*self._args, **self._kwargs) File "./test.py", line 18, in busywork o = Container() File "./test.py", line 13, in close traceback.print_stack() Ie, a method being called by a thread that doesn't have access to the object, and without any reference to the call in the source. I am left wondering: - Is there really a point in the RawIOBase destructor calling close? - Is there some way to make the call stack for destructors less confusing? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Missing stack frames?
Nikolaus Rath writes: > Chris Angelico writes: >> On Wed, Jun 4, 2014 at 12:30 PM, Nikolaus Rath wrote: >>> I've instrumented one of my unit tests with a conditional >>> 'pdb.set_trace' in some circumstances (specifically, when a function is >>> called by a thread other than MainThread). >> >> I think the likelihood of this being an issue with interactive >> debugging and threads is sufficiently high that you should avoid >> putting the two together, at least until you can verify that the same >> problem occurs without that combination. > > Here's stacktrace as obtained by traceback.print_stack(): > > tests/t1_backends.py:563: test_extra_data[mock_s3c-zlib] PASSED > tests/t1_backends.py:563: test_extra_data[mock_s3c-bzip2] PASSED > > 87 tests deselected by '-kextra' > = > === 5 passed, 1 skipped, 87 deselected in 0.65 seconds > > File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line > 853, in close > self.fh.close() > File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, > in close > traceback.print_stack(file=sys.stdout) > something is wrong > File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line > 853, in close > self.fh.close() > File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, > in close > traceback.print_stack(file=sys.stdout) > something is wrong > File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line > 1050, in close > self.fh.close() > File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, > in close > traceback.print_stack(file=sys.stdout) > > Still no context before the ominous close() call. I'm very confused. It get's even funnier if I just repeat this exercise (without changing code). Here are some more "tracebacks" that I got: File "/usr/bin/py.test-3", line 5, in sys.exit(load_entry_point('pytest==2.5.2', 'console_scripts', 'py.test')()) [...] File "/usr/lib/python3/dist-packages/py/_code/code.py", line 524, in repr_traceback_entry source = self._getentrysource(entry) File "/usr/lib/python3/dist-packages/py/_code/code.py", line 450, in _getentrysource source = entry.getsource(self.astcache) File "/usr/lib/python3/dist-packages/py/_code/code.py", line 199, in getsource astnode=astnode) File "/usr/lib/python3/dist-packages/py/_code/source.py", line 367, in getstatementrange_ast astnode = compile(content, "source", "exec", 1024) # 1024 for AST File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 1050, in close self.fh.close_real() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 702, in close_real traceback.print_stack(file=sys.stdout) or also File "/usr/bin/py.test-3", line 5, in sys.exit(load_entry_point('pytest==2.5.2', 'console_scripts', 'py.test')()) [...] File "/usr/lib/python3.4/logging/__init__.py", line 1474, in callHandlers hdlr.handle(record) File "/usr/lib/python3.4/logging/__init__.py", line 842, in handle rv = self.filter(record) File "/usr/lib/python3.4/logging/__init__.py", line 699, in filter for f in self.filters: File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, in close self.fh.close() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 1050, in close self.fh.close_real() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 702, in close_real traceback.print_stack(file=sys.stdout) and this one looks actually the way I would expect: [...] File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 335, in fetch return self.perform_read(do_read, key) File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 58, in wrapped return method(*a, **kw) File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 309, in perform_read return fn(fh) File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 859, in __exit__ self.close() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, in close self.fh.close() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 1050, in close self.fh.close_real() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 702, in close_real traceback.print_stack(file=sys.stdout) I am not using any C extension modules, but I guess I nevertheless have to assume that something seriously messed up either the stack or the traceback printing routines? Best, Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Missing stack frames?
Paul Rubin writes: > Nikolaus Rath writes: >> Still no context before the ominous close() call. I'm very confused. > > close() could be getting called from a destructor as the top level > function of a thread exits, or something like that. Shouldn't the destructor have its own stack frame then, i.e. shouldn't the first frame be in a __del__ function? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Missing stack frames?
Chris Angelico writes: > On Wed, Jun 4, 2014 at 12:30 PM, Nikolaus Rath wrote: >> I've instrumented one of my unit tests with a conditional >> 'pdb.set_trace' in some circumstances (specifically, when a function is >> called by a thread other than MainThread). > > I think the likelihood of this being an issue with interactive > debugging and threads is sufficiently high that you should avoid > putting the two together, at least until you can verify that the same > problem occurs without that combination. Here's stacktrace as obtained by traceback.print_stack(): tests/t1_backends.py:563: test_extra_data[mock_s3c-zlib] PASSED tests/t1_backends.py:563: test_extra_data[mock_s3c-bzip2] PASSED 87 tests deselected by '-kextra' = === 5 passed, 1 skipped, 87 deselected in 0.65 seconds File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, in close self.fh.close() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, in close traceback.print_stack(file=sys.stdout) something is wrong File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, in close self.fh.close() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, in close traceback.print_stack(file=sys.stdout) something is wrong File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 1050, in close self.fh.close() File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, in close traceback.print_stack(file=sys.stdout) Still no context before the ominous close() call. I'm very confused. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Corrputed stacktrace?
Chris Angelico writes: > On Wed, Jun 4, 2014 at 12:20 PM, Nikolaus Rath wrote: >> File "/usr/lib/python3.3/threading.py", line 878 in _bootstrap > > Can you replicate the problem in a non-threaded environment? Threads > make interactive debugging very hairy. Hmm. I could try to run the server thread in a separate process. I'll try that and report back. Thanks for the suggestion, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: Missing stack frames?
Chris Angelico writes: > On Wed, Jun 4, 2014 at 12:30 PM, Nikolaus Rath wrote: >> I've instrumented one of my unit tests with a conditional >> 'pdb.set_trace' in some circumstances (specifically, when a function is >> called by a thread other than MainThread). > > I think the likelihood of this being an issue with interactive > debugging and threads is sufficiently high that you should avoid > putting the two together, at least until you can verify that the same > problem occurs without that combination. Is there a way to produce a stacktrace without using the interactive debugger? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Missing stack frames?
Hello, (This may or may not be related to my mail about a "corrupted stack trace"). I've instrumented one of my unit tests with a conditional 'pdb.set_trace' in some circumstances (specifically, when a function is called by a thread other than MainThread). However, when trying to print a back trace to figure out how this function got called, I get this: $ py.test-3 tests/t1_backends.py -k extra === test session starts === platform linux -- Python 3.3.5 -- py-1.4.20 -- pytest-2.5.2 -- /usr/bin/python3 collected 33 items tests/t1_backends.py:563: test_extra_data[mock_s3c-plain] SKIPPED tests/t1_backends.py:563: test_extra_data[mock_s3c-zlib] PASSED 31 tests deselected by '-kextra' = === 1 passed, 1 skipped, 31 deselected in 0.23 seconds > /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close() -> if not self.md5_checked: (Pdb) bt /home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py(853)close() -> self.fh.close() > /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close() -> if not self.md5_checked: (Pdb) q What does this mean? Why is there no caller above the backends/common.py code? At the very least, I would have expected some frames from threading.py...? Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Corrputed stacktrace?
Hello, I'm trying to debug a problem. As far as I can tell, one of my methods is called at a point where it really should not be called. When setting a breakpoint in the function, I'm getting this: > /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close() -> if not self.md5_checked: (Pdb) bt /usr/lib/python3.3/threading.py(878)_bootstrap() -> self._bootstrap_inner() /usr/lib/python3.3/threading.py(901)_bootstrap_inner() -> self.run() /usr/lib/python3.3/threading.py(858)run() -> self._target(*self._args, **self._kwargs) /usr/lib/python3.3/socketserver.py(610)process_request_thread() -> self.finish_request(request, client_address) /usr/lib/python3.3/socketserver.py(345)finish_request() -> self.RequestHandlerClass(request, client_address, self) /usr/lib/python3.3/socketserver.py(666)__init__() -> self.handle() /home/nikratio/in-progress/s3ql/tests/mock_server.py(77)handle() -> return super().handle() /usr/lib/python3.3/http/server.py(402)handle() -> self.handle_one_request() /usr/lib/python3.3/http/server.py(388)handle_one_request() -> method() /home/nikratio/in-progress/s3ql/tests/mock_server.py(169)do_GET() -> q = parse_url(self.path) /home/nikratio/in-progress/s3ql/tests/mock_server.py(52)parse_url() -> p.params = urllib.parse.parse_qs(q.query) /usr/lib/python3.3/urllib/parse.py(553)parse_qs() -> encoding=encoding, errors=errors) /usr/lib/python3.3/urllib/parse.py(585)parse_qsl() -> pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] /usr/lib/python3.3/urllib/parse.py(585)() -> pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] /home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py(853)close() -> self.fh.close() > /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close() -> if not self.md5_checked: To me this does not make any sense. Firstly, the thread that is (apparently) calling close should never ever reach code in common.py. This thread is executing a socketserver handler that is entirely contained in mock_server.py and only communicates with the rest of the program via tcp. Secondly, the backtrace does not make sense. How can evaluation of pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] in urllib/parse.py() result in a method call in backends/common.py? There is no trickery going on, qs is a regular string: (Pdb) up (Pdb) up (Pdb) up (Pdb) l 580 into Unicode characters, as accepted by the bytes.decode() method. 581 582 Returns a list, as G-d intended. 583 """ 584 qs, _coerce_result = _coerce_args(qs) 585 -> pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] 586 r = [] 587 for name_value in pairs: 588 if not name_value and not strict_parsing: 589 continue 590 nv = name_value.split('=', 1) (Pdb) whatis qs (Pdb) p qs '' (Pdb) I have also tried to get a backtrace with the faulthandler module, but it gives the same result: Thread 0x7f7dafdb4700: File "/usr/lib/python3.3/cmd.py", line 126 in cmdloop File "/usr/lib/python3.3/pdb.py", line 318 in _cmdloop File "/usr/lib/python3.3/pdb.py", line 345 in interaction File "/usr/lib/python3.3/pdb.py", line 266 in user_line File "/usr/lib/python3.3/bdb.py", line 65 in dispatch_line File "/usr/lib/python3.3/bdb.py", line 47 in trace_dispatch File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 693 in clos File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853 in c File "/usr/lib/python3.3/urllib/parse.py", line 585 in File "/usr/lib/python3.3/urllib/parse.py", line 585 in parse_qsl File "/usr/lib/python3.3/urllib/parse.py", line 553 in parse_qs File "/home/nikratio/in-progress/s3ql/tests/mock_server.py", line 52 in parse_url File "/home/nikratio/in-progress/s3ql/tests/mock_server.py", line 169 in do_GET File "/usr/lib/python3.3/http/server.py", line 388 in handle_one_request File "/usr/lib/python3.3/http/server.py", line 402 in handle File "/home/nikratio/in-progress/s3ql/tests/mock_server.py", line 77 in handle File "/usr/lib/python3.3/socketserver.py", line 666 in __init__ File "/usr/lib/python3.3/socketserver.py", line 345 in finish_request File "/usr/lib/python3.3/socketserver.py", line 610 in process_request_thread File "/usr/lib/python3.3/threading.py", line 858 in run File "/usr/lib/python3.3/threading.py", line 901 in _bootstrap_inner File "/usr/lib/python3.3/threading.py", line 878 in _bootstrap Is it possible that the stack got somehow corrupted? Does anyone have a suggestion how I could go about debugging this? I am using Python 3.3. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Re: select(sock) indicates not-ready, but sock.recv does not block
Nikolaus Rath writes: > Hello, > > I have a problem with using select. I can reliably reproduce a situation > where select.select((sock.fileno(),), (), (), 0) returns ((),(),()) > (i.e., no data ready for reading), but an immediately following > sock.recv() returns data without blocking. [...] Turns out that I fell into the well-known (except to me) ssl-socket trap: http://docs.python.org/3/library/ssl.html#notes-on-non-blocking-sockets TL;DR: Relying on select() on an SSL socket is not a good idea because some internal buffering is done. Better put the socket in non-blocking mode and try to read something, catching the ssl.Want* exception if nothing is ready. Best, -Nikolaus -- Encrypted emails preferred. PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
select(sock) indicates not-ready, but sock.recv does not block
Hello, I have a problem with using select. I can reliably reproduce a situation where select.select((sock.fileno(),), (), (), 0) returns ((),(),()) (i.e., no data ready for reading), but an immediately following sock.recv() returns data without blocking. I am pretty sure that this is not a race condition. The behavor is 100% reproducible, the program is single threaded, and even waiting for 10 seconds before the select() call does not change the result. I'm running Python 3.3.3 under Linux 3.12. Has anyone an idea what might be going wrong here? Thanks, -Nikolaus -- Encrypted emails preferred. PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C »Time flies like an arrow, fruit flies like a Banana.« -- https://mail.python.org/mailman/listinfo/python-list
Dynamically change __del__
Hi, I'm trying to be very clever: class tst(object): def destroy(self): print 'Cleaning up.' self.__del__ = lambda: None def __del__(self): raise RuntimeError('Instance destroyed without running destroy! Hell may break loose!') However, it doesn't work: In [2]: t = tst() In [3]: t = None Exception RuntimeError: RuntimeError('Instance destroyed without running destroy! Hell may break loose!',) in > ignored In [4]: t = tst() In [5]: t.destroy() Cleaning up. In [6]: t = None Exception RuntimeError: RuntimeError('Instance destroyed without running destroy! Hell may break loose!',) in > ignored $ python -V Python 2.6.4 Apparently Python calls the class attribute __del__ rather than the instance's __del__ attribute. Is that a bug or a feature? Is there any way to implement the desired functionality without introducing an additional destroy_has_been_called attribute? (I know that invocation of __del__ is unreliable, this is just an additional safeguard to increase the likelihood of bugs to get noticed). Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Profiling: Interpreting tottime
"Gabriel Genellina" writes: > En Wed, 07 Apr 2010 18:44:39 -0300, Nikolaus Rath > escribió: > >> def check_s3_refcounts(): >> """Check s3 object reference counts""" >> >> global found_errors >> log.info('Checking S3 object reference counts...') >> >> for (key, refcount) in conn.query("SELECT id, refcount FROM >> s3_objects"): >> >> refcount2 = conn.get_val("SELECT COUNT(inode) FROM blocks >> WHERE s3key=?", >> (key,)) >> if refcount != refcount2: >> log_error("S3 object %s has invalid refcount, setting >> from %d to %d", >> key, refcount, refcount2) >> found_errors = True >> if refcount2 != 0: >> conn.execute("UPDATE s3_objects SET refcount=? WHERE >> id=?", >> (refcount2, key)) >> else: >> # Orphaned object will be picked up by check_keylist >> conn.execute('DELETE FROM s3_objects WHERE id=?', (key,)) >> >> When I ran cProfile.Profile().runcall() on it, I got the following >> result: >> >>ncalls tottime percall cumtime percall filename:lineno(function) >> 1 7639.962 7639.962 7640.269 7640.269 >> fsck.py:270(check_s3_refcounts) >> >> So according to the profiler, the entire 7639 seconds where spent >> executing the function itself. >> >> How is this possible? I really don't see how the above function can >> consume any CPU time without spending it in one of the called >> sub-functions. > > Is the conn object implemented as a C extension? No, it's pure Python. > The profiler does not > detect calls to C functions, I think. Hmm. Isn't this a C function? 262.3170.0892.3170.089 {method 'execute' of 'apsw.Cursor' objects} > You may be interested in this package by Robert Kern: > http://pypi.python.org/pypi/line_profiler > "Line-by-line profiler. > line_profiler will profile the time individual lines of code take to > execute." That looks interesting nevertheless, thanks! -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Profiling: Interpreting tottime
Hello, Consider the following function: def check_s3_refcounts(): """Check s3 object reference counts""" global found_errors log.info('Checking S3 object reference counts...') for (key, refcount) in conn.query("SELECT id, refcount FROM s3_objects"): refcount2 = conn.get_val("SELECT COUNT(inode) FROM blocks WHERE s3key=?", (key,)) if refcount != refcount2: log_error("S3 object %s has invalid refcount, setting from %d to %d", key, refcount, refcount2) found_errors = True if refcount2 != 0: conn.execute("UPDATE s3_objects SET refcount=? WHERE id=?", (refcount2, key)) else: # Orphaned object will be picked up by check_keylist conn.execute('DELETE FROM s3_objects WHERE id=?', (key,)) When I ran cProfile.Profile().runcall() on it, I got the following result: ncalls tottime percall cumtime percall filename:lineno(function) 1 7639.962 7639.962 7640.269 7640.269 fsck.py:270(check_s3_refcounts) So according to the profiler, the entire 7639 seconds where spent executing the function itself. How is this possible? I really don't see how the above function can consume any CPU time without spending it in one of the called sub-functions. Puzzled, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Determine PyArg_ParseTuple parameters at runtime?
Hello, I want to create an extension module that provides an interface to a couple of C functions that take arguments of type struct iovec, struct stat, struct flock, etc (the FUSE library, in case it matters). Now the problem is that these structures contain attributes of type fsid_t, off_t, dev_t etc. Since I am receiving the values for these attributes from the Python caller, I have to convert them to the correct type using PyArg_ParseTuple. However, since the structs are defined in a system-dependent header file, when writing the code I do not know if on the target system, e.g. off_t will be of type long, long long or unsigned long, so I don't know which format string to pass to PyArg_ParseTuple. Are there any best practices for handling this kind of situation? I'm at a loss right now. The only thing that comes to my mind is to, e.g., to compare sizeof(off_t) to sizeof(long) and sizeof(long long) and thereby determine the correct bit length at runtime. But this does not help me to figure out if it is signed or unsigned, or (in case of other attributes than off_t) if it is an integer at all. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: [distutils] Install script under a different name
Wolodja Wentland writes: > On Fri, Dec 04, 2009 at 19:34 -0500, Nikolaus Rath wrote: >> All my Python files have extension .py. However, I would like to install >> scripts that are meant to be called by the user without the suffix, i.e. >> the file scripts/doit.py should end up as /usr/bin/doit. > >> Apparently the scripts= option of the setup() function does not support >> this directly. Is there a clever way to get what I want? > > You can also use entry points to create the executable at install time. > Have a look at [1] which explains how this is done. This requires using > Distribute/setuptools though, ... > > [1] > http://packages.python.org/distribute/setuptools.html#automatic-script-creation That looks perfect, thanks! Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: [distutils] Install script under a different name
Lie Ryan writes: > On 12/5/2009 11:34 AM, Nikolaus Rath wrote: >> Hello, >> >> All my Python files have extension .py. However, I would like to install >> scripts that are meant to be called by the user without the suffix, i.e. >> the file scripts/doit.py should end up as /usr/bin/doit. >> >> Apparently the scripts= option of the setup() function does not support >> this directly. Is there a clever way to get what I want? > > if this is on windows, you should add ".py" to the PATHEXT environment > variable. > > on linux/unix, you need to add the proper #! line to the top of any > executable scripts and of course set the executable bit permission > (chmod +x scriptname). In linux/unix there is no need to have the .py > extension for a file to be recognized as python script (i.e. just > remove it). Sorry, but I think this is totally unrelated to my question. I want to rename files during the setup process. This is not going to happen by adding/changing any #! lines or the permissions of the file. I know that there is no need to have the .py extension, that's why I want to install the scripts without this suffix. But in my source distribution I want to keep the suffix for various reasons. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
[distutils] Install script under a different name
Hello, All my Python files have extension .py. However, I would like to install scripts that are meant to be called by the user without the suffix, i.e. the file scripts/doit.py should end up as /usr/bin/doit. Apparently the scripts= option of the setup() function does not support this directly. Is there a clever way to get what I want? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Monkeypatching an object to become callable
Bruno Desthuilliers writes: > 7stud a écrit : > (snip) >> class Wrapper(object): >> def __init__(self, obj, func): >> self.obj = obj >> self.func = func >> >> def __call__(self, *args): >> return self.func(*args) >> >> def __getattr__(self, name): >> return object.__getattribute__(self.obj, name) > > This should be > > return getattr(self.obj, name) > > directly calling object.__getattribute__ might skip redefinition of > __getattribute__ in self.obj.__class__ or it's mro. Works nicely, thanks. I came up with the following shorter version which modifies the object in-place: class Modifier(obj.__class__): def __call__(self): return fn() obj.__class__ = Modifier To me this seems a bit more elegant (less code, less underscores). Or are there some cases where the above would fail? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Profiling a Callback Function
Hello, I am trying to profile a Python program that primarily calls a C extension. From within the C extension, a callback Python function is then called concurrently in several threads. When I tried to profile this application with import c_extension def callback_fn(args): # Do all sorts of complicated, time consuming stuff pass def main(): c_extension.call_me_back(callback_fn, some_random_args) cProfile.run('main', 'profile.dat') I only got results for main(), but no information at all about callback_fn. What is the proper way to profile such an application? I already thought about this: import c_extension def callback_fn(args): # Do all sorts of complicated, time consuming stuff pass def callback_wrapper(args): def doit(): callback_fn(args) cProfile.run('doit', 'profile.dat') c_extension.call_me_back(callback_wrapper, some_random_args) but that probably overwrites the profiling information whenever callback_wrapper is called, instead of accumulating them over several calls (with different arguments). Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Monkeypatching an object to become callable
Hi, I want to monkeypatch an object so that it becomes callable, although originally it is not meant to be. (Yes, I think I do have a good reason to do so). But simply adding a __call__ attribute to the object apparently isn't enough, and I do not want to touch the class object (since it would modify all the instances): >>> class foo(object): ... pass ... >>> t = foo() >>> def test(): ... print 'bar' ... >>> t() Traceback (most recent call last): File "", line 1, in TypeError: 'foo' object is not callable >>> t.__call__ = test >>> t() Traceback (most recent call last): File "", line 1, in TypeError: 'foo' object is not callable >>> t.__call__() bar Is there an additional trick to get it to work? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Python docs disappointing
Carl Banks writes: >> This is one area in which Perl still whips Python... > > No way. Perl's man pages are organized so poorly there is no > ergonomic pit deep enough to offset them. Quick, what man page is the > "do" statement documented in? Of course there is: $ perldoc -f do | head do BLOCK Not really a function. Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by the "while" or "until" loop modifier, executes the BLOCK once before testing the loop condition. (On other statements the loop modifiers test the conditional first.) "do BLOCK" does not count as a loop, so the loop control statements "next", "last", or "redo" cannot be used to leave or restart the block. See perlsyn for alternative strategies. $ Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Implementing a cache
Hello, I want to implement a caching data structure in Python that allows me to: 1. Quickly look up objects using a key 2. Keep track of the order in which the objects are accessed (most recently and least recently accessed one, not a complete history) 3. Quickly retrieve and remove the least recently accessed object. Here's my idea for the implementation: The objects in the cache are encapsulated in wrapper objects: class OrderedDictElement(object): __slots__ = [ "next", "prev", "key", "value" ] These wrapper objects are then kept in a linked lists and in an ordinary dict (self.data) in parallel. Object access then works as follows: def __setitem__(self, key, value): if key in self.data: # Key already exists, just overwrite value self.data[key].value = value else: # New key, attach at head of list with self.lock: el = OrderedDictElement(key, value, next=self.head.next, prev=self.head) self.head.next.prev = el self.head.next = el self.data[key] = el def __getitem__(self, key): return self.data[key].value To 'update the access time' of an object, I use def to_head(self, key): with self.lock: el = self.data[key] # Splice out el.prev.next = el.next el.next.prev = el.prev # Insert back at front el.next = self.head.next el.prev = self.head self.head.next.prev = el self.head.next = el self.head and self.tail are special sentinel objects that only have a .next and .prev attribute respectively. While this is probably going to work, I'm not sure if its the best solution, so I'd appreciate any comments. Can it be done more elegantly? Or is there an entirely different way to construct the data structure that also fulfills my requirements? I already looked at the new OrderedDict class in Python 3.1, but apparently it does not allow me to change the ordering and is therefore not suitable for my purpose. (I can move something to one end by deleting and reinserting it, but I'd like to keep at least the option of also moving objects to the opposite end). Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Logging multiple lines
Hi, Are there any best practices for handling multi-line log messages? For example, the program, main = logging.getLogger() handler = logging.StreamHandler() handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s %(message)s')) main.addHandler(handler) main.setLevel(logging.DEBUG) main.info("Starting") try: bla = 42/0 except: main.exception("Oops...") generates the log messages 2009-06-16 22:19:57,515 INFO Starting 2009-06-16 22:19:57,518 ERROR Oops... Traceback (most recent call last): File "/home/nikratio/lib/EclipseWorkspace/playground/src/mytests.py", line 27, in bla = 42/0 ZeroDivisionError: integer division or modulo by zero which are a mess in any logfile because they make it really difficult to parse. How do you usually handle multi-line messages? Do you avoid them completely (and therefore also the exception logging facilities provided by logging)? Or is it possible to tweak the formatter so that it inserts the prefix at the beginning of every line? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Exceptions and Object Destruction (was: Problem with apsw and garbage collection)
Nikolaus Rath writes: > Hi, > > Please consider this example: [] I think I managed to narrow down the problem a bit. It seems that when a function returns normally, its local variables are immediately destroyed. However, if the function is left due to an exception, the local variables remain alive: -snip- #!/usr/bin/env python import gc class testclass(object): def __init__(self): print "Initializing" def __del__(self): print "Destructing" def dostuff(fail): obj = testclass() if fail: raise TypeError print "Calling dostuff" dostuff(fail=False) print "dostuff returned" try: print "Calling dostuff" dostuff(fail=True) except TypeError: pass gc.collect() print "dostuff returned" -snip- Prints out: -snip- Calling dostuff Initializing Destructing dostuff returned Calling dostuff Initializing dostuff returned Destructing -snip- Is there a way to have the obj variable (that is created in dostuff()) destroyed earlier than at the end of the program? As you can see, I already tried to explicitly call the garbage collector, but this does not help. Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Problem with apsw and garbage collection
Hi, Please consider this example: #!/usr/bin/env python import apsw import tempfile fh = tempfile.NamedTemporaryFile() conn = apsw.Connection(fh.name) # Fill the db with some data cur = conn.cursor() datafh = open("/dev/urandom", "rb") cur.execute("CREATE TABLE foo (no INT, data BLOB)") for i in range(42): cur.execute("INSERT INTO foo(no,data) VALUES(?,?)", (i, buffer(datafh.read(4096 del cur # Here comes the problem: def dostuff(conn, fail): cur = conn.cursor() # Ignore result, we just need to make some query cur.execute("SELECT data FROM foo WHERE no = ?", (33,)) cur2 = conn.cursor() if fail: raise TypeError else: return # This works: dostuff(conn, fail=False) cur = conn.cursor() cur.execute("VACUUM") #cur.execute("CREATE TABLE test (no INT)") # This does not: try: dostuff(conn, fail=True) except TypeError: cur = conn.cursor() cur.execute("VACUUM") #cur.execute("CREATE TABLE test2 (no INT)") While the first execute("VACUUM") call succeeds, the second does not but raises an apsw.BusyError (meaning that sqlite thinks that it cannot get an exclusive lock on the database). I suspect that the reason for that is that the cursor object that is created in the function is not destroyed when the function is left with raise (rather than return), which in turn prevents sqlite from obtaining the lock. However, if I exchange the VACUUM command by something else (e.g. CREATE TABLE), the program runs fine. I think this casts some doubt on the above explanation, since, AFAIK sqlite always locks the entire file and should therefore have the some problem as before. Can someone explain what exactly is happening here? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: [pyunit] Only run one specific test
Dave Angel writes: > Nikolaus Rath wrote: >> Hi, >> >> Consider these two files: >> >> , mytest.py - >> | #!/usr/bin/env python >> | import unittest >> | | class myTestCase(unittest.TestCase): >> | def test_foo(self): >> |pass >> | | # Somehow important according to pyunit documentation >> | def suite(): >> | return unittest.makeSuite(myTestCase) >> ` >> >> , runtest --- >> | #!/usr/bin/env python >> | import unittest >> | | # Find and import tests >> | modules_to_test = [ "mytest" ] >> | map(__import__, modules_to_test) >> | | # Runs all tests in test/ directory >> | def suite(): >> | alltests = unittest.TestSuite() >> | for name in modules_to_test: >> | alltests.addTest(unittest.findTestCases(sys.modules[name])) >> | return alltests >> | | if __name__ == '__main__': >> | unittest.main(defaultTest='suite') >> ` >> >> >> if I run runtest without arguments, it works. But according to runtest >> --help, I should also be able to do >> >> , >> | $ ./runtest mytest >> | Traceback (most recent call last): >> | File "./runtest", line 20, in >> | unittest.main() >> | File "/usr/lib/python2.6/unittest.py", line 816, in __init__ >> | self.parseArgs(argv) >> | File "/usr/lib/python2.6/unittest.py", line 843, in parseArgs >> | self.createTests() >> | File "/usr/lib/python2.6/unittest.py", line 849, in createTests >> | self.module) >> | File "/usr/lib/python2.6/unittest.py", line 613, in loadTestsFromNames >> | suites = [self.loadTestsFromName(name, module) for name in names] >> | File "/usr/lib/python2.6/unittest.py", line 584, in loadTestsFromName >> | parent, obj = obj, getattr(obj, part) >> | AttributeError: 'module' object has no attribute 'mytest' >> ` >> >> >> Why doesn't this work? >> >> Best, >> >>-Nikolaus >> >> >> > First, you're missing aimport sys in the runtest.py module. > Without that, it won't even start. Sorry, I must have accidentally deleted the line when I deleted empty lines to make the example more compact. > Now, I have no familiarity with unittest, but I took this as a > challenge. The way I read the code is that you need an explicit > import of mytest if you're > going to specify a commandline of >runtest mytest > > So I'd add two lines to the beginning of runtest.py: > > import sys > import mytest Yes, that works indeed. But in practice the modules_to_import list is filled by parsing the contents of a test/*.py directory. That's why I import dynamically with __import__. Nevertheless, you got me on the right track. After I explicitly added the modules to the global namespace (globals()["mytest"] = __import__("mytest")), it works fine. Thx! Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
[pyunit] Only run one specific test
Hi, Consider these two files: , mytest.py - | #!/usr/bin/env python | import unittest | | class myTestCase(unittest.TestCase): | def test_foo(self): | pass | | # Somehow important according to pyunit documentation | def suite(): | return unittest.makeSuite(myTestCase) ` , runtest --- | #!/usr/bin/env python | import unittest | | # Find and import tests | modules_to_test = [ "mytest" ] | map(__import__, modules_to_test) | | # Runs all tests in test/ directory | def suite(): | alltests = unittest.TestSuite() | for name in modules_to_test: | alltests.addTest(unittest.findTestCases(sys.modules[name])) | return alltests | | if __name__ == '__main__': | unittest.main(defaultTest='suite') ` if I run runtest without arguments, it works. But according to runtest --help, I should also be able to do , | $ ./runtest mytest | Traceback (most recent call last): | File "./runtest", line 20, in | unittest.main() | File "/usr/lib/python2.6/unittest.py", line 816, in __init__ | self.parseArgs(argv) | File "/usr/lib/python2.6/unittest.py", line 843, in parseArgs | self.createTests() | File "/usr/lib/python2.6/unittest.py", line 849, in createTests | self.module) | File "/usr/lib/python2.6/unittest.py", line 613, in loadTestsFromNames | suites = [self.loadTestsFromName(name, module) for name in names] | File "/usr/lib/python2.6/unittest.py", line 584, in loadTestsFromName | parent, obj = obj, getattr(obj, part) | AttributeError: 'module' object has no attribute 'mytest' ` Why doesn't this work? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Strange pexpect behaviour: just duplicates stdin
Hello, I have a strange problem with pexpect: $ cat test.py #!/usr/bin/python import pexpect child = pexpect.spawn("./test.pl") while True: try: line = raw_input() except EOFError: break child.sendline(line) print child.readline().rstrip("\r\n") child.close() $ cat test.pl #!/usr/bin/perl # Replace all digits by __ while (<>) { s/[0-9]+/__/g; print; } $ echo bla24fasel | ./test.pl bla__fasel $ echo bla24fasel | ./test.py bla24fasel Why doesn't the last command return bla__fasel too? I extracted this example from an even stranger problem in a bigger program. In there, pexpect sometimes returns both the string send with sendline() *and* the output of the child program, sometimes the correct output of the child, and sometimes only the input it has send to the child. I couldn't figure out a pattern, but the above example always produces the same result. Anyone able to help? Best, -Nikolaus -- »Time flies like an arrow, fruit flies like a Banana.« PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Calculate sha1 hash of a binary file
LaundroMat <[EMAIL PROTECTED]> writes: > Hi - > > I'm trying to calculate unique hash values for binary files, > independent of their location and filename, and I was wondering > whether I'm going in the right direction. > > Basically, the hash values are calculated thusly: > > f = open('binaryfile.bin') > import hashlib > h = hashlib.sha1() > h.update(f.read()) > hash = h.hexdigest() > f.close() > > A quick try-out shows that effectively, after renaming a file, its > hash remains the same as it was before. > > I have my doubts however as to the usefulness of this. As f.read() > does not seem to read until the end of the file (for a 3.3MB file only > a string of 639 bytes is being returned, perhaps a 00-byte counts as > EOF?), is there a high danger for collusion? > > Are there better ways of calculating hash values of binary files? Apart from opening the file in binary mode, I would consider to read and update the hash in chunks of e.g. 512 KB. The above code is probably going to perform horribly for sufficiently large files, since you try read the entire file into memory. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Find class of an instance?
Neal Becker <[EMAIL PROTECTED]> writes: > Sounds simple, but how, given an instance, do I find the class? It does not only sound simple. When 'inst' is your instance, then inst.__class__ or type(inst) is the class. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Locking around
Tobiah <[EMAIL PROTECTED]> writes: > On Mon, 04 Aug 2008 15:30:51 +0200, Nikolaus Rath wrote: > >> Hello, >> >> I need to synchronize the access to a couple of hundred-thousand >> files[1]. It seems to me that creating one lock object for each of the >> files is a waste of resources, but I cannot use a global lock for all >> of them either (since the locked operations go over the network, this >> would make the whole application essentially single-threaded even >> though most operations act on different files). > > Do you think you could use an SQL database on the network to > handle the locking? Yeah, I could. It wouldn't even have to be over the network (I'm synchronizing access from within the same program). But I think that is even more resource-wasteful than my original idea. > Hey, maybe the files themselves should go into blobs. Nope, not possible. They're on Amazon S3. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Locking around
Carl Banks <[EMAIL PROTECTED]> writes: > Freaky... I just posted nearly this exact solution. > > I have a couple comments. First, the call to acquire should come > before the try block. If the acquire were to fail, you wouldn't want > to release the lock on cleanup. > > Second, you need to change notify() to notifyAll(); notify alone won't > cut it. Consider what happens if you have two threads waiting for > keys A and B respectively. When the thread that has B is done, it > releases B and calls notify, but notify happens to wake up the thread > waiting on A. Thus the thread waiting on B is starved. You're right. Thanks for pointing it out. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Locking around
Nikolaus Rath <[EMAIL PROTECTED]> writes: >> This should work, at least the idea is not flawed. However, I'd say >> there are too many locks involved. Rather, you just need a simple >> flag and the global lock. Further, you need a condition/event that >> tells waiting threads that you released some of the files so that it >> should see again if the ones it wants are available. > > I have to agree that this sounds like an easier implementation. I > just have to think about how to do the signalling. Thanks a lot! Here's the code I use now. I think it's also significantly easier to understand (cv is a threading.Condition() object and cv.locked_keys a set()). def lock_s3key(s3key): cv = self.s3_lock try: # Lock set of locked s3 keys (global lock) cv.acquire() # Wait for given s3 key becoming unused while s3key in cv.locked_keys: cv.wait() # Mark it as used (local lock) cv.locked_keys.add(s3key) finally: # Release global lock cv.release() def unlock_s3key(s3key): cv = self.s3_lock try: # Lock set of locked s3 keys (global lock) cv.acquire() # Mark key as free (release local lock) cv.locked_keys.remove(s3key) # Notify other threads cv.notify() finally: # Release global lock cv.release() Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Locking around
Ulrich Eckhardt <[EMAIL PROTECTED]> writes: > Nikolaus Rath wrote: >> I need to synchronize the access to a couple of hundred-thousand >> files[1]. It seems to me that creating one lock object for each of the >> files is a waste of resources, but I cannot use a global lock for all >> of them either (since the locked operations go over the network, this >> would make the whole application essentially single-threaded even >> though most operations act on different files). > > Just wondering, but at what time do you know what files are needed? As soon as I have read a client request. Also, I will only need one file per request, not multiple. > If you know that rather early, you could simply 'check out' the > required files, do whatever you want with them and then release them > again. If one of the requested files is marked as already in use, > you simply wait (without reserving the others) until someone > releases files and then try again. You could also wait for that > precise file to be available, but that would require that you > already reserve the other files, which might unnecessarily block > other accesses. > > Note that this idea requires that each access locks one set of files at the > beginning and releases them at the end, i.e. no attempts to lock files in > between, which would otherwise easily lead to deadlocks. I am not sure that I understand your idea. To me this sounds exactly like what I'm already doing, just replace 'check out' by 'lock' in your description... Am I missing something? >> My idea is therefore to create and destroy per-file locks "on-demand" >> and to protect the creation and destruction by a global lock >> (self.global_lock). For that, I add a "usage counter" >> (wlock.user_count) to each lock, and destroy the lock when it reaches >> zero. > [...code...] > >> - Does that look like a proper solution, or does anyone have a better >>one? > > This should work, at least the idea is not flawed. However, I'd say > there are too many locks involved. Rather, you just need a simple > flag and the global lock. Further, you need a condition/event that > tells waiting threads that you released some of the files so that it > should see again if the ones it wants are available. I have to agree that this sounds like an easier implementation. I just have to think about how to do the signalling. Thanks a lot! Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Locking around
Hello, I need to synchronize the access to a couple of hundred-thousand files[1]. It seems to me that creating one lock object for each of the files is a waste of resources, but I cannot use a global lock for all of them either (since the locked operations go over the network, this would make the whole application essentially single-threaded even though most operations act on different files). My idea is therefore to create and destroy per-file locks "on-demand" and to protect the creation and destruction by a global lock (self.global_lock). For that, I add a "usage counter" (wlock.user_count) to each lock, and destroy the lock when it reaches zero. The number of currently active lock objects is stored in a dict: def lock_s3key(s3key): self.global_lock.acquire() try: # If there is a lock object, use it if self.key_lock.has_key(s3key): wlock = self.key_lock[s3key] wlock.user_count += 1 lock = wlock.lock # otherwise create a new lock object else: wlock = WrappedLock() wlock.lock = threading.Lock() wlock.user_count = 1 self.key_lock[s3key] = wlock finally: self.global_lock.release() # Lock the key itself lock.acquire() and similarly def unlock_s3key(s3key): # Lock dictionary of lock objects self.global_lock.acquire() try: # Get lock object wlock = self.key_lock[s3key] # Unlock key wlock.lock.release() # We don't use the lock object any longer wlock.user_count -= 1 # If no other thread uses the lock, dispose it if wlock.user_count == 0: del self.key_lock[s3key] assert wlock.user_count >= 0 finally: self.global_lock.release() WrappedLock is just an empty class that allows me to add the additional user_count attribute. My questions: - Does that look like a proper solution, or does anyone have a better one? - Did I overlook any deadlock possibilities? Best, Nikolaus [1] Actually, it's not really files (because in that case I could use fcntl) but blobs stored on Amazon S3. -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Steven D'Aprano <[EMAIL PROTECTED]> writes: > So, to the Original Poster: > > In Python, new-style classes and types are the same, but it is > traditional to refer to customer objects as "class" and built-in > objects as "types". Old-style classes are different, but you are > discouraged from using old-style classes unless you have a specific > reason for needing them (e.g. backwards compatibility). I see. Thanks a lot for the explanation. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Miles <[EMAIL PROTECTED]> writes: > On Thu, Jul 31, 2008 at 1:59 PM, Nikolaus Rath wrote: >> If it is just a matter of different rendering, what's the reason for >> doing it like that? Wouldn't it be more consistent and straightforward >> to denote builtin types as classes as well? > > Yes, and in Python 3, it will be so: > [..] > http://svn.python.org/view?rev=23331&view=rev That makes matters absolutely clear. Thanks a lot. No more questions from my side ;-). Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Maric Michaud <[EMAIL PROTECTED]> writes: >>> What the means is that int is not a user type but a >>> builtin type, instances of int are not types (or classes) but common >>> objects, so its nature is the same as any classes. >>> >>> The way it prints doesn't matter, it's just the __repr__ of any instance, >>> and the default behavior for instances of type is to return '', >>> but it can be easily customized. >> >> But 'int' is an instance of 'type' (the metaclass): >> >>> int.__class__ >> >> >> >> so it should also return '' if that's the default behavior >> of the 'type' metaclass. >> > > The fact that a class is an instance of type, which it is always true, > doesn't > mean its metaclass is "type", it could be any subclass of type : Yes, that is true. But it's not what I said above (and below). 'obj.__class__' gives the class of 'obj', so if 'int.__class__ is type' then 'type' is the class of 'int' and 'int' is *not* an instance of some metaclass derived from 'type'. >> I think that to get '' one would have to define a new >> metaclass like this: >> >> def type_meta(type): >> def __repr__(self) >> return "" % self.__name__ >> >> and then one should have int.__class__ is type_meta. But obviously >> that's not the case. Why? >> >> Moreover: >> >>> class myint(int): >> >> ... pass >> ... >> >> >>> myint.__class__ is int.__class__ >> True >> >> >>> int >> >> >> >>> myint >> >> >> despite int and myint having the same metaclass. So if the >> representation is really defined in the 'type' metaclass, then >> type.__repr__ has to make some kind of distinction between int and >> myint, so they cannot be on absolute equal footing. > > You're right, type(int) is type, the way it renders differently is a > detail of its implementation, you can do things with builtin types > (written in C) you coudn't do in pure python, exactly as you > couldn't write recursive types like 'object' and 'type'. If it is just a matter of different rendering, what's the reason for doing it like that? Wouldn't it be more consistent and straightforward to denote builtin types as classes as well? And where exactly is this different rendering implemented? Could I write my own type (in C, of course) and make it behave like e.g. 'int'? I.e. its rendering should be different and not inherited to subclasses: >>> my_type >>> a = my_type(42) >>> a.__class__ >>> class derived(my_type): >>>pass or would I have to change the implemention of 'type' for this (since it contains the __repr__ function that renders the type)? This is of course purely theoretical and probably without any practical relevance. I'm if I just can't stop drilling, but I think this is really interesting. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Maric Michaud <[EMAIL PROTECTED]> writes: > Le Thursday 31 July 2008 16:46:28 Nikolaus Rath, vous avez écrit : >> Maric Michaud <[EMAIL PROTECTED]> writes: >> >> > Can someone explain to me the difference between a type and a class? >> >> >> >> If your confusion is of a more general nature I suggest reading the >> >> introduction of `Design Patterns' (ISBN-10: 0201633612), under >> >> `Specifying Object Interfaces'. >> >> >> >> In short: A type denotes a certain interface, i.e. a set of signatures, >> >> whereas a class tells us how an object is implemented (like a >> >> blueprint). A class can have many types if it implements all their >> >> interfaces, and different classes can have the same type if they share a >> >> common interface. The following example should clarify matters: >> > >> > Of course, this is what a type means in certain literature about OO >> > (java-ish), but this absolutely not what type means in Python. Types >> > are a family of object with a certain status, and they're type is >> > "type", conventionnaly named a metatype in standard OO. >> >> [...] >> >> Hmm. Now you have said a lot about Python objects and their type, but >> you still haven't said what a type actually is (in Python) and in what >> way it is different from a class. Or did I miss something? > > This paragraph ? > > """ > - types, or classes, are all instance of type 'type' (or a subclass of it), Well, I couldn't quite make sense of '..are instance of type...' without knowing what you actually meant with "type* in this context, but... > Maybe it's still unclear that "types" and "classes" *are* synonyms > in Python. ..in this case it becomes clear. Then my question reduces to the one in the other post (why do 'int' and 'myint' have different __repr__ results). Already thanks for your help, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Maric Michaud <[EMAIL PROTECTED]> writes: > Le Thursday 31 July 2008 14:30:19 Nikolaus Rath, vous avez écrit : >> oj <[EMAIL PROTECTED]> writes: >> > On Jul 31, 11:37 am, Nikolaus Rath <[EMAIL PROTECTED]> wrote: >> >> So why does Python distinguish between e.g. the type 'int' and the >> >> class 'myclass'? Why can't I say that 'int' is a class and 'myclass' >> >> is a type? >> > >> > I might be wrong here, but I think the point is that there is no >> > distinction. A class (lets call it SomeClass for this example) is an >> > object of type 'type', and an instance of a class is an object of type >> > 'SomeClass'. >> >> But there seems to be a distinction: >> >>> class int_class(object): >> >> ... pass >> ... >> >> >>> int_class >> >> >> >> >>> int >> >> >> >> why doesn't this print >> >> >>> class int_class(object): >> >> ... pass >> ... >> >> >>> int_class >> >> >> >> >>> int >> >> >> >> or >> >> >>> class int_class(object): >> >> ... pass >> ... >> >> >>> int_class >> >> >> >> >>> int >> >> >> >> If there is no distinction, how does the Python interpreter know when >> to print 'class' and when to print 'type'? >> > > There are some confusion about the terms here. > > Classes are instances of type 'type', Could you please clarify what you mean with 'instance of type X'? I guess you mean that 'y is an instance of type X' iif y is constructed by instantiating X. Is that correct? > What the means is that int is not a user type but a > builtin type, instances of int are not types (or classes) but common > objects, so its nature is the same as any classes. > > The way it prints doesn't matter, it's just the __repr__ of any instance, and > the default behavior for instances of type is to return '', but it > can be easily customized. But 'int' is an instance of 'type' (the metaclass): >>> int.__class__ so it should also return '' if that's the default behavior of the 'type' metaclass. I think that to get '' one would have to define a new metaclass like this: def type_meta(type): def __repr__(self) return "" % self.__name__ and then one should have int.__class__ == type_meta. But obviously that's not the case. Why? Moreover: >>> class myint(int): ...pass ... >>> myint.__class__ == int.__class__ True >>> int >>> myint despite int and myint having the same metaclass. So if the representation is really defined in the 'type' metaclass, then type.__repr__ has to make some kind of distinction between int and myint, so they cannot be on absolute equal footing. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Maric Michaud <[EMAIL PROTECTED]> writes: >> > Can someone explain to me the difference between a type and a class? >> >> If your confusion is of a more general nature I suggest reading the >> introduction of `Design Patterns' (ISBN-10: 0201633612), under >> `Specifying Object Interfaces'. >> >> In short: A type denotes a certain interface, i.e. a set of signatures, >> whereas a class tells us how an object is implemented (like a >> blueprint). A class can have many types if it implements all their >> interfaces, and different classes can have the same type if they share a >> common interface. The following example should clarify matters: >> > > Of course, this is what a type means in certain literature about OO > (java-ish), but this absolutely not what type means in Python. Types > are a family of object with a certain status, and they're type is > "type", conventionnaly named a metatype in standard OO. [...] Hmm. Now you have said a lot about Python objects and their type, but you still haven't said what a type actually is (in Python) and in what way it is different from a class. Or did I miss something? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
oj <[EMAIL PROTECTED]> writes: > On Jul 31, 11:37 am, Nikolaus Rath <[EMAIL PROTECTED]> wrote: >> So why does Python distinguish between e.g. the type 'int' and the >> class 'myclass'? Why can't I say that 'int' is a class and 'myclass' >> is a type? > > I might be wrong here, but I think the point is that there is no > distinction. A class (lets call it SomeClass for this example) is an > object of type 'type', and an instance of a class is an object of type > 'SomeClass'. But there seems to be a distinction: >>> class int_class(object): ... pass ... >>> int_class >>> int why doesn't this print >>> class int_class(object): ... pass ... >>> int_class >>> int or >>> class int_class(object): ... pass ... >>> int_class >>> int If there is no distinction, how does the Python interpreter know when to print 'class' and when to print 'type'? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Difference between type and class
Thomas Troeger <[EMAIL PROTECTED]> writes: >> Can someone explain to me the difference between a type and a class? > > If your confusion is of a more general nature I suggest reading the > introduction of `Design Patterns' (ISBN-10: 0201633612), under > Specifying Object Interfaces'. > > In short: A type denotes a certain interface, i.e. a set of > signatures, whereas a class tells us how an object is implemented > (like a blueprint). A class can have many types if it implements all > their interfaces, and different classes can have the same type if they > share a common interface. The following example should clarify > matters: > > class A: > def bar(self): > print "A" > > class B: > def bar(self): > print "B" > > class C: > def bla(self): > print "C" > > def foo(x): > x.bar() > > you can call foo with instances of both A and B, because both classes > share a common type, namely the type that has a `bar' method), but not > with an instance of C because it has no method `bar'. Btw, this > example shows the use of duck typing > (http://en.wikipedia.org/wiki/Duck_typing). That would imply that I cannot create instances of a type, only of a class that implements the type, wouldn't it? But Python denotes 'int' as a type *and* I can instantiate it. Still confused, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Difference between type and class
Hello, Can someone explain to me the difference between a type and a class? After reading http://www.cafepy.com/article/python_types_and_objects/ it seems to me that classes and types are actually the same thing: - both are instances of a metaclass, and the same metaclass ('type') can instantiate both classes and types. - both can be instantiated and yield an "ordinary" object - I can even inherit from a type and get a class So why does Python distinguish between e.g. the type 'int' and the class 'myclass'? Why can't I say that 'int' is a class and 'myclass' is a type? I hope I have managed to get across the point of my confusion... Thanks in advance, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: [unittest] Run setUp only once
Jean-Paul Calderone <[EMAIL PROTECTED]> writes: > On Tue, 29 Jul 2008 19:26:09 +0200, Nikolaus Rath <[EMAIL PROTECTED]> wrote: >>Jean-Paul Calderone <[EMAIL PROTECTED]> writes: >>> On Tue, 29 Jul 2008 16:35:55 +0200, Nikolaus Rath <[EMAIL PROTECTED]> wrote: >>>>Hello, >>>> >>>>I have a number of conceptually separate tests that nevertheless need >>>>a common, complicated and expensive setup. >>>> >>>>Unfortunately, unittest runs the setUp method once for each defined >>>>test, even if they're part of the same class as in >>>> >>>>class TwoTests(unittest.TestCase): >>>>def setUp(self): >>>># do something very time consuming >>>> >>>>def testOneThing(self): >>>> >>>> >>>>def testADifferentThing(self): >>>> >>>> >>>>which would call setUp twice. >>>> >>>> >>>>Is there any way to avoid this, without packing all the unrelated >>>>tests into one big function? >>>> >>> >>>class TwoTests(unittest.TestCase): >>>setUpResult = None >>> >>>def setUp(self): >>>if self.setUpResult is None: >>>self.setUpResult = computeIt() >>> >>>... >>> >>> There are plenty of variations on this pattern. >> >> >>But at least this variation doesn't work, because unittest apparently >>also creates two separate TwoTests instances for the two tests. Isn't >>there some way to convince unittest to reuse the same instance instead >>of trying to solve the problem in the test code itself? >> > > Eh sorry, you're right, the above is broken. `setUpResult` should be > a class attribute instead of an instance attribute. Yeah, well, I guess that would work. But to me this looks really more like a nasty hack.. isn't there a proper solution? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: [unittest] Run setUp only once
Jean-Paul Calderone <[EMAIL PROTECTED]> writes: > On Tue, 29 Jul 2008 16:35:55 +0200, Nikolaus Rath <[EMAIL PROTECTED]> wrote: >>Hello, >> >>I have a number of conceptually separate tests that nevertheless need >>a common, complicated and expensive setup. >> >>Unfortunately, unittest runs the setUp method once for each defined >>test, even if they're part of the same class as in >> >>class TwoTests(unittest.TestCase): >>def setUp(self): >># do something very time consuming >> >>def testOneThing(self): >> >> >>def testADifferentThing(self): >> >> >>which would call setUp twice. >> >> >>Is there any way to avoid this, without packing all the unrelated >>tests into one big function? >> > >class TwoTests(unittest.TestCase): >setUpResult = None > >def setUp(self): >if self.setUpResult is None: >self.setUpResult = computeIt() > >... > > There are plenty of variations on this pattern. But at least this variation doesn't work, because unittest apparently also creates two separate TwoTests instances for the two tests. Isn't there some way to convince unittest to reuse the same instance instead of trying to solve the problem in the test code itself? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
[unittest] Run setUp only once
Hello, I have a number of conceptually separate tests that nevertheless need a common, complicated and expensive setup. Unfortunately, unittest runs the setUp method once for each defined test, even if they're part of the same class as in class TwoTests(unittest.TestCase): def setUp(self): # do something very time consuming def testOneThing(self): def testADifferentThing(self): which would call setUp twice. Is there any way to avoid this, without packing all the unrelated tests into one big function? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Bruno Desthuilliers <[EMAIL PROTECTED]> writes: > Nikolaus Rath a écrit : >> Michael Torrie <[EMAIL PROTECTED]> writes: > > (snip) > >>> In short, unlike what most of the implicit self advocates are >>> saying, it's not just a simple change to the python parser to do >>> this. It would require a change in the interpreter itself and how it >>> deals with classes. >> >> >> Thats true. But out of curiosity: why is changing the interpreter such >> a bad thing? (If we suppose for now that the change itself is a good >> idea). > > Because it would very seriously break a *lot* of code ? Well, Python 3 will break lots of code anyway, won't it? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Protecting instance variables
Hi, Sorry for replying so late. Your MUA apparently messes up the References:, so I saw you reply only now and by coincidence. "Diez B. Roggisch" <[EMAIL PROTECTED]> writes: > Nikolaus Rath schrieb: >> Hello, >> >> I am really surprised that I am asking this question on the mailing >> list, but I really couldn't find it on python.org/doc. >> >> Why is there no proper way to protect an instance variable from access >> in derived classes? >> >> I can perfectly understand the philosophy behind not protecting them >> from access in external code ("protection by convention"), but isn't >> it a major design flaw that when designing a derived class I first >> have to study the base classes source code? Otherwise I may always >> accidentally overwrite an instance variable used by the base class... > > Here we go again... > > http://groups.google.com/group/comp.lang.python/browse_thread/thread/188467d724b48b32/ > > To directly answer your question: that's what the __ (double > underscore) name mangling is for. I understand that it is desirable not to completely hide instance variables. But it seems silly to me that I should generally prefix almost all my instance variables with two underscores. I am not so much concerned about data hiding, but about not accidentally overwriting a variable of the class I'm inheriting from. And, unless I misunderstood something, this is only possible if I'm prefixing them with __. How is this problem solved in practice? I probably don't have a representative sample, but in the libraries that I have been using so far, there were a lot of undocumented (in the sense of: not being part of the public API) instance variables not prefixed with __. I have therefore started to first grep the source of all base classes whenever I introduce a new variable in my derived class. Is that really the way it's supposed to be? What if one of the base classes introduces a new variable at a later point? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: os.symlink()
"Diez B. Roggisch" <[EMAIL PROTECTED]> writes: > Nikolaus Rath wrote: > >> Hello, >> >>>From `pydoc os`: >> >> symlink(...) >> symlink(src, dst) >> >> Create a symbolic link pointing to src named dst. >> >> >> Is there any reason why this is so deliberately confusing? Why is the >> target of the symlink, the think where it points *to*, called the >> `src`? It seems to me that the names of the parameters should be >> reversed. > > I used the command the other day, and didn't feel the slightest confusion. > > To me, the process of creating a symlink is like a "virtual copy". > Which the above parameter names reflect perfectly. Is this interpretation really widespread? I couldn't find any other sources using it. On the other hand: From ln(2): , | SYNOPSIS |ln [OPTION]... [-T] TARGET LINK_NAME (1st form) | | DESCRIPTION |In the 1st form, create a link to TARGET with the name LINK_NAME. ` From Wikipedia: , | A symbolic link merely contains a text string that is interpreted and | followed by the operating system as a path to another file or | directory. It is a file on its own and can exist independently of its | target. If a symbolic link is deleted, its target remains unaffected. | If the target is moved, renamed or deleted, any symbolic link that | used to point to it continues to exist but now points to a | non-existing file. ` Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
os.symlink()
Hello, From `pydoc os`: symlink(...) symlink(src, dst) Create a symbolic link pointing to src named dst. Is there any reason why this is so deliberately confusing? Why is the target of the symlink, the think where it points *to*, called the `src`? It seems to me that the names of the parameters should be reversed. Puzzled, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Michael Torrie <[EMAIL PROTECTED]> writes: > I think the biggest reason why an implicit self is bad is because it > prevents monkey-patching of existing class objects. Right now I can add > a new method to any existing class just with a simple attribute like so > (adding a new function to an existing instance object isn't so simple, > but ah well): > > def a(self, x, y): > self.x = x > self.y = y > > class Test(object): > pass > > Test.setxy = a > > b = Test() > > b.setxy(4,4) > > print b.x, b.y > > If self was implicit, none of this would work. No, but it could work like this: def a(x, y): self.x = x self.y = y class Test(object): pass Test.setxy = a b = Test() # Still all the same until here # Since setxy is called as an instance method, it automatically # gets a 'self' variable and everything works nicely b.setxy(4,4) # This throws an exception, since self is undefined a(4,4) Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
"Russ P." <[EMAIL PROTECTED]> writes: > The issue here has nothing to do with the inner workings of the Python > interpreter. The issue is whether an arbitrary name such as "self" > needs to be supplied by the programmer. > > All I am suggesting is that the programmer have the option of > replacing "self.member" with simply ".member", since the word "self" > is arbitrary and unnecessary. Otherwise, everything would work > *EXACTLY* the same as it does now. This would be a shallow syntactical > change with no effect on the inner workings of Python, but it could > significantly unclutter code in many instances. > > The fact that you seem to think it would change the inner > functioning of Python just shows that you don't understand the > proposal. So how would you translate this into a Python with implicit self, but without changing the procedure for method resolution? def will_be_a_method(self, a) # Do something with self and a class A: pass a = A() a.method = will_be_a_method It won't work unless you change the interpreter to magically insert a 'self' variable into the scope of a function when it is called as a method. I'm not saying that that's a bad thing, but it certainly requires some changes to Python's internals. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Bruno Desthuilliers <[EMAIL PROTECTED]> writes: > The fact that a function is defined within a class statement doesn't > imply any "magic", it just creates a function object, bind it to a > name, and make that object an attribute of the class. You have the > very same result by defining the function outside the class statement > and binding it within the class statement, by defining the function > outside the class and binding it to the class outside the class > statement, by binding the name to a lambda within the class statement > etc... But why can't the current procedure to resolve method calls be changed to automatically define a 'self' variable in the scope of the called function, instead of binding its first argument? Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Michael Torrie <[EMAIL PROTECTED]> writes: > Colin J. Williams wrote: >>> >>> def fun( ., cat): >>> >> I don't see the need for the comma in fun. > > It (the entire first variable!) is needed because a method object is > constructed from a normal function object: > > def method(self,a,b): > pass > > class MyClass(object): > pass > > MyClass.testmethod=method > > That's precisely the same as if you'd defined method inside of the class > to begin with. A function becomes a method when the lookup procedure in > the instance object looks up the attribute and returns (from what I > understand) essentially a closure that binds the instance to the first > variable of the function. The result is known as a bound method, which > is a callable object: > instance=MyClass() > instance.testmethod > > > > > How would this work if there was not first parameter at all? > > In short, unlike what most of the implicit self advocates are > saying, it's not just a simple change to the python parser to do > this. It would require a change in the interpreter itself and how it > deals with classes. Thats true. But out of curiosity: why is changing the interpreter such a bad thing? (If we suppose for now that the change itself is a good idea). Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
castironpi <[EMAIL PROTECTED]> writes: >> I think you misunderstood him. What he wants is to write >> >> class foo: >> def bar(arg): >> self.whatever = arg + 1 >> >> instead of >> >> class foo: >> def bar(self, arg) >> self.whatever = arg + 1 >> >> so 'self' should *automatically* only be inserted in the function >> declaration, and *manually* be typed for attributes. >> > > There's a further advantage: > > class A: > def get_auxclass( self, b, c ): > class B: > def auxmeth( self2, d, e ): > #here, ... > return B In auxmeth, self would refer to the B instance. In get_auxclass, it would refer to the A instance. If you wanted to access the A instance in auxmeth, you'd have to use class A: def get_auxclass(b, c ): a_inst = self class B: def auxmeth(d, e ): self # the B instance a_inst # the A instance return B This seems pretty natural to me (innermost scope takes precedence), and AFAIR this is also how it is done in Java. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Terry Reedy <[EMAIL PROTECTED]> writes: >> What he wants is to write >> > > > class foo: >>def bar(arg): >>self.whatever = arg + 1 >> >> instead of >> >> class foo: >>def bar(self, arg) >>self.whatever = arg + 1 >> >> so 'self' should *automatically* only be inserted in the function >> declaration, and *manually* be typed for attributes. > > which means making 'self' a keyword just so it can be omitted. Silly > and pernicious. Well, I guess that's more a matter of personal preference. I would go for it immediately (and also try rename it to '@' at the same time). Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Terry Reedy <[EMAIL PROTECTED]> writes: > Nikolaus Rath wrote: >> Terry Reedy <[EMAIL PROTECTED]> writes: >>> Torsten Bronger wrote: >>>> Hallöchen! >>> > And why does this make the implicit insertion of "self" difficult? >>>> I could easily write a preprocessor which does it after all. >>> class C(): >>> def f(): >>> a = 3 >>> >>> Inserting self into the arg list is trivial. Mindlessly deciding >>> correctly whether or not to insert 'self.' before 'a' is impossible >>> when 'a' could ambiguously be either an attribute of self or a local >>> variable of f. Or do you and/or Jordan plan to abolish local >>> variables for methods? >> >> Why do you think that 'self' should be inserted anywhere except in the >> arg list? AFAIU, the idea is to remove the need to write 'self' in the >> arg list, not to get rid of it entirely. > > Because you must prefix self attributes with 'self.'. If you do not > use any attributes of the instance of the class you are making the > function an instance method of, then it is not really an instance > method and need not and I would say should not be masqueraded as > one. If the function is a static method, then it should be labeled > as one and no 'self' is not needed and auto insertion would be a > mistake. In brief, I assume the OP wants 'self' inserted in the body > because inserting it only in the parameter list and never using it > in the body is either silly or wrong. I think you misunderstood him. What he wants is to write class foo: def bar(arg): self.whatever = arg + 1 instead of class foo: def bar(self, arg) self.whatever = arg + 1 so 'self' should *automatically* only be inserted in the function declaration, and *manually* be typed for attributes. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Re: Attack a sacred Python Cow
Terry Reedy <[EMAIL PROTECTED]> writes: > Torsten Bronger wrote: >> Hallöchen! > > And why does this make the implicit insertion of "self" difficult? >> I could easily write a preprocessor which does it after all. > > class C(): > def f(): > a = 3 > > Inserting self into the arg list is trivial. Mindlessly deciding > correctly whether or not to insert 'self.' before 'a' is impossible > when 'a' could ambiguously be either an attribute of self or a local > variable of f. Or do you and/or Jordan plan to abolish local > variables for methods? Why do you think that 'self' should be inserted anywhere except in the arg list? AFAIU, the idea is to remove the need to write 'self' in the arg list, not to get rid of it entirely. Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list
Protecting instance variables
Hello, I am really surprised that I am asking this question on the mailing list, but I really couldn't find it on python.org/doc. Why is there no proper way to protect an instance variable from access in derived classes? I can perfectly understand the philosophy behind not protecting them from access in external code ("protection by convention"), but isn't it a major design flaw that when designing a derived class I first have to study the base classes source code? Otherwise I may always accidentally overwrite an instance variable used by the base class... Best, -Nikolaus -- »It is not worth an intelligent man's time to be in the majority. By definition, there are already enough people to do that.« -J.H. Hardy PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C -- http://mail.python.org/mailman/listinfo/python-list