ssl.SSLError: [SSL: BAD_WRITE_RETRY] bad write retry (_ssl.c:1636)

2014-09-21 Thread Nikolaus Rath
Hello,

Can someone explain help me understand what this exception means?

[...]
  File 
"/usr/local/lib/python3.4/dist-packages/dugong-3.2-py3.4.egg/dugong/__init__.py",
 line 584, in _co_send
len_ = self._sock.send(buf)
  File "/usr/lib/python3.4/ssl.py", line 679, in send
v = self._sslobj.write(data)
ssl.SSLError: [SSL: BAD_WRITE_RETRY] bad write retry (_ssl.c:1636)

Presumably this is generated by OpenSSL, but how do I figure out what it
means? The best I found in the OpenSSL documentation is
https://www.openssl.org/docs/crypto/err.html, and Google only found
brought me to https://stackoverflow.com/questions/2997218.


Best,
-Nikolaus
-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Missing stack frames?

2014-06-06 Thread Nikolaus Rath
Chris Angelico  writes:
> On Fri, Jun 6, 2014 at 12:16 PM, Nikolaus Rath  wrote:
>>  - Is there some way to make the call stack for destructors less confusing?
>
> First off, either don't have refloops, or explicitly break them.

The actual code isn't as simple as the example. I wasn't even aware
there were any refloops. Guess I have to start hunting them down now.

> That'll at least make things a bit more predictable; in CPython,
> you'll generally see destructors called as soon as something goes out
> of scope. Secondly, avoid threads when you're debugging a problem! I
> said this right at the beginning. If you run into problems, the first
> thing to do is isolate the system down to a single thread. Debugging
> is way easier without threads getting in each other's ways.

I don't see how this would have made a difference in this case. I still
would have gotten lots of apparently non-sensical backtraces. Only this
time they would all come from MainThread.


Best,
-Nikolaus
-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Missing stack frames?

2014-06-05 Thread Nikolaus Rath
dieter  writes:
[...]
> Someone else already mentioned that the "close" call
> can come from a destructor. Destructors can easily be called
> at not obvious places (e.g. "s = C(); ... x = [s for s in ...]";
> in this example the list comprehension calls the "C" destructor
> which is not obvious when one looks only locally).
> The destructor calls often have intervening C code (which
> one does not see). However, in your case, I do not see
> why the "cgi" module should cause a destructor call of one
> of your server components.

Paul, dieter, you are my heroes. It was indeed an issue with a
destructor. It turns out that the io.RawIOBase destructor calls
self.close(). If the instance of a derived class is part of a reference
cycle, it gets called on the next routine run of the garbage
collector -- with the stack trace originating at whatever statement was
last executed before the gc run.

The following minimal example reproduces the problem:

#!/usr/bin/env python3
import io
import traceback
import threading

class Container:
pass

class InnocentVictim(io.RawIOBase):
def close(self):
print('close called in %s by:'
  % threading.current_thread().name)
traceback.print_stack()

def busywork():
numbers = []
for i in range(500):
o = Container()
o.l = numbers
numbers.append(o)

if i % 87 == 0:
numbers = []

l = [ InnocentVictim() ]
l[0].cycle = l
del l

t = threading.Thread(target=busywork)
t.start()
t.join()


If you run this, you could things like:

close called in Thread-1 by:
  File "/usr/lib/python3.4/threading.py", line 888, in _bootstrap
self._bootstrap_inner()
  File "/usr/lib/python3.4/threading.py", line 920, in _bootstrap_inner
self.run()
  File "/usr/lib/python3.4/threading.py", line 868, in run
self._target(*self._args, **self._kwargs)
  File "./test.py", line 18, in busywork
o = Container()
  File "./test.py", line 13, in close
traceback.print_stack()


Ie, a method being called by a thread that doesn't have access to the
object, and without any reference to the call in the source.


I am left wondering:

 - Is there really a point in the RawIOBase destructor calling close?
 - Is there some way to make the call stack for destructors less confusing?


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Missing stack frames?

2014-06-05 Thread Nikolaus Rath
Nikolaus Rath  writes:
> Chris Angelico  writes:
>> On Wed, Jun 4, 2014 at 12:30 PM, Nikolaus Rath  wrote:
>>> I've instrumented one of my unit tests with a conditional
>>> 'pdb.set_trace' in some circumstances (specifically, when a function is
>>> called by a thread other than MainThread).
>>
>> I think the likelihood of this being an issue with interactive
>> debugging and threads is sufficiently high that you should avoid
>> putting the two together, at least until you can verify that the same
>> problem occurs without that combination.
>
> Here's stacktrace as obtained by traceback.print_stack():
>
> tests/t1_backends.py:563: test_extra_data[mock_s3c-zlib] PASSED
> tests/t1_backends.py:563: test_extra_data[mock_s3c-bzip2] PASSED
>
>  87 tests deselected by '-kextra' 
> =
> === 5 passed, 1 skipped, 87 deselected in 0.65 seconds 
> 
>   File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
> 853, in close
> self.fh.close()
>   File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, 
> in close
> traceback.print_stack(file=sys.stdout)
> something is wrong
>   File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
> 853, in close
> self.fh.close()
>   File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, 
> in close
> traceback.print_stack(file=sys.stdout)
> something is wrong
>   File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
> 1050, in close
> self.fh.close()
>   File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, 
> in close
> traceback.print_stack(file=sys.stdout)
>
> Still no context before the ominous close() call. I'm very confused.

It get's even funnier if I just repeat this exercise (without changing
code). Here are some more "tracebacks" that I got:

  File "/usr/bin/py.test-3", line 5, in 
sys.exit(load_entry_point('pytest==2.5.2', 'console_scripts', 'py.test')())
 [...] 
  File "/usr/lib/python3/dist-packages/py/_code/code.py", line 524, in 
repr_traceback_entry
source = self._getentrysource(entry)
  File "/usr/lib/python3/dist-packages/py/_code/code.py", line 450, in 
_getentrysource
source = entry.getsource(self.astcache)
  File "/usr/lib/python3/dist-packages/py/_code/code.py", line 199, in getsource
astnode=astnode)
  File "/usr/lib/python3/dist-packages/py/_code/source.py", line 367, in 
getstatementrange_ast
astnode = compile(content, "source", "exec", 1024)  # 1024 for AST
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
1050, in close
self.fh.close_real()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 702, in 
close_real
traceback.print_stack(file=sys.stdout)

or also

  File "/usr/bin/py.test-3", line 5, in 
sys.exit(load_entry_point('pytest==2.5.2', 'console_scripts', 'py.test')())
[...]
  File "/usr/lib/python3.4/logging/__init__.py", line 1474, in callHandlers
hdlr.handle(record)
  File "/usr/lib/python3.4/logging/__init__.py", line 842, in handle
rv = self.filter(record)
  File "/usr/lib/python3.4/logging/__init__.py", line 699, in filter
for f in self.filters:
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, 
in close
self.fh.close()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
1050, in close
self.fh.close_real()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 702, in 
close_real
traceback.print_stack(file=sys.stdout)

and this one looks actually the way I would expect:

[...]
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 335, 
in fetch
return self.perform_read(do_read, key)
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 58, 
in wrapped
return method(*a, **kw)
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 309, 
in perform_read
return fn(fh)
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 859, 
in __exit__
self.close()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, 
in close
self.fh.close()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
1050, in close
self.fh.close_real()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 702, in 
close_real
traceback.print_stack(file=sys.stdout)


I am not using any C extension modules, but I guess I nevertheless have
to assume that something seriously messed up either the stack or the
traceback printing routines?


Best,
Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Missing stack frames?

2014-06-05 Thread Nikolaus Rath
Paul Rubin  writes:
> Nikolaus Rath  writes:
>> Still no context before the ominous close() call. I'm very confused.
>
> close() could be getting called from a destructor as the top level
> function of a thread exits, or something like that.

Shouldn't the destructor have its own stack frame then, i.e. shouldn't
the first frame be in a __del__ function?

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Missing stack frames?

2014-06-04 Thread Nikolaus Rath
Chris Angelico  writes:
> On Wed, Jun 4, 2014 at 12:30 PM, Nikolaus Rath  wrote:
>> I've instrumented one of my unit tests with a conditional
>> 'pdb.set_trace' in some circumstances (specifically, when a function is
>> called by a thread other than MainThread).
>
> I think the likelihood of this being an issue with interactive
> debugging and threads is sufficiently high that you should avoid
> putting the two together, at least until you can verify that the same
> problem occurs without that combination.

Here's stacktrace as obtained by traceback.print_stack():

tests/t1_backends.py:563: test_extra_data[mock_s3c-zlib] PASSED
tests/t1_backends.py:563: test_extra_data[mock_s3c-bzip2] PASSED

 87 tests deselected by '-kextra' 
=
=== 5 passed, 1 skipped, 87 deselected in 0.65 seconds 

  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, 
in close
self.fh.close()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, in 
close
traceback.print_stack(file=sys.stdout)
something is wrong
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853, 
in close
self.fh.close()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, in 
close
traceback.print_stack(file=sys.stdout)
something is wrong
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 
1050, in close
self.fh.close()
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 691, in 
close
traceback.print_stack(file=sys.stdout)


Still no context before the ominous close() call. I'm very confused.


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Corrputed stacktrace?

2014-06-04 Thread Nikolaus Rath
Chris Angelico  writes:
> On Wed, Jun 4, 2014 at 12:20 PM, Nikolaus Rath  wrote:
>>   File "/usr/lib/python3.3/threading.py", line 878 in _bootstrap
>
> Can you replicate the problem in a non-threaded environment? Threads
> make interactive debugging very hairy.

Hmm. I could try to run the server thread in a separate process. I'll
try that and report back.


Thanks for the suggestion,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Missing stack frames?

2014-06-04 Thread Nikolaus Rath
Chris Angelico  writes:
> On Wed, Jun 4, 2014 at 12:30 PM, Nikolaus Rath  wrote:
>> I've instrumented one of my unit tests with a conditional
>> 'pdb.set_trace' in some circumstances (specifically, when a function is
>> called by a thread other than MainThread).
>
> I think the likelihood of this being an issue with interactive
> debugging and threads is sufficiently high that you should avoid
> putting the two together, at least until you can verify that the same
> problem occurs without that combination.

Is there a way to produce a stacktrace without using the interactive
debugger?


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Missing stack frames?

2014-06-03 Thread Nikolaus Rath
Hello,

(This may or may not be related to my mail about a "corrupted stack
trace").

I've instrumented one of my unit tests with a conditional
'pdb.set_trace' in some circumstances (specifically, when a function is
called by a thread other than MainThread). However, when trying to print
a back trace to figure out how this function got called, I get this:

$ py.test-3 tests/t1_backends.py -k extra
=== test session starts 
===
platform linux -- Python 3.3.5 -- py-1.4.20 -- pytest-2.5.2 -- /usr/bin/python3
collected 33 items 

tests/t1_backends.py:563: test_extra_data[mock_s3c-plain] SKIPPED
tests/t1_backends.py:563: test_extra_data[mock_s3c-zlib] PASSED

 31 tests deselected by '-kextra' 
=
=== 1 passed, 1 skipped, 31 deselected in 0.23 seconds 

> /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close()
-> if not self.md5_checked:
(Pdb) bt
  /home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py(853)close()
-> self.fh.close()
> /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close()
-> if not self.md5_checked:
(Pdb) q

What does this mean? Why is there no caller above the backends/common.py code? 
At
the very least, I would have expected some frames from threading.py...?


Best,
-Nikolaus
-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Corrputed stacktrace?

2014-06-03 Thread Nikolaus Rath
Hello,

I'm trying to debug a problem. As far as I can tell, one of my methods
is called at a point where it really should not be called. When setting
a breakpoint in the function, I'm getting this:

> /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close()
-> if not self.md5_checked:
(Pdb) bt
  /usr/lib/python3.3/threading.py(878)_bootstrap()
-> self._bootstrap_inner()
  /usr/lib/python3.3/threading.py(901)_bootstrap_inner()
-> self.run()
  /usr/lib/python3.3/threading.py(858)run()
-> self._target(*self._args, **self._kwargs)
  /usr/lib/python3.3/socketserver.py(610)process_request_thread()
-> self.finish_request(request, client_address)
  /usr/lib/python3.3/socketserver.py(345)finish_request()
-> self.RequestHandlerClass(request, client_address, self)
  /usr/lib/python3.3/socketserver.py(666)__init__()
-> self.handle()
  /home/nikratio/in-progress/s3ql/tests/mock_server.py(77)handle()
-> return super().handle()
  /usr/lib/python3.3/http/server.py(402)handle()
-> self.handle_one_request()
  /usr/lib/python3.3/http/server.py(388)handle_one_request()
-> method()
  /home/nikratio/in-progress/s3ql/tests/mock_server.py(169)do_GET()
-> q = parse_url(self.path)
  /home/nikratio/in-progress/s3ql/tests/mock_server.py(52)parse_url()
-> p.params = urllib.parse.parse_qs(q.query)
  /usr/lib/python3.3/urllib/parse.py(553)parse_qs()
-> encoding=encoding, errors=errors)
  /usr/lib/python3.3/urllib/parse.py(585)parse_qsl()
-> pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
  /usr/lib/python3.3/urllib/parse.py(585)()
-> pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
  /home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py(853)close()
-> self.fh.close()
> /home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py(693)close()
-> if not self.md5_checked:


To me this does not make any sense.

Firstly, the thread that is (apparently) calling close should never ever
reach code in common.py. This thread is executing a socketserver handler
that is entirely contained in mock_server.py and only communicates with
the rest of the program via tcp.

Secondly, the backtrace does not make sense. How can evaluation of 

 pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]

in urllib/parse.py() result in a method call in backends/common.py?
There is no trickery going on, qs is a regular string:

(Pdb) up
(Pdb) up
(Pdb) up
(Pdb) l
580 into Unicode characters, as accepted by the bytes.decode() 
method.
581 
582 Returns a list, as G-d intended.
583 """
584 qs, _coerce_result = _coerce_args(qs)
585  -> pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
586 r = []
587 for name_value in pairs:
588 if not name_value and not strict_parsing:
589 continue
590 nv = name_value.split('=', 1)
(Pdb) whatis qs

(Pdb) p qs
''
(Pdb)

I have also tried to get a backtrace with the faulthandler module, but
it gives the same result:

Thread 0x7f7dafdb4700:
  File "/usr/lib/python3.3/cmd.py", line 126 in cmdloop
  File "/usr/lib/python3.3/pdb.py", line 318 in _cmdloop
  File "/usr/lib/python3.3/pdb.py", line 345 in interaction
  File "/usr/lib/python3.3/pdb.py", line 266 in user_line
  File "/usr/lib/python3.3/bdb.py", line 65 in dispatch_line
  File "/usr/lib/python3.3/bdb.py", line 47 in trace_dispatch
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/s3c.py", line 693 in 
clos
  File "/home/nikratio/in-progress/s3ql/src/s3ql/backends/common.py", line 853 
in c
  File "/usr/lib/python3.3/urllib/parse.py", line 585 in 
  File "/usr/lib/python3.3/urllib/parse.py", line 585 in parse_qsl
  File "/usr/lib/python3.3/urllib/parse.py", line 553 in parse_qs
  File "/home/nikratio/in-progress/s3ql/tests/mock_server.py", line 52 in 
parse_url
  File "/home/nikratio/in-progress/s3ql/tests/mock_server.py", line 169 in 
do_GET
  File "/usr/lib/python3.3/http/server.py", line 388 in handle_one_request
  File "/usr/lib/python3.3/http/server.py", line 402 in handle
  File "/home/nikratio/in-progress/s3ql/tests/mock_server.py", line 77 in handle
  File "/usr/lib/python3.3/socketserver.py", line 666 in __init__
  File "/usr/lib/python3.3/socketserver.py", line 345 in finish_request
  File "/usr/lib/python3.3/socketserver.py", line 610 in process_request_thread
  File "/usr/lib/python3.3/threading.py", line 858 in run
  File "/usr/lib/python3.3/threading.py", line 901 in _bootstrap_inner
  File "/usr/lib/python3.3/threading.py", line 878 in _bootstrap


Is it possible that the stack got somehow corrupted?

Does anyone have a suggestion how I could go about debugging this?

I am using Python 3.3.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: select(sock) indicates not-ready, but sock.recv does not block

2014-02-17 Thread Nikolaus Rath
Nikolaus Rath  writes:
> Hello,
>
> I have a problem with using select. I can reliably reproduce a situation
> where select.select((sock.fileno(),), (), (), 0) returns ((),(),())
> (i.e., no data ready for reading), but an immediately following
> sock.recv() returns data without blocking.
[...]

Turns out that I fell into the well-known (except to me) ssl-socket
trap: http://docs.python.org/3/library/ssl.html#notes-on-non-blocking-sockets

TL;DR: Relying on select() on an SSL socket is not a good idea because
some internal buffering is done. Better put the socket in non-blocking
mode and try to read something, catching the ssl.Want* exception if
nothing is ready.

Best,
-Nikolaus

-- 
Encrypted emails preferred.
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


select(sock) indicates not-ready, but sock.recv does not block

2014-02-16 Thread Nikolaus Rath
Hello,

I have a problem with using select. I can reliably reproduce a situation
where select.select((sock.fileno(),), (), (), 0) returns ((),(),())
(i.e., no data ready for reading), but an immediately following
sock.recv() returns data without blocking.

I am pretty sure that this is not a race condition. The behavor is 100%
reproducible, the program is single threaded, and even waiting for 10
seconds before the select() call does not change the result.

I'm running Python 3.3.3 under Linux 3.12.

Has anyone an idea what might be going wrong here?

Thanks,
-Nikolaus

-- 
Encrypted emails preferred.
PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

 »Time flies like an arrow, fruit flies like a Banana.«
-- 
https://mail.python.org/mailman/listinfo/python-list


Dynamically change __del__

2010-04-30 Thread Nikolaus Rath
Hi,

I'm trying to be very clever:

class tst(object):
def destroy(self):
print 'Cleaning up.'
self.__del__ = lambda: None
def __del__(self):
raise RuntimeError('Instance destroyed without running destroy! Hell 
may break loose!')

However, it doesn't work:

In [2]: t = tst()

In [3]: t = None
Exception RuntimeError: RuntimeError('Instance destroyed without running 
destroy! Hell may break loose!',) in > ignored

In [4]: t = tst()

In [5]: t.destroy()
Cleaning up.

In [6]: t = None
Exception RuntimeError: RuntimeError('Instance destroyed without running 
destroy! Hell may break loose!',) in > ignored

$ python -V
Python 2.6.4


Apparently Python calls the class attribute __del__ rather than the
instance's __del__ attribute. Is that a bug or a feature? Is there any
way to implement the desired functionality without introducing an
additional destroy_has_been_called attribute?


(I know that invocation of __del__ is unreliable, this is just an
additional safeguard to increase the likelihood of bugs to get noticed).



Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Profiling: Interpreting tottime

2010-04-08 Thread Nikolaus Rath
"Gabriel Genellina"  writes:
> En Wed, 07 Apr 2010 18:44:39 -0300, Nikolaus Rath 
> escribió:
>
>> def check_s3_refcounts():
>> """Check s3 object reference counts"""
>>
>> global found_errors
>> log.info('Checking S3 object reference counts...')
>>
>> for (key, refcount) in conn.query("SELECT id, refcount FROM
>> s3_objects"):
>>
>> refcount2 = conn.get_val("SELECT COUNT(inode) FROM blocks
>> WHERE s3key=?",
>>  (key,))
>> if refcount != refcount2:
>> log_error("S3 object %s has invalid refcount, setting
>> from %d to %d",
>>   key, refcount, refcount2)
>> found_errors = True
>> if refcount2 != 0:
>> conn.execute("UPDATE s3_objects SET refcount=? WHERE
>> id=?",
>>  (refcount2, key))
>> else:
>> # Orphaned object will be picked up by check_keylist
>> conn.execute('DELETE FROM s3_objects WHERE id=?', (key,))
>>
>> When I ran cProfile.Profile().runcall() on it, I got the following
>> result:
>>
>>ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>> 1 7639.962 7639.962 7640.269 7640.269
>> fsck.py:270(check_s3_refcounts)
>>
>> So according to the profiler, the entire 7639 seconds where spent
>> executing the function itself.
>>
>> How is this possible? I really don't see how the above function can
>> consume any CPU time without spending it in one of the called
>> sub-functions.
>
> Is the conn object implemented as a C extension?

No, it's pure Python.

> The profiler does not
> detect calls to C functions, I think.

Hmm. Isn't this a C function?

   262.3170.0892.3170.089 {method 'execute' of 
'apsw.Cursor' objects}

> You may be interested in this package by Robert Kern:
> http://pypi.python.org/pypi/line_profiler
> "Line-by-line profiler.
> line_profiler will profile the time individual lines of code take to
> execute."

That looks interesting nevertheless, thanks!


   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Profiling: Interpreting tottime

2010-04-07 Thread Nikolaus Rath
Hello,

Consider the following function:

def check_s3_refcounts():
"""Check s3 object reference counts"""

global found_errors
log.info('Checking S3 object reference counts...')

for (key, refcount) in conn.query("SELECT id, refcount FROM s3_objects"):

refcount2 = conn.get_val("SELECT COUNT(inode) FROM blocks WHERE 
s3key=?",
 (key,))
if refcount != refcount2:
log_error("S3 object %s has invalid refcount, setting from %d to 
%d",
  key, refcount, refcount2)
found_errors = True
if refcount2 != 0:
conn.execute("UPDATE s3_objects SET refcount=? WHERE id=?",
 (refcount2, key))
else:
# Orphaned object will be picked up by check_keylist
conn.execute('DELETE FROM s3_objects WHERE id=?', (key,))

When I ran cProfile.Profile().runcall() on it, I got the following
result:

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
1 7639.962 7639.962 7640.269 7640.269 fsck.py:270(check_s3_refcounts)

So according to the profiler, the entire 7639 seconds where spent
executing the function itself.

How is this possible? I really don't see how the above function can
consume any CPU time without spending it in one of the called
sub-functions.


Puzzled,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Determine PyArg_ParseTuple parameters at runtime?

2009-12-08 Thread Nikolaus Rath
Hello,

I want to create an extension module that provides an interface to a
couple of C functions that take arguments of type struct iovec, struct
stat, struct flock, etc (the FUSE library, in case it matters).

Now the problem is that these structures contain attributes of type
fsid_t, off_t, dev_t etc. Since I am receiving the values for these
attributes from the Python caller, I have to convert them to the correct
type using PyArg_ParseTuple.

However, since the structs are defined in a system-dependent header
file, when writing the code I do not know if on the target system, e.g.
off_t will be of type long, long long or unsigned long, so I don't know
which format string to pass to PyArg_ParseTuple.


Are there any best practices for handling this kind of situation? I'm at
a loss right now.

The only thing that comes to my mind is to, e.g., to compare
sizeof(off_t) to sizeof(long) and sizeof(long long) and thereby
determine the correct bit length at runtime. But this does not help me
to figure out if it is signed or unsigned, or (in case of other
attributes than off_t) if it is an integer at all.


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [distutils] Install script under a different name

2009-12-05 Thread Nikolaus Rath
Wolodja Wentland  writes:
> On Fri, Dec 04, 2009 at 19:34 -0500, Nikolaus Rath wrote:
>> All my Python files have extension .py. However, I would like to install
>> scripts that are meant to be called by the user without the suffix, i.e.
>> the file scripts/doit.py should end up as /usr/bin/doit.
>
>> Apparently the scripts= option of the setup() function does not support
>> this directly. Is there a clever way to get what I want?
>
> You can also use entry points to create the executable at install time.
> Have a look at [1] which explains how this is done. This requires using
> Distribute/setuptools though, ...
>
> [1] 
> http://packages.python.org/distribute/setuptools.html#automatic-script-creation

That looks perfect, thanks!


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [distutils] Install script under a different name

2009-12-05 Thread Nikolaus Rath
Lie Ryan  writes:
> On 12/5/2009 11:34 AM, Nikolaus Rath wrote:
>> Hello,
>>
>> All my Python files have extension .py. However, I would like to install
>> scripts that are meant to be called by the user without the suffix, i.e.
>> the file scripts/doit.py should end up as /usr/bin/doit.
>>
>> Apparently the scripts= option of the setup() function does not support
>> this directly. Is there a clever way to get what I want?
>
> if this is on windows, you should add ".py" to the PATHEXT environment
> variable.
>
> on linux/unix, you need to add the proper #! line to the top of any
> executable scripts and of course set the executable bit permission
> (chmod +x scriptname). In linux/unix there is no need to have the .py
> extension for a file to be recognized as python script (i.e. just
> remove it).

Sorry, but I think this is totally unrelated to my question. I want to
rename files during the setup process. This is not going to happen by
adding/changing any #! lines or the permissions of the file.

I know that there is no need to have the .py extension, that's why I
want to install the scripts without this suffix. But in my source
distribution I want to keep the suffix for various reasons.

Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


[distutils] Install script under a different name

2009-12-04 Thread Nikolaus Rath
Hello,

All my Python files have extension .py. However, I would like to install
scripts that are meant to be called by the user without the suffix, i.e.
the file scripts/doit.py should end up as /usr/bin/doit.

Apparently the scripts= option of the setup() function does not support
this directly. Is there a clever way to get what I want?


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Monkeypatching an object to become callable

2009-08-11 Thread Nikolaus Rath
Bruno Desthuilliers  writes:
> 7stud a écrit :
> (snip)
>> class Wrapper(object):
>> def __init__(self, obj, func):
>> self.obj = obj
>> self.func = func
>>
>> def __call__(self, *args):
>> return self.func(*args)
>>
>> def __getattr__(self, name):
>> return object.__getattribute__(self.obj, name)
>
> This should be
>
>   return getattr(self.obj, name)
>
> directly calling object.__getattribute__ might skip redefinition of
> __getattribute__ in self.obj.__class__ or it's mro.

Works nicely, thanks. I came up with the following shorter version which
modifies the object in-place:

class Modifier(obj.__class__):
 def __call__(self):
 return fn()

obj.__class__ = Modifier


To me this seems a bit more elegant (less code, less underscores). Or
are there some cases where the above would fail?


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Profiling a Callback Function

2009-08-11 Thread Nikolaus Rath
Hello,

I am trying to profile a Python program that primarily calls a C
extension. From within the C extension, a callback Python function is
then called concurrently in several threads.

When I tried to profile this application with

  import c_extension
  def callback_fn(args):
  # Do all sorts of complicated, time consuming stuff
  pass

  def main():
  c_extension.call_me_back(callback_fn, some_random_args)

  cProfile.run('main', 'profile.dat')

I only got results for main(), but no information at all about
callback_fn.


What is the proper way to profile such an application? 


I already thought about this:

  import c_extension

  def callback_fn(args):
  # Do all sorts of complicated, time consuming stuff
  pass
  
  def callback_wrapper(args):
   def doit():
   callback_fn(args)

   cProfile.run('doit', 'profile.dat')

  c_extension.call_me_back(callback_wrapper, some_random_args)


but that probably overwrites the profiling information whenever
callback_wrapper is called, instead of accumulating them over several
calls (with different arguments).


Best,


   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C


-- 
http://mail.python.org/mailman/listinfo/python-list


Monkeypatching an object to become callable

2009-08-09 Thread Nikolaus Rath
Hi,

I want to monkeypatch an object so that it becomes callable, although
originally it is not meant to be. (Yes, I think I do have a good reason
to do so).

But simply adding a __call__ attribute to the object apparently isn't
enough, and I do not want to touch the class object (since it would
modify all the instances):

>>> class foo(object):
...   pass
... 
>>> t = foo()
>>> def test():
...   print 'bar'
... 
>>> t()
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'foo' object is not callable
>>> t.__call__ = test
>>> t()
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'foo' object is not callable
>>> t.__call__()
bar


Is there an additional trick to get it to work?


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python docs disappointing

2009-07-31 Thread Nikolaus Rath
Carl Banks  writes:
>> This is one area in which Perl still whips Python...
>
> No way.  Perl's man pages are organized so poorly there is no
> ergonomic pit deep enough to offset them.  Quick, what man page is the
> "do" statement documented in?

Of course there is:

$ perldoc -f do | head
   do BLOCK
   Not really a function.  Returns the value of the last command
   in the sequence of commands indicated by BLOCK.  When modified
   by the "while" or "until" loop modifier, executes the BLOCK
   once before testing the loop condition. (On other statements
   the loop modifiers test the conditional first.)

   "do BLOCK" does not count as a loop, so the loop control
   statements "next", "last", or "redo" cannot be used to leave or
   restart the block.  See perlsyn for alternative strategies.
$ 


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Implementing a cache

2009-07-10 Thread Nikolaus Rath
Hello,

I want to implement a caching data structure in Python that allows me
to:

 1. Quickly look up objects using a key
 2. Keep track of the order in which the objects are accessed (most
recently and least recently accessed one, not a complete history)
 3. Quickly retrieve and remove the least recently accessed object.


Here's my idea for the implementation:

The objects in the cache are encapsulated in wrapper objects:

class OrderedDictElement(object):
__slots__ = [ "next", "prev", "key", "value" ]

These wrapper objects are then kept in a linked lists and in an ordinary
dict (self.data) in parallel. Object access then works as follows:

def __setitem__(self, key, value):
if key in self.data:
# Key already exists, just overwrite value
self.data[key].value = value
else:
# New key, attach at head of list
with self.lock:
el = OrderedDictElement(key, value, next=self.head.next, 
prev=self.head)
self.head.next.prev = el
self.head.next = el
self.data[key] = el

def __getitem__(self, key):
return self.data[key].value

To 'update the access time' of an object, I use

def to_head(self, key):
with self.lock:
el = self.data[key]
# Splice out
el.prev.next = el.next
el.next.prev = el.prev

# Insert back at front
el.next = self.head.next
el.prev = self.head

self.head.next.prev = el
self.head.next = el


self.head and self.tail are special sentinel objects that only have a
.next and .prev attribute respectively.


While this is probably going to work, I'm not sure if its the best
solution, so I'd appreciate any comments. Can it be done more elegantly?
Or is there an entirely different way to construct the data structure
that also fulfills my requirements?


I already looked at the new OrderedDict class in Python 3.1, but
apparently it does not allow me to change the ordering and is therefore
not suitable for my purpose. (I can move something to one end by
deleting and reinserting it, but I'd like to keep at least the option of also
moving objects to the opposite end).


Best,


   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Logging multiple lines

2009-06-16 Thread Nikolaus Rath
Hi,

Are there any best practices for handling multi-line log messages?

For example, the program,

main = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('%(asctime)s %(levelname)s 
%(message)s'))
main.addHandler(handler)
main.setLevel(logging.DEBUG)
main.info("Starting")
try:
bla = 42/0
except:
main.exception("Oops...")

generates the log messages

2009-06-16 22:19:57,515 INFO Starting
2009-06-16 22:19:57,518 ERROR Oops...
Traceback (most recent call last):
  File "/home/nikratio/lib/EclipseWorkspace/playground/src/mytests.py", line 
27, in 
bla = 42/0
ZeroDivisionError: integer division or modulo by zero


which are a mess in any logfile because they make it really difficult to
parse.


How do you usually handle multi-line messages? Do you avoid them
completely (and therefore also the exception logging facilities provided
by logging)? Or is it possible to tweak the formatter so that it inserts
the prefix at the beginning of every line? 


Best,


   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Exceptions and Object Destruction (was: Problem with apsw and garbage collection)

2009-06-12 Thread Nikolaus Rath
Nikolaus Rath  writes:
> Hi,
>
> Please consider this example:
[]

I think I managed to narrow down the problem a bit. It seems that when
a function returns normally, its local variables are immediately
destroyed. However, if the function is left due to an exception, the
local variables remain alive:

-snip-
#!/usr/bin/env python
import gc

class testclass(object):
def __init__(self):
print "Initializing"

def __del__(self):
print "Destructing"

def dostuff(fail):
obj = testclass()

if fail:
raise TypeError

print "Calling dostuff"
dostuff(fail=False)
print "dostuff returned"

try:
print "Calling dostuff"
dostuff(fail=True)
except TypeError:
pass

gc.collect()
print "dostuff returned"
-snip-


Prints out:


-snip-
Calling dostuff
Initializing
Destructing
dostuff returned
Calling dostuff
Initializing
dostuff returned
Destructing
-snip-


Is there a way to have the obj variable (that is created in dostuff())
destroyed earlier than at the end of the program? As you can see, I
already tried to explicitly call the garbage collector, but this does
not help.


Best,


   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Problem with apsw and garbage collection

2009-06-11 Thread Nikolaus Rath
Hi,

Please consider this example:

#!/usr/bin/env python

import apsw
import tempfile

fh = tempfile.NamedTemporaryFile()
conn = apsw.Connection(fh.name)

# Fill the db with some data
cur = conn.cursor()
datafh = open("/dev/urandom", "rb")
cur.execute("CREATE TABLE foo (no INT, data BLOB)")
for i in range(42):
cur.execute("INSERT INTO foo(no,data) VALUES(?,?)", (i, buffer(datafh.read(4096
del cur

# Here comes the problem:
def dostuff(conn, fail):
cur = conn.cursor()
# Ignore result, we just need to make some query
cur.execute("SELECT data FROM foo WHERE no = ?", (33,))
cur2 = conn.cursor()

if fail:
raise TypeError
else:
return

# This works:
dostuff(conn, fail=False)
cur = conn.cursor()
cur.execute("VACUUM")
#cur.execute("CREATE TABLE test (no INT)")

# This does not:
try:
dostuff(conn, fail=True)
except TypeError:   
cur = conn.cursor()
cur.execute("VACUUM")
#cur.execute("CREATE TABLE test2 (no INT)")


While the first execute("VACUUM") call succeeds, the second does not but
raises an apsw.BusyError (meaning that sqlite thinks that it cannot get
an exclusive lock on the database).

I suspect that the reason for that is that the cursor object that is
created in the function is not destroyed when the function is left with
raise (rather than return), which in turn prevents sqlite from obtaining
the lock.

However, if I exchange the VACUUM command by something else (e.g. CREATE
TABLE), the program runs fine. I think this casts some doubt on the
above explanation, since, AFAIK sqlite always locks the entire file and
should therefore have the some problem as before.


Can someone explain what exactly is happening here?


Best,


   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [pyunit] Only run one specific test

2009-05-28 Thread Nikolaus Rath
Dave Angel  writes:
> Nikolaus Rath wrote:
>> Hi,
>>
>> Consider these two files:
>>
>> , mytest.py -
>> | #!/usr/bin/env python
>> | import unittest
>> | | class myTestCase(unittest.TestCase):
>> | def test_foo(self):
>> |pass
>> | | # Somehow important according to pyunit documentation
>> | def suite():
>> | return unittest.makeSuite(myTestCase)
>> `
>>
>> , runtest ---
>> | #!/usr/bin/env python
>> | import unittest
>> | | # Find and import tests
>> | modules_to_test =  [ "mytest" ]
>> | map(__import__, modules_to_test)
>> | | # Runs all tests in test/ directory
>> | def suite():
>> | alltests = unittest.TestSuite()
>> | for name in modules_to_test:
>> | alltests.addTest(unittest.findTestCases(sys.modules[name]))
>> | return alltests
>> | | if __name__ == '__main__':
>> | unittest.main(defaultTest='suite')
>> `
>>
>>
>> if I run runtest without arguments, it works. But according to runtest
>> --help, I should also be able to do
>>
>> ,
>> | $ ./runtest mytest
>> | Traceback (most recent call last):
>> |   File "./runtest", line 20, in 
>> | unittest.main()
>> |   File "/usr/lib/python2.6/unittest.py", line 816, in __init__
>> | self.parseArgs(argv)
>> |   File "/usr/lib/python2.6/unittest.py", line 843, in parseArgs
>> | self.createTests()
>> |   File "/usr/lib/python2.6/unittest.py", line 849, in createTests
>> | self.module)
>> |   File "/usr/lib/python2.6/unittest.py", line 613, in loadTestsFromNames
>> | suites = [self.loadTestsFromName(name, module) for name in names]
>> |   File "/usr/lib/python2.6/unittest.py", line 584, in loadTestsFromName
>> | parent, obj = obj, getattr(obj, part)
>> | AttributeError: 'module' object has no attribute 'mytest'
>> `
>>
>>
>> Why doesn't this work?
>>
>> Best,
>>
>>-Nikolaus
>>
>>
>>   
> First, you're missing aimport sys in the runtest.py module.
> Without that, it won't even start.

Sorry, I must have accidentally deleted the line when I deleted empty
lines to make the example more compact.


> Now, I have no familiarity with unittest, but I took this as a
> challenge.  The way I read the code is that you need an explicit
> import of mytest if you're
> going to specify a commandline of
>runtest mytest
>
> So I'd add two lines to the beginning of  runtest.py:
>
> import sys
> import mytest

Yes, that works indeed. But in practice the modules_to_import list is
filled by parsing the contents of a test/*.py directory. That's why I
import dynamically with __import__.

Nevertheless, you got me on the right track. After I explicitly added
the modules to the global namespace (globals()["mytest"] =
__import__("mytest")), it works fine. Thx!


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


[pyunit] Only run one specific test

2009-05-28 Thread Nikolaus Rath
Hi,

Consider these two files:

, mytest.py -
| #!/usr/bin/env python
| import unittest
| 
| class myTestCase(unittest.TestCase):
| def test_foo(self):
|   pass
| 
| # Somehow important according to pyunit documentation
| def suite():
| return unittest.makeSuite(myTestCase)
`

, runtest ---
| #!/usr/bin/env python
| import unittest
| 
| # Find and import tests
| modules_to_test =  [ "mytest" ]
| map(__import__, modules_to_test)
| 
| # Runs all tests in test/ directory
| def suite():
| alltests = unittest.TestSuite()
| for name in modules_to_test:
| alltests.addTest(unittest.findTestCases(sys.modules[name]))
| return alltests
| 
| if __name__ == '__main__':
| unittest.main(defaultTest='suite')
`


if I run runtest without arguments, it works. But according to runtest
--help, I should also be able to do

,
| $ ./runtest mytest
| Traceback (most recent call last):
|   File "./runtest", line 20, in 
| unittest.main()
|   File "/usr/lib/python2.6/unittest.py", line 816, in __init__
| self.parseArgs(argv)
|   File "/usr/lib/python2.6/unittest.py", line 843, in parseArgs
| self.createTests()
|   File "/usr/lib/python2.6/unittest.py", line 849, in createTests
| self.module)
|   File "/usr/lib/python2.6/unittest.py", line 613, in loadTestsFromNames
| suites = [self.loadTestsFromName(name, module) for name in names]
|   File "/usr/lib/python2.6/unittest.py", line 584, in loadTestsFromName
| parent, obj = obj, getattr(obj, part)
| AttributeError: 'module' object has no attribute 'mytest'
`


Why doesn't this work?

Best,

   -Nikolaus


-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
-- 
http://mail.python.org/mailman/listinfo/python-list


Strange pexpect behaviour: just duplicates stdin

2009-03-30 Thread Nikolaus Rath
Hello,

I have a strange problem with pexpect:

$ cat test.py
#!/usr/bin/python

import pexpect

child = pexpect.spawn("./test.pl")

while True:
try:
line = raw_input()
except EOFError:
break

child.sendline(line)
print child.readline().rstrip("\r\n")

child.close()

$ cat test.pl
#!/usr/bin/perl
# Replace all digits by __
while (<>) {
s/[0-9]+/__/g;
print;
}

$ echo bla24fasel | ./test.pl
bla__fasel

$ echo bla24fasel | ./test.py
bla24fasel


Why doesn't the last command return bla__fasel too?


I extracted this example from an even stranger problem in a bigger
program. In there, pexpect sometimes returns both the string send with
sendline() *and* the output of the child program, sometimes the
correct output of the child, and sometimes only the input it has send
to the child. I couldn't figure out a pattern, but the above example
always produces the same result.


Anyone able to help?


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list


Re: Calculate sha1 hash of a binary file

2008-08-07 Thread Nikolaus Rath
LaundroMat <[EMAIL PROTECTED]> writes:
> Hi -
>
> I'm trying to calculate unique hash values for binary files,
> independent of their location and filename, and I was wondering
> whether I'm going in the right direction.
>
> Basically, the hash values are calculated thusly:
>
> f = open('binaryfile.bin')
> import hashlib
> h = hashlib.sha1()
> h.update(f.read())
> hash = h.hexdigest()
> f.close()
>
> A quick try-out shows that effectively, after renaming a file, its
> hash remains the same as it was before.
>
> I have my doubts however as to the usefulness of this. As f.read()
> does not seem to read until the end of the file (for a 3.3MB file only
> a string of 639 bytes is being returned, perhaps a 00-byte counts as
> EOF?), is there a high danger for collusion?
>
> Are there better ways of calculating hash values of binary files?


Apart from opening the file in binary mode, I would consider to read
and update the hash in chunks of e.g. 512 KB. The above code is
probably going to perform horribly for sufficiently large files, since
you try read the entire file into memory.


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Find class of an instance?

2008-08-06 Thread Nikolaus Rath
Neal Becker <[EMAIL PROTECTED]> writes:
> Sounds simple, but how, given an instance, do I find the class?


It does not only sound simple. When 'inst' is your instance, then

  inst.__class__

or

  type(inst) 

is the class.

Best,


   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Locking around

2008-08-06 Thread Nikolaus Rath
Tobiah <[EMAIL PROTECTED]> writes:
> On Mon, 04 Aug 2008 15:30:51 +0200, Nikolaus Rath wrote:
>
>> Hello,
>> 
>> I need to synchronize the access to a couple of hundred-thousand
>> files[1]. It seems to me that creating one lock object for each of the
>> files is a waste of resources, but I cannot use a global lock for all
>> of them either (since the locked operations go over the network, this
>> would make the whole application essentially single-threaded even
>> though most operations act on different files).
>
> Do you think you could use an SQL database on the network to
> handle the locking?

Yeah, I could. It wouldn't even have to be over the network (I'm
synchronizing access from within the same program). But I think that
is even more resource-wasteful than my original idea.

> Hey, maybe the files themselves should go into blobs.

Nope, not possible.  They're on Amazon S3.

Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Locking around

2008-08-06 Thread Nikolaus Rath
Carl Banks <[EMAIL PROTECTED]> writes:
> Freaky... I just posted nearly this exact solution.
>
> I have a couple comments.  First, the call to acquire should come
> before the try block.  If the acquire were to fail, you wouldn't want
> to release the lock on cleanup.
>
> Second, you need to change notify() to notifyAll(); notify alone won't
> cut it.  Consider what happens if you have two threads waiting for
> keys A and B respectively.  When the thread that has B is done, it
> releases B and calls notify, but notify happens to wake up the thread
> waiting on A.  Thus the thread waiting on B is starved.

You're right. Thanks for pointing it out.

Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Locking around

2008-08-06 Thread Nikolaus Rath
Nikolaus Rath <[EMAIL PROTECTED]> writes:
>> This should work, at least the idea is not flawed. However, I'd say
>> there are too many locks involved. Rather, you just need a simple
>> flag and the global lock. Further, you need a condition/event that
>> tells waiting threads that you released some of the files so that it
>> should see again if the ones it wants are available.
>
> I have to agree that this sounds like an easier implementation. I
> just have to think about how to do the signalling. Thanks a lot!

Here's the code I use now. I think it's also significantly easier to
understand (cv is a threading.Condition() object and cv.locked_keys a
set()).

def lock_s3key(s3key):
cv = self.s3_lock

try:
# Lock set of locked s3 keys (global lock)
cv.acquire()

# Wait for given s3 key becoming unused
while s3key in cv.locked_keys:
cv.wait()

# Mark it as used (local lock)
cv.locked_keys.add(s3key)
finally:
# Release global lock
cv.release()


def unlock_s3key(s3key):
cv = self.s3_lock

try:
# Lock set of locked s3 keys (global lock)
cv.acquire()

# Mark key as free (release local lock)
cv.locked_keys.remove(s3key)

# Notify other threads
cv.notify()

finally:
# Release global lock
cv.release()


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Locking around

2008-08-06 Thread Nikolaus Rath
Ulrich Eckhardt <[EMAIL PROTECTED]> writes:
> Nikolaus Rath wrote:
>> I need to synchronize the access to a couple of hundred-thousand
>> files[1]. It seems to me that creating one lock object for each of the
>> files is a waste of resources, but I cannot use a global lock for all
>> of them either (since the locked operations go over the network, this
>> would make the whole application essentially single-threaded even
>> though most operations act on different files).
>
> Just wondering, but at what time do you know what files are needed?

As soon as I have read a client request. Also, I will only need one
file per request, not multiple.

> If you know that rather early, you could simply 'check out' the
> required files, do whatever you want with them and then release them
> again. If one of the requested files is marked as already in use,
> you simply wait (without reserving the others) until someone
> releases files and then try again. You could also wait for that
> precise file to be available, but that would require that you
> already reserve the other files, which might unnecessarily block
> other accesses.
>
> Note that this idea requires that each access locks one set of files at the
> beginning and releases them at the end, i.e. no attempts to lock files in
> between, which would otherwise easily lead to deadlocks.

I am not sure that I understand your idea. To me this sounds exactly
like what I'm already doing, just replace 'check out' by 'lock' in
your description... Am I missing something?

>> My idea is therefore to create and destroy per-file locks "on-demand"
>> and to protect the creation and destruction by a global lock
>> (self.global_lock). For that, I add a "usage counter"
>> (wlock.user_count) to each lock, and destroy the lock when it reaches
>> zero.
> [...code...]
>
>>  - Does that look like a proper solution, or does anyone have a better
>>one?
>
> This should work, at least the idea is not flawed. However, I'd say
> there are too many locks involved. Rather, you just need a simple
> flag and the global lock. Further, you need a condition/event that
> tells waiting threads that you released some of the files so that it
> should see again if the ones it wants are available.

I have to agree that this sounds like an easier implementation. I just
have to think about how to do the signalling. Thanks a lot!


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Locking around

2008-08-04 Thread Nikolaus Rath
Hello,

I need to synchronize the access to a couple of hundred-thousand
files[1]. It seems to me that creating one lock object for each of the
files is a waste of resources, but I cannot use a global lock for all
of them either (since the locked operations go over the network, this
would make the whole application essentially single-threaded even
though most operations act on different files).

My idea is therefore to create and destroy per-file locks "on-demand"
and to protect the creation and destruction by a global lock
(self.global_lock). For that, I add a "usage counter"
(wlock.user_count) to each lock, and destroy the lock when it reaches
zero. The number of currently active lock objects is stored in a dict:

def lock_s3key(s3key):

self.global_lock.acquire()
try:

# If there is a lock object, use it
if self.key_lock.has_key(s3key):
wlock = self.key_lock[s3key]
wlock.user_count += 1
lock = wlock.lock

# otherwise create a new lock object
else:
wlock = WrappedLock()
wlock.lock = threading.Lock()
wlock.user_count = 1
self.key_lock[s3key] = wlock

finally:
self.global_lock.release()

# Lock the key itself
lock.acquire()


and similarly

def unlock_s3key(s3key):

# Lock dictionary of lock objects
self.global_lock.acquire()
try:

# Get lock object
wlock = self.key_lock[s3key]

# Unlock key
wlock.lock.release()

# We don't use the lock object any longer
wlock.user_count -= 1

# If no other thread uses the lock, dispose it
if wlock.user_count == 0:
del self.key_lock[s3key]
assert wlock.user_count >= 0

finally:
self.global_lock.release()


WrappedLock is just an empty class that allows me to add the
additional user_count attribute.


My questions:

 - Does that look like a proper solution, or does anyone have a better
   one?

 - Did I overlook any deadlock possibilities?
 

Best,
Nikolaus



[1] Actually, it's not really files (because in that case I could use
fcntl) but blobs stored on Amazon S3.


-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Steven D'Aprano <[EMAIL PROTECTED]> writes:
> So, to the Original Poster:
>
> In Python, new-style classes and types are the same, but it is
> traditional to refer to customer objects as "class" and built-in
> objects as "types". Old-style classes are different, but you are
> discouraged from using old-style classes unless you have a specific
> reason for needing them (e.g. backwards compatibility).

I see. Thanks a lot for the explanation.


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Miles <[EMAIL PROTECTED]> writes:
> On Thu, Jul 31, 2008 at 1:59 PM, Nikolaus Rath wrote:
>> If it is just a matter of different rendering, what's the reason for
>> doing it like that? Wouldn't it be more consistent and straightforward
>> to denote builtin types as classes as well?
>
> Yes, and in Python 3, it will be so:
>
[..]

> http://svn.python.org/view?rev=23331&view=rev

That makes matters absolutely clear. Thanks a lot. No more questions
from my side ;-).


Best,

   -Nikolaus


-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Maric Michaud <[EMAIL PROTECTED]> writes:
>>> What the  means is that int is not a user type but a
>>> builtin type, instances of int are not types (or classes) but common
>>> objects, so its nature is the same as any classes.
>>>
>>> The way it prints doesn't matter, it's just the __repr__ of any instance,
>>> and the default behavior for instances of type is to return '',
>>> but it can be easily customized.
>>
>> But 'int' is an instance of 'type' (the metaclass):
>> >>> int.__class__
>>
>> 
>>
>> so it should also return '' if that's the default behavior
>> of the 'type' metaclass.
>>
>
> The fact that a class is an instance of type, which it is always true, 
> doesn't 
> mean its metaclass is "type", it could be any subclass of type :

Yes, that is true. But it's not what I said above (and below).
'obj.__class__' gives the class of 'obj', so if 'int.__class__ is
type' then 'type' is the class of 'int' and 'int' is *not* an instance
of some metaclass derived from 'type'.

>> I think that to get '' one would have to define a new
>> metaclass like this:
>>
>> def type_meta(type):
>>     def __repr__(self)
>>          return "" % self.__name__
>>
>> and then one should have int.__class__ is type_meta. But obviously
>> that's not the case. Why?
>>
>> Moreover:
>> >>> class myint(int):
>>
>> ...    pass
>> ...
>>
>> >>> myint.__class__ is int.__class__
>> True
>>
>> >>> int
>> 
>>
>> >>> myint
>> 
>>
>> despite int and myint having the same metaclass. So if the
>> representation is really defined in the 'type' metaclass, then
>> type.__repr__ has to make some kind of distinction between int and
>> myint, so they cannot be on absolute equal footing.
>
> You're right, type(int) is type, the way it renders differently is a
> detail of its implementation, you can do things with builtin types
> (written in C) you coudn't do in pure python, exactly as you
> couldn't write recursive types like 'object' and 'type'.


If it is just a matter of different rendering, what's the reason for
doing it like that? Wouldn't it be more consistent and straightforward
to denote builtin types as classes as well?

And where exactly is this different rendering implemented? Could I
write my own type (in C, of course) and make it behave like e.g.
'int'? I.e. its rendering should be different and not inherited to
subclasses:


>>> my_type

>>> a = my_type(42)
>>> a.__class__


>>> class derived(my_type):
>>>pass


or would I have to change the implemention of 'type' for this (since
it contains the __repr__ function that renders the type)?


This is of course purely theoretical and probably without any
practical relevance. I'm if I just can't stop drilling, but I think
this is really interesting.


Best,


   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Maric Michaud <[EMAIL PROTECTED]> writes:
> Le Thursday 31 July 2008 16:46:28 Nikolaus Rath, vous avez écrit :
>> Maric Michaud <[EMAIL PROTECTED]> writes:
>> >> > Can someone explain to me the difference between a type and a class?
>> >>
>> >> If your confusion is of a more general nature I suggest reading the
>> >> introduction of `Design Patterns' (ISBN-10: 0201633612), under
>> >> `Specifying Object Interfaces'.
>> >>
>> >> In short: A type denotes a certain interface, i.e. a set of signatures,
>> >> whereas a class tells us how an object is implemented (like a
>> >> blueprint). A class can have many types if it implements all their
>> >> interfaces, and different classes can have the same type if they share a
>> >> common interface. The following example should clarify matters:
>> >
>> > Of course, this is what a type means in certain literature about OO
>> > (java-ish), but this absolutely not what type means in Python. Types
>> > are a family of object with a certain status, and they're type is
>> > "type", conventionnaly named a metatype in standard OO.
>>
>> [...]
>>
>> Hmm. Now you have said a lot about Python objects and their type, but
>> you still haven't said what a type actually is (in Python) and in what
>> way it is different from a class. Or did I miss something?
>
> This paragraph ?
>
> """
> - types, or classes, are all instance of type 'type' (or a subclass of it), 

Well, I couldn't quite make sense of '..are instance of type...'
without knowing what you actually meant with "type* in this context,
but...

> Maybe it's still unclear that "types" and "classes" *are* synonyms
> in Python.

..in this case it becomes clear. Then my question reduces to the one
in the other post (why do 'int' and 'myint' have different __repr__
results).


Already thanks for your help,


   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Maric Michaud <[EMAIL PROTECTED]> writes:
> Le Thursday 31 July 2008 14:30:19 Nikolaus Rath, vous avez écrit :
>> oj <[EMAIL PROTECTED]> writes:
>> > On Jul 31, 11:37 am, Nikolaus Rath <[EMAIL PROTECTED]> wrote:
>> >> So why does Python distinguish between e.g. the type 'int' and the
>> >> class 'myclass'? Why can't I say that 'int' is a class and 'myclass'
>> >> is a type?
>> >
>> > I might be wrong here, but I think the point is that there is no
>> > distinction. A class (lets call it SomeClass for this example) is an
>> > object of type 'type', and an instance of a class is an object of type
>> > 'SomeClass'.
>>
>> But there seems to be a distinction:
>> >>> class int_class(object):
>>
>> ...   pass
>> ...
>>
>> >>> int_class
>>
>> 
>>
>> >>> int
>>
>> 
>>
>> why doesn't this print
>>
>> >>> class int_class(object):
>>
>> ...   pass
>> ...
>>
>> >>> int_class
>>
>> 
>>
>> >>> int
>>
>> 
>>
>> or
>>
>> >>> class int_class(object):
>>
>> ...   pass
>> ...
>>
>> >>> int_class
>>
>> 
>>
>> >>> int
>>
>> 
>>
>> If there is no distinction, how does the Python interpreter know when
>> to print 'class' and when to print 'type'?
>>
>
> There are some confusion about the terms here.
>
> Classes are instances of type 'type',

Could you please clarify what you mean with 'instance of type X'? I
guess you mean that 'y is an instance of type X' iif y is constructed
by instantiating X. Is that correct?

> What the  means is that int is not a user type but a
> builtin type, instances of int are not types (or classes) but common
> objects, so its nature is the same as any classes.
>
> The way it prints doesn't matter, it's just the __repr__ of any instance, and 
> the default behavior for instances of type is to return '', but it 
> can be easily customized.

But 'int' is an instance of 'type' (the metaclass):

>>> int.__class__


so it should also return '' if that's the default behavior
of the 'type' metaclass.

I think that to get '' one would have to define a new
metaclass like this:

def type_meta(type):
def __repr__(self)
 return "" % self.__name__

and then one should have int.__class__ == type_meta. But obviously
that's not the case. Why?


Moreover:

>>> class myint(int):
...pass
... 
>>> myint.__class__ == int.__class__
True
>>> int

>>> myint


despite int and myint having the same metaclass. So if the
representation is really defined in the 'type' metaclass, then
type.__repr__ has to make some kind of distinction between int and
myint, so they cannot be on absolute equal footing.



Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Maric Michaud <[EMAIL PROTECTED]> writes:
>> > Can someone explain to me the difference between a type and a class?
>>
>> If your confusion is of a more general nature I suggest reading the
>> introduction of `Design Patterns' (ISBN-10: 0201633612), under
>> `Specifying Object Interfaces'.
>>
>> In short: A type denotes a certain interface, i.e. a set of signatures,
>> whereas a class tells us how an object is implemented (like a
>> blueprint). A class can have many types if it implements all their
>> interfaces, and different classes can have the same type if they share a
>> common interface. The following example should clarify matters:
>>
>
> Of course, this is what a type means in certain literature about OO
> (java-ish), but this absolutely not what type means in Python. Types
> are a family of object with a certain status, and they're type is
> "type", conventionnaly named a metatype in standard OO.
[...]

Hmm. Now you have said a lot about Python objects and their type, but
you still haven't said what a type actually is (in Python) and in what
way it is different from a class. Or did I miss something?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
oj <[EMAIL PROTECTED]> writes:
> On Jul 31, 11:37 am, Nikolaus Rath <[EMAIL PROTECTED]> wrote:
>> So why does Python distinguish between e.g. the type 'int' and the
>> class 'myclass'? Why can't I say that 'int' is a class and 'myclass'
>> is a type?
>
> I might be wrong here, but I think the point is that there is no
> distinction. A class (lets call it SomeClass for this example) is an
> object of type 'type', and an instance of a class is an object of type
> 'SomeClass'.

But there seems to be a distinction:

>>> class int_class(object):
...   pass
... 
>>> int_class

>>> int


why doesn't this print

>>> class int_class(object):
...   pass
... 
>>> int_class

>>> int


or

>>> class int_class(object):
...   pass
... 
>>> int_class

>>> int


If there is no distinction, how does the Python interpreter know when
to print 'class' and when to print 'type'?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: Difference between type and class

2008-07-31 Thread Nikolaus Rath
Thomas Troeger <[EMAIL PROTECTED]> writes:
>> Can someone explain to me the difference between a type and a class?
>
> If your confusion is of a more general nature I suggest reading the
> introduction of `Design Patterns' (ISBN-10: 0201633612), under
> Specifying Object Interfaces'.
>
> In short: A type denotes a certain interface, i.e. a set of
> signatures, whereas a class tells us how an object is implemented
> (like a blueprint). A class can have many types if it implements all
> their interfaces, and different classes can have the same type if they
> share a common interface. The following example should clarify
> matters:
>
> class A:
>   def bar(self):
>   print "A"
>
> class B:
>   def bar(self):
>   print "B"
>
> class C:
>   def bla(self):
>   print "C"
>
> def foo(x):
>   x.bar()
>
> you can call foo with instances of both A and B, because both classes
> share a common type, namely the type that has a `bar' method), but not
> with an instance of C because it has no method `bar'. Btw, this
> example shows the use of duck typing
> (http://en.wikipedia.org/wiki/Duck_typing).

That would imply that I cannot create instances of a type, only of
a class that implements the type, wouldn't it?

But Python denotes 'int' as a type *and* I can instantiate it.



Still confused,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Difference between type and class

2008-07-31 Thread Nikolaus Rath
Hello,

Can someone explain to me the difference between a type and a class?
After reading http://www.cafepy.com/article/python_types_and_objects/
it seems to me that classes and types are actually the same thing:

 - both are instances of a metaclass, and the same metaclass ('type')
   can instantiate both classes and types.
 - both can be instantiated and yield an "ordinary" object
 - I can even inherit from a type and get a class

So why does Python distinguish between e.g. the type 'int' and the
class 'myclass'? Why can't I say that 'int' is a class and 'myclass'
is a type?

I hope I have managed to get across the point of my confusion...


Thanks in advance,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: [unittest] Run setUp only once

2008-07-29 Thread Nikolaus Rath
Jean-Paul Calderone <[EMAIL PROTECTED]> writes:
> On Tue, 29 Jul 2008 19:26:09 +0200, Nikolaus Rath <[EMAIL PROTECTED]> wrote:
>>Jean-Paul Calderone <[EMAIL PROTECTED]> writes:
>>> On Tue, 29 Jul 2008 16:35:55 +0200, Nikolaus Rath <[EMAIL PROTECTED]> wrote:
>>>>Hello,
>>>>
>>>>I have a number of conceptually separate tests that nevertheless need
>>>>a common, complicated and expensive setup.
>>>>
>>>>Unfortunately, unittest runs the setUp method once for each defined
>>>>test, even if they're part of the same class as in
>>>>
>>>>class TwoTests(unittest.TestCase):
>>>>def setUp(self):
>>>># do something very time consuming
>>>>
>>>>def testOneThing(self):
>>>>
>>>>
>>>>def testADifferentThing(self):
>>>>
>>>>
>>>>which would call setUp twice.
>>>>
>>>>
>>>>Is there any way to avoid this, without packing all the unrelated
>>>>tests into one big function?
>>>>
>>>
>>>class TwoTests(unittest.TestCase):
>>>setUpResult = None
>>>
>>>def setUp(self):
>>>if self.setUpResult is None:
>>>self.setUpResult = computeIt()
>>>
>>>...
>>>
>>> There are plenty of variations on this pattern.
>>
>>
>>But at least this variation doesn't work, because unittest apparently
>>also creates two separate TwoTests instances for the two tests. Isn't
>>there some way to convince unittest to reuse the same instance instead
>>of trying to solve the problem in the test code itself?
>>
>
> Eh sorry, you're right, the above is broken. `setUpResult` should be
> a class attribute instead of an instance attribute.

Yeah, well, I guess that would work. But to me this looks really more
like a nasty hack.. isn't there a proper solution?

Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

Re: [unittest] Run setUp only once

2008-07-29 Thread Nikolaus Rath
Jean-Paul Calderone <[EMAIL PROTECTED]> writes:
> On Tue, 29 Jul 2008 16:35:55 +0200, Nikolaus Rath <[EMAIL PROTECTED]> wrote:
>>Hello,
>>
>>I have a number of conceptually separate tests that nevertheless need
>>a common, complicated and expensive setup.
>>
>>Unfortunately, unittest runs the setUp method once for each defined
>>test, even if they're part of the same class as in
>>
>>class TwoTests(unittest.TestCase):
>>def setUp(self):
>># do something very time consuming
>>
>>def testOneThing(self):
>>
>>
>>def testADifferentThing(self):
>>
>>
>>which would call setUp twice.
>>
>>
>>Is there any way to avoid this, without packing all the unrelated
>>tests into one big function?
>>
>
>class TwoTests(unittest.TestCase):
>setUpResult = None
>
>def setUp(self):
>if self.setUpResult is None:
>self.setUpResult = computeIt()
>
>...
>
> There are plenty of variations on this pattern.


But at least this variation doesn't work, because unittest apparently
also creates two separate TwoTests instances for the two tests. Isn't
there some way to convince unittest to reuse the same instance instead
of trying to solve the problem in the test code itself?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
--
http://mail.python.org/mailman/listinfo/python-list

[unittest] Run setUp only once

2008-07-29 Thread Nikolaus Rath
Hello,

I have a number of conceptually separate tests that nevertheless need
a common, complicated and expensive setup.

Unfortunately, unittest runs the setUp method once for each defined
test, even if they're part of the same class as in

class TwoTests(unittest.TestCase):
def setUp(self):
# do something very time consuming

def testOneThing(self):


def testADifferentThing(self):


which would call setUp twice.


Is there any way to avoid this, without packing all the unrelated
tests into one big function?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-29 Thread Nikolaus Rath
Bruno Desthuilliers <[EMAIL PROTECTED]> writes:
> Nikolaus Rath a écrit :
>> Michael Torrie <[EMAIL PROTECTED]> writes:
>
> (snip)
>
>>> In short, unlike what most of the implicit self advocates are
>>> saying, it's not just a simple change to the python parser to do
>>> this. It would require a change in the interpreter itself and how it
>>> deals with classes.
>>
>>
>> Thats true. But out of curiosity: why is changing the interpreter such
>> a bad thing? (If we suppose for now that the change itself is a good
>> idea).
>
> Because it would very seriously break a *lot* of code ?

Well, Python 3 will break lots of code anyway, won't it?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Protecting instance variables

2008-07-28 Thread Nikolaus Rath
Hi,

Sorry for replying so late. Your MUA apparently messes up the
References:, so I saw you reply only now and by coincidence.

"Diez B. Roggisch" <[EMAIL PROTECTED]> writes:
> Nikolaus Rath schrieb:
>> Hello,
>>
>> I am really surprised that I am asking this question on the mailing
>> list, but I really couldn't find it on python.org/doc.
>>
>> Why is there no proper way to protect an instance variable from access
>> in derived classes?
>>
>> I can perfectly understand the philosophy behind not protecting them
>> from access in external code ("protection by convention"), but isn't
>> it a major design flaw that when designing a derived class I first
>> have to study the base classes source code? Otherwise I may always
>> accidentally overwrite an instance variable used by the base class...
>
> Here we go again...
>
> http://groups.google.com/group/comp.lang.python/browse_thread/thread/188467d724b48b32/
>
> To directly answer your question: that's what the __ (double
> underscore) name mangling is for.


I understand that it is desirable not to completely hide instance
variables. But it seems silly to me that I should generally prefix
almost all my instance variables with two underscores.

I am not so much concerned about data hiding, but about not
accidentally overwriting a variable of the class I'm inheriting from.
And, unless I misunderstood something, this is only possible if I'm
prefixing them with __.

How is this problem solved in practice? I probably don't have a
representative sample, but in the libraries that I have been using so
far, there were a lot of undocumented (in the sense of: not being part
of the public API) instance variables not prefixed with __. I have
therefore started to first grep the source of all base classes
whenever I introduce a new variable in my derived class. Is that
really the way it's supposed to be? What if one of the base classes
introduces a new variable at a later point?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: os.symlink()

2008-07-28 Thread Nikolaus Rath
"Diez B. Roggisch" <[EMAIL PROTECTED]> writes:
> Nikolaus Rath wrote:
>
>> Hello,
>> 
>>>From `pydoc os`:
>> 
>> symlink(...)
>> symlink(src, dst)
>> 
>> Create a symbolic link pointing to src named dst.
>> 
>> 
>> Is there any reason why this is so deliberately confusing? Why is the
>> target of the symlink, the think where it points *to*, called the
>> `src`? It seems to me that the names of the parameters should be
>> reversed.
>
> I used the command the other day, and didn't feel the slightest confusion.
>
> To me, the process of creating a symlink is like a "virtual copy".
> Which the above parameter names reflect perfectly.

Is this interpretation really widespread? I couldn't find any other
sources using it. On the other hand:

From ln(2):

,
| SYNOPSIS
|ln [OPTION]... [-T] TARGET LINK_NAME   (1st form)
| 
| DESCRIPTION
|In the 1st form, create a link to TARGET with the name LINK_NAME.  
`

From Wikipedia:

,
| A symbolic link merely contains a text string that is interpreted and
| followed by the operating system as a path to another file or
| directory. It is a file on its own and can exist independently of its
| target. If a symbolic link is deleted, its target remains unaffected.
| If the target is moved, renamed or deleted, any symbolic link that
| used to point to it continues to exist but now points to a
| non-existing file.
`


Best,


   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

os.symlink()

2008-07-28 Thread Nikolaus Rath
Hello,

From `pydoc os`:

symlink(...)
symlink(src, dst)

Create a symbolic link pointing to src named dst.


Is there any reason why this is so deliberately confusing? Why is the
target of the symlink, the think where it points *to*, called the
`src`? It seems to me that the names of the parameters should be
reversed.



Puzzled,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-28 Thread Nikolaus Rath
Michael Torrie <[EMAIL PROTECTED]> writes:
> I think the biggest reason why an implicit self is bad is because it
> prevents monkey-patching of existing class objects.  Right now I can add
> a new method to any existing class just with a simple attribute like so
> (adding a new function to an existing instance object isn't so simple,
> but ah well):
>
> def a(self, x, y):
> self.x = x
> self.y = y
>
> class Test(object):
> pass
>
> Test.setxy = a
>
> b = Test()
>
> b.setxy(4,4)
>
> print b.x, b.y
>
> If self was implicit, none of this would work.

No, but it could work like this:

def a(x, y):
 self.x = x
 self.y = y

class Test(object):
 pass

Test.setxy = a
b = Test()

# Still all the same until here

# Since setxy is called as an instance method, it automatically
# gets a 'self' variable and everything works nicely
b.setxy(4,4)

# This throws an exception, since self is undefined
a(4,4)


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-28 Thread Nikolaus Rath
"Russ P." <[EMAIL PROTECTED]> writes:
> The issue here has nothing to do with the inner workings of the Python
> interpreter. The issue is whether an arbitrary name such as "self"
> needs to be supplied by the programmer.
>
> All I am suggesting is that the programmer have the option of
> replacing "self.member" with simply ".member", since the word "self"
> is arbitrary and unnecessary. Otherwise, everything would work
> *EXACTLY* the same as it does now. This would be a shallow syntactical
> change with no effect on the inner workings of Python, but it could
> significantly unclutter code in many instances.
>
> The fact that you seem to think it would change the inner
> functioning of Python just shows that you don't understand the
> proposal.


So how would you translate this into a Python with implicit self, but
without changing the procedure for method resolution?

def will_be_a_method(self, a)
# Do something with self and a

class A:
pass

a = A()
a.method = will_be_a_method


It won't work unless you change the interpreter to magically insert a
'self' variable into the scope of a function when it is called as a
method.

I'm not saying that that's a bad thing, but it certainly requires some
changes to Python's internals.


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-28 Thread Nikolaus Rath
Bruno Desthuilliers <[EMAIL PROTECTED]> writes:
> The fact that a function is defined within a class statement doesn't
> imply any "magic", it just creates a function object, bind it to a
> name, and make that object an attribute of the class. You have the
> very same result by defining the function outside the class statement
> and binding it within the class statement, by defining the function
> outside the class and binding it to the class outside the class
> statement, by binding the name to a lambda within the class statement
> etc...

But why can't the current procedure to resolve method calls be changed
to automatically define a 'self' variable in the scope of the called
function, instead of binding its first argument?


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-28 Thread Nikolaus Rath
Michael Torrie <[EMAIL PROTECTED]> writes:
> Colin J. Williams wrote:
>>>
>>> def fun( ., cat):
>>>
>> I don't see the need for the comma in fun.
>
> It (the entire first variable!) is needed because a method object is
> constructed from a normal function object:
>
> def method(self,a,b):
>   pass
>
> class MyClass(object):
>   pass
>
> MyClass.testmethod=method
>
> That's precisely the same as if you'd defined method inside of the class
> to begin with.  A function becomes a method when the lookup procedure in
> the instance object looks up the attribute and returns (from what I
> understand) essentially a closure that binds the instance to the first
> variable of the function.  The result is known as a bound method, which
> is a callable object:
>
 instance=MyClass()
>
 instance.testmethod
> >
>
>
> How would this work if there was not first parameter at all?
>
> In short, unlike what most of the implicit self advocates are
> saying, it's not just a simple change to the python parser to do
> this. It would require a change in the interpreter itself and how it
> deals with classes.


Thats true. But out of curiosity: why is changing the interpreter such
a bad thing? (If we suppose for now that the change itself is a good
idea).


Best,


   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-28 Thread Nikolaus Rath
castironpi <[EMAIL PROTECTED]> writes:
>> I think you misunderstood him. What he wants is to write
>>
>> class foo:
>>    def bar(arg):
>>        self.whatever = arg + 1
>>
>> instead of
>>
>> class foo:
>>    def bar(self, arg)
>>        self.whatever = arg + 1
>>
>> so 'self' should *automatically* only be inserted in the function
>> declaration, and *manually* be typed for attributes.
>>
>
> There's a further advantage:
>
> class A:
>   def get_auxclass( self, b, c ):
> class B:
>   def auxmeth( self2, d, e ):
> #here, ...
> return B

In auxmeth, self would refer to the B instance. In get_auxclass, it
would refer to the A instance. If you wanted to access the A instance
in auxmeth, you'd have to use

class A:
   def get_auxclass(b, c ):
 a_inst = self
 class B:
   def auxmeth(d, e ):
 self # the B instance
 a_inst # the A instance
 return B


This seems pretty natural to me (innermost scope takes precedence),
and AFAIR this is also how it is done in Java.


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-27 Thread Nikolaus Rath
Terry Reedy <[EMAIL PROTECTED]> writes:
>> What he wants is to write
>>
>
>  > class foo:
>>def bar(arg):
>>self.whatever = arg + 1
>>
>> instead of
>>
>> class foo:
>>def bar(self, arg)
>>self.whatever = arg + 1
>>
>> so 'self' should *automatically* only be inserted in the function
>> declaration, and *manually* be typed for attributes.
>
> which means making 'self' a keyword just so it can be omitted. Silly
> and pernicious.

Well, I guess that's more a matter of personal preference. I would go
for it immediately (and also try rename it to '@' at the same time).


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-26 Thread Nikolaus Rath
Terry Reedy <[EMAIL PROTECTED]> writes:
> Nikolaus Rath wrote:
>> Terry Reedy <[EMAIL PROTECTED]> writes:
>>> Torsten Bronger wrote:
>>>> Hallöchen!
>>>  > And why does this make the implicit insertion of "self" difficult?
>>>> I could easily write a preprocessor which does it after all.
>>> class C():
>>>   def f():
>>> a = 3
>>>
>>> Inserting self into the arg list is trivial.  Mindlessly deciding
>>> correctly whether or not to insert 'self.' before 'a' is impossible
>>> when 'a' could ambiguously be either an attribute of self or a local
>>> variable of f.  Or do you and/or Jordan plan to abolish local
>>> variables for methods?
>>
>> Why do you think that 'self' should be inserted anywhere except in the
>> arg list? AFAIU, the idea is to remove the need to write 'self' in the
>> arg list, not to get rid of it entirely.
>
> Because you must prefix self attributes with 'self.'. If you do not
> use any attributes of the instance of the class you are making the
> function an instance method of, then it is not really an instance
> method and need not and I would say should not be masqueraded as
> one. If the function is a static method, then it should be labeled
> as one and no 'self' is not needed and auto insertion would be a
> mistake. In brief, I assume the OP wants 'self' inserted in the body
> because inserting it only in the parameter list and never using it
> in the body is either silly or wrong.


I think you misunderstood him. What he wants is to write


class foo:
   def bar(arg):
   self.whatever = arg + 1


instead of

class foo:
   def bar(self, arg)
   self.whatever = arg + 1


so 'self' should *automatically* only be inserted in the function
declaration, and *manually* be typed for attributes.

Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Re: Attack a sacred Python Cow

2008-07-25 Thread Nikolaus Rath
Terry Reedy <[EMAIL PROTECTED]> writes:
> Torsten Bronger wrote:
>> Hallöchen!
>  > And why does this make the implicit insertion of "self" difficult?
>> I could easily write a preprocessor which does it after all.
>
> class C():
>   def f():
> a = 3
>
> Inserting self into the arg list is trivial.  Mindlessly deciding
> correctly whether or not to insert 'self.' before 'a' is impossible
> when 'a' could ambiguously be either an attribute of self or a local
> variable of f.  Or do you and/or Jordan plan to abolish local
> variables for methods?

Why do you think that 'self' should be inserted anywhere except in the
arg list? AFAIU, the idea is to remove the need to write 'self' in the
arg list, not to get rid of it entirely.


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list

Protecting instance variables

2008-07-18 Thread Nikolaus Rath
Hello,

I am really surprised that I am asking this question on the mailing
list, but I really couldn't find it on python.org/doc.

Why is there no proper way to protect an instance variable from access
in derived classes?

I can perfectly understand the philosophy behind not protecting them
from access in external code ("protection by convention"), but isn't
it a major design flaw that when designing a derived class I first
have to study the base classes source code? Otherwise I may always
accidentally overwrite an instance variable used by the base class...


Best,

   -Nikolaus

-- 
 »It is not worth an intelligent man's time to be in the majority.
  By definition, there are already enough people to do that.«
 -J.H. Hardy

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C

--
http://mail.python.org/mailman/listinfo/python-list