Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Wed, Sep 21, 2011 at 07:41:50AM +0200, Martin v. Loewis wrote: Is it just that nobody's implemented it, or is there a good reason for avoiding offering this sort of thing? I've been considering to implement killing threads several times for the last 15 years (I think about it once every year), and every time I give up because it's too complex and just not implementable. To start with, a simple flag in the thread won't do any good. I don't agree. Now if you had written that it wouldn't solve all problem, I could understand that. But I have been in circumstances where a simple flag in the thread implementation would have been helpfull. It will not cancel blocking system calls, so people will complain that the threads they meant to cancel continue to run forever. Instead, you have to use some facility to interrupt blocking system calls. You then have to convince callers of those blocking system calls not to retry when they see that the first attempt to call it was interrupted. And so on. But this is no longer an implementation problem but a use problem. If someone gets an IOError for writing on a closed pipe and he cathes the exception and retries the write in a loop, then this a problem of the author of this loop, not of exceptions. So if one thread throws an exception to an other thread for instance to indicate a timeout for the latter and the latter catches that exception and tries again what it was doing in a loop, that is entirely the problem of the author of that loop and not of the abilty of one thread throwing an exception in an other. Unless of course there may be a lot of such problematic loops within the internal python code. -- Antoon Pardon -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
Is it just that nobody's implemented it, or is there a good reason for avoiding offering this sort of thing? I've been considering to implement killing threads several times for the last 15 years (I think about it once every year), and every time I give up because it's too complex and just not implementable. To start with, a simple flag in the thread won't do any good. It will not cancel blocking system calls, so people will complain that the threads they meant to cancel continue to run forever. Instead, you have to use some facility to interrupt blocking system calls. You then have to convince callers of those blocking system calls not to retry when they see that the first attempt to call it was interrupted. And so on. Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
Ian Kelly wrote: And what if the thread gets killed a second time while it's in the except block? And what if the thread gets killed in the middle of the commit? For these kinds of reasons, any feature for raising asynchronous exceptions in another thread would need to come with some related facilites: * A way of blocking asynchronous exceptions around a critical section would be needed. * Once an asynchronous exception has been raised, further asynchronous exceptions should be blocked until explicitly re-enabled. * Asynchronous exceptions should probably be disabled initially in a new thread until it explicitly enables them. Some care would still be required to write code that is robust in the presence of asynchronous exceptions, but given these facilities, it ought to be possible. -- Greg -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
The point of the Java thread.stop() being deprecated seems to have very little to do with undeclared exceptions being raised and a lot to do with objects being left in a potentially damaged state. As Ian said, it's a lot more complex than just adding try/catches. Killing a thread in the middle of some non-atomic operation with side-effects that propagate beyond the thread is a recipe for trouble. In fact, while a a lot can be written about Java being a poor language the specific article linked to about why Java deprecated thread.stop() gives a pretty damn good explanation as to why Thread.stop() and the like are a bad idea and what a better idea might be (Signalling that a graceful halt should be attempted) -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Mon, Sep 19, 2011 at 3:41 PM, Ian Kelly ian.g.ke...@gmail.com wrote: And what if the thread gets killed in the middle of the commit? Database managers solved this problem years ago. It's not done by preventing death until you're done - death can come from someone brutally pulling out your power cord. There's no except PowerCordRemoved to protect you from that! There are various ways, and I'm sure one of them will work for whatever situation is needed. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sun, Sep 18, 2011 at 07:35:01AM +1000, Chris Angelico wrote: On Sun, Sep 18, 2011 at 5:00 AM, Nobody nob...@nowhere.com wrote: Forking a thread to discuss threads ahem. Why is it that threads can't be killed? Do Python threads correspond to OS-provided threads (eg POSIX threads on Linux)? Every OS threading library I've seen has some way of killing threads, although I've not looked in detail into POSIX threads there (there seem to be two options, pthread_kill and pthread_cancel, that could be used, but I've not used either). If nothing else, it ought to be possible to implement a high level kill simply by setting a flag that the interpreter will inspect every few commands, the same way that KeyboardInterrupt is checked for. Is it just that nobody's implemented it, or is there a good reason for avoiding offering this sort of thing? Python has a half baked solution to this. If you go to http://docs.python.org/release/3.2.2/c-api/init.html You will find the following: int PyThreadState_SetAsyncExc(long id, PyObject *exc) Asynchronously raise an exception in a thread. The id argument is the thread id of the target thread; exc is the exception object to be raised. This function does not steal any references to exc. To prevent naive misuse, you must write your own C extension to call this. Must be called with the GIL held. Returns the number of thread states modified; this is normally one, but will be zero if the thread id isn’t found. If exc is NULL, the pending exception (if any) for the thread is cleared. This raises no exceptions. Some recipes can be found at: http://www.google.com/search?ie=UTF-8oe=utf-8q=python+recipe+PyThreadState_SetAsyncExc However it this doesn't work 100% correctly. Last time I tried using this, it didn't work with an exception instance but only with an execption class as parameter. There was a discussion at http://mail.python.org/pipermail/python-dev/2006-August/068158.html about this. I don't know how it was finaly resolved. -- Antoon Pardon -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sun, 18 Sep 2011 23:41:29 -0600, Ian Kelly wrote: If the transaction object doesn't get its commit() called, it does no actions at all, thus eliminating all issues of locks. And what if the thread gets killed in the middle of the commit? The essence of a commit is that it involves an atomic operation, for which there is no middle. -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Mon, Sep 19, 2011 at 12:25 AM, Chris Angelico ros...@gmail.com wrote: On Mon, Sep 19, 2011 at 3:41 PM, Ian Kelly ian.g.ke...@gmail.com wrote: And what if the thread gets killed in the middle of the commit? Database managers solved this problem years ago. It's not done by preventing death until you're done - death can come from someone brutally pulling out your power cord. There's no except PowerCordRemoved to protect you from that! I'm aware of that. I'm not saying it's impossible, just that the example you gave is over-simplified, as writing atomic transactional logic is a rather complex topic. There may be an existing Python library to handle this, but I'm not aware of one. PowerCordRemoved is not relevant here, as that would kill the entire process, which renders the issue of broken shared data within a continuing process rather moot. Cheers, Ian -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Tue, Sep 20, 2011 at 8:04 AM, Ian Kelly ian.g.ke...@gmail.com wrote: PowerCordRemoved is not relevant here, as that would kill the entire process, which renders the issue of broken shared data within a continuing process rather moot. Assuming that the broken shared data exists only in RAM on one single machine, and has no impact on the state of anything on the hard disk or on any other computer, yes. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
Antoon Pardon wrote: int PyThreadState_SetAsyncExc(long id, PyObject *exc) To prevent naive misuse, you must write your own C extension to call this. Not if we use ctypes! Muahahahaaa! -- Greg -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sat, Sep 17, 2011 at 5:38 PM, Chris Angelico ros...@gmail.com wrote: But if it's done as an exception, all you need is to catch that exception and reraise it: def threadWork(lock, a1, a2, rate): try: while True: time.sleep(rate) lock.lock() t = a2.balance / 2 a1.balance += t #say a thread.kill kills at this point a2.balance -= t lock.release() except: # roll back the transaction in some way lock.release() raise And what if the thread gets killed a second time while it's in the except block? It'd require some care in coding, but it could be done. And if the lock/transaction object can be coded for it, it could even be done automatically: def threadWork(lock, a1, a2, rate): while True: time.sleep(rate) transaction.begin() t = a2.balance / 2 transaction.apply(a1.balance,t) #say a thread.kill kills at this point transaction.apply(a2.balance,-t) transaction.commit() If the transaction object doesn't get its commit() called, it does no actions at all, thus eliminating all issues of locks. And what if the thread gets killed in the middle of the commit? Getting the code right is going to be a lot more complicated than just adding a couple of try/excepts. Cheers, Ian -- http://mail.python.org/mailman/listinfo/python-list
Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sun, Sep 18, 2011 at 5:00 AM, Nobody nob...@nowhere.com wrote: The only robust solution is to use a separate process (threads won't suffice, as they don't have a .kill() method). Forking a thread to discuss threads ahem. Why is it that threads can't be killed? Do Python threads correspond to OS-provided threads (eg POSIX threads on Linux)? Every OS threading library I've seen has some way of killing threads, although I've not looked in detail into POSIX threads there (there seem to be two options, pthread_kill and pthread_cancel, that could be used, but I've not used either). If nothing else, it ought to be possible to implement a high level kill simply by setting a flag that the interpreter will inspect every few commands, the same way that KeyboardInterrupt is checked for. Is it just that nobody's implemented it, or is there a good reason for avoiding offering this sort of thing? Chris Angelico -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sat, Sep 17, 2011 at 2:35 PM, Chris Angelico ros...@gmail.com wrote: On Sun, Sep 18, 2011 at 5:00 AM, Nobody nob...@nowhere.com wrote: The only robust solution is to use a separate process (threads won't suffice, as they don't have a .kill() method). Forking a thread to discuss threads ahem. Why is it that threads can't be killed? Do Python threads correspond to OS-provided threads (eg POSIX threads on Linux)? Every OS threading library I've seen has some way of killing threads, although I've not looked in detail into POSIX threads there (there seem to be two options, pthread_kill and pthread_cancel, that could be used, but I've not used either). If nothing else, it ought to be possible to implement a high level kill simply by setting a flag that the interpreter will inspect every few commands, the same way that KeyboardInterrupt is checked for. Is it just that nobody's implemented it, or is there a good reason for avoiding offering this sort of thing? It's possible that the reason is analogous to why Java has deprecated its equivalent, Thread.stop(): http://download.oracle.com/javase/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sun, Sep 18, 2011 at 8:27 AM, Chris Rebert c...@rebertia.com wrote: It's possible that the reason is analogous to why Java has deprecated its equivalent, Thread.stop(): http://download.oracle.com/javase/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html Interesting. The main argument against having a way to raise an arbitrary exception in a different thread is that it gets around Java's requirement to declare all exceptions that a routine might throw - a requirement that Python doesn't have. So does that mean it'd be reasonable to have a way to trigger a TerminateThread exception (like SystemExit but for one thread) remotely? The above article recommends polling a variable, but that's the exact sort of thing that exceptions are meant to save you from doing. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On 17Sep2011 15:27, Chris Rebert c...@rebertia.com wrote: | On Sat, Sep 17, 2011 at 2:35 PM, Chris Angelico ros...@gmail.com wrote: | On Sun, Sep 18, 2011 at 5:00 AM, Nobody nob...@nowhere.com wrote: | The only robust solution is to use a separate process (threads won't | suffice, as they don't have a .kill() method). | | Forking a thread to discuss threads ahem. | | Why is it that threads can't be killed? Do Python threads correspond | to OS-provided threads (eg POSIX threads on Linux)? Every OS threading | library I've seen has some way of killing threads, although I've not | looked in detail into POSIX threads there (there seem to be two | options, pthread_kill and pthread_cancel, that could be used, but I've | not used either). If nothing else, it ought to be possible to | implement a high level kill simply by setting a flag that the | interpreter will inspect every few commands, the same way that | KeyboardInterrupt is checked for. | | Is it just that nobody's implemented it, or is there a good reason for | avoiding offering this sort of thing? | | It's possible that the reason is analogous to why Java has deprecated | its equivalent, Thread.stop(): | http://download.oracle.com/javase/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html Interesting. A lot of that discussion concerns exceptions that the Java app is unprepared for. Java's strong typing includes the throwable exceptions, so that's a quite legitimate concern. The aborting mutex regions thing is also very real. Conversely, Python can have unexpected exceptions anywhere, anytime because it is not strongly typed in this way. That doesn't make it magicly robust against this, but does mean this is _already_ an issue in Python programs, threaded or otherwise. Context managers can help a lot here, in that they offer a reliable exception handler in a less ad hoc fashion than try/except because it is tied to the locking object; but they won't magicly step in save your basic: with my_lock: stuff... Personally I'm of the view that thread stopping should be part of the overt program logic, not a low level facility (such as causing a ThreadDeath exception asynchronously). The latter has all the troubles in the cited URL. Doing it overtly would go like this: ... outside ... that_thread.stop()# sets the stopping flag on the thread object that_thread.join()# and now maybe we wait for it... ... thread code ... ... do stuff, eg: with my_lock: muck about ... if thread.stopping: abort now, _outside_ the mutex ... This avoids the issue of aborting in the middle of supposedly mutex-safe code. It still requires scattering checks on thread.stopping through library code such as the OP's rogue regexp evaluator. Cheers, -- Cameron Simpson c...@zip.com.au DoD#743 http://www.cskk.ezoshosting.com/cs/ One measure of `programming complexity' is the number of mental objects you have to keep in mind simultaneously in order to understand a program. The mental juggling act is one of the most difficult aspects of programming and is the reason programming requires more concentration than other activities. It is the reason programmers get upset about `quick interruptions' -- such interruptions are tantamount to asking a juggler to keep three balls in the air and hold your groceries at the same time. - Steve McConnell, _Code Complete_ -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On 9/17/2011 7:19 PM, Chris Angelico wrote: On Sun, Sep 18, 2011 at 8:27 AM, Chris Rebertc...@rebertia.com wrote: It's possible that the reason is analogous to why Java has deprecated its equivalent, Thread.stop(): http://download.oracle.com/javase/1.4.2/docs/guide/misc/threadPrimitiveDeprecation.html Interesting. The main argument against having a way to raise an arbitrary exception in a different thread is that it gets around Java's requirement to declare all exceptions that a routine might throw - a requirement that Python doesn't have. I saw the main argument as being that stopping a thread at an arbitrary point can have an arbitrary, unpredictable effect on all other threads. And more so that shutting down an independent process. -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On Sun, Sep 18, 2011 at 9:26 AM, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: def threadWork(lock, a1, a2, rate): while True: time.sleep(rate) lock.lock() t = a2.balance / 2 a1.balance += t #say a thread.kill kills at this point a2.balance -= t lock.release() It's obviously going to be an issue with killing processes too, which is why database engines have so much code specifically to protect against this. But if it's done as an exception, all you need is to catch that exception and reraise it: def threadWork(lock, a1, a2, rate): try: while True: time.sleep(rate) lock.lock() t = a2.balance / 2 a1.balance += t #say a thread.kill kills at this point a2.balance -= t lock.release() except: # roll back the transaction in some way lock.release() raise It'd require some care in coding, but it could be done. And if the lock/transaction object can be coded for it, it could even be done automatically: def threadWork(lock, a1, a2, rate): while True: time.sleep(rate) transaction.begin() t = a2.balance / 2 transaction.apply(a1.balance,t) #say a thread.kill kills at this point transaction.apply(a2.balance,-t) transaction.commit() If the transaction object doesn't get its commit() called, it does no actions at all, thus eliminating all issues of locks. Obviously there won't be any problem with the Python interpreter itself (refcounts etc) if the kill is done by exception - that would be a potential risk if using OS-level kills. ChrisA -- http://mail.python.org/mailman/listinfo/python-list
Re: Killing threads (was Re: Cancel or timeout a long running regular expression)
On 18/09/2011 00:26, Dennis Lee Bieber wrote: On Sun, 18 Sep 2011 07:35:01 +1000, Chris Angelicoros...@gmail.com declaimed the following in gmane.comp.python.general: Is it just that nobody's implemented it, or is there a good reason for avoiding offering this sort of thing? Any asynchronous kill runs the risk of leaving shared data structures in a corrupt state. {Stupid example, but, in pseudo-Python: import time class Account(object): def __init__(self, initial=0.0) self.balance = initial myAccount = Account(100.0) yourAccount = Account(100.0) accountLock = threading.Lock() def threadWork(lock, a1, a2, rate): while True: time.sleep(rate) lock.lock() t = a2.balance / 2 a1.balance += t #say a thread.kill kills at this point a2.balance -= t lock.release() # create/start thread1 passing (accountLock, myAccount, yourAccount, 60) # create/start thread2 passing (accountLock, yourAccount, myAccount, 120) time.sleep(300) thread1.kill() So... Thread1 may be killed after one account gets incremented but before the other is decremented... And what happens to the lock? If it doesn't get released as part of the .kill() processing, they program is dead-locked (and the magically appearing money will never be seen). If it does get released, then the sum total of money in the system will have increased. [snip] The lock won't be released if an exception is raised, for example, if 'a1' isn't an Account instance and has no 'balance' attribute. Using a context manager would help in that case. -- http://mail.python.org/mailman/listinfo/python-list