Re: [Rd] Interrupting C++ code execution
Peter, On 25/04/11 10:22, schattenpfla...@arcor.de wrote: 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly. Sorry not to have seen this thread sooner. You may like to give CXXR a try (http://www.cs.kent.ac.uk/projects/cxxr/). In CXXR the R interpreter is written in C++, and a user interrupt is handled by throwing a C++ exception, so the stack is unwound in an orderly fashion, destructors are invoked, etc. However, it's fair to say that in using CXXR with a multi-threaded program you'll be on the bleeding edge... Andrew __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Interrupting C++ code execution
Andrew, You may like to give CXXR a try (http://www.cs.kent.ac.uk/projects/cxxr/). In CXXR the R interpreter is written in C++, and a user interrupt is handled by throwing a C++ exception, so the stack is unwound in an orderly fashion, destructors are invoked, etc. Thank you for this suggestion. CXXR is a very interesting project! For my current project, however, I aim at distributing the program to other R users on pre-installed cluster nodes. Thus, I have no choice with respect to the underlying R interpreter. Best regards, Peter __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Interrupting C++ code execution
I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings: On Windows you can look at the variable UserBreak, available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case. I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface? static void chkIntFn(void *dummy) { R_CheckUserInterrupt(); } // this will call the above in a top-level context so it won't longjmp-out of your context bool checkInterrupt() { return (R_ToplevelExec(chkIntFn, NULL) == FALSE); } // your code somewhere ... if (checkInterrupt()) { // user interrupted ... } This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change), the question of portability, and the question of interoperability with other errors. Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec(). An officially supported, portable solution would of course be much appreciated! Best regards, Peter __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Interrupting C++ code execution
On Apr 26, 2011, at 7:30 AM, schattenpfla...@arcor.de wrote: I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings: On Windows you can look at the variable UserBreak, available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case. I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface? Thomas' suggestion was not aimed at your problem - it was sort of the inverse (more at your Qt question). If you want to interrupt R you can mess with those flags and them let R run the event loop. It doesn't work in your (original) case. static void chkIntFn(void *dummy) { R_CheckUserInterrupt(); } // this will call the above in a top-level context so it won't longjmp-out of your context bool checkInterrupt() { return (R_ToplevelExec(chkIntFn, NULL) == FALSE); } // your code somewhere ... if (checkInterrupt()) { // user interrupted ... } This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change), Actually, it is in the official API (Rinternals.h) so I don't think that is the issue. the question of portability, and the question of interoperability with other errors. It is portable as well, so I'd say the main concern is what happens when events trigger something that is not related to you and you eat those errors. They will act as user-interrupt to you even if it's not what the user intended. One could argue that it's the lesser of the evils, because if you don't do anything R will just block so those events would have to wait until you're done anyway. Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec(). Interesting - I'm glad Luke also suggested C-level onexit bac then - it is something I was thinking about before .. Cheers, Simon An officially supported, portable solution would of course be much appreciated! Best regards, Peter __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Interrupting C++ code execution
Hi, I've been thinking about how to handle c++ threads that were started via Rcpp calls to some of my c++ libraries from R. My main obstacle is trying to make sure that users don't try to process files that are being generated by a thread before the thread finishes. One thing I am considering is having my threaded code return a class to R that contains a pointer that it remembers. Then maybe I could just change the value at that pointer when my thread finishes. Does that seem like a reasonable approach? I'm not completely sure if this is related to your issue or not, but it might be similar enough to be worth asking... Thanks, Sean On 4/26/11 9:21 AM, Simon Urbanek simon.urba...@r-project.org wrote: On Apr 26, 2011, at 7:30 AM, schattenpfla...@arcor.de wrote: I have tested the solutions suggested by Simon and Thomas on a Linux machine. These are my findings: On Windows you can look at the variable UserBreak, available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case. I did not manage to get this to work. Neither R_interrupts_pending nor R_interrupts_suspended seem to change when I press ctrl+c. Perhaps this is due to the fact that I run R in a terminal without any graphical interface? Thomas' suggestion was not aimed at your problem - it was sort of the inverse (more at your Qt question). If you want to interrupt R you can mess with those flags and them let R run the event loop. It doesn't work in your (original) case. static void chkIntFn(void *dummy) { R_CheckUserInterrupt(); } // this will call the above in a top-level context so it won't longjmp-out of your context bool checkInterrupt() { return (R_ToplevelExec(chkIntFn, NULL) == FALSE); } // your code somewhere ... if (checkInterrupt()) { // user interrupted ... } This solution works perfectly! It takes slightly longer to call this function than the plan R_CheckUserInterrupt() call, but in any reasonable scenario, the additional time is absolutely insignificant. Inside OpenMP parallel for constructs, one has to make sure that only the thread satisfying omp_get_thread_num()==0 makes the call (the 'master' construct cannot be nested inside a loop). I can then set a flag, which is queried by every thread in every loop cycle, causing fast termination of the parallel loop. After the loop, I throw an exception. Thus, my code is terminated gracefully with minimal effort. I can do additional cleanup operations (which usually is not necessary, since I use smart pointers), and report details on the interrupt to the user. With my limited testing, so far I have not noticed any downsides. Of course, there is the obvious drawback of not being supported officially (and thus maybe being subject to change), Actually, it is in the official API (Rinternals.h) so I don't think that is the issue. the question of portability, and the question of interoperability with other errors. It is portable as well, so I'd say the main concern is what happens when events trigger something that is not related to you and you eat those errors. They will act as user-interrupt to you even if it's not what the user intended. One could argue that it's the lesser of the evils, because if you don't do anything R will just block so those events would have to wait until you're done anyway. Moreover, I have found an old thread discussing almost the same topic: http://tolstoy.newcastle.edu.au/R/e4/devel/08/05/1686.html . The thread was created in 2008, so the issue is not really a new one. The solution proposed there is actually the same as the one suggested by Simon, namely using R_ToplevelExec(). Interesting - I'm glad Luke also suggested C-level onexit bac then - it is something I was thinking about before .. Cheers, Simon An officially supported, portable solution would of course be much appreciated! Best regards, Peter __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Interrupting C++ code execution
Hello, I am writing an R interface for one of my C++ programs. The computations in C++ are very time consuming (several hours), so the user needs to be able to interrupt them. Currently, the only way I found to do so is calling R_CheckUserInterrupt() frequently. Unfortunately, there are several problems with that: 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly. 2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything. Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately? I would appreciate your advice on this topic. Best regards, Peter __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Interrupting C++ code execution
On Apr 25, 2011, at 11:09 AM, schattenpfla...@arcor.de wrote: Thank you for your response, Simon. 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly. In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up. I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method. Generally, you should not use local objects We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So not using local objects is definitely no option. But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant. 2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything. As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt(). That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it? Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context). Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea. Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit. Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an on.exit statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work? It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left. Could I mimic that behaviour directly in C++? Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit(). However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have the feeling that it's going to a too low level ... Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the user pressing ctrl+c or maybe a different set of keys) without interrupting immediately? Checking for interrupts may involve running the OS event loop (to allow the user to interact with R) and thus is not guaranteed to return. I see. There is no general solution - if you're worried only about your, local code, then on unix, for example, you could use custom signal handlers to set a flag and co-operatively interrupt your program. On Windows there is the UserBreak flag which can be set by a separate thread and thus you may check on it. That said, all this is very much platform-specific. Being able to set a flag is all I need and would be the perfect solution imho. However, I do not yet see how I could achieve that. It is GUI-specific, unfortunately. AFAIR the Windows GUI does that
Re: [Rd] Interrupting C++ code execution
Actually, it just came to me that there is a hack you could use. The problem with it is that it will eat all errors, even if they were not yours (e.g. those resulting from events triggered the event loop), so I would not recommend it for general use. But here we go: static void chkIntFn(void *dummy) { R_CheckUserInterrupt(); } // this will call the above in a top-level context so it won't longjmp-out of your context bool checkInterrupt() { return (R_ToplevelExec(chkIntFn, NULL) == FALSE); } // your code somewhere ... if (checkInterrupt()) { // user interrupted ... } You must call it on the main thread and you should be prepared that it may take some time and may interact with the OS... Cheers, Simon On Apr 25, 2011, at 12:23 PM, Simon Urbanek wrote: On Apr 25, 2011, at 11:09 AM, schattenpfla...@arcor.de wrote: Thank you for your response, Simon. 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly. In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up. I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method. Generally, you should not use local objects We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So not using local objects is definitely no option. But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant. 2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything. As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt(). That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it? Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context). Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea. Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit. Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an on.exit statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work? It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left. Could I mimic that behaviour directly in C++? Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit(). However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have the feeling that it's going to a too low level ... Having a function similar to R_CheckUserInterrupt() but returning a boolean variable (has an interrupt occurred or not?) would solve these problems. Is there a way to find out about user interrupt requests (the
Re: [Rd] Interrupting C++ code execution
On Monday 25 April 2011, Simon Urbanek wrote: Actually, it just came to me that there is a hack you could use. The problem with it is that it will eat all errors, even if they were not yours (e.g. those resulting from events triggered the event loop), so I would not recommend it for general use. Here's another option which is probably not recommendable for general use, since it is not part of the documented API: On Windows you can look at the variable UserBreak, available from Rembedded.h. Outside of Windows, you can look at R_interrupts_pending, available from R_ext/GraphicsDevice.h. R_ext/GraphicsDevice.h also has R_interrupts_suspended, which you may or may not want to take into account, depending on your use-case. BTW, being able to check for a pending interrupt or to schedule an interrupt from a separate thread is something that can come in handy in GUI development as well, and personally, I would appreciate, if there was some slightly more official support for this. Regards Thomas signature.asc Description: This is a digitally signed message part. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Interrupting C++ code execution
Dear Simon, thanks again for your explanations. Your previous e-mail clarified several points for me. Actually, it just came to me that there is a hack you could use. [...] That actually looks quite nice. At least when compared to my currently only alternative of not interrupting at all. I will test it, in particular with respect to computational speed. Perhaps I can at least call it once per second. Best regards, Peter The problem with it is that it will eat all errors, even if they were not yours (e.g. those resulting from events triggered the event loop), so I would not recommend it for general use. But here we go: static void chkIntFn(void *dummy) { R_CheckUserInterrupt(); } // this will call the above in a top-level context so it won't longjmp-out of your context bool checkInterrupt() { return (R_ToplevelExec(chkIntFn, NULL) == FALSE); } // your code somewhere ... if (checkInterrupt()) { // user interrupted ... } You must call it on the main thread and you should be prepared that it may take some time and may interact with the OS... Cheers, Simon On Apr 25, 2011, at 12:23 PM, Simon Urbanek wrote: On Apr 25, 2011, at 11:09 AM, schattenpfla...@arcor.de wrote: Thank you for your response, Simon. 1. Calling R_CheckUserInterrupt() interrupts immediately, so I have no possibility to exit my code gracefully. In particular, I suppose that objects created on the heap (e.g., STL containers) are not destructed properly. In general, you're responsible for the cleanup. See R-devel archives for discussion on the interactions of C++ and R error handling. Generally, you should not use local objects and you should use on.exit to make sure you clean up. I am using Rcpp (Rcpp-modules, to be precise). This means, I do actually not write any R code. Moreover, the C++ code does not use the R API. My C++ functions are 'exposed' to R via Rcpp, which creates suitable S4 classes. Rcpp does the exception handling. In particular, there is no obvious possibility for me to add an 'on.exit' statement to a particular exposed C++ method. Generally, you should not use local objects We are talking about large amounts of code, dozens of nested function calls, and even external libraries. So not using local objects is definitely no option. But that would imply that the library calls R! Note that we're talking about the stack at the point of R API call, so you can do what you want until you cal R API. At the moment you touch R API you should have no local C++ objects on the stack (all the way down) - that's what I meant. 2. Calling R_CheckUserInterrupt() within a parallel OpenMP loop causes memory corruptions. Even if I do so within a critical section, it usually results in segfaults, crashes, or invalid variable contents afterwards. I suppose this is due to the threads not being destroyed properly. Since most of the time critical computations are done in parallel, this means I can hardly interrupt anything. As you know R is not thread-safe so you cannot call any R API from a thread - including OMP threads - so obviously you can't call R_CheckUserInterrupt(). That is very interesting. Not being thread safe does not necessarily imply that a function cannot be called from within a thread (as long as it is not done concurrently from several threads). In particular, the main program itself is also a thread, isn't it? Yes, but each thread has a separate stack, and you can only enter R with the same stack you left (because the stack will be restored to the state of the calling context). Since no cleanup is done, however, it is now clear that calling R_CheckUserInterrupt() _anywhere_ in my program, parallel section or not, is a bad idea. Since you're using threads the safe way is to perform your computations on a separate thread and let R handle events so that you can abort your computation thread as part of on.exit. Starting the computations in a separate thread is a nice idea. I could then call R_CheckUserInterrupt() every x milliseconds in the function which dispatches the worker thread. Unfortunately, I see no obvious way of adding an on.exit statement to an Rcpp module method. So I would probably have to call an R function from C++ (e.g., using RInside) which contains the on.exit statement, which in turn calls again a C++ function setting a global 'abort' flag and waits for the threads to be terminated. Hmmm. How does on.exit work? It sets the conexit object of the current context structure to the closure to be evaluated when the context is left. endcontext() then simply evaluates that closure when the context is left. Could I mimic that behaviour directly in C++? Unfortunately there is no C-level onexit hook and the internal structure of RCNTXT is not revealed to packages. So AFAICS the closest you can get is to use eval to call on.exit(). However, I think it would be useful to have a provision for creating a context with a C-level hook - the question is whether the others have