Re: Stage3 closing soon, call for patch pings
On Fri, Jan 16, 2015 at 7:05 AM, Jeff Law l...@redhat.com wrote: On 01/15/15 16:43, Nathaniel Smith wrote: Jakub, myself and management have discussed this issue extensively and those patches specifically. I'm painfully aware of how this affects the ability to utilize numerical packages in Python. Thanks for the response! I had no idea anyone was paying attention :-). We've got customers that care about this issue, so naturally it gets a goodly amount of attention up into the management chain. The fundamental problem is you're penalizing conformant code to make non-conformant code work. In the last iteration I think the problem was major memory leakage and nobody could see a way to avoid that. I'm afraid I'm a bit confused by this part. I'm going from memory rather than the patch itself (Jakub understands the technical side of this issue far better than I). Jakub, can you chime in on Nathaniel's clarifications below? If the leak is strictly in non-comformant code, that seems much less of a problem than I recall from our internal discussions. Ping? In the patch I linked, the costs imposed on conformant programs are: 1) One extra 'int' per thread pool. 2) An increment/decrement pair during each call to fork(). 3) A single 'if (__builtin_expect(..., 0)) { ... }' in gomp_team_start. That's all. There is definitely no memory leakage for conformant code. There *is* a memory leakage for non-conformant code: if you use OMP in the parent, then fork, and then use OMP in the child, then without the patch you get a deadlock; with the patch everything functions correctly, but the child's COW copies of the parent's thread stacks are leaked. There are a few things that somewhat mitigate this in practice. The main one is just: Leaking is a lot better than crashing. Esp. when the only other way to fix the crash is to completely rewrite your code (or some third-party code!) to avoid OMP, which is rather prohibitive. Of course I'd rather not leak at all, but this isn't really a situation where one can say well that was a stupid idea so don't do that -- a memory leak here really is a better outcome than any currently available alternative, enables use cases that just aren't possible otherwise, and if OMP fixes its problems later then we can update our fix to follow. It's also worth thinking a bit about the magnitude of the problem: In practice the most common case where an OMP-using program forks and then the children use OMP is going to be something like Python multiprocessing, where the child processes form a worker pool. In this case, the total effective memory leakage is somewhere between 0 and 1 copies of the parents threads -- all the children will share a single COW copy, and the parent may share some or all of that COW copy as well, depending on its OMP usage. Conversely, it's very rare to have a process that forks, then the child uses OMP then forks, then the grandchild uses OMP then forks, ... etc. Yet this is what would be required to get an unbounded memory leak here. If you only have a single level of forking, then each child only has a single leak of fixed size. There are many many situations where a few megabytes of overhead are a small price to pay for having a convenient way to process large datasets. Does this affect your opinion about this patch, or am I better off giving up now? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
Re: Stage3 closing soon, call for patch pings
On 15/01/15 04:26 PM, H.J. Lu wrote: On Thu, Jan 15, 2015 at 1:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. This one was updated yesterday: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00956.html I guess it won't hurt to list it here. I find it pretty strange that this didn't land. It's a dead simple set of changes since the subtle fixes already landed. MSVC has offered full ASLR by default since the early Vista era, but most Linux distributions don't have it thanks to GCC making it difficult to use. Anyway, here's to another year of unmitigated exploits... signature.asc Description: OpenPGP digital signature
Re: Stage3 closing soon, call for patch pings
torsdag 15 januari 2015 13.26.43 skrev H.J. Lu: On Thu, Jan 15, 2015 at 1:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. This one was updated yesterday: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00956.html I guess it won't hurt to list it here. --- H.J. Jeff can that be commited? Thank you H.J. for the work with it. /Magnus
Re: Stage3 closing soon, call for patch pings
On 01/16/15 13:14, Magnus Granberg wrote: torsdag 15 januari 2015 13.26.43 skrev H.J. Lu: On Thu, Jan 15, 2015 at 1:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. This one was updated yesterday: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00956.html I guess it won't hurt to list it here. --- H.J. Jeff can that be commited? Thank you H.J. for the work with it. Hoping folks more familiar with it will wrap it up. I'm nowhere near up to speed on this change yet. jeff
Re: Stage3 closing soon, call for patch pings
On Thu, Jan 15, 2015 at 1:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. This one was updated yesterday: https://gcc.gnu.org/ml/gcc-patches/2015-01/msg00956.html I guess it won't hurt to list it here. --- H.J.
Re: Stage3 closing soon, call for patch pings
On 01/15/15 15:34, Nathaniel Smith wrote: On Thu, Jan 15, 2015 at 9:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. I've been pinging this for about a year now in case it's of interest: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01871.html It fixes a show-stopper for using GOMP in libraries -- currently you cannot use GOMP in any code where you don't control the whole program, b/c it breaks fork(). (GOMP is the only OMP implementation that has this problem.) This is particularly annoying for Python, since it manifests as person A writing a fancy numerical package that happens to use OMP internally (b/c they don't know any better), and then person B trying to use Python's standard 'multiprocessing' library and getting weird hangs and having no idea why. Jakub, myself and management have discussed this issue extensively and those patches specifically. I'm painfully aware of how this affects the ability to utilize numerical packages in Python. The fundamental problem is you're penalizing conformant code to make non-conformant code work. In the last iteration I think the problem was major memory leakage and nobody could see a way to avoid that. It's highly unfortunate that posix, openmp, etc haven't tackled any of the composability problems yet. I heard rumblings that those issues were going to be investigated by openmp back in 2013, but I don't think any significant progress has been made. I don't see any way any of those patches are going to get in. jeff
Re: Stage3 closing soon, call for patch pings
On Thu, Jan 15, 2015 at 9:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. I've been pinging this for about a year now in case it's of interest: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01871.html It fixes a show-stopper for using GOMP in libraries -- currently you cannot use GOMP in any code where you don't control the whole program, b/c it breaks fork(). (GOMP is the only OMP implementation that has this problem.) This is particularly annoying for Python, since it manifests as person A writing a fancy numerical package that happens to use OMP internally (b/c they don't know any better), and then person B trying to use Python's standard 'multiprocessing' library and getting weird hangs and having no idea why. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
Re: Stage3 closing soon, call for patch pings
Hi Jeff, On Thu, Jan 15, 2015 at 10:50 PM, Jeff Law l...@redhat.com wrote: On 01/15/15 15:34, Nathaniel Smith wrote: On Thu, Jan 15, 2015 at 9:04 PM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. I've been pinging this for about a year now in case it's of interest: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01871.html It fixes a show-stopper for using GOMP in libraries -- currently you cannot use GOMP in any code where you don't control the whole program, b/c it breaks fork(). (GOMP is the only OMP implementation that has this problem.) This is particularly annoying for Python, since it manifests as person A writing a fancy numerical package that happens to use OMP internally (b/c they don't know any better), and then person B trying to use Python's standard 'multiprocessing' library and getting weird hangs and having no idea why. Jakub, myself and management have discussed this issue extensively and those patches specifically. I'm painfully aware of how this affects the ability to utilize numerical packages in Python. Thanks for the response! I had no idea anyone was paying attention :-). The fundamental problem is you're penalizing conformant code to make non-conformant code work. In the last iteration I think the problem was major memory leakage and nobody could see a way to avoid that. I'm afraid I'm a bit confused by this part. In the patch I linked, the costs imposed on conformant programs are: 1) One extra 'int' per thread pool. 2) An increment/decrement pair during each call to fork(). 3) A single 'if (__builtin_expect(..., 0)) { ... }' in gomp_team_start. That's all. There is definitely no memory leakage for conformant code. There *is* a memory leakage for non-conformant code: if you use OMP in the parent, then fork, and then use OMP in the child, then without the patch you get a deadlock; with the patch everything functions correctly, but the child's COW copies of the parent's thread stacks are leaked. There are a few things that somewhat mitigate this in practice. The main one is just: Leaking is a lot better than crashing. Esp. when the only other way to fix the crash is to completely rewrite your code (or some third-party code!) to avoid OMP, which is rather prohibitive. Of course I'd rather not leak at all, but this isn't really a situation where one can say well that was a stupid idea so don't do that -- a memory leak here really is a better outcome than any currently available alternative, enables use cases that just aren't possible otherwise, and if OMP fixes its problems later then we can update our fix to follow. It's also worth thinking a bit about the magnitude of the problem: In practice the most common case where an OMP-using program forks and then the children use OMP is going to be something like Python multiprocessing, where the child processes form a worker pool. In this case, the total effective memory leakage is somewhere between 0 and 1 copies of the parents threads -- all the children will share a single COW copy, and the parent may share some or all of that COW copy as well, depending on its OMP usage. Conversely, it's very rare to have a process that forks, then the child uses OMP then forks, then the grandchild uses OMP then forks, ... etc. Yet this is what would be required to get an unbounded memory leak here. If you only have a single level of forking, then each child only has a single leak of fixed size. There are many many situations where a few megabytes of overhead are a small price to pay for having a convenient way to process large datasets. Does this affect your opinion about this patch, or am I better off giving up now? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org
Re: Stage3 closing soon, call for patch pings
On 01/15/15 16:43, Nathaniel Smith wrote: Jakub, myself and management have discussed this issue extensively and those patches specifically. I'm painfully aware of how this affects the ability to utilize numerical packages in Python. Thanks for the response! I had no idea anyone was paying attention :-). We've got customers that care about this issue, so naturally it gets a goodly amount of attention up into the management chain. The fundamental problem is you're penalizing conformant code to make non-conformant code work. In the last iteration I think the problem was major memory leakage and nobody could see a way to avoid that. I'm afraid I'm a bit confused by this part. I'm going from memory rather than the patch itself (Jakub understands the technical side of this issue far better than I). Jakub, can you chime in on Nathaniel's clarifications below? If the leak is strictly in non-comformant code, that seems much less of a problem than I recall from our internal discussions. In the patch I linked, the costs imposed on conformant programs are: 1) One extra 'int' per thread pool. 2) An increment/decrement pair during each call to fork(). 3) A single 'if (__builtin_expect(..., 0)) { ... }' in gomp_team_start. That's all. There is definitely no memory leakage for conformant code. There *is* a memory leakage for non-conformant code: if you use OMP in the parent, then fork, and then use OMP in the child, then without the patch you get a deadlock; with the patch everything functions correctly, but the child's COW copies of the parent's thread stacks are leaked. There are a few things that somewhat mitigate this in practice. The main one is just: Leaking is a lot better than crashing. Esp. when the only other way to fix the crash is to completely rewrite your code (or some third-party code!) to avoid OMP, which is rather prohibitive. Of course I'd rather not leak at all, but this isn't really a situation where one can say well that was a stupid idea so don't do that -- a memory leak here really is a better outcome than any currently available alternative, enables use cases that just aren't possible otherwise, and if OMP fixes its problems later then we can update our fix to follow. It's also worth thinking a bit about the magnitude of the problem: In practice the most common case where an OMP-using program forks and then the children use OMP is going to be something like Python multiprocessing, where the child processes form a worker pool. In this case, the total effective memory leakage is somewhere between 0 and 1 copies of the parents threads -- all the children will share a single COW copy, and the parent may share some or all of that COW copy as well, depending on its OMP usage. Conversely, it's very rare to have a process that forks, then the child uses OMP then forks, then the grandchild uses OMP then forks, ... etc. Yet this is what would be required to get an unbounded memory leak here. If you only have a single level of forking, then each child only has a single leak of fixed size. There are many many situations where a few megabytes of overhead are a small price to pay for having a convenient way to process large datasets. Does this affect your opinion about this patch, or am I better off giving up now? -n
Re: Stage3 closing soon, call for patch pings
On Fri, Jan 16, 2015 at 5:04 AM, Jeff Law l...@redhat.com wrote: Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. Here is an ARM backend patch, CCing ARM maintainers. https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01383.html Thanks, bin Jeff
Stage3 closing soon, call for patch pings
Stage3 is closing rapidly. I've drained my queue of patches I was tracking for gcc-5.However, note that I don't track everything. If it's a patch for a backend, language other than C or seemingly has another maintainer that's engaged in review, then I haven't been tracking the patch. So this is my final call for patch pings. I've got some bandwidth and may be able to look at a few patches that have otherwise stalled. Jeff