Re: User-space threads and NSAutoreleasePool
On 18 Mar 2010, at 06:41, BJ Homer wrote: On Wed, Mar 17, 2010 at 11:47 PM, Greg Guerin glgue...@amug.org wrote: doing one transaction updating 400-500 records.) Hence, we pipeline the HTTP requests, starting transfer of the second before the first one is finished. There are a large number of servers that don't handle pipelining, but we'll only we talking to one particular server, and we know it does. NSURLConnection does not (according to various mailing list messages) implement pipelining, allegedly due to the lack of server support. There's some suggestion that CFHTTPStream does support pipelining, but there's little to no documentation about it, and I don't know if it will handle 500 at once. It seems to me that you can accomplish what you want using just two normal threads. Open a TCP connection to port 80 on the server Create a read thread and a write thread. The write thread takes all the requests passed to it and writes them to the TCP connection using synchronous IO as fast as the connection will accept data. After each request has been written, the thread puts it in an awaiting response queue. The read thread reads the TCP connection using synchronous IO and deserialises the responses as they come in. for each response, it takes the first request off the awaiting response queue and deals with it. ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
On Wed, Mar 17, 2010 at 11:47 PM, Greg Guerin glgue...@amug.org wrote: Two main questions come to mind: Q1. What are you trying to accomplish? Q2. Why do you think this would work? More on Q1: You said you need user-space threads, but you gave no reason or rationale. If it's because you need 500 threads, then why do you need 500 threads? Exactly what are you trying to accomplish? Q1: What am I trying to accomplish? In this particular case, I'm working on an application that sends very large HTTP PUT requests over an HTTP connection in a pipelined fashion. For performance reasons, the server prefers to delay acknowledgement of these files until it can process them in large groups. (Before it can send an ack, it must update a database. Doing 400-500 single updates is far slower than doing one transaction updating 400-500 records.) Hence, we pipeline the HTTP requests, starting transfer of the second before the first one is finished. There are a large number of servers that don't handle pipelining, but we'll only we talking to one particular server, and we know it does. NSURLConnection does not (according to various mailing list messages) implement pipelining, allegedly due to the lack of server support. There's some suggestion that CFHTTPStream does support pipelining, but there's little to no documentation about it, and I don't know if it will handle 500 at once. If you can use user-space threads, this becomes simple; you send the first file, and then when you're waiting for the ACK, you swap out and let another file start sending. When the ack is received, the first user-thread is rescheduled. We've developed an extremely high-performance cross-platform library that handles all the scheduling directly. Then we've built another library around that that handles all the particulars of our server protocol. At the moment, I'm converting our OS X client to use this library. That's my immediate motivation; if I can use an existing library, I'll save a large amount of time, get a big performance boost, and have less code to test. To a large extent, it is already working. More on Q2: The ucontext(3) functions appear to me to be more intended for signal-handling, specifically for alt-signal-stack handling. The nature of signals is that they eventually return (nested), they don't work in a round-robin or any other scheduling. That's not a question, just prep. The question is this: Are you sure the Objective-C runtime is compatible with ucontext user-threads? There are lots of things you can't do in signal-handlers, not even handlers written in plain C (ref: man sigaction, the list of safe functions). If you can't call malloc() in a signal-handler (and I'm pretty sure you can't), what do you expect to happen with your Objective-C user-threads, since object allocation is tied to malloc()? Q2: Why do I think this would work? Am I sure the Obj-C runtime is compatible with ucontext user-threads? No. There is, in general, relatively little information on use of ucontext user-threads on OS X. Mostly, it exists in man pages. So far the only issue I've run into is the autorelease pool issue mentioned previously, which isn't (I suspect) actually a runtime issue at all. There may be other issues, but that's part of the reason I'm asking on the list. I can work around the autorelease pool issue if I have to (by limiting my use of Objective-C objects and autorelease pools in such a way that I can guarantee correctness), but if there are other known issues, I'd love to hear about them. It's true that ucontext stuff is often used for signal handling, but that's not the only use case. My code does not relate to signal handling at all. As mentioned above, all this ucontext stuff happening in a cross-platform library. It's already running on Linux, where I suspect the malloc-in-signal-handlers restrictions equally apply. I've also had it running (to some extent) as well. In neither Linux nor my initial attempts on OS X have I seen anything to indicate that object allocation was not working. In fact, I have a fair amout of evidence that it is likely working just fine. If object allocation was failing or otherwise unstable, my code would be crashing all over the place, not just having autorelease pool issues. -BJ ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
Le 18 mars 2010 à 07:41, BJ Homer a écrit : On Wed, Mar 17, 2010 at 11:47 PM, Greg Guerin glgue...@amug.org wrote: Two main questions come to mind: Q1. What are you trying to accomplish? Q2. Why do you think this would work? More on Q1: You said you need user-space threads, but you gave no reason or rationale. If it's because you need 500 threads, then why do you need 500 threads? Exactly what are you trying to accomplish? Q1: What am I trying to accomplish? In this particular case, I'm working on an application that sends very large HTTP PUT requests over an HTTP connection in a pipelined fashion. For performance reasons, the server prefers to delay acknowledgement of these files until it can process them in large groups. (Before it can send an ack, it must update a database. Doing 400-500 single updates is far slower than doing one transaction updating 400-500 records.) Hence, we pipeline the HTTP requests, starting transfer of the second before the first one is finished. There are a large number of servers that don't handle pipelining, but we'll only we talking to one particular server, and we know it does. NSURLConnection does not (according to various mailing list messages) implement pipelining, allegedly due to the lack of server support. There's some suggestion that CFHTTPStream does support pipelining, but there's little to no documentation about it, and I don't know if it will handle 500 at once. If you can use user-space threads, this becomes simple; you send the first file, and then when you're waiting for the ACK, you swap out and let another file start sending. When the ack is received, the first user-thread is rescheduled. We've developed an extremely high-performance cross-platform library that handles all the scheduling directly. Then we've built another library around that that handles all the particulars of our server protocol. At the moment, I'm converting our OS X client to use this library. That's my immediate motivation; if I can use an existing library, I'll save a large amount of time, get a big performance boost, and have less code to test. To a large extent, it is already working. That's why non-blocking/async API where developed. I don't see what prevent you to use one single thread and kevent like API (except that you would not be able to reuse the existing library). More on Q2: The ucontext(3) functions appear to me to be more intended for signal-handling, specifically for alt-signal-stack handling. The nature of signals is that they eventually return (nested), they don't work in a round-robin or any other scheduling. That's not a question, just prep. The question is this: Are you sure the Objective-C runtime is compatible with ucontext user-threads? There are lots of things you can't do in signal-handlers, not even handlers written in plain C (ref: man sigaction, the list of safe functions). If you can't call malloc() in a signal-handler (and I'm pretty sure you can't), what do you expect to happen with your Objective-C user-threads, since object allocation is tied to malloc()? Not only object allocation, but also message sending, as in case of cache miss while sending a message, the runtime may allocate additional cache space to store informations. You cannot safely use obj-c in signal handler. -- Jean-Daniel ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
The problem is that when you call swapcontext() to switch the user-thread running on a kernel-thread, the NSAutoreleasePool stack is not swapped out. It remains rooted in thread-local storage. As a result, serious problems result. Let me give an example. - (void)doStuff { NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init]; // do some stuff that calls swapcontext() [pool drain]; } -doStuff calls swapcontext(), trusting that the other user-thread will eventually call swapcontext() and restore the flow to this user-thread. Further, assume that the second user-thread also allocates an autorelease pool before returning. Despite being on separate user-threads, there is still only one kernel-thread. Since the autorelease pool stack is in (kernel-)thread-local storage, the second user-thread's pool goes on top of the same stack: Autorelease pool stack: pool_from_uthread1 - pool_from_uthread2 Now, when we swap back to the first user-thread, it will release its autorelease pool. This, naturally, releases the second pool as well. When we swap back to the second user-thread, anything that was autoreleased is now dead and gone. In my opinion, the only solution is NOT to have any additional Autorelease-Pools active when switching the context. As far as I understand user space contexts, you have total control about when a context switch happens. And you also have total control about your local Autorelease-Pools in your code. So just don't span Autorelease-Pools over a context switch: - (void)doStuff { NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init]; // do some stuff [pool drain]; // make sure not to have additional Autorelease-Pool // do some stuff that calls swapcontext() pool = [[NSAutoreleasePool alloc] init]; // now you can have one again // do some stuff [pool drain]; } Depending on the complexity of your app this may be a bit tricky to find. But local Autorelease-Pools are usually kept around small amounts of code. Just don't switch contexts while inside these code areas. ;-) Regards, Mani -- http://mani.de - friendly software iVolume - listen to music hands-free LittleSecrets - the encrypted notepad Sahara - sand in your pocket Watchdog - baffle the curious ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
On Wed, Mar 17, 2010 at 11:11 PM, BJ Homer bjho...@gmail.com wrote: Okay, so that's the setup. Obviously, the problem is that the two user-space threads are sharing an autorelease pool stack that is intended to be thread-local. My question, then, is whether there exists a way to get and set the autorelease pool stack, so that before calling swapcontext(), I could put it in a state appropriate for this user-level thread? I assume it is being stored in thread-local storage, but it's not in the NSThread threadDictionary, which means it's probably set using pthread_setspecific. Accessing that value would require the key used to store it, but naturally I don't have access to that. So is there some existing function call that allows such access? I did some experimenting with this sort of thing back in the 10.4 days, using custom userspace threading code (but it should be pretty much the same as this library call). I came to the conclusion that it was impractical. Cocoa keeps around a lot of thread-specific state. In addition to autorelease pools, you also have exception handlers, graphics contexts, and possibly others. You also have implicit thread-specific state like runloops and various non-reentrant code that may be calling yours when you switch to another user-level thread. And Cocoa assumes on a pretty deep level that the only threading you're doing is pthreads, and things built on top of pthreads like NSThread. In short, I think you're doomed. Any code you call in your user-level threads needs to be minimally aware of them and compatible with them (at least to the extent of not assuming pthreads), so you can't just call arbitrary libraries. I'm afraid Cocoa falls into that category. Since you mention in another message that this is portable, cross-platform code, why do you need to call Cocoa from the user-level threads at all? Separate your code into cross-platform code that does this swapcontext stuff, and Mac-specific code that doesn't, and you should be good. Mike ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
On Mar 18, 2010, at 8:35 AM, Michael Ash wrote: Cocoa keeps around a lot of thread-specific state. In addition to autorelease pools, you also have exception handlers, graphics contexts, and possibly others. Yup. I quickly ran into this in 2008 when experimenting with implementing coroutines (which use the same ucontext stuff.) Lightweight threads/coroutines can be very useful for highly scalable systems — that's one reason there's a lot of hype about Erlang these days — but you can't graft them on top of a runtime that doesn't know about them and already has its own threading support. I haven't had a chance yet to use the Grand Central / dispatch-queue stuff in 10.6, but I believe that it offers some similar functionality, like being able to create huge numbers of concurrent operations without having each one create a kernel thread. —Jens___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
re: User-space threads and NSAutoreleasePool
The problem is that when you call swapcontext() to switch the user-thread running on a kernel-thread, the NSAutoreleasePool stack is not swapped out. It remains rooted in thread-local storage. As a result, serious problems result. Let me give an example. - (void)doStuff { NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init]; // do some stuff that calls swapcontext() [pool drain]; } That doesn't work correctly with regular threads, pthreads, GCD or any other concurrency pattern. The autorelease pool needs to be pushed and popped within the context of the asynchronous subtask. The original thread of control needs its own pool, and you cannot rely upon autorelease to keep objects alive across asynchronous task boundaries. You will need to be careful to ensure that the sending thread transfers ownership of a retain to the receiving thread which releases it. Not autoreleases it. It would need to conceptually be: - (void)doStuff { NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init]; // some stuff Before [pool drain]; // do some stuff that calls swapcontext(), and has it's only scoped autorelease pool pool = [[NSAutoreleasePool alloc] init]; // some more stuff After [pool drain]; } Object you want to keep alive across drains should be retained and later released. Autorelease pools are cheap. Make more. A lot more. If you have places where that doesn't work with coroutines then don't leak important objects into autorelease pools. Either don't create them that way, or retain them again, and release them after you can verify the autorelease pool was destroyed. Using kernel-level threads is, naturally, simpler, but due to the cost of kernel-level context switches, they don't scale well. (You just can't run 500 threads at once very well.) User-space threads require more programmer attention, certainly, but also allow certain programming models that can't be done otherwise. For example, they can be used for coroutines, cooperative threading, or all sorts of esoteric-yet-sometimes-useful effects. That's true, but I'm not sure it's relevant for the I wish I had 500 threads to handle this web server communication task. There are a lot of different ways to organize the tasks as inert data that can be processed cooperatively by a set of active worker threads. The mental model of each task being its own little world and a thread is nice, but it's also artificial. You can reorganize the relationship between active workers and command data. Nothing is stopping you from that except a preconception that tasks and threads must have a 1:1 correspondence. You could create a finite state machine to process a queue of the tasks as inert command structures and firing up 8 of those finite state machines on their own dedicated normal threads on your Mac Pro. The semantic effect will be similar to user threads, yet you won't run into all these crazy problems. Destroying thread local storage and its affect on autorelease pools is just your first problem stuffing Cocoa into swapcontext() This point from Mike is particularly apt: Since you mention in another message that this is portable, cross-platform code, why do you need to call Cocoa from the user-level threads at all? Separate your code into cross-platform code that does this swapcontext stuff, and Mac-specific code that doesn't, and you should be good. You could use a sandboxing approach like Safari and have your cross platform code run in a background CLI and talk to the GUI app over IPC. No Cocoa in the CLI and no user threads in the GUI app. Pretty sure, though, you'd get better performance conceptually restructuring the task / actor relationship. The cooperative finite state machine approach along with a prioritized queue can provide soft real time performance with thousands of task objects ... every second. That's basically what some older MMORPG servers did before people went wild with distributed computing. - Ben ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
On Thu, Mar 18, 2010 at 12:05 PM, Jens Alfke j...@mooseyard.com wrote: On Mar 18, 2010, at 8:35 AM, Michael Ash wrote: Cocoa keeps around a lot of thread-specific state. In addition to autorelease pools, you also have exception handlers, graphics contexts, and possibly others. Yup. I quickly ran into this in 2008 when experimenting with implementing coroutines (which use the same ucontext stuff.) Lightweight threads/coroutines can be very useful for highly scalable systems — that's one reason there's a lot of hype about Erlang these days — but you can't graft them on top of a runtime that doesn't know about them and already has its own threading support. I haven't had a chance yet to use the Grand Central / dispatch-queue stuff in 10.6, but I believe that it offers some similar functionality, like being able to create huge numbers of concurrent operations without having each one create a kernel thread. GCD sort of kind of offers this. The big difference with GCD is that the individual jobs submitted to GCD can't be preempted, except by the normal rules of preemptive multithreading (GCD worker threads are just pthreads). If you load up GCD with enough jobs to keep your CPU busy, then submit one more job, that job has to wait until a running job completes before it gets a chance to run. You can get more granularity by dividing your work into smaller jobs, of course. Where GCD can really shine for tasks like the original poster mentioned is with dispatch sources. These allow you to, among other things, manage asynchronous I/O using callbacks in a way that's pretty easy to write and gives you good performance. You can basically just load up GCD with file descriptors to monitor, and it'll call you when data is available (or when space is available for writing). By using blocks, your code looks close to what it would look like with synchronous I/O, you don't have to use one kernel thread for each I/O source, and if you do intensive computations on multiple I/O sources, GCD will give you a multicore speedup pretty much for free. Mike ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
User-space threads and NSAutoreleasePool
Hi everyone, Some setup, first. If you just want to jump to the question, jump to the last paragraph. OS X includes (as part of its UNIX heritage) functions for implementing user-space threading. (See, for example, the man page on swapcontexthttp://developer.apple.com/Mac/library/documentation/Darwin/Reference/ManPages/man3/swapcontext.3.html.) This allows developers to swap out the current stack for another, in effect allowing two threads of execution on a single thread. Using kernel-level threads is, naturally, simpler, but due to the cost of kernel-level context switches, they don't scale well. (You just can't run 500 threads at once very well.) User-space threads require more programmer attention, certainly, but also allow certain programming models that can't be done otherwise. For example, they can be used for coroutines, cooperative threading, or all sorts of esoteric-yet-sometimes-useful effects. I say all that in an effort to avoid the frequent you're fighting the frameworks responses. I'm well familiar with the frameworks, GCD, etc., and in this particular case, user-space threads is what I need. The problem is that when you call swapcontext() to switch the user-thread running on a kernel-thread, the NSAutoreleasePool stack is not swapped out. It remains rooted in thread-local storage. As a result, serious problems result. Let me give an example. - (void)doStuff { NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init]; // do some stuff that calls swapcontext() [pool drain]; } -doStuff calls swapcontext(), trusting that the other user-thread will eventually call swapcontext() and restore the flow to this user-thread. Further, assume that the second user-thread also allocates an autorelease pool before returning. Despite being on separate user-threads, there is still only one kernel-thread. Since the autorelease pool stack is in (kernel-)thread-local storage, the second user-thread's pool goes on top of the same stack: Autorelease pool stack: pool_from_uthread1 - pool_from_uthread2 Now, when we swap back to the first user-thread, it will release its autorelease pool. This, naturally, releases the second pool as well. When we swap back to the second user-thread, anything that was autoreleased is now dead and gone. Okay, so that's the setup. Obviously, the problem is that the two user-space threads are sharing an autorelease pool stack that is intended to be thread-local. My question, then, is whether there exists a way to get and set the autorelease pool stack, so that before calling swapcontext(), I could put it in a state appropriate for this user-level thread? I assume it is being stored in thread-local storage, but it's not in the NSThread threadDictionary, which means it's probably set using pthread_setspecific. Accessing that value would require the key used to store it, but naturally I don't have access to that. So is there some existing function call that allows such access? Thanks for listening, -BJ Homer ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com
Re: User-space threads and NSAutoreleasePool
BJ Homer wrote: I say all that in an effort to avoid the frequent you're fighting the frameworks responses. I'm well familiar with the frameworks, GCD, etc., and in this particular case, user-space threads is what I need. The problem is that when you call swapcontext() to switch the user- thread running on a kernel-thread, the NSAutoreleasePool stack is not swapped out. Two main questions come to mind: Q1. What are you trying to accomplish? Q2. Why do you think this would work? More on Q1: You said you need user-space threads, but you gave no reason or rationale. If it's because you need 500 threads, then why do you need 500 threads? Exactly what are you trying to accomplish? More on Q2: The ucontext(3) functions appear to me to be more intended for signal-handling, specifically for alt-signal-stack handling. The nature of signals is that they eventually return (nested), they don't work in a round-robin or any other scheduling. That's not a question, just prep. The question is this: Are you sure the Objective-C runtime is compatible with ucontext user-threads? There are lots of things you can't do in signal-handlers, not even handlers written in plain C (ref: man sigaction, the list of safe functions). If you can't call malloc() in a signal-handler (and I'm pretty sure you can't), what do you expect to happen with your Objective-C user-threads, since object allocation is tied to malloc()? Oh, and as for Unix heritage, the ucontext functions don't appear until 10.5. Signal-handling with sigaction and alt-sig-stack are older, but their limitations are equally old: AFAIK it's never been safe to call malloc() in a signal handler. -- GG ___ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com