Re: [Mono-dev] NancyFX self hosting (HttpListener) locking up on linux
Nikita, just interesting if it is the same issue. Have you tried applying this patch https://github.com/mono/mono/pull/703/ ? Make sure you DO NOT set MONO_DISABLE_AIO as only epoll/kqueue backend is patched there. Not sure if it is related, though. thank you -yuriy On Tue, Aug 6, 2013 at 7:42 PM, Nikita Tsukanov kek...@gmail.com wrote: Hello. I've written a simple SCGI server, configured nginx to work with it, hammered it with jmeter and... got the same issue. It works for a while and then stops accepting new connections (some existing ones still waiting with CLOSE_WAIT). I get it with new NetworkStream(TcpListener.AcceptSocket()) and BeginWrite/BeginRead. MONO_DISABLE_AIO actually helps it to survive for 20-30 more seconds but result is the same. Now I'll try to use some alternative way of working with sockets, may be libuv or something like that. Ubuntu 13.04, Mono JIT compiler version 3.2.0 (tarball Tue Jul 30 21:08:00 UTC 2013) 2013/8/5 Alfred Hall ah...@ahall.org ** Getting similar issues when using FastCGI/XSP proxied via Nginx using tcp and unix sockets (tried both). After hammering it with jmeter it ends up locking up. I'm wondering if other mono webapps have this issue. Would be useful if someone else here could do a similar loadtest using jmeter and let me know if it happens to them. -Original message- *From:* Alfred Hall ah...@ahall.org *Sent:* Sunday 4th August 2013 17:41 *To:* Nikita Tsukanov kek...@gmail.com; mono-devel-list@lists.ximian.com *Subject:* Re: [Mono-dev] NancyFX self hosting (HttpListener) locking up on linux Hi Nikita. Full thread dump: threadpool thread tid=0x0x7fc4ad29d700 this=0x0x7fc4ad584c70 thread handle 0x80f state : not waiting owns () IO Threadpool worker tid=0x0x7fc4ad25c700 this=0x0x7fc4ad584dd0 thread handle 0x810 state : interrupted state owns () IO Threadpool worker tid=0x0x7fc4a7567700 this=0x0x7fc4a741d350 thread handle 0x845 state : interrupted state owns () Threadpool worker tid=0x0x7fc4ac39a700 this=0x0x7fc4a6192270 thread handle 0x837 state : interrupted state owns () at unknown 0x at (wrapper managed-to-native) object.__icall_wrapper_mono_gc_alloc_vector (intptr,intptr,intptr) 0x at (wrapper alloc) object.AllocVector (intptr,intptr) 0x at System.Net.HttpConnection.BeginReadRequest () 0x0003a at System.Net.EndPointListener.OnAccept (object,System.EventArgs) 0x00357 at System.Net.Sockets.SocketAsyncEventArgs.OnCompleted (System.Net.Sockets.SocketAsyncEventArgs) 0x0002e at System.Net.Sockets.SocketAsyncEventArgs.AcceptCallback (System.IAsyncResult) 0x00336 at System.Net.Sockets.SocketAsyncEventArgs.DispatcherCB (System.IAsyncResult) 0x0010f at (wrapper runtime-invoke) Module.runtime_invoke_void__this___object (object,intptr,intptr,intptr) 0x I then tried the same again and only got this in my trace: Full thread dump: threadpool thread tid=0x0x7f31b8ac5700 this=0x0x7f31b8da4c70 thread handle 0x80e state : not waiting owns () IO Threadpool worker tid=0x0x7f31b8a84700 this=0x0x7f31b8da4dd0 thread handle 0x80f state : interrupted state owns () Not sure why I'm not getting any dump here. Any more debugging I can do on there? What seems to happen is its coping well initially with the requests and then in all of a sudden it stops accepting connections and existing connections dont die off. Cheers, Alf -Original message- *From:* Nikita Tsukanov kek...@gmail.com *Sent:* Sunday 4th August 2013 16:13 *To:* mono-devel-list@lists.ximian.com *Subject:* Re: [Mono-dev] NancyFX self hosting (HttpListener) locking up on linux Alfred, please, try to send SIGQUIT to mono (i. e. kill -SIGQUIT {PID}) when it stops processing requests. It will force mono to write thread stack traces to stdout. Grab them and post here, I suspect that the issue is simular to one experienced by me. 2013/8/4 Nikita Tsukanov kek...@gmail.com Alfred, please, try to send SIGQUIT to mono (i. e. kill -SIGQUIT {PID}) when it stops processing requests. It will force mono to write thread stack traces to stdout. Grab them and post here, I suspect that the issue is simular to one experienced by me. ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list -- Yuriy Solodkyy (y.solod...@gmail.com) ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] NancyFX self hosting (HttpListener) locking up on linux
Hi, two notes: 1) That should have the fix in from https://github.com/ysw, but setting MONO_DISABLE_AIO should have worked around that anyway, as its meant to bypass the epoll backend. The problem we attempted to fix with async sockets is not only with epoll backend. i could reproduce it with plain poll backend as well. Moreover the patch we submitted only addresses it for epoll/kqueue and leaves poll backend unpatched. 2) From what I understand it is very unlikely it is related to the problem described here. The problem we fixed can only be observed if you have parallel read and write operations on the same socket which is unlikely to happen in RPC style protocols like HTTP, unless you do request pipelining from the client. However, if it is the same problem, MONO_DISABLE_AIO won't help as poll backend is not better than epoll/kqueue in this case. -yuriy On Sun, Aug 4, 2013 at 2:17 PM, Alfred Hall ah...@ahall.org wrote: ** I meant to say I tried master too: root@mulder:~# /opt/ahall-mono/bin/mono -V Mono JIT compiler version 3.3.0 (master/2354865 Sun Aug 4 00:42:51 BST 2013) Copyright (C) 2002-2012 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com TLS: __thread SIGSEGV: altstack Notifications: epoll Architecture: amd64 Disabled: none Misc: softdebug LLVM: supported, not enabled. GC:sgen That should have the fix in from https://github.com/ysw, but setting MONO_DISABLE_AIO should have worked around that anyway, as its meant to bypass the epoll backend. My Nancy service is literally just returning a very simple JSON: public class HelloWorldModule : NancyModule { public HelloWorldModule() { Get[/] = parameters = { return Response.AsJson(new HomeResponse { Message = Test }); }; } } In JMeter I'm using 100 threads and loop count of 100 and it locks up after like 15 seconds even over the network. Very odd. -Original message- From:quot;Andrés G. Aragonesesquot; kno...@gmail.com Sent: Sunday 4th August 2013 10:03 To: mono-devel-list@lists.ximian.com Subject: Re: [Mono-dev] NancyFX self hosting (HttpListener) locking up on linux On 04/08/13 03:07, Alfred Hall wrote: Hi there. I'm running the NancyFX web framework in self hosting mode which is using HttpListener which basically means I deploy an executable and run it and it will start a webserver on localhost on a TCP port of choice. I then use nginx to proxy to it. I first ran my executable on my macbook pro and bombarded it with jmeter and it coped fine and no issues. I then deployed on my debian wheezy 64 bit linux box running on top of KVM using mono 3.2.0 and performed the same jmeter experiment. It locks up fairly quickly and wont take any new requests. I've tried using both sgen and boehm but they behave similarly although it seems to lock up faster when using sgen. I also tried enabling MONO_DISABLE_AIO but it doesn't make any difference. Anyone had similar issues? I tried using self hosted ServiceStack which also uses HttpListener and had similar issues so I'm finding it unlikely that the bug is in NancyFX itself. I tried installing mono 2.10.8 to check if this is a regression, but getting exactly the same results. Any idea how I can debug this further or what could be going on. Hey Alfred. Could you try mono master (3.3) instead of 3.2? I mention this because this pull request [1] has been merged recently, which could help in this situation (and wouldn't make much difference in Mac since the problem in this platform is worked-around by avoiding kqueue [2]). [1] https://github.com/mono/mono/pull/703 [2] https://github.com/mono/mono/blob/master/configure.in#L1823 ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list -- Yuriy Solodkyy (y.solod...@gmail.com) ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Fwd: Possible bug in System.Net.Sockets.Socket?
Hi, we have a problem with mono sometimes not invoking a callback when completing an operation. We use a patch from gonzalo.m...@gmail.com which fixes the problem for us. You can try building mono with this patch and check if it helps. https://github.com/ysw/mono-socket-problem/blob/master/Patches/cb_fix.patch -yuriy On Tue, Jan 29, 2013 at 1:47 PM, Esben Laursen hy...@hyber.dk wrote: Hi Guys, This my first post to the develop list, so I have that I have found the correct place for my questions. I have a problem with the sharpsnmplib (http://sharpsnmplib.codeplex.** com/ http://sharpsnmplib.codeplex.com/) that when I run mono-3.0.3 it halts my program even tough I have set a time-out of 3000ms on the socket. I have a watchdog that after about 10minutes reset the thread and tries again. I am fairly convinced that it is a problem in the framework, but of course not sure. I tried to build a reproduction of my problem where I implemented my method as Async with a waitone time-out of twice the socket time-out. It happens from time to time that the waitone would trigger before the time-out of the socket. (I tested this on 2.6.7, 2.10.8 and 3.0.3 and they all had the issue) That lead me to build a simple udp socket client and server app. It works the following way: Client sends a UDP packet to the server The server reads the socket and returns the same data back to the client Client reads the socket and closes it Then loop back to the beginning. This works great on Windows (and mono if you run both apps locally) Then I tested with the server app running on my Windows7 and the client app running on a Debian (tried both 2.6.7 and 3.0.3) and it would only send out 20-30 udp packets before it would that giving me this output to the console: Operation on non-blocking socket would block Operation on non-blocking socket would block Operation on non-blocking socket would block Operation on non-blocking socket would block Operation on non-blocking socket would block after a little while it would send out some more packets. Here is a link to my code: http://pastebin.com/3ehqSpWM The question is the following: 1. I guess this is a bug, should I create it in bugzilla? 2. If I have a multithreaded app that sends out hundreds or thousand of UDP requests, do you think this could cause the socket to be block for 10 minutes at a time? I am not sure its the same problem and the problem I have with my code seems to be worse on mono-3.0.3, but not worse in my reproduction app (although it is not threaded) What are your thoughts? Many thanks for your help. Cheers Esben __**_ Mono-devel-list mailing list Mono-devel-list@lists.ximian.**com Mono-devel-list@lists.ximian.com http://lists.ximian.com/**mailman/listinfo/mono-devel-**listhttp://lists.ximian.com/mailman/listinfo/mono-devel-list -- Yuriy Solodkyy (y.solod...@gmail.com) -- Yuriy Solodkyy (y.solod...@gmail.com) ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Marshal.GetFunctionPointerForDelegate and non-mono threads
Do you keep a reference to your delegate while using the pointer? I suspect GC just collect your delegate and function pointer becomes invalid. -yuriy On Tuesday, January 29, 2013, wrote: We are not using the debugger. We're not sure how the library in question creates its threads. We don't have access to its source code and it's proprietary. Putting together a full repro would be hard. The callback is a simple function which picks up a logged string and sends it to NLog by way of an Rx Subject. That's a lot of moving parts, but they all work fine when the callback comes from one of our threads. Am I correct in assuming that the GetFunctionPointerForDelegate should automatically register the thread it's called on with mono? I have enough facts at hand to call the registration function manually if need be, but it would be awkward indeed. On Jan 28, 2013, at 6:34 PM, Alan alan.mcgov...@gmail.com javascript:; wrote: Do you see these issues when running with the soft debugger attached? If so, that was a bug which was fixed a few days ago. If you're seeing the issue without the debugger, a small testcase would be great for figuring this out. Alan On 28 January 2013 18:42, sebastian sebast...@palladiumconsulting.comjavascript:; wrote: We run a program under mono which uses a 3rd party C++ library. Mono is responsible for running the application, that is, we are not using the mono_embed API, but rather just PInvoke to talk to the C++ library. This library has some callbacks which we subscribe to using Marhsal.GetFunctionPointerForDelegate. However we get exotic concurrency problems (seg faults, inexplicable stacktraces) when we use it. We only get these errors when the callback is made from a thread which was not started by us. I know that when embedding mono, i.e. with C++ in the driver's seat, threads must be registered with mono using mono_thread_attach. However that would be a funny thing for us to do, since we're not launching mono ourselves and would have to do some exploration to find all the right pointers. Does the code in GetFunctionPointerForDelegate emit a managed wrapper that takes care of this detail? Once we are called back on this foreign thread, there's no telling what or how much .NET code will run on it, and it presumably needs to be properly registered with the garbage collector. I looked at the code in mono_marshal_get_managed_wrapper and didn't see anything obviously related to threading, but I imagine it'd be taken care of at a lower level in any case. We could easily be convinced the bugs we saw were GC or threading related, as they could only be explained by corruption of things that shouldn't be corruptible, like reflection and array bounds. Or is there additional code or attributes we should be using to ensure correct operation? Thanks, Sebastian ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com javascript:; http://lists.ximian.com/mailman/listinfo/mono-devel-list ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com javascript:; http://lists.ximian.com/mailman/listinfo/mono-devel-list -- Yuriy Solodkyy (y.solod...@gmail.com) ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
[Mono-dev] Is #693996 fixed? -- UriTemplate doesn't support named wildcards
Hi, release notes for 2.10.3 [ http://www.mono-project.com/Release_Notes_Mono_2.10.3 ] says: Fixed #693996 UriTemplate doesn't support named wildcards However, the behavior of 2.10.8 is different than it is in .net. See https://gist.github.com/3488351 These tests are green in .NET, but under the mono I get: Test(s) failed. Expected string length 3 but was 7. Strings differ at index 0. Expected: 123 But was: a/b/123 ---^ at NUnit.Framework.Assert.That (System.Object actual, IResolveConstraint expression, System.String message, System.Object[] args) [0x0] in filename unknown:0 at NUnit.Framework.Assert.AreEqual (System.Object expected, System.Object actual) [0x0] in filename unknown:0 Test(s) failed. Expected string length 7 but was 11. Strings differ at index 0. Expected: 123/456 But was: a/b/123/456 ---^ at NUnit.Framework.Assert.That (System.Object actual, IResolveConstraint expression, System.String message, System.Object[] args) [0x0] in filename unknown:0 at NUnit.Framework.Assert.AreEqual (System.Object expected, System.Object actual) [0x0] in filename unknown:0 Thank you -- Yuriy Solodkyy (y.solod...@gmail.com) ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] Is #693996 fixed? -- UriTemplate doesn't support named wildcards
Andres, at least this old bug description at novell.com is outdated. The exception message reported there is Unhandled Exception: System.FormatException: Wildcard in UriTemplate is valid only if it is placed at the last part of the path: '{*path}' however, nowadays it accepts this, but the matching behavior is incorrect. Thank you, Yuriy On Mon, Aug 27, 2012 at 5:23 PM, Andres G. Aragoneses kno...@gmail.com wrote: Maybe it was a typo in the release notes? The bug is still open. You can look at the git log of the 2.10.3 tag to try to find the change: https://github.com/mono/mono/tree/2.10.3 On 27/08/12 14:27, Yuriy Solodkyy wrote: Hi, release notes for 2.10.3 [ http://www.mono-project.com/Release_Notes_Mono_2.10.3 ] says: Fixed #693996 UriTemplate doesn't support named wildcards However, the behavior of 2.10.8 is different than it is in .net. See https://gist.github.com/3488351 These tests are green in .NET, but under the mono I get: Test(s) failed. Expected string length 3 but was 7. Strings differ at index 0. Expected: 123 But was: a/b/123 ---^ at NUnit.Framework.Assert.That (System.Object actual, IResolveConstraint expression, System.String message, System.Object[] args) [0x0] in filename unknown:0 at NUnit.Framework.Assert.AreEqual (System.Object expected, System.Object actual) [0x0] in filename unknown:0 Test(s) failed. Expected string length 7 but was 11. Strings differ at index 0. Expected: 123/456 But was: a/b/123/456 ---^ at NUnit.Framework.Assert.That (System.Object actual, IResolveConstraint expression, System.String message, System.Object[] args) [0x0] in filename unknown:0 at NUnit.Framework.Assert.AreEqual (System.Object expected, System.Object actual) [0x0] in filename unknown:0 Thank you ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list -- Yuriy Solodkyy (y.solod...@gmail.com) ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] ConcurrentStack with value type in 2.10
Hi All, I just check this on fresh build mono from master. It is not the problem that you can see on each run. You need to run on 8-cores to observe it frequently enough. My environment is Linux x64 (tried: ubuntu 10, ubuntu 12, opensuse 12). Typically if you start test app and it does not crash it will not crash later. However, I just got it again, the stack trace is below. This is just a different favor of the same problem. Sometimes ConcurrentStack returns inconsistent data, sometimes it crashes. See the code sample at https://github.com/ysw/mono-socket-problem/tree/master/ConcurrentTest ubuntu@ip-10-244-0-134:~/mono-socket-problem/ConcurrentTest/bin/Debug$ m CocurrentTest.exe Hello World! Unhandled Exception: mono() [0x49545d] mono() [0x497079] mono() [0x49918b] mono() [0x4f0e67] [0x4199f9ac] [ERROR] FATAL UNHANDLED EXCEPTION: System.IndexOutOfRangeException: Array index is out of range. at (wrapper stelemref) object:virt_stelemref_class (intptr,object) at System.Collections.Concurrent.ObjectPool`1[T].Release (System.Collections.Concurrent.T obj) [0x0] in filename unknown:0 at System.Collections.Concurrent.ConcurrentStack`1[CocurrentTest.MainClass+Data].TryPop (CocurrentTest.Data result) [0x0] in filename unknown:0 at CocurrentTest.MainClass+Mainc__AnonStorey0.m__0 (System.Object v) [0x0] in filename unknown:0 at System.Threading.Thread.StartInternal () [0x0] in filename unknown:0 ubuntu@ip-10-244-0-134:~/mono-socket-problem/ConcurrentTest/bin/Debug$ Updating https://bugzilla.xamarin.com/show_bug.cgi?id=6229 as well --yuriy On Mon, Jul 23, 2012 at 3:45 PM, Alan alan.mcgov...@gmail.com wrote: I cannot reproduce the problem either. What exact version of 2.10 did you test against? It's possible the bug has already been fixed in a newer release of the 2.10 series. Alan On 23 July 2012 13:32, Rodrigo Kumpera kump...@gmail.com wrote: Hi Yuriy, With how many cores and on what CPU did managed to reproduce this? I'm running this on my 4 cores nehalem mac without any luck. I'll diff ConcurrentStack between 2.8 and 2.10 to see what could be. On Sun, Jul 22, 2012 at 5:10 AM, Yuriy Solodkyy yu...@couldbedone.com wrote: Hi, It looks like the ConcurrentStack does not work with big enough structures anymore. 12 bytes struct is enough to reproduce the problem occasionally, 16 bytes structure to reproduce it immediately. It worked fine in mono 2.8. The following code shows that we may pop inconsistent structure from the stack from time to time. using System; using System.Collections.Concurrent; namespace CocurrentTest { class MainClass { struct Data { public int A; public int B; public int C; public int D; public Data(int v) { A = v; B = -v; C = v; D = -v; } } public static void Main (string[] args) { Console.WriteLine (Hello World!); var data = new byte[1024 * 1024]; var stack = new ConcurrentStackData (); for (var i = 0; i 50; i++) { var thread = new System.Threading.Thread (v = { var rnd = new Random (); while (true) { int pushCount = rnd.Next (50); int popCount = rnd.Next (50); for (var k = 0; k pushCount; k++) { var sample = new Data (rnd.Next(Int32.MaxValue)); CheckSample (sample); stack.Push (sample); } for (var k = 0; k popCount; k++) { Data retrievedSample = new Data(); if (stack.TryPop (out retrievedSample)) { CheckSample (retrievedSample); } } } } ); thread.Start (); } } static void CheckSample (Data sample){ if (sample.A != -sample.B || sample.A != sample.C || sample.B != sample.D) throw new Exception (string.Format (Invalid sample detected)); } } } -- Yuriy Solodkyy ___ Mono
[Mono-dev] ConcurrentStack with value type in 2.10
Hi, It looks like the ConcurrentStack does not work with big enough structures anymore. 12 bytes struct is enough to reproduce the problem occasionally, 16 bytes structure to reproduce it immediately. It worked fine in mono 2.8. The following code shows that we may pop inconsistent structure from the stack from time to time. using System; using System.Collections.Concurrent; namespace CocurrentTest { class MainClass { struct Data { public int A; public int B; public int C; public int D; public Data(int v) { A = v; B = -v; C = v; D = -v; } } public static void Main (string[] args) { Console.WriteLine (Hello World!); var data = new byte[1024 * 1024]; var stack = new ConcurrentStackData (); for (var i = 0; i 50; i++) { var thread = new System.Threading.Thread (v = { var rnd = new Random (); while (true) { int pushCount = rnd.Next (50); int popCount = rnd.Next (50); for (var k = 0; k pushCount; k++) { var sample = new Data (rnd.Next(Int32.MaxValue)); CheckSample (sample); stack.Push (sample); } for (var k = 0; k popCount; k++) { Data retrievedSample = new Data(); if (stack.TryPop (out retrievedSample)) { CheckSample (retrievedSample); } } } } ); thread.Start (); } } static void CheckSample (Data sample){ if (sample.A != -sample.B || sample.A != sample.C || sample.B != sample.D) throw new Exception (string.Format (Invalid sample detected)); } } } -- Yuriy Solodkyy ___ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list
Re: [Mono-dev] TCP Async
Brett, No completion callback is what proves the problem, unless there is a problem in the sample code itself. I tried to rebuild mono with thread/pool instead of epoll socket implementation and still get the same problem. So, I concluded it is not epoll specific problem. -yuriy On Thu, Jul 19, 2012 at 12:08 AM, Brett Ernst brett.e.er...@gmail.com wrote: I've had some strangeness with the thread pool in the past, but never enough to get a solid, consistent repro that I could file a bug for. I don't know if this is related or not, but I've actually seen a simple Timer fail to generate callbacks under very high load (and on old hardware). Again, not repro-able enough to file a bug for but enough to make me nervous. When I run your mono-socket-problem code, I start seeing the No completion callback messages within 5 seconds and then regularly every 5-10 seconds or so. I can't say for sure if the issues are related, but if they are, this is the best repro I've seen. As you can imagine, I've grown a bit of a distrust for the threadpool and thus async socket operations. I put some effort into digging through the mono internals, but without a solid repro and lacking a good understanding of the thread pool implementation, my ultimate solution was to give up and stop using async sockets altogether. I took a different approach: I wrapped libev and POSIX sockets. Manos de Mono is an excellent example of this approach. So far, this has been rock solid and performs extremely well. Of course, the major downsides are: 1) it's platform-specific, and 2) totally single-threaded. I get around #2 by simply running multiple load-balanced nodes, one for each core. I still make light use of the thread pool for long-running operations that shouldn't block the message loop. I only throw this out there as a possible alternative if you don't have any success resolving this issue. Our architecture fit very well into the event loop paradigm, but that may not work for everyone. On Tue, Jul 17, 2012 at 7:47 AM, Greg Young gregoryyou...@gmail.com wrote: Btw to avoid confusion and duplicated work if someone starts could we just sync a bit in this thread? On Tuesday, July 17, 2012, Greg Young wrote: Hey all. As this is a big issue for us and I feel a huge problem for mono in general at this point as it means sockets basically dont work which is a strong point of unix environments, I would like to propose something I have done in the past. I am willing to offer a bounty (personally) for a working fix to this section of code of $500 usd (more if done quickly). Acceptance criteria is the included test working in a stable fashion in Linux / bsd but just Linux is acceptable as well, I honestly wish more people would do this kind of thing with OSS projects. Cheers, Greg On Saturday, July 7, 2012, Yuriy Solodkyy wrote: Hi Rodrigo, please find a small sample app at https://github.com/ysw/mono-socket-problem This app can start in either server or client mode. These modes only differ in whether it listens for connections on multiple ports or connects to server on multiple ports. Upon connecting to or accepting connection it immediately sends some data, and then sends next chunk of data in response to any data received from the other side. There are some random delays in code and we limit outgoing traffic on each connection not to be significantly higher than inbound. There is also a separate thread which regularly checks status of every connection and report any connections that are awaiting a callback, but their status obtained with socket.poll is already READY. (A several seconds delay is still allowed). See also the README file. Also, it seems that constantly changing men/max threads in ThreadPool increases probability of the problem. See code. Please let me know if this sample app works for you. Hope it helps Thank you, Yuriy We've been aware of such issues, could you file a bug and attach a test case with it please? This would really really help us fix it. On Wed, Jun 27, 2012 at 4:08 AM, Greg Young gregoryyoung1 at gmail.com wrote: We are experiencing an issue with async behaviours in sockets (with SendAsync/callback not Begin/End). Our visible issue is that when in a send loop we will fail on our heartbeats. After debugging and counting calls into/out of SendAsync/callback we see that we are inside of a call to SendAsync (eg: it never returns, in our case for 10 seconds before we declare the socket dead). We can duplicate this fairly regularly on mac/bsd/linux though its nonconsistent (sometimes it may happen repeatedly other times it works fine). The code does not have such issues on MS CLR. We are also running on loopback so its unlikely that an underlying network problem is causing the hang up. The code itself is fairly straight forward (not that different than the MS example of the API except that its fully