T wrote:
Hi all,

I'm a developer working on software that does a fairly large amount of 
concurrent outbound tcp/ip connections.  About 15000 connections or so.

I'm running build 57 on a few different machines, and the problem I am having 
is that after running my software for ~15 minutes, I can no longer connect to 
anything past our router.  It can still talk to anything on the local network, 
just nothing past the router (this includes ping or any other program).

This only seems to occur when I get upwards of 15k+ connections. The CPU load is about 20%, and the actual bandwidth usage is minimal. Other machines on the local network continue to function normally. Also keep in mind that this software works fine on freebsd and linux, though I might have botched my 'event ports' implementation for solaris, but that still shouldn't make the whole machine go dark. Indeed the program seems to run nicely until the OS starts dropping any outbound packet destined for the intranets.
Other symptoms:
- netstat -r hangs after problem occurs (probably because it can't talk to a 
dns server)
- The system doesn't recover unless i do 'svcadm restart network/service or 
network/physical'
- Problem still occurs if I run this script:  
http://everythingsolaris.org/software/tune_tcp

Does anyone have any idea whats causing it to basically stop talking to our 
router?  I'm free to try anything and give feedback.  I'll be checking back 
here frequently as well.
Is in.routed running on the system when you see this problem?
You can run netstat -nr (which will contact the naming server).

Hope this helps.

Regards,
Ramesh.
Thanks
This message posted from opensolaris.org
_______________________________________________
networking-discuss mailing list
[email protected]

_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to