T wrote:
Hi all,
I'm a developer working on software that does a fairly large amount of
concurrent outbound tcp/ip connections. About 15000 connections or so.
I'm running build 57 on a few different machines, and the problem I am having
is that after running my software for ~15 minutes, I can no longer connect to
anything past our router. It can still talk to anything on the local network,
just nothing past the router (this includes ping or any other program).
This only seems to occur when I get upwards of 15k+ connections. The CPU load is about 20%, and the actual bandwidth usage is minimal. Other machines on the local network continue to function normally. Also keep in mind that this software works fine on freebsd and linux, though I might have botched my 'event ports' implementation for solaris, but that still shouldn't make the whole machine go dark. Indeed the program seems to run nicely until the OS starts dropping any outbound packet destined for the intranets.
Other symptoms:
- netstat -r hangs after problem occurs (probably because it can't talk to a
dns server)
- The system doesn't recover unless i do 'svcadm restart network/service or
network/physical'
- Problem still occurs if I run this script:
http://everythingsolaris.org/software/tune_tcp
Does anyone have any idea whats causing it to basically stop talking to our
router? I'm free to try anything and give feedback. I'll be checking back
here frequently as well.
Is in.routed running on the system when you see this problem?
You can run netstat -nr (which will contact the naming server).
Hope this helps.
Regards,
Ramesh.
Thanks
This message posted from opensolaris.org
_______________________________________________
networking-discuss mailing list
[email protected]
_______________________________________________
networking-discuss mailing list
[email protected]