On 6 July 2011 16:19, Matt Brozowski <bro...@opennms.org> wrote: > There is a property in opennms.properties that allows you to set the > max number of async connections it can do.. > > Have you tried setting that? > > org.opennms.netmgt.provision.maxConcurrentConnector
Yes, we doing that to mitigate the effects of this. However I think the point is provisiond's usage of the library is broken (or at least fundamentally inefficient). It means OpenNMS is using up far more FDs that needed for each provision thread and with the default of 2000 provision instances it's quite easy to kill. > > Matt > > > > On Wed, Jul 6, 2011 at 8:20 AM, Duncan Mackintosh <dmackint...@cbnl.com> > wrote: >> I've been doing a lot of digging around various 'Too many open files' >> crashes we've been seeing locally, and I think I've pinned down a big leak >> of file descriptors in provisiond's use of org.apache.mina connectors. >> >> What it's currently doing in AsyncBasicDetector#isServiceDetected: >> - For each service, create a new NioSocketConnector >> - Configure that connector with a handler, filters etc >> - Make a connection out, check for results etc >> >> There seem to be two problems with this approach: >> >> 1) Constructing an NioSocketConnector creates a lot of 'anon_inode' and >> 'pipe' file descriptors - on one machine it was 8 & 12 respectiovely and on >> another 4/8, so I'm not sure quite what the difference is there (under >> linux, at least; I assume some equivalent under Windows). The actual >> connect() call only uses one more handle. This causes it to run out of >> descriptors a lot faster than expected. >> >> 2) If new NioSocketConnector() crashes due to a "Too many open files" >> exception, Mina sometimes just sort of falls over dead with >> "NoClassDefFoundError: Could not initialize class >> sun.nio.ch.FileDispatcher". This class does exist in my JVM (openjdk 6) and >> if I reflectively inspect it first, it sometimes stops the crashes >> happening. I'm pretty baffled there, to be honest. If it does get itself >> into this state, you can't close existing sockets, you can't open new ones; >> all the anon_inode and pipe FDs just sit there. This seems to tally with >> behaviour we've witnessed in opennms instances where we've had a Too many >> open files crash - lsof shows a few thousand pipe/anon_inode handles just >> sitting around long after the crash. >> >> For reference, I've attached a simple test class that just opens ~60 >> connections using the current methodology. If you lsof the process while it >> pauses, you can see how many new file descriptors are being created each >> time; if you drop the 60 down to 50 it cleans up gracefully but at 60 it >> doesn't seem possible to free the descriptors again (you'll need mina-core >> and slf4j-log4j12 in a project to run it). I'd be quite interested to see if >> others get the same behaviour I do. >> >> What I think Mina wants you to be doing is creating a single >> NioSocketConnector to reuse everywhere and using the optional >> IoSessionInitializer in .connect() to configure filters and attach state >> objects to the IoSession. This would take a moderate overhaul of >> AsyncBasicDetector, as the handler would need to be rewritten to be a >> singleton that takes some state using IoSession.get/setAttribute rather than >> having one handler per service detect attempt and probably a fair chunk of >> refactoring at the same time. >> >> Before I embark on making those changes I wanted to throw this out there for >> comment, and to see if there's already a refactor of this code planned (I >> couldn't see any changes on the provisiond-refactor branch yet). >> >> Thanks, >> Duncan Mackintosh (dijm) >> Cambridge Broadband Networks Limited Registered in England and Wales under >> company number: 03879840 Registered office: Selwyn House, Cambridge Business >> Park, Cowley Road, Cambridge CB4 0WZ, UK. VAT number: GB 741 0186 64 >> >> ------------------------------------------------------------------------------ >> All of the data generated in your IT infrastructure is seriously valuable. >> Why? It contains a definitive record of application performance, security >> threats, fraudulent activity, and more. Splunk takes this data and makes >> sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-d2d-c2 >> _______________________________________________ >> Please read the OpenNMS Mailing List FAQ: >> http://www.opennms.org/index.php/Mailing_List_FAQ >> >> opennms-devel mailing list >> >> To *unsubscribe* or change your subscription options, see the bottom of this >> page: >> https://lists.sourceforge.net/lists/listinfo/opennms-devel >> > > ------------------------------------------------------------------------------ > All of the data generated in your IT infrastructure is seriously valuable. > Why? It contains a definitive record of application performance, security > threats, fraudulent activity, and more. Splunk takes this data and makes > sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-d2d-c2 > _______________________________________________ > Please read the OpenNMS Mailing List FAQ: > http://www.opennms.org/index.php/Mailing_List_FAQ > > opennms-devel mailing list > > To *unsubscribe* or change your subscription options, see the bottom of this > page: > https://lists.sourceforge.net/lists/listinfo/opennms-devel > -- Alex, homepage: http://www.bennee.com/~alex/ http://www.half-llama.co.uk ------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Please read the OpenNMS Mailing List FAQ: http://www.opennms.org/index.php/Mailing_List_FAQ opennms-devel mailing list To *unsubscribe* or change your subscription options, see the bottom of this page: https://lists.sourceforge.net/lists/listinfo/opennms-devel