Just take a look at this issue...
http://issues.opennms.org/browse/NMS-4631
Antonio
Il giorno 06/lug/2011, alle ore 20.03, Alex Bennee ha scritto:
> On 6 July 2011 16:19, Matt Brozowski <bro...@opennms.org> wrote:
>> There is a property in opennms.properties that allows you to set the
>> max number of async connections it can do..
>>
>> Have you tried setting that?
>>
>> org.opennms.netmgt.provision.maxConcurrentConnector
>
> Yes, we doing that to mitigate the effects of this. However I think
> the point is provisiond's usage of the library is
> broken (or at least fundamentally inefficient). It means OpenNMS is
> using up far more FDs that needed for each provision thread
> and with the default of 2000 provision instances it's quite easy to kill.
>
>>
>> Matt
>>
>>
>>
>> On Wed, Jul 6, 2011 at 8:20 AM, Duncan Mackintosh <dmackint...@cbnl.com>
>> wrote:
>>> I've been doing a lot of digging around various 'Too many open files'
>>> crashes we've been seeing locally, and I think I've pinned down a big leak
>>> of file descriptors in provisiond's use of org.apache.mina connectors.
>>>
>>> What it's currently doing in AsyncBasicDetector#isServiceDetected:
>>> - For each service, create a new NioSocketConnector
>>> - Configure that connector with a handler, filters etc
>>> - Make a connection out, check for results etc
>>>
>>> There seem to be two problems with this approach:
>>>
>>> 1) Constructing an NioSocketConnector creates a lot of 'anon_inode' and
>>> 'pipe' file descriptors - on one machine it was 8 & 12 respectiovely and on
>>> another 4/8, so I'm not sure quite what the difference is there (under
>>> linux, at least; I assume some equivalent under Windows). The actual
>>> connect() call only uses one more handle. This causes it to run out of
>>> descriptors a lot faster than expected.
>>>
>>> 2) If new NioSocketConnector() crashes due to a "Too many open files"
>>> exception, Mina sometimes just sort of falls over dead with
>>> "NoClassDefFoundError: Could not initialize class
>>> sun.nio.ch.FileDispatcher". This class does exist in my JVM (openjdk 6) and
>>> if I reflectively inspect it first, it sometimes stops the crashes
>>> happening. I'm pretty baffled there, to be honest. If it does get itself
>>> into this state, you can't close existing sockets, you can't open new ones;
>>> all the anon_inode and pipe FDs just sit there. This seems to tally with
>>> behaviour we've witnessed in opennms instances where we've had a Too many
>>> open files crash - lsof shows a few thousand pipe/anon_inode handles just
>>> sitting around long after the crash.
>>>
>>> For reference, I've attached a simple test class that just opens ~60
>>> connections using the current methodology. If you lsof the process while it
>>> pauses, you can see how many new file descriptors are being created each
>>> time; if you drop the 60 down to 50 it cleans up gracefully but at 60 it
>>> doesn't seem possible to free the descriptors again (you'll need mina-core
>>> and slf4j-log4j12 in a project to run it). I'd be quite interested to see
>>> if others get the same behaviour I do.
>>>
>>> What I think Mina wants you to be doing is creating a single
>>> NioSocketConnector to reuse everywhere and using the optional
>>> IoSessionInitializer in .connect() to configure filters and attach state
>>> objects to the IoSession. This would take a moderate overhaul of
>>> AsyncBasicDetector, as the handler would need to be rewritten to be a
>>> singleton that takes some state using IoSession.get/setAttribute rather
>>> than having one handler per service detect attempt and probably a fair
>>> chunk of refactoring at the same time.
>>>
>>> Before I embark on making those changes I wanted to throw this out there
>>> for comment, and to see if there's already a refactor of this code planned
>>> (I couldn't see any changes on the provisiond-refactor branch yet).
>>>
>>> Thanks,
>>> Duncan Mackintosh (dijm)
>>> Cambridge Broadband Networks Limited Registered in England and Wales under
>>> company number: 03879840 Registered office: Selwyn House, Cambridge
>>> Business Park, Cowley Road, Cambridge CB4 0WZ, UK. VAT number: GB 741 0186
>>> 64
>>>
>>> ------------------------------------------------------------------------------
>>> All of the data generated in your IT infrastructure is seriously valuable.
>>> Why? It contains a definitive record of application performance, security
>>> threats, fraudulent activity, and more. Splunk takes this data and makes
>>> sense of it. IT sense. And common sense.
>>> http://p.sf.net/sfu/splunk-d2d-c2
>>> _______________________________________________
>>> Please read the OpenNMS Mailing List FAQ:
>>> http://www.opennms.org/index.php/Mailing_List_FAQ
>>>
>>> opennms-devel mailing list
>>>
>>> To *unsubscribe* or change your subscription options, see the bottom of
>>> this page:
>>> https://lists.sourceforge.net/lists/listinfo/opennms-devel
>>>
>>
>> ------------------------------------------------------------------------------
>> All of the data generated in your IT infrastructure is seriously valuable.
>> Why? It contains a definitive record of application performance, security
>> threats, fraudulent activity, and more. Splunk takes this data and makes
>> sense of it. IT sense. And common sense.
>> http://p.sf.net/sfu/splunk-d2d-c2
>> _______________________________________________
>> Please read the OpenNMS Mailing List FAQ:
>> http://www.opennms.org/index.php/Mailing_List_FAQ
>>
>> opennms-devel mailing list
>>
>> To *unsubscribe* or change your subscription options, see the bottom of this
>> page:
>> https://lists.sourceforge.net/lists/listinfo/opennms-devel
>>
>
>
>
> --
> Alex, homepage: http://www.bennee.com/~alex/
> http://www.half-llama.co.uk
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> Please read the OpenNMS Mailing List FAQ:
> http://www.opennms.org/index.php/Mailing_List_FAQ
>
> opennms-devel mailing list
>
> To *unsubscribe* or change your subscription options, see the bottom of this
> page:
> https://lists.sourceforge.net/lists/listinfo/opennms-devel
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Please read the OpenNMS Mailing List FAQ:
http://www.opennms.org/index.php/Mailing_List_FAQ
opennms-devel mailing list
To *unsubscribe* or change your subscription options, see the bottom of this
page:
https://lists.sourceforge.net/lists/listinfo/opennms-devel