Brock, think you may have something there... in fact nrpoc was in fact defaulting to 600, I'll bump it up and give her a restart tonight and see how things go.
On Wed, Nov 27, 2013 at 8:54 AM, Brock Noland <[email protected]> wrote: > Sounds like the nproc ulimit... > > On Tue, Nov 26, 2013 at 8:50 PM, Jeff Lord <[email protected]> wrote: > > Can you provide the logfile and config? > > > > > > On Tue, Nov 26, 2013 at 12:20 PM, Cochran, David <[email protected] > > > > wrote: > >> > >> I've got a pretty good sized box collecting logs for a number of sources > >> (about a dozen or so). > >> Actually two instances were running on this box (one production and the > >> other a testing environment) > >> > >> After adding a dozen more log files i'm running into this error (87 logs > >> for PROD and 70 or so for Testing) > >> > >> Nov 26, 2013 10:25:27 AM > >> org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink > >> WARNING: Failed to accept a connection. > >> java.lang.OutOfMemoryError: unable to create new native thread > >> at java.lang.Thread.start0(Native Method) > >> at java.lang.Thread.start(Thread.java:640) > >> at > >> > java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727) > >> at > >> > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657) > >> at > >> > org.jboss.netty.channel.socket.nio.AbstractNioWorker.start(AbstractNioWorker.java:160) > >> at > >> > org.jboss.netty.channel.socket.nio.AbstractNioWorker.register(AbstractNioWorker.java:131) > >> at > >> > org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.registerAcceptedChannel(NioServerSocketPipelineSink.java:269) > >> at > >> > org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink$Boss.run(NioServerSocketPipelineSink.java:231) > >> at > >> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > >> at > >> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > >> at java.lang.Thread.run(Thread.java:662) > >> > >> I have been able to confirm that they are not bumping into the OS > ulimit, > >> and increasing the sizes of the JVM does not make this go away > >> (JAVA_OPTS="-Xms1g -Xmx6g -Dcom.sun.management.jmxremot"). I ended up > >> having to shut down the testing environment instance so as not to lose > any > >> more production logs. > >> > >> The box is 16 CPU with 16G RAM using local disks, running RHEL5.x > >> > >> Anytime process threads exceed ~930 (the results of ps -eLF -u flume | > wc > >> -l )the above error appears...if allowed to continue ultimately crashes > the > >> box. > >> > >> Any ideas on resolving this issue or have I run into a thread limit that > >> cannot be gotten around without adding another server? > >> > >> Thanks, > >> Dave > >> > > > > > > -- > Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >
