It seems like the thread reading the points file is locked waiting for a buffer from the global buffer pool that doesn't come. What could be causing this?
java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x6b985888> (a java.util.ArrayDeque) at eu.stratosphere.runtime.io.network.bufferprovider.LocalBufferPool.requestBuffer(LocalBufferPool.java:160) - locked <0x6b985888> (a java.util.ArrayDeque) at eu.stratosphere.runtime.io.network.bufferprovider.LocalBufferPool.requestBufferBlocking(LocalBufferPool.java:101) at eu.stratosphere.runtime.io.gates.InputGate.requestBufferBlocking(InputGate.java:333) at eu.stratosphere.runtime.io.channels.InputChannel.requestBufferBlocking(InputChannel.java:426) at eu.stratosphere.runtime.io.network.ChannelManager.dispatchFromOutputChannel(ChannelManager.java:441) at eu.stratosphere.runtime.io.channels.OutputChannel.sendBuffer(OutputChannel.java:74) at eu.stratosphere.runtime.io.gates.OutputGate.sendBuffer(OutputGate.java:49) at eu.stratosphere.runtime.io.api.BufferWriter.sendBuffer(BufferWriter.java:35) at eu.stratosphere.runtime.io.api.RecordWriter.emit(RecordWriter.java:96) at eu.stratosphere.pact.runtime.shipping.OutputCollector.collect(OutputCollector.java:82) at eu.stratosphere.pact.runtime.task.chaining.ChainedMapDriver.collect(ChainedMapDriver.java:71) at eu.stratosphere.pact.runtime.task.DataSourceTask.invoke(DataSourceTask.java:228) at eu.stratosphere.nephele.execution.RuntimeEnvironment.run(RuntimeEnvironment.java:284) at java.lang.Thread.run(Thread.java:744) Thanks for your help Sebastian. Regards // Saludos // Mit Freundlichen Grüßen // Bien cordialement, Pino On 22 June 2014 13:38, Sebastian Schelter <[email protected]> wrote: > Have you looked at a jstack dump on one of the workera? That typically > helps finding out, where the processes are stuck. > > -s > Am 22.06.2014 13:32 schrieb "José Luis López Pino" <[email protected] > >: > > > Hi, > > > > I'm running the KMeans java and scala examples in two nodes. It works > fine > > with very small files (3MB) but when I try with files of 30MB or bigger > the > > process never ends. After several hours, the DataChain process that is > > reading the input points is still working. > > > > I have tried before with way bigger files in the same environment and I > had > > no issue. I have already tried: > > - Check that the process is not locked using all the CPU time. > > - Format the datanodes. > > - Compile the last version available on github. > > - The debug log mode doesn't give any additional information. > > > > Could someone give me a hint where to look at that? Thanks for your help! > > > > Regards // Saludos // Mit Freundlichen Grüßen // Bien cordialement, > > Pino > > >
