Re: JVM Non Heap Memory

2016-11-29 Thread Cliff Resnick
Are you using the RocksDB backend in native mode? If so then the off-heap memory may be there. On Tue, Nov 29, 2016 at 9:54 AM, wrote: > i have the same problem,but i put the flink job into yarn. > but i put the job into yarn on the computer 22,and the job can success > run,and the jobmanager is

Re: JVM Non Heap Memory

2016-11-29 Thread Daniel Santos
Hello, Nope I am using Hadoop HDFS, as state backend, Kafka, as source, and a HttpClient as a Sink, also Kafka as Sink. So it's possible that the state backend is the culprit? Curious thing is even when no jobs are running streaming or otherwise, the JVM Non-HEAP stays the same. Which I find

Re: JVM Non Heap Memory

2016-11-29 Thread Ufuk Celebi
Hey Daniel! Thanks for reporting this. Unbounded growth of non-heap memory is not expected.  What kind of Threads are you seeing being spawned/lingering around? As a first step, could you try to disable checkpointing and see how it behaves afterwards? – Ufuk On 29 November 2016 at 17:32:32, D

Re: JVM Non Heap Memory

2016-12-05 Thread Daniel Santos
Hello, I have done some threads checking and dumps. And I have disabled the checkpointing. Here are my findings. I did a thread dump a few hours after I booted up the whole cluster. (@2/12/2016; 5 TM ; 3GB HEAP each ; 7GB total each as Limit ) The dump shows that most threads are of 3 sour

Re: JVM Non Heap Memory

2016-12-05 Thread Stefan Richter
Hi Daniel, the behaviour you observe looks like some threads are not canceled. Thread cancelation in Flink (and Java in general) is always cooperative, where cooperative means that the thread you want to cancel should somehow check cancelation and react to it. Sometimes this also requires some

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
Hello Daniel, I'm afraid you stumbled upon a bug in Flink. Meters were not properly cleaned up, causing the underlying dropwizard meter update threads to not be shutdown either. I've opened a JIRA and will open a PR soon. Thank your for re

Re: JVM Non Heap Memory

2016-12-05 Thread Ufuk Celebi
Just to note that the bug mentioned by Chesnay does not invalidate Stefan's comments. ;-) Chesnay's issue is here: https://issues.apache.org/jira/browse/FLINK-5261 I added an issue to improve the documentation about cancellation (https://issues.apache.org/jira/browse/FLINK-5260). Which version

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
We don't have to include it in 1.1.4 since Meter's do not exist in 1.1; my bad for tagging it in JIRA for 1.1.4. On 05.12.2016 14:18, Ufuk Celebi wrote: Just to note that the bug mentioned by Chesnay does not invalidate Stefan's comments. ;-) Chesnay's issue is here: https://issues.apache.org

Re: JVM Non Heap Memory

2016-12-05 Thread Daniel Santos
Hello, Thank you all for the kindly reply. I've got the general idea. I am using version flink's 1.1.3. So it seems the fix of Meter's won't make it to 1.1.4 ? Best Regards, Daniel Santos On 12/05/2016 01:28 PM, Chesnay Schepler wrote: We don't have to include it in 1.1.4 since Meter's do n

Re: JVM Non Heap Memory

2016-12-05 Thread Ufuk Celebi
Quick question since the Meter issue does _not_ apply to 1.1.3, which Flink metrics are you using? – Ufuk On 5 December 2016 at 16:44:47, Daniel Santos (dsan...@cryptolab.net) wrote: > Hello, > > Thank you all for the kindly reply. > > I've got the general idea. I am using version flink's 1.

Re: JVM Non Heap Memory

2016-12-05 Thread Chesnay Schepler
Hey Daniel, the fix won't make it into 1.1.4 since it is only relevant if you're using Flink Meters together with either the Graphite or Ganglia Reporter. The Meter metric is however not available in 1.1 at all, so it can't be the underlying cause. My fix is only for 1.2; the fixed issue coul