Sean/Junping-

Ignoring the epistemology, it's a problem. Let's figure out what's
causing memory to balloon and then we can work out the appropriate
remedy.

Is this reproducible outside the CI environment? To Junping's point,
would YETUS-561 provide more detailed information to aid debugging? -C

On Tue, Oct 24, 2017 at 2:50 PM, Junping Du <j...@hortonworks.com> wrote:
> In general, the "solid evidence" of memory leak comes from analysis of 
> heapdump, jastack, gc log, etc. In many cases, we can locate/conclude which 
> piece of code are leaking memory from the analysis.
>
> Unfortunately, I cannot find any conclusion from previous comments and it 
> even cannot tell which daemons/components of HDFS consumes unexpected high 
> memory. Don't sounds like a solid bug report to me.
>
>
>
> Thanks,?
>
>
> Junping
>
>
> ________________________________
> From: Sean Busbey <bus...@cloudera.com>
> Sent: Tuesday, October 24, 2017 2:20 PM
> To: Junping Du
> Cc: Allen Wittenauer; Hadoop Common; Hdfs-dev; 
> mapreduce-...@hadoop.apache.org; yarn-...@hadoop.apache.org
> Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
>
> Just curious, Junping what would "solid evidence" look like? Is the 
> supposition here that the memory leak is within HDFS test code rather than 
> library runtime code? How would such a distinction be shown?
>
> On Tue, Oct 24, 2017 at 4:06 PM, Junping Du 
> <j...@hortonworks.com<mailto:j...@hortonworks.com>> wrote:
> Allen,
>      Do we have any solid evidence to show the HDFS unit tests going through 
> the roof are due to serious memory leak by HDFS? Normally, I don't expect 
> memory leak are identified in our UTs - mostly, it (test jvm gone) is just 
> because of test or deployment issues.
>      Unless there is concrete evidence, my concern on seriously memory leak 
> for HDFS on 2.8 is relatively low given some companies (Yahoo, Alibaba, etc.) 
> have deployed 2.8 on large production environment for months. Non-serious 
> memory leak (like forgetting to close stream in non-critical path, etc.) and 
> other non-critical bugs always happens here and there that we have to live 
> with.
>
> Thanks,
>
> Junping
>
> ________________________________________
> From: Allen Wittenauer 
> <a...@effectivemachines.com<mailto:a...@effectivemachines.com>>
> Sent: Tuesday, October 24, 2017 8:27 AM
> To: Hadoop Common
> Cc: Hdfs-dev; 
> mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>; 
> yarn-...@hadoop.apache.org<mailto:yarn-...@hadoop.apache.org>
> Subject: Re: Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
>
>> On Oct 23, 2017, at 12:50 PM, Allen Wittenauer 
>> <a...@effectivemachines.com<mailto:a...@effectivemachines.com>> wrote:
>>
>>
>>
>> With no other information or access to go on, my current hunch is that one 
>> of the HDFS unit tests is ballooning in memory size.  The easiest way to 
>> kill a Linux machine is to eat all of the RAM, thanks to overcommit and 
>> that's what this "feels" like.
>>
>> Someone should verify if 2.8.2 has the same issues before a release goes out 
>> ...
>
>
>         FWIW, I ran 2.8.2 last night and it has the same problems.
>
>         Also: the node didn't die!  Looking through the workspace (so the 
> next run will destroy them), two sets of logs stand out:
>
> https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
>
>                                                         and
>
> https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/ws/sourcedir/hadoop-hdfs-project/hadoop-hdfs/
>
>         It looks like my hunch is correct:  RAM in the HDFS unit tests are 
> going through the roof.  It's also interesting how MANY log files there are.  
> Is surefire not picking up that jobs are dying?  Maybe not if memory is 
> getting tight.
>
>         Anyway, at the point, branch-2.8 and higher are probably fubar'd. 
> Additionally, I've filed YETUS-561 so that Yetus-controlled Docker containers 
> can have their RAM limits set in order to prevent more nodes going catatonic.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: 
> yarn-dev-unsubscr...@hadoop.apache.org<mailto:yarn-dev-unsubscr...@hadoop.apache.org>
> For additional commands, e-mail: 
> yarn-dev-h...@hadoop.apache.org<mailto:yarn-dev-h...@hadoop.apache.org>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: 
> common-dev-unsubscr...@hadoop.apache.org<mailto:common-dev-unsubscr...@hadoop.apache.org>
> For additional commands, e-mail: 
> common-dev-h...@hadoop.apache.org<mailto:common-dev-h...@hadoop.apache.org>
>
>
>
>
> --
> busbey

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to