Hadoop-Mapreduce-trunk - Build # 2605 - Still Failing

2015-11-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2605/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 32274 lines...]
Running org.apache.hadoop.mapreduce.v2.app.TestKill
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.665 sec - in 
org.apache.hadoop.mapreduce.v2.app.TestKill
Running org.apache.hadoop.mapreduce.TestMapreduceConfigFields
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.926 sec - in 
org.apache.hadoop.mapreduce.TestMapreduceConfigFields
Running org.apache.hadoop.mapred.TestTaskAttemptListenerImpl
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.497 sec - in 
org.apache.hadoop.mapred.TestTaskAttemptListenerImpl
Running org.apache.hadoop.mapred.TestLocalContainerLauncher
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.09 sec - in 
org.apache.hadoop.mapred.TestLocalContainerLauncher
Running org.apache.hadoop.mapred.TestTaskAttemptFinishingMonitor
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.928 sec - in 
org.apache.hadoop.mapred.TestTaskAttemptFinishingMonitor

Results :

Failed tests: 
  TestJobImpl.testUnusableNodeTransition:629->assertJobState:1012 
expected: but was:

Tests run: 340, Failures: 1, Errors: 0, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop MapReduce Client  SUCCESS [  4.417 s]
[INFO] Apache Hadoop MapReduce Core .. SUCCESS [02:18 min]
[INFO] Apache Hadoop MapReduce Common  SUCCESS [ 39.496 s]
[INFO] Apache Hadoop MapReduce Shuffle ... SUCCESS [  6.455 s]
[INFO] Apache Hadoop MapReduce App ... FAILURE [10:47 min]
[INFO] Apache Hadoop MapReduce HistoryServer . SKIPPED
[INFO] Apache Hadoop MapReduce JobClient . SKIPPED
[INFO] Apache Hadoop MapReduce HistoryServer Plugins . SKIPPED
[INFO] Apache Hadoop MapReduce NativeTask  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] Apache Hadoop MapReduce ... SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 13:58 min
[INFO] Finished at: 2015-11-13T08:21:27+00:00
[INFO] Final Memory: 39M/739M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-mapreduce-client-app: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-mapreduce-client-app
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Recording test results
Updating HDFS-9410
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition

Error Message:
expected: but was:

Stack Trace:
java.lang.AssertionError: expected: but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1012)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:629)




Re: Release votes and git-tags [Was Re: [VOTE] Release Apache Hadoop 2.7.2 RC0]

2015-11-13 Thread Steve Loughran

> On 12 Nov 2015, at 20:23, Vinod Kumar Vavilapalli  wrote:
> 
> We have always voted on release tar-balls, not svn branches / git commit-ids 
> or tags.
> 
> When we were on SVN, we used to paste in the voting thread the release branch 
> URL.
> 
> Since we moved to git, we stopped creating release branches and have always 
> used signed tags for snapshotting and posted tags in the voting threads.
> 
> To my knowledge, we never reuse tags - as they are themselves versioned - for 
> e.g. hadoop-2.7.1-RC0 etc. So we don’t run the risk of tags getting replaced 
> from under the rug.
> 
> To me, tags are a simple way of going back to the code we ship in a release 
> without creating and maintaining explicit release branches. No one can 
> remember Commit IDs.
> 
> All that said, we can post the commit-IDs in future release votes for the 
> sake of convenience, but I disagree with the statement that we vote on 
> git-commits.
> 
> +Vinod
> 

I recognise that we vote on the src distro, but that source has an origin. And 
that has to be a commit #, not a tag, as somebody *may* change that tag later.

the ASF incubator will only approve of git-based releases with that checksum 
—so its the one we should all be using

FWIW, the RC tag is: 6f38ccc ; I've checked it out and verifying it builds on 
Windows, including all the native libs

Re: [VOTE] Release Apache Hadoop 2.7.2 RC0

2015-11-13 Thread Jason Lowe
-1 (binding)
Ran into public localization issues and filed YARN-4354. We need that resolved 
before the release is ready.  We will either need a timely fix or may have to 
revert YARN-2902 to unblock the release if my root-cause analysis is correct.  
I'll dig into this more today.

Jason

  From: Vinod Kumar Vavilapalli 
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Wednesday, November 11, 2015 10:31 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC0
   
Hi all,


I've created a release candidate RC0 for Apache Hadoop 2.7.2.


As discussed before, this is the next maintenance release to follow up
2.7.1.


The RC is available for validation at:

*http://people.apache.org/~vinodkv/hadoop-2.7.2-RC0/

*


The RC tag in git is: release-2.7.2-RC0


The maven artifacts are available via repository.apache.org at

*https://repository.apache.org/content/repositories/orgapachehadoop-1023/

*


As you may have noted, an unusually long 2.6.3 release caused 2.7.2 to slip
by quite a bit. This release's related discussion threads are linked below:
[1] and [2].


Please try the release and vote; the vote will run for the usual 5 days.


Thanks,

Vinod


[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes

[2]: Planning Apache Hadoop 2.7.2
http://markmail.org/message/iktqss2qdeykgpqk


  

Re: [VOTE] Release Apache Hadoop 2.7.2 RC0

2015-11-13 Thread Sunil Govind
+1 (non-binding)

- Build the tar ball from source and deployed.
- Ran few MR jobs successfully along with some basic node label and
preemption verification.
- Verified RM Web UI, AM UI and Timeline UI. All pages are looking fine.

Thanks and Regards
Sunil

On Thu, Nov 12, 2015 at 10:01 AM Vinod Kumar Vavilapalli 
wrote:

> Hi all,
>
>
> I've created a release candidate RC0 for Apache Hadoop 2.7.2.
>
>
> As discussed before, this is the next maintenance release to follow up
> 2.7.1.
>
>
> The RC is available for validation at:
>
> *http://people.apache.org/~vinodkv/hadoop-2.7.2-RC0/
>
> *
>
>
> The RC tag in git is: release-2.7.2-RC0
>
>
> The maven artifacts are available via repository.apache.org at
>
> *https://repository.apache.org/content/repositories/orgapachehadoop-1023/
>
>  >*
>
>
> As you may have noted, an unusually long 2.6.3 release caused 2.7.2 to slip
> by quite a bit. This release's related discussion threads are linked below:
> [1] and [2].
>
>
> Please try the release and vote; the vote will run for the usual 5 days.
>
>
> Thanks,
>
> Vinod
>
>
> [1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes
>
> [2]: Planning Apache Hadoop 2.7.2
> http://markmail.org/message/iktqss2qdeykgpqk
>


Re: [VOTE] Release Apache Hadoop 2.7.2 RC0

2015-11-13 Thread Kihwal Lee
We found HDFS-9426. The rolling upgrade finalization is not backward 
compatible.I.e. 2.7.1 or 2.6.x datanodes will ignore finalization.

So -1.
Kihwal
  From: Vinod Kumar Vavilapalli 
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Wednesday, November 11, 2015 10:31 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC0
   
Hi all,


I've created a release candidate RC0 for Apache Hadoop 2.7.2.


As discussed before, this is the next maintenance release to follow up
2.7.1.


The RC is available for validation at:

*http://people.apache.org/~vinodkv/hadoop-2.7.2-RC0/

*


The RC tag in git is: release-2.7.2-RC0


The maven artifacts are available via repository.apache.org at

*https://repository.apache.org/content/repositories/orgapachehadoop-1023/

*


As you may have noted, an unusually long 2.6.3 release caused 2.7.2 to slip
by quite a bit. This release's related discussion threads are linked below:
[1] and [2].


Please try the release and vote; the vote will run for the usual 5 days.


Thanks,

Vinod


[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes

[2]: Planning Apache Hadoop 2.7.2
http://markmail.org/message/iktqss2qdeykgpqk


   

Re: [VOTE] Release Apache Hadoop 2.7.2 RC0

2015-11-13 Thread Vinod Kumar Vavilapalli
Thanks for reporting this Jason!

Everyone, I am canceling this RC given the feedback, we will go again after 
addressing the open issues.

Thanks
+Vinod

> On Nov 13, 2015, at 7:57 AM, Jason Lowe  wrote:
> 
> -1 (binding)
> 
> Ran into public localization issues and filed YARN-4354 
> . We need that resolved 
> before the release is ready.  We will either need a timely fix or may have to 
> revert YARN-2902 to unblock the release if my root-cause analysis is correct. 
>  I'll dig into this more today.
> 
> Jason
> 
> From: Vinod Kumar Vavilapalli 
> To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
> yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
> Cc: vino...@apache.org 
> Sent: Wednesday, November 11, 2015 10:31 PM
> Subject: [VOTE] Release Apache Hadoop 2.7.2 RC0
> 
> Hi all,
> 
> 
> I've created a release candidate RC0 for Apache Hadoop 2.7.2.
> 
> 
> As discussed before, this is the next maintenance release to follow up
> 2.7.1.
> 
> 
> The RC is available for validation at:
> 
> *http://people.apache.org/~vinodkv/hadoop-2.7.2-RC0/ 
> 
> 
>  >*
> 
> 
> The RC tag in git is: release-2.7.2-RC0
> 
> 
> The maven artifacts are available via repository.apache.org at
> 
> *https://repository.apache.org/content/repositories/orgapachehadoop-1023/ 
> 
> 
>  >*
> 
> 
> As you may have noted, an unusually long 2.6.3 release caused 2.7.2 to slip
> by quite a bit. This release's related discussion threads are linked below:
> [1] and [2].
> 
> 
> Please try the release and vote; the vote will run for the usual 5 days.
> 
> 
> Thanks,
> 
> Vinod
> 
> 
> [1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 
> 
> 
> [2]: Planning Apache Hadoop 2.7.2
> http://markmail.org/message/iktqss2qdeykgpqk 
> 
> 
> 



[jira] [Reopened] (MAPREDUCE-3065) ApplicationMaster killed by NodeManager due to excessive virtual memory consumption

2015-11-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened MAPREDUCE-3065:


> ApplicationMaster killed by NodeManager due to excessive virtual memory 
> consumption
> ---
>
> Key: MAPREDUCE-3065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3065
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Chris Riccomini
>
> > Hey Vinod,
> > 
> > OK, so I have a little more clarity into this.
> > 
> > When I bump my resource request for my AM to 4096, it runs. The important 
> > line in the NM logs is:
> > 
> > 2011-09-21 13:43:44,366 INFO  monitor.ContainersMonitorImpl 
> > (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 25656 
> > for container-id container_1316637655278_0001_01_01 : Virtual 
> > 2260938752 bytes, limit : 4294967296 bytes; Physical 120860672 bytes, limit 
> > -1 bytes
> > 
> > The thing to note is the virtual memory, which is off the charts, even 
> > though my physical memory is almost nothing (12 megs). I'm still poking 
> > around the code, but I am noticing that there are two checks in the NM, one 
> > for virtual mem, and one for physical mem. The virtual memory check appears 
> > to be toggle-able, but is presumably defaulted to on.
> > 
> > At this point I'm trying to figure out exactly what the VMEM check is for, 
> > why YARN thinks my app is taking 2 gigs, and how to fix this.
> > 
> > Cheers,
> > Chris
> > 
> > From: Chris Riccomini [criccom...@linkedin.com]
> > Sent: Wednesday, September 21, 2011 1:42 PM
> > To: mapreduce-dev@hadoop.apache.org
> > Subject: Re: ApplicationMaster Memory Usage
> > 
> > For the record, I bumped to 4096 for memory resource request, and it works.
> > :(
> > 
> > 
> > On 9/21/11 1:32 PM, "Chris Riccomini"  wrote:
> > 
> >> Hey Vinod,
> >> 
> >> So, I ran my application master directly from the CLI. I commented out the
> >> YARN-specific code. It runs fine without leaking memory.
> >> 
> >> I then ran it from YARN, with all YARN-specific code commented it. It again
> >> ran fine.
> >> 
> >> I then uncommented JUST my registerWithResourceManager call. It then fails
> >> with OOM after a few seconds. I call registerWithResourceManager, and then 
> >> go
> >> into a while(true) { println("yeh") sleep(1000) }. Doing this prints:
> >> 
> >> yeh
> >> yeh
> >> yeh
> >> yeh
> >> yeh
> >> 
> >> At which point, it dies, and, in the NodeManager,I see:
> >> 
> >> 2011-09-21 13:24:51,036 WARN  monitor.ContainersMonitorImpl
> >> (ContainersMonitorImpl.java:isProcessTreeOverLimit(289)) - Process tree for
> >> container: container_1316626117280_0005_01_01 has processes older than 
> >> 1
> >> iteration running over the configured limit. Limit=2147483648, current 
> >> usage =
> >> 2192773120
> >> 2011-09-21 13:24:51,037 WARN  monitor.ContainersMonitorImpl
> >> (ContainersMonitorImpl.java:run(453)) - Container
> >> [pid=23852,containerID=container_1316626117280_0005_01_01] is running
> >> beyond memory-limits. Current usage : 2192773120bytes. Limit :
> >> 2147483648bytes. Killing container.
> >> Dump of the process-tree for container_1316626117280_0005_01_01 :
> >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> >> SYSTEM_TIME(MILLIS)
> >> VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> >> |- 23852 20570 23852 23852 (bash) 0 0 108638208 303 /bin/bash -c java 
> >> -Xmx512M
> >> -cp './package/*' kafka.yarn.ApplicationMaster
> >> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 5 1 1316626117280
> >> com.linkedin.TODO 1
> >> 1>/tmp/logs/application_1316626117280_0005/container_1316626117280_0005_01_000
> >> 001/stdout
> >> 2>/tmp/logs/application_1316626117280_0005/container_1316626117280_0005_01_000
> >> 001/stderr
> >> |- 23855 23852 23852 23852 (java) 81 4 2084134912 14772 java -Xmx512M -cp
> >> ./package/* kafka.yarn.ApplicationMaster
> >> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 5 1 1316626117280
> >> com.linkedin.TODO 1
> >> 2011-09-21 13:24:51,037 INFO  monitor.ContainersMonitorImpl
> >> (ContainersMonitorImpl.java:run(463)) - Removed ProcessTree with root 23852
> >> 
> >> Either something is leaking in YARN, or my registerWithResourceManager code
> >> (see below) is doing something funky.
> >> 
> >> I'm trying to avoid going through all the pain of attaching a remote 
> >> debugger.
> >> Presumably things aren't leaking in YARN, which means it's likely that I'm
> >> doing something wrong in my registration code.
> >> 
> >> Incidentally, my NodeManager is running with 1000 megs. My application 
> >> master
> >> memory is set to 2048, and my -Xmx setting is 512M
> >> 
> >> Cheers,
>

[jira] [Resolved] (MAPREDUCE-3065) ApplicationMaster killed by NodeManager due to excessive virtual memory consumption

2015-11-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-3065.

Resolution: Duplicate

Resolving correctly as a dup of MAPREDUCE-3068.

> ApplicationMaster killed by NodeManager due to excessive virtual memory 
> consumption
> ---
>
> Key: MAPREDUCE-3065
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3065
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Chris Riccomini
>
> > Hey Vinod,
> > 
> > OK, so I have a little more clarity into this.
> > 
> > When I bump my resource request for my AM to 4096, it runs. The important 
> > line in the NM logs is:
> > 
> > 2011-09-21 13:43:44,366 INFO  monitor.ContainersMonitorImpl 
> > (ContainersMonitorImpl.java:run(402)) - Memory usage of ProcessTree 25656 
> > for container-id container_1316637655278_0001_01_01 : Virtual 
> > 2260938752 bytes, limit : 4294967296 bytes; Physical 120860672 bytes, limit 
> > -1 bytes
> > 
> > The thing to note is the virtual memory, which is off the charts, even 
> > though my physical memory is almost nothing (12 megs). I'm still poking 
> > around the code, but I am noticing that there are two checks in the NM, one 
> > for virtual mem, and one for physical mem. The virtual memory check appears 
> > to be toggle-able, but is presumably defaulted to on.
> > 
> > At this point I'm trying to figure out exactly what the VMEM check is for, 
> > why YARN thinks my app is taking 2 gigs, and how to fix this.
> > 
> > Cheers,
> > Chris
> > 
> > From: Chris Riccomini [criccom...@linkedin.com]
> > Sent: Wednesday, September 21, 2011 1:42 PM
> > To: mapreduce-dev@hadoop.apache.org
> > Subject: Re: ApplicationMaster Memory Usage
> > 
> > For the record, I bumped to 4096 for memory resource request, and it works.
> > :(
> > 
> > 
> > On 9/21/11 1:32 PM, "Chris Riccomini"  wrote:
> > 
> >> Hey Vinod,
> >> 
> >> So, I ran my application master directly from the CLI. I commented out the
> >> YARN-specific code. It runs fine without leaking memory.
> >> 
> >> I then ran it from YARN, with all YARN-specific code commented it. It again
> >> ran fine.
> >> 
> >> I then uncommented JUST my registerWithResourceManager call. It then fails
> >> with OOM after a few seconds. I call registerWithResourceManager, and then 
> >> go
> >> into a while(true) { println("yeh") sleep(1000) }. Doing this prints:
> >> 
> >> yeh
> >> yeh
> >> yeh
> >> yeh
> >> yeh
> >> 
> >> At which point, it dies, and, in the NodeManager,I see:
> >> 
> >> 2011-09-21 13:24:51,036 WARN  monitor.ContainersMonitorImpl
> >> (ContainersMonitorImpl.java:isProcessTreeOverLimit(289)) - Process tree for
> >> container: container_1316626117280_0005_01_01 has processes older than 
> >> 1
> >> iteration running over the configured limit. Limit=2147483648, current 
> >> usage =
> >> 2192773120
> >> 2011-09-21 13:24:51,037 WARN  monitor.ContainersMonitorImpl
> >> (ContainersMonitorImpl.java:run(453)) - Container
> >> [pid=23852,containerID=container_1316626117280_0005_01_01] is running
> >> beyond memory-limits. Current usage : 2192773120bytes. Limit :
> >> 2147483648bytes. Killing container.
> >> Dump of the process-tree for container_1316626117280_0005_01_01 :
> >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
> >> SYSTEM_TIME(MILLIS)
> >> VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
> >> |- 23852 20570 23852 23852 (bash) 0 0 108638208 303 /bin/bash -c java 
> >> -Xmx512M
> >> -cp './package/*' kafka.yarn.ApplicationMaster
> >> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 5 1 1316626117280
> >> com.linkedin.TODO 1
> >> 1>/tmp/logs/application_1316626117280_0005/container_1316626117280_0005_01_000
> >> 001/stdout
> >> 2>/tmp/logs/application_1316626117280_0005/container_1316626117280_0005_01_000
> >> 001/stderr
> >> |- 23855 23852 23852 23852 (java) 81 4 2084134912 14772 java -Xmx512M -cp
> >> ./package/* kafka.yarn.ApplicationMaster
> >> /home/criccomi/git/kafka-yarn/dist/kafka-streamer.tgz 5 1 1316626117280
> >> com.linkedin.TODO 1
> >> 2011-09-21 13:24:51,037 INFO  monitor.ContainersMonitorImpl
> >> (ContainersMonitorImpl.java:run(463)) - Removed ProcessTree with root 23852
> >> 
> >> Either something is leaking in YARN, or my registerWithResourceManager code
> >> (see below) is doing something funky.
> >> 
> >> I'm trying to avoid going through all the pain of attaching a remote 
> >> debugger.
> >> Presumably things aren't leaking in YARN, which means it's likely that I'm
> >> doing something wrong in my registration code.
> >> 
> >> Incidentally, my NodeManager is running with 1000 megs. My application 
> >> master
>

[jira] [Resolved] (MAPREDUCE-1901) Jobs should not submit the same jar files over and over again

2015-11-13 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved MAPREDUCE-1901.

Resolution: Duplicate

YARN-1492 implemented a solution reasonably close to what [~jsensarma] 
proposed. Closing this very old JIRA as a dup, please reopen if you disagree.

> Jobs should not submit the same jar files over and over again
> -
>
> Key: MAPREDUCE-1901
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1901
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Joydeep Sen Sarma
> Attachments: 1901.PATCH, 1901.PATCH
>
>
> Currently each Hadoop job uploads the required resources 
> (jars/files/archives) to a new location in HDFS. Map-reduce nodes involved in 
> executing this job would then download these resources into local disk.
> In an environment where most of the users are using a standard set of jars 
> and files (because they are using a framework like Hive/Pig) - the same jars 
> keep getting uploaded and downloaded repeatedly. The overhead of this 
> protocol (primarily in terms of end-user latency) is significant when:
> - the jobs are small (and conversantly - large in number)
> - Namenode is under load (meaning hdfs latencies are high and made worse, in 
> part, by this protocol)
> Hadoop should provide a way for jobs in a cooperative environment to not 
> submit the same files over and again. Identifying and caching execution 
> resources by a content signature (md5/sha) would be a good alternative to 
> have available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Looking to a 2.8.0 release

2015-11-13 Thread Sangjin Lee
I reviewed the current state of the YARN-2928 changes regarding its impact
if the timeline service v.2 is disabled. It does appear that there are a
lot of things that still do get created and enabled unconditionally
regardless of configuration. While this is understandable when we were
working to implement the feature, this clearly needs to be cleaned up so
that when disabled the timeline service v.2 doesn't impact other things.

I filed a JIRA for that work:
https://issues.apache.org/jira/browse/YARN-4356

We need to complete it before we can merge.

Somewhat related is the status of the configuration and what it means in
various contexts (client/app-side vs. server-side, v.1 vs. v.2, etc.). I
know there is an ongoing discussion regarding YARN-4183. We'll need to
reflect the outcome of that discussion.

My overall impression of whether this can be done for 2.8 is that it looks
rather challenging given the suggested timeframe. We also need to complete
several major tasks before it is ready.

Sangjin


On Wed, Nov 11, 2015 at 5:49 PM, Sangjin Lee  wrote:

>
> On Wed, Nov 11, 2015 at 12:13 PM, Vinod Vavilapalli <
> vino...@hortonworks.com> wrote:
>
>> — YARN Timeline Service Next generation: YARN-2928: Lots of momentum,
>> but clearly a work in progress. Two options here
>> — If it is safe to ship it into 2.8 in a disable manner, we can
>> get the early code into trunk and all the way int o2.8.
>> — If it is not safe, it organically rolls over into 2.9
>>
>
> I'll review the changes on YARN-2928 to see what impact it has (if any) if
> the timeline service v.2 is disabled.
>
> Another condition for it to make 2.8 is whether the branch will be in a
> shape in a couple of weeks such that it adds value for folks that want to
> test it. Hopefully it will become clearer soon.
>
> Sangjin
>


Hadoop-Mapreduce-trunk - Build # 2607 - Failure

2015-11-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2607/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 32266 lines...]
Running org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebApp
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.865 sec - 
in org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebApp
Running org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobConf
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.086 sec - in 
org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobConf
Running org.apache.hadoop.mapred.TestTaskAttemptListenerImpl
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.793 sec - in 
org.apache.hadoop.mapred.TestTaskAttemptListenerImpl
Running org.apache.hadoop.mapred.TestTaskAttemptFinishingMonitor
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.186 sec - in 
org.apache.hadoop.mapred.TestTaskAttemptFinishingMonitor
Running org.apache.hadoop.mapred.TestLocalContainerLauncher
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.46 sec - in 
org.apache.hadoop.mapred.TestLocalContainerLauncher

Results :

Failed tests: 
  
TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned:177->testTaskAttemptAssignedKilledHistory:388
 No Ta Started JH Event

Tests run: 340, Failures: 1, Errors: 0, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop MapReduce Client  SUCCESS [  8.718 s]
[INFO] Apache Hadoop MapReduce Core .. SUCCESS [04:56 min]
[INFO] Apache Hadoop MapReduce Common  SUCCESS [01:01 min]
[INFO] Apache Hadoop MapReduce Shuffle ... SUCCESS [ 11.685 s]
[INFO] Apache Hadoop MapReduce App ... FAILURE [11:04 min]
[INFO] Apache Hadoop MapReduce HistoryServer . SKIPPED
[INFO] Apache Hadoop MapReduce JobClient . SKIPPED
[INFO] Apache Hadoop MapReduce HistoryServer Plugins . SKIPPED
[INFO] Apache Hadoop MapReduce NativeTask  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] Apache Hadoop MapReduce ... SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 17:27 min
[INFO] Finished at: 2015-11-14T05:50:59+00:00
[INFO] Final Memory: 41M/825M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-mapreduce-client-app: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-mapreduce-client-app
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Recording test results
Updating HADOOP-12374
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned

Error Message:
No Ta Started JH Event

Stack Trace:
java.lang.AssertionError: No Ta Started JH Event
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTaskAttemptAssignedKilledHistory(TestTaskAttempt.java:388)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned(TestTaskAttempt.java:177)




Hadoop-Mapreduce-trunk - Build # 2608 - Still Failing

2015-11-13 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2608/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 32268 lines...]
Running org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobConf
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.964 sec - in 
org.apache.hadoop.mapreduce.v2.app.webapp.TestAMWebServicesJobConf
Running org.apache.hadoop.mapred.TestTaskAttemptListenerImpl
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.293 sec - in 
org.apache.hadoop.mapred.TestTaskAttemptListenerImpl
Running org.apache.hadoop.mapred.TestTaskAttemptFinishingMonitor
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.698 sec - in 
org.apache.hadoop.mapred.TestTaskAttemptFinishingMonitor
Running org.apache.hadoop.mapred.TestLocalContainerLauncher
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.299 sec - in 
org.apache.hadoop.mapred.TestLocalContainerLauncher

Results :

Failed tests: 
  
TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned:177->testTaskAttemptAssignedKilledHistory:388
 No Ta Started JH Event

Tests run: 340, Failures: 1, Errors: 0, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop MapReduce Client  SUCCESS [  5.427 s]
[INFO] Apache Hadoop MapReduce Core .. SUCCESS [02:59 min]
[INFO] Apache Hadoop MapReduce Common  SUCCESS [ 50.176 s]
[INFO] Apache Hadoop MapReduce Shuffle ... SUCCESS [  9.285 s]
[INFO] Apache Hadoop MapReduce App ... FAILURE [12:39 min]
[INFO] Apache Hadoop MapReduce HistoryServer . SKIPPED
[INFO] Apache Hadoop MapReduce JobClient . SKIPPED
[INFO] Apache Hadoop MapReduce HistoryServer Plugins . SKIPPED
[INFO] Apache Hadoop MapReduce NativeTask  SKIPPED
[INFO] Apache Hadoop MapReduce Examples .. SKIPPED
[INFO] Apache Hadoop MapReduce ... SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 16:46 min
[INFO] Finished at: 2015-11-14T07:26:32+00:00
[INFO] Final Memory: 49M/1215M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-mapreduce-client-app: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Mapreduce-trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :hadoop-mapreduce-client-app
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Recording test results
Updating HADOOP-12348
Updating HADOOP-12482
Updating HADOOP-11361
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned

Error Message:
No Ta Started JH Event

Stack Trace:
java.lang.AssertionError: No Ta Started JH Event
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTaskAttemptAssignedKilledHistory(TestTaskAttempt.java:388)
at 
org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned(TestTaskAttempt.java:177)