Re: [VOTE] Release Apache Hadoop 3.2.0 - RC0

2018-11-28 Thread Jason Lowe
Thanks for driving this release, Sunil!

+1 (binding)

- Verified signatures and digests
- Successfully performed a native build
- Deployed a single-node cluster
- Ran some sample jobs

Jason

On Fri, Nov 23, 2018 at 6:07 AM Sunil G  wrote:

> Hi folks,
>
>
>
> Thanks to all contributors who helped in this release [1]. I have created
>
> first release candidate (RC0) for Apache Hadoop 3.2.0.
>
>
> Artifacts for this RC are available here:
>
> http://home.apache.org/~sunilg/hadoop-3.2.0-RC0/
>
>
>
> RC tag in git is release-3.2.0-RC0.
>
>
>
> The maven artifacts are available via repository.apache.org at
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1174/
>
>
> This vote will run 7 days (5 weekdays), ending on Nov 30 at 11:59 pm PST.
>
>
>
> 3.2.0 contains 1079 [2] fixed JIRA issues since 3.1.0. Below feature
> additions
>
> are the highlights of this release.
>
> 1. Node Attributes Support in YARN
>
> 2. Hadoop Submarine project for running Deep Learning workloads on YARN
>
> 3. Support service upgrade via YARN Service API and CLI
>
> 4. HDFS Storage Policy Satisfier
>
> 5. Support Windows Azure Storage - Blob file system in Hadoop
>
> 6. Phase 3 improvements for S3Guard and Phase 5 improvements S3a
>
> 7. Improvements in Router-based HDFS federation
>
>
>
> Thanks to Wangda, Vinod, Marton for helping me in preparing the release.
>
> I have done few testing with my pseudo cluster. My +1 to start.
>
>
>
> Regards,
>
> Sunil
>
>
>
> [1]
>
>
> https://lists.apache.org/thread.html/68c1745dcb65602aecce6f7e6b7f0af3d974b1bf0048e7823e58b06f@%3Cyarn-dev.hadoop.apache.org%3E
>
> [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND fixVersion in (3.2.0)
> AND fixVersion not in (3.1.0, 3.0.0, 3.0.0-beta1) AND status = Resolved
> ORDER BY fixVersion ASC
>


Re: [VOTE] Release Apache Hadoop 2.9.2 (RC0)

2018-11-19 Thread Jason Lowe
Thanks for driving this release, Akira!

+1 (binding)

- Verified signatures and digests
- Successfully performed native build from source
- Deployed a single-node cluster and ran some sample jobs

Jason

On Tue, Nov 13, 2018 at 7:02 PM Akira Ajisaka  wrote:

> Hi folks,
>
> I have put together a release candidate (RC0) for Hadoop 2.9.2. It
> includes 204 bug fixes and improvements since 2.9.1. [1]
>
> The RC is available at http://home.apache.org/~aajisaka/hadoop-2.9.2-RC0/
> Git signed tag is release-2.9.2-RC0 and the checksum is
> 826afbeae31ca687bc2f8471dc841b66ed2c6704
> The maven artifacts are staged at
> https://repository.apache.org/content/repositories/orgapachehadoop-1166/
>
> You can find my public key at:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Please try the release and vote. The vote will run for 5 days.
>
> [1] https://s.apache.org/2.9.2-fixed-jiras
>
> Thanks,
> Akira
>
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


[jira] [Created] (MAPREDUCE-7155) TestHSAdminServer is failing

2018-10-22 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-7155:
-

 Summary: TestHSAdminServer is failing
 Key: MAPREDUCE-7155
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7155
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: Jason Lowe


After HADOOP-15836 TestHSAdminServer has been failing consistently.  Sample 
stacktraces to follow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6440) Duplicate Key in Json Output for Job details

2018-09-13 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6440.
---
  Resolution: Duplicate
Target Version/s:   (was: )

This has been fixed by MAPREDUCE-7133.

> Duplicate Key in Json Output for Job details
> 
>
> Key: MAPREDUCE-6440
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6440
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Anushri
>Priority: Minor
>
> Duplicate key in Json Output for Job details for the url : 
> http://:/ws/v1/history/mapreduce/jobs/job_id/tasks/task_id/attempts
> If the task type is "REDUCE" the json output for this url contains duplicate 
> key for "type".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Next Hadoop Contributors Meetup on September 25th

2018-09-13 Thread Jason Lowe
I am happy to announce that Oath will be hosting the next Hadoop
Contributors meetup on Tuesday, September 25th at Yahoo Building G, 589
Java Drive, Sunnyvale CA from 8:00AM to 6:00PM.

The agenda will look roughly as follows:

08:00AM - 08:30AM Arrival and Check-in
08:30AM - 12:00PM A series of brief talks with some of the topics including:
  - HDFS scalability and security
  - Use cases and future directions for Docker on YARN
  - Submarine (Deep Learning on YARN)
  - Hadoop in the cloud
  - Oath's use of machine learning, Vespa, and Storm
11:45PM - 12:30PM Lunch Break
12:30PM - 02:00PM Brief talks series resume
02:00PM - 04:30PM Parallel breakout sessions to discuss topics suggested by
attendees.  Some proposed topics include:
  - Improved security credentials management for long-running YARN
applications
  - Improved management of parallel development lines
  - Proposals for the next bug bash
  - Tez shuffle handler and DAG aware scheduler overview
 04:30PM - 06:00PM Closing Reception

RSVP at https://www.meetup.com/Hadoop-Contributors/events/254012512/ is
REQUIRED to attend and spots are limited.  Security will be checking the
attendee list as you enter the building.

We will host a Google Hangouts/Meet so people who are interested but unable
to attend in person can participate remotely.  Details will be posted to
the meetup event.

Hope to see you there!

Jason


Re: [VOTE] Release Apache Hadoop 2.8.5 (RC0)

2018-09-10 Thread Jason Lowe
Thanks for driving the release, Junping!

+1 (binding)

- Verified signatures and digests
- Successfully performed a native build from source
- Successfully deployed a single-node cluster with the timeline server
- Ran some sample jobs and examined the web UI and job logs

Jason

On Mon, Sep 10, 2018 at 7:00 AM, 俊平堵  wrote:

> Hi all,
>
>  I've created the first release candidate (RC0) for Apache
> Hadoop 2.8.5. This is our next point release to follow up 2.8.4. It
> includes 33 important fixes and improvements.
>
>
> The RC artifacts are available at:
> http://home.apache.org/~junping_du/hadoop-2.8.5-RC0
>
>
> The RC tag in git is: release-2.8.5-RC0
>
>
>
> The maven artifacts are available via repository.apache.org<
> http://repository.apache.org> at:
>
> https://repository.apache.org/content/repositories/orgapachehadoop-1140
>
>
> Please try the release and vote; the vote will run for the usual 5
> working
> days, ending on 9/15/2018 PST time.
>
>
> Thanks,
>
>
> Junping
>


[jira] [Resolved] (MAPREDUCE-6948) TestJobImpl.testUnusableNodeTransition failed

2018-07-17 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6948.
---
Resolution: Cannot Reproduce

I agree as well.  I have not seen any recent precommit failures on 3.x releases 
for this unit test.

> TestJobImpl.testUnusableNodeTransition failed
> -
>
> Key: MAPREDUCE-6948
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6948
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha4
>Reporter: Haibo Chen
>Assignee: Jim Brennan
>Priority: Major
>  Labels: unit-test
>
> *Error Message*
> expected: but was:
> *Stacktrace*
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:1041)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testUnusableNodeTransition(TestJobImpl.java:615)
> *Standard out*
> {code}
> 2017-08-30 10:12:21,928 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventType for class 
> org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler
> 2017-08-30 10:12:21,939 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$StubbedJob
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskEventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.jobhistory.EventType for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,940 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.JobFinishEvent$Type for class 
> org.apache.hadoop.yarn.event.EventHandler$$EnhancerByMockitoWithCGLIB$$79f96ebf
> 2017-08-30 10:12:21,941 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1534)) - Adding job token for job_123456789_0001 to 
> jobTokenSecretManager
> 2017-08-30 10:12:21,941 WARN  [Thread-49] impl.JobImpl 
> (JobImpl.java:setup(1540)) - Shuffle secret key missing from job credentials. 
> Using job token secret as shuffle secret.
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:makeUberDecision(1305)) - Not uberizing job_123456789_0001 
> because: not enabled;
> 2017-08-30 10:12:21,944 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createMapTasks(1562)) - Input size for job 
> job_123456789_0001 = 0. Number of splits = 2
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:createReduceTasks(1579)) - Number of reduces for job 
> job_123456789_0001 = 1
> 2017-08-30 10:12:21,945 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from NEW 
> to INITED
> 2017-08-30 10:12:21,946 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> INITED to SETUP
> 2017-08-30 10:12:21,954 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-08-30 10:12:21,978 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1017)) - job_123456789_0001Job Transitioned from 
> SETUP to RUNNING
> 2017-08-30 10:12:21,983 INFO  [Thread-49] event.AsyncDispatcher 
> (AsyncDispatcher.java:register(209)) - Registering class 
> org.apache.hadoop.mapreduce.v2.app.job.event.TaskAttemptEventType for class 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl$5
> 2017-08-30 10:12:22,000 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 1
> 2017-08-30 10:12:22,029 INFO  [Thread-49] impl.JobImpl 
> (JobImpl.java:transition(1953)) - Num completed Tasks: 2
> 2017-08-30 1

[jira] [Created] (MAPREDUCE-7118) Distributed cache conflicts breaks backwards compatability

2018-07-03 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-7118:
-

 Summary: Distributed cache conflicts breaks backwards compatability
 Key: MAPREDUCE-7118
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7118
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 3.1.0, 3.0.0, 3.2.0
Reporter: Jason Lowe
Assignee: Jason Lowe


MAPREDUCE-4503 made distributed cache conflicts break job submission, but this 
was quickly downgraded to a warning in MAPREDUCE-4549.  Unfortunately the 
latter did not go into trunk, so the fix is only in 0.23 and 2.x.  When Oozie, 
Pig, and other downstream projects that can occasionally generate distributed 
cache conflicts move to Hadoop 3.x the workflows that used to work on 0.23 and 
2.x no longer function.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-7080) Default speculator won't sepculate the last several submitted reduced task if the total task num is large

2018-04-17 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7080.
---
Resolution: Duplicate

Closing as a duplicate of MAPREDUCE-7081.

> Default speculator won't sepculate the last several submitted reduced task if 
> the total task num is large
> -
>
> Key: MAPREDUCE-7080
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7080
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 2.7.5
>Reporter: Zhizhen Hou
>Priority: Major
>
> DefaultSpeculator speculates a task one time. 
> By default, the number of speculators is max(max(10, 0.01 * tasks.size), 0.1 
> * running tasks)
> I  set mapreduce.job.reduce.slowstart.completedmaps = 1 to start reduce after 
> all the map tasks are finished.
> The cluster has 1000 vcores, and the Job has 5000 reduce jobs.
> At first, 1000 reduces tasks can run simultaneously, number of speculators 
> can speculator at most is 0.1 * 1000 = 100 tasks. Reduce tasks with less data 
> can over shortly, and speculator will speculator a task per second by 
> default. The task be speculated execution may be because the more data to be 
> processed. It will speculator  100 tasks within 100 seconds.
> When 4900 reduces is over, If a reduce is executed with a lot of  data be 
> processed and is put on a slow machine. The speculate opportunity is running 
> out, it will not be speculated. It can increase the execution time of job 
> significantly.
> In short, it may waste the speculate opportunity at first only because the 
> execution time of  reduce with less data to be processed as average time. At  
> end of job, there is no speculate opportunity available, especially last 
> several running tasks, judged the number of the running tasks .
>  
> In my opinion, the number of tasks be speculated can be judged by square of 
> finished task percent. Take an example, if ninety percent of  the task is 
> finished, only 0.9*0.9 = 0.81 speculate opportunity can be used. It will 
> leave enough opportunity for latter tasks.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.6 (RC0)

2018-04-16 Thread Jason Lowe
Thanks for driving the release, Konstatin!

+1 (binding)

- Verified signatures and digests
- Completed a native build from source
- Deployed a single-node cluster
- Ran some sample jobs

Jason

On Mon, Apr 9, 2018 at 6:14 PM, Konstantin Shvachko
 wrote:
> Hi everybody,
>
> This is the next dot release of Apache Hadoop 2.7 line. The previous one 2.7.5
> was released on December 14, 2017.
> Release 2.7.6 includes critical bug fixes and optimizations. See more
> details in Release Note:
> http://home.apache.org/~shv/hadoop-2.7.6-RC0/releasenotes.html
>
> The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.6-RC0/
>
> Please give it a try and vote on this thread. The vote will run for 5 days
> ending 04/16/2018.
>
> My up to date public key is available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Thanks,
> --Konstantin

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7079) TestMRIntermediateDataEncryption is failing in precommit builds

2018-04-12 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-7079:
-

 Summary: TestMRIntermediateDataEncryption is failing in precommit 
builds
 Key: MAPREDUCE-7079
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7079
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Lowe


TestMRIntermediateDataEncryption is either timing out or tearing down the JVM 
which causes the unit tests in jobclient to not pass cleanly during precommit 
builds. From sample precommit console output, note the lack of a test results 
line when the test is run:
{noformat}
[INFO] Running org.apache.hadoop.mapred.TestSequenceFileInputFormat
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.976 s 
- in org.apache.hadoop.mapred.TestSequenceFileInputFormat
[INFO] Running org.apache.hadoop.mapred.TestMRIntermediateDataEncryption
[INFO] Running org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.659 s 
- in org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
[...]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:14 h
[INFO] Finished at: 2018-04-12T04:27:06+00:00
[INFO] Final Memory: 24M/594M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7078) TestPipeApplication is failing in precommit builds

2018-04-12 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-7078:
-

 Summary: TestPipeApplication is failing in precommit builds
 Key: MAPREDUCE-7078
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7078
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Jason Lowe


TestPipeApplication is either timing out or tearing down the JVM which causes 
the unit tests in jobclient to not pass cleanly during precommit builds.  From 
sample precommit console output, note the lack of a test results line when the 
test is run:
{noformat}
[INFO] Running org.apache.hadoop.mapred.TestIFile
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1 s - in 
org.apache.hadoop.mapred.TestIFile
[INFO] Running org.apache.hadoop.mapred.pipes.TestPipeApplication
[INFO] Running org.apache.hadoop.mapred.pipes.TestPipesNonJavaInputFormat
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.02 s - 
in org.apache.hadoop.mapred.pipes.TestPipesNonJavaInputFormat
[...]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 02:14 h
[INFO] Finished at: 2018-04-12T04:27:06+00:00
[INFO] Final Memory: 24M/594M
[INFO] 
[WARNING] The requested profile "parallel-tests" could not be activated because 
it does not exist.
[WARNING] The requested profile "native" could not be activated because it does 
not exist.
[WARNING] The requested profile "yarn-ui" could not be activated because it 
does not exist.
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project hadoop-mapreduce-client-jobclient: There was a timeout or other error 
in the fork -> [Help 1]
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-7053) Timed out tasks can fail to produce thread dump

2018-02-14 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-7053:
-

 Summary: Timed out tasks can fail to produce thread dump
 Key: MAPREDUCE-7053
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7053
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 3.1.0, 3.0.1, 2.10.0, 2.9.1, 2.8.4, 2.7.6
Reporter: Jason Lowe


TestMRJobs#testThreadDumpOnTaskTimeout has been failing sporadically recently.  
When the AM times out a task it immediately removes it from the list of known 
tasks and then connects to the NM to request a thread dump followed by a kill.  
If the task heartbeats in after the task has been removed from the list of 
known tasks but before the thread dump signal arrives then the task can exit 
with a "org.apache.hadoop.mapred.Task: Parent died." message and no thread dump.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-7049) Testcase TestMRJobs#testJobClassloaderWithCustomClasses fails

2018-02-06 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7049.
---
Resolution: Duplicate

> Testcase TestMRJobs#testJobClassloaderWithCustomClasses fails 
> --
>
> Key: MAPREDUCE-7049
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7049
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
>
> The testcase TestMRJobs#testJobClassloaderWithCustomClasses fails 
> consistently with this error:
> {noformat}
> [INFO] ---
> [INFO]  T E S T S
> [INFO] ---
> [INFO] Running org.apache.hadoop.mapreduce.v2.TestMRJobs
> [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
> 54.325 s <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs
> [ERROR] 
> testJobClassloaderWithCustomClasses(org.apache.hadoop.mapreduce.v2.TestMRJobs)
>   Time elapsed: 10.531 s  <<< FAILURE!
> java.lang.AssertionError: 
> Job status: Application application_1517928628935_0001 failed 2 times due to 
> AM Container for appattempt_1517928628935_0001_02 exited with  exitCode: 1
> Failing this attempt.Diagnostics: [2018-02-06 15:50:38.688]Exception from 
> container-launch.
> Container id: container_1517928628935_0001_02_01
> Exit code: 1
> [2018-02-06 15:50:38.693]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> [2018-02-06 15:50:38.694]Container exited with a non-zero exit code 1. Error 
> file: prelaunch.err.
> Last 4096 bytes of prelaunch.err :
> Last 4096 bytes of stderr :
> log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> log4j:WARN Please initialize the log4j system properly.
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> For more detailed output, check the application tracking page: 
> http://ubuntu:46235/cluster/app/application_1517928628935_0001 Then click on 
> links to logs of each attempt.
> . Failing the application.
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobClassloader(TestMRJobs.java:529)
>   at 
> org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobClassloaderWithCustomClasses(TestMRJobs.java:477)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}
> Today I found the offending commit with {{git bisect}} and this failure is 
> caused by {{YARN-2185}}.
> The application master fails because of the following error:
> {noformat}
> 2018-02-05 17:15:18,530 DEBUG [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> 1: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
> at org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:265)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1694)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
>

[jira] [Created] (MAPREDUCE-7033) Map outputs implicitly rely on permissive umask for shuffle

2018-01-11 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-7033:
-

 Summary: Map outputs implicitly rely on permissive umask for 
shuffle
 Key: MAPREDUCE-7033
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7033
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe


Map tasks do not explicitly set the permissions of their output files for 
shuffle.  In a secure cluster the shuffle service is running as a different 
user than the map task, so the output files require group readability in order 
to serve up the data during the shuffle phase.  If the user's UNIX umask is too 
restrictive (e.g.: 077) then the map task's file.out and file.out.index 
permissions can be too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: Apache Hadoop 3.0.1 Release plan

2018-01-09 Thread Jason Lowe
Is it necessary to cut the branch so far ahead of the release?  branch-3.0
is already a maintenance line for 3.0.x releases.  Is there a known
feature/improvement planned to go into branch-3.0 that is not desirable for
the 3.0.1 release?

I have found in the past that branching so early leads to many useful fixes
being unnecessarily postponed to future releases because committers forget
to pick to the new, relatively long-lived patch branch.  This becomes
especially true if blockers end up dragging out the ultimate release date,
which has historically been quite common.  My preference would be to cut
this branch as close to the RC as possible.

Jason


On Tue, Jan 9, 2018 at 1:17 PM, Lei Xu  wrote:

> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


Re: [VOTE] Release Apache Hadoop 2.7.5 (RC1)

2017-12-12 Thread Jason Lowe
Thanks for driving the release, Konstantin!

+1 (binding)

- Verified signatures and digests
- Successfully performed a native build from source
- Deployed a single-node cluster
- Ran some sample jobs and checked the logs

Jason


On Thu, Dec 7, 2017 at 9:22 PM, Konstantin Shvachko 
wrote:

> Hi everybody,
>
> I updated CHANGES.txt and fixed documentation links.
> Also committed  MAPREDUCE-6165, which fixes a consistently failing test.
>
> This is RC1 for the next dot release of Apache Hadoop 2.7 line. The
> previous one 2.7.4 was release August 4, 2017.
> Release 2.7.5 includes critical bug fixes and optimizations. See more
> details in Release Note:
> http://home.apache.org/~shv/hadoop-2.7.5-RC1/releasenotes.html
>
> The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.5-RC1/
>
> Please give it a try and vote on this thread. The vote will run for 5 days
> ending 12/13/2017.
>
> My up to date public key is available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
>
> Thanks,
> --Konstantin
>


Re: [VOTE] Release Apache Hadoop 2.8.3 (RC0)

2017-12-12 Thread Jason Lowe
Thanks for driving this release, Junping!

+1 (binding)

- Verified signatures and digests
- Successfully performed native build from source
- Deployed a single-node cluster
- Ran some test jobs and examined the logs

Jason

On Tue, Dec 5, 2017 at 3:58 AM, Junping Du  wrote:

> Hi all,
>  I've created the first release candidate (RC0) for Apache Hadoop
> 2.8.3. This is our next maint release to follow up 2.8.2. It includes 79
> important fixes and improvements.
>
>   The RC artifacts are available at: http://home.apache.org/~
> junping_du/hadoop-2.8.3-RC0
>
>   The RC tag in git is: release-2.8.3-RC0
>
>   The maven artifacts are available via repository.apache.org at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1072
>
>   Please try the release and vote; the vote will run for the usual 5
> working days, ending on 12/12/2017 PST time.
>
> Thanks,
>
> Junping
>


[jira] [Resolved] (MAPREDUCE-7019) java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2

2017-12-08 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-7019.
---
Resolution: Invalid

Closing this since I believe the error is coming from the program being 
launched by the streaming job rather than an issue with the streaming framework 
code.  If this is incorrect, please provide details showing where the streaming 
framework code is going awry.


> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed 
> with code 2
> -
>
> Key: MAPREDUCE-7019
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7019
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: shrutika sarda
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.9.0 (RC3)

2017-11-17 Thread Jason Lowe
Thanks for putting this release together!

+1 (binding)

- Verified signatures and digests
- Successfully built from source including native
- Deployed to single-node cluster and ran some test jobs

Jason


On Mon, Nov 13, 2017 at 6:10 PM, Arun Suresh  wrote:

> Hi Folks,
>
> Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be the
> starting release for Apache Hadoop 2.9.x line - it includes 30 New Features
> with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since
> 2.8.2.
>
> More information about the 2.9.0 release plan can be found here:
> *https://cwiki.apache.org/confluence/display/HADOOP/
> Roadmap#Roadmap-Version2.9
>  Roadmap#Roadmap-Version2.9>*
>
> New RC is available at: *https://home.apache.org/~
> asuresh/hadoop-2.9.0-RC3/
> *
>
> The RC tag in git is: release-2.9.0-RC3, and the latest commit id is:
> 756ebc8394e473ac25feac05fa493f6d612e6c50.
>
> The maven artifacts are available via repository.apache.org at:
>  apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066&sa=D&
> sntz=1&usg=AFQjCNFcern4uingMV_sEreko_zeLlgdlg>*https://
> repository.apache.org/content/repositories/orgapachehadoop-1068/
>  >*
>
> We are carrying over the votes from the previous RC given that the delta is
> the license fix.
>
> Given the above - we are also going to stick with the original deadline for
> the vote : ending on Friday 17th November 2017 2pm PT time.
>
> Thanks,
> -Arun/Subru
>


Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-23 Thread Jason Lowe
My apologies, false alarm on the CHANGES.md and RELEASENOTES.md.  I was in
the process of reviewing the release and was interrupted, and when I
resumed I thought I had already downloaded the CHANGES and RELEASENOTES,
but in fact they were the old versions from a prior review of 2.8.0.  I
reviewed both of them for 2.8.2 (for real this time!) and they look
correct.  Again my apologies for the confusion.

Jason

On Mon, Oct 23, 2017 at 3:26 PM, Jason Lowe  wrote:

> +1 (binding)
>
> - Verified signatures and digests
> - Performed a native build from source
> - Deployed to a single-node cluster
> - Ran some sample jobs
>
> The CHANGES.md and RELEASENOTES.md both refer to release 2.8.0 instead of
> 2.8.2, and I do not see the list of JIRAs in CHANGES.md that have been
> committed since 2.8.1.  Since we're voting on the source bits rather than
> the change log I kept my vote as a +1 as I do see the 2.8.2 changes in the
> source code.
>
> Jason
>
>
> On Thu, Oct 19, 2017 at 7:42 PM, Junping Du  wrote:
>
>> Hi folks,
>>  I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
>>
>>  Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line
>> and will be the latest stable/production release for Apache Hadoop - it
>> includes 315 new fixed issues since 2.8.1 and 69 fixes are marked as
>> blocker/critical issues.
>>
>>   More information about the 2.8.2 release plan can be found here:
>> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>>
>>   New RC is available at: http://home.apache.org/~junpin
>> g_du/hadoop-2.8.2-RC1<http://home.apache.org/~junping_du/hadoop-2.8.2-RC0
>> >
>>
>>   The RC tag in git is: release-2.8.2-RC1, and the latest commit id
>> is: 66c47f2a01ad9637879e95f80c41f798373828fb
>>
>>   The maven artifacts are available via repository.apache.org<
>> http://repository.apache.org/> at: https://repository.apache.org/
>> content/repositories/orgapachehadoop-1064<https://repository
>> .apache.org/content/repositories/orgapachehadoop-1062>
>>
>>   Please try the release and vote; the vote will run for the usual 5
>> days, ending on 10/24/2017 6pm PST time.
>>
>> Thanks,
>>
>> Junping
>>
>>
>


Re: [VOTE] Release Apache Hadoop 2.8.2 (RC1)

2017-10-23 Thread Jason Lowe
+1 (binding)

- Verified signatures and digests
- Performed a native build from source
- Deployed to a single-node cluster
- Ran some sample jobs

The CHANGES.md and RELEASENOTES.md both refer to release 2.8.0 instead of
2.8.2, and I do not see the list of JIRAs in CHANGES.md that have been
committed since 2.8.1.  Since we're voting on the source bits rather than
the change log I kept my vote as a +1 as I do see the 2.8.2 changes in the
source code.

Jason


On Thu, Oct 19, 2017 at 7:42 PM, Junping Du  wrote:

> Hi folks,
>  I've created our new release candidate (RC1) for Apache Hadoop 2.8.2.
>
>  Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line
> and will be the latest stable/production release for Apache Hadoop - it
> includes 315 new fixed issues since 2.8.1 and 69 fixes are marked as
> blocker/critical issues.
>
>   More information about the 2.8.2 release plan can be found here:
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
>
>   New RC is available at: http://home.apache.org/~
> junping_du/hadoop-2.8.2-RC1 du/hadoop-2.8.2-RC0>
>
>   The RC tag in git is: release-2.8.2-RC1, and the latest commit id
> is: 66c47f2a01ad9637879e95f80c41f798373828fb
>
>   The maven artifacts are available via repository.apache.org repository.apache.org/> at: https://repository.apache.org/
> content/repositories/orgapachehadoop-1064 repository.apache.org/content/repositories/orgapachehadoop-1062>
>
>   Please try the release and vote; the vote will run for the usual 5
> days, ending on 10/24/2017 6pm PST time.
>
> Thanks,
>
> Junping
>
>


[jira] [Resolved] (MAPREDUCE-6978) MR task counters deserialized through RPC throws OutOfBoundsException if Counter enum class version not match

2017-10-09 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6978.
---
Resolution: Duplicate

> MR task counters deserialized through RPC throws OutOfBoundsException if 
> Counter enum class version not match
> -
>
> Key: MAPREDUCE-6978
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6978
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mr-am, task
>Affects Versions: 3.0.0-alpha4
> Environment: NM1 TaskCounter.class old version; 
> NM2 TaskCounter.class new version (new Enumeration values appended); 
>Reporter: rangjiaheng
>
> Environment:
> NM1 TaskCounter.class old version; 
> NM2 TaskCounter.class new version (new Enumeration values appended); 
> Result:
> When an MR app's AM running on NM1, and it's containers on NM2; the 
> containers on NM2 will all failed, AM cause OutOfBoundsException;
> Reason:
> When app running, containers will report their counters to AM through RPC, 
> while the Container with new version TaskCounter.class will write more 
> Counter values to RPC; however, the AM with old version TaskCounter.class 
> which can not read them correctly from RPC.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6969) TestHSWebApp is failing

2017-09-26 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6969:
-

 Summary: TestHSWebApp is failing
 Key: MAPREDUCE-6969
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6969
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe


TestHSWebApp has been failing recently:
{noformat}
Running org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp
Tests run: 17, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 5.57 sec <<< 
FAILURE! - in org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp
testLogsViewBadStartEnd(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp)  
Time elapsed: 0.076 sec  <<< FAILURE!
org.mockito.exceptions.verification.junit.ArgumentsAreDifferent: 
Argument(s) are different! Wanted:
printWriter.write(
"Invalid log end value: bar"
);
-> at 
org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp.testLogsViewBadStartEnd(TestHSWebApp.java:261)
Actual invocation has different arguments:
printWriter.write(
"http://www.w3.org/TR/html4/strict.dtd";>"
);
-> at 
org.apache.hadoop.yarn.webapp.view.TextView.echoWithoutEscapeHtml(TextView.java:62)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.mapreduce.v2.hs.webapp.TestHSWebApp.testLogsViewBadStartEnd(TestHSWebApp.java:261)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6968) Staging directory erasure coding config property has a typo

2017-09-26 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6968:
-

 Summary: Staging directory erasure coding config property has a 
typo
 Key: MAPREDUCE-6968
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6968
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 3.0.0-beta1
Reporter: Jason Lowe
Assignee: Jason Lowe


TestMapreduceConfigFields has been failing since MAPREDUCE-6954. 
MRJobConfig#MR_AM_STAGING_DIR_ERASURECODING_ENABLED is defined as 
"yarn.app.mapreduce.am.staging-direrasurecoding.enabled"  but the property is 
listed as "yarn.app.mapreduce.am.staging-dir.erasurecoding.enabled" in 
mapred-default.xml.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6959) Understanding on process to start contribution

2017-09-18 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6959.
---
Resolution: Invalid

JIRA is for tracking features and bugs in Hadoop and not for general support.  
Questions such as these can be directed to the [mailing 
lists|http://hadoop.apache.org/mailing_lists.html].  Specifically if you're 
interested in contributing I highly recommend checking out the [How To 
Contribute|https://wiki.apache.org/hadoop/HowToContribute] wiki page.

Note that https://github.com/apache/hadoop-mapreduce is a mirror of just the 
MapReduce code from what looks like Hadoop 1.x or even earlier code that is no 
longer supported.  All active development is on Hadoop 2. x and Hadoop 3.x.


> Understanding on process to start contribution
> --
>
> Key: MAPREDUCE-6959
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6959
> Project: Hadoop Map/Reduce
>  Issue Type: Wish
>Reporter: Mehul
>Priority: Trivial
>
> I was trying to find process/steps to start with contribution into following 
> repo i.e. https://github.com/apache/hadoop-mapreduce. Can someone please help 
> with the detail so that I can create appropriate git/jira issue and start 
> woking on it?
> Any direction would be really appreciated!
> Thanks,
> Mehul



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6958) Shuffle audit logger should log size of shuffle transfer

2017-09-14 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6958:
-

 Summary: Shuffle audit logger should log size of shuffle transfer
 Key: MAPREDUCE-6958
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6958
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Minor


The shuffle audit logger currently logs the job ID and reducer ID but nothing 
about the size of the requested transfer.  It calculates this as part of the 
HTTP response headers, so it would be trivial to log the response size.  This 
would be very valuable for debugging network traffic storms from the shuffle 
handler.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6952) Using DistributedCache.addFileToClasspath with a rename fragment fails during job submit

2017-09-07 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6952:
-

 Summary: Using DistributedCache.addFileToClasspath with a rename 
fragment fails during job submit
 Key: MAPREDUCE-6952
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6952
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.8.1, 2.7.4
Reporter: Jason Lowe


Calling DistributedCache.addFileToClasspath with a Path that specifies a URI 
fragment, used to rename the file during localization, causes job submission to 
fail with a FileNotFoundException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Reopened] (MAPREDUCE-6641) TestTaskAttempt fails in trunk

2017-08-29 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-6641:
---

Seeing this fail the same way in 2.8 builds as well.  Unfortunately since the 
fix uses lambdas I can't just cherry-pick the fix down to other branches.  
Reopening so Jenkins can comment on a branch-2 version of the patch.

> TestTaskAttempt fails in trunk
> --
>
> Key: MAPREDUCE-6641
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6641
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Tsuyoshi Ozawa
>Assignee: Haibo Chen
> Fix For: 3.0.0-alpha1
>
> Attachments: mapreduce6641.001.patch, mapreduce6641.002.patch, 
> MAPREDUCE-6641-branch-2.002.patch, 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt-output.txt
>
>
> {code}
> Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
> Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 24.917 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt
> testMRAppHistoryForTAFailedInAssigned(org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt)
>   Time elapsed: 12.732 sec  <<< FAILURE!
> java.lang.AssertionError: No Ta Started JH Event
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testTaskAttemptAssignedKilledHistory(TestTaskAttempt.java:388)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TestTaskAttempt.testMRAppHistoryForTAFailedInAssigned(TestTaskAttempt.java:177)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Merge feature branch YARN-5355 (Timeline Service v2) to trunk

2017-08-29 Thread Jason Lowe
+1 (binding)

I participated in the review for the reader authorization and verified that
ATSv2 has no significant impact when disabled.  Looking forward to seeing
the next increment in functionality in a release.  A big thank you to
everyone involved in this effort!

Jason


On Tue, Aug 22, 2017 at 1:32 AM, Vrushali Channapattan <
vrushalic2...@gmail.com> wrote:

> Hi folks,
>
> Per earlier discussion [1], I'd like to start a formal vote to merge
> feature branch YARN-5355 [2] (Timeline Service v.2) to trunk. The vote will
> run for 7 days, and will end August 29 11:00 PM PDT.
>
> We have previously completed one merge onto trunk [3] and Timeline Service
> v2 has been part of Hadoop release 3.0.0-alpha1.
>
> Since then, we have been working on extending the capabilities of Timeline
> Service v2 in a feature branch [2] for a while, and we are reasonably
> confident that the state of the feature meets the criteria to be merged
> onto trunk and we'd love folks to get their hands on it in a test capacity
> and provide valuable feedback so that we can make it production-ready.
>
> In a nutshell, Timeline Service v.2 delivers significant scalability and
> usability improvements based on a new architecture. What we would like to
> merge to trunk is termed "alpha 2" (milestone 2). The feature has a
> complete end-to-end read/write flow with security and read level
> authorization via whitelists. You should be able to start setting it up and
> testing it.
>
> At a high level, the following are the key features that have been
> implemented since alpha1:
> - Security via Kerberos Authentication and delegation tokens
> - Read side simple authorization via whitelist
> - Client configurable entity sort ordering
> - Richer REST APIs for apps, app attempts, containers, fetching metrics by
> timerange, pagination, sub-app entities
> - Support for storing sub-application entities (entities that exist outside
> the scope of an application)
> - Configurable TTLs (time-to-live) for tables, configurable table prefixes,
> configurable hbase cluster
> - Flow level aggregations done as dynamic (table level) coprocessors
> - Uses latest stable HBase release 1.2.6
>
> There are a total of 82 subtasks that were completed as part of this
> effort.
>
> We paid close attention to ensure that once disabled Timeline Service v.2
> does not impact existing functionality when disabled (by default).
>
> Special thanks to a team of folks who worked hard and contributed towards
> this effort with patches, reviews and guidance: Rohith Sharma K S, Varun
> Saxena, Haibo Chen, Sangjin Lee, Li Lu, Vinod Kumar Vavilapalli, Joep
> Rottinghuis, Jason Lowe, Jian He, Robert Kanter, Micheal Stack.
>
> Regards,
> Vrushali
>
> [1] http://www.mail-archive.com/yarn-dev@hadoop.apache.org/msg27383.html
> [2] https://issues.apache.org/jira/browse/YARN-5355
> [3] https://issues.apache.org/jira/browse/YARN-2928
> [4] https://github.com/apache/hadoop/commits/YARN-5355
>


Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-28 Thread Jason Lowe
Allen Wittenauer wrote:


> > On Aug 25, 2017, at 1:23 PM, Jason Lowe  wrote:
> >
> > Allen Wittenauer wrote:
> >
> > > Doesn't this place an undue burden on the contributor with the first
> incompatible patch to prove worthiness?  What happens if it is decided that
> it's not good enough?
> >
> > It is a burden for that first, "this can't go anywhere else but 4.x"
> change, but arguably that should not be a change done lightly anyway.  (Or
> any other backwards-incompatible change for that matter.)  If it's worth
> committing then I think it's perfectly reasonable to send out the dev
> announce that there's reason for trunk to diverge from 3.x, cut branch-3,
> and move on.  This is no different than Andrew's recent announcement that
> there's now a need for separating trunk and the 3.0 line based on what's
> about to go in.
>
> So, by this definition as soon as a patch comes in to remove
> deprecated bits there will be no issue with a branch-3 getting created,
> correct?
>

I think this gets back to the "if it's worth committing" part.  I feel the
community should collectively decide when it's worth taking the hit to
maintain the separate code line.  IMHO removing deprecated bits alone is
not reason enough to diverge the code base and the additional maintenance
that comes along with the extra code line.  A new feature is traditionally
the reason to diverge because that's something users would actually care
enough about to take the compatibility hit when moving to the version that
has it.  That also helps drive a timely release of the new code line
because users want the feature that went into it.


> >  Otherwise if past trunk behavior is any indication, it ends up mostly
> enabling people to commit to just trunk, forgetting that the thing they are
> committing is perfectly valid for branch-3.
>
> I'm not sure there was any "forgetting" involved.  We likely
> wouldn't be talking about 3.x at all if it wasn't for the code diverging
> enough.
>

I don't think it was the myriad of small patches that went only into trunk
over the last 6 years that drove this.  Instead I think it was simply that
an "important enough" feature went in, like erasure coding, that gathered
momentum behind this release.  Trunk sat ignored for basically 5+ years,
and plenty of patches went into just trunk that should have gone into at
least branch-2 as well.  I don't think we as a community did the
contributors any favors by putting their changes into a code line that
didn't see a release for a very long time.  Yes 3.x could have released
sooner to help solve that issue, but given the complete lack of excitement
around 3.x until just recently is there any reason this won't happen again
with 4.x?  Seems to me 4.x will need to have something "interesting enough"
to drive people to release it relative to 3.x, which to me indicates we
shouldn't commit things only to there until we have an interest to do so.

> > Given the number of committers that openly ignore discussions like
> this, who is going to verify that incompatible changes don't get in?
> >
> > The same entities who are verifying other bugs don't get in, i.e.: the
> committers and the Hadoop QA bot running the tests.
> >  Yes, I know that means it's inevitable that compatibility breakages
> will happen, and we can and should improve the automation around
> compatibility testing when possible.
>
> The automation only goes so far.  At least while investigating
> Yetus bugs, I've seen more than enough blatant and purposeful ignored
> errors and warnings that I'm not convinced it will be effective. ("That
> javadoc compile failure didn't come from my patch!"  Um, yes, yes it did.)
> PR for features has greatly trumped code correctness for a few years now.
>

I totally agree here.  We can and should do better about this outside of
automation.  I brought up automation since I see it as a useful part of the
total solution along with better developer education, oversight, etc.  I'm
thinking specifically about tools that can report on public API signature
changes, but that's just one aspect of compatibility.  Semantic behavior is
not something a static analysis tool can automatically detect, and the only
way to automate some of that is something like end-to-end compatibility
testing.  Bigtop may cover some of this with testing of older versions of
downstream projects like HBase, Hive, Oozie, etc., and we could setup some
tests that standup two different Hadoop clusters and run tests that verify
interop between them.  But the tests will never be exhaustive and we will
still need educated commit

Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-25 Thread Jason Lowe
Allen Wittenauer wrote:


> Doesn't this place an undue burden on the contributor with the first
> incompatible patch to prove worthiness?  What happens if it is decided that
> it's not good enough?


It is a burden for that first, "this can't go anywhere else but 4.x"
change, but arguably that should not be a change done lightly anyway.  (Or
any other backwards-incompatible change for that matter.)  If it's worth
committing then I think it's perfectly reasonable to send out the dev
announce that there's reason for trunk to diverge from 3.x, cut branch-3,
and move on.  This is no different than Andrew's recent announcement that
there's now a need for separating trunk and the 3.0 line based on what's
about to go in.

I do not think it makes sense to pay for the maintenance overhead of two
nearly-identical lines with no backwards-incompatible changes between them
until we have the need.  Otherwise if past trunk behavior is any
indication, it ends up mostly enabling people to commit to just trunk,
forgetting that the thing they are committing is perfectly valid for
branch-3.  If we can agree that trunk and branch-3 should be equivalent
until an incompatible change goes into trunk, why pay for the commit
overhead and potential for accidentally missed commits until it is really
necessary?

How many will it take before the dam will break?  Or is there a timeline
> going to be given before trunk gets set to 4.x?


I think the threshold count for the dam should be 1.  As soon as we have a
JIRA that needs to be committed to move the project forward and we cannot
ship it in a 3.x release then we create branch-3 and move trunk to 4.x.
As for a timeline going to 4.x, again I don't see it so much as a "baking
period" as a "when we need it" criteria.  If we need it in a week then we
should cut it in a week.  Or a year then a year.  It all depends upon when
that 4.x-only change is ready to go in.

Given the number of committers that openly ignore discussions like this,
> who is going to verify that incompatible changes don't get in?
>

The same entities who are verifying other bugs don't get in, i.e.: the
committers and the Hadoop QA bot running the tests.  Yes, I know that means
it's inevitable that compatibility breakages will happen, and we can and
should improve the automation around compatibility testing when possible.
But I don't think there's a magic bullet for preventing all compatibility
bugs from being introduced, just like there isn't one for preventing
general bugs.  Does having a trunk branch separate but essentially similar
to branch-3 make this any better?

Longer term:  what is the PMC doing to make sure we start doing major
> releases in a timely fashion again?  In other words, is this really an
> issue if we shoot for another major in (throws dart) 2 years?
>

If we're trying to do semantic versioning then we shouldn't have a regular
cadence for major releases unless we have a regular cadence of changes that
break compatibility.  I'd hope that's not something we would strive
towards.  I do agree that we should try to be better about shipping
releases, major or minor, in a more timely manner, but I don't agree that
we should cut 4.0 simply based on a duration since the last major release.
The release contents and community's desire for those contents should
dictate the release numbering and schedule, respectively.

Jason


On Fri, Aug 25, 2017 at 2:16 PM, Allen Wittenauer 
wrote:

>
> > On Aug 25, 2017, at 10:36 AM, Andrew Wang 
> wrote:
>
> > Until we need to make incompatible changes, there's no need for
> > a Hadoop 4.0 version.
>
> Some questions:
>
> Doesn't this place an undue burden on the contributor with the
> first incompatible patch to prove worthiness?  What happens if it is
> decided that it's not good enough?
>
> How many will it take before the dam will break?  Or is there a
> timeline going to be given before trunk gets set to 4.x?
>
> Given the number of committers that openly ignore discussions like
> this, who is going to verify that incompatible changes don't get in?
>
> Longer term:  what is the PMC doing to make sure we start doing
> major releases in a timely fashion again?  In other words, is this really
> an issue if we shoot for another major in (throws dart) 2 years?
> -
> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>
>


Re: Branch merges and 3.0.0-beta1 scope

2017-08-25 Thread Jason Lowe
Andrew Wang wrote:


> This means I'll cut branch-3 and
> branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
> up development for Hadoop 3.1.0 and 4.0.0.


I can see a need for branch-3.0, but please do not create branch-3.  Doing
so will relegate trunk back to the "patch purgatory" branch, a place where
patches won't see a release for years.  Unless something is imminently
going in that will break backwards compatibility and warrant a new 4.x
release, I don't see the need to distinguish trunk from the 3.x line.
Leaving trunk as the 3.x line means less branches to commit patches through
and more testing of every patch since trunk would remain an active area for
testing and releasing.  If we separate trunk and branch-3 then it's almost
certain only-trunk patches will start to accumulate and never get any
"real" testing until someone eventually decides it's time to go to Hadoop
4.x.  Looking back at trunk-as-3.x for an example, patches committed there
in the early days after branch-2 was cut didn't see a release for almost 6
years.

My apologies if I've missed a feature that is just going to miss the 3.0
release and will break compatibility when it goes in.  If so then we need
to cut branch-3, but if not then here's my plea to hold off until we do
need it.

Jason


On Thu, Aug 24, 2017 at 3:33 PM, Andrew Wang 
wrote:

> Glad to see the discussion continued in my absence :)
>
> From a release management perspective, it's *extremely* reasonable to block
> the inclusion of new features a month from the planned release date. A
> typical software development lifecycle includes weeks of feature freeze and
> weeks of code freeze. It is no knock on any developer or any feature to say
> that we should not include something in 3.0.0.
>
> I've been very open and clear about the goals, schedule, and scope of 3.0.0
> over the last year plus. The point of the extended alpha process was to get
> all our features in during alpha, and the alpha merge window has been open
> for a year. I'm unmoved by arguments about how long a feature has been
> worked on. None of these were not part of the original 3.0.0 scope, and our
> users have been waiting even longer for big-ticket 3.0 items like JDK8 and
> HDFS EC that were part of the discussed scope.
>
> I see that two VOTEs have gone out since I was out. I still plan to follow
> the proposal in my original email. This means I'll cut branch-3 and
> branch-3.0, and move trunk to 4.0.0 before these VOTEs end. This will open
> up development for Hadoop 3.1.0 and 4.0.0.
>
> I'm reaching out to the lead contributor of each of these features
> individually to discuss. We need to close on this quickly, and email is too
> low bandwidth at this stage.
>
> Best,
> Andrew
>


[jira] [Resolved] (MAPREDUCE-6933) Invalid event: TA_CONTAINER_LAUNCH_FAILED at KILLED

2017-08-04 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6933.
---
Resolution: Duplicate

> Invalid event: TA_CONTAINER_LAUNCH_FAILED at KILLED
> ---
>
> Key: MAPREDUCE-6933
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6933
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.1, 3.0.0-alpha4
>Reporter: lujie
>
> When I run a job on 0.23.1, I found a InvalidStateTransitonException:
> {code:java}
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> TA_CONTAINER_LAUNCH_FAILED at KILLED
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:301)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:926)
> at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:135)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:870)
> at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskAttemptEventDispatcher.handle(MRAppMaster.java:862)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:125)
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:82)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> After I manually analyse the code of 3.0.0,I think this error may still 
> exists.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.4 (RC0)

2017-08-02 Thread Jason Lowe
Thanks for driving the 2.7.4 release!
+1 (binding)
- Verified signatures and digests- Successfully built from source including 
native- Deployed to a single-node cluster and ran sample MapReduce jobs
Jason 

On Saturday, July 29, 2017 6:29 PM, Konstantin Shvachko 
 wrote:
 

 Hi everybody,

Here is the next release of Apache Hadoop 2.7 line. The previous stable
release 2.7.3 was available since 25 August, 2016.
Release 2.7.4 includes 264 issues fixed after release 2.7.3, which are
critical bug fixes and major optimizations. See more details in Release
Note:
http://home.apache.org/~shv/hadoop-2.7.4-RC0/releasenotes.html

The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.4-RC0/

Please give it a try and vote on this thread. The vote will run for 5 days
ending 08/04/2017.

Please note that my up to date public key are available from:
https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
Please don't forget to refresh the page if you've been there recently.
There are other place on Apache sites, which may contain my outdated key.

Thanks,
--Konstantin


   

Re: Apache Hadoop 2.8.2 Release Plan

2017-07-21 Thread Jason Lowe
+1 to base the 2.8.2 release off of the more recent activity on branch-2.8.  
Because branch-2.8.2 was cut so long ago it is missing a lot of fixes that are 
in branch-2.8.  There also are a lot of JIRAs that claim they are fixed in 
2.8.2 but are not in branch-2.8.2.  Having the 2.8.2 release be based on recent 
activity in branch-2.8 would solve both of these issues, and we'd only need to 
move the handful of JIRAs that have marked themselves correctly as fixed in 
2.8.3 to be fixed in 2.8.2.

Jason
 

On Friday, July 21, 2017 10:01 AM, Kihwal Lee 
 wrote:
 

 Thanks for driving the next 2.8 release, Junping. While I was committing a 
blocker for 2.7.4, I noticed some of the jiras are back-ported to 2.7, but 
missing in branch-2.8.2.  Perhaps it is safer and easier to simply rebranch 
2.8.2.
Thanks,Kihwal

On Thursday, July 20, 2017, 3:32:16 PM CDT, Junping Du  
wrote:

Hi all,
    Per Vinod's previous email, we just announce Apache Hadoop 2.8.1 get 
released today which is a special security release. Now, we should work towards 
2.8.2 release which aim for production deployment. The focus obviously is to 
fix blocker/critical issues [2], bug-fixes and *no* features / improvements. We 
currently have 13 blocker/critical issues, and 10 of them are Patch Available.

  I plan to cut an RC in a month - target for releasing before end of Aug., to 
give enough time for outstanding blocker / critical issues. Will start moving 
out any tickets that are not blockers and/or won't fit the timeline. For 
progress of releasing effort, please refer our release wiki [2].

  Please share thoughts if you have any. Thanks!

Thanks,

Junping

[1] 2.8.2 release Blockers/Criticals: https://s.apache.org/JM5x
[2] 2.8 Release wiki: 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release


From: Vinod Kumar Vavilapalli 
Sent: Thursday, July 20, 2017 1:05 PM
To: gene...@hadoop.apache.org
Subject: [ANNOUNCE] Apache Hadoop 2.8.1 is released

Hi all,

The Apache Hadoop PMC has released version 2.8.1. You can get it from this 
page: http://hadoop.apache.org/releases.html#Download
This is a security release in the 2.8.0 release line. It consists of 2.8.0 plus 
security fixes. Users on 2.8.0 are encouraged to upgrade to 2.8.1.

Please note that 2.8.x release line continues to be not yet ready for 
production use. Critical issues are being ironed out via testing and downstream 
adoption. Production users should wait for a subsequent release in the 2.8.x 
line.

Thanks
+Vinod


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org

   

[jira] [Created] (MAPREDUCE-6916) History server scheduling tasks at fixed rate can be problematic when those tasks are slow

2017-07-18 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6916:
-

 Summary: History server scheduling tasks at fixed rate can be 
problematic when those tasks are slow
 Key: MAPREDUCE-6916
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6916
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.7.4
Reporter: Jason Lowe


The job history server currently schedules both the task of moving jobs from 
intermediate to done and the task of cleaning jobs at a fixed rate.  If those 
tasks take longer than the rate period to execute then a backlog of 
to-be-scheduled tasks can build up and cause a long storm of them to execute 
later when the blockage clears.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6909) LocalJobRunner fails when run on a node from multiple users

2017-06-30 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6909:
-

 Summary: LocalJobRunner fails when run on a node from multiple 
users
 Key: MAPREDUCE-6909
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6909
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.8.1
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


MAPREDUCE-5762 removed mapreduce.jobtracker.staging.root.dir from 
mapred-default.xml but the property is still being used by LocalJobRunner and 
the code default value does *not* match the value that was removed from 
mapred-default.xml.  This broke the use case where multiple users are running 
local mode jobs on the same node, since they now default to the same directory 
in /tmp.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-6898) TestKill.testKillTask is flaky

2017-06-16 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6898.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.8.2)
   (was: 3.0.0-alpha4)
   (was: 2.9.0)

> TestKill.testKillTask is flaky
> --
>
> Key: MAPREDUCE-6898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6898
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Attachments: MAPREDUCE-6898-001.patch
>
>
> TestKill.testKillTask() can fail if the async dispatcher thread is slower 
> than the test's thread.
> {noformat}
> 2017-05-26 11:43:26,532 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from INITED to SETUP
> Job State is : RUNNING
> Job State is : RUNNING Waiting for state : SUCCEEDED   map progress : 0.0   
> reduce progress : 0.0
> 2017-05-26 11:43:26,538 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-05-26 11:43:26,540 INFO  [AsyncDispatcher event handler] impl.TaskImpl 
> (TaskImpl.java:handle(661)) - task_0__m_00 Task Transitioned from NEW 
> to KILLED
> 2017-05-26 11:43:26,540 ERROR [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(998)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> JOB_TASK_COMPLETED at SETUP
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1366)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1362)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-05-26 11:43:26,541 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from SETUP to ERROR
> 2017-05-26 11:43:26,542 INFO  [AsyncDispatcher event handler] app.MRAppMaster 
> (MRAppMaster.java:serviceStop(978)) - Skipping cleaning up the staging dir. 
> assuming AM will be retried.
> {noformat}
> We have to wait until the job's internal state is 
> {{JobInternalState.RUNNING}} and not {{JobInternalState.SETUP}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Reopened] (MAPREDUCE-6898) TestKill.testKillTask is flaky

2017-06-16 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reopened MAPREDUCE-6898:
---

No worries, I'll revert and mark this as a duplicate of MAPREDUCE-6815.

> TestKill.testKillTask is flaky
> --
>
> Key: MAPREDUCE-6898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6898
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: MAPREDUCE-6898-001.patch
>
>
> TestKill.testKillTask() can fail if the async dispatcher thread is slower 
> than the test's thread.
> {noformat}
> 2017-05-26 11:43:26,532 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from INITED to SETUP
> Job State is : RUNNING
> Job State is : RUNNING Waiting for state : SUCCEEDED   map progress : 0.0   
> reduce progress : 0.0
> 2017-05-26 11:43:26,538 INFO  [CommitterEvent Processor #0] 
> commit.CommitterEventHandler (CommitterEventHandler.java:run(231)) - 
> Processing the event EventType: JOB_SETUP
> 2017-05-26 11:43:26,540 INFO  [AsyncDispatcher event handler] impl.TaskImpl 
> (TaskImpl.java:handle(661)) - task_0__m_00 Task Transitioned from NEW 
> to KILLED
> 2017-05-26 11:43:26,540 ERROR [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(998)) - Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> JOB_TASK_COMPLETED at SETUP
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
>   at 
> org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1366)
>   at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1362)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-05-26 11:43:26,541 INFO  [AsyncDispatcher event handler] impl.JobImpl 
> (JobImpl.java:handle(1006)) - job_0_Job Transitioned from SETUP to ERROR
> 2017-05-26 11:43:26,542 INFO  [AsyncDispatcher event handler] app.MRAppMaster 
> (MRAppMaster.java:serviceStop(978)) - Skipping cleaning up the staging dir. 
> assuming AM will be retried.
> {noformat}
> We have to wait until the job's internal state is 
> {{JobInternalState.RUNNING}} and not {{JobInternalState.SETUP}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-04-18 Thread Jason Lowe
Thanks for the pointers, Sean!  According to the infrastructure team, 
apparently it was a typo in the protection scheme that allowed the trunk force 
push to go through.  
 
https://issues.apache.org/jira/browse/INFRA-13902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971643#comment-15971643
   
Jason
 On Monday, April 17, 2017 3:05 PM, Sean Busbey  wrote:
 

 disallowing force pushes to trunk was done back in:

* August 2014: INFRA-8195
* February 2016: INFRA-11136

On Mon, Apr 17, 2017 at 11:18 AM, Jason Lowe
 wrote:
> I found at least one commit that was dropped, MAPREDUCE-6673.  I was able to 
> cherry-pick the original commit hash since it was recorded in the commit 
> email.
> This begs the question of why we're allowing force pushes to trunk.  I 
> thought we asked to have that disabled the last time trunk was accidentally 
> clobbered?
> Jason
>
>
>    On Monday, April 17, 2017 10:18 AM, Arun Suresh  wrote:
>
>
>  Hi
>
> I had the Apr-14 eve version of trunk on my local machine. I've pushed that.
> Don't know if anything was committed over the weekend though.
>
> Cheers
> -Arun
>
> On Mon, Apr 17, 2017 at 7:17 AM, Anu Engineer 
> wrote:
>
>> Hi Allen,
>>
>> https://issues.apache.org/jira/browse/INFRA-13902
>>
>> That happened with ozone branch too. It was an inadvertent force push.
>> Infra has advised us to force push the latest branch if you have it.
>>
>> Thanks
>> Anu
>>
>>
>> On 4/17/17, 7:10 AM, "Allen Wittenauer"  wrote:
>>
>> >Looks like someone reset HEAD back to Mar 31.
>> >
>> >Sent from my iPad
>> >
>> >> On Apr 16, 2017, at 12:08 AM, Apache Jenkins Server <
>> jenk...@builds.apache.org> wrote:
>> >>
>> >> For more details, see https://builds.apache.org/job/
>> hadoop-qbt-trunk-java8-linux-x86/378/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> -1 overall
>> >>
>> >>
>> >> The following subsystems voted -1:
>> >>    docker
>> >>
>> >>
>> >> Powered by Apache Yetus 0.5.0-SNAPSHOT  http://yetus.apache.org
>> >>
>> >>
>> >>
>> >> -
>> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>> >
>> >
>> >-
>> >To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> >For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> >
>> >
>>
>>
>
>
>



-- 
busbey


   

Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-04-17 Thread Jason Lowe
I found at least one commit that was dropped, MAPREDUCE-6673.  I was able to 
cherry-pick the original commit hash since it was recorded in the commit email.
This begs the question of why we're allowing force pushes to trunk.  I thought 
we asked to have that disabled the last time trunk was accidentally clobbered?
Jason
 

On Monday, April 17, 2017 10:18 AM, Arun Suresh  wrote:
 

 Hi

I had the Apr-14 eve version of trunk on my local machine. I've pushed that.
Don't know if anything was committed over the weekend though.

Cheers
-Arun

On Mon, Apr 17, 2017 at 7:17 AM, Anu Engineer 
wrote:

> Hi Allen,
>
> https://issues.apache.org/jira/browse/INFRA-13902
>
> That happened with ozone branch too. It was an inadvertent force push.
> Infra has advised us to force push the latest branch if you have it.
>
> Thanks
> Anu
>
>
> On 4/17/17, 7:10 AM, "Allen Wittenauer"  wrote:
>
> >Looks like someone reset HEAD back to Mar 31.
> >
> >Sent from my iPad
> >
> >> On Apr 16, 2017, at 12:08 AM, Apache Jenkins Server <
> jenk...@builds.apache.org> wrote:
> >>
> >> For more details, see https://builds.apache.org/job/
> hadoop-qbt-trunk-java8-linux-x86/378/
> >>
> >>
> >>
> >>
> >>
> >> -1 overall
> >>
> >>
> >> The following subsystems voted -1:
> >>    docker
> >>
> >>
> >> Powered by Apache Yetus 0.5.0-SNAPSHOT  http://yetus.apache.org
> >>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
> >-
> >To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> >
>
>


   

[jira] [Resolved] (MAPREDUCE-6869) org.apache.hadoop.mapred.ShuffleHandler: Shuffle error in populating headers :

2017-03-28 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6869.
---
Resolution: Not A Bug

Closing this since it does not appear to be a problem in Hadoop.  Please reopen 
with additional evidence if you find otherwise.

> org.apache.hadoop.mapred.ShuffleHandler: Shuffle error in populating headers :
> --
>
> Key: MAPREDUCE-6869
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6869
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2, yarn
>Affects Versions: 2.6.0
> Environment: hadoop 2.6.0-cdh5.8.2
>Reporter: 翟玉勇
>Priority: Minor
>
> nodemanager log
> 2017-03-25 21:07:03,071 ERROR org.apache.hadoop.mapred.ShuffleHandler: 
> Shuffle error in populating headers :
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find 
> usercache/master/appcache/application_1489067586592_930490/output/attempt_1489067586592_930490_m_002811_0/file.out.index
>  in any of the configured local directories
> at 
> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:488)
> at 
> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:165)
> at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.getMapOutputInfo(ShuffleHandler.java:1000)
> at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.populateHeaders(ShuffleHandler.java:1022)
> at 
> org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:908)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:142)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:148)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:787)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
> at 
> org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:459)
> at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:536)
> at 
> org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:435)
> at 
> org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:560)
> at 
> org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:555)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
> at 
> org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:107)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
> at 
> org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
> at 
> org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
> at 
> org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
> at 
> org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
>

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

2017-03-17 Thread Jason Lowe
+1 (binding)
- Verfied signatures and digests- Performed a native build from the release 
tag- Deployed to a single node cluster- Ran some sample jobs
Jason
 

On Friday, March 17, 2017 4:18 AM, Junping Du  wrote:
 

 Hi all,
    With fix of HDFS-11431 get in, I've created a new release candidate (RC3) 
for Apache Hadoop 2.8.0.

    This is the next minor release to follow up 2.7.0 which has been released 
for more than 1 year. It comprises 2,900+ fixes, improvements, and new 
features. Most of these commits are released for the first time in branch-2.

      More information about the 2.8.0 release plan can be found here: 
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release

      New RC is available at: 
http://home.apache.org/~junping_du/hadoop-2.8.0-RC3

      The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 
91f2b7a13d1e97be65db92ddabc627cc29ac0009

      The maven artifacts are available via repository.apache.org at: 
https://repository.apache.org/content/repositories/orgapachehadoop-1057

      Please try the release and vote; the vote will run for the usual 5 days, 
ending on 03/22/2017 PDT time.

Thanks,

Junping

   

Re: Updated 2.8.0-SNAPSHOT artifact

2016-11-04 Thread Jason Lowe
At this point my preference would be to do the most expeditious thing to 
release 2.8, whether that's sticking with the branch-2.8 we have today or 
re-cutting it on branch-2.  Doing a quick JIRA query, there's been almost 2,400 
JIRAs resolved in 2.8.0 (1).  For many of them, it's well-past time they saw a 
release vehicle.  If re-cutting the branch means we have to wrap up a few extra 
things that are still in-progress on branch-2 or add a few more blockers to the 
list before we release then I'd rather stay where we're at and ship it ASAP.

Jason
(1) 
https://issues.apache.org/jira/issues/?jql=project%20in%20%28hadoop%2C%20yarn%2C%20mapreduce%2C%20hdfs%29%20and%20resolution%20%3D%20Fixed%20and%20fixVersion%20%3D%202.8.0





On Tuesday, October 25, 2016 5:31 PM, Karthik Kambatla  
wrote:
 

 Is there value in releasing current branch-2.8? Aren't we better off
re-cutting the branch off of branch-2?

On Tue, Oct 25, 2016 at 12:20 AM, Akira Ajisaka 
wrote:

> It's almost a year since branch-2.8 has cut.
> I'm thinking we need to release 2.8.0 ASAP.
>
> According to the following list, there are 5 blocker and 6 critical issues.
> https://issues.apache.org/jira/issues/?filter=12334985
>
> Regards,
> Akira
>
>
> On 10/18/16 10:47, Brahma Reddy Battula wrote:
>
>> Hi Vinod,
>>
>> Any plan on first RC for branch-2.8 ? I think, it has been long time.
>>
>>
>>
>>
>> --Brahma Reddy Battula
>>
>> -Original Message-
>> From: Vinod Kumar Vavilapalli [mailto:vino...@apache.org]
>> Sent: 20 August 2016 00:56
>> To: Jonathan Eagles
>> Cc: common-...@hadoop.apache.org
>> Subject: Re: Updated 2.8.0-SNAPSHOT artifact
>>
>> Jon,
>>
>> That is around the time when I branched 2.8, so I guess you were getting
>> SNAPSHOT artifacts till then from the branch-2 nightly builds.
>>
>> If you need it, we can set up SNAPSHOT builds. Or just wait for the first
>> RC, which is around the corner.
>>
>> +Vinod
>>
>> On Jul 28, 2016, at 4:27 PM, Jonathan Eagles  wrote:
>>>
>>> Latest snapshot is uploaded in Nov 2015, but checkins are still coming
>>> in quite frequently.
>>> https://repository.apache.org/content/repositories/snapshots/org/apach
>>> e/hadoop/hadoop-yarn-api/
>>>
>>> Are there any plans to start producing updated SNAPSHOT artifacts for
>>> current hadoop development lines?
>>>
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


   

Re: [VOTE] Release Apache Hadoop 2.6.5 (RC1)

2016-10-10 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Built native from source- Deployed to a 
single-node cluster and ran some sample jobs
Jason
 

On Sunday, October 2, 2016 7:13 PM, Sangjin Lee  wrote:
 

 Hi folks,

I have pushed a new release candidate (R1) for the Apache Hadoop 2.6.5
release (the next maintenance release in the 2.6.x release line). RC1
contains fixes to CHANGES.txt, and is otherwise identical to RC0.

Below are the details of this release candidate:

The RC is available for validation at:
http://home.apache.org/~sjlee/hadoop-2.6.5-RC1/.

The RC tag in git is release-2.6.5-RC1 and its git commit is
e8c9fe0b4c252caf2ebf1464220599650f119997.

The maven artifacts are staged via repository.apache.org at:
https://repository.apache.org/content/repositories/orgapachehadoop-1050/.

You can find my public key at
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS.

Please try the release and vote. The vote will run for the usual 5 days. I
would greatly appreciate your timely vote. Thanks!

Regards,
Sangjin


   

Re: [VOTE] Release Apache Hadoop 2.7.3 RC2

2016-08-22 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Successfully built from source with native 
support- Deployed a single-node cluster- Ran some sample jobs successfully

Jason

  From: Vinod Kumar Vavilapalli 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli 
 Sent: Wednesday, August 17, 2016 9:05 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.3 RC2
   
Hi all,

I've created a new release candidate RC2 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/ 


The RC tag in git is: release-2.7.3-RC2

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1046 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC2/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - few issues with RC0 forced a RC1 [1]
 - few more issues with RC1 forced a RC2 [2]
 - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
release's related discussion thread is linked below: [3].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 

[2] [VOTE] Release Apache Hadoop 2.7.3 RC1: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg26336.html 

[3] 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 


   

[jira] [Created] (MAPREDUCE-6763) Shuffle server listen queue is too small

2016-08-19 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6763:
-

 Summary: Shuffle server listen queue is too small
 Key: MAPREDUCE-6763
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6763
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Jason Lowe
Assignee: Jason Lowe


ShuffleHandler doesn't specify a listen queue length for the server port, so it 
ends up getting the default listen queue length of 50.  This is too small to 
handle bursts of shuffle traffic on large clusters.  It's also inconsistent 
with the default Hadoop uses for RPC servers (default=128).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.3 RC1

2016-08-15 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Built from source with native support- 
Deployed a pseudo-distributed cluster- Ran some sample jobs
Jason

  From: Vinod Kumar Vavilapalli 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli 
 Sent: Friday, August 12, 2016 11:45 AM
 Subject: [VOTE] Release Apache Hadoop 2.7.3 RC1
   
Hi all,

I've created a release candidate RC1 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC1/ 


The RC tag in git is: release-2.7.3-RC1

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1045/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at home.apache.org/~vinodkv/hadoop-2.7.3-RC1/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - few issues with RC0 forced a RC1 [1]
 - a very long fix-cycle for the License & Notice issues (HADOOP-12893) caused 
2.7.3 (along with every other Hadoop release) to slip by quite a bit. This 
release's related discussion thread is linked below: [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1] [VOTE] Release Apache Hadoop 2.7.3 RC0: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/index.html#26106 

[2]: 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 


   

Re: [Release thread] 2.6.5 release activities

2016-08-10 Thread Jason Lowe
Thanks for organizing this, Chris!
I don't believe HADOOP-13362 is needed since it's related to ContainerMetrics.  
ContainerMetrics weren't added until 2.7 by YARN-2984.
YARN-4794 looks applicable to 2.6.  The change drops right in except it has 
JDK7-isms (multi-catch clause), so it needs a slight change.

Jason

  From: Chris Trezzo 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; "mapreduce-dev@hadoop.apache.org" 
; "yarn-...@hadoop.apache.org" 
 
 Sent: Tuesday, August 9, 2016 7:32 PM
 Subject: [Release thread] 2.6.5 release activities
   
Based on the sentiment in the "[DISCUSS] 2.6.x line releases" thread, I
have moved forward with some of the initial effort in creating a 2.6.5
release. I am forking this thread so we have a dedicated 2.6.5 release
thread.

I have gone through the git logs and gathered a list of JIRAs that are in
branch-2.7 but are missing from branch-2.6. I limited the diff to issues
with a commit date after 1/26/2016. I did this because 2.6.4 was cut from
branch-2.6 around that date (http://markmail.org/message/xmy7ebs6l3643o5e)
and presumably issues that were committed to branch-2.7 before then were
already looked at as part of 2.6.4.

I have collected these issues in a spreadsheet and have given them an
initial triage on whether they are candidates for a backport to 2.6.5. The
spreadsheet is sorted by the status of the issues with the potential
backport candidates at the top. Here is a link to the spreadsheet:
https://docs.google.com/spreadsheets/d/1lfG2CYQ7W4q3olWpOCo6EBAey1WYC8hTRUemHvYPPzY/edit?usp=sharing

As of now, I have identified 16 potential backport candidates. Please take
a look at the list and let me know if there are any that you think should
not be on the list, or ones that you think I have missed. This was just an
initial high-level triage, so there could definitely be issues that are
miss-labeled.

As a side note: we still need to look at the pre-commit build for 2.6 and
follow up with an addendum for HADOOP-12800.

Thanks everyone!
Chris Trezzo


  

Re: [VOTE] Release Apache Hadoop 2.7.3 RC0

2016-08-05 Thread Jason Lowe
Both sound like real problems to me, and I think it's appropriate to file JIRAs 
to track them.
Jason


  From: Andrew Wang 
 To: Karthik Kambatla  
Cc: larry mccay ; Vinod Kumar Vavilapalli 
; "common-...@hadoop.apache.org" 
; "hdfs-...@hadoop.apache.org" 
; "yarn-...@hadoop.apache.org" 
; "mapreduce-dev@hadoop.apache.org" 

 Sent: Thursday, August 4, 2016 5:56 PM
 Subject: Re: [VOTE] Release Apache Hadoop 2.7.3 RC0
   
Could a YARN person please comment on these two issues, one of which Vinay
also hit? If someone already triaged or filed JIRAs, I missed it.

On Mon, Jul 25, 2016 at 11:52 AM, Andrew Wang 
wrote:

> I'll also add that, as a YARN newbie, I did hit two usability issues.
> These are very unlikely to be regressions, and I can file JIRAs if they
> seem fixable.
>
> * I didn't have SSH to localhost set up (new laptop), and when I tried to
> run the Pi job, it'd exit my window manager session. I feel there must be a
> more developer-friendly solution here.
> * If you start the NodeManager and not the RM, the NM has a handler for
> SIGTERM and SIGINT that blocked my Ctrl-C and kill attempts during startup.
> I had to kill -9 it.
>
> On Mon, Jul 25, 2016 at 11:44 AM, Andrew Wang 
> wrote:
>
>> I got asked this off-list, so as a reminder, only PMC votes are binding
>> on releases. Everyone is encouraged to vote on releases though!
>>
>> +1 (binding)
>>
>> * Downloaded source, built
>> * Started up HDFS and YARN
>> * Ran Pi job which as usual returned 4, and a little teragen
>>
>> On Mon, Jul 25, 2016 at 11:08 AM, Karthik Kambatla 
>> wrote:
>>
>>> +1 (binding)
>>>
>>> * Downloaded and build from source
>>> * Checked LICENSE and NOTICE
>>> * Pseudo-distributed cluster with FairScheduler
>>> * Ran MR and HDFS tests
>>> * Verified basic UI
>>>
>>> On Sun, Jul 24, 2016 at 1:07 PM, larry mccay  wrote:
>>>
>>> > +1 binding
>>> >
>>> > * downloaded and built from source
>>> > * checked LICENSE and NOTICE files
>>> > * verified signatures
>>> > * ran standalone tests
>>> > * installed pseudo-distributed instance on my mac
>>> > * ran through HDFS and mapreduce tests
>>> > * tested credential command
>>> > * tested webhdfs access through Apache Knox
>>> >
>>> >
>>> > On Fri, Jul 22, 2016 at 10:15 PM, Vinod Kumar Vavilapalli <
>>> > vino...@apache.org> wrote:
>>> >
>>> > > Hi all,
>>> > >
>>> > > I've created a release candidate RC0 for Apache Hadoop 2.7.3.
>>> > >
>>> > > As discussed before, this is the next maintenance release to follow
>>> up
>>> > > 2.7.2.
>>> > >
>>> > > The RC is available for validation at:
>>> > > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ <
>>> > > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/>
>>> > >
>>> > > The RC tag in git is: release-2.7.3-RC0
>>> > >
>>> > > The maven artifacts are available via repository.apache.org <
>>> > > http://repository.apache.org/> at
>>> > > https://repository.apache.org/content/repositories/
>>> orgapachehadoop-1040/
>>> > <
>>> > > https://repository.apache.org/content/repositories/
>>> orgapachehadoop-1040/
>>> > >
>>> > >
>>> > > The release-notes are inside the tar-balls at location
>>> > > hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html.
>>> I
>>> > > hosted this at
>>> > > http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.html <
>>> > > http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html
>>> >
>>> > for
>>> > > your quick perusal.
>>> > >
>>> > > As you may have noted, a very long fix-cycle for the License & Notice
>>> > > issues (HADOOP-12893) caused 2.7.3 (along with every other Hadoop
>>> > release)
>>> > > to slip by quite a bit. This release's related discussion thread is
>>> > linked
>>> > > below: [1].
>>> > >
>>> > > Please try the release and vote; the vote will run for the usual 5
>>> days.
>>> > >
>>> > > Thanks,
>>> > > Vinod
>>> > >
>>> > > [1]: 2.7.3 release plan:
>>> > > https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/
>>> msg24439.html
>>> > <
>>> > > http://markmail.org/thread/6yv2fyrs4jlepmmr>
>>> >
>>>
>>
>>
>


   

Re: [VOTE] Release Apache Hadoop 2.7.3 RC0

2016-07-25 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Built from source with native support- 
Deployed a pseudo-distributed cluster- Ran some sample jobs
Jason

  From: Vinod Kumar Vavilapalli 
 To: "common-...@hadoop.apache.org" ; 
hdfs-...@hadoop.apache.org; yarn-...@hadoop.apache.org; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli 
 Sent: Friday, July 22, 2016 9:15 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.3 RC0
   
Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.3.

As discussed before, this is the next maintenance release to follow up 2.7.2.

The RC is available for validation at: 
http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/ 


The RC tag in git is: release-2.7.3-RC0

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1040/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://home.apache.org/~vinodkv/hadoop-2.7.3-RC0/releasenotes.html 
 for your 
quick perusal.

As you may have noted, a very long fix-cycle for the License & Notice issues 
(HADOOP-12893) caused 2.7.3 (along with every other Hadoop release) to slip by 
quite a bit. This release's related discussion thread is linked below: [1].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.3 release plan: 
https://www.mail-archive.com/hdfs-dev%40hadoop.apache.org/msg24439.html 


   

[jira] [Resolved] (MAPREDUCE-3294) Log the reason for killing a task during speculative execution

2016-06-20 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-3294.
---
Resolution: Duplicate

This was fixed by MAPREDUCE-5692.

> Log the reason for killing a task during speculative execution
> --
>
> Key: MAPREDUCE-3294
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3294
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Ramya Sunil
>
> The reason for killing a speculated task has to be logged. Currently, a 
> speculated task is killed with a note of "Container killed by the 
> ApplicationMaster. Container killed on request. Exit code is 137" which is 
> not very useful. Better logging of this message stating the task was killed 
> due to completion of its speculative task would be useful.
> Also, this message is lost once the app is moved to history. All we are left 
> with is a list of killed tasks without a reason being notified to the user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Created] (MAPREDUCE-6697) Concurrent task limits should only be applied when necessary

2016-05-16 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6697:
-

 Summary: Concurrent task limits should only be applied when 
necessary
 Key: MAPREDUCE-6697
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6697
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.7.0
Reporter: Jason Lowe


The concurrent task limit feature should only adjust the ANY portion of the AM 
heartbeat ask when a limit is truly necessary, otherwise extraneous containers 
could be allocated by the RM to the AM adding some overhead to both.  
Specifying a concurrent task limit that is beyond the total number of tasks in 
the job should be the same as asking for no limit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4758) jobhistory web ui not showing correct # failed reducers

2016-05-12 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-4758.
---
Resolution: Duplicate

This is a duplicate of MAPREDUCE-5982 which was fixed in 2.7.2 and 2.6.4.

> jobhistory web ui not showing correct # failed reducers
> ---
>
> Key: MAPREDUCE-4758
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4758
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, webapps
>Affects Versions: 0.23.4
>Reporter: Thomas Graves
>
> we had a job fail due to a reducer failing 4 times.  Unfortunately the job 
> history UI didn't show  this particular failed reducer which lead to 
> confusion as to why the job failed. 
> This reducer failed to launch all 4 task attempts with a Token Expiration 
> error and the jobhistory file only gets an event when the task attempt 
> transitions to launched.  The webapp JobInfo object only counts the task 
> attempts in the jobhistory file to display under the "Attempt Type" table, so 
> since this task didn't have an attempt with it, it did show it on the UI.
> We need to reconcile the task list with the task attempts or also shows more 
> stats for the tasks vs task attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.6.4 RC0

2016-02-08 Thread Jason Lowe
+1 (binding)
- verified signatures and digests- built native from source- deployed a 
single-node cluster and ran some sample MapReduce jobs.
Jason


  From: Junping Du 
 To: "hdfs-...@hadoop.apache.org" ; 
"yarn-...@hadoop.apache.org" ; 
"mapreduce-dev@hadoop.apache.org" ; 
"common-...@hadoop.apache.org"  
 Sent: Wednesday, February 3, 2016 1:01 AM
 Subject: [VOTE] Release Apache Hadoop 2.6.4 RC0
   
Hi community folks,
  I've created a release candidate RC0 for Apache Hadoop 2.6.4 (the next 
maintenance release to follow up 2.6.3.) according to email thread of release 
plan 2.6.4 [1]. Below is details of this release candidate:

The RC is available for validation at:
*http://people.apache.org/~junping_du/hadoop-2.6.4-RC0/
*

The RC tag in git is: release-2.6.4-RC0

The maven artifacts are staged via repository.apache.org at:
*https://repository.apache.org/content/repositories/orgapachehadoop-1028/?
*

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the usual 5 days.

Thanks!


Cheers,

Junping


[1]: 2.6.4 release plan: http://markmail.org/message/fk3ud3c665lscvx5?


  

[jira] [Created] (MAPREDUCE-6625) TestCLI#testGetJob fails occasionally

2016-02-02 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6625:
-

 Summary: TestCLI#testGetJob fails occasionally
 Key: MAPREDUCE-6625
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6625
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe


Lately TestCLI has been failing sometimes in precommit builds:
{noformat}
Running org.apache.hadoop.mapreduce.tools.TestCLI
Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.883 sec <<< 
FAILURE! - in org.apache.hadoop.mapreduce.tools.TestCLI
testGetJob(org.apache.hadoop.mapreduce.tools.TestCLI)  Time elapsed: 0.037 sec  
<<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.mapreduce.tools.TestCLI.testGetJob(TestCLI.java:175)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6623) TestRMNMInfo and TestNetworkedJob fails in trunk

2016-02-01 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6623.
---
Resolution: Duplicate

Resolving as a duplicate per the previous comment.

> TestRMNMInfo and TestNetworkedJob fails in trunk
> 
>
> Key: MAPREDUCE-6623
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6623
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Eric Badger
>
> TestRMNMInfo:
> {code}
> Running org.apache.hadoop.mapreduce.v2.TestRMNMInfo
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 32.347 sec 
> <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestRMNMInfo
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 
> 1.572 sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {code}
> TestNetworkedJob
> {code}
> testNetworkedJob:174 expected:<[[Thu Jan 28 22:41:20 + 2016] Application 
> is Activated, waiting for resources to be assigned for AM.  Details : AM 
> Partition =  ; Partition Resource =  vCores:16> ; Queue's Absolute capacity = 100.0 % ; Queue's Absolute used 
> capacity = 0.0 % ; Queue's Absolute max capacity = 100.0 % ; ]> but was:<[]>
>   TestRMNMInfo.testRMNMInfo:111 Unexpected number of live nodes: expected:<4> 
> but was:<0>
> {code}
> JDK version: JDK v1.8.0_66



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.7.2 RC2

2016-01-19 Thread Jason Lowe
That's reasonable, especially if we don't take nearly as long for 2.7.3.  Note 
that there are almost 50 JIRAs already committed to 2.7.3, so hopefully we'll 
have a plan for that soon.
+1 (binding) for 2.7.2 RC2.
Jason


  From: Vinod Kumar Vavilapalli 
 To: mapreduce-dev@hadoop.apache.org; Jason Lowe  
Cc: Hadoop Common ; "hdfs-...@hadoop.apache.org" 
; "yarn-...@hadoop.apache.org" 

 Sent: Tuesday, January 19, 2016 5:25 PM
 Subject: Re: [VOTE] Release Apache Hadoop 2.7.2 RC2
   
The JIRA YARN-4610 links YARN-3434 as the one causing the breakage, and 
YARN-3434 already exists in 2.7.1 itself. That categorizes the new issue as an 
existing bug.
If you agree with that sentiment, and given that there is a clear work-around, 
in the interest of progress of 2.7.2 (we have spent > 2 months on this now), 
I’d like to move forward.
Please LMK what you think.
Thanks+Vinod


On Jan 19, 2016, at 3:13 PM, Jason Lowe  wrote:
-1 (binding)
We have been running a release derived from 2.7 on some of our clusters, and we 
recently hit a bug where an application making large container requests can 
drastically slow down container allocations for other users in the same queue.  
See YARN-4610 for details.  Since 
yarn.scheduler.capacity.reservations-continue-look-all-nodes is on by default, 
I think we should fix this.  If we decide to ship 2.7.2 without that fix then 
the release notes should call out that JIRA and mention the workaround of 
setting yarn.scheduler.capacity.reservations-continue-look-all-nodes to false.
Jason


  From: Vinod Kumar Vavilapalli 
 To: Hadoop Common ; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
 Sent: Thursday, January 14, 2016 10:57 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC2

Hi all,

I've created an updated release candidate RC2 for Apache Hadoop 2.7.2.

As discussed before, this is the next maintenance release to follow up 2.7.1.

The RC is available for validation at: 
http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/

The RC tag in git is: release-2.7.2-RC2

The maven artifacts are available via repository.apache.org 
<http://repository.apache.org/> at 
https://repository.apache.org/content/repositories/orgapachehadoop-1027 
<https://repository.apache.org/content/repositories/orgapachehadoop-1027>

The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/releasenotes.html 
<http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html> for your 
quick perusal.

As you may have noted,
 - I terminated the RC1 related voting thread after finding out that we didn’t 
have a bunch of patches that are already in the released 2.6.3 version. After a 
brief discussion, we decided to keep the parallel 2.6.x and 2.7.x releases 
incremental, see [4] for this discussion.
 - The RC0 related voting thread got halted due to some critical issues. It 
took a while again for getting all those blockers out of the way. See the 
previous voting thread [3] for details.
 - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by quite a 
bit. This release's related discussion threads are linked below: [1] and [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 
<http://markmail.org/message/oozq3gvd4nhzsaes>
[2]: Planning Apache Hadoop 2.7.2 http://markmail.org/message/iktqss2qdeykgpqk 
<http://markmail.org/message/iktqss2qdeykgpqk>
[3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: 
http://markmail.org/message/5txhvr2qdiqglrwc 
<http://markmail.org/message/5txhvr2qdiqglrwc>
[4] Retracted [VOTE] Release Apache Hadoop 2.7.2 RC1: 
http://markmail.org/thread/n7ljbsnquihn3wlw





  

Re: [VOTE] Release Apache Hadoop 2.7.2 RC2

2016-01-19 Thread Jason Lowe
-1 (binding)
We have been running a release derived from 2.7 on some of our clusters, and we 
recently hit a bug where an application making large container requests can 
drastically slow down container allocations for other users in the same queue.  
See YARN-4610 for details.  Since 
yarn.scheduler.capacity.reservations-continue-look-all-nodes is on by default, 
I think we should fix this.  If we decide to ship 2.7.2 without that fix then 
the release notes should call out that JIRA and mention the workaround of 
setting yarn.scheduler.capacity.reservations-continue-look-all-nodes to false.
Jason


  From: Vinod Kumar Vavilapalli 
 To: Hadoop Common ; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
 Sent: Thursday, January 14, 2016 10:57 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC2
   
Hi all,

I've created an updated release candidate RC2 for Apache Hadoop 2.7.2.

As discussed before, this is the next maintenance release to follow up 2.7.1.

The RC is available for validation at: 
http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/

The RC tag in git is: release-2.7.2-RC2

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1027 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://people.apache.org/~vinodkv/hadoop-2.7.2-RC2/releasenotes.html 
 for your 
quick perusal.

As you may have noted,
 - I terminated the RC1 related voting thread after finding out that we didn’t 
have a bunch of patches that are already in the released 2.6.3 version. After a 
brief discussion, we decided to keep the parallel 2.6.x and 2.7.x releases 
incremental, see [4] for this discussion.
 - The RC0 related voting thread got halted due to some critical issues. It 
took a while again for getting all those blockers out of the way. See the 
previous voting thread [3] for details.
 - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by quite a 
bit. This release's related discussion threads are linked below: [1] and [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 

[2]: Planning Apache Hadoop 2.7.2 http://markmail.org/message/iktqss2qdeykgpqk 

[3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: 
http://markmail.org/message/5txhvr2qdiqglrwc 

[4] Retracted [VOTE] Release Apache Hadoop 2.7.2 RC1: 
http://markmail.org/thread/n7ljbsnquihn3wlw

  

[jira] [Created] (MAPREDUCE-6599) ResourceManager crash due to scheduling opportunity overflow

2016-01-05 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6599:
-

 Summary: ResourceManager crash due to scheduling opportunity 
overflow
 Key: MAPREDUCE-6599
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6599
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.1
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Critical


If a resource request lingers long enough unsatisfied then the scheduling 
opportunities count for the request can overflow and cause an RM crash.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.7.2 RC1

2015-12-18 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Spot checked CHANGES.txt files- Successfully 
performed a native build from source- Deployed to a single node cluster and ran 
sample jobs
We have been running with the fix for YARN-4354 on two of our clusters for some 
time with no issues, so I feel confident that prior blocker is now fixed.
Jason
 

  From: Vinod Kumar Vavilapalli 
 To: Hadoop Common ; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: Vinod Kumar Vavilapalli 
 Sent: Wednesday, December 16, 2015 8:49 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC1
   
Hi all,

I've created a release candidate RC1 for Apache Hadoop 2.7.2.

As discussed before, this is the next maintenance release to follow up 2.7.1.

The RC is available for validation at: 
http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/ 


The RC tag in git is: release-2.7.2-RC1

The maven artifacts are available via repository.apache.org 
 at 
https://repository.apache.org/content/repositories/orgapachehadoop-1026/ 


The release-notes are inside the tar-balls at location 
hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html. I hosted 
this at http://people.apache.org/~vinodkv/hadoop-2.7.2-RC1/releasenotes.html 
for quick perusal.

As you may have noted,
 - The RC0 related voting thread got halted due to some critical issues. It 
took a while again for getting all those blockers out of the way. See the 
previous voting thread [3] for details.
 - Before RC0, an unusually long 2.6.3 release caused 2.7.2 to slip by quite a 
bit. This release's related discussion threads are linked below: [1] and [2].

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes 

[2]: Planning Apache Hadoop 2.7.2 http://markmail.org/message/iktqss2qdeykgpqk 

[3]: [VOTE] Release Apache Hadoop 2.7.2 RC0: 
http://markmail.org/message/5txhvr2qdiqglrwc


   

Re: [VOTE] Release Apache Hadoop 2.6.3 RC0

2015-12-16 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Successfully built from source with native 
code support- Deployed to a single-node cluster and ran some test jobs
Jason

  From: Junping Du 
 To: Hadoop Common ; "hdfs-...@hadoop.apache.org" 
; "mapreduce-dev@hadoop.apache.org" 
; "yarn-...@hadoop.apache.org" 
 
Cc: "junping...@apache.org" 
 Sent: Friday, December 11, 2015 6:16 PM
 Subject: [VOTE] Release Apache Hadoop 2.6.3 RC0
   

Hi all developers in hadoop community,
  I've created a release candidate RC0 for Apache Hadoop 2.6.3 (the next 
maintenance release to follow up 2.6.2.) according to email thread of release 
plan 2.6.3 [1]. Sorry for this RC coming a bit late as several blocker issues 
were getting committed until yesterday. Below is the details:

The RC is available for validation at:
*http://people.apache.org/~junping_du/hadoop-2.6.3-RC0/
*

The RC tag in git is: release-2.6.3-RC0

The maven artifacts are staged via repository.apache.org at:
*https://repository.apache.org/content/repositories/orgapachehadoop-1025/?
*

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the usual 5 days.

Thanks and happy weekend!


Cheers,

Junping


[1]: 2.6.3 release plan: http://markmail.org/thread/nc2jogbgni37vu6y


 

Re: [VOTE] Release Apache Hadoop 2.7.2 RC0

2015-11-13 Thread Jason Lowe
-1 (binding)
Ran into public localization issues and filed YARN-4354. We need that resolved 
before the release is ready.  We will either need a timely fix or may have to 
revert YARN-2902 to unblock the release if my root-cause analysis is correct.  
I'll dig into this more today.

Jason

  From: Vinod Kumar Vavilapalli 
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Wednesday, November 11, 2015 10:31 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.2 RC0
   
Hi all,


I've created a release candidate RC0 for Apache Hadoop 2.7.2.


As discussed before, this is the next maintenance release to follow up
2.7.1.


The RC is available for validation at:

*http://people.apache.org/~vinodkv/hadoop-2.7.2-RC0/

*


The RC tag in git is: release-2.7.2-RC0


The maven artifacts are available via repository.apache.org at

*https://repository.apache.org/content/repositories/orgapachehadoop-1023/

*


As you may have noted, an unusually long 2.6.3 release caused 2.7.2 to slip
by quite a bit. This release's related discussion threads are linked below:
[1] and [2].


Please try the release and vote; the vote will run for the usual 5 days.


Thanks,

Vinod


[1]: 2.7.2 release plan: http://markmail.org/message/oozq3gvd4nhzsaes

[2]: Planning Apache Hadoop 2.7.2
http://markmail.org/message/iktqss2qdeykgpqk


  

Re: [VOTE] Release Apache Hadoop 2.6.2

2015-10-26 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Performed native build from source- Deployed 
a single-node cluster and ran some test jobs

Jason
  From: Sangjin Lee 
 To: "common-...@hadoop.apache.org" ; 
"yarn-...@hadoop.apache.org" ; 
"hdfs-...@hadoop.apache.org" ; 
"mapreduce-dev@hadoop.apache.org"  
Cc: Vinod Kumar Vavilapalli  
 Sent: Thursday, October 22, 2015 4:14 PM
 Subject: [VOTE] Release Apache Hadoop 2.6.2
   
Hi all,

I have created a release candidate (RC0) for Hadoop 2.6.2.

The RC is available at: http://people.apache.org/~sjlee/hadoop-2.6.2-RC0/

The RC tag in git is: release-2.6.2-RC0

The list of JIRAs committed for 2.6.2:
https://issues.apache.org/jira/browse/YARN-4101?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20fixVersion%20%3D%202.6.2

The maven artifacts are staged at
https://repository.apache.org/content/repositories/orgapachehadoop-1022/

Please try out the release candidate and vote. The vote will run for 5 days.

Thanks,
Sangjin


   

[jira] [Resolved] (MAPREDUCE-4938) Job submission to unknown queue can leave staging directory behind

2015-10-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-4938.
---
Resolution: Duplicate

> Job submission to unknown queue can leave staging directory behind
> --
>
> Key: MAPREDUCE-4938
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4938
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha, 0.23.5
>    Reporter: Jason Lowe
>
> There is a race where submitting a job to an unknown queue can appear to 
> succeed to the client and then subsequently fail later.  Since there was no 
> AM ever launched, there was nothing left to cleanup the staging directory.  
> At that point the client is the only thing that can cleanup the staging 
> directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6472) MapReduce AM should have java.io.tmpdir=./tmp to be consistent with tasks

2015-09-08 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6472:
-

 Summary: MapReduce AM should have java.io.tmpdir=./tmp to be 
consistent with tasks
 Key: MAPREDUCE-6472
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6472
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.6.0
Reporter: Jason Lowe


MapReduceChildJVM.getVMCommand ensures that all tasks have 
-Djava.io.tmpdir=./tmp set as part of the task command-line, but this is only 
used for tasks.  The AM itself does not have a corresponding java.io.tmpdir 
setting.  It should also use the same tmpdir setting to avoid cases where the 
AM JVM wants to place files in /tmp by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.7.1 RC0

2015-07-01 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests
- Successfully performed a native build from source- Deployed a single-node 
cluster- Ran sample MapReduce jobs

Jason
  From: Vinod Kumar Vavilapalli 
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Monday, June 29, 2015 3:45 AM
 Subject: [VOTE] Release Apache Hadoop 2.7.1 RC0
   
Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.1.

As discussed before, this is the next stable release to follow up 2.6.0,
and the first stable one in the 2.7.x line.

The RC is available for validation at:
*http://people.apache.org/~vinodkv/hadoop-2.7.1-RC0/
*

The RC tag in git is: release-2.7.1-RC0

The maven artifacts are available via repository.apache.org at
*https://repository.apache.org/content/repositories/orgapachehadoop-1019/
*

Please try the release and vote; the vote will run for the usual 5 days.

Thanks,
Vinod

PS: It took 2 months instead of the planned [1] 2 weeks in getting this
release out: post-mortem in a separate thread.

[1]: A 2.7.1 release to follow up 2.7.0
http://markmail.org/thread/zwzze6cqqgwq4rmw


   

[jira] [Created] (MAPREDUCE-6413) TestLocalJobSubmission is failing

2015-06-23 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6413:
-

 Summary: TestLocalJobSubmission is failing
 Key: MAPREDUCE-6413
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6413
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.1
Reporter: Jason Lowe


ThestLocalJobSubmission.testLocalJobLibjarsOption is failing with 
java.net.UnknownHostException: testcluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6355) 2.5 client cannot communicate with 2.5 job on 2.6 cluster

2015-05-04 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6355:
-

 Summary: 2.5 client cannot communicate with 2.5 job on 2.6 cluster
 Key: MAPREDUCE-6355
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6355
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe


Trying to run a job on a Hadoop 2.6 cluster from a Hadoop 2.5 client submitting 
a job that uses Hadoop 2.5 jars results in a job that succeeds but the client 
cannot communicate with the AM while the job is running.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6324) Uber jobs fail to update AMRM token when it rolls over

2015-04-21 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6324:
-

 Summary: Uber jobs fail to update AMRM token when it rolls over
 Key: MAPREDUCE-6324
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6324
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.6.0
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


When the RM rolls a new AMRM master key the AMs are supposed to receive a new 
AMRM token on subsequent heartbeats between the time when the new key is rolled 
and when it is activated.  This is not occurring for uber jobs.  If the 
connection to the RM needs to be re-established after the new key is activated 
(e.g.: RM restart or network hiccup) then the uber job AM will be unable to 
reconnect to the RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.7.0 RC0

2015-04-14 Thread Jason Lowe
+1 (binding)
- Verified signatures and digests- Built from source with native support- 
Deployed to a single-node cluster and ran sample jobs
Jason

  From: Vinod Kumar Vavilapalli 
 To: common-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org 
Cc: vino...@apache.org 
 Sent: Friday, April 10, 2015 6:44 PM
 Subject: [VOTE] Release Apache Hadoop 2.7.0 RC0
   
Hi all,

I've created a release candidate RC0 for Apache Hadoop 2.7.0.

 The RC is available at: http://people.apache.org/~vinodkv/hadoop-2.7.0-RC0/

The RC tag in git is: release-2.7.0-RC0

 The maven artifacts are available via repository.apache.org at
https://repository.apache.org/content/repositories/orgapachehadoop-1017/

As discussed before
 - This release will only work with JDK 1.7 and above
 - I’d like to use this as a starting release for 2.7.x [1], depending on
how it goes, get it stabilized and potentially use a 2.7.1 in a few
weeks as the stable release.

 Please try the release and vote; the vote will run for the usual 5 days.

 Thanks,
 Vinod

 [1]: A 2.7.1 release to follow up 2.7.0
http://markmail.org/thread/zwzze6cqqgwq4rmw

  

[jira] [Created] (MAPREDUCE-6303) Read timeout when retrying a fetch error can be fatal to a reducer

2015-04-01 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6303:
-

 Summary: Read timeout when retrying a fetch error can be fatal to 
a reducer
 Key: MAPREDUCE-6303
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6303
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jason Lowe
Priority: Blocker


If a reducer encounters an error trying to fetch from a node then encounters a 
read timeout when trying to re-establish the connection then the reducer can 
fail.  The read timeout exception can leak to the top of the Fetcher thread 
which will cause the reduce task to teardown.  This type of error can repeat 
across reducer attempts causing jobs to fail due to a single bad node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6279) AM should explicity exit JVM after all services have stopped

2015-03-18 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6279:
-

 Summary: AM should explicity exit JVM after all services have 
stopped
 Key: MAPREDUCE-6279
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6279
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 2.5.0
Reporter: Jason Lowe


Occasionally the MapReduce AM can "get stuck" trying to shut down.  
MAPREDUCE-6049 and MAPREDUCE-5888 were specific instances that have been fixed, 
but this can also occur with uber jobs if the task code inadvertently leaves 
non-daemon threads lingering.

We should explicitly shutdown the JVM after the MapReduce AM has unregistered 
and all services have been stopped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Looking to a Hadoop 3 release

2015-03-05 Thread Jason Lowe
I'm OK with a 3.0.0 release as long as we are minimizing the pain of 
maintaining yet another release line and conscious of the incompatibilities 
going into that release line.
For the former, I would really rather not see a branch-3 cut so soon.  It's yet 
another line onto which to cherry-pick, and I don't see why we need to add this 
overhead at such an early phase.  We should only create branch-3 when there's 
an incompatible change that the community wants and it should _not_ go into the 
next major release (i.e.: it's for Hadoop 4.0).  We can develop 3.0 alphas and 
betas on trunk and release from trunk in the interim.  IMHO we need to stop 
treating trunk as a place to exile patches.

For the latter, I think as a community we need to evaluate the benefits of 
breaking compatibility against the costs of migrating.  Each time we break 
compatibility we create a hurdle for people to jump when they move to the new 
release, and we should make those hurdles worth their time.  For example, 
wire-compatibility has been mentioned as part of this.  Any feature that breaks 
wire compatibility better be absolutely amazing, as it creates a huge hurdle 
for people to jump.
To summarize:+1 for a community-discussed roadmap of what we're breaking in 
Hadoop 3 and why it's worth it for users
-1 for creating branch-3 now, we can release from trunk until the next 
incompatibility for Hadoop 4 arrives
+1 for baking classpath isolation as opt-in on 2.x and eventually default on in 
3.0
Jason
  From: Andrew Wang 
 To: "hdfs-...@hadoop.apache.org"  
Cc: "common-...@hadoop.apache.org" ; 
"mapreduce-dev@hadoop.apache.org" ; 
"yarn-...@hadoop.apache.org"  
 Sent: Wednesday, March 4, 2015 12:15 PM
 Subject: Re: Looking to a Hadoop 3 release
   
Let's not dismiss this quite so handily.

Sean, Jason, and Stack replied on HADOOP-11656 pointing out that while we
could make classpath isolation opt-in via configuration, what we really
want longer term is to have it on by default (or just always on). Stack in
particular points out the practical difficulties in using an opt-in method
in 2.x from a downstream project perspective. It's not pretty.

The plan that both Sean and Jason propose (which I support) is to have an
opt-in solution in 2.x, bake it there, then turn it on by default
(incompatible) in a new major release. I think this lines up well with my
proposal of some alphas and betas leading up to a GA 3.x. I'm also willing
to help with 2.x release management if that would help with testing this
feature.

Even setting aside classpath isolation, a new major release is still
justified by JDK8. Somehow this is being ignored in the discussion. Allen,
historically the voice of the user in our community, just highlighted it as
a major compatibility issue, and myself and Tucu have also expressed our
very strong concerns about bumping this in a minor release. 2.7's bump is a
unique exception, but this is not something to be cited as precedent or
policy.

Where does this resistance to a new major release stem from? As I've
described from the beginning, this will look basically like a 2.x release,
except for the inclusion of classpath isolation by default and target
version JDK8. I've expressed my desire to maintain API and wire
compatibility, and we can audit the set of incompatible changes in trunk to
ensure this. My proposal for doing alpha and beta releases leading up to GA
also gives downstreams a nice amount of time for testing and validation.

Regards,
Andrew



On Tue, Mar 3, 2015 at 2:32 PM, Arun Murthy  wrote:

> Awesome, looks like we can just do this in a compatible manner - nothing
> else on the list seems like it warrants a (premature) major release.
>
> Thanks Vinod.
>
> Arun
>
> 
> From: Vinod Kumar Vavilapalli 
> Sent: Tuesday, March 03, 2015 2:30 PM
> To: common-...@hadoop.apache.org
> Cc: hdfs-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org;
> yarn-...@hadoop.apache.org
> Subject: Re: Looking to a Hadoop 3 release
>
> I started pitching in more on that JIRA.
>
> To add, I think we can and should strive for doing this in a compatible
> manner, whatever the approach. Marking and calling it incompatible before
> we see proposal/patch seems premature to me. Commented the same on JIRA:
> https://issues.apache.org/jira/browse/HADOOP-11656?focusedCommentId=14345875&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14345875
> .
>
> Thanks
> +Vinod
>
> On Mar 2, 2015, at 8:08 PM, Andrew Wang  andrew.w...@cloudera.com>> wrote:
>
> Regarding classpath isolation, based on what I hear from our customers,
> it's still a big problem (even after the MR classloader work). The latest
> Jackson version bump was quite painful for our downstream projects, and the
> HDFS client still leaks a lot of dependencies. Would welcome more
> discussion of this on HADOOP-11656, Steve, Colin, and Haohui have already
> chimed in.
>
>


  

[jira] [Created] (MAPREDUCE-6263) Large jobs can lose history when killed due to brief client timeout

2015-02-18 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6263:
-

 Summary: Large jobs can lose history when killed due to brief 
client timeout
 Key: MAPREDUCE-6263
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6263
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: client
Affects Versions: 2.6.0
Reporter: Jason Lowe


YARNRunner connects to the AM to send the kill job command then waits a 
hardcoded 10 seconds for the job to enter a terminal state.  If the job fails 
to enter a terminal state in that time then YARNRunner will tell YARN to kill 
the application forcefully.  The latter type of kill usually results in no job 
history, since the AM process is killed forcefully.

Ten seconds can be too short for large jobs in a large cluster, as it takes 
time to connect to all the nodemanagers, process the state machine events, and 
copy a large jhist file.  The timeout should be more lenient or configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6261) NullPointerException if MapOutputBuffer.flush invoked twice

2015-02-13 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6261:
-

 Summary: NullPointerException if MapOutputBuffer.flush invoked 
twice
 Key: MAPREDUCE-6261
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6261
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Affects Versions: 2.5.0
Reporter: Jason Lowe


MapOutputBuffer.flush will throw an NPE if it is invoked twice, since it 
blindly assumes kvbuffer is not null yet sets kvbuffer to null towards the end 
of the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-5727) History server web page can filter without showing filter keyword

2015-02-11 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-5727.
---
Resolution: Duplicate

This is the same issue as described in YARN-2238, and there's more discussion 
there.

> History server web page can filter without showing filter keyword
> -
>
> Key: MAPREDUCE-5727
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5727
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, webapps
>Affects Versions: 2.3.0
>Reporter: Jason Lowe
>
> I loaded up a job conf page on the history server and used one of the search 
> boxes to narrow the results.  I then navigated to other pages (e.g.: map 
> tasks, logs, etc.) then navigated back to the job conf page using the job 
> configuration link on the left side of the page.  When I arrived it promptly 
> showed me just a few conf entries (the ones I had searched for earlier) but 
> my search term was missing.  At first glance it looked like those were the 
> only entries in the entire job conf, which can be very confusing.  Somehow 
> the search term is being remembered but not replotted when the configuration 
> page is revisited.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6249) Streaming task will not untar tgz uploaded with -archives

2015-02-10 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6249.
---
Resolution: Not a Problem

This is something better sent to the [Hadoop User mailing 
list|http://hadoop.apache.org/mailing_lists.html#User] rather than JIRA.

The archive was untarred as requested, but it was untarred into a directory 
(named "test" per the '#test' URI fragment in the archive argument).  An 
archive is always unpacked into a directory specific to that archive, and the 
distributed cache does not support unpacking directly into the task's working 
directory.  If you need files placed in the task working directory then you 
will need to specify them separately (e.g.: via the "-files" directive).

> Streaming task will not untar tgz uploaded with -archives
> -
>
> Key: MAPREDUCE-6249
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6249
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 2.5.2
> Environment: hadoop-2.5.2
> hadoop-streaming-2.5.2.jar
>Reporter: Liu Xiao
>
> when writing hadoop streaming task. i used -archives to upload a tgz from 
> local machine to hdfs task working directory, but it has not been untarred as 
> the document says. I've searched a lot without any luck.
> Here is the hadoop streaming task starting command with hadoop-2.5.2
> hadoop jar /opt/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.5.2.jar \
> -files mapper.sh
> -archives /home/hadoop/tmp/test.tgz#test \
> -D mapreduce.job.maps=1 \
> -D mapreduce.job.reduces=1 \
> -input "/test/test.txt" \
> -output "/res/" \
> -mapper "sh mapper.sh" \
> -reducer "cat"
> and "mapper.sh"
> cat > /dev/null
> ls -l test
> exit 0
> in "test.tgz" there is two files "test.1.txt" and "test.2.txt"
> echo "abcd" > test.1.txt
> echo "efgh" > test.2.txt
> tar zcvf test.tgz test.1.txt test.2.txt
> the output from above task
> lrwxrwxrwx 1 hadoop hadoop 71 Feb  8 23:25 test -> 
> /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/filecache/116/test.tgz
> but what desired may be like this
> -rw-r--r-- 1 hadoop hadoop 5 Feb  8 23:25 test.1.txt
> -rw-r--r-- 1 hadoop hadoop 5 Feb  8 23:25 test.2.txt
> so, why test.tgz has not been untarred automatically as document says, and or 
> there is actually another way makes the "tgz" being untarred



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6230) MR AM does not survive RM restart if RM activated a new AMRM secret key

2015-01-27 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6230:
-

 Summary: MR AM does not survive RM restart if RM activated a new 
AMRM secret key
 Key: MAPREDUCE-6230
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6230
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Reporter: Jason Lowe
Assignee: Jason Lowe
Priority: Blocker


A MapReduce AM will fail to reconnect to an RM that performed restart in the 
following scenario:

# MapReduce job launched with AMRM token generated from AMRM secret X
# RM rolls new AMRM secret Y and activates the new key
# RM performs a work-preserving restart
# MapReduce job AM now unable to connect to RM with "Invalid AMRMToken" 
exception



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6225) Fix new findbug warnings in hadoop-mapreduce-client-core

2015-01-26 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6225:
-

 Summary: Fix new findbug warnings in hadoop-mapreduce-client-core
 Key: MAPREDUCE-6225
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6225
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Jason Lowe


Recent precommit builds in hadoop-mapreduce-client-core are flagging findbug 
warnings that appear to be new with the recent findbugs upgrade.  These need to 
be cleaned up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6219) Reduce memory required for FileInputFormat located status optimization

2015-01-20 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6219:
-

 Summary: Reduce memory required for FileInputFormat located status 
optimization
 Key: MAPREDUCE-6219
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6219
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Jason Lowe
Priority: Minor


MAPREDUCE-1981 introduced an optimization to drastically reduce the number of 
namenode operations required to compute input splits when processing a 
directory.  However it requires more memory to perform this optimization as it 
retains the full LocatedFileStatus object for all input files while computing 
the splits.  This can lead to odd situations for users where using a directory 
as input can run the job client out of heap space but using directory/* as the 
input spec allows it to run within the original heap space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6172) TestDbClasses timeouts are too aggressive

2014-11-24 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6172:
-

 Summary: TestDbClasses timeouts are too aggressive
 Key: MAPREDUCE-6172
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6172
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 2.6.0
Reporter: Jason Lowe
Priority: Minor


Some of the TestDbClasses test timeouts are only 1 second, and some of those 
tests perform disk I/O which could easily exceed the test timeout if the disk 
is busy or there's some other hiccup on the system at the time.  We should 
increase these timeouts to something more reasonable (i.e.: 10 or 20 seconds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.6.0

2014-11-17 Thread Jason Lowe
+1 (binding)
- verified signatures and digests- verified late-arriving fixes for YARN-2846 
and MAPREDUCE-6156 were present
- built from source- deployed to a single-node cluster 
- ran some sample MapReduce jobs
Jason
  From: Arun C Murthy 
 To: "common-...@hadoop.apache.org" ; 
"hdfs-...@hadoop.apache.org" ; 
"yarn-...@hadoop.apache.org" ; 
"mapreduce-dev@hadoop.apache.org"  
 Sent: Thursday, November 13, 2014 5:08 PM
 Subject: [VOTE] Release Apache Hadoop 2.6.0 
   
Folks,

I've created another release candidate (rc1) for hadoop-2.6.0 based on the 
feedback.

The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.6.0-rc1
The RC tag in git is: release-2.6.0-rc1

The maven artifacts are available via repository.apache.org at 
https://repository.apache.org/content/repositories/orgapachehadoop-1013.

Please try the release and vote; the vote will run for the usual 5 days.

thanks,
Arun


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


   

[jira] [Created] (MAPREDUCE-6162) mapred hsadmin fails on a secure cluster

2014-11-14 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6162:
-

 Summary: mapred hsadmin fails on a secure cluster
 Key: MAPREDUCE-6162
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6162
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Jason Lowe


Attempts to use mapred hsadmin fail on a secure cluster for a couple of 
reasons. The HSAdmin client isn't configuring the principal config key for the 
protocol, resulting in a "Failed to specify server's Kerberos principal name" 
error.  The principal can be specified manually on the command-line via 
-Dhadoop.security.service.user.name.key, but then it results in a "Protocol 
interface ... is not known" error because HSAdminServer is not registering an 
appropriately configured policy provider when authorization is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6161) mapred hsadmin command missing from trunk

2014-11-13 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6161:
-

 Summary: mapred hsadmin command missing from trunk
 Key: MAPREDUCE-6161
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6161
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: scripts
Affects Versions: trunk
Reporter: Jason Lowe


The hsadmin subcommand of the mapred script is no longer present in trunk. It 
is present in branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.6.0

2014-11-13 Thread Jason Lowe
I just committed 2.6 blockes YARN-2846 and MAPREDUCE-6156 which should also be 
in the 2.6.0 rc1 build.
Jason
  From: Arun C Murthy 
 To: yarn-...@hadoop.apache.org 
Cc: mapreduce-dev@hadoop.apache.org; Ravi Prakash ; 
"hdfs-...@hadoop.apache.org" ; 
"common-...@hadoop.apache.org"  
 Sent: Wednesday, November 12, 2014 10:58 AM
 Subject: Re: [VOTE] Release Apache Hadoop 2.6.0
   
Sounds good. I'll create an rc1. Thanks.

Arun

On Nov 11, 2014, at 2:06 PM, Robert Kanter  wrote:

> Hi Arun,
> 
> We were testing the RC and ran into a problem with the recent fixes that
> were done for POODLE for Tomcat (HADOOP-11217 for KMS and HDFS-7274 for
> HttpFS).  Basically, in disabling SSLv3, we also disabled SSLv2Hello, which
> is required for older clients (e.g. Java 6 with openssl 0.9.8x) so they
> can't connect without it.  Just to be clear, it does not mean SSLv2, which
> is insecure.  This also affects the MR shuffle in HADOOP-11243.
> 
> The fix is super simple, so I think we should reopen these 3 JIRAs and put
> in addendum patches and get them into 2.6.0.
> 
> thanks
> - Robert
> 
> On Tue, Nov 11, 2014 at 1:04 PM, Ravi Prakash  wrote:
> 
>> Hi Arun!
>> We are very close to completion on YARN-1964 (DockerContainerExecutor).
>> I'd also like HDFS-4882 to be checked in. Do you think these issues merit
>> another RC?
>> ThanksRavi
>> 
>> 
>>    On Tuesday, November 11, 2014 11:57 AM, Steve Loughran <
>> ste...@hortonworks.com> wrote:
>> 
>> 
>> +1 binding
>> 
>> -patched slider pom to build against 2.6.0
>> 
>> -verified build did download, which it did at up to ~8Mbps. Faster than a
>> local build.
>> 
>> -full clean test runs on OS/X & Linux
>> 
>> 
>> Windows 2012:
>> 
>> Same thing. I did have to first build my own set of the windows native
>> binaries, by checking out branch-2.6.0; doing a native build, copying the
>> binaries and then purging the local m2 repository of hadoop artifacts to be
>> confident I was building against. For anyone who wants those native libs
>> they will be up on
>> https://github.com/apache/incubator-slider/tree/develop/bin/windows/ once
>> it syncs with the ASF repos.
>> 
>> afterwords: the tests worked!
>> 
>> 
>> On 11 November 2014 02:52, Arun C Murthy  wrote:
>> 
>>> Folks,
>>> 
>>> I've created a release candidate (rc0) for hadoop-2.6.0 that I would like
>>> to see released.
>>> 
>>> The RC is available at:
>>> http://people.apache.org/~acmurthy/hadoop-2.6.0-rc0
>>> The RC tag in git is: release-2.6.0-rc0
>>> 
>>> The maven artifacts are available via repository.apache.org at
>>> https://repository.apache.org/content/repositories/orgapachehadoop-1012.
>>> 
>>> Please try the release and vote; the vote will run for the usual 5 days.
>>> 
>>> thanks,
>>> Arun
>>> 
>>> 
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>> immediately
>>> and delete it from your system. Thank You.
>>> 
>> 
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>> 
>> 
>> 
>> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/hdp/





-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


  

[jira] [Resolved] (MAPREDUCE-6159) No log of JobHistory found in all logs files

2014-11-12 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6159.
---
Resolution: Invalid

The JobHistoryEventHandler is code that runs in the ApplicationMaster rather 
than the job history server.  You'll find those log messages in the AM logs of 
individual jobs which are either aggregated to HDFS (by default) or left on the 
nodes the AMs ran on if log aggregation is disabled.

> No log of JobHistory found in all logs files
> 
>
> Key: MAPREDUCE-6159
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6159
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, mrv2
>Affects Versions: 2.2.0
> Environment: Hadoop-2.2.0
>Reporter: JasonZhu
>
> I intend to dig into 'mapreduce.jobhistory.intermediate-done-dir' argument, 
> the position of which is at `JHAdminConfig:73`, to get some comprehension on 
> history server. This argument is referenced at 
> `JobHistoryEventHandler.moveToDoneNow()`, where history server moves job 
> summary file 
> from "$[yarn.app.mapreduce.am.staging-dir]/$[user]/.staging" to 
> "$[mapreduce.jobhistory.intermediate-done-dir]/$[user]". 
> The following code snippet in `moveToDoneNow()` will definitely write some 
> logs out to log file, but I can found no any sign of it in all logs in 
> $HADOOP_LOG_DIR via command `grep "Copied to done location" *`.
> if (copied)
> LOG.info("Copied to done location: " + toPath);
> else 
> LOG.info("copy failed");
> Is there anything that I missed?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6141) History server leveldb recovery store

2014-10-28 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6141:
-

 Summary: History server leveldb recovery store
 Key: MAPREDUCE-6141
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6141
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Reporter: Jason Lowe
Assignee: Jason Lowe


It would be nice to have a leveldb option to the job history server recovery 
store.  Leveldb would provide some benefits over the existing filesystem store 
such as better support for atomic operations, fewer I/O ops per state update, 
and far fewer total files on the filesystem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MAPREDUCE-6119) Ability to disable node update processing in MR AM

2014-10-03 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6119:
-

 Summary: Ability to disable node update processing in MR AM
 Key: MAPREDUCE-6119
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6119
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mr-am
Affects Versions: 2.5.0
Reporter: Jason Lowe
Assignee: Jason Lowe






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6114) TestMRCJCFileInputFormat#testAddInputPath fails in trunk

2014-09-29 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6114.
---
Resolution: Duplicate

Dup of MAPREDUCE-6094.

> TestMRCJCFileInputFormat#testAddInputPath fails in trunk
> 
>
> Key: MAPREDUCE-6114
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6114
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>Reporter: Ted Yu
>Priority: Minor
>
> This can be reproduced locally:
> {code}
> Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.474 sec <<< 
> FAILURE! - in org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat
> testAddInputPath(org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat)
>   Time elapsed: 0.86 sec  <<< ERROR!
> java.io.IOException: No FileSystem for scheme: s3
>   at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2583)
>   at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2590)
>   at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>   at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2629)
>   at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2611)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>   at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
>   at 
> org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat.testAddInputPath(TestMRCJCFileInputFormat.java:55)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MAPREDUCE-6098) org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat intermittently failed in trunk

2014-09-19 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-6098.
---
Resolution: Duplicate

This is a duplicate of MAPREDUCE-6094.

> org.apache.hadoop.mapreduce.lib.input.TestMRCJCFileInputFormat intermittently 
> failed in trunk
> -
>
> Key: MAPREDUCE-6098
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6098
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Wangda Tan
> Fix For: trunk
>
>
> See: 
> https://issues.apache.org/jira/browse/YARN-611?focusedCommentId=14129761&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14129761
>  for details



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.5.1 RC0

2014-09-10 Thread Jason Lowe

+1 (binding)

- verified signatures and digests
- built from source
- examined CHANGES.txt for items fixed in 2.5.1
- deployed to a single-node cluster and ran some sample MR jobs

Jason

On 09/05/2014 07:18 PM, Karthik Kambatla wrote:

Hi folks,

I have put together a release candidate (RC0) for Hadoop 2.5.1.

The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.1-RC0/
The RC git tag is release-2.5.1-RC0
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1010/

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the now usual 5
days.

Thanks
Karthik





[jira] [Created] (MAPREDUCE-6075) HistoryServerFileSystemStateStore can create zero-length files

2014-09-05 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6075:
-

 Summary: HistoryServerFileSystemStateStore can create zero-length 
files
 Key: MAPREDUCE-6075
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6075
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe
Assignee: Jason Lowe


When the history server state store writes a token file it uses 
IOUtils.cleanup() to close the file which will silently ignore errors.  This 
can lead to empty token files in the state store.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Hadoop 2.5.0 RC2

2014-08-10 Thread Jason Lowe

+1 (binding)

- verified signatures and digests
- built from source
- deployed a single-node cluster
- ran some sample jobs

Jason

On 08/06/2014 03:59 PM, Karthik Kambatla wrote:

Hi folks,

I have put together a release candidate (rc2) for Hadoop 2.5.0.

The RC is available at: http://people.apache.org/~kasha/hadoop-2.5.0-RC2/
The RC tag in svn is here:
https://svn.apache.org/repos/asf/hadoop/common/tags/release-2.5.0-rc2/
The maven artifacts are staged at:
https://repository.apache.org/content/repositories/orgapachehadoop-1009/

You can find my public key at:
http://svn.apache.org/repos/asf/hadoop/common/dist/KEYS

Please try the release and vote. The vote will run for the now usual 5
days.

Thanks





Re: [VOTE] Migration from subversion to git for version control

2014-08-10 Thread Jason Lowe

+1

Jason

On 08/08/2014 09:57 PM, Karthik Kambatla wrote:

I have put together this proposal based on recent discussion on this topic.

Please vote on the proposal. The vote runs for 7 days.

1. Migrate from subversion to git for version control.
2. Force-push to be disabled on trunk and branch-* branches. Applying
changes from any of trunk/branch-* to any of branch-* should be through
"git cherry-pick -x".
3. Force-push on feature-branches is allowed. Before pulling in a
feature, the feature-branch should be rebased on latest trunk and the
changes applied to trunk through "git rebase --onto" or "git cherry-pick
".
4. Every time a feature branch is rebased on trunk, a tag that
identifies the state before the rebase needs to be created (e.g.
tag_feature_JIRA-2454_2014-08-07_rebase). These tags can be deleted once
the feature is pulled into trunk and the tags are no longer useful.
5. The relevance/use of tags stay the same after the migration.

Thanks
Karthik

PS: Per Andrew Wang, this should be a "Adoption of New Codebase" kind of
vote and will be Lazy 2/3 majority of PMC members.





[jira] [Created] (MAPREDUCE-6022) map_input_file is missing from streaming job environment

2014-08-01 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6022:
-

 Summary: map_input_file is missing from streaming job environment
 Key: MAPREDUCE-6022
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6022
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Jason Lowe


When running a streaming job the 'map_input_file' environment variable is not 
being set.  This property is deprecated, but in the past deprecated properties 
still appeared in a stream job's environment.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6021) MR AM should add working directory to LD_LIBRARY_PATH

2014-08-01 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6021:
-

 Summary: MR AM should add working directory to LD_LIBRARY_PATH
 Key: MAPREDUCE-6021
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6021
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mr-am
Affects Versions: 2.4.1
Reporter: Jason Lowe


Tasks implicitly pick up shared libraries added to the job because the task 
launch context explicitly adds the container working directory to 
LD_LIBRARY_PATH.  However the same is not done for the AM container which is 
inconsistent.  User code can run in the AM via output committer, speculator, 
uber job, etc., so the AM's LD_LIBRARY_PATH should have the container work 
directory for consistency with tasks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6011) Improve history server behavior during a recovery error

2014-07-28 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6011:
-

 Summary: Improve history server behavior during a recovery error
 Key: MAPREDUCE-6011
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6011
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe


Currently when the history server encounters an error during recovery it is 
fatal without specific details on the error (e.g. which token was involved 
during the recovery error).  We should either allow the history server to 
proceed past recovery errors or provide more specifics on the offending token 
involved in the fatal error to aid in manual recovery.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MAPREDUCE-6010) HistoryServerFileSystemStateStore fails to update tokens

2014-07-28 Thread Jason Lowe (JIRA)
Jason Lowe created MAPREDUCE-6010:
-

 Summary: HistoryServerFileSystemStateStore fails to update tokens
 Key: MAPREDUCE-6010
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6010
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobhistoryserver
Affects Versions: 2.3.0
Reporter: Jason Lowe


When token recovery is enabled and the file system state store is being used 
then tokens fail to be updated due to a rename destination conflict.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   3   4   >