[jira] [Updated] (MAPREDUCE-6983) Moving logging APIs over to slf4j in hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated MAPREDUCE-6983: - Status: Patch Available (was: Open) > Moving logging APIs over to slf4j in hadoop-mapreduce-client-core > - > > Key: MAPREDUCE-6983 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6983 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Jinjiang Ling >Assignee: Jinjiang Ling > Attachments: MAPREDUCE-6983.001.patch, MAPREDUCE-6983.002.patch, > MAPREDUCE-6983.003.patch, MAPREDUCE-6983.004.patch, MAPREDUCE-6983.005.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6983) Moving logging APIs over to slf4j in hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated MAPREDUCE-6983: - Attachment: MAPREDUCE-6983.005.patch > Moving logging APIs over to slf4j in hadoop-mapreduce-client-core > - > > Key: MAPREDUCE-6983 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6983 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Jinjiang Ling >Assignee: Jinjiang Ling > Attachments: MAPREDUCE-6983.001.patch, MAPREDUCE-6983.002.patch, > MAPREDUCE-6983.003.patch, MAPREDUCE-6983.004.patch, MAPREDUCE-6983.005.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6983) Moving logging APIs over to slf4j in hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinjiang Ling updated MAPREDUCE-6983: - Status: Open (was: Patch Available) > Moving logging APIs over to slf4j in hadoop-mapreduce-client-core > - > > Key: MAPREDUCE-6983 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6983 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Reporter: Jinjiang Ling >Assignee: Jinjiang Ling > Attachments: MAPREDUCE-6983.001.patch, MAPREDUCE-6983.002.patch, > MAPREDUCE-6983.003.patch, MAPREDUCE-6983.004.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6752) Bad logging practices in mapreduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212073#comment-16212073 ] Nemo Chen commented on MAPREDUCE-6752: -- [~gribeler] Hi, I have no problems with that. Thanks for your effort! > Bad logging practices in mapreduce > -- > > Key: MAPREDUCE-6752 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6752 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.7.2 >Reporter: Nemo Chen > Labels: easyfix, easytest, newbie > > Similar to previous issue HADOOP-3029, in the file: > hadoop-rel-release-2.7.2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/DefaultSpeculator.java > line 253, the whole method is for debugging purpose, therefore the log: > {code:borderStyle=solid} > LOG.info("We got asked to run a debug speculation scan."); > {code} > should be set to debug level instead of info level to avoid redundant > information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6985) MapReduce native optimization does not work properly due to a shuffle error (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated MAPREDUCE-6985: Description: Reported by Shingo Furuyama: We confirmed that MapReduce native optimization (MAPREDUCE-2841) of Hadoop-3.0.0-beta1 does not work properly due to a shuffle error. {noformat} 2017-10-19 11:59:40,513 WARN mapred.Task: Could not find output size org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:489) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:166) at org.apache.hadoop.mapred.MROutputFiles.getOutputFile(MROutputFiles.java:57) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1248) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1228) at org.apache.hadoop.mapred.Task.done(Task.java:1174) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {noformat} My build and run environment is below: CentOS: 7.3 cmake: 3.6.3 Maven: 3.5.0 Java: 1.8.0_131 Hadoop: 3.0.0-beta1 mode: LocalJobRunner was: Reported by Shingo Furuyama: We confirmed that MapReduce native optimization (MAPREDUCE-2841) of Hadoop-3.0.0-beta1 does not work properly due to a shuffle error. {noformat} 2017-10-19 11:59:40,513 WARN mapred.Task: Could not find output size org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find output/file.out in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:489) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:166) at org.apache.hadoop.mapred.MROutputFiles.getOutputFile(MROutputFiles.java:57) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1248) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1228) at org.apache.hadoop.mapred.Task.done(Task.java:1174) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) {noformat} My build and run environment is below: CentOS: 7.3 cmake: 3.6.3 Maven: 3.5.0 Java: 1.8.0_131 Hadoop: 3.0.0-beta1 > MapReduce native optimization does not work properly due to a shuffle error > (LocalJobRunner) > > > Key: MAPREDUCE-6985 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6985 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Takanobu Asanuma >Priority: Critical > Attachments: mr_native_optimization_error.log > > > Reported by Shingo Furuyama: > We confirmed that MapReduce native optimization (MAPREDUCE-2841) of > Hadoop-3.0.0-beta1 does not work properly due to a shuffle error. > {noformat} > 2017-10-19 11:59:40,513 WARN mapred.Task: Could not find output size > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > output/file.out in any of the configured local directories > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:489) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:166) > at > org.apache.hadoop.mapred.MROutputFiles.getOutputFile(MROutputFiles.java:57) > at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1248) > at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1228) > at org.apache.hadoop.mapred.Task.done(Task.java:1174) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(Loc
[jira] [Updated] (MAPREDUCE-6985) MapReduce native optimization does not work properly due to a shuffle error (LocalJobRunner)
[ https://issues.apache.org/jira/browse/MAPREDUCE-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takanobu Asanuma updated MAPREDUCE-6985: Summary: MapReduce native optimization does not work properly due to a shuffle error (LocalJobRunner) (was: MapReduce native optimization does not work properly due to a shuffle error) > MapReduce native optimization does not work properly due to a shuffle error > (LocalJobRunner) > > > Key: MAPREDUCE-6985 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6985 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Takanobu Asanuma >Priority: Critical > Attachments: mr_native_optimization_error.log > > > Reported by Shingo Furuyama: > We confirmed that MapReduce native optimization (MAPREDUCE-2841) of > Hadoop-3.0.0-beta1 does not work properly due to a shuffle error. > {noformat} > 2017-10-19 11:59:40,513 WARN mapred.Task: Could not find output size > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > output/file.out in any of the configured local directories > at > org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:489) > at > org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:166) > at > org.apache.hadoop.mapred.MROutputFiles.getOutputFile(MROutputFiles.java:57) > at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1248) > at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1228) > at org.apache.hadoop.mapred.Task.done(Task.java:1174) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:345) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > {noformat} > My build and run environment is below: > CentOS: 7.3 > cmake: 3.6.3 > Maven: 3.5.0 > Java: 1.8.0_131 > Hadoop: 3.0.0-beta1 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5124) AM lacks flow control for task events
[ https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211818#comment-16211818 ] Jason Lowe commented on MAPREDUCE-5124: --- As you point out, it's probably going to be bad for the IPC server thread to block since that will likely cause all server threads to block shortly thereafter preventing any RPC processing until the blockage clears. That is definitely a throttle, but it could be severe enough to cause task failures if they can't contact the AM in a timely manner. For the throttling case we could do something similar to what was done in YARN to help mitigate NM heartbeats overwhelming the RM. We could still dispatch status update events to the task attempts for every heartbeat, but instead of attaching the status update directly to the event we could attach the status payload to the TaskAttempt directly. If there's already an unprocessed status event pending on the TaskAttempt we can then coalesce the two status updates into a single status update. Coalescing should be pretty straightforward since the newer status should clobber the older status for most of the payload. Then when the status update event arrives at the TaskAttempt it processes the status update object, if there is one, then clears it. This should largely mitigate the problem since the memory pressure from all these events is primarily from the status payload attached to each event. If the server thread makes sure there is only at most one outstanding status payload per task then we have an upper limit on the number of outstanding status payloads the AM has to track. With this approach I don't think it will be necessary to scan or otherwise track each status update events posted to the dispatcher. They're going to be very small once the status payload is removed, and they'll be quick to process if its corresponding status payload was coalesced into an earlier payload. If desired we could also combine this idea with the client-side throttle hint in the RPC response, since the server thread will know whether it coalesced a status update or not. If it did then we could tell the client to throttle a bit for the next update. Deferred RPC response could be useful here, but I haven't thought through how tricky it would be to implement in practice. I agree that switching the AM to be the one that drives heartbeats is not appropriate here. My feeling is it creates as many problems as it solves. My current recommendation is to try the coalesce status updates approach as was done for NM heartbeats to the RM. That was pretty effective there at mitigating the backlog issues, and I think it could work well here too. As a bonus it makes it trivial to determine when we should tell a client to backoff a bit if we choose to do that as well. > AM lacks flow control for task events > - > > Key: MAPREDUCE-5124 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Haibo Chen > Attachments: MAPREDUCE-5124-proto.2.txt, MAPREDUCE-5124-prototype.txt > > > The AM does not have any flow control to limit the incoming rate of events > from tasks. If the AM is unable to keep pace with the rate of incoming > events for a sufficient period of time then it will eventually exhaust the > heap and crash. MAPREDUCE-5043 addressed a major bottleneck for event > processing, but the AM could still get behind if it's starved for CPU and/or > handling a very large job with tens of thousands of active tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked
[ https://issues.apache.org/jira/browse/MAPREDUCE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211769#comment-16211769 ] Jason Lowe commented on MAPREDUCE-6987: --- Thanks for the report! Is this a problem in 2.8? If this really was caused by MAPREDUCE-6634 then I'm thinking this afffects 2.9.0 instead of 2.8.0. > JHS Log Scanner and Cleaner blocked > --- > > Key: MAPREDUCE-6987 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > {code} > "Log Scanner/Cleaner #1" #81 prio=5 os_prio=0 tid=0x7fd6c010f000 > nid=0x11db waiting on condition [0x7fd6aa859000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xd6c88a80> (a > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) > at > org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "Log Scanner/Cleaner #0" #80 prio=5 os_prio=0 tid=0x7fd6c010c800 > nid=0x11da waiting on condition [0x7fd6aa95a000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xd6c8> (a > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) > at > org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Both threads waiting on {{FutureTask.get()}} for infinite time after first > execution -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6752) Bad logging practices in mapreduce
[ https://issues.apache.org/jira/browse/MAPREDUCE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211535#comment-16211535 ] Guilherme Gribeler commented on MAPREDUCE-6752: --- I also agree that the System.out.println calls could be replaced by LOG.debug calls, as well as the LOG.info call. I can generate a patch for this issue if this solution is ok. Can I do it from the most recent released version (3.0.0-beta1) ? > Bad logging practices in mapreduce > -- > > Key: MAPREDUCE-6752 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6752 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: client >Affects Versions: 2.7.2 >Reporter: Nemo Chen > Labels: easyfix, easytest, newbie > > Similar to previous issue HADOOP-3029, in the file: > hadoop-rel-release-2.7.2/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/speculate/DefaultSpeculator.java > line 253, the whole method is for debugging purpose, therefore the log: > {code:borderStyle=solid} > LOG.info("We got asked to run a debug speculation scan."); > {code} > should be set to debug level instead of info level to avoid redundant > information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (MAPREDUCE-5124) AM lacks flow control for task events
[ https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211212#comment-16211212 ] Peter Bacsko edited comment on MAPREDUCE-5124 at 10/19/17 3:32 PM: --- [~jlowe] yesterday I had a discussion with [~miklos.szeg...@cloudera.com] about the possible implementations of this. We came up with different solutions which I'll try to summarize here. * Throttling: we try to determine whether an event for a particular task update has been processed or not. If not, then we don't try to process the new status update. We can examine the event queue of the AsyncDispatcher and try find update event that belongs to the same attempt ID. I can think of two approaches here: *# Server-side throttling: block inside MRAppMaster until the status update is fully processed. If I'm not mistaken, this completely blocks the current RPC server thread, so too many status updates in parallel might make it impossible to process other RPC calls. *# Client-side throttling: we return without dispatching {{TaskAttemptStatusUpdateEvent}} to the event queue, but set a field in {{AMFeedBack}}, indicating that the AM is busy. The client checks the result. If the flag is set, it doubles the status update interval, resulting in fewer status update calls to the AM. * Use the deferred RPC response mechanism implemented in HADOOP-11552. This means that we have to retrieve the callback object from the current RPC calling context and pass it over until the full update logic is executed. This is doable, although one event might create another event and it's not entirely clear when the operation can be considered finished. Getting rid of some asynchronicity can help, although I'm not sure if this kind of change is dangerous or not. * Let the AM drive the whole status update mechanism as explained by Miklos. This looks too complicated and the change would be too big, at least for this JIRA. I haven't been deeply considering the pros and cons of the proposed solutions. Personally I like the client-side throttling and the deferred RPC callback. If we go for throttling, we also have to think about how we determine when we need to push the client back to send updates less frequently. We can check the size of the current event queue, but Miklos had some convincing arguments against doing it. We can look for already existing TaskAttemptStatusUpdateEvent:s (what I suggested above) but that means iteration which is more expensive. I can't see a simple, silver-bullet solution right now. was (Author: pbacsko): [~jlowe] yesterday I had a discussion with [~miklos.szeg...@cloudera.com] about the possible implementations of this. We came up with different solutions which I'll try to summarize here. * Throttling: we try to determine whether an event for a particular task update has been processed or not. If not, then we don't try to process the new status update. We can examine the event queue of the AsyncDispatcher and try find update event that belongs to the same attempt ID. I can think of two approaches here: *# Server-side throttling: block inside MRAppMaster until the status update is fully processed. If I'm not mistaken, this completely blocks the current RPC server thread, so too many status updates in parallel might make it impossible to process other RPC calls. *# Client-side throttling: we return without dispatching {{TaskAttemptStatusUpdateEvent}} to the event queue, but set a field in {{AMFeedBack}}, indicating that the AM is busy. The client checks the result. If the flag is set, it doubles the status update interval, resulting in fewer status update calls to the AM. * Use the deferred RPC response mechanism implemented in HADOOP-11552. This means that we have to retrieve the callback object from the current RPC calling context and pass it over until the full update logic is executed. This is doable, although one event might create another event and it's not entirely clear when the operation can be considered finished. Getting rid of some asynchronicity can help, although I'm not sure if this kind of change is dangerous or not. * Let the AM drive the whole status update mechanism as explained by Miklos. This looks too complicated and the change would be too big, at least for this JIRA. I haven't been deeply considering the pros and cons of the proposed solutions. Personally I like the client-side throttling and the deferred RPC callback. If we go for throttling, we also have to think about how we determine when we need to push the client back to send updates less frequently. We can check the size of the current event queue, but Miklos had some convincing arguments against doing it. We can look for already existing {{TaskAttemptStatusUpdateEvent}}s (what I suggested above) but that means iteration which is more expensive. I can't see a s
[jira] [Commented] (MAPREDUCE-5124) AM lacks flow control for task events
[ https://issues.apache.org/jira/browse/MAPREDUCE-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211212#comment-16211212 ] Peter Bacsko commented on MAPREDUCE-5124: - [~jlowe] yesterday I had a discussion with [~miklos.szeg...@cloudera.com] about the possible implementations of this. We came up with different solutions which I'll try to summarize here. * Throttling: we try to determine whether an event for a particular task update has been processed or not. If not, then we don't try to process the new status update. We can examine the event queue of the AsyncDispatcher and try find update event that belongs to the same attempt ID. I can think of two approaches here: *# Server-side throttling: block inside MRAppMaster until the status update is fully processed. If I'm not mistaken, this completely blocks the current RPC server thread, so too many status updates in parallel might make it impossible to process other RPC calls. *# Client-side throttling: we return without dispatching {{TaskAttemptStatusUpdateEvent}} to the event queue, but set a field in {{AMFeedBack}}, indicating that the AM is busy. The client checks the result. If the flag is set, it doubles the status update interval, resulting in fewer status update calls to the AM. * Use the deferred RPC response mechanism implemented in HADOOP-11552. This means that we have to retrieve the callback object from the current RPC calling context and pass it over until the full update logic is executed. This is doable, although one event might create another event and it's not entirely clear when the operation can be considered finished. Getting rid of some asynchronicity can help, although I'm not sure if this kind of change is dangerous or not. * Let the AM drive the whole status update mechanism as explained by Miklos. This looks too complicated and the change would be too big, at least for this JIRA. I haven't been deeply considering the pros and cons of the proposed solutions. Personally I like the client-side throttling and the deferred RPC callback. If we go for throttling, we also have to think about how we determine when we need to push the client back to send updates less frequently. We can check the size of the current event queue, but Miklos had some convincing arguments against doing it. We can look for already existing {{TaskAttemptStatusUpdateEvent}}s (what I suggested above) but that means iteration which is more expensive. I can't see a simple, silver-bullet solution right now. > AM lacks flow control for task events > - > > Key: MAPREDUCE-5124 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5124 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mr-am >Affects Versions: 2.0.3-alpha, 0.23.5 >Reporter: Jason Lowe >Assignee: Haibo Chen > Attachments: MAPREDUCE-5124-proto.2.txt, MAPREDUCE-5124-prototype.txt > > > The AM does not have any flow control to limit the incoming rate of events > from tasks. If the AM is unable to keep pace with the rate of incoming > events for a sufficient period of time then it will eventually exhaust the > heap and crash. MAPREDUCE-5043 addressed a major bottleneck for event > processing, but the AM could still get behind if it's starved for CPU and/or > handling a very large job with tens of thousands of active tasks. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked
[ https://issues.apache.org/jira/browse/MAPREDUCE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated MAPREDUCE-6987: Affects Version/s: (was: 3.0.0) 3.0.0-alpha1 > JHS Log Scanner and Cleaner blocked > --- > > Key: MAPREDUCE-6987 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0-alpha1 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > {code} > "Log Scanner/Cleaner #1" #81 prio=5 os_prio=0 tid=0x7fd6c010f000 > nid=0x11db waiting on condition [0x7fd6aa859000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xd6c88a80> (a > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) > at > org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "Log Scanner/Cleaner #0" #80 prio=5 os_prio=0 tid=0x7fd6c010c800 > nid=0x11da waiting on condition [0x7fd6aa95a000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xd6c8> (a > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) > at > org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > Both threads waiting on {{FutureTask.get()}} for infinite time after first > execution -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked
[ https://issues.apache.org/jira/browse/MAPREDUCE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated MAPREDUCE-6987: Affects Version/s: 3.0.0 2.8.0 Description: {code} "Log Scanner/Cleaner #1" #81 prio=5 os_prio=0 tid=0x7fd6c010f000 nid=0x11db waiting on condition [0x7fd6aa859000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd6c88a80> (a java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) at org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "Log Scanner/Cleaner #0" #80 prio=5 os_prio=0 tid=0x7fd6c010c800 nid=0x11da waiting on condition [0x7fd6aa95a000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd6c8> (a java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) at org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} Both threads waiting on {{FutureTask.get()}} for infinite time after first execution Component/s: jobhistoryserver > JHS Log Scanner and Cleaner blocked > --- > > Key: MAPREDUCE-6987 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: jobhistoryserver >Affects Versions: 2.8.0, 3.0.0 >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > > {code} > "Log Scanner/Cleaner #1" #81 prio=5 os_prio=0 tid=0x7fd6c010f000 > nid=0x11db waiting on condition [0x7fd6aa859000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xd6c88a80> (a > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) > at > org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > "Log Scanner/Cleaner #0" #80 prio=5 os_prio=0 tid=0x7fd6c010c800 > nid=0x11da waiting on condition [0x7fd6aa95a000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xd6c8> (a > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429) > at java.util.concurrent.FutureTask.get(FutureTask.java:191) > at > org.apache.hadoop.util.concurrent.ExecutorHelper.logThrowableFromAfterExecute(ExecutorHelper.java:47) > at > org.apache.hadoop.util.concurrent.HadoopScheduledThreadPoolExecutor.afterExecute(HadoopScheduledThreadPoolExecutor.java:69) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1150) > at > java.util.concurrent.Threa
[jira] [Assigned] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked
[ https://issues.apache.org/jira/browse/MAPREDUCE-6987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt reassigned MAPREDUCE-6987: --- Assignee: Bibin A Chundatt > JHS Log Scanner and Cleaner blocked > --- > > Key: MAPREDUCE-6987 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6987) JHS Log Scanner and Cleaner blocked
Bibin A Chundatt created MAPREDUCE-6987: --- Summary: JHS Log Scanner and Cleaner blocked Key: MAPREDUCE-6987 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6987 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Bibin A Chundatt Priority: Critical -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16210723#comment-16210723 ] Hadoop QA commented on MAPREDUCE-6986: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 15s{color} | {color:red} Docker failed to build yetus/hadoop:0de40f0. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | MAPREDUCE-6986 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12893006/MAPREDUCE-6986.001.patch | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7195/console | | Powered by | Apache Yetus 0.5.0 http://yetus.apache.org | This message was automatically generated. > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > Attachments: MAPREDUCE-6986.001.patch > > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated MAPREDUCE-6986: --- Description: I tried to submit mr example job wordmedian and specify queue by -D option: {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2.jar wordmedian -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not work in wordmedian while it works in other example jobs such Terasort, Wordcount. The cause is there is unnecessary {{setConf(new Configuration())}} in {{run()}} method, which would override Configuration with properties by -D options. was: I tried to submit mr example job wordmedian and specify queue by -D option: {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not work in wordmedian while it works in other example jobs such Terasort, Wordcount. The cause is there is unnecessary {{setConf(new Configuration())}} in {{run()}} method, which would override Configuration with properties by -D options. > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > Attachments: MAPREDUCE-6986.001.patch > > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated MAPREDUCE-6986: --- Attachment: MAPREDUCE-6986.001.patch > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > Attachments: MAPREDUCE-6986.001.patch > > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated MAPREDUCE-6986: --- Status: Patch Available (was: Open) > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > Attachments: MAPREDUCE-6986.001.patch > > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Work stopped] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-6986 stopped by Tao Jie. -- > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > Attachments: MAPREDUCE-6986.001.patch > > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Work started] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-6986 started by Tao Jie. -- > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > Attachments: MAPREDUCE-6986.001.patch > > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated MAPREDUCE-6986: --- Description: I tried to submit mr example job wordmedian and specify queue by -D option: {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not work in wordmedian while it works in other example jobs such Terasort, Wordcount. The cause is there is unnecessary {{setConf(new Configuration())}} in {{run()}} method, which would override Configuration with properties by -D options. was: I tried to submit mr example job wordmedian and specify queue by -D option: {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not work in wordmedian while it works in other example jobs such Terasort, Wordcount. The cause is there is unnecessary {{ setConf(new Configuration())}} in {{run()}} method, which would override Configuration with properties by -D options. > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
[ https://issues.apache.org/jira/browse/MAPREDUCE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated MAPREDUCE-6986: --- Affects Version/s: 2.8.1 > Fail to set job configuration by -D option in example WordMedian > - > > Key: MAPREDUCE-6986 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.8.1 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Trivial > > I tried to submit mr example job wordmedian and specify queue by -D option: > {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian > -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not > work in wordmedian while it works in other example jobs such Terasort, > Wordcount. > The cause is there is unnecessary {{ setConf(new Configuration())}} in > {{run()}} method, which would override Configuration with properties by -D > options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6986) Fail to set job configuration by -D option in example WordMedian
Tao Jie created MAPREDUCE-6986: -- Summary: Fail to set job configuration by -D option in example WordMedian Key: MAPREDUCE-6986 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6986 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Tao Jie Assignee: Tao Jie Priority: Trivial I tried to submit mr example job wordmedian and specify queue by -D option: {{bin/hadoop jar ~/hadoop-mapreduce-examples-2.8.2-bc1.4.0.jar wordmedian -Dmapreduce.job.queuename=root.user1 /input /output6}}. This option did not work in wordmedian while it works in other example jobs such Terasort, Wordcount. The cause is there is unnecessary {{ setConf(new Configuration())}} in {{run()}} method, which would override Configuration with properties by -D options. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org