[jira] [Updated] (MAPREDUCE-4695) Fix LocalRunner on trunk after MAPREDUCE-3223 broke it

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-4695:
---
Component/s: test

> Fix LocalRunner on trunk after MAPREDUCE-3223 broke it
> --
>
> Key: MAPREDUCE-4695
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4695
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-alpha1
>Reporter: Harsh J
>Assignee: Harsh J
>Priority: Blocker
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-4695.patch, MAPREDUCE-4695.patch
>
>
> MAPREDUCE-3223 removed mapreduce.cluster.local.dir property from 
> mapred-default.xml (since NM local dirs are now used) but failed to counter 
> that LocalJobRunner, etc. still use it.
> {code}
> mr-3223.txt:-  mapreduce.cluster.local.dir
> mr-3223.txt--  ${hadoop.tmp.dir}/mapred/local
> {code}
> All local job tests have been failing since then.
> This JIRA is to reintroduce it or provide an equivalent new config for fixing 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-12 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6558:
-
Attachment: MAPREDUCE-6558.3.patch

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch, 
> MAPREDUCE-6558.3.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-3149) add a test to verify that buildDTAuthority works for cases with no authority.

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-3149:
---
Component/s: test

> add a test to verify that buildDTAuthority works for cases with no authority.
> -
>
> Key: MAPREDUCE-3149
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3149
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.0-alpha
>Reporter: John George
>Assignee: John George
> Fix For: 3.0.0-alpha1
>
> Attachments: HADOOP-7602.patch
>
>
> Add a test to verify that buildDTAuthority works for cases with no Authority.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-2632) Avoid calling the partitioner when the numReduceTasks is 1.

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-2632:
---
Target Version/s:   (was: )
Release Note: A partitioner is now only created if there are multiple 
reducers.

I added a release note based on my understanding of this patch, please update 
if something's off.

> Avoid calling the partitioner when the numReduceTasks is 1.
> ---
>
> Key: MAPREDUCE-2632
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2632
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.23.0
>Reporter: Ravi Teja Ch N V
>Assignee: Sunil G
> Fix For: 3.0.0-alpha1
>
> Attachments: 0001-MAPREDUCE-2632.patch, MAPREDUCE-2632-1.patch, 
> MAPREDUCE-2632.patch, mr-2632-2.patch, mr-2632-3.patch, mr-2632-4.patch
>
>
> We can avoid the call to the partitioner when the number of reducers is 
> 1.This will avoid the unnecessary computations by the partitioner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6223) TestJobConf#testNegativeValueForTaskVmem failures

2016-05-12 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated MAPREDUCE-6223:
---
Target Version/s:   (was: )
Hadoop Flags: Reviewed  (was: Incompatible change,Reviewed)

> TestJobConf#testNegativeValueForTaskVmem failures
> -
>
> Key: MAPREDUCE-6223
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6223
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.0.0-alpha1
>Reporter: Gera Shegalov
>Assignee: Varun Saxena
> Fix For: 3.0.0-alpha1
>
> Attachments: MAPREDUCE-6223.001.patch, MAPREDUCE-6223.002.patch, 
> MAPREDUCE-6223.003.patch, MAPREDUCE-6223.004.patch, MAPREDUCE-6223.005.patch, 
> MAPREDUCE-6223.006.patch
>
>
> {code}
> Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.328 sec <<< 
> FAILURE! - in org.apache.hadoop.conf.TestJobConf
> testNegativeValueForTaskVmem(org.apache.hadoop.conf.TestJobConf)  Time 
> elapsed: 0.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<1024> but was:<-1>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.conf.TestJobConf.testNegativeValueForTaskVmem(TestJobConf.java:111)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6558) multibyte delimiters with compressed input files generate duplicate records

2016-05-12 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated MAPREDUCE-6558:
-
Attachment: MAPREDUCE-6558.2.patch

> multibyte delimiters with compressed input files generate duplicate records
> ---
>
> Key: MAPREDUCE-6558
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6558
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv1, mrv2
>Affects Versions: 2.7.2
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: MAPREDUCE-6558.1.patch, MAPREDUCE-6558.2.patch
>
>
> This is the follow up for MAPREDUCE-6549. Compressed files cause record 
> duplications as shown in different junit tests. The number of duplicated 
> records changes with the splitsize:
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 45062
> Unexpected number of records in split (splitsize = 10)
> Expected: 41051
> Actual: 41052
> Test passes with splitsize = 147445 which is the compressed file length.The 
> file is a bzip2 file with 100k blocks and a total of 11 blocks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6146) Reduce tar ball size for MR over distributed cache

2016-05-12 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6146:
--
Labels:   (was: BB2015-05-TBR)

> Reduce tar ball size for MR over distributed cache
> --
>
> Key: MAPREDUCE-6146
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6146
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: HADOOP-11177.patch, MAPREDUCE-6146-v2.patch, 
> MAPREDUCE-6146.patch
>
>
> The current tar ball built from "mvn package -Pdist -DskipTests -Dtar" is 
> over 160M in size. We need more smaller tar ball pieces for feature like MR 
> over distributed cache to support Rolling update of cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6146) Reduce tar ball size for MR over distributed cache

2016-05-12 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated MAPREDUCE-6146:
--
Target Version/s:   (was: 2.8.0)

> Reduce tar ball size for MR over distributed cache
> --
>
> Key: MAPREDUCE-6146
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6146
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Attachments: HADOOP-11177.patch, MAPREDUCE-6146-v2.patch, 
> MAPREDUCE-6146.patch
>
>
> The current tar ball built from "mvn package -Pdist -DskipTests -Dtar" is 
> over 160M in size. We need more smaller tar ball pieces for feature like MR 
> over distributed cache to support Rolling update of cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6441) LocalDistributedCacheManager for concurrent sqoop processes fails to create unique directories

2016-05-12 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15282058#comment-15282058
 ] 

Ray Chiang commented on MAPREDUCE-6441:
---

[~wattsinabox], it looks like the old patch doesn't work now on trunk either.

> LocalDistributedCacheManager for concurrent sqoop processes fails to create 
> unique directories
> --
>
> Key: MAPREDUCE-6441
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6441
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: William Watson
>Assignee: William Watson
> Attachments: HADOOP-10924.02.patch, 
> HADOOP-10924.03.jobid-plus-uuid.patch
>
>
> Kicking off many sqoop processes in different threads results in:
> {code}
> 2014-08-01 13:47:24 -0400:  INFO - 14/08/01 13:47:22 ERROR tool.ImportTool: 
> Encountered IOException running import job: java.io.IOException: 
> java.util.concurrent.ExecutionException: java.io.IOException: Rename cannot 
> overwrite non empty destination directory 
> /tmp/hadoop-hadoop/mapred/local/1406915233073
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:149)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.(LocalJobRunner.java:163)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> java.security.AccessController.doPrivileged(Native Method)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> javax.security.auth.Subject.doAs(Subject.java:415)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:239)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:645)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> 2014-08-01 13:47:24 -0400:  INFO -at 
> org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> {code}
> If two are kicked off in the same second. The issue is the following lines of 
> code in the org.apache.hadoop.mapred.LocalDistributedCacheManager class: 
> {code}
> // Generating unique numbers for FSDownload.
> AtomicLong uniqueNumberGenerator =
>new AtomicLong(System.currentTimeMillis());
> {code}
> and 
> {code}
> Long.toString(uniqueNumberGenerator.incrementAndGet())),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-05-12 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281968#comment-15281968
 ] 

Varun Saxena commented on MAPREDUCE-6513:
-

Rebased 2.7 patch LGTM.

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, 
> MAPREDUCE-6513.03.patch, MAPREDUCE-6513.3.branch-2.8.patch, 
> MAPREDUCE-6513.3_1.branch-2.7.patch, MAPREDUCE-6513.3_1.branch-2.8.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6639) Process hangs in LocatedFileStatusFetcher if FileSystem.get throws

2016-05-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281910#comment-15281910
 ] 

Hudson commented on MAPREDUCE-6639:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #9753 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9753/])
MAPREDUCE-6639 Process hangs in LocatedFileStatusFetcher if (stevel: rev 
7eddecd357014d4793df4bf2e5d987add02289f5)
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LocatedFileStatusFetcher.java


> Process hangs in LocatedFileStatusFetcher if FileSystem.get throws
> --
>
> Key: MAPREDUCE-6639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.2
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6639.2.patch
>
>
> ListLocatedFileStatusFetcher uses a thread pool, but one of the Callable 
> thread functions, 
> [{{ProcessInitialInputPathCallable}}|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LocatedFileStatusFetcher.java#L306],
>  doesn't catch exceptions (the callbacks do). When an exception is thrown, 
> the thread exists and doesn't signal the error to the calling thread, which 
> continues waiting to be signaled. This can happen when a FS implementation 
> cannot be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Updated] (MAPREDUCE-6639) Process hangs in LocatedFileStatusFetcher if FileSystem.get throws

2016-05-12 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-6639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated MAPREDUCE-6639:
--
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I concur that it is a bug and that the patch fixes it. Although there's no test 
for (it'd be possible, but tricky), the IDE highlights that the field 
{{unknownError}} is only set in the conditional clause. That is: the code which 
sets the field can only be reached if the field is set. Accordingly, the field 
can never be set. This patch fixes that.



> Process hangs in LocatedFileStatusFetcher if FileSystem.get throws
> --
>
> Key: MAPREDUCE-6639
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6639
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 2.7.2
>Reporter: Ryan Blue
>Assignee: Ryan Blue
> Fix For: 2.8.0
>
> Attachments: MAPREDUCE-6639.2.patch
>
>
> ListLocatedFileStatusFetcher uses a thread pool, but one of the Callable 
> thread functions, 
> [{{ProcessInitialInputPathCallable}}|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/LocatedFileStatusFetcher.java#L306],
>  doesn't catch exceptions (the callbacks do). When an exception is thrown, 
> the thread exists and doesn't signal the error to the calling thread, which 
> continues waiting to be signaled. This can happen when a FS implementation 
> cannot be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-05-12 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281625#comment-15281625
 ] 

Varun Saxena commented on MAPREDUCE-6513:
-

[~leftnoteasy], will check the 2.7 patch and let you know.

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, 
> MAPREDUCE-6513.03.patch, MAPREDUCE-6513.3.branch-2.8.patch, 
> MAPREDUCE-6513.3_1.branch-2.7.patch, MAPREDUCE-6513.3_1.branch-2.8.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6513) MR job got hanged forever when one NM unstable for some time

2016-05-12 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281622#comment-15281622
 ] 

Varun Saxena commented on MAPREDUCE-6513:
-

[~jianhe], the former is for rescheduling completed maps(as this output maybe 
unusable) and latter is for assigned maps.

> MR job got hanged forever when one NM unstable for some time
> 
>
> Key: MAPREDUCE-6513
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6513
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: applicationmaster, resourcemanager
>Affects Versions: 2.7.0
>Reporter: Bob.zhao
>Assignee: Varun Saxena
>Priority: Critical
> Attachments: MAPREDUCE-6513.01.patch, MAPREDUCE-6513.02.patch, 
> MAPREDUCE-6513.03.patch, MAPREDUCE-6513.3.branch-2.8.patch, 
> MAPREDUCE-6513.3_1.branch-2.7.patch, MAPREDUCE-6513.3_1.branch-2.8.patch
>
>
> when job is in-progress which is having more tasks,one node became unstable 
> due to some OS issue.After the node became unstable, the map on this node 
> status changed to KILLED state. 
> Currently maps which were running on unstable node are rescheduled, and all 
> are in scheduled state and wait for RM assign container.Seen ask requests for 
> map till Node is good (all those failed), there are no ask request after 
> this. But AM keeps on preempting the reducers (it's recycling).
> Finally reducers are waiting for complete mappers and mappers did n't get 
> container..
> My Question Is:
> 
> why map requests did not sent AM ,once after node recovery.?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Resolved] (MAPREDUCE-4758) jobhistory web ui not showing correct # failed reducers

2016-05-12 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe resolved MAPREDUCE-4758.
---
Resolution: Duplicate

This is a duplicate of MAPREDUCE-5982 which was fixed in 2.7.2 and 2.6.4.

> jobhistory web ui not showing correct # failed reducers
> ---
>
> Key: MAPREDUCE-4758
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4758
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver, webapps
>Affects Versions: 0.23.4
>Reporter: Thomas Graves
>
> we had a job fail due to a reducer failing 4 times.  Unfortunately the job 
> history UI didn't show  this particular failed reducer which lead to 
> confusion as to why the job failed. 
> This reducer failed to launch all 4 task attempts with a Token Expiration 
> error and the jobhistory file only gets an event when the task attempt 
> transitions to launched.  The webapp JobInfo object only counts the task 
> attempts in the jobhistory file to display under the "Attempt Type" table, so 
> since this task didn't have an attempt with it, it did show it on the UI.
> We need to reconcile the task list with the task attempts or also shows more 
> stats for the tasks vs task attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-6657) job history server can fail on startup when NameNode is in start phase

2016-05-12 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15281517#comment-15281517
 ] 

Junping Du commented on MAPREDUCE-6657:
---

bq. Do you think we should create a subclass of RetriableException for this 
instead?
It is up to you. IMO, it is not necessary to do so just for a special case or 
it could be too many sub-exceptions.

bq. The message is derived from a instance method this.nn.getRole(), and doing 
string matching is probably not the cleanest way.
You can make a static method for {{this.nn.getRole() + " still not started"}} 
with input of daemon's name ("NameNode" here) which is accessible from both 
HDFS and MAPREDUCE (JHS). In JHS, just put "NameNode" (or move NamenodeRole 
from HdfsServerConstants to HdfsConstants and share to JHS) and get the same 
string with HDFS. That could be much cleaner.

> job history server can fail on startup when NameNode is in start phase
> --
>
> Key: MAPREDUCE-6657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobhistoryserver
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: mapreduce6657.001.patch, mapreduce6657.002.patch, 
> mapreduce6657.003.patch, mapreduce6657.004.patch, mapreduce6657.005.patch
>
>
> Job history server will try to create a history directory in HDFS on startup. 
> When NameNode is in safe mode, it will keep retrying for a configurable time 
> period.  However, it should also keeps retrying if the name node is in start 
> state. Safe mode does not happen until the NN is out of the startup phase. 
> A RetriableException with the text "NameNode still not started" is thrown 
> when the NN is in its internal service startup phase. We should add the check 
> for this specific exception in isBecauseSafeMode() to account for that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org