[jira] [Updated] (MAPREDUCE-2781) mr279 RM application finishtime not set

2011-08-08 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-2781:
-

Attachment: MAPREDUCE-2781-v2.patch

patch with various small fixes and unit tests.

> mr279 RM application finishtime not set
> ---
>
> Key: MAPREDUCE-2781
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2781
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Minor
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2781-v2.patch, finishtime.patch
>
>
> The RM Application finishTime isn't being set.  Looks like it got lost in the 
> RM refactor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2781) mr279 RM application finishtime not set

2011-08-08 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated MAPREDUCE-2781:
-

Fix Version/s: 0.23.0
Affects Version/s: 0.23.0
   Status: Patch Available  (was: Open)

> mr279 RM application finishtime not set
> ---
>
> Key: MAPREDUCE-2781
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2781
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Thomas Graves
>Assignee: Thomas Graves
>Priority: Minor
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2781-v2.patch, finishtime.patch
>
>
> The RM Application finishTime isn't being set.  Looks like it got lost in the 
> RM refactor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2788) LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory() == 0

2011-08-08 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-2788:


Attachment: MAPREDUCE-2788.patch

The attached patch avoids the described crash and returns Resources.none() in 
case memory==0.

> LeafQueue.assignContainer() can cause a crash if 
> request.getCapability().getMemory() == 0
> -
>
> Key: MAPREDUCE-2788
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2788
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: mrv2
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-2788.patch
>
>
> The assignContainer() method in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> can cause the scheduler to crash if the ResourseRequest capability memory == 
> 0 (divide by zero).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2788) LeafQueue.assignContainer() can cause a crash if request.getCapability().getMemory() == 0

2011-08-08 Thread Ahmed Radwan (JIRA)
LeafQueue.assignContainer() can cause a crash if 
request.getCapability().getMemory() == 0
-

 Key: MAPREDUCE-2788
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2788
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan


The assignContainer() method in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue can 
cause the scheduler to crash if the ResourseRequest capability memory == 0 
(divide by zero).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2780) Standardize the value of token service

2011-08-08 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081301#comment-13081301
 ] 

Jitendra Nath Pandey commented on MAPREDUCE-2780:
-

Daryn and I discussed it at length, the conclusions were:
It is better to treat service as opaque, as in current Token API.
A convenience method to set service using InetSocketAddress is useful. 
Therefore we add a method in SecurityUtil that takes a token and an 
InetSocketAddress, constructs a service and sets it in the token. And the Token 
API remains unchanged.

> Standardize the value of token service
> --
>
> Key: MAPREDUCE-2780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch
>
>
> The token's service field must (currently) be set to "ip:port".  All the 
> producers of a token are independently building the service string.  This 
> should be done via a common method to reduce the chance of error, and to 
> facilitate the field value being easily changed in the (near) future.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2676) MR-279: JobHistory Job page needs reformatted

2011-08-08 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081164#comment-13081164
 ] 

Luke Lu commented on MAPREDUCE-2676:


Since you're creating jobhistory specific pages, can you create unit tests 
similar to TestAMWebApp?

The simplest page test (that ensure structural integrity of the page) is a one 
liner, e.g.:
{code}
  WebAppTests.testPage(HsView.class, AppContext.class, new 
TestHistoryContext());
{code}



> MR-279: JobHistory Job page needs reformatted
> -
>
> Key: MAPREDUCE-2676
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2676
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Affects Versions: 0.23.0
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Fix For: 0.23.0
>
> Attachments: MR-2676-v1.patch
>
>
> The Job page, The Maps page and the Reduces page for the job history server 
> needs to be reformatted.
> The Job Overview needs to add in the User, a link to the Job Conf, and the 
> Job ACLs
> It also needs Submitted at, launched at, and finished at, depending on how 
> they relates to Started and Elapsed.
> In the attempts table we need to remove the new and the running columns
> In the tasks table we need to remove progress, pending, and running columns 
> and add in a failed count column
> We also need to investigate what it would take to add in setup and cleanup 
> statistics.  Perhaps these should be more generally Application Master 
> statistics and links.
> The Maps page and Reduces page should have the progress column removed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy resolved MAPREDUCE-2729.
--

Resolution: Fixed

Sorry, some weird issue with my patch d/w.

I just committed this. Thanks Sherry!

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2787) MR-279: Performance improvement in running Uber MapTasks

2011-08-08 Thread Ahmed Radwan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmed Radwan updated MAPREDUCE-2787:


Attachment: MAPREDUCE-2787.patch

The attached patch fixes the described issue by only creating the FileSystem 
and Configuration once for all task attempts.

All mapreduce unit tests ran successfully.

> MR-279: Performance improvement in running Uber MapTasks
> 
>
> Key: MAPREDUCE-2787
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2787
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: mrv2
>Reporter: Ahmed Radwan
>Assignee: Ahmed Radwan
> Attachments: MAPREDUCE-2787.patch
>
>
> The runUberMapTasks() in org.apache.hadoop.mapred.UberTask obtains the local 
> fileSystem and local job configuration for every task attempt.  This will 
> have a negative performance impact.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2780) Standardize the value of token service

2011-08-08 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated MAPREDUCE-2780:
---

Status: Patch Available  (was: Open)

> Standardize the value of token service
> --
>
> Key: MAPREDUCE-2780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch
>
>
> The token's service field must (currently) be set to "ip:port".  All the 
> producers of a token are independently building the service string.  This 
> should be done via a common method to reduce the chance of error, and to 
> facilitate the field value being easily changed in the (near) future.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2780) Standardize the value of token service

2011-08-08 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081142#comment-13081142
 ] 

Daryn Sharp commented on MAPREDUCE-2780:


bq.  It forces to set ip/host/port in the service.

Per the title of this jira, that is the goal. :)  Every token producer is 
required to use the same service field format to comply with the format needed 
by the token's selector. Currently every token producer has a copy-n-paste 
chunk selector's code to construct the service format.  If the selector and the 
producer get out of sync, there's a big problem.

bq. If we later decide to store something else in the service for example, uri 
with scheme, we will have to change it again.

Please elaborate?  This appears as a non-sequitur since token producers that 
choose to use a URI (someday), for instance, will require a change in either 
case.

Here is the evolution from the original code of:
{code}
token.setService(new Text(addr.getAddress().getHostAddress() + ":" + 
addr.getPort()));
selector.selectToken(new Text(addr.getAddress().getHostAddress() + ":" + 
addr.getPort()), tokens);
{code}
I think is your suggestion?  It's incrementally better, but continues to 
require a copy-n-paste in each token producer, and every token producer 
continues to have intimate knowledge of the service format.
{code}
token.setService(SecurityUtil.buildDTAuthority(addr));
selector.selectToken(SecurityUtil.buildDTAuthority(addr), tokens);
{code}
The patch applies another layer of abstraction.  The format is privatized to 
the token, instead of publicly diffused over all the tokens in hadoop.
{code}
token.setService(addr);
selector.selectToken(Token.createService(addr), tokens);
// I removed this due to your earlier concerns in the parent jira
// selector.selectToken(addr, tokens);
{code}


Do you believe this a persuasive case for the patch?

> Standardize the value of token service
> --
>
> Key: MAPREDUCE-2780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch
>
>
> The token's service field must (currently) be set to "ip:port".  All the 
> producers of a token are independently building the service string.  This 
> should be done via a common method to reduce the chance of error, and to 
> facilitate the field value being easily changed in the (near) future.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2787) MR-279: Performance improvement in running Uber MapTasks

2011-08-08 Thread Ahmed Radwan (JIRA)
MR-279: Performance improvement in running Uber MapTasks


 Key: MAPREDUCE-2787
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2787
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: mrv2
Reporter: Ahmed Radwan
Assignee: Ahmed Radwan


The runUberMapTasks() in org.apache.hadoop.mapred.UberTask obtains the local 
fileSystem and local job configuration for every task attempt.  This will have 
a negative performance impact.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-08 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081122#comment-13081122
 ] 

Arun C Murthy commented on MAPREDUCE-2729:
--

Thomas, it doesn't make sense to port this to trunk - please don't bother, 
unless you want to look at this vis-a-vis MAPREDUCE-279.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-08 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081118#comment-13081118
 ] 

Thomas Graves commented on MAPREDUCE-2729:
--

The patch is for the branch-0.20-security branch.  I will look at putting it on 
trunk.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2780) Standardize the value of token service

2011-08-08 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081091#comment-13081091
 ] 

Jitendra Nath Pandey commented on MAPREDUCE-2780:
-

  I see only one issue with changing the setService API to take 
InetSocketAddress. It forces to set ip/host/port in the service. If we later 
decide to store something else in the service for example, uri with scheme, we 
will have to change it again. 
  I would recommend to keep setService API unchanged, however a utility method 
in SecurityUtil to construct DT service from a socket address is good.

> Standardize the value of token service
> --
>
> Key: MAPREDUCE-2780
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2780
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2780-2.patch, MAPREDUCE-2780.patch
>
>
> The token's service field must (currently) be set to "ip:port".  All the 
> producers of a token are independently building the service string.  This 
> should be done via a common method to reduce the chance of error, and to 
> facilitate the field value being easily changed in the (near) future.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2364) Shouldn't hold lock on rjob while localizing resources.

2011-08-08 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081086#comment-13081086
 ] 

Devaraj Das commented on MAPREDUCE-2364:


Subroto, I see a significant difference in the patches attached to 
MAPREDUCE-2209 and the last one here. I'll need to look at the details but if 
you have time could you please take a look at the patch attached here and see 
if this makes sense (given this patch predates the patch on MAPREDUCE-2209; I 
am sorry that I didn't look at the patch here earlier). 

> Shouldn't hold lock on rjob while localizing resources.
> ---
>
> Key: MAPREDUCE-2364
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2364
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.20.203.0
>Reporter: Owen O'Malley
>Assignee: Devaraj Das
> Fix For: 0.20.204.0
>
> Attachments: MAPREDUCE-2364.patch, 
> no-lock-localize-branch-0.20-security.patch, no-lock-localize-trunk.patch
>
>
> There is a deadlock while localizing resources on the TaskTracker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2209) TaskTracker's heartbeat hang for several minutes when copying large job.jar from HDFS

2011-08-08 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081084#comment-13081084
 ] 

Arun C Murthy commented on MAPREDUCE-2209:
--

Subroto/Liyin - In general the community is moving forward with a complete 
re-architecture of Hadoop MapReduce from 0.23 onwards: MAPREDUCE-279 for more 
details.

Thus, you might be better off focussing your efforts either on hadoop-0.20.203 
and beyond (the classic mapreduce) or, better yet, helping out with 
MAPREDUCE-279 (new mapreduce). Given your valuable experience on 'classic' 
mapreduce, your help on MAPREDUCE-279 would be very welcome. Thanks!

> TaskTracker's heartbeat hang for several minutes when copying large job.jar 
> from HDFS
> -
>
> Key: MAPREDUCE-2209
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2209
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.23.0
> Environment: hadoop version: 0.19.1
>Reporter: Liyin Liang
>Priority: Blocker
> Attachments: 2209-1.diff, MAPREDUCE-2209.patch
>
>
> If a job's jar file is very large, e.g 200m+, the TaskTracker's heartbeat 
> hang for several minutes when localizing the job. The jstack of related 
> threads are as follows:
> {code:borderStyle=solid}
> "TaskLauncher for task" daemon prio=10 tid=0x002b05ee5000 nid=0x1adf 
> runnable [0x42e56000]
>java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
> at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
> at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
> - locked <0x002afc892ec8> (a sun.nio.ch.Util$1)
> - locked <0x002afc892eb0> (a 
> java.util.Collections$UnmodifiableSet)
> - locked <0x002afc8927d8> (a sun.nio.ch.EPollSelectorImpl)
> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:260)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
> at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
> at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> - locked <0x002afce26158> (a java.io.BufferedInputStream)
> at java.io.DataInputStream.readShort(DataInputStream.java:295)
> at 
> org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1304)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1556)
> - locked <0x002afce26218> (a 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1673)
> - locked <0x002afce26218> (a 
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
> at java.io.DataInputStream.read(DataInputStream.java:83)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
> at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:209)
> at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
> at 
> org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214)
> at 
> org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195)
> at 
> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:824)
> - locked <0x002afce2d260> (a 
> org.apache.hadoop.mapred.TaskTracker$RunningJob)
> at 
> org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1745)
> at 
> org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:103)
> at 
> org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1710)
> "Map-events fetcher for all reduce tasks on 
> tracker_r01a08025:localhost/127.0.0.1:50050" daemon prio=10 
> tid=0x002b05ef8000 
> nid=0x1ada waiting for monitor entry [0x42d55000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:582)
> - waiting to lock <0x002afce2d260> (a 
> org.apache.hadoop.mapred.TaskTracker$RunningJob)
> at 
> org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:617)
> - locked <0x002a9eefe1f8> (a java.util.TreeMap)
> "IPC Server handler 2 on 

[jira] [Updated] (MAPREDUCE-2729) Reducers are always counted having "pending tasks" even if they can't be scheduled yet because not enough of their mappers have completed

2011-08-08 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-2729:
-

Status: Open  (was: Patch Available)

Sherry, the patch doesn't apply clean - can you please re-generate it? Thanks.

> Reducers are always counted having "pending tasks" even if they can't be 
> scheduled yet because not enough of their mappers have completed
> -
>
> Key: MAPREDUCE-2729
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2729
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.205.0
> Environment: 0.20.1xx-Secondary
>Reporter: Sherry Chen
>Assignee: Sherry Chen
> Fix For: 0.20.205.0
>
> Attachments: MAPREDUCE-2729.patch
>
>
> In capacity scheduler, number of users in a queue needing slots are 
> calculated based on whether users' jobs have any pending tasks.
> This works fine for map tasks. However, for reduce tasks, jobs do not need 
> reduce slots until the minimum number of map tasks have been completed.
> Here, we add checking whether reduce is ready to schedule (i.e. if a job has 
> completed enough map tasks) when we increment number of users in a queue 
> needing reduce slots.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-2786) TestDFSIO should also test compression reading/writing from command-line.

2011-08-08 Thread Plamen Jeliazkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Plamen Jeliazkov updated MAPREDUCE-2786:


Attachment: MAPREDUCE-2786.patch

This is my work done so far. I'd like to move the codec into the mapper 
constructors but I have not been able to do it successfully because the 
CompressionOutputStream relies on the OutputStream within each mapper.

> TestDFSIO should also test compression reading/writing from command-line.
> -
>
> Key: MAPREDUCE-2786
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2786
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks
>Reporter: Plamen Jeliazkov
>Priority: Minor
>  Labels: newbie
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-2786.patch
>
>   Original Estimate: 36h
>  Remaining Estimate: 36h
>
> After running into trouble dealing with the config files I thought it might 
> be easier to simply alter the code of TestDFSIO to accept any compression 
> codec and allow testing for compression by a command line argument instead of 
> having to change the config file everytime. Something like "-compression" 
> would do.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2786) TestDFSIO should also test compression reading/writing from command-line.

2011-08-08 Thread Plamen Jeliazkov (JIRA)
TestDFSIO should also test compression reading/writing from command-line.
-

 Key: MAPREDUCE-2786
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2786
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: benchmarks
Reporter: Plamen Jeliazkov
Priority: Minor
 Fix For: 0.22.0


After running into trouble dealing with the config files I thought it might be 
easier to simply alter the code of TestDFSIO to accept any compression codec 
and allow testing for compression by a command line argument instead of having 
to change the config file everytime. Something like "-compression" would do.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2627) guava-r09 JAR file needs to be added to mapreduce.

2011-08-08 Thread Plamen Jeliazkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081062#comment-13081062
 ] 

Plamen Jeliazkov commented on MAPREDUCE-2627:
-

Yes, it was due to this patch being submitted. We have reverted it and are 
waiting for a new POM file to be generated. Please read the related issue 
thread for more.

> guava-r09 JAR file needs to be added to mapreduce.
> --
>
> Key: MAPREDUCE-2627
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2627
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.22.0
>Reporter: Plamen Jeliazkov
>Priority: Blocker
> Fix For: 0.22.0
>
> Attachments: patch.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Need to add the guava-r09.jar file into the 
> "mapreduce/build/ivy/lib/Hadoop/common" directory; missing from build.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-377) Add serialization for Protocol Buffers

2011-08-08 Thread Pere Ferrera Bertran (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13081031#comment-13081031
 ] 

Pere Ferrera Bertran commented on MAPREDUCE-377:


PB integration with Hadoop is now possible by using Protostuff 
(http://code.google.com/p/protostuff/) by calling 
ProtobufIOUtil.writeDelimitedTo() and ProtobufIOUtil.mergeDelimitedFrom() . 
These methods avoid the problem with consuming too many bytes from the stream.

> Add serialization for Protocol Buffers
> --
>
> Key: MAPREDUCE-377
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-377
> Project: Hadoop Map/Reduce
>  Issue Type: Wish
>Reporter: Tom White
>Assignee: Alex Loddengaard
> Attachments: hadoop-3788-v1.patch, hadoop-3788-v2.patch, 
> hadoop-3788-v3.patch, protobuf-java-2.0.1.jar, protobuf-java-2.0.2.jar
>
>
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of encoding 
> data in a compact binary format. This issue is to write a 
> ProtocolBuffersSerialization to support using Protocol Buffers types in 
> MapReduce programs, including an example program. This should probably go 
> into contrib. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2785) MiniMR cluster thread crashes if no hadoop log dir set

2011-08-08 Thread Steve Loughran (JIRA)
MiniMR cluster thread crashes if no hadoop log dir set
--

 Key: MAPREDUCE-2785
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2785
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.20.203.0
Reporter: Steve Loughran
Priority: Minor


I'm marking this as minor as it is most obvious in the MiniMRCluster, but the 
root cause is in the JT. 

If you instantiate an MiniMR Cluster without setting 
{{hadoop.job.history.location}} in the configuration and the system property 
{{hadoop.log.dir}} unset, then the JobHistory throws an NPE. In production, 
that would be picked up as a failure to start the JT. In the MiniMRCluster, all 
it does is crash the JT thread -which isn't noticed by the MiniMR cluster. You 
see the logged error, but the tests will just timeout waiting for things to 
come up

2011/08/08 17:46:26:427 CEST [ERROR][Thread-44] 
org.apache.hadoop.mapred.MiniMRCluster - Job tracker crashed 
 java.lang.NullPointerException
at java.io.File.(File.java:222)
at org.apache.hadoop.mapred.JobHistory.initLogDir(JobHistory.java:531)
at org.apache.hadoop.mapred.JobHistory.init(JobHistory.java:499)
at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:2316)
at org.apache.hadoop.mapred.JobTracker$2.run(JobTracker.java:2313)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:2313)
at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:2171)
at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:300)
at 
org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner$1.run(MiniMRCluster.java:114)
at 
org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner$1.run(MiniMRCluster.java:112)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at 
org.apache.hadoop.mapred.MiniMRCluster$JobTrackerRunner.run(MiniMRCluster.java:112)
at java.lang.Thread.run(Thread.java:662)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (MAPREDUCE-2784) [Gridmix] TestGridmixSummary fails with NPE when run in DEBUG mode.

2011-08-08 Thread Amar Kamat (JIRA)
[Gridmix] TestGridmixSummary fails with NPE when run in DEBUG mode.
---

 Key: MAPREDUCE-2784
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2784
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/gridmix
Reporter: Amar Kamat
Assignee: Amar Kamat


TestGridmixSummary fails with NPE when run in debug mode. JobFactory tries to 
access the _createReaderThread()_ API of JobStoryProducer which returns null in 
TestGridmixSummary's FakeJobStoryProducer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (MAPREDUCE-2765) DistCp Rewrite

2011-08-08 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080917#comment-13080917
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-2765:


First of all, the code needs go into a contrib project. So, you need to 
regenerate the patch putting the code in contrib.
Also, build environment needs changes. Will this be blocked on mavenization of 
MapReduce?

Overall, design looks fine. Here are some comments on the code:
* CopyMapper:
  ** 
{noformat}
if (targetFS.exists(targetFinalPath) && targetFS.isFile(targetFinalPath)) {
  overWrite = true; // When target is an existing file, overwrite it.
}
{noformat}
Target file is overwritten irrespective of overwrite configuration? why?

* Dynamic\*
  ** DynamicInputChunk is not public?
  ** DynamicInputFormat creates FileSplits with zero length. Instead should it 
be created with the size of chunk as the size of the split.
  ** DynamicRecordReader has commented code. Should remove it.

* CopyCommitter:
  ** Atomic commit should not delete the final directory. Should throw out an 
error if it exists even before starting the job.
  ** deleteMissing() counts the files which do not exists at both source and 
target paths as deleted entries.
  ** Preserving status for the root folder does not happen at all? Can you 
check?
  ** If I’m not wrong, preserveFileAttributes() does preserve only for 
directories. Can we rename the method accordingly?
  ** The methods deleteMissing(), preserveFileAttributes() etc need more doc.
  ** Deleting attempt temp files happens in each attempt. Why are we doing 
delete again in Committer? Committer should just delete the work path.

General comment:
All public classes and public methods need javadoc

Haven't looked at testcases.

> DistCp Rewrite
> --
>
> Key: MAPREDUCE-2765
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2765
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: distcp
>Affects Versions: 0.20.203.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
> Attachments: distcpv2.20.203.patch
>
>
> This is a slightly modified version of the DistCp rewrite that Yahoo uses in 
> production today. The rewrite was ground-up, with specific focus on:
> 1. improved startup time (postponing as much work as possible to the MR job)
> 2. support for multiple copy-strategies
> 3. new features (e.g. -atomic, -async, -bandwidth.)
> 4. improved programmatic use
> Some effort has gone into refactoring what used to be achieved by a single 
> large (1.7 KLOC) source file, into a design that (hopefully) reads better too.
> The proposed DistCpV2 preserves command-line-compatibility with the old 
> version, and should be a drop-in replacement.
> New to v2:
> 1. Copy-strategies and the DynamicInputFormat:
>   A copy-strategy determines the policy by which source-file-paths are 
> distributed between map-tasks. (These boil down to the choice of the 
> input-format.) 
>   If no strategy is explicitly specified on the command-line, the policy 
> chosen is "uniform size", where v2 behaves identically to old-DistCp. (The 
> number of bytes transferred by each map-task is roughly equal, at a per-file 
> granularity.) 
>   Alternatively, v2 ships with a "dynamic" copy-strategy (in the 
> DynamicInputFormat). This policy acknowledges that 
>   (a)  dividing files based only on file-size might not be an 
> even distribution (E.g. if some datanodes are slower than others, or if some 
> files are skipped.)
>   (b) a "static" association of a source-path to a map increases 
> the likelihood of long-tails during copy.
>   The "dynamic" strategy divides the list-of-source-paths into a number 
> (> nMaps) of smaller parts. When each map completes its current list of 
> paths, it picks up a new list to process, if available. So if a map-task is 
> stuck on a slow (and not necessarily large) file, other maps can pick up the 
> slack. The thinner the file-list is sliced, the greater the parallelism (and 
> the lower the chances of long-tails). Within reason, of course: the number of 
> these short-lived list-files is capped at an overridable maximum.
>   Internal benchmarks against source/target clusters with some slow(ish) 
> datanodes have indicated significant performance gains when using the 
> dynamic-strategy. Gains are most pronounced when nFiles greatly exceeds nMaps.
>   Please note that the DynamicInputFormat might prove useful outside of 
> DistCp. It is hence available as a mapred/lib, unfettered to DistCpV2. Also 
> note that the copy-strategies have no bearing on the CopyMapper.map() 
> implementation.
>   
> 2. Improved startup-time and programmatic use:
>   When the old-DistCp runs with -update, and cre

[jira] [Commented] (MAPREDUCE-1834) TestSimulatorDeterministicReplay timesout on trunk

2011-08-08 Thread arunkumar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080804#comment-13080804
 ] 

arunkumar commented on MAPREDUCE-1834:
--

More info ..
>From svn checkout in 
>HADOOP_HOME/mapreduce/src/contrib/mumak/src/test/org/apache/hadoop/mapred/
there are no DeterministicCollectionAspects.aj and FakeConcurrentHashMap.java 
files.

> TestSimulatorDeterministicReplay timesout on trunk
> --
>
> Key: MAPREDUCE-1834
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1834
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/mumak
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Hong Tang
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1834.patch, 
> TestSimulatorDeterministicReplay.log, mr-1834-20100727.patch, 
> mr-1834-20100729.patch, mr-1834-20100802.patch
>
>
> TestSimulatorDeterministicReplay timesout on trunk.
> See hudson patch build 
> http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/216/testReport/org.apache.hadoop.mapred/TestSimulatorDeterministicReplay/testMain/

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira