date:20100618

[jira] Created: (MAPREDUCE-1879) JobTracker web page showing '

2010-06-18 Thread Tsz Wo (Nicholas), SZE (JIRA)

JobTracker web page showing '
--

 Key: MAPREDUCE-1879
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1879
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Reporter: Tsz Wo (Nicholas), SZE


{noformat}
On the secure JobTracker web page, I saw

Active users:
User 'tsz': 1985 (100.0% of used capacity)

According to W3C (http://www.w3.org/TR/xhtml1/#C_16),
' should be used instead of '
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

2010-06-18 Thread Scott Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1845:
--

Status: Patch Available  (was: Open)

> FairScheduler.tasksToPeempt() can return negative number
> 
>
> Key: MAPREDUCE-1845
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1845
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/fair-share
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1845.20100717.txt
>
>
> This method can return negative number. This will cause the preemption to 
> under-preempt.
> The bug was discovered by Joydeep.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

2010-06-18 Thread Scott Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1845:
--

Attachment: MAPREDUCE-1845.20100717.txt

The patch simply check if tasksToPreempt is negative and set it back to zero.
The included unit test produced a negative number of tasksToPreempt before 
adding the fix.

> FairScheduler.tasksToPeempt() can return negative number
> 
>
> Key: MAPREDUCE-1845
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1845
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/fair-share
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1845.20100717.txt
>
>
> This method can return negative number. This will cause the preemption to 
> under-preempt.
> The bug was discovered by Joydeep.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1878) Add MRUnit documentation

2010-06-18 Thread Aaron Kimball (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1878:
-

Status: Patch Available  (was: Open)

> Add MRUnit documentation
> 
>
> Key: MAPREDUCE-1878
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1878
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mrunit
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-1878.patch
>
>
> A short user guide for MRUnit, written in asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1878) Add MRUnit documentation

2010-06-18 Thread Aaron Kimball (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-1878:
-

Attachment: MAPREDUCE-1878.patch

Here's a short user guide document for MRUnit that I've written up. This is in 
asciidoc format.

> Add MRUnit documentation
> 
>
> Key: MAPREDUCE-1878
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1878
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/mrunit
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-1878.patch
>
>
> A short user guide for MRUnit, written in asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1878) Add MRUnit documentation

2010-06-18 Thread Aaron Kimball (JIRA)

Add MRUnit documentation


 Key: MAPREDUCE-1878
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1878
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/mrunit
Reporter: Aaron Kimball
Assignee: Aaron Kimball


A short user guide for MRUnit, written in asciidoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1559) The DelegationTokenRenewal timer task should use the jobtracker's credentials to create the filesystem

2010-06-18 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-1559:
---

Status: Resolved  (was: Patch Available)
Resolution: Fixed

I just committed this. (the patch was manually tested; hard to write a unit 
test for this bug)

> The DelegationTokenRenewal timer task should use the jobtracker's credentials 
> to create the filesystem
> --
>
> Key: MAPREDUCE-1559
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1559
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: MR-1559.1.patch, MR-1559.2.patch, mr-1559.patch
>
>
> The submitJob RPC finally creates a timer task for renewing the delegation 
> tokens of the submitting user. This timer task inherits the context of the 
> RPC handler that runs in the context of the job submitting user, and when it 
> tries to create a filesystem, the RPC client tries to use the user's 
> credentials. This should instead use the JobTracker's credentials.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

2010-06-18 Thread Scott Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Chen updated MAPREDUCE-1845:
--

Priority: Major  (was: Minor)

I am changing this to major because this is a serious problem for preemption.
The jobs which are supposed to be preempt (running task > fairshare) will 
actually generate negative tasksToPreempt.
That makes the total number of tasks to preemption go down.
This actually makes them able to escape from preemption.

> FairScheduler.tasksToPeempt() can return negative number
> 
>
> Key: MAPREDUCE-1845
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1845
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/fair-share
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
>
> This method can return negative number. This will cause the preemption to 
> under-preempt.
> The bug was discovered by Joydeep.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

2010-06-18 Thread Rodrigo Schmidt (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880389#action_12880389
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1831:


Got it now. Thanks!

+1, patch looks good to me.

> Delete the co-located replicas when raiding file
> 
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, 
> MAPREDUCE-1831.v1.1.txt
>
>
> In raid, it is good to have the blocks on the same stripe located on 
> different machine.
> This way when one machine is down, it does not broke two blocks on the stripe.
> By doing this, we can decrease the block error probability in raid from 
> O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica 
> missing probability).
> One way to do this is that we can add a new BlockPlacementPolicy which 
> deletes the replicas that are co-located.
> So when raiding the file, we can make the remaining replicas live on 
> different machines.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1877) getConetentSummary() for HarFileSystem throws IllegalArgumentException

2010-06-18 Thread Paul Yang (JIRA)

getConetentSummary() for HarFileSystem throws IllegalArgumentException
--

 Key: MAPREDUCE-1877
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1877
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: harchive
Affects Versions: 0.22.0
Reporter: Paul Yang


As HarFileSystem does not implement getContentSummary(), the implementation 
from FilterFileSystem is inherited by default. However, 
FilterFileSystem.getContentSummary() does not work for the HarFileSystem 
because the method attempts to use HarFileSystem's underlying FS to call 
getContentSummary(). In the case where the the underlying filesystem is HDFS, 
an exception similar to the following is thrown:

{code}
java.lang.IllegalArgumentException: Wrong FS: 
har://hdfs-example.com:9000/tmp/data.har, expected: hdfs://example.com:9000
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:352)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:155)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getContentSummary(DistributedFileSystem.java:232)
at 
org.apache.hadoop.fs.FilterFileSystem.getContentSummary(FilterFileSystem.java:287)
at 
org.apache.hadoop.fs.FilterFileSystem.getContentSummary(FilterFileSystem.java:287)
{code}

One solution is to implement HarFileSystem.getContentSummary() using code 
similar to FileSystem.getContentSummary().


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

2010-06-18 Thread Scott Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880388#action_12880388
 ] 

Scott Chen commented on MAPREDUCE-1831:
---

Rodrigo: Thanks for the comment. In
{code}
chooseReplicaToDelete(inode, block, replicationFactor, first, second);
{code}
We always go with the "first" set first. first set contains the replicas on the 
same rack.
In the normal case, the "local" replica will be in the second set.
In this patch, if we have a tie in the number of co-located replica, we compare 
the free space.
So this policy will just fall back to function as BlockPlacementPolicyDefault 
for the files with replication=3.

To solve this problem we need to create one more replica (make the 
replication=4) while raiding.
It is described in MAPREDUCE-1861.
After we have another off-rack replica, this policy kicks in to reduce the 
replication to 2.
This policy will first remove a replica in first set. And remove the replica 
co-located on the same stripe.
The final result will be two replica on different rack and no co-located 
replica on the same stripe.

We need both this JIRA and MAPREDUCE-1861 to achieve that result.

> Delete the co-located replicas when raiding file
> 
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, 
> MAPREDUCE-1831.v1.1.txt
>
>
> In raid, it is good to have the blocks on the same stripe located on 
> different machine.
> This way when one machine is down, it does not broke two blocks on the stripe.
> By doing this, we can decrease the block error probability in raid from 
> O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica 
> missing probability).
> One way to do this is that we can add a new BlockPlacementPolicy which 
> deletes the replicas that are co-located.
> So when raiding the file, we can make the remaining replicas live on 
> different machines.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1855) refreshSuperUserGroupsConfiguration for MR should use server side configuration for the refresh (for HADOOP-6815)

2010-06-18 Thread Boris Shkolnik (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Shkolnik updated MAPREDUCE-1855:
--

Attachment: MAPREDUCE-1855-5.patch

> refreshSuperUserGroupsConfiguration for MR should use server side 
> configuration for the refresh (for HADOOP-6815)
> -
>
> Key: MAPREDUCE-1855
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1855
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Boris Shkolnik
>Assignee: Boris Shkolnik
> Attachments: MAPREDUCE-1855-1.patch, MAPREDUCE-1855-5.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

2010-06-18 Thread Rodrigo Schmidt (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880339#action_12880339
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1831:


This still has the problem of probably leaving both replicas on the same 
switch, right?

> Delete the co-located replicas when raiding file
> 
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, 
> MAPREDUCE-1831.v1.1.txt
>
>
> In raid, it is good to have the blocks on the same stripe located on 
> different machine.
> This way when one machine is down, it does not broke two blocks on the stripe.
> By doing this, we can decrease the block error probability in raid from 
> O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica 
> missing probability).
> One way to do this is that we can add a new BlockPlacementPolicy which 
> deletes the replicas that are co-located.
> So when raiding the file, we can make the remaining replicas live on 
> different machines.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1735) Un-deprecate the old MapReduce API in the 0.21 branch

2010-06-18 Thread Tom White (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom White updated MAPREDUCE-1735:
-

Status: Resolved  (was: Patch Available)
Resolution: Fixed

I've just committed this to the 0.21 branch.

> Un-deprecate the old MapReduce API in the 0.21 branch
> -
>
> Key: MAPREDUCE-1735
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1735
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Tom White
>Assignee: Tom White
>Priority: Blocker
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-1735-0.21.patch
>
>
> This issue is to un-deprecate the "old" MapReduce API (in o.a.h.mapred) in 
> the next 0.21 release, as discussed at 
> http://www.mail-archive.com/mapreduce-...@hadoop.apache.org/msg01833.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

2010-06-18 Thread Scott Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880312#action_12880312
 ] 

Scott Chen commented on MAPREDUCE-1831:
---

The failed contrib test is a known issue reported in MAPREDUCE-1834.
The patch is ready for review.

> Delete the co-located replicas when raiding file
> 
>
> Key: MAPREDUCE-1831
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, 
> MAPREDUCE-1831.v1.1.txt
>
>
> In raid, it is good to have the blocks on the same stripe located on 
> different machine.
> This way when one machine is down, it does not broke two blocks on the stripe.
> By doing this, we can decrease the block error probability in raid from 
> O(p^3) to O(p^4) which can be a hugh improvement (where p is the replica 
> missing probability).
> One way to do this is that we can add a new BlockPlacementPolicy which 
> deletes the replicas that are co-located.
> So when raiding the file, we can make the remaining replicas live on 
> different machines.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics

2010-06-18 Thread Scott Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880311#action_12880311
 ] 

Scott Chen commented on MAPREDUCE-1848:
---

The failed contrib test is a known issue reported in MAPREDUCE-1834.
The patch is ready for reviewing.

In the patch we add number of speculative tasks, data local and rack local maps 
in jobtrakcer instrumentation.

> Put number of speculative, data local, rack local tasks in JobTracker metrics
> -
>
> Key: MAPREDUCE-1848
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1848
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1848-20100614.txt, MAPREDUCE-1848-20100617.txt
>
>
> It will be nice that we can collect these information in JobTracker metrics

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1559) The DelegationTokenRenewal timer task should use the jobtracker's credentials to create the filesystem

2010-06-18 Thread Owen O'Malley (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880310#action_12880310
 ] 

Owen O'Malley commented on MAPREDUCE-1559:
--

+1

> The DelegationTokenRenewal timer task should use the jobtracker's credentials 
> to create the filesystem
> --
>
> Key: MAPREDUCE-1559
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1559
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: MR-1559.1.patch, MR-1559.2.patch, mr-1559.patch
>
>
> The submitJob RPC finally creates a timer task for renewing the delegation 
> tokens of the submitting user. This timer task inherits the context of the 
> RPC handler that runs in the context of the job submitting user, and when it 
> tries to create a filesystem, the RPC client tries to use the user's 
> credentials. This should instead use the JobTracker's credentials.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-647) Update the DistCp forrest doc to make it consistent with the latest changes (5472, 5620, 5762, 5826)

2010-06-18 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-647:
-

Status: Resolved  (was: Patch Available)
Resolution: Fixed

I have committed this.  Thanks, Rodrigo!

> Update the DistCp forrest doc to make it consistent with the latest changes 
> (5472, 5620, 5762, 5826)
> 
>
> Key: MAPREDUCE-647
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-647
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-647.1.patch, MAPREDUCE-647.patch
>
>
> New features have been added to DistCp and the documentation must be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-647) Update the DistCp forrest doc to make it consistent with the latest changes (5472, 5620, 5762, 5826)

2010-06-18 Thread Tsz Wo (Nicholas), SZE (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-647:
-

Hadoop Flags: [Reviewed]
 Component/s: documentation
  (was: distcp)

+1 yes, it looks good.

> Update the DistCp forrest doc to make it consistent with the latest changes 
> (5472, 5620, 5762, 5826)
> 
>
> Key: MAPREDUCE-647
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-647
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-647.1.patch, MAPREDUCE-647.patch
>
>
> New features have been added to DistCp and the documentation must be updated.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1790) Herriot test cases failing in Hudson due to LzCode not found

2010-06-18 Thread Giridharan Kesavan (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880195#action_12880195
 ] 

Giridharan Kesavan commented on MAPREDUCE-1790:
---

The idea is to get the hadoop-gpl-compression.jar to the maven repository and 
resolve it through ivy. 



> Herriot test cases failing in Hudson due to LzCode not found
> 
>
> Key: MAPREDUCE-1790
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1790
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: Herriot system test case automation
>Reporter: Balaji Rajagopalan
>Assignee: Giridharan Kesavan
> Attachments: lzcodec_fix.txt
>
>
> The test cases are failing due to non-availablity of the jar  
> hadoop-gpl-compression-0.1.0-1005060043.jar, need changes to aop xml to fix 
> this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-1876) TaskAttemptStartedEvent.java incorrectly logs MAP_ATTEMPT_STARTED as event type for reduce tasks

2010-06-18 Thread Amar Kamat (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-1876:
--

Attachment: mapreduce-1876-v1.1.patch

Attaching a patch that uses the task-attempt-id to distinguish between 
map-tasktype and reduce-tasktype. Added a test for the same.

> TaskAttemptStartedEvent.java incorrectly logs MAP_ATTEMPT_STARTED as event 
> type for reduce tasks
> 
>
> Key: MAPREDUCE-1876
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1876
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1876-v1.1.patch
>
>
> {{TaskAttemptStartedEvent}} is used to log the start time of both the map and 
> reduce task attempts to {{JobHistory}}. Following is the implementation of 
> _getEventType()_ method of {{TaskAttemptStartedEvent}}
> {code}
> /** Get the event type */
>   public EventType getEventType() {
> return EventType.MAP_ATTEMPT_STARTED;
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (MAPREDUCE-617) Streaming should not throw java.lang.RuntimeException and ERROR while displaying help

2010-06-18 Thread Karam Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karam Singh resolved MAPREDUCE-617.
---

Fix Version/s: 0.20.2
   Resolution: Fixed

Now Streaming does not throw java.lang.RuntimeException and ERROR while 
displaying help in Hadoop. 02.0.2. So marking it as resolved fixed 


> Streaming should not throw java.lang.RuntimeException and ERROR while 
> displaying help
> -
>
> Key: MAPREDUCE-617
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-617
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Reporter: Karam Singh
>Priority: Minor
> Fix For: 0.20.2
>
>
> Run streaming command as -:
> bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -help 
> it will dislay following -:
> ERROR streaming.StreamJob: Missing required option -input
> Usage: $HADOOP_HOME/bin/hadoop [--config dir] jar \
>   $HADOOP_HOME/hadoop-streaming.jar [options]
> Options:
>   -input DFS input file(s) for the Map step
>   -outputDFS output directory for the Reduce step
>   -mapper The streaming command to run
>   -combiner  Combiner has to be a Java class
>   -reducerThe streaming command to run
>   -file  File/dir to be shipped in the Job jar file
>   -dfs|local  Optional. Override DFS configuration
>   -jt |local  Optional. Override JobTracker configuration
>   -additionalconfspec specfile  Optional.
>   -inputformat 
> TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName Optional.
>   -outputformat TextOutputFormat(default)|JavaClassName  Optional.
>   -partitioner JavaClassName  Optional.
>   -numReduceTasks   Optional.
>   -inputreader   Optional.
>   -jobconf  =Optional. Add or override a JobConf property
>   -cmdenv   =Optional. Pass env.var to streaming commands
>   -mapdebug   Optional. To run this script when a map task fails
>   -reducedebug   Optional. To run this script when a reduce task fails
>   -cacheFile fileNameURI
>   -cacheArchive fileNameURI
>   -verbose
> For more details about these options:
> Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
> Exception in thread "main" java.lang.RuntimeException:
> at org.apache.hadoop.streaming.StreamJob.fail(StreamJob.java:542)
> at org.apache.hadoop.streaming.StreamJob.exitUsage(StreamJob.java:481)
> at org.apache.hadoop.streaming.StreamJob.parseArgv(StreamJob.java:203)
> at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:105)
> at 
> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:33)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:585)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
> It should not display execption and -input and error about missin -input 
> option 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

2010-06-18 Thread Ravi Gummadi (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880159#action_12880159
 ] 

Ravi Gummadi commented on MAPREDUCE-577:


Looks like the testcase TestStreamXmlMultiOuter is failing in trunk but passing 
in 0.20. Will investigate.

> Duplicate Mapper input when using StreamXmlRecordReader
> ---
>
> Key: MAPREDUCE-577
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-577
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
> Environment: HADOOP 0.17.0, Java 6.0
>Reporter: David Campbell
>Assignee: Ravi Gummadi
> Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 
> 0002-patch-for-HADOOP-3484.patch, 577.patch, HADOOP-3484.combined.patch, 
> HADOOP-3484.try3.patch
>
>
> I have an XML file with 93626 rows.  A row is marked by 
> I've confirmed this with grep and the Grep example program included with 
> HADOOP.
> Here is the grep example output.  93626   
> I've setup my job configuration as follows:   
> conf.set("stream.recordreader.class", 
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> conf.set("stream.recordreader.begin", "");
> conf.set("stream.recordreader.end", "");
> conf.setInputFormat(StreamInputFormat.class);
> I have a fairly simple test Mapper.
> Here's the map method.
>   public void map(Text key, Text value, OutputCollector 
> output, Reporter reporter) throws IOException {
> try {
> output.collect(totalWord, one);
> if (key != null && key.toString().indexOf("01852") != -1) {
> output.collect(new Text("01852"), one);
> }
> } catch (Exception ex) {
> Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, 
> null, ex);
> System.out.println(value);
> }
> }
> For totalWord ("TOTAL"), I get:
> TOTAL 140850
> and for 01852 I get.
> 01852 86
> There are 43 instances of 01852 in the file.
> I have the following setting in my config.  
>conf.setNumMapTasks(1);
> I have a total of six machines in my cluster.
> If I run without this, the result is 12x the actual value, not 2x.
> Here's some info from the cluster web page.
> Maps  Reduces Total Submissions   Nodes   Map Task Capacity   Reduce 
> Task CapacityAvg. Tasks/Node
> 0 0   1   6   12  12  4.00
> I've also noticed something really strange in the job's output.  It looks 
> like it's starting over or redoing things.
> This was run using all six nodes and no limitations on map or reduce tasks.  
> I haven't seen this behavior in any other case.
> 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018
> 08/06/03 10:50:37 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:50:42 INFO mapred.JobClient:  map 2% reduce 0%
> 08/06/03 10:50:45 INFO mapred.JobClient:  map 12% reduce 0%
> 08/06/03 10:50:47 INFO mapred.JobClient:  map 31% reduce 0%
> 08/06/03 10:50:48 INFO mapred.JobClient:  map 49% reduce 0%
> 08/06/03 10:50:49 INFO mapred.JobClient:  map 68% reduce 0%
> 08/06/03 10:50:50 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:54 INFO mapred.JobClient:  map 87% reduce 0%
> 08/06/03 10:50:55 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:56 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:51:00 INFO mapred.JobClient:  map 0% reduce 1%
> 08/06/03 10:51:05 INFO mapred.JobClient:  map 28% reduce 2%
> 08/06/03 10:51:07 INFO mapred.JobClient:  map 80% reduce 4%
> 08/06/03 10:51:08 INFO mapred.JobClient:  map 100% reduce 4%
> 08/06/03 10:51:09 INFO mapred.JobClient:  map 100% reduce 7%
> 08/06/03 10:51:10 INFO mapred.JobClient:  map 90% reduce 9%
> 08/06/03 10:51:11 INFO mapred.JobClient:  map 100% reduce 9%
> 08/06/03 10:51:12 INFO mapred.JobClient:  map 100% reduce 11%
> 08/06/03 10:51:13 INFO mapred.JobClient:  map 90% reduce 11%
> 08/06/03 10:51:14 INFO mapred.JobClient:  map 97% reduce 11%
> 08/06/03 10:51:15 INFO mapred.JobClient:  map 63% reduce 11%
> 08/06/03 10:51:16 INFO mapred.JobClient:  map 48% reduce 11%
> 08/06/03 10:51:17 INFO mapred.JobClient:  map 21% reduce 11%
> 08/06/03 10:51:19 INFO mapred.JobClient:  map 0% reduce 11%
> 08/06/03 10:51:20 INFO mapred.JobClient:  map 15% reduce 12%
> 08/06/03 10:51:21 INFO mapred.JobClient:  map 27% reduce 13%
> 08/06/03 10:51:22 INFO mapred.JobClient:  map 67% reduce 13%
> 08/06/03 10:51:24 INFO mapred.JobClient:  map 22% reduce 16%
> 08/06/03 10:51:25 INFO mapred.JobClient:  map 46% reduce 16%
> 08/06/03 10:51:26 INFO mapred.JobClient:  map 70% reduce 16%
> 08/06/03 10:51:27 INFO mapred.JobClient:  map 73% reduce 18%
> 08/06/03 10:51:28 INFO mapred.JobClient:  map 85% reduce

[jira] Updated: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

2010-06-18 Thread Ravi Gummadi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-577:
---

Attachment: 577.patch

Attaching patch for trunk. This patch is same as the earlier patch 
HADOOP-3484.try3.patch except fixing TestStreamXmlMultiInner.java to have 
fs.local.block.size=59 instead of 80 sothat the testcase fails without the fix 
of this patch.

> Duplicate Mapper input when using StreamXmlRecordReader
> ---
>
> Key: MAPREDUCE-577
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-577
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
> Environment: HADOOP 0.17.0, Java 6.0
>Reporter: David Campbell
> Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 
> 0002-patch-for-HADOOP-3484.patch, 577.patch, HADOOP-3484.combined.patch, 
> HADOOP-3484.try3.patch
>
>
> I have an XML file with 93626 rows.  A row is marked by 
> I've confirmed this with grep and the Grep example program included with 
> HADOOP.
> Here is the grep example output.  93626   
> I've setup my job configuration as follows:   
> conf.set("stream.recordreader.class", 
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> conf.set("stream.recordreader.begin", "");
> conf.set("stream.recordreader.end", "");
> conf.setInputFormat(StreamInputFormat.class);
> I have a fairly simple test Mapper.
> Here's the map method.
>   public void map(Text key, Text value, OutputCollector 
> output, Reporter reporter) throws IOException {
> try {
> output.collect(totalWord, one);
> if (key != null && key.toString().indexOf("01852") != -1) {
> output.collect(new Text("01852"), one);
> }
> } catch (Exception ex) {
> Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, 
> null, ex);
> System.out.println(value);
> }
> }
> For totalWord ("TOTAL"), I get:
> TOTAL 140850
> and for 01852 I get.
> 01852 86
> There are 43 instances of 01852 in the file.
> I have the following setting in my config.  
>conf.setNumMapTasks(1);
> I have a total of six machines in my cluster.
> If I run without this, the result is 12x the actual value, not 2x.
> Here's some info from the cluster web page.
> Maps  Reduces Total Submissions   Nodes   Map Task Capacity   Reduce 
> Task CapacityAvg. Tasks/Node
> 0 0   1   6   12  12  4.00
> I've also noticed something really strange in the job's output.  It looks 
> like it's starting over or redoing things.
> This was run using all six nodes and no limitations on map or reduce tasks.  
> I haven't seen this behavior in any other case.
> 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018
> 08/06/03 10:50:37 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:50:42 INFO mapred.JobClient:  map 2% reduce 0%
> 08/06/03 10:50:45 INFO mapred.JobClient:  map 12% reduce 0%
> 08/06/03 10:50:47 INFO mapred.JobClient:  map 31% reduce 0%
> 08/06/03 10:50:48 INFO mapred.JobClient:  map 49% reduce 0%
> 08/06/03 10:50:49 INFO mapred.JobClient:  map 68% reduce 0%
> 08/06/03 10:50:50 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:54 INFO mapred.JobClient:  map 87% reduce 0%
> 08/06/03 10:50:55 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:56 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:51:00 INFO mapred.JobClient:  map 0% reduce 1%
> 08/06/03 10:51:05 INFO mapred.JobClient:  map 28% reduce 2%
> 08/06/03 10:51:07 INFO mapred.JobClient:  map 80% reduce 4%
> 08/06/03 10:51:08 INFO mapred.JobClient:  map 100% reduce 4%
> 08/06/03 10:51:09 INFO mapred.JobClient:  map 100% reduce 7%
> 08/06/03 10:51:10 INFO mapred.JobClient:  map 90% reduce 9%
> 08/06/03 10:51:11 INFO mapred.JobClient:  map 100% reduce 9%
> 08/06/03 10:51:12 INFO mapred.JobClient:  map 100% reduce 11%
> 08/06/03 10:51:13 INFO mapred.JobClient:  map 90% reduce 11%
> 08/06/03 10:51:14 INFO mapred.JobClient:  map 97% reduce 11%
> 08/06/03 10:51:15 INFO mapred.JobClient:  map 63% reduce 11%
> 08/06/03 10:51:16 INFO mapred.JobClient:  map 48% reduce 11%
> 08/06/03 10:51:17 INFO mapred.JobClient:  map 21% reduce 11%
> 08/06/03 10:51:19 INFO mapred.JobClient:  map 0% reduce 11%
> 08/06/03 10:51:20 INFO mapred.JobClient:  map 15% reduce 12%
> 08/06/03 10:51:21 INFO mapred.JobClient:  map 27% reduce 13%
> 08/06/03 10:51:22 INFO mapred.JobClient:  map 67% reduce 13%
> 08/06/03 10:51:24 INFO mapred.JobClient:  map 22% reduce 16%
> 08/06/03 10:51:25 INFO mapred.JobClient:  map 46% reduce 16%
> 08/06/03 10:51:26 INFO mapred.JobClient:  map 70% reduce 16%
> 08/06/03 10:51:27 INFO mapred.JobClient:  map 73% reduc

[jira] Assigned: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

2010-06-18 Thread Ravi Gummadi (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi reassigned MAPREDUCE-577:
--

Assignee: Ravi Gummadi

> Duplicate Mapper input when using StreamXmlRecordReader
> ---
>
> Key: MAPREDUCE-577
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-577
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
> Environment: HADOOP 0.17.0, Java 6.0
>Reporter: David Campbell
>Assignee: Ravi Gummadi
> Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 
> 0002-patch-for-HADOOP-3484.patch, 577.patch, HADOOP-3484.combined.patch, 
> HADOOP-3484.try3.patch
>
>
> I have an XML file with 93626 rows.  A row is marked by 
> I've confirmed this with grep and the Grep example program included with 
> HADOOP.
> Here is the grep example output.  93626   
> I've setup my job configuration as follows:   
> conf.set("stream.recordreader.class", 
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> conf.set("stream.recordreader.begin", "");
> conf.set("stream.recordreader.end", "");
> conf.setInputFormat(StreamInputFormat.class);
> I have a fairly simple test Mapper.
> Here's the map method.
>   public void map(Text key, Text value, OutputCollector 
> output, Reporter reporter) throws IOException {
> try {
> output.collect(totalWord, one);
> if (key != null && key.toString().indexOf("01852") != -1) {
> output.collect(new Text("01852"), one);
> }
> } catch (Exception ex) {
> Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, 
> null, ex);
> System.out.println(value);
> }
> }
> For totalWord ("TOTAL"), I get:
> TOTAL 140850
> and for 01852 I get.
> 01852 86
> There are 43 instances of 01852 in the file.
> I have the following setting in my config.  
>conf.setNumMapTasks(1);
> I have a total of six machines in my cluster.
> If I run without this, the result is 12x the actual value, not 2x.
> Here's some info from the cluster web page.
> Maps  Reduces Total Submissions   Nodes   Map Task Capacity   Reduce 
> Task CapacityAvg. Tasks/Node
> 0 0   1   6   12  12  4.00
> I've also noticed something really strange in the job's output.  It looks 
> like it's starting over or redoing things.
> This was run using all six nodes and no limitations on map or reduce tasks.  
> I haven't seen this behavior in any other case.
> 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018
> 08/06/03 10:50:37 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:50:42 INFO mapred.JobClient:  map 2% reduce 0%
> 08/06/03 10:50:45 INFO mapred.JobClient:  map 12% reduce 0%
> 08/06/03 10:50:47 INFO mapred.JobClient:  map 31% reduce 0%
> 08/06/03 10:50:48 INFO mapred.JobClient:  map 49% reduce 0%
> 08/06/03 10:50:49 INFO mapred.JobClient:  map 68% reduce 0%
> 08/06/03 10:50:50 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:54 INFO mapred.JobClient:  map 87% reduce 0%
> 08/06/03 10:50:55 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:56 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:51:00 INFO mapred.JobClient:  map 0% reduce 1%
> 08/06/03 10:51:05 INFO mapred.JobClient:  map 28% reduce 2%
> 08/06/03 10:51:07 INFO mapred.JobClient:  map 80% reduce 4%
> 08/06/03 10:51:08 INFO mapred.JobClient:  map 100% reduce 4%
> 08/06/03 10:51:09 INFO mapred.JobClient:  map 100% reduce 7%
> 08/06/03 10:51:10 INFO mapred.JobClient:  map 90% reduce 9%
> 08/06/03 10:51:11 INFO mapred.JobClient:  map 100% reduce 9%
> 08/06/03 10:51:12 INFO mapred.JobClient:  map 100% reduce 11%
> 08/06/03 10:51:13 INFO mapred.JobClient:  map 90% reduce 11%
> 08/06/03 10:51:14 INFO mapred.JobClient:  map 97% reduce 11%
> 08/06/03 10:51:15 INFO mapred.JobClient:  map 63% reduce 11%
> 08/06/03 10:51:16 INFO mapred.JobClient:  map 48% reduce 11%
> 08/06/03 10:51:17 INFO mapred.JobClient:  map 21% reduce 11%
> 08/06/03 10:51:19 INFO mapred.JobClient:  map 0% reduce 11%
> 08/06/03 10:51:20 INFO mapred.JobClient:  map 15% reduce 12%
> 08/06/03 10:51:21 INFO mapred.JobClient:  map 27% reduce 13%
> 08/06/03 10:51:22 INFO mapred.JobClient:  map 67% reduce 13%
> 08/06/03 10:51:24 INFO mapred.JobClient:  map 22% reduce 16%
> 08/06/03 10:51:25 INFO mapred.JobClient:  map 46% reduce 16%
> 08/06/03 10:51:26 INFO mapred.JobClient:  map 70% reduce 16%
> 08/06/03 10:51:27 INFO mapred.JobClient:  map 73% reduce 18%
> 08/06/03 10:51:28 INFO mapred.JobClient:  map 85% reduce 19%
> 08/06/03 10:51:29 INFO mapred.JobClient:  map 7% reduce 19%
> 08/06/03 10:51:32 INFO mapred.JobClient:  map 100% reduce 20%
>

[jira] Assigned: (MAPREDUCE-1790) Herriot test cases failing in Hudson due to LzCode not found

2010-06-18 Thread Balaji Rajagopalan (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balaji Rajagopalan reassigned MAPREDUCE-1790:
-

Assignee: Giridharan Kesavan  (was: Balaji Rajagopalan)

Giri is working towards a fix for this issue. 

> Herriot test cases failing in Hudson due to LzCode not found
> 
>
> Key: MAPREDUCE-1790
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1790
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: Herriot system test case automation
>Reporter: Balaji Rajagopalan
>Assignee: Giridharan Kesavan
> Attachments: lzcodec_fix.txt
>
>
> The test cases are failing due to non-availablity of the jar  
> hadoop-gpl-compression-0.1.0-1005060043.jar, need changes to aop xml to fix 
> this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (MAPREDUCE-1122) streaming with custom input format does not support the new API

2010-06-18 Thread Amareshwari Sriramadasu (JIRA)


 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu reassigned MAPREDUCE-1122:
--

Assignee: Amareshwari Sriramadasu

> streaming with custom input format does not support the new API
> ---
>
> Key: MAPREDUCE-1122
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1122
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.20.1
> Environment: any OS
>Reporter: Keith Jackson
>Assignee: Amareshwari Sriramadasu
>
> When trying to implement a custom input format for use with streaming, I have 
> found that streaming does not support the new API, 
> org.apache.hadoop.mapreduce.InputFormat, but requires the old API, 
> org.apache.hadoop.mapred.InputFormat.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics

2010-06-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880088#action_12880088
 ] 

Hadoop QA commented on MAPREDUCE-1848:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12447421/MAPREDUCE-1848-20100617.txt
  against trunk revision 955543.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 9 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/578/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/578/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/578/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/578/console

This message is automatically generated.

> Put number of speculative, data local, rack local tasks in JobTracker metrics
> -
>
> Key: MAPREDUCE-1848
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1848
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Affects Versions: 0.22.0
>Reporter: Scott Chen
>Assignee: Scott Chen
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1848-20100614.txt, MAPREDUCE-1848-20100617.txt
>
>
> It will be nice that we can collect these information in JobTracker metrics

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (MAPREDUCE-1879) JobTracker web page showing '

[jira] Updated: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

[jira] Updated: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

[jira] Updated: (MAPREDUCE-1878) Add MRUnit documentation

[jira] Updated: (MAPREDUCE-1878) Add MRUnit documentation

[jira] Created: (MAPREDUCE-1878) Add MRUnit documentation

[jira] Updated: (MAPREDUCE-1559) The DelegationTokenRenewal timer task should use the jobtracker's credentials to create the filesystem

[jira] Updated: (MAPREDUCE-1845) FairScheduler.tasksToPeempt() can return negative number

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

[jira] Created: (MAPREDUCE-1877) getConetentSummary() for HarFileSystem throws IllegalArgumentException

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

[jira] Updated: (MAPREDUCE-1855) refreshSuperUserGroupsConfiguration for MR should use server side configuration for the refresh (for HADOOP-6815)

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

[jira] Updated: (MAPREDUCE-1735) Un-deprecate the old MapReduce API in the 0.21 branch

[jira] Commented: (MAPREDUCE-1831) Delete the co-located replicas when raiding file

[jira] Commented: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics

[jira] Commented: (MAPREDUCE-1559) The DelegationTokenRenewal timer task should use the jobtracker's credentials to create the filesystem

[jira] Updated: (MAPREDUCE-647) Update the DistCp forrest doc to make it consistent with the latest changes (5472, 5620, 5762, 5826)

[jira] Updated: (MAPREDUCE-647) Update the DistCp forrest doc to make it consistent with the latest changes (5472, 5620, 5762, 5826)

[jira] Commented: (MAPREDUCE-1790) Herriot test cases failing in Hudson due to LzCode not found

[jira] Updated: (MAPREDUCE-1876) TaskAttemptStartedEvent.java incorrectly logs MAP_ATTEMPT_STARTED as event type for reduce tasks

[jira] Resolved: (MAPREDUCE-617) Streaming should not throw java.lang.RuntimeException and ERROR while displaying help

[jira] Commented: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

[jira] Updated: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

[jira] Assigned: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

[jira] Assigned: (MAPREDUCE-1790) Herriot test cases failing in Hudson due to LzCode not found

[jira] Assigned: (MAPREDUCE-1122) streaming with custom input format does not support the new API

[jira] Commented: (MAPREDUCE-1848) Put number of speculative, data local, rack local tasks in JobTracker metrics

28 matches

Site Navigation

Mail list logo

Footer information