[jira] Commented: (MAPREDUCE-775) Add input/output formatters for Vertica clustered ADBMS.

2009-07-23 Thread Omer Trajman (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734913#action_12734913
 ] 

Omer Trajman commented on MAPREDUCE-775:


src/contrib/vertica and  package org.apache.hadoop.vertica?

> Add input/output formatters for Vertica clustered ADBMS.
> 
>
> Key: MAPREDUCE-775
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-775
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Omer Trajman
> Fix For: 0.21.0
>
>
> Add native support for Vertica as an input or output format taking advantage 
> of parallel read and write properties of the DBMS.
>  
> On the input side allow for parametrized queries (a la prepared statements) 
> and create a split for each combination of parameters.  Also support the 
> parameter list to be generated from a sql statement.  For example - return 
> metrics for all dimensions that meet criteria X with one input split for each 
> dimension.  Divide the read among any number of hosts in the Vertica cluster.
>  
> On the output side, support Vertica streaming load to any number of hosts in 
> the Vertica cluster.  Output may be to a different cluster than input.
>  
> Also includes Input and Output formatters that support streaming interface.
> Code has been tested and run on live systems under 19 and 20.  Patch for 21 
> with new API will be ready end of this week.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics

2009-07-23 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734910#action_12734910
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-779:
--

Output from ant test-patch:

{noformat}
 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
{noformat}

> Add node health failures into JobTrackerStatistics
> --
>
> Key: MAPREDUCE-779
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-779
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sreekanth Ramakrishnan
>Assignee: Sreekanth Ramakrishnan
> Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch
>
>
> Add the node health failure counts into {{JobTrackerStatistics}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics

2009-07-23 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734904#action_12734904
 ] 

rahul k singh commented on MAPREDUCE-779:
-

changes look fine to me 
+1.

> Add node health failures into JobTrackerStatistics
> --
>
> Key: MAPREDUCE-779
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-779
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sreekanth Ramakrishnan
>Assignee: Sreekanth Ramakrishnan
> Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch
>
>
> Add the node health failure counts into {{JobTrackerStatistics}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-07-23 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734804#action_12734804
 ] 

Aaron Kimball commented on MAPREDUCE-798:
-

According to the comments in the ChainMapper source, it allows you to run jobs 
of the form {{M+RM*}}. This is designed to test pipelines of the form  
{{(MR)+}}. Can you elaborate a bit more on how you think ChainMapper / 
ChainReducer fits here?

> MRUnit should be able to test a succession of MapReduce passes
> --
>
> Key: MAPREDUCE-798
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-798.patch
>
>
> MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
> produce certain outputs at the end of the reducer. It would be good to 
> support more end-to-end tests of a series of MapReduce jobs that form a 
> longer pipeline surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-07-23 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734802#action_12734802
 ] 

Owen O'Malley commented on MAPREDUCE-798:
-

Wouldn't it be better to have it work with the ChainMapper?

> MRUnit should be able to test a succession of MapReduce passes
> --
>
> Key: MAPREDUCE-798
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-798.patch
>
>
> MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
> produce certain outputs at the end of the reducer. It would be good to 
> support more end-to-end tests of a series of MapReduce jobs that form a 
> longer pipeline surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-800) MRUnit should support the new API

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-800:


Status: Patch Available  (was: Open)

> MRUnit should support the new API
> -
>
> Key: MAPREDUCE-800
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-800
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-800.patch
>
>
> MRUnit's TestDriver implementations use the old 
> org.apache.hadoop.mapred-based classes. TestDrivers and associated mock 
> object implementations are required for org.apache.hadoop.mapreduce-based 
> code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-800) MRUnit should support the new API

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-800:


Attachment: MAPREDUCE-800.patch

Implementation of MRUnit over new API. New code is in the 
org.apache.hadoop.mrunit.mapreduce package, which contains many 
identically-named classes to those in the base org.apache.hadoop.mrunit 
package, similar to the mapred/mapreduce package style used in Hadoop MapReduce 
itself.

This adds mock implementations of Mapper.Context and Reducer.Context which are 
used as inputs to user-provided Mapper and Reducer classes. This takes 
advantage of the fact that even though Mapper.Context and Reducer.Context are 
not static classes, they make use of no state of their outer class Mapper or 
Reducer objects, merely the shared type signature.

> MRUnit should support the new API
> -
>
> Key: MAPREDUCE-800
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-800
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-800.patch
>
>
> MRUnit's TestDriver implementations use the old 
> org.apache.hadoop.mapred-based classes. TestDrivers and associated mock 
> object implementations are required for org.apache.hadoop.mapreduce-based 
> code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-800) MRUnit should support the new API

2009-07-23 Thread Aaron Kimball (JIRA)
MRUnit should support the new API
-

 Key: MAPREDUCE-800
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-800
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Aaron Kimball
Assignee: Aaron Kimball


MRUnit's TestDriver implementations use the old org.apache.hadoop.mapred-based 
classes. TestDrivers and associated mock object implementations are required 
for org.apache.hadoop.mapreduce-based code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-799:


Attachment: MAPREDUCE-799.patch

> Some of MRUnit's self-tests were not being run
> --
>
> Key: MAPREDUCE-799
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-799
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-799.patch
>
>
> Due to method naming issues, some test cases were not being executed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-799:


Status: Patch Available  (was: Open)

> Some of MRUnit's self-tests were not being run
> --
>
> Key: MAPREDUCE-799
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-799
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-799.patch
>
>
> Due to method naming issues, some test cases were not being executed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-07-23 Thread Aaron Kimball (JIRA)
MRUnit should be able to test a succession of MapReduce passes
--

 Key: MAPREDUCE-798
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-798.patch

MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
produce certain outputs at the end of the reducer. It would be good to support 
more end-to-end tests of a series of MapReduce jobs that form a longer pipeline 
surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-798:


Attachment: MAPREDUCE-798.patch

attaching implementation.

> MRUnit should be able to test a succession of MapReduce passes
> --
>
> Key: MAPREDUCE-798
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-798.patch
>
>
> MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
> produce certain outputs at the end of the reducer. It would be good to 
> support more end-to-end tests of a series of MapReduce jobs that form a 
> longer pipeline surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-798) MRUnit should be able to test a succession of MapReduce passes

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-798:


Status: Patch Available  (was: Open)

> MRUnit should be able to test a succession of MapReduce passes
> --
>
> Key: MAPREDUCE-798
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-798
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-798.patch
>
>
> MRUnit can currently test that the inputs to a given (mapper, reducer) "job" 
> produce certain outputs at the end of the reducer. It would be good to 
> support more end-to-end tests of a series of MapReduce jobs that form a 
> longer pipeline surrounding some data.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-799) Some of MRUnit's self-tests were not being run

2009-07-23 Thread Aaron Kimball (JIRA)
Some of MRUnit's self-tests were not being run
--

 Key: MAPREDUCE-799
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-799
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Aaron Kimball
Assignee: Aaron Kimball


Due to method naming issues, some test cases were not being executed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-797) MRUnit MapReduceDriver should support combiners

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-797:


Status: Patch Available  (was: Open)

> MRUnit MapReduceDriver should support combiners
> ---
>
> Key: MAPREDUCE-797
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-797
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-797.patch
>
>
> The MapReduceDriver allows you to specify a mapper and a reducer class with a 
> simple sort/"shuffle" between the passes. It would be nice to also support 
> another Reducer implementation being used as a combiner in the middle.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-797) MRUnit MapReduceDriver should support combiners

2009-07-23 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-797:


Attachment: MAPREDUCE-797.patch

Attaching implementation.

> MRUnit MapReduceDriver should support combiners
> ---
>
> Key: MAPREDUCE-797
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-797
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-797.patch
>
>
> The MapReduceDriver allows you to specify a mapper and a reducer class with a 
> simple sort/"shuffle" between the passes. It would be nice to also support 
> another Reducer implementation being used as a combiner in the middle.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-797) MRUnit MapReduceDriver should support combiners

2009-07-23 Thread Aaron Kimball (JIRA)
MRUnit MapReduceDriver should support combiners
---

 Key: MAPREDUCE-797
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-797
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-797.patch

The MapReduceDriver allows you to specify a mapper and a reducer class with a 
simple sort/"shuffle" between the passes. It would be nice to also support 
another Reducer implementation being used as a combiner in the middle.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-751) Rumen: a tool to extract job characterization data from job tracker logs

2009-07-23 Thread Dick King (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dick King updated MAPREDUCE-751:


Attachment: mapreduce-751--2009-07-23.patch

This is a preliminary patch to gather early feedback on this functionality.

It works, but there are some areas I'm working on -- general code cleanup, 
mostly.  Its functionality is complete.  Although there are forseeable 
enhancements, they will be called out in their own JIRAs.

> Rumen: a tool to extract job characterization data from job tracker logs
> 
>
> Key: MAPREDUCE-751
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-751
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Dick King
> Attachments: mapreduce-751--2009-07-23.patch
>
>
>  We propose a new map/reduce component, rumen, which can be used to process 
> job history logs to produce any or all of the following:
>   * Retrospective info describing the statistical behavior of the
> amount of time it would have taken to launch a job into a certain
> percentage of the number of mapper slots in the log's cluster, given the
> load over the period covered by the log
>   * Statistical info as to the runtimes and shuffle times, etc. of
> the tasks and jobs covered by the log
>   * files describing detailed job trace information, and the
> network topology as inferred from the host locations and rack IDs that
> arise in the job tracker log.  In addition to this facility, rumen
> includes readers for this information to return job and detailed task
> information to other tools.
> These other tools include a more advanced version of gridmix, and 
> also includes mumak: see blocked issues.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-07-23 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734651#action_12734651
 ] 

Arun C Murthy commented on MAPREDUCE-796:
-

I see the problem in 
org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run with the following 
fix:

{noformat}
diff --git 
src/mapred/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.java 
src/mapred/org/apach
index 2e0d6d9..95530f9 100644
--- src/mapred/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.java
+++ src/mapred/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.java
@@ -146,7 +146,7 @@ public class MultithreadedMapper
 } else if (th instanceof InterruptedException) {
   throw (InterruptedException) th;
 } else {
-  throw (RuntimeException) th;
+  throw new RuntimeException(th);
 }
   }
 }
{noformat}

The *else* block should probably be:

> Encountered "ClassCastException" on tasktracker while running wordcount with 
> MultithreadedMapRunner
> ---
>
> Key: MAPREDUCE-796
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>
> ClassCastException for OutOfMemoryError is encountered on tasktracker while 
> running wordcount example with MultithreadedMapRunner. 
> Stack trace :
> =
> java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
> java.lang.RuntimeException
>   at 
> org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-792) javac warnings in DBInputFormat

2009-07-23 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE updated MAPREDUCE-792:
-

Hadoop Flags: [Reviewed]

+1 patch looks good.

> javac warnings in DBInputFormat
> ---
>
> Key: MAPREDUCE-792
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-792
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: MAPREDUCE-792.2.patch, MAPREDUCE-792.patch
>
>
> MAPREDUCE-716 introduces javac warnings

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-793) Create a new test that consolidates a few tests to be included in the commit-test list

2009-07-23 Thread gary murry (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734648#action_12734648
 ] 

gary murry commented on MAPREDUCE-793:
--

This is beginning to sound more like a functional test then a unit test?

> Create a new test that consolidates a few tests to be included in the 
> commit-test list
> --
>
> Key: MAPREDUCE-793
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-793
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
>
> There are few tests that just run similar jobs and test different 
> functionality. It would be useful to have a test that runs one job and tests 
> several of these functionality together so that this test can be included in 
> the fast commit-tests target.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-796) Encountered "ClassCastException" on tasktracker while running wordcount with MultithreadedMapRunner

2009-07-23 Thread Suman Sehgal (JIRA)
Encountered "ClassCastException" on tasktracker while running wordcount with 
MultithreadedMapRunner
---

 Key: MAPREDUCE-796
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-796
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: examples
Affects Versions: 0.20.1
Reporter: Suman Sehgal


ClassCastException for OutOfMemoryError is encountered on tasktracker while 
running wordcount example with MultithreadedMapRunner. 

Stack trace :
=
java.lang.ClassCastException: java.lang.OutOfMemoryError cannot be cast to 
java.lang.RuntimeException
at 
org.apache.hadoop.mapreduce.lib.map.MultithreadedMapper.run(MultithreadedMapper.java:149)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:581)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:303)
at org.apache.hadoop.mapred.Child.main(Child.java:170)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-795) JobHistory.markCompleted() should be synchronized

2009-07-23 Thread Amar Kamat (JIRA)
JobHistory.markCompleted() should be synchronized
-

 Key: MAPREDUCE-795
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-795
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amar Kamat
Assignee: Amar Kamat


Since other FS calls in JobHistory are synchronized, this call should also be 
synchronized. This method moves jobhistory files from running to done folder. 
So while other JobHistory methods perform a search in the running folder this 
method might move the files in the FS causing inconsistencies. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-794) JobTrackerInstrumentation data might be grabled

2009-07-23 Thread Amar Kamat (JIRA)
JobTrackerInstrumentation data might be grabled 


 Key: MAPREDUCE-794
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-794
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Reporter: Amar Kamat


Here is the sequence of events
1) submit a job 
2) kill it in prep state

This should result into -ve values of pending maps in 
JobTrackerInstrumentation. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-794) JobTrackerInstrumentation data might be grabled

2009-07-23 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734553#action_12734553
 ] 

Amar Kamat commented on MAPREDUCE-794:
--

Job killed in PREP state will cause the job to invoke job.garbageCollect() 
which does
{code}
 jobtracker.getInstrumentation().decWaitingMaps(getJobID(), pendingMaps());
 jobtracker.getInstrumentation().decWaitingReduces(getJobID(), 
pendingReduces());
{code}
 
which goes ahead and blindly deletes _pendingMaps()=total_maps_ from 
_numWaitingMaps_. But pendingMaps() is added to the JobTrackerInstrumentation 
only upon init.

{code}
@Override
  public synchronized void decWaitingMaps(JobID id, int task) {
numWaitingMaps -= task;
  }
{code}

{code}
initTasks() {
..
..
 jobtracker.getInstrumentation().addWaitingMaps(getJobID(), numMapTasks);
 jobtracker.getInstrumentation().addWaitingReduces(getJobID(), numReduceTasks);
..
..
}
{code}

> JobTrackerInstrumentation data might be grabled 
> 
>
> Key: MAPREDUCE-794
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-794
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amar Kamat
>
> Here is the sequence of events
> 1) submit a job 
> 2) kill it in prep state
> This should result into -ve values of pending maps in 
> JobTrackerInstrumentation. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-793) Create a new test that consolidates a few tests to be included in the commit-test list

2009-07-23 Thread Jothi Padmanabhan (JIRA)
Create a new test that consolidates a few tests to be included in the 
commit-test list
--

 Key: MAPREDUCE-793
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-793
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Jothi Padmanabhan
Assignee: Jothi Padmanabhan
 Fix For: 0.21.0


There are few tests that just run similar jobs and test different 
functionality. It would be useful to have a test that runs one job and tests 
several of these functionality together so that this test can be included in 
the fast commit-tests target.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-784) Modify TestUserDefinedCounters to use LocalJobRunner instead of MiniMR

2009-07-23 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734549#action_12734549
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-784:
---

+1

> Modify TestUserDefinedCounters to use LocalJobRunner instead of MiniMR
> --
>
> Key: MAPREDUCE-784
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-784
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: test
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
> Fix For: 0.21.0
>
> Attachments: mapred-784.patch
>
>
> This test can be modified to use LocalJobRunner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-786) Jobtracker history should be written aysnchronously to the filesystem

2009-07-23 Thread Sharad Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734542#action_12734542
 ] 

Sharad Agarwal commented on MAPREDUCE-786:
--

I think we should cleanup the way history is written. Providing cleaner API to 
plugin the async. Perhaps a separate JIRA for it. Something like:
{code}
public interface Event {
  public Map getData();
}

public interface EventReader {
   public Iterator read();
   public void close();
}

public interface EventWriter {
  public void write(Event event);
  public void flush();
  public void close();
}
{code}

Async writer should be able to wrap to an EventWriter object:
{code}
class AsyncEventWriter implements EventWriter {
  AsyncEventWriter(EventWriter writer){
  }
}
{code}

> Jobtracker history should be written aysnchronously to the filesystem
> -
>
> Key: MAPREDUCE-786
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-786
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
>
> Jobtracker lock is held while writing the history events. This makes the 
> jobtracker slow on flushes, especially when history is written to HDFS. 
> History events should be written asynchronously to avoid this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-786) Jobtracker history should be written aysnchronously to the filesystem

2009-07-23 Thread Sharad Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharad Agarwal reassigned MAPREDUCE-786:


Assignee: Sharad Agarwal

> Jobtracker history should be written aysnchronously to the filesystem
> -
>
> Key: MAPREDUCE-786
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-786
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sharad Agarwal
>Assignee: Sharad Agarwal
>
> Jobtracker lock is held while writing the history events. This makes the 
> jobtracker slow on flushes, especially when history is written to HDFS. 
> History events should be written asynchronously to avoid this problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics

2009-07-23 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-779:
-

Attachment: mapreduce-779-2.patch

Attaching patch which addresses Rahul's comments.

> Add node health failures into JobTrackerStatistics
> --
>
> Key: MAPREDUCE-779
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-779
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sreekanth Ramakrishnan
>Assignee: Sreekanth Ramakrishnan
> Attachments: mapreduce-779-1.patch, mapreduce-779-2.patch
>
>
> Add the node health failure counts into {{JobTrackerStatistics}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-766) Enhance -list-blacklisted-trackers to display host name, blacklisted reason and blacklist report.

2009-07-23 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-766:
-

Attachment: blacklist3.png

Screen shot for the cli.

> Enhance -list-blacklisted-trackers to display host name, blacklisted reason 
> and blacklist report.
> -
>
> Key: MAPREDUCE-766
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-766
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Sreekanth Ramakrishnan
>Assignee: Sreekanth Ramakrishnan
> Attachments: blacklist3.png, mapreduce-766-1.patch, 
> mapreduce-766-2.patch, mapreduce-766-3.patch, mapreduce-766-4.patch
>
>
> Currently, the -list-blacklisted-trackers in the mapred job option list only 
> tracker name. We should enhance it to display as hostname, reason for 
> blacklisting and blacklist report.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-779) Add node health failures into JobTrackerStatistics

2009-07-23 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734518#action_12734518
 ] 

rahul k singh commented on MAPREDUCE-779:
-

Some comments:

1.comment the updateNodeHealthFailureStatistics method  esp UN_HEALTHY stuff.
2.above method declaration is more than 80 characters

3.make sure things are 80 characters in jsp file.
4.show the blacklisted metrics all the time

> Add node health failures into JobTrackerStatistics
> --
>
> Key: MAPREDUCE-779
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-779
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker
>Reporter: Sreekanth Ramakrishnan
>Assignee: Sreekanth Ramakrishnan
> Attachments: mapreduce-779-1.patch
>
>
> Add the node health failure counts into {{JobTrackerStatistics}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-373) Change org.apache.hadoop.mapred.lib. FieldSelectionMapReduce to use new api.

2009-07-23 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734507#action_12734507
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-373:
---

test-patch resurlt:
{noformat}
 [exec]
 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 9 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec]
{noformat}

Both TestFieldSelection and TestMRFieldSelection passed on my machine.

> Change org.apache.hadoop.mapred.lib. FieldSelectionMapReduce to use new api.
> 
>
> Key: MAPREDUCE-373
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-373
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-373-1.txt, patch-373.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-373) Change org.apache.hadoop.mapred.lib. FieldSelectionMapReduce to use new api.

2009-07-23 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-373:
--

Attachment: patch-373-1.txt

Patch with comments incorporated. I left FieldSelectionHelper as public class, 
since it is used in old api class.

> Change org.apache.hadoop.mapred.lib. FieldSelectionMapReduce to use new api.
> 
>
> Key: MAPREDUCE-373
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-373
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-373-1.txt, patch-373.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.