[jira] Commented: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME

2010-06-29 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883805#action_12883805
 ] 

Ravi Gummadi commented on MAPREDUCE-1864:
-

Patch looks good.
+1

> PipeMapRed.java has uninitialized members log_ and LOGNAME 
> ---
>
> Key: MAPREDUCE-1864
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1864.txt
>
>
> PipeMapRed.java has members log_ and LOGNAME, which are never initialized and 
> they are used in code for logging in several places. 
> They should be removed and PipeMapRed should use commons LogFactory and Log 
> for logging. This would improve code maintainability.
> Also, as per [comment | 
> https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530],
>  stream.joblog_ configuration property can be removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

2010-06-29 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-577:
---

Attachment: 577.v3.patch

Attaching new patch fixing the issue with FileSystem block size setting using 
FileSystem.closeAll() in the test cases. With this, the block size is properly 
set to 60, 80, 60 and 80 in the 4 test cases respectively.

> Duplicate Mapper input when using StreamXmlRecordReader
> ---
>
> Key: MAPREDUCE-577
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-577
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
> Environment: HADOOP 0.17.0, Java 6.0
>Reporter: David Campbell
>Assignee: Ravi Gummadi
> Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 
> 0002-patch-for-HADOOP-3484.patch, 577.20S.patch, 577.patch, 577.v1.patch, 
> 577.v2.patch, 577.v3.patch, HADOOP-3484.combined.patch, HADOOP-3484.try3.patch
>
>
> I have an XML file with 93626 rows.  A row is marked by 
> I've confirmed this with grep and the Grep example program included with 
> HADOOP.
> Here is the grep example output.  93626   
> I've setup my job configuration as follows:   
> conf.set("stream.recordreader.class", 
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> conf.set("stream.recordreader.begin", "");
> conf.set("stream.recordreader.end", "");
> conf.setInputFormat(StreamInputFormat.class);
> I have a fairly simple test Mapper.
> Here's the map method.
>   public void map(Text key, Text value, OutputCollector 
> output, Reporter reporter) throws IOException {
> try {
> output.collect(totalWord, one);
> if (key != null && key.toString().indexOf("01852") != -1) {
> output.collect(new Text("01852"), one);
> }
> } catch (Exception ex) {
> Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, 
> null, ex);
> System.out.println(value);
> }
> }
> For totalWord ("TOTAL"), I get:
> TOTAL 140850
> and for 01852 I get.
> 01852 86
> There are 43 instances of 01852 in the file.
> I have the following setting in my config.  
>conf.setNumMapTasks(1);
> I have a total of six machines in my cluster.
> If I run without this, the result is 12x the actual value, not 2x.
> Here's some info from the cluster web page.
> Maps  Reduces Total Submissions   Nodes   Map Task Capacity   Reduce 
> Task CapacityAvg. Tasks/Node
> 0 0   1   6   12  12  4.00
> I've also noticed something really strange in the job's output.  It looks 
> like it's starting over or redoing things.
> This was run using all six nodes and no limitations on map or reduce tasks.  
> I haven't seen this behavior in any other case.
> 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018
> 08/06/03 10:50:37 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:50:42 INFO mapred.JobClient:  map 2% reduce 0%
> 08/06/03 10:50:45 INFO mapred.JobClient:  map 12% reduce 0%
> 08/06/03 10:50:47 INFO mapred.JobClient:  map 31% reduce 0%
> 08/06/03 10:50:48 INFO mapred.JobClient:  map 49% reduce 0%
> 08/06/03 10:50:49 INFO mapred.JobClient:  map 68% reduce 0%
> 08/06/03 10:50:50 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:54 INFO mapred.JobClient:  map 87% reduce 0%
> 08/06/03 10:50:55 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:56 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:51:00 INFO mapred.JobClient:  map 0% reduce 1%
> 08/06/03 10:51:05 INFO mapred.JobClient:  map 28% reduce 2%
> 08/06/03 10:51:07 INFO mapred.JobClient:  map 80% reduce 4%
> 08/06/03 10:51:08 INFO mapred.JobClient:  map 100% reduce 4%
> 08/06/03 10:51:09 INFO mapred.JobClient:  map 100% reduce 7%
> 08/06/03 10:51:10 INFO mapred.JobClient:  map 90% reduce 9%
> 08/06/03 10:51:11 INFO mapred.JobClient:  map 100% reduce 9%
> 08/06/03 10:51:12 INFO mapred.JobClient:  map 100% reduce 11%
> 08/06/03 10:51:13 INFO mapred.JobClient:  map 90% reduce 11%
> 08/06/03 10:51:14 INFO mapred.JobClient:  map 97% reduce 11%
> 08/06/03 10:51:15 INFO mapred.JobClient:  map 63% reduce 11%
> 08/06/03 10:51:16 INFO mapred.JobClient:  map 48% reduce 11%
> 08/06/03 10:51:17 INFO mapred.JobClient:  map 21% reduce 11%
> 08/06/03 10:51:19 INFO mapred.JobClient:  map 0% reduce 11%
> 08/06/03 10:51:20 INFO mapred.JobClient:  map 15% reduce 12%
> 08/06/03 10:51:21 INFO mapred.JobClient:  map 27% reduce 13%
> 08/06/03 10:51:22 INFO mapred.JobClient:  map 67% reduce 13%
> 08/06/03 10:51:24 INFO mapred.JobClient:  map 22% reduce 16%
> 08/06/03 10:51:25 INFO mapred.JobClient:  map 46% reduce 16%
> 08/06/03 10:51:26 INFO mapred.JobClient:  

[jira] Updated: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

2010-06-29 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-577:
---

Status: Patch Available  (was: Open)

> Duplicate Mapper input when using StreamXmlRecordReader
> ---
>
> Key: MAPREDUCE-577
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-577
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
> Environment: HADOOP 0.17.0, Java 6.0
>Reporter: David Campbell
>Assignee: Ravi Gummadi
> Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 
> 0002-patch-for-HADOOP-3484.patch, 577.20S.patch, 577.patch, 577.v1.patch, 
> 577.v2.patch, 577.v3.patch, HADOOP-3484.combined.patch, HADOOP-3484.try3.patch
>
>
> I have an XML file with 93626 rows.  A row is marked by 
> I've confirmed this with grep and the Grep example program included with 
> HADOOP.
> Here is the grep example output.  93626   
> I've setup my job configuration as follows:   
> conf.set("stream.recordreader.class", 
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> conf.set("stream.recordreader.begin", "");
> conf.set("stream.recordreader.end", "");
> conf.setInputFormat(StreamInputFormat.class);
> I have a fairly simple test Mapper.
> Here's the map method.
>   public void map(Text key, Text value, OutputCollector 
> output, Reporter reporter) throws IOException {
> try {
> output.collect(totalWord, one);
> if (key != null && key.toString().indexOf("01852") != -1) {
> output.collect(new Text("01852"), one);
> }
> } catch (Exception ex) {
> Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, 
> null, ex);
> System.out.println(value);
> }
> }
> For totalWord ("TOTAL"), I get:
> TOTAL 140850
> and for 01852 I get.
> 01852 86
> There are 43 instances of 01852 in the file.
> I have the following setting in my config.  
>conf.setNumMapTasks(1);
> I have a total of six machines in my cluster.
> If I run without this, the result is 12x the actual value, not 2x.
> Here's some info from the cluster web page.
> Maps  Reduces Total Submissions   Nodes   Map Task Capacity   Reduce 
> Task CapacityAvg. Tasks/Node
> 0 0   1   6   12  12  4.00
> I've also noticed something really strange in the job's output.  It looks 
> like it's starting over or redoing things.
> This was run using all six nodes and no limitations on map or reduce tasks.  
> I haven't seen this behavior in any other case.
> 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018
> 08/06/03 10:50:37 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:50:42 INFO mapred.JobClient:  map 2% reduce 0%
> 08/06/03 10:50:45 INFO mapred.JobClient:  map 12% reduce 0%
> 08/06/03 10:50:47 INFO mapred.JobClient:  map 31% reduce 0%
> 08/06/03 10:50:48 INFO mapred.JobClient:  map 49% reduce 0%
> 08/06/03 10:50:49 INFO mapred.JobClient:  map 68% reduce 0%
> 08/06/03 10:50:50 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:54 INFO mapred.JobClient:  map 87% reduce 0%
> 08/06/03 10:50:55 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:56 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:51:00 INFO mapred.JobClient:  map 0% reduce 1%
> 08/06/03 10:51:05 INFO mapred.JobClient:  map 28% reduce 2%
> 08/06/03 10:51:07 INFO mapred.JobClient:  map 80% reduce 4%
> 08/06/03 10:51:08 INFO mapred.JobClient:  map 100% reduce 4%
> 08/06/03 10:51:09 INFO mapred.JobClient:  map 100% reduce 7%
> 08/06/03 10:51:10 INFO mapred.JobClient:  map 90% reduce 9%
> 08/06/03 10:51:11 INFO mapred.JobClient:  map 100% reduce 9%
> 08/06/03 10:51:12 INFO mapred.JobClient:  map 100% reduce 11%
> 08/06/03 10:51:13 INFO mapred.JobClient:  map 90% reduce 11%
> 08/06/03 10:51:14 INFO mapred.JobClient:  map 97% reduce 11%
> 08/06/03 10:51:15 INFO mapred.JobClient:  map 63% reduce 11%
> 08/06/03 10:51:16 INFO mapred.JobClient:  map 48% reduce 11%
> 08/06/03 10:51:17 INFO mapred.JobClient:  map 21% reduce 11%
> 08/06/03 10:51:19 INFO mapred.JobClient:  map 0% reduce 11%
> 08/06/03 10:51:20 INFO mapred.JobClient:  map 15% reduce 12%
> 08/06/03 10:51:21 INFO mapred.JobClient:  map 27% reduce 13%
> 08/06/03 10:51:22 INFO mapred.JobClient:  map 67% reduce 13%
> 08/06/03 10:51:24 INFO mapred.JobClient:  map 22% reduce 16%
> 08/06/03 10:51:25 INFO mapred.JobClient:  map 46% reduce 16%
> 08/06/03 10:51:26 INFO mapred.JobClient:  map 70% reduce 16%
> 08/06/03 10:51:27 INFO mapred.JobClient:  map 73% reduce 18%
> 08/06/03 10:51:28 INFO mapred.JobClient:  map 85% reduce 19%
> 08/06/03 10:51:29 INFO mapred.JobClient:  map 7% reduce 19%

[jira] Commented: (MAPREDUCE-1898) [Herriot] Implement a functionality for getting the job summary information of a job.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883796#action_12883796
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1898:
--

Cos, you are absolutely correct and it was my mistake.Some how I was in under 
impression that JobInfoImpl class belongs to mapreduce, so that I have 
implemented the functionality by using aspectJ. BTW, I will modify the code and 
re-submit the new patch.

> [Herriot] Implement a functionality for getting the job summary information 
> of a job.
> -
>
> Key: MAPREDUCE-1898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1898
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1898-ydist-security.patch, 1898-ydist-security.patch
>
>
> Implement a method for getting the job summary details of a job. The job 
> summary should be.
> jobId, startTime, launchTime, finishTime, numMaps, numSlotsPerMap, 
> numReduces, numSlotsPerReduce, user, queue, status, mapSlotSeconds, 
> reduceSlotSeconds, clusterMapCapacity,clusterReduceCapacity.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883795#action_12883795
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1864:


bq. -1 contrib tests.
Test failure is due to MAPREDUCE-1834.

bq. -1 tests included. 
The patch removes unused code. So, no new tests are added.

> PipeMapRed.java has uninitialized members log_ and LOGNAME 
> ---
>
> Key: MAPREDUCE-1864
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1864.txt
>
>
> PipeMapRed.java has members log_ and LOGNAME, which are never initialized and 
> they are used in code for logging in several places. 
> They should be removed and PipeMapRed should use commons LogFactory and Log 
> for logging. This would improve code maintainability.
> Also, as per [comment | 
> https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530],
>  stream.joblog_ configuration property can be removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1888:
---

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

I just committed this. Thanks Ravi !

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch, 1888.v3.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883787#action_12883787
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1888:


+1 Latest patch looks fine.
Test failure is because of MAPREDUCE-1834. 

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch, 1888.v3.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1854) [herriot] Automate health script system test

2010-06-29 Thread Balaji Rajagopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883780#action_12883780
 ] 

Balaji Rajagopalan commented on MAPREDUCE-1854:
---

Cos is it a +1 for this jira, can I commit ?

> [herriot] Automate health script system test
> 
>
> Key: MAPREDUCE-1854
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1854
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
> Environment: Herriot framework
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: health_script_5.txt, health_script_7.txt
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> 1. There are three scenarios, first is induce a error from health script, 
> verify that task tracker is blacklisted. 
> 2. Make the health script timeout and verify the task tracker is blacklisted. 
> 3. Make an error in the health script path and make sure the task tracker 
> stays healthy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1548) Hadoop archives should be able to preserve times and other properties from original files

2010-06-29 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883761#action_12883761
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1548:


Thanks Scott!

Indeed I've been doing too much PHP and javascript lately. I'll change that.

The space-separated metadata is fine because I'm URL-encoding all the string 
properties. I kept " " as the standard separator because it is already used in 
other parts of Hadoop Archives.

Using URL-encoding of strings makes it safe.


> Hadoop archives should be able to preserve times and other properties from 
> original files
> -
>
> Key: MAPREDUCE-1548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1548.0.patch
>
>
> Files inside hadoop archives don't keep their original:
> - modification time
> - access time
> - permission
> - owner
> - group
> all such properties are currently taken from the file storing the archive 
> index, and not the stored files. This doesn't look very correct.
> There should be possible to preserve the original properties of the stored 
> files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1548) Hadoop archives should be able to preserve times and other properties from original files

2010-06-29 Thread Scott Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883759#action_12883759
 ] 

Scott Chen commented on MAPREDUCE-1548:
---

@Rodrigo: Can you change this to modificationTime? Doing too much PHP?
{code}
+long modification_time = 0;
{code}

The other thing is that metadata are separated by " " (it was like that).
Is this safe enough?

Other than these the patch looks good. Thanks.


> Hadoop archives should be able to preserve times and other properties from 
> original files
> -
>
> Key: MAPREDUCE-1548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1548.0.patch
>
>
> Files inside hadoop archives don't keep their original:
> - modification time
> - access time
> - permission
> - owner
> - group
> all such properties are currently taken from the file storing the archive 
> index, and not the stored files. This doesn't look very correct.
> There should be possible to preserve the original properties of the stored 
> files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1897) trunk build broken on compile-mapred-test

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883758#action_12883758
 ] 

Konstantin Boudnik commented on MAPREDUCE-1897:
---

Ran the same on RHEL4.5 with success. Could you please contact me offline and 
provide with information about your build environment so I'd have a chance to 
use it to reproduce the problem?

> trunk build broken on compile-mapred-test
> -
>
> Key: MAPREDUCE-1897
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1897
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
> Environment: RHEL4 Linux, Java 1.6.0_15-b03
>Reporter: Greg Roelofs
>Assignee: Konstantin Boudnik
>
> ...apparently.  Fresh checkout of trunk (all three hadoop-*), 
> build.properties project.version fix, ant veryclean mvn-install of common, 
> hdfs, and then mapreduce:
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:52:
>  cannot access org.apache.hadoop.test.system.DaemonProtocol
> [javac] class file for org.apache.hadoop.test.system.DaemonProtocol not 
> found
> [javac]   static class FakeJobTracker extends JobTracker {
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  cannot find symbol
> [javac] symbol  : variable trackers
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method taskTrackers()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:68:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   getBlacklistedTrackerCount(), 0, 0, 0, totalSlots/2, 
> totalSlots/2, 
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:64:
>  method does not override or implement a method from a supertype
> [javac] @Override
> [javac] ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  cannot find symbol
> [javac] symbol  : variable totalSlots
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:91:
>  establishFirstContact(org.apache.hadoop.mapred.JobTracker,java.lang.String) 
> in org.apache.hadoop.mapred.FakeObjectUtilities cannot be applied to 
> (org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker,java.lang.String)
> [javac]   FakeObjectUtilities.establishFirstContact(jobTracker, 
> s);
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:170:
>  cannot find symbol
> [javac] symbol  : constructor 
> MyFakeJobInProgress(org.apache.hadoop.mapred.JobConf,org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker)
> [javac] 

[jira] Commented: (MAPREDUCE-1897) trunk build broken on compile-mapred-test

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883754#action_12883754
 ] 

Konstantin Boudnik commented on MAPREDUCE-1897:
---

I have tried to run {{ant veryclean mvn-install}} after removing {{~/.ivy2}} on 
RHEL5 system and saw successful build with all artifacts installed into my 
local maven repository. Have repeated the same on BSD and Ubuntu 10.4 with 
similar results. Going to get an access to RHEL4 and will try reproduce your 
issue again.


> trunk build broken on compile-mapred-test
> -
>
> Key: MAPREDUCE-1897
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1897
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
> Environment: RHEL4 Linux, Java 1.6.0_15-b03
>Reporter: Greg Roelofs
>Assignee: Konstantin Boudnik
>
> ...apparently.  Fresh checkout of trunk (all three hadoop-*), 
> build.properties project.version fix, ant veryclean mvn-install of common, 
> hdfs, and then mapreduce:
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:52:
>  cannot access org.apache.hadoop.test.system.DaemonProtocol
> [javac] class file for org.apache.hadoop.test.system.DaemonProtocol not 
> found
> [javac]   static class FakeJobTracker extends JobTracker {
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:60:
>  cannot find symbol
> [javac] symbol  : variable trackers
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.trackers = tts;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method taskTrackers()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:67:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   taskTrackers().size() - getBlacklistedTrackerCount(),
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:68:
>  cannot find symbol
> [javac] symbol  : method getBlacklistedTrackerCount()
> [javac] location: class 
> org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker
> [javac]   getBlacklistedTrackerCount(), 0, 0, 0, totalSlots/2, 
> totalSlots/2, 
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:64:
>  method does not override or implement a method from a supertype
> [javac] @Override
> [javac] ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  non-static variable this cannot be referenced from a static context
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/FakeObjectUtilities.java:73:
>  cannot find symbol
> [javac] symbol  : variable totalSlots
> [javac] location: class org.apache.hadoop.mapred.FakeObjectUtilities
> [javac]   this.totalSlots = totalSlots;
> [javac]   ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:91:
>  establishFirstContact(org.apache.hadoop.mapred.JobTracker,java.lang.String) 
> in org.apache.hadoop.mapred.FakeObjectUtilities cannot be applied to 
> (org.apache.hadoop.mapred.FakeObjectUtilities.FakeJobTracker,java.lang.String)
> [javac]   FakeObjectUtilities.establishFirstContact(jobTracker, 
> s);
> [javac]  ^
> [javac] 
> /home/roelofs/grid/trunk2/hadoop-mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestJobInProgress.java:170:
>  cannot find symbol
> [javac] symbol  : construct

[jira] Commented: (MAPREDUCE-1893) Multiple reducers for Slive

2010-06-29 Thread Ravi Phulari (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883749#action_12883749
 ] 

Ravi Phulari commented on MAPREDUCE-1893:
-

Sorry, I missed one thing earlier . 
One correction is needed in MapredTestDriver.java 
We need to make SliveTest visible through TestDriver which can be done by 
adding "SliveTest" as first parameter to pgd.addClass()

{noformat}
pgd.addClass(SliveTest.class.getSimpleName(), SliveTest.class, 
  "HDFS Stress Test and Live Data Verification.");
{noformat}

Should be like - 
{noformat}
pgd.addClass("SliveTest",SliveTest.class.getSimpleName(), SliveTest.class, 
  "HDFS Stress Test and Live Data Verification.");
{noformat}

> Multiple reducers for Slive
> ---
>
> Key: MAPREDUCE-1893
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1893
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, test
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.22.0
>
> Attachments: SliveMultiR.patch, SliveMultiR.patch, SliveMultiR.patch
>
>
> Slive currently uses single reducer. It could use multiple ones.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883737#action_12883737
 ] 

Hadoop QA commented on MAPREDUCE-1888:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12448337/1888.v3.patch
  against trunk revision 958902.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 35 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/274/console

This message is automatically generated.

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch, 1888.v3.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1893) Multiple reducers for Slive

2010-06-29 Thread Ravi Phulari (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883724#action_12883724
 ] 

Ravi Phulari commented on MAPREDUCE-1893:
-

Thanks for adding Slive to Driver Konstantin. 
Patch looks good to me.


> Multiple reducers for Slive
> ---
>
> Key: MAPREDUCE-1893
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1893
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: benchmarks, test
>Affects Versions: 0.22.0
>Reporter: Konstantin Shvachko
>Assignee: Konstantin Shvachko
> Fix For: 0.22.0
>
> Attachments: SliveMultiR.patch, SliveMultiR.patch, SliveMultiR.patch
>
>
> Slive currently uses single reducer. It could use multiple ones.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883711#action_12883711
 ] 

Konstantin Boudnik commented on MAPREDUCE-1672:
---

Is there really no such thing as creating a DFS file yet?
Also, your latest patch for y20 security seems to be too short.


> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile-ydist-security-patch.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1871) Create automated test scenario for "Collect information about number of tasks succeeded / total per time unit for a tasktracker"

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883708#action_12883708
 ] 

Konstantin Boudnik commented on MAPREDUCE-1871:
---

*  Consider lowering log levels in JobTrackerAspect.aj. I think there's no 
need to print clearly debug messages with INFO level.
* + * @return int isn't very informative.
* this is very C-like programming style:

  +  /**
  +   * This gets the value of all task trackers windows in the tasktracker 
page.
  +   *
  +   * @param none,
  +   * @return int[] of  all the tasks that ran, in the sequence given 
below
  +   * "since_start", "total_tasks"
  +   * "since_start","succeeded_tasks"
  +   * "last_hour", "total_tasks"
  +   * "last_hour", "succeeded_tasks"
  +   * "last_day", "total_tasks"
  +   * "last_day", "succeeded_tasks"
  +   */

  why don't you use an object container instead? It will make code like this

  +int totalTasksSinceStartBeforeJob = ttAllInfo[0];
  +int succeededTasksSinceStartBeforeJob = ttAllInfo[1];
  +int totalTasksLastHourBeforeJob = ttAllInfo[2];
  +int succeededTasksLastHourBeforeJob = ttAllInfo[3];

  much clearer and readable.
* bad choice of class name TestTaskTrackerInfoFirst as well as 
TestTaskTrackerInfoSecond
* some of the tests are commented out //@Test. Please consider to take them 
away all together if they aren't going to be used. You'd better add them later 
in a separate JIRA.
* in the second test class waiting and log levels in waitForTTStop() and 
waitForTTStart() seem to be inconsistent.

[ Show ยป ]
Konstantin Boudnik added a comment - 29/Jun/10 05:07 PM

* Consider lowering log levels in JobTrackerAspect.aj. I think there's no 
need to print clearly debug messages with INFO level.
* + * @return int isn't very informative.
* this is very C-like programming style:

  +  /**
  +   * This gets the value of all task trackers windows in the tasktracker 
page.
  +   *
  +   * @param none,
  +   * @return int[] of  all the tasks that ran, in the sequence given 
below
  +   * "since_start", "total_tasks"
  +   * "since_start","succeeded_tasks"
  +   * "last_hour", "total_tasks"
  +   * "last_hour", "succeeded_tasks"
  +   * "last_day", "total_tasks"
  +   * "last_day", "succeeded_tasks"
  +   */

  why don't you use an object container instead? It will make code like this

  +int totalTasksSinceStartBeforeJob = ttAllInfo[0];
  +int succeededTasksSinceStartBeforeJob = ttAllInfo[1];
  +int totalTasksLastHourBeforeJob = ttAllInfo[2];
  +int succeededTasksLastHourBeforeJob = ttAllInfo[3];

  much clearer and readable.
* bad choice of class name TestTaskTrackerInfoFirst as well as 
TestTaskTrackerInfoSecond
* some of the tests are commented out //@Test. Please consider to take them 
away all together if they aren't going to be used. You'd better add them later 
in a separate JIRA.
* in the second test class waiting and log levels in waitForTTStop() and 
waitForTTStart() seem to be inconsistent.



> Create automated test scenario for "Collect information about number of tasks 
> succeeded / total per time unit for a tasktracker"
> 
>
> Key: MAPREDUCE-1871
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1871
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 1871-ydist-security-patch.txt, 
> 1871-ydist-security-patch.txt
>
>
> Create automated test scenario for "Collect information about number of tasks 
> succeeded / total per time unit for a tasktracker"
> 1) Verification of all the above mentioned fields with the specified TTs. 
> Total no. of tasks and successful tasks should be equal to the corresponding 
> no. of tasks specified in TTs logs
> 2)  Fail a task on tasktracker.  Node UI should update the status of tasks on 
> that TT accordingly. 
> 3)  Kill a task on tasktracker.  Node UI should update the status of tasks on 
> that TT accordingly
> 4) Positive Run simultaneous jobs and check if all the fields are populated 
> with proper values of tasks.  Node UI should have correct valiues for all the 
> fields mentioned above. 
> 5)  Check the fields across one hour window  Fields related to hour should be 
> updated after every hour
> 6) Check the fields across one day window  fields related to hour should be 
> updated after every day
> 7) Restart a TT and bring it back.  UI should retain the fields values.  
> 8) Positive Run a bunch of jobs with 0 maps and 0 redu

[jira] Commented: (MAPREDUCE-1854) [herriot] Automate health script system test

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883703#action_12883703
 ] 

Konstantin Boudnik commented on MAPREDUCE-1854:
---

- Consider lowering log levels in {{JobTrackerAspect.aj}}. I think there's no 
need to print clearly debug messages with *INFO* level.
- {{+   * @return int}} isn't very informative.
- this is very C-like programming style:
{noformat}
+  /**
+   * This gets the value of all task trackers windows in the tasktracker page.
+   *
+   * @param none,
+   * @return int[] of  all the tasks that ran, in the sequence given below
+   * "since_start", "total_tasks"
+   * "since_start","succeeded_tasks"
+   * "last_hour", "total_tasks"
+   * "last_hour", "succeeded_tasks"
+   * "last_day", "total_tasks"
+   * "last_day", "succeeded_tasks"
+   */
{noformat}
why don't you use an object container instead? It will make code like this 
{noformat}
+int totalTasksSinceStartBeforeJob = ttAllInfo[0];
+int succeededTasksSinceStartBeforeJob = ttAllInfo[1];
+int totalTasksLastHourBeforeJob = ttAllInfo[2];
+int succeededTasksLastHourBeforeJob = ttAllInfo[3];
{noformat}
much clearer and readable.
- bad choice of class name {{TestTaskTrackerInfoFirst}} as well as 
{{TestTaskTrackerInfoSecond}}
- some of the tests are commented out {{//@Test}}. Please consider to take them 
away all together if they aren't going to be used. You'd better add them later 
in a separate JIRA.
- in the second test class waiting and log levels in {{waitForTTStop()}} and 
{{waitForTTStart()}} seem to be inconsistent.


> [herriot] Automate health script system test
> 
>
> Key: MAPREDUCE-1854
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1854
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
> Environment: Herriot framework
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: health_script_5.txt, health_script_7.txt
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> 1. There are three scenarios, first is induce a error from health script, 
> verify that task tracker is blacklisted. 
> 2. Make the health script timeout and verify the task tracker is blacklisted. 
> 3. Make an error in the health script path and make sure the task tracker 
> stays healthy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1854) [herriot] Automate health script system test

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883707#action_12883707
 ] 

Konstantin Boudnik commented on MAPREDUCE-1854:
---

Please disregard my last comment - this was intended for MAPREDUCE-1871

> [herriot] Automate health script system test
> 
>
> Key: MAPREDUCE-1854
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1854
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
> Environment: Herriot framework
>Reporter: Balaji Rajagopalan
>Assignee: Balaji Rajagopalan
> Attachments: health_script_5.txt, health_script_7.txt
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> 1. There are three scenarios, first is induce a error from health script, 
> verify that task tracker is blacklisted. 
> 2. Make the health script timeout and verify the task tracker is blacklisted. 
> 3. Make an error in the health script path and make sure the task tracker 
> stays healthy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1827) [Herriot] Task Killing/Failing tests for a streaming job.

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883697#action_12883697
 ] 

Konstantin Boudnik commented on MAPREDUCE-1827:
---

Please make a patch for trunk as well.

> [Herriot] Task Killing/Failing tests for a streaming job.
> -
>
> Key: MAPREDUCE-1827
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1827
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1827-ydist-security.patch, 1827-ydist-security.patch
>
>
> 1. Set the sleep time for the tasks is 3 seconds and kill the task of 
> streaming job using SIGKILL. After that  verify whether task is killed after 
> 3 seconds or not and also verify whether job is succeeded or not.
> 2. Set the maximum attempts for the maps and reducers are one. make the task 
> to fail and verify whether task  is failed or not.Also verify whether the job 
> is failed or not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1794) Test the job status of lost task trackers before and after the timeout.

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883695#action_12883695
 ] 

Konstantin Boudnik commented on MAPREDUCE-1794:
---

Please have the patch for trunk.

> Test the job status of lost task trackers before and after the timeout.
> ---
>
> Key: MAPREDUCE-1794
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1794
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1794-ydist-security.patch, 1794_lost_tasktracker.patch
>
>
> This test covers the following scenarios.
> 1. Verify the job status whether it is succeeded or not when  the task 
> tracker is lost and alive before the timeout.
> 2. Verify the job status and killed attempts of a task whether it is 
> succeeded or not and killed attempts are matched or not  when the task 
> trackers are lost and it timeout for all the four attempts of a task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1882) Use Jsch instead of Shell.java

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883691#action_12883691
 ] 

Konstantin Boudnik commented on MAPREDUCE-1882:
---

- The functionality seems to be common for MR and HDFS. Shall it be moved to 
Common instead?
- This 
{noformat}
+jsch.setKnownHosts("/homes/" + user + "/.ssh/known_hosts");
{noformat}
is questionable. 
- This one won't work if RSA identities are in use:
{noformat}
+jsch.addIdentity("/homes/" + user + "/.ssh/id_dsa");
{noformat}
- Where Jsch will be coming from? Ivy dependency resolution needs to be added 
as well. 

> Use Jsch instead of Shell.java 
> ---
>
> Key: MAPREDUCE-1882
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1882
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
> Environment: herriot framework 
>Reporter: Balaji Rajagopalan
>Assignee: Iyappan Srinivasan
> Attachments: RemoteExecution.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In herriot ( hadoop system test case dev) we often find that we are resorted 
> to habit of ssh to remote node execute a shell command, and come out. It is 
> wise to use Jsch instead of doing this through Shell.java ( hadoop code), 
> since Jsch provides nice Java abstraction, the JIRA will only close after we 
> import Jsch input hadoop build system and also fix all the existing test 
> cases. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1898) [Herriot] Implement a functionality for getting the job summary information of a job.

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883688#action_12883688
 ] 

Konstantin Boudnik commented on MAPREDUCE-1898:
---

Actually, I don't like the fact that we use aspects to modify framework 
classes. {{JobInfoImpl}} is Herriot class and can be modified at the source 
code level if needed. Using aspects for this is overkill.

> [Herriot] Implement a functionality for getting the job summary information 
> of a job.
> -
>
> Key: MAPREDUCE-1898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1898
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1898-ydist-security.patch, 1898-ydist-security.patch
>
>
> Implement a method for getting the job summary details of a job. The job 
> summary should be.
> jobId, startTime, launchTime, finishTime, numMaps, numSlotsPerMap, 
> numReduces, numSlotsPerReduce, user, queue, status, mapSlotSeconds, 
> reduceSlotSeconds, clusterMapCapacity,clusterReduceCapacity.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1896) [Herriot] New property for multi user list.

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883682#action_12883682
 ] 

Konstantin Boudnik commented on MAPREDUCE-1896:
---

Am I missing something? Why this is needed?
Besides, please don't use {{.txt}} extensions for the config files. Better drop 
the extension all together.


> [Herriot] New property for multi user list.
> ---
>
> Key: MAPREDUCE-1896
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1896
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: MAPREDUCE-1896.patch
>
>
> Adding new property for multi user list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1888:


Attachment: 1888.v3.patch

Attaching new patch fixing TrApp.

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch, 1888.v3.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1888:


Status: Patch Available  (was: Open)

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch, 1888.v3.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME

2010-06-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883592#action_12883592
 ] 

Hadoop QA commented on MAPREDUCE-1864:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12447931/patch-1864.txt
  against trunk revision 958902.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/273/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/273/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/273/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/273/console

This message is automatically generated.

> PipeMapRed.java has uninitialized members log_ and LOGNAME 
> ---
>
> Key: MAPREDUCE-1864
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1864.txt
>
>
> PipeMapRed.java has members log_ and LOGNAME, which are never initialized and 
> they are used in code for logging in several places. 
> They should be removed and PipeMapRed should use commons LogFactory and Log 
> for logging. This would improve code maintainability.
> Also, as per [comment | 
> https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530],
>  stream.joblog_ configuration property can be removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-577) Duplicate Mapper input when using StreamXmlRecordReader

2010-06-29 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883507#action_12883507
 ] 

Ravi Gummadi commented on MAPREDUCE-577:


After merging, the file system block size is not updated properly. So adding 
FileSystem.closeAll() call at the begimning of each test case. Will upload a 
patch soon.

> Duplicate Mapper input when using StreamXmlRecordReader
> ---
>
> Key: MAPREDUCE-577
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-577
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
> Environment: HADOOP 0.17.0, Java 6.0
>Reporter: David Campbell
>Assignee: Ravi Gummadi
> Attachments: 0001-test-to-demonstrate-HADOOP-3484.patch, 
> 0002-patch-for-HADOOP-3484.patch, 577.20S.patch, 577.patch, 577.v1.patch, 
> 577.v2.patch, HADOOP-3484.combined.patch, HADOOP-3484.try3.patch
>
>
> I have an XML file with 93626 rows.  A row is marked by 
> I've confirmed this with grep and the Grep example program included with 
> HADOOP.
> Here is the grep example output.  93626   
> I've setup my job configuration as follows:   
> conf.set("stream.recordreader.class", 
> "org.apache.hadoop.streaming.StreamXmlRecordReader");
> conf.set("stream.recordreader.begin", "");
> conf.set("stream.recordreader.end", "");
> conf.setInputFormat(StreamInputFormat.class);
> I have a fairly simple test Mapper.
> Here's the map method.
>   public void map(Text key, Text value, OutputCollector 
> output, Reporter reporter) throws IOException {
> try {
> output.collect(totalWord, one);
> if (key != null && key.toString().indexOf("01852") != -1) {
> output.collect(new Text("01852"), one);
> }
> } catch (Exception ex) {
> Logger.getLogger(TestMapper.class.getName()).log(Level.SEVERE, 
> null, ex);
> System.out.println(value);
> }
> }
> For totalWord ("TOTAL"), I get:
> TOTAL 140850
> and for 01852 I get.
> 01852 86
> There are 43 instances of 01852 in the file.
> I have the following setting in my config.  
>conf.setNumMapTasks(1);
> I have a total of six machines in my cluster.
> If I run without this, the result is 12x the actual value, not 2x.
> Here's some info from the cluster web page.
> Maps  Reduces Total Submissions   Nodes   Map Task Capacity   Reduce 
> Task CapacityAvg. Tasks/Node
> 0 0   1   6   12  12  4.00
> I've also noticed something really strange in the job's output.  It looks 
> like it's starting over or redoing things.
> This was run using all six nodes and no limitations on map or reduce tasks.  
> I haven't seen this behavior in any other case.
> 08/06/03 10:50:35 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> 08/06/03 10:50:36 INFO mapred.JobClient: Running job: job_200806030916_0018
> 08/06/03 10:50:37 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:50:42 INFO mapred.JobClient:  map 2% reduce 0%
> 08/06/03 10:50:45 INFO mapred.JobClient:  map 12% reduce 0%
> 08/06/03 10:50:47 INFO mapred.JobClient:  map 31% reduce 0%
> 08/06/03 10:50:48 INFO mapred.JobClient:  map 49% reduce 0%
> 08/06/03 10:50:49 INFO mapred.JobClient:  map 68% reduce 0%
> 08/06/03 10:50:50 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:54 INFO mapred.JobClient:  map 87% reduce 0%
> 08/06/03 10:50:55 INFO mapred.JobClient:  map 100% reduce 0%
> 08/06/03 10:50:56 INFO mapred.JobClient:  map 0% reduce 0%
> 08/06/03 10:51:00 INFO mapred.JobClient:  map 0% reduce 1%
> 08/06/03 10:51:05 INFO mapred.JobClient:  map 28% reduce 2%
> 08/06/03 10:51:07 INFO mapred.JobClient:  map 80% reduce 4%
> 08/06/03 10:51:08 INFO mapred.JobClient:  map 100% reduce 4%
> 08/06/03 10:51:09 INFO mapred.JobClient:  map 100% reduce 7%
> 08/06/03 10:51:10 INFO mapred.JobClient:  map 90% reduce 9%
> 08/06/03 10:51:11 INFO mapred.JobClient:  map 100% reduce 9%
> 08/06/03 10:51:12 INFO mapred.JobClient:  map 100% reduce 11%
> 08/06/03 10:51:13 INFO mapred.JobClient:  map 90% reduce 11%
> 08/06/03 10:51:14 INFO mapred.JobClient:  map 97% reduce 11%
> 08/06/03 10:51:15 INFO mapred.JobClient:  map 63% reduce 11%
> 08/06/03 10:51:16 INFO mapred.JobClient:  map 48% reduce 11%
> 08/06/03 10:51:17 INFO mapred.JobClient:  map 21% reduce 11%
> 08/06/03 10:51:19 INFO mapred.JobClient:  map 0% reduce 11%
> 08/06/03 10:51:20 INFO mapred.JobClient:  map 15% reduce 12%
> 08/06/03 10:51:21 INFO mapred.JobClient:  map 27% reduce 13%
> 08/06/03 10:51:22 INFO mapred.JobClient:  map 67% reduce 13%
> 08/06/03 10:51:24 INFO mapred.JobClient:  map 22% reduce 16%
> 08/06/03 10:51:25 INFO mapred.JobClient:  map 46% reduce 16%
> 08/06/03 10:51:26 INFO mapred.JobClient:  map 70% reduce 16%
> 08/06/03 10:51:27

[jira] Commented: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883506#action_12883506
 ] 

Ravi Gummadi commented on MAPREDUCE-1888:
-

TrApp.java when used as mapper should not expect mapreduce_job_output_key_class 
as Text and mapreduce_job_output_value_class as Text.
Fixing TrApp.java so that it expects mapreduce_map_output_key_class and 
mapreduce_map_output_key_class.

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883501#action_12883501
 ] 

Hadoop QA commented on MAPREDUCE-1888:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12448202/1888.v2.patch
  against trunk revision 958279.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 29 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/272/console

This message is automatically generated.

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1888) Streaming overrides user given output key and value types.

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1888:
---

Status: Open  (was: Patch Available)

Canceling the patch to fix test failures.

> Streaming overrides user given output key and value types.
> --
>
> Key: MAPREDUCE-1888
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1888
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Affects Versions: 0.21.0
>Reporter: Amareshwari Sriramadasu
>Assignee: Ravi Gummadi
> Fix For: 0.22.0
>
> Attachments: 1888.patch, 1888.v1.patch, 1888.v2.patch
>
>
> The following code in StreamJob.java overrides user given output key and 
> value types.
> {code}
> idResolver.resolve(conf.get(StreamJobConfig.MAP_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.MAP_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setMapOutputKeyClass(idResolver.getOutputKeyClass());
> job.setMapOutputValueClass(idResolver.getOutputValueClass());
> 
> idResolver.resolve(conf.get(StreamJobConfig.REDUCE_OUTPUT,
> IdentifierResolver.TEXT_ID));
> conf.setClass(StreamJobConfig.REDUCE_OUTPUT_READER_CLASS,
>   idResolver.getOutputReaderClass(), OutputReader.class);
> job.setOutputKeyClass(idResolver.getOutputKeyClass());
> job.setOutputValueClass(idResolver.getOutputValueClass());
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1864:
---

Status: Patch Available  (was: Open)

> PipeMapRed.java has uninitialized members log_ and LOGNAME 
> ---
>
> Key: MAPREDUCE-1864
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1864.txt
>
>
> PipeMapRed.java has members log_ and LOGNAME, which are never initialized and 
> they are used in code for logging in several places. 
> They should be removed and PipeMapRed should use commons LogFactory and Log 
> for logging. This would improve code maintainability.
> Also, as per [comment | 
> https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530],
>  stream.joblog_ configuration property can be removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1864) PipeMapRed.java has uninitialized members log_ and LOGNAME

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1864:
---

Status: Open  (was: Patch Available)

> PipeMapRed.java has uninitialized members log_ and LOGNAME 
> ---
>
> Key: MAPREDUCE-1864
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1864
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: contrib/streaming
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.22.0
>
> Attachments: patch-1864.txt
>
>
> PipeMapRed.java has members log_ and LOGNAME, which are never initialized and 
> they are used in code for logging in several places. 
> They should be removed and PipeMapRed should use commons LogFactory and Log 
> for logging. This would improve code maintainability.
> Also, as per [comment | 
> https://issues.apache.org/jira/browse/MAPREDUCE-1851?focusedCommentId=12878530&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12878530],
>  stream.joblog_ configuration property can be removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1896) [Herriot] New property for multi user list.

2010-06-29 Thread Balaji Rajagopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883498#action_12883498
 ] 

Balaji Rajagopalan commented on MAPREDUCE-1896:
---

link this to hadoop-6839, +1 

> [Herriot] New property for multi user list.
> ---
>
> Key: MAPREDUCE-1896
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1896
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: MAPREDUCE-1896.patch
>
>
> Adding new property for multi user list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1809:
-

Attachment: MAPREDUCE-1809.patch

Re-created the patch from the source location.

> Ant build changes for Streaming system tests in contrib projects.
> -
>
> Key: MAPREDUCE-1809
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1809-ydist-security.patch, MAPREDUCE-1809.patch, 
> MAPREDUCE-1809.patch, MAPREDUCE-1809.patch, MAPREDUCE-1809.patch
>
>
> Implementing new target( test-system) in build-contrib.xml file for executing 
> the system test that are in contrib projects. Also adding 'subant'  target in 
> aop.xml that calls the build-contrib.xml file for system tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1898) [Herriot] Implement a functionality for getting the job summary information of a job.

2010-06-29 Thread Balaji Rajagopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883489#action_12883489
 ] 

Balaji Rajagopalan commented on MAPREDUCE-1898:
---

Serializing command seperated string does not look good, very error prone. 
Please check if you can serialize hashmap, last time I talked to someone I was 
told to write my own writable object. 


> [Herriot] Implement a functionality for getting the job summary information 
> of a job.
> -
>
> Key: MAPREDUCE-1898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1898
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1898-ydist-security.patch, 1898-ydist-security.patch
>
>
> Implement a method for getting the job summary details of a job. The job 
> summary should be.
> jobId, startTime, launchTime, finishTime, numMaps, numSlotsPerMap, 
> numReduces, numSlotsPerReduce, user, queue, status, mapSlotSeconds, 
> reduceSlotSeconds, clusterMapCapacity,clusterReduceCapacity.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1850) Include job submit host information (name and ip) in jobconf and jobdetails display

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1850:
---

Fix Version/s: 0.22.0

> Include job submit host information (name and ip) in jobconf and jobdetails 
> display
> ---
>
> Key: MAPREDUCE-1850
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1850
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Krishna Ramachandran
>Assignee: Krishna Ramachandran
> Fix For: 0.22.0
>
> Attachments: mapred-1850-1.patch, mapred-1850-2.patch, 
> mapred-1850-3.patch, mapred-1850-4.patch, mapred-1850-5.patch, 
> mapred-1850.patch, mapred-1850.patch
>
>
> Enhancement to identify the source (submit host and ip) of a job request. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1850) Include job submit host information (name and ip) in jobconf and jobdetails display

2010-06-29 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated MAPREDUCE-1850:
---

  Status: Resolved  (was: Patch Available)
Hadoop Flags: [Reviewed]
  Resolution: Fixed

Test failure is because of MAPREDUCE-1834.

I just committed this. Thanks Krishna !

> Include job submit host information (name and ip) in jobconf and jobdetails 
> display
> ---
>
> Key: MAPREDUCE-1850
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1850
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.22.0
>Reporter: Krishna Ramachandran
>Assignee: Krishna Ramachandran
> Attachments: mapred-1850-1.patch, mapred-1850-2.patch, 
> mapred-1850-3.patch, mapred-1850-4.patch, mapred-1850-5.patch, 
> mapred-1850.patch, mapred-1850.patch
>
>
> Enhancement to identify the source (submit host and ip) of a job request. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1898) [Herriot] Implement a functionality for getting the job summary information of a job.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1898:
-

Attachment: 1898-ydist-security.patch

Addressed Iyappan's comments.

> [Herriot] Implement a functionality for getting the job summary information 
> of a job.
> -
>
> Key: MAPREDUCE-1898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1898
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1898-ydist-security.patch, 1898-ydist-security.patch
>
>
> Implement a method for getting the job summary details of a job. The job 
> summary should be.
> jobId, startTime, launchTime, finishTime, numMaps, numSlotsPerMap, 
> numReduces, numSlotsPerReduce, user, queue, status, mapSlotSeconds, 
> reduceSlotSeconds, clusterMapCapacity,clusterReduceCapacity.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1548) Hadoop archives should be able to preserve times and other properties from original files

2010-06-29 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883482#action_12883482
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1548:


Yes, if this proposal sounds good to you, I think it's ready for review ;)

> Hadoop archives should be able to preserve times and other properties from 
> original files
> -
>
> Key: MAPREDUCE-1548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1548
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1548.0.patch
>
>
> Files inside hadoop archives don't keep their original:
> - modification time
> - access time
> - permission
> - owner
> - group
> all such properties are currently taken from the file storing the archive 
> index, and not the stored files. This doesn't look very correct.
> There should be possible to preserve the original properties of the stored 
> files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1899) [Herriot] Test jobsummary information for different jobs.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1899:
-

Attachment: 1899-ydist-security.patch

Addressed Iyappan's comments.

> [Herriot] Test jobsummary information for different jobs.
> -
>
> Key: MAPREDUCE-1899
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1899
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1899-ydist-security.patch, 1899-ydist-security.patch
>
>
> Test the following scenarios.
> 1. Verify the job summary information for killed job.
> 2. Verify the job summary information for failed job.
> 3. Verify the job queue information in job summary after job has successfully 
> completed.
> 4. Verify the job summary information for high ram jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1899) [Herriot] Test jobsummary information for different jobs.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883473#action_12883473
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1899:
--

bq. 2) FailMapper is not needed to be defined seperately as there is a 
FailedMapper class already present in GenerateTaskChildProcess.java

FailedMapper is different with FailMapper. It creates the child process for a 
mapper and then do the actions for failing.But FailMapper is simple mapper 
which throws the exception apart from that nothing it will do.

bq. + while (!jInfo.getStatus().isJobComplete()) { + 
UtilsForTests.waitFor(100); + jInfo = remoteJTClient.getJobInfo(jobId); + } 
I will use the building block.




> [Herriot] Test jobsummary information for different jobs.
> -
>
> Key: MAPREDUCE-1899
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1899
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1899-ydist-security.patch
>
>
> Test the following scenarios.
> 1. Verify the job summary information for killed job.
> 2. Verify the job summary information for failed job.
> 3. Verify the job queue information in job summary after job has successfully 
> completed.
> 4. Verify the job summary information for high ram jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1865) [Rumen] Rumen should also support jobhistory files generated using trunk

2010-06-29 Thread Amar Kamat (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-1865:
--

Attachment: mapreduce-1865-v1.6.2.patch

Attaching a patch incorporating Dick's comments. Changes are as follows
- Testcase changes to run a job using MiniMRCluster and then test Rumen parsing 
against that as compared to having a hard-coded jobhistory file.
- Used JobID.compare instead of stringified JobIDs. 

Test-patch and ant-test passed on my box.

> [Rumen] Rumen should also support jobhistory files generated using trunk
> 
>
> Key: MAPREDUCE-1865
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1865
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.22.0
>
> Attachments: mapreduce-1865-v1.2.patch, mapreduce-1865-v1.6.2.patch
>
>
> Rumen code in trunk parses and process only jobhistory files from pre-21 
> hadoop mapreduce clusters. It should also support jobhistory files generated 
> using trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1898) [Herriot] Implement a functionality for getting the job summary information of a job.

2010-06-29 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883469#action_12883469
 ] 

Iyappan Srinivasan commented on MAPREDUCE-1898:
---

1)numSoltsPerMap - typo. Correct it to numSlotsPerMap

2) numSoltsPerReduce - typo. Correct it to numSlotsPerMap

3) JobTracker.getJobSummaryFromLogs - In @param section, add the second 
parameter.


> [Herriot] Implement a functionality for getting the job summary information 
> of a job.
> -
>
> Key: MAPREDUCE-1898
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1898
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1898-ydist-security.patch
>
>
> Implement a method for getting the job summary details of a job. The job 
> summary should be.
> jobId, startTime, launchTime, finishTime, numMaps, numSlotsPerMap, 
> numReduces, numSlotsPerReduce, user, queue, status, mapSlotSeconds, 
> reduceSlotSeconds, clusterMapCapacity,clusterReduceCapacity.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1899) [Herriot] Test jobsummary information for different jobs.

2010-06-29 Thread Balaji Rajagopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883460#action_12883460
 ] 

Balaji Rajagopalan commented on MAPREDUCE-1899:
---

+while (!jInfo.getStatus().isJobComplete()) {
+  UtilsForTests.waitFor(100);
+  jInfo = remoteJTClient.getJobInfo(jobId);
+}

Please use the building block that I will check in shortly for the above code. 

I agree with iyappan testjar has many failedmapper class that can be reused 
instead of writing your own. 



> [Herriot] Test jobsummary information for different jobs.
> -
>
> Key: MAPREDUCE-1899
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1899
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1899-ydist-security.patch
>
>
> Test the following scenarios.
> 1. Verify the job summary information for killed job.
> 2. Verify the job summary information for failed job.
> 3. Verify the job queue information in job summary after job has successfully 
> completed.
> 4. Verify the job summary information for high ram jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1899) [Herriot] Test jobsummary information for different jobs.

2010-06-29 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883454#action_12883454
 ] 

Iyappan Srinivasan commented on MAPREDUCE-1899:
---


1) verfiyJobSummaryInfo method - typo. make "verfiy" as "verify" Correct in 
calling places too.

2) FailMapper is not needed to be defined seperately as there is a  
FailedMapper class already present in GenerateTaskChildProcess.java

3) testJobQueueInfoInJobSummary - Make sure that the comments on top of 
testcase tell that job is submitted from a differnt queue.

4) testJobSummaryInfoOfHighMemoryJob - Make the comement as "high RAM" isntead 
fo "high ram", just to be clear.

5) testJobSummaryInfoOfFailedJob - 
+Assert.assertEquals("Job has not been succeeded", 
- This comment should be "Job has not failed as expected. but has succeeded"

> [Herriot] Test jobsummary information for different jobs.
> -
>
> Key: MAPREDUCE-1899
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1899
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1899-ydist-security.patch
>
>
> Test the following scenarios.
> 1. Verify the job summary information for killed job.
> 2. Verify the job summary information for failed job.
> 3. Verify the job queue information in job summary after job has successfully 
> completed.
> 4. Verify the job summary information for high ram jobs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1794) Test the job status of lost task trackers before and after the timeout.

2010-06-29 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883446#action_12883446
 ] 

Iyappan Srinivasan commented on MAPREDUCE-1794:
---

+1 from me.

> Test the job status of lost task trackers before and after the timeout.
> ---
>
> Key: MAPREDUCE-1794
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1794
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1794-ydist-security.patch, 1794_lost_tasktracker.patch
>
>
> This test covers the following scenarios.
> 1. Verify the job status whether it is succeeded or not when  the task 
> tracker is lost and alive before the timeout.
> 2. Verify the job status and killed attempts of a task whether it is 
> succeeded or not and killed attempts are matched or not  when the task 
> trackers are lost and it timeout for all the four attempts of a task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1794) Test the job status of lost task trackers before and after the timeout.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1794:
-

Attachment: 1794-ydist-security.patch

Addressed Iyappan's comment.

> Test the job status of lost task trackers before and after the timeout.
> ---
>
> Key: MAPREDUCE-1794
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1794
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1794-ydist-security.patch, 1794_lost_tasktracker.patch
>
>
> This test covers the following scenarios.
> 1. Verify the job status whether it is succeeded or not when  the task 
> tracker is lost and alive before the timeout.
> 2. Verify the job status and killed attempts of a task whether it is 
> succeeded or not and killed attempts are matched or not  when the task 
> trackers are lost and it timeout for all the four attempts of a task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1809:
-

Attachment: 1809-ydist-security.patch

> Ant build changes for Streaming system tests in contrib projects.
> -
>
> Key: MAPREDUCE-1809
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1809-ydist-security.patch, MAPREDUCE-1809.patch, 
> MAPREDUCE-1809.patch, MAPREDUCE-1809.patch
>
>
> Implementing new target( test-system) in build-contrib.xml file for executing 
> the system test that are in contrib projects. Also adding 'subant'  target in 
> aop.xml that calls the build-contrib.xml file for system tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1809) Ant build changes for Streaming system tests in contrib projects.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1809:
-

Attachment: (was: 1809-ydist-security.patch)

> Ant build changes for Streaming system tests in contrib projects.
> -
>
> Key: MAPREDUCE-1809
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1809
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: build
>Affects Versions: 0.21.0
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1809-ydist-security.patch, MAPREDUCE-1809.patch, 
> MAPREDUCE-1809.patch, MAPREDUCE-1809.patch
>
>
> Implementing new target( test-system) in build-contrib.xml file for executing 
> the system test that are in contrib projects. Also adding 'subant'  target in 
> aop.xml that calls the build-contrib.xml file for system tests.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1794) Test the job status of lost task trackers before and after the timeout.

2010-06-29 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883442#action_12883442
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1794:
--

{quote}
1) Since there is a change in restartClusterWithNewConfig now,
Hashtable prop should be used.
{quote}
Now we have implemented in generic fashion and there is no change is required 
in the test. it accepts any type of data.

bq. 2) import java.util.Collection; - is not used anywhere.
removed the statement.

bq. 3) cleanup should be present in AfterClass also, just in case testcase 
fails in the middle.

cleanup already there in after class.




> Test the job status of lost task trackers before and after the timeout.
> ---
>
> Key: MAPREDUCE-1794
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1794
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: 1794_lost_tasktracker.patch
>
>
> This test covers the following scenarios.
> 1. Verify the job status whether it is succeeded or not when  the task 
> tracker is lost and alive before the timeout.
> 2. Verify the job status and killed attempts of a task whether it is 
> succeeded or not and killed attempts are matched or not  when the task 
> trackers are lost and it timeout for all the four attempts of a task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.