[jira] Commented: (MAPREDUCE-1674) Some new features for CapacityTaskScheduler

2010-04-05 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853762#action_12853762
 ] 

Hemanth Yamijala commented on MAPREDUCE-1674:
-

Since Hadoop 0.21, there has been a good deal of changes in the capacity 
scheduler, primarily to support hierarchical queues. Since the 'Affects 
version' was set to an earlier version, I just thought of calling this out so 
you are aware.

> Some new features for CapacityTaskScheduler
> ---
>
> Key: MAPREDUCE-1674
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1674
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Affects Versions: 0.20.2
>Reporter: Xiao Kang
>
> Some new features for CapacityTaskScheduler developed at Baidu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-04-05 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853760#action_12853760
 ] 

rahul k singh commented on MAPREDUCE-1594:
--

- SleepJob hacks GridmixKey to pass along the sleep duration from map tasks to 
reduce tasks. We should document that in the code and file a jira to fix it.

Documented in the code and opened the jira 
https://issues.apache.org/jira/browse/MAPREDUCE-1675

> Support for Sleep Jobs in gridmix
> -
>
> Key: MAPREDUCE-1594
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/gridmix
>Reporter: rahul k singh
> Attachments: 1376-5-yhadoop20-100-3.patch, 
> 1594-yhadoop-20-1xx-1-2.patch, 1594-yhadoop-20-1xx-1-3.patch, 
> 1594-yhadoop-20-1xx-1-4.patch, 1594-yhadoop-20-1xx-1.patch, 
> 1594-yhadoop-20-1xx.patch
>
>
> Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1675) SleepJob hacks GridmixKey to pass along the sleep duration from map tasks to reduce tasks.

2010-04-05 Thread rahul k singh (JIRA)
SleepJob hacks GridmixKey to pass along the sleep duration from map tasks to 
reduce tasks.
--

 Key: MAPREDUCE-1675
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1675
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/gridmix
Reporter: rahul k singh


SleepJob hacks GridmixKey to pass along the sleep duration from map tasks to 
reduce tasks. We need to come up with cleaner solution

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-04-05 Thread rahul k singh (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853759#action_12853759
 ] 

rahul k singh commented on MAPREDUCE-1594:
--

I have implemented all the comments except few:

- GenerateData should extend from GridmixJob instead of LoadJob. I think we can 
have a default implementation of buildSplits (as an empty function) in 
GridmixJob and remove the "abstract" keyword.

GenerateData is now extending GridmixJob. But GridmixJob is still abstract as 
call() method is abstract.And it is implemented by all the derived classes.

- Avoiding directly setting "gridmix.job.seq" in both LoadJob and SleepJob. 
Instead, refactor the statement to a common method in GridmixJob called 
setSeqId(Job job). Similarly, adding a method getSeqId(Job job) in GridmixJob 
and avoid directly calling conf.get("girdmix.job.seq", -1) in 
{GridmixInputFormat, SleepInputFormat}.getSplits(...).

getSeqId is not there as {GridmixInputFormat, SleepInputFormat}.getSplits(...). 
is part of static inner classes and can only access static method. 

- I cannot find anywhere outdir is used by SleepJob. Did you encounter an error 
if FOF.setOutputPath is commented out in SleepJob.call()?
Removed this code and tested , things work fine.

- Both SleepJob and GridmixJob calls FileInputFormat.addInputPath(job, new 
path("ignored")), but one is surrounded with a try-catch block and the other is 
not. Not sure why. I am also curious to know what would be the error if 
FIF.addInputPath is not called in both classes 

I have remove FIF.addInputPath . things are working fine on cluster. I have 
removed try-catch block and added the exception in the signature of call

> Support for Sleep Jobs in gridmix
> -
>
> Key: MAPREDUCE-1594
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/gridmix
>Reporter: rahul k singh
> Attachments: 1376-5-yhadoop20-100-3.patch, 
> 1594-yhadoop-20-1xx-1-2.patch, 1594-yhadoop-20-1xx-1-3.patch, 
> 1594-yhadoop-20-1xx-1-4.patch, 1594-yhadoop-20-1xx-1.patch, 
> 1594-yhadoop-20-1xx.patch
>
>
> Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1594) Support for Sleep Jobs in gridmix

2010-04-05 Thread rahul k singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

rahul k singh updated MAPREDUCE-1594:
-

Attachment: 1594-yhadoop-20-1xx-1-4.patch

> Support for Sleep Jobs in gridmix
> -
>
> Key: MAPREDUCE-1594
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1594
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/gridmix
>Reporter: rahul k singh
> Attachments: 1376-5-yhadoop20-100-3.patch, 
> 1594-yhadoop-20-1xx-1-2.patch, 1594-yhadoop-20-1xx-1-3.patch, 
> 1594-yhadoop-20-1xx-1-4.patch, 1594-yhadoop-20-1xx-1.patch, 
> 1594-yhadoop-20-1xx.patch
>
>
> Support for Sleep jobs in gridmix

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853727#action_12853727
 ] 

Hadoop QA commented on MAPREDUCE-1585:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch
  against trunk revision 930423.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/93/console

This message is automatically generated.

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1674) Some new features for CapacityTaskScheduler

2010-04-05 Thread Xiao Kang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853723#action_12853723
 ] 

Xiao Kang commented on MAPREDUCE-1674:
--

Creating a sub task for each feature may be more convient for discuss.

> Some new features for CapacityTaskScheduler
> ---
>
> Key: MAPREDUCE-1674
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1674
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Affects Versions: 0.20.2
>Reporter: Xiao Kang
>
> Some new features for CapacityTaskScheduler developed at Baidu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1674) Some new features for CapacityTaskScheduler

2010-04-05 Thread Xiao Kang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853721#action_12853721
 ] 

Xiao Kang commented on MAPREDUCE-1674:
--

Sorry, wrong wiki syntext used, correct here:

# *support queue priority* : higher priority queue get capacity first
# *support queue refresh* : reload queue configuration at run time, including 
change queue properties and adding/removing queues
# *queue over capacity controlled* : add a queue property to indicate whether 
this queue can use more slot than its configured capacity
# *seperate web page for capacity-scheduler* : just as fair-scheduler, a new 
web page dadicated for capacity-scheduler is added
# *job capacity* : allowing user to specify per job map/reduce capacity both in 
cluster/tasktracker level
# *job initialization* : do not initialize a job if it will not get a chance to 
run if initialized


> Some new features for CapacityTaskScheduler
> ---
>
> Key: MAPREDUCE-1674
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1674
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Affects Versions: 0.20.2
>Reporter: Xiao Kang
>
> Some new features for CapacityTaskScheduler developed at Baidu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1674) Some new features for CapacityTaskScheduler

2010-04-05 Thread Xiao Kang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853720#action_12853720
 ] 

Xiao Kang commented on MAPREDUCE-1674:
--

Details as follows:

# support queue priority : higher priority queue get capacity first

# support queue refresh : reload queue configuration at run time, including 
change queue properties and adding/removing queues

# queue over capacity controlled : add a queue property to indicate whether 
this queue can use more slot than its configured capacity

# seperate web page for capacity-scheduler : just as fair-scheduler, a new web 
page dadicated for capacity-scheduler is added

# job capacity : allowing user to specify per job map/reduce capacity both in 
cluster/tasktracker level

# job initialization : do not initialize a job if it will not get a chance to 
run if initialized


> Some new features for CapacityTaskScheduler
> ---
>
> Key: MAPREDUCE-1674
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1674
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Affects Versions: 0.20.2
>Reporter: Xiao Kang
>
> Some new features for CapacityTaskScheduler developed at Baidu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1674) Some new features for CapacityTaskScheduler

2010-04-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853719#action_12853719
 ] 

Arun C Murthy commented on MAPREDUCE-1674:
--

More details please? *smile*

> Some new features for CapacityTaskScheduler
> ---
>
> Key: MAPREDUCE-1674
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1674
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Affects Versions: 0.20.2
>Reporter: Xiao Kang
>
> Some new features for CapacityTaskScheduler developed at Baidu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1671) Test scenario for "Killing Task Attempt id till job fails"

2010-04-05 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853718#action_12853718
 ] 

Ravi Gummadi commented on MAPREDUCE-1671:
-

mapred.max.tracker.failures is not the config for making the job fail. You need 
to use mapreduce.map.maxattempts and mapreduce.reduce.maxattempts ?

> Test scenario for "Killing Task Attempt id till job fails"
> --
>
> Key: MAPREDUCE-1671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1671
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: TestTaskKilling.patch
>
>
> 1) In a  job, kill the task attemptid of one task.  Whenever that task  tries 
> to run again with another task atempt id for mapred.max.tracker.failures 
> times, kill that task attempt id. After the mapred.max.tracker.failures 
> times, the whole job should get killed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1673:
---

Status: Patch Available  (was: Open)

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1673.1.patch, MAPREDUCE-1673.patch
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1673:
---

Attachment: MAPREDUCE-1673.1.patch

Comments were not correct

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1673.1.patch, MAPREDUCE-1673.patch
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1673:
---

Status: Open  (was: Patch Available)

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1673.patch
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1674) Some new features for CapacityTaskScheduler

2010-04-05 Thread Xiao Kang (JIRA)
Some new features for CapacityTaskScheduler
---

 Key: MAPREDUCE-1674
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1674
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/capacity-sched
Affects Versions: 0.20.2
Reporter: Xiao Kang


Some new features for CapacityTaskScheduler developed at Baidu.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853617#action_12853617
 ] 

Rodrigo Schmidt commented on MAPREDUCE-1585:


I looked at the test output and it seems to be unrelated to this patch 
(java.lang.NoClassDefFoundError: org/apache/hadoop/metrics/jvm/JvmMetrics).

Mahadev, what do you think? Can we commit it?

Thanks,
Rodrigo

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853613#action_12853613
 ] 

Hadoop QA commented on MAPREDUCE-1673:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12440746/MAPREDUCE-1673.patch
  against trunk revision 930423.

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/354/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/354/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/354/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h3.grid.sp2.yahoo.net/354/console

This message is automatically generated.

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1673.patch
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853609#action_12853609
 ] 

Hadoop QA commented on MAPREDUCE-1585:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12440747/MAPREDUCE-1585.2.patch
  against trunk revision 930423.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/92/console

This message is automatically generated.

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1523) Sometimes rumen trace generator fails to extract the job finish time.

2010-04-05 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1523:
-

Status: Open  (was: Patch Available)

> Sometimes rumen trace generator fails to extract the job finish time.
> -
>
> Key: MAPREDUCE-1523
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1523
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hong Tang
>Assignee: Dick King
> Attachments: mapreduce-1523--2010-03-31a-1612PDT.patch
>
>
> We saw sometimes (not very often) that rumen may fail to extract the job 
> finish time from Hadoop 0.20 history log.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1523) Sometimes rumen trace generator fails to extract the job finish time.

2010-04-05 Thread Hong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Tang updated MAPREDUCE-1523:
-

Status: Patch Available  (was: Open)

Retry hudson.

> Sometimes rumen trace generator fails to extract the job finish time.
> -
>
> Key: MAPREDUCE-1523
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1523
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hong Tang
>Assignee: Dick King
> Attachments: mapreduce-1523--2010-03-31a-1612PDT.patch
>
>
> We saw sometimes (not very often) that rumen may fail to extract the job 
> finish time from Hadoop 0.20 history log.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1376) Support for varied user submission in Gridmix

2010-04-05 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853572#action_12853572
 ] 

Chris Douglas commented on MAPREDUCE-1376:
--

bq. I feel , it is clean . I have added more code for creation of staging 
directories in it so I thought TestGridmixSubmission was getting really big 
with some reusable components . I think we might use those components again in 
future.

Code written with {{\*Util}} classes tends to be difficult to read and the 
tenuous dependencies between tests created through these classes are avoidable. 
Concerning its content, other than boilerplate {{MiniMRCluster}} code (which 
must be called in the same, boilerplate way from the test) and the one-line 
{{changePermission}}, there's only {{createHomeAndStagingDirectory}}, which 
calls back to the {{TestGridmixSubmission}} {{LOG}} field, anyway. But OK.

bq. parseUserList is only being used by RoundRobinUserResolve to iam leaving 
the interface as it is.

Abstract classes are generally easier to evolve than interfaces- and creating 
new {{UserResolvers}} without rewriting the parser or depending on 
{{RoundRobinUserResolver}} is a pleasant property- but OK. Making 
{{parseUserList}} protected would at least allow subclasses of RRUR to share 
the parsing code.

* There is some commented-out code in {{TestGridmixSubmission}}:
{noformat}+//
GridmixTestUtils.createHomeAndStagingDirectory((JobConf)conf);{noformat}
* There is a javadoc error since {{parseUserList}} is moved:
{noformat}+   * listing target users. The format of this file is defined by 
{...@link
+   * #parseUserList}.{noformat}
* Many javadoc comments seem to include {{}} at random
* Many of the exception and log messages start with spaces, e.g. {{" Could not 
run job submitted "}}, which is not conventional. Please don't do this. There 
are also many spelling errors in the messages, e.g. {{LOG.error(" EXCEPTOIN in 
availableSlots ", e);}}
* Should jobs be submitted with ACLs so that the statistics can be viewed by 
the submitting user? This would allow statistics, etc. to be collected without 
requiring a {{doAs}} block, right?
* {{GridmixJob::call}} can declare the exceptions it will throw instead of 
catching everything and returning null. This is true of all the 
{{PrivilegedExceptionAction}} uses, including the {{doAs}} calling 
{{Gridmix::runJob}}, which can have a signature narrower than {{Exception}}
* {{TestGridmixSubmission::doSubmission}} should take a 
{{GridmixJobSubmissionPolicy}} as an argument, rather than setting a static and 
calling the method from each submission type.

+1 Overall. None of these need to block commit, but please attend to them as 
part of either the Apache patch or concurrently with related work.

> Support for varied user submission in Gridmix
> -
>
> Key: MAPREDUCE-1376
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1376
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/gridmix
>Reporter: Chris Douglas
>Assignee: Chris Douglas
> Attachments: 1376-2-yhadoop-security.patch, 
> 1376-3-yhadoop20.100.patch, 1376-4-yhadoop20.100.patch, 
> 1376-5-yhadoop20-100.patch, 1376-yhadoop-security.patch, M1376-0.patch, 
> M1376-1.patch, M1376-2.patch, M1376-3.patch, M1376-4.patch
>
>
> Gridmix currently submits all synthetic jobs as the client user. It should be 
> possible to map users in the trace to a set of users appropriate for the 
> target cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853566#action_12853566
 ] 

Mahadev konar commented on MAPREDUCE-1585:
--

+1 the patch looks good... 

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Patch Available  (was: Open)

Trying Hudson again!

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Open  (was: Patch Available)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1646) Task Killing tests

2010-04-05 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853536#action_12853536
 ] 

Konstantin Boudnik commented on MAPREDUCE-1646:
---

bq. I have checked your log file and the first test(testFailedTaskJobStatus ) 
failed due to task has not been started for almost 1 min, so it got failed.And 
the other two tests are failing due to error in configuration object. I am 
suspecting the failures might occurred due to environmental issue.

- Shall the limit be raised slightly? Say 2 mins?
- 'Good citizen' test should be unrelated to environment as much as possible. 
If you can provide some special settings in the test config - you should do so, 
If test requires a particular environment to exist at the moment of a test 
execution and such env. setting doesn't exist then the test in question should 
fail with proper and meaningful error message. E.g. a person who runs tests 
shouldn't be guessing the required environment. 

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch, 
> TaskKilling_1646.patch, TEST-org.apache.hadoop.mapred.TestTaskKilling.txt, 
> TEST-org.apache.hadoop.mapred.TestTaskKilling.txt
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1656) JobStory should provide queue info.

2010-04-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853513#action_12853513
 ] 

Hudson commented on MAPREDUCE-1656:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #300 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/300/])
. JobStory should provide queue info. (hong via mahadev)


> JobStory should provide queue info.
> ---
>
> Key: MAPREDUCE-1656
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1656
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Hong Tang
>Assignee: Hong Tang
>Priority: Minor
> Fix For: 0.22.0
>
> Attachments: mr-1656-2.patch, mr-1656.patch
>
>
> Add a method in JobStory to get the queue to which a job is submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1428) Make block size and the size of archive created files configurable.

2010-04-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853512#action_12853512
 ] 

Hudson commented on MAPREDUCE-1428:
---

Integrated in Hadoop-Mapreduce-trunk-Commit #300 (See 
[http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/300/])


> Make block size and the size of archive created files configurable.
> ---
>
> Key: MAPREDUCE-1428
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1428
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: harchive
>Reporter: Mahadev konar
>Assignee: Mahadev konar
> Fix For: 0.22.0
>
> Attachments: BinaryFileGenerator.java, BinaryFileGenerator.java, 
> BinaryFileGenerator.java, MAPREDUCE-1428.patch, MAPREDUCE-1428.patch
>
>
> Currently the block size used by archives is the default block size of the 
> hdfs filesystem. We need to make it configurable so that the block size can 
> be higher for the part files that archives create.
> Also, we need to make the size of part files in archives configurable again 
> to make it bigger in size and create less number of such files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Open  (was: Patch Available)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1673:
---

Status: Patch Available  (was: Open)

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1673.patch
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Patch Available  (was: Open)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Attachment: (was: MAPREDUCE-1585.2.patch)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Attachment: MAPREDUCE-1585.2.patch

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1673:
---

Attachment: MAPREDUCE-1673.patch

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1673.patch
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)
Start and Stop scripts for the RaidNode
---

 Key: MAPREDUCE-1673
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: contrib/raid
Reporter: Rodrigo Schmidt
Assignee: Rodrigo Schmidt


We should have scripts that start and stop the RaidNode automatically. 
Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1673) Start and Stop scripts for the RaidNode

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1673:
---

Affects Version/s: 0.22.0
Fix Version/s: 0.22.0

> Start and Stop scripts for the RaidNode
> ---
>
> Key: MAPREDUCE-1673
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1673
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/raid
>Affects Versions: 0.22.0
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
>
> We should have scripts that start and stop the RaidNode automatically. 
> Something like start-raidnode.sh and stop-raidnode.sh

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Open  (was: Patch Available)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Attachment: MAPREDUCE-1585.2.patch

This new patch doesn't adds new test cases to cover the problems found in 
HADOOP-6645 and HADOOP-6591.

It also keeps TestHadoopArchives.java, and only changes it so that there are no 
more tests with space replacement for invalid characters, since space 
replacement is removed by this patch.

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1585) Create Hadoop Archives version 2 with filenames URL-encoded

2010-04-05 Thread Rodrigo Schmidt (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rodrigo Schmidt updated MAPREDUCE-1585:
---

Status: Patch Available  (was: Open)

> Create Hadoop Archives version 2 with filenames URL-encoded
> ---
>
> Key: MAPREDUCE-1585
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1585
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: harchive
>Reporter: Rodrigo Schmidt
>Assignee: Rodrigo Schmidt
> Fix For: 0.22.0
>
> Attachments: MAPREDUCE-1585.1.patch, MAPREDUCE-1585.2.patch, 
> MAPREDUCE-1585.patch
>
>
> Hadoop Archives version 1 don't cope with files that have spaces on their 
> names.
> One proposal is to URLEncode filenames inside the index file (version 2, 
> refers to HADOOP-6591).
> This task is to allow the creation of version 2 files that have file names 
> encoded appropriately. It currently depends on HADOOP-6591

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-05 Thread Iyappan Srinivasan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iyappan Srinivasan updated MAPREDUCE-1672:
--

Attachment: TestDistributedCacheUnModifiedFile.patch

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Iyappan Srinivasan
> Fix For: 0.22.0
>
> Attachments: TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-05 Thread Iyappan Srinivasan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iyappan Srinivasan reassigned MAPREDUCE-1672:
-

Assignee: Iyappan Srinivasan

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Fix For: 0.22.0
>
> Attachments: TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-05 Thread Iyappan Srinivasan (JIRA)
Create test scenario for "distributed cache file behaviour, when dfs file is 
not modified"
--

 Key: MAPREDUCE-1672
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: test
Affects Versions: 0.22.0
Reporter: Iyappan Srinivasan
 Fix For: 0.22.0


This test scenario is for a distributed cache file behaviour
when it is not modified before and after being
accessed by maximum two jobs. Once a job uses a distributed cache file
that file is stored in the mapred.local.dir. If the next job
uses the same file, then that is not stored again.
So, if two jobs choose the same tasktracker for their job execution
then, the distributed cache file should not be found twice.

This testcase should run a job with a distributed cache file. All the
tasks' corresponding tasktracker's handle is got and checked for
the presence of distributed cache with proper permissions in the
proper directory. Next when job
runs again and if any of its tasks hits the same tasktracker, which
ran one of the task of the previous job, then that
file should not be uploaded again and task use the old file.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1640) Node health feature fails to blacklist a node if the health check script times out in some cases

2010-04-05 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853346#action_12853346
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-1640:
---

The problem in gist: 

* In bash scripts all the commands are spawned using fork + exec. Similar to 
system() syscall. The process hierarchy is as follows:
{noformat}
15772 pts/0S  0:00  \_ /bin/bash ./myscript.sh
15773 pts/0S  0:00  |   \_ sleep 10
{noformat}
* So when kill 15772 is sent, the signal is not delivered to child.
* So parent exits and sleep does not check if the parent is alive or not 
continues doing its work as it is a long running process.

The problem is similar to what is mentioned and addressed in HADOOP-2721

For this problem all the node health script which is spawned should do: 
{{setsid; exec(node_health_path)}} and then instead of Process.destory() we do 
{{kill -pid}}, now the problem is that pid of the process is now passed on to 
java, so it will change to {{setsid; echo $$ ;exec(node_health_path)}} and we 
should read the input stream to get process id.

Or alternate solution to the problem is the node health script configured 
manages its children  :-)

> Node health feature fails to blacklist a node if the health check script 
> times out in some cases
> 
>
> Key: MAPREDUCE-1640
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1640
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Reporter: Vinod K V
> Fix For: 0.22.0
>
>
> Node health check feature fails to blacklist a TT if health check script 
> times out. Below are the values that were set:
>  - mapred.healthChecker.interval=6
>  - mapred.healthChecker.script.timeout=6000
> And the script was:
> {code}
> #!/bin/bash
> echo "start"
> sleep 10
> echo "end"
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1646) Task Killing tests

2010-04-05 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1646:
-

Attachment: TEST-org.apache.hadoop.mapred.TestTaskKilling.txt

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch, 
> TaskKilling_1646.patch, TEST-org.apache.hadoop.mapred.TestTaskKilling.txt, 
> TEST-org.apache.hadoop.mapred.TestTaskKilling.txt
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1646) Task Killing tests

2010-04-05 Thread Vinay Kumar Thota (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Kumar Thota updated MAPREDUCE-1646:
-

Attachment: TaskKilling_1646.patch

> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch, 
> TaskKilling_1646.patch, TEST-org.apache.hadoop.mapred.TestTaskKilling.txt
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1646) Task Killing tests

2010-04-05 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853343#action_12853343
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1646:
--

You are absolutely true, I have an issue with IDE configuration.Even If i set 
the tab size 2 and aligning the code accordingly, but it reformatting the code 
while generating the patch file so that you are seeing the indentation 
issues.Now I have given the spaces instead of tabs so that it behaves correctly 
across the IDE's. I hope this time you won't find any indentation issues in the 
latest patch.


also, I've tried to run it and it failed (see attached log)

Vinay: I have run all 3 tests on multi cluster couple of times and I didn't see 
any issues and all the tests are passing consistently.Please check the attached 
log file. 

I have checked your log file and the first test(testFailedTaskJobStatus ) 
failed due to task has not been started for almost 1 min, so it got failed.And 
the other two tests are failing due to error in configuration object. I am 
suspecting the failures might occurred due to environmental issue.

One more thing: the class has about 10 unused import statement. Please optimize 
the import list. 

Vinay : I have optimized the import list.




> Task Killing tests
> --
>
> Key: MAPREDUCE-1646
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1646
> Project: Hadoop Map/Reduce
>  Issue Type: Task
>  Components: test
>Reporter: Vinay Kumar Thota
>Assignee: Vinay Kumar Thota
> Attachments: TaskKilling_1646.patch, TaskKilling_1646.patch, 
> TEST-org.apache.hadoop.mapred.TestTaskKilling.txt
>
>
> The following tasks covered in the test.
> 1. In a running job, kill a task and verify the job succeeds.
> 2. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Kill the task. After 
> the task is killed, make sure that the
> output/_temporary/_attempt-id directory is cleaned up.
> 3. Setup a job with long running tasks that write some output to HDFS. When 
> one of the tasks is running, ensure that
> the output/_temporary/_attempt-id directory is created. Fail the task by 
> simulating the map. After the task is failed,
> make sure that the output/_temporary/_attempt-id directory is cleaned up. The 
> important difference we are trying to
> check is btw kill and fail, there would a subtle difference.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1671) Test scenario for "Killing Task Attempt id till job fails"

2010-04-05 Thread Iyappan Srinivasan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Iyappan Srinivasan updated MAPREDUCE-1671:
--

Attachment: TestTaskKilling.patch

Test scenario using herriot for TestTaskKilling scenario

> Test scenario for "Killing Task Attempt id till job fails"
> --
>
> Key: MAPREDUCE-1671
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1671
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: TestTaskKilling.patch
>
>
> 1) In a  job, kill the task attemptid of one task.  Whenever that task  tries 
> to run again with another task atempt id for mapred.max.tracker.failures 
> times, kill that task attempt id. After the mapred.max.tracker.failures 
> times, the whole job should get killed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1671) Test scenario for "Killing Task Attempt id till job fails"

2010-04-05 Thread Iyappan Srinivasan (JIRA)
Test scenario for "Killing Task Attempt id till job fails"
--

 Key: MAPREDUCE-1671
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1671
 Project: Hadoop Map/Reduce
  Issue Type: New Feature
  Components: test
Affects Versions: 0.22.0
Reporter: Iyappan Srinivasan
Assignee: Iyappan Srinivasan


1) In a  job, kill the task attemptid of one task.  Whenever that task  tries 
to run again with another task atempt id for mapred.max.tracker.failures times, 
kill that task attempt id. After the mapred.max.tracker.failures times, the 
whole job should get killed.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.