[jira] Commented: (MAPREDUCE-398) Task process exit with nonzero status of 134.

2009-07-06 Thread dillip pattnaik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727409#action_12727409
 ] 

dillip pattnaik commented on MAPREDUCE-398:
---

Hi,
I also have faced the issue.
To identify the problem , we have checked the userlogs present in
hadoop-0.18.3/logs/userlogs/{$task_id}/stdout or stderr
 where {$task_id} can be found from jobtracker's url (default url is 
jobtrackermachine:50030). For example, {$task_id} will be like 
attempt_200907061121_0002_m_10_0.

 For me it was an out of memory error.


> Task process exit with nonzero status of 134.
> -
>
> Key: MAPREDUCE-398
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-398
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: Linux OS, 1 Namenode, 5 Datanode
>Reporter: sha feng
>Priority: Minor
>
> During fetcher2 stage, i got these errors on all datanodes.
> java.io.IOException: Task process exit with nonzero status of 134.  
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java: 424)
> When fetching more than 100 urls, these errors occur.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-532) Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue

2009-07-06 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-532:
---

Attachment: MAPREDUCE-532-6.patch

The attached patch (MAPREDUCE-532-6.patch) fixes a missed comment about not 
displaying the slot limit if configured. It also changes the display string 
about capacities a little bit.

> Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of 
> a queue
> -
>
> Key: MAPREDUCE-532
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-532
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Reporter: Rajiv Chittajallu
> Attachments: MAPREDUCE-532-1.patch, MAPREDUCE-532-2.patch, 
> MAPREDUCE-532-3.patch, MAPREDUCE-532-4.patch, MAPREDUCE-532-5.patch, 
> MAPREDUCE-532-6.patch
>
>
> For jobs which call external services, (eg: distcp, crawlers) user/admin 
> should be able to control max parallel tasks spawned. There should be a 
> mechanism to cap the capacity available for a queue/job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-532) Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue

2009-07-06 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727413#action_12727413
 ] 

Hemanth Yamijala commented on MAPREDUCE-532:


Results of test-patch:

{noformat}
 [exec] -1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] -1 Eclipse classpath. The patch causes the Eclipse classpath to 
differ from the contents of the lib directories.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
{noformat}

The -1 related to Eclipse classpath is because of a mismatch in ivy jars. I am 
told this does not need to be worried about.

The test changes only the capacity scheduler code base, and all unit tests of 
capacity scheduler pass.

> Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of 
> a queue
> -
>
> Key: MAPREDUCE-532
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-532
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Reporter: Rajiv Chittajallu
> Attachments: MAPREDUCE-532-1.patch, MAPREDUCE-532-2.patch, 
> MAPREDUCE-532-3.patch, MAPREDUCE-532-4.patch, MAPREDUCE-532-5.patch, 
> MAPREDUCE-532-6.patch
>
>
> For jobs which call external services, (eg: distcp, crawlers) user/admin 
> should be able to control max parallel tasks spawned. There should be a 
> mechanism to cap the capacity available for a queue/job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-631) TestJobInProgressListener brings up MinMR/DFS clusters for every test

2009-07-06 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan resolved MAPREDUCE-631.
-

Resolution: Duplicate

This will be fixed as a part of 
https://issues.apache.org/jira/browse/MAPREDUCE-153. Resolving.

> TestJobInProgressListener brings up MinMR/DFS clusters for every test
> -
>
> Key: MAPREDUCE-631
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-631
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
>
> TestJobInProgressListener brings up/down the cluster several times. Instead, 
> the cluster should just be brought up once, all tests run and then brought 
> down

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-532) Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue

2009-07-06 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-532:
---

Attachment: MAPREDUCE-532-7.patch

The attached patch (MAPREDUCE-532-7.patch) corrects a missed Forrest 
documentation tag which was causing the docs build to fail. Verified docs are 
generated properly with this patch.

> Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of 
> a queue
> -
>
> Key: MAPREDUCE-532
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-532
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Reporter: Rajiv Chittajallu
> Attachments: MAPREDUCE-532-1.patch, MAPREDUCE-532-2.patch, 
> MAPREDUCE-532-3.patch, MAPREDUCE-532-4.patch, MAPREDUCE-532-5.patch, 
> MAPREDUCE-532-6.patch, MAPREDUCE-532-7.patch
>
>
> For jobs which call external services, (eg: distcp, crawlers) user/admin 
> should be able to control max parallel tasks spawned. There should be a 
> mechanism to cap the capacity available for a queue/job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-630) TestKillCompletedJob can be modified to improve execution times

2009-07-06 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-630:


Attachment: mapred-630.patch

Updating patch to the trunk

> TestKillCompletedJob can be modified to improve execution times
> ---
>
> Key: MAPREDUCE-630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-630
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: hadoop-6068.patch, mapred-630.patch
>
>
> This test can be easily made into a unit test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-630) TestKillCompletedJob can be modified to improve execution times

2009-07-06 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-630:


Status: Patch Available  (was: Open)

> TestKillCompletedJob can be modified to improve execution times
> ---
>
> Key: MAPREDUCE-630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-630
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: hadoop-6068.patch, mapred-630.patch
>
>
> This test can be easily made into a unit test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-632) Merge TestCustomOutputCommitter with TestCommandLineJobSubmission

2009-07-06 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727420#action_12727420
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-632:
---

+1 for the patch

> Merge TestCustomOutputCommitter with TestCommandLineJobSubmission
> -
>
> Key: MAPREDUCE-632
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-632
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
> Attachments: hadoop-5978.java
>
>
> TestCommandLineJobSubmission tests job submisison with different command line 
> options. This can be easily enhanced to test custom output committer too and 
> we can do away with TestCustomOutputCommitter

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-532) Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue

2009-07-06 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala resolved MAPREDUCE-532.


   Resolution: Fixed
Fix Version/s: 0.21.0
 Assignee: rahul k singh
 Release Note: Provided ability in the capacity scheduler to limit the 
number of slots that can be concurrently used per queue at any given time.
 Hadoop Flags: [Reviewed]

I just committed this to trunk. Thanks, Rahul !

> Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of 
> a queue
> -
>
> Key: MAPREDUCE-532
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-532
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Reporter: Rajiv Chittajallu
>Assignee: rahul k singh
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-532-1.patch, MAPREDUCE-532-2.patch, 
> MAPREDUCE-532-3.patch, MAPREDUCE-532-4.patch, MAPREDUCE-532-5.patch, 
> MAPREDUCE-532-6.patch, MAPREDUCE-532-7.patch
>
>
> For jobs which call external services, (eg: distcp, crawlers) user/admin 
> should be able to control max parallel tasks spawned. There should be a 
> mechanism to cap the capacity available for a queue/job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-626) Modify TestLostTracker to improve execution time

2009-07-06 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-626:


Attachment: mapred-626.patch

Simple patch that makes this test as a unit test.
The code to start and stop tracker-expiry has been refactored into separate 
methods in JT

> Modify TestLostTracker to improve execution time
> 
>
> Key: MAPREDUCE-626
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-626
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: mapred-626.patch
>
>
> This test can be made faster with a few modifications

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-626) Modify TestLostTracker to improve execution time

2009-07-06 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-626:


Status: Patch Available  (was: Open)

> Modify TestLostTracker to improve execution time
> 
>
> Key: MAPREDUCE-626
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-626
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: mapred-626.patch
>
>
> This test can be made faster with a few modifications

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-677) TestNodeRefresh timesout

2009-07-06 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727433#action_12727433
 ] 

Amar Kamat commented on MAPREDUCE-677:
--

I am trying to reproduce this failure and I am not able to do it. Can someone 
plz attach failure logs (nohup logs etc) or comment as to how to reproduce it. 

> TestNodeRefresh timesout
> 
>
> Key: MAPREDUCE-677
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-677
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: test
>Reporter: Amar Kamat
>Assignee: Amar Kamat
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Ramya R (JIRA)
node health check script does not refresh the "reason for blacklisting"
---

 Key: MAPREDUCE-708
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
Reporter: Ramya R
Priority: Minor
 Fix For: 0.20.1


After MAPREDUCE-211, the node health check script does not refresh the "reason 
for blacklisting".
The steps to reproduce the issue are:
* Blacklist a TT with an error message 'x'
* Change the health check script to return an error message 'y'
* The "reason for blacklisting" still shows 'x'
 
The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-708:
---

  Component/s: tasktracker
Affects Version/s: (was: 0.20.1)
   0.21.0
Fix Version/s: (was: 0.20.1)
   0.21.0
 Assignee: Sreekanth Ramakrishnan

> node health check script does not refresh the "reason for blacklisting"
> ---
>
> Key: MAPREDUCE-708
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
>
> After MAPREDUCE-211, the node health check script does not refresh the 
> "reason for blacklisting".
> The steps to reproduce the issue are:
> * Blacklist a TT with an error message 'x'
> * Change the health check script to return an error message 'y'
> * The "reason for blacklisting" still shows 'x'
>  
> The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-532) Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of a queue

2009-07-06 Thread Hemanth Yamijala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Yamijala updated MAPREDUCE-532:
---

Attachment: MAPREDUCE-532-20.patch

Attached patch (MAPREDUCE-532-20.patch) is taken to apply against the Yahoo! 
Hadoop distribution at version 20. This is NOT to be committed externally.

> Allow admins of the Capacity Scheduler to set a hard-limit on the capacity of 
> a queue
> -
>
> Key: MAPREDUCE-532
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-532
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: contrib/capacity-sched
>Reporter: Rajiv Chittajallu
>Assignee: rahul k singh
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-532-1.patch, MAPREDUCE-532-2.patch, 
> MAPREDUCE-532-20.patch, MAPREDUCE-532-3.patch, MAPREDUCE-532-4.patch, 
> MAPREDUCE-532-5.patch, MAPREDUCE-532-6.patch, MAPREDUCE-532-7.patch
>
>
> For jobs which call external services, (eg: distcp, crawlers) user/admin 
> should be able to control max parallel tasks spawned. There should be a 
> mechanism to cap the capacity available for a queue/job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-701) Make TestRackAwareTaskPlacement a unit test

2009-07-06 Thread Jothi Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727444#action_12727444
 ] 

Jothi Padmanabhan commented on MAPREDUCE-701:
-

Test Patch results:

 [exec] +1 overall.  
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 8 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] 

> Make TestRackAwareTaskPlacement a unit test
> ---
>
> Key: MAPREDUCE-701
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-701
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: test
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: mapred-701.patch, ratp.patch
>
>
> This test can be made into a unit test and the functionality verified without 
> needing to start MiniMR/DFS and launching jobs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-626) Modify TestLostTracker to improve execution time

2009-07-06 Thread Jothi Padmanabhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jothi Padmanabhan updated MAPREDUCE-626:


Attachment: mapred-626.patch

Simple patch that tests this functionality as a unit test

> Modify TestLostTracker to improve execution time
> 
>
> Key: MAPREDUCE-626
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-626
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: mapred-626.patch, mapred-626.patch
>
>
> This test can be made faster with a few modifications

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-701) Make TestRackAwareTaskPlacement a unit test

2009-07-06 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-701:
--

   Resolution: Fixed
Fix Version/s: 0.21.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Jothi!

> Make TestRackAwareTaskPlacement a unit test
> ---
>
> Key: MAPREDUCE-701
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-701
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: test
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-701.patch, ratp.patch
>
>
> This test can be made into a unit test and the functionality verified without 
> needing to start MiniMR/DFS and launching jobs

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-708:
-

Attachment: MAPREDUCE-708-1.patch

Attaching patch fixing the issue pointed out by Ramya.

Patch removes the condition which checks if the TT was already blacklisted for 
the same reason and ignore further processing.
Added test case to verify the error messages were changed even if the reason is 
same.

> node health check script does not refresh the "reason for blacklisting"
> ---
>
> Key: MAPREDUCE-708
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-708-1.patch
>
>
> After MAPREDUCE-211, the node health check script does not refresh the 
> "reason for blacklisting".
> The steps to reproduce the issue are:
> * Blacklist a TT with an error message 'x'
> * Change the health check script to return an error message 'y'
> * The "reason for blacklisting" still shows 'x'
>  
> The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-709) node health check script does not display the correct message on timeout

2009-07-06 Thread Ramya R (JIRA)
node health check script does not display the correct message on timeout


 Key: MAPREDUCE-709
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-709
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Ramya R
Priority: Minor
 Fix For: 0.21.0


When the node health check script takes more than 
"mapred.healthChecker.script.timeout" to return, it should display a timeout 
message. Instead it displays the full stacktrace as below:

{noformat}
java.io.IOException: Stream closed at 
java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) 
at java.io.BufferedInputStream.read(BufferedInputStream.java:308) 
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) 
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) 
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) 
at java.io.InputStreamReader.read(InputStreamReader.java:167) 
at java.io.BufferedReader.fill(BufferedReader.java:136) 
at java.io.BufferedReader.readLine(BufferedReader.java:299) 
at java.io.BufferedReader.readLine(BufferedReader.java:362) 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) 
at org.apache.hadoop.util.Shell.run(Shell.java:145) 
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:338) 
at 
org.apache.hadoop.mapred.NodeHealthCheckerService$NodeHealthMonitorExecutor.run(NodeHealthCheckerService.java:119)
 
at java.util.TimerThread.mainLoop(Timer.java:512) 
at java.util.TimerThread.run(Timer.java:462) 
{noformat}

Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
job.xml. It always picks up the default value. It is just an UI issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-709) node health check script does not display the correct message on timeout

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan reassigned MAPREDUCE-709:


Assignee: Sreekanth Ramakrishnan

> node health check script does not display the correct message on timeout
> 
>
> Key: MAPREDUCE-709
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-709
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
>
> When the node health check script takes more than 
> "mapred.healthChecker.script.timeout" to return, it should display a timeout 
> message. Instead it displays the full stacktrace as below:
> {noformat}
> java.io.IOException: Stream closed at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:308) 
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) 
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) 
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) 
> at java.io.InputStreamReader.read(InputStreamReader.java:167) 
> at java.io.BufferedReader.fill(BufferedReader.java:136) 
> at java.io.BufferedReader.readLine(BufferedReader.java:299) 
> at java.io.BufferedReader.readLine(BufferedReader.java:362) 
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) 
> at org.apache.hadoop.util.Shell.run(Shell.java:145) 
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:338) 
> at 
> org.apache.hadoop.mapred.NodeHealthCheckerService$NodeHealthMonitorExecutor.run(NodeHealthCheckerService.java:119)
>  
> at java.util.TimerThread.mainLoop(Timer.java:512) 
> at java.util.TimerThread.run(Timer.java:462) 
> {noformat}
> Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
> job.xml. It always picks up the default value. It is just an UI issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-709) node health check script does not display the correct message on timeout

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-709:
-

Attachment: mapred-709-1.patch

Attaching patch fixing this issue:

* Setting proper exit status, previous we were not using TIMEOUT enum.
* Changed test case to check for proper timeout message.

> node health check script does not display the correct message on timeout
> 
>
> Key: MAPREDUCE-709
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-709
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-709-1.patch
>
>
> When the node health check script takes more than 
> "mapred.healthChecker.script.timeout" to return, it should display a timeout 
> message. Instead it displays the full stacktrace as below:
> {noformat}
> java.io.IOException: Stream closed at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:308) 
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) 
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) 
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) 
> at java.io.InputStreamReader.read(InputStreamReader.java:167) 
> at java.io.BufferedReader.fill(BufferedReader.java:136) 
> at java.io.BufferedReader.readLine(BufferedReader.java:299) 
> at java.io.BufferedReader.readLine(BufferedReader.java:362) 
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) 
> at org.apache.hadoop.util.Shell.run(Shell.java:145) 
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:338) 
> at 
> org.apache.hadoop.mapred.NodeHealthCheckerService$NodeHealthMonitorExecutor.run(NodeHealthCheckerService.java:119)
>  
> at java.util.TimerThread.mainLoop(Timer.java:512) 
> at java.util.TimerThread.run(Timer.java:462) 
> {noformat}
> Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
> job.xml. It always picks up the default value. It is just an UI issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-709) node health check script does not display the correct message on timeout

2009-07-06 Thread Ramya R (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727457#action_12727457
 ] 

Ramya R commented on MAPREDUCE-709:
---

| Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
job.xml. It always picks up the default value. It is just an UI issue.
The issue was due to wrong conf dir. No longer observed.

> node health check script does not display the correct message on timeout
> 
>
> Key: MAPREDUCE-709
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-709
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-709-1.patch
>
>
> When the node health check script takes more than 
> "mapred.healthChecker.script.timeout" to return, it should display a timeout 
> message. Instead it displays the full stacktrace as below:
> {noformat}
> java.io.IOException: Stream closed at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:308) 
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) 
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) 
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) 
> at java.io.InputStreamReader.read(InputStreamReader.java:167) 
> at java.io.BufferedReader.fill(BufferedReader.java:136) 
> at java.io.BufferedReader.readLine(BufferedReader.java:299) 
> at java.io.BufferedReader.readLine(BufferedReader.java:362) 
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) 
> at org.apache.hadoop.util.Shell.run(Shell.java:145) 
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:338) 
> at 
> org.apache.hadoop.mapred.NodeHealthCheckerService$NodeHealthMonitorExecutor.run(NodeHealthCheckerService.java:119)
>  
> at java.util.TimerThread.mainLoop(Timer.java:512) 
> at java.util.TimerThread.run(Timer.java:462) 
> {noformat}
> Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
> job.xml. It always picks up the default value. It is just an UI issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-657) CompletedJobStatusStore hardcodes filesystem to hdfs

2009-07-06 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727459#action_12727459
 ] 

Iyappan Srinivasan commented on MAPREDUCE-657:
--

The job info files are not being removed from the hdfs also after  
"mapred.job.tracker.persist.jobstatus.hours" 
I set it to 1 hour and it got deleted after 1 hour 57 minutes.

> CompletedJobStatusStore hardcodes filesystem to hdfs
> 
>
> Key: MAPREDUCE-657
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-657
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Fix For: 0.20.1
>
> Attachments: MAPREDUCE-657-v1.0.patch, 
> MAPREDUCE-657-v1.2-branch-0.20.patch, MAPREDUCE-657-v1.2.patch
>
>
> Today, completedjobstatusstore stores only to hdfs. It should be configurable 
> to write to local-fs too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-371) Change org.apache.hadoop.mapred.lib.KeyFieldBasedComparator and org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner to use new api

2009-07-06 Thread Jothi Padmanabhan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727463#action_12727463
 ] 

Jothi Padmanabhan commented on MAPREDUCE-371:
-

+1. Changes look good. 

> Change org.apache.hadoop.mapred.lib.KeyFieldBasedComparator and 
> org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner to use new api
> 
>
> Key: MAPREDUCE-371
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-371
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Attachments: patch-371.txt
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-630) TestKillCompletedJob can be modified to improve execution times

2009-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727472#action_12727472
 ] 

Hadoop QA commented on MAPREDUCE-630:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12412585/mapred-630.patch
  against trunk revision 791401.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/355/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/355/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/355/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/355/console

This message is automatically generated.

> TestKillCompletedJob can be modified to improve execution times
> ---
>
> Key: MAPREDUCE-630
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-630
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: hadoop-6068.patch, mapred-630.patch
>
>
> This test can be easily made into a unit test

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-626) Modify TestLostTracker to improve execution time

2009-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727483#action_12727483
 ] 

Hadoop QA commented on MAPREDUCE-626:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12412593/mapred-626.patch
  against trunk revision 791418.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/356/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/356/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/356/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/356/console

This message is automatically generated.

> Modify TestLostTracker to improve execution time
> 
>
> Key: MAPREDUCE-626
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-626
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Jothi Padmanabhan
>Assignee: Jothi Padmanabhan
>Priority: Minor
> Attachments: mapred-626.patch, mapred-626.patch
>
>
> This test can be made faster with a few modifications

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727493#action_12727493
 ] 

Hemanth Yamijala commented on MAPREDUCE-708:


Changes look good. +1. Can you please run ant test-patch and relevant tests.

> node health check script does not refresh the "reason for blacklisting"
> ---
>
> Key: MAPREDUCE-708
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-708-1.patch
>
>
> After MAPREDUCE-211, the node health check script does not refresh the 
> "reason for blacklisting".
> The steps to reproduce the issue are:
> * Blacklist a TT with an error message 'x'
> * Change the health check script to return an error message 'y'
> * The "reason for blacklisting" still shows 'x'
>  
> The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-709) node health check script does not display the correct message on timeout

2009-07-06 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727496#action_12727496
 ] 

Hemanth Yamijala commented on MAPREDUCE-709:


Changes look good to me. +1. Please upload results of test-patch and relevant 
tests.

> node health check script does not display the correct message on timeout
> 
>
> Key: MAPREDUCE-709
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-709
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-709-1.patch
>
>
> When the node health check script takes more than 
> "mapred.healthChecker.script.timeout" to return, it should display a timeout 
> message. Instead it displays the full stacktrace as below:
> {noformat}
> java.io.IOException: Stream closed at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:308) 
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) 
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) 
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) 
> at java.io.InputStreamReader.read(InputStreamReader.java:167) 
> at java.io.BufferedReader.fill(BufferedReader.java:136) 
> at java.io.BufferedReader.readLine(BufferedReader.java:299) 
> at java.io.BufferedReader.readLine(BufferedReader.java:362) 
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) 
> at org.apache.hadoop.util.Shell.run(Shell.java:145) 
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:338) 
> at 
> org.apache.hadoop.mapred.NodeHealthCheckerService$NodeHealthMonitorExecutor.run(NodeHealthCheckerService.java:119)
>  
> at java.util.TimerThread.mainLoop(Timer.java:512) 
> at java.util.TimerThread.run(Timer.java:462) 
> {noformat}
> Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
> job.xml. It always picks up the default value. It is just an UI issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-548) Global scheduling in the Fair Scheduler

2009-07-06 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727580#action_12727580
 ] 

Matei Zaharia commented on MAPREDUCE-548:
-

Test failures still appear unrelated.

> Global scheduling in the Fair Scheduler
> ---
>
> Key: MAPREDUCE-548
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-548
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Matei Zaharia
>Assignee: Matei Zaharia
> Fix For: 0.21.0
>
> Attachments: fs-global-v0.patch, hadoop-4667-v1.patch, 
> hadoop-4667-v1b.patch, hadoop-4667-v2.patch, HADOOP-4667_api.patch, 
> mapreduce-548-v1.patch, mapreduce-548.patch
>
>
> The current schedulers in Hadoop all examine a single job on every heartbeat 
> when choosing which tasks to assign, choosing the job based on FIFO or fair 
> sharing. There are inherent limitations to this approach. For example, if the 
> job at the front of the queue is small (e.g. 10 maps, in a cluster of 100 
> nodes), then on average it will launch only one local map on the first 10 
> heartbeats while it is at the head of the queue. This leads to very poor 
> locality for small jobs. Instead, we need a more "global" view of scheduling 
> that can look at multiple jobs. To resolve the locality problem, we will use 
> the following algorithm:
> - If the job at the head of the queue has no node-local task to launch, skip 
> it and look through other jobs.
> - If a job has waited at least T1 seconds while being skipped, also allow it 
> to launch rack-local tasks.
> - If a job has waited at least T2 > T1 seconds, also allow it to launch 
> off-rack tasks.
> This algorithm improves locality while bounding the delay that any job 
> experiences in launching a task.
> It turns out that whether waiting is useful depends on how many tasks are 
> left in the job - the probability of getting a heartbeat from a node with a 
> local task - and on whether the job is CPU or IO bound. Thus there may be 
> logic for removing the wait on the last few tasks in the job.
> As a related issue, once we allow global scheduling, we can launch multiple 
> tasks per heartbeat, as in HADOOP-3136. The initial implementation of 
> HADOOP-3136 adversely affected performance because it only launched multiple 
> tasks from the same job, but with the wait rule above, we will only do this 
> for jobs that are allowed to launch non-local tasks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-710:


Attachment: MAPREDUCE-710.patch

> Sqoop should read and transmit passwords in a more secure manner
> 
>
> Key: MAPREDUCE-710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-710.patch
>
>
> Sqoop's current support for passwords involves reading passwords from the 
> command line "--password foo", which makes the password visible to other 
> users via 'ps'. An invisible-console approach should be taken.
> Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
> also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Aaron Kimball (JIRA)
Sqoop should read and transmit passwords in a more secure manner


 Key: MAPREDUCE-710
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
 Attachments: MAPREDUCE-710.patch

Sqoop's current support for passwords involves reading passwords from the 
command line "--password foo", which makes the password visible to other users 
via 'ps'. An invisible-console approach should be taken.

Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-710:


Status: Patch Available  (was: Open)

> Sqoop should read and transmit passwords in a more secure manner
> 
>
> Key: MAPREDUCE-710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-710.patch
>
>
> Sqoop's current support for passwords involves reading passwords from the 
> command line "--password foo", which makes the password visible to other 
> users via 'ps'. An invisible-console approach should be taken.
> Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
> also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727630#action_12727630
 ] 

Aaron Kimball commented on MAPREDUCE-710:
-

This patch adds a {{\-P}} flag which prompts for password on the console using 
java.io.Console.readPassword().

It also changes the mysqldump logic to write a user-readable-only file 
containing the password and use that instead of {{\-\-password}} on the 
command-line, which is insecure. Since mysqldump reads its password directly 
from the console, not from stdin, it is impossible to "directly" feed the 
password to mysqldump. Thus the user-only file is the means I've chosen to 
transmit the password.

I have added a new test case which Hudson won't run by default, to test this 
behavior. Users with mysql who wish to run this test should run {{ant jar 
\-Dtestcase=MySQLAuthTest}} in the {{src/contrib/sqoop}} directory.


> Sqoop should read and transmit passwords in a more secure manner
> 
>
> Key: MAPREDUCE-710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-710.patch
>
>
> Sqoop's current support for passwords involves reading passwords from the 
> command line "--password foo", which makes the password visible to other 
> users via 'ps'. An invisible-console approach should be taken.
> Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
> also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727632#action_12727632
 ] 

Aaron Kimball commented on MAPREDUCE-710:
-

Oops - that should be {{ant test \-Dtestcase=MySQLAuthTest}}, not {{ant jar}}. 

> Sqoop should read and transmit passwords in a more secure manner
> 
>
> Key: MAPREDUCE-710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-710.patch
>
>
> Sqoop's current support for passwords involves reading passwords from the 
> command line "--password foo", which makes the password visible to other 
> users via 'ps'. An invisible-console approach should be taken.
> Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
> also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-711) Move Distributed Cache from Common to Map/Reduce

2009-07-06 Thread Owen O'Malley (JIRA)
Move Distributed Cache from Common to Map/Reduce


 Key: MAPREDUCE-711
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-711
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Reporter: Owen O'Malley


Distributed Cache logically belongs as part of map/reduce and not Common.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-711) Move Distributed Cache from Common to Map/Reduce

2009-07-06 Thread Philip Zeyliger (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727711#action_12727711
 ] 

Philip Zeyliger commented on MAPREDUCE-711:
---

+1!

> Move Distributed Cache from Common to Map/Reduce
> 
>
> Key: MAPREDUCE-711
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-711
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>
> Distributed Cache logically belongs as part of map/reduce and not Common.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727764#action_12727764
 ] 

Hadoop QA commented on MAPREDUCE-710:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12412634/MAPREDUCE-710.patch
  against trunk revision 791418.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/357/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/357/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/357/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/357/console

This message is automatically generated.

> Sqoop should read and transmit passwords in a more secure manner
> 
>
> Key: MAPREDUCE-710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-710.patch
>
>
> Sqoop's current support for passwords involves reading passwords from the 
> command line "--password foo", which makes the password visible to other 
> users via 'ps'. An invisible-console approach should be taken.
> Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
> also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Khaled Elmeleegy (JIRA)
TextWritter example is CPU bound!!
--

 Key: MAPREDUCE-712
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: task
Affects Versions: 0.20.1, 0.21.0
 Environment: ~200 nodes cluster
Each node has the following configuration:
Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
quad-core (8 CPUs)
4 Disks
16 GB RAM
Linux 2.6
Hadoop version: trunk
Reporter: Khaled Elmeleegy


Running the RandomTextWritter example job ( from the examples jar) pegs the 
machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Khaled Elmeleegy (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khaled Elmeleegy updated MAPREDUCE-712:
---


I used the following command line for the RandomTextWritter job:

 ./hadoop jar ../hadoop-0.21.0-dev-*examples.jar randomtextwriter  -D 
test.randomtextwrite.total_bytes=5368709120 -D 
test.randomtextwrite.bytes_per_map=536870912   -D 
test.randomtextwrite.min_words_key=5-D 
test.randomtextwrite.max_words_key=10-D 
test.randomtextwrite.min_words_value=100   -D 
test.randomtextwrite.max_words_value=1   -D mapred.output.compress=false   
-D mapred.map.output.compression.type=BLOCK   -outFormat 
org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat 
/gridmix/data/WebSimulationBlock


The job has 100,000 maps and no reduces. I configured HDFS to have replication 
factor of 1 to eliminate network traffic. Nodes were configured to have 16 map 
slots and 2 reduce slots. Each task was configured to have at most 512MB of 
java heap space. The jobs output file is ~50TB >>  overall cluster memory, 
forcing disk I/O

Since the job is doing no computation other than just writing to disk, one 
would expect that the job would be totally i/o (disk) bound. Surprisingly, it 
turned out to be CPU bound.

Measurements (using chukwa):

Across the cluster, workers cpu was <5% idle on average. Used disk bandwidth 
was ~40 MB/s across all disks for all the nodes at the cluster, which is close 
to the practical disk BW limit. The network is virtually 100% idle as one would 
expect.

The CPU time was ~70% at the user space, suggesting it's mainly overhead in the 
map tasks.

This suggests that there is a lot of CPU fat in the map tasks.


> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727780#action_12727780
 ] 

Arun C Murthy commented on MAPREDUCE-712:
-

Khaled, do you have details on where the CPU is being consumed? Is it the map 
task? Did you profile the task to see where the CPU is being consumed?

What about the datanode and the tasktracker? 

> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-710) Sqoop should read and transmit passwords in a more secure manner

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727784#action_12727784
 ] 

Aaron Kimball commented on MAPREDUCE-710:
-

These test failures are unrelated to the patch.

> Sqoop should read and transmit passwords in a more secure manner
> 
>
> Key: MAPREDUCE-710
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-710
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
> Attachments: MAPREDUCE-710.patch
>
>
> Sqoop's current support for passwords involves reading passwords from the 
> command line "--password foo", which makes the password visible to other 
> users via 'ps'. An invisible-console approach should be taken.
> Related, Sqoop transmits passwords to mysqldump in the same fashion, which is 
> also insecure.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Khaled Elmeleegy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727797#action_12727797
 ] 

Khaled Elmeleegy commented on MAPREDUCE-712:


Well, top reports that ~170% (~1.7 cpus) of the time is spent at the data
node, which makes sense, as it's receiving all these writes. The rest of the
time is distributed evenly among tasks (maps), this part doesn't sound
right...too much fat.

One thing to add, when having replication factor of 3, the bottleneck shifts
to become the network, no surprise there.






> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved MAPREDUCE-712.
-

Resolution: Invalid

16 maps on 8 cpus running gzip is expected to completely saturate cpu. This is 
not a bug!!!

Also check to see if you were using the native codec. If you are using the Java 
codec, it will be very slow and cpu bound.

> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley reopened MAPREDUCE-712:
-


I notice now that you didn't have compression. I wonder how much time you were 
spending in gc with such small heaps. That might explain the cpu load.

> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727808#action_12727808
 ] 

Arun C Murthy commented on MAPREDUCE-712:
-

bq.  I wonder how much time you were spending in gc with such small heaps. That 
might explain the cpu load.

Agreed. You have 5G of data per map (50TB/100k maps) which results in a 
significant number of output Text objects being created in RandomTextWriter (a 
potential bug).
Thus, we'd get a lot of data out of the profiles of the tasks... 

> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Khaled Elmeleegy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727811#action_12727811
 ] 

Khaled Elmeleegy commented on MAPREDUCE-712:


Ah, I forgot to mention I turned off compression to make sure it's not a
bottleneck. You can see that in the command line I used.






> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-675) Sqoop should allow user-defined class and package names

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-675:


Status: Open  (was: Patch Available)

> Sqoop should allow user-defined class and package names
> ---
>
> Key: MAPREDUCE-675
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-675
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: classnames.patch, MAPREDUCE-675.2.patch
>
>
> Currently Sqoop generates a class for each table to be imported; the class 
> names are equal to the table names and they are not part of any package.
> This adds --class-name and --package-name parameters to Sqoop, allowing these 
> aspects of code generation to be controlled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-675) Sqoop should allow user-defined class and package names

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-675:


Attachment: MAPREDUCE-675.2.patch

> Sqoop should allow user-defined class and package names
> ---
>
> Key: MAPREDUCE-675
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-675
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: classnames.patch, MAPREDUCE-675.2.patch
>
>
> Currently Sqoop generates a class for each table to be imported; the class 
> names are equal to the table names and they are not part of any package.
> This adds --class-name and --package-name parameters to Sqoop, allowing these 
> aspects of code generation to be controlled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-675) Sqoop should allow user-defined class and package names

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727816#action_12727816
 ] 

Aaron Kimball commented on MAPREDUCE-675:
-

New patch which passes unit tests.

> Sqoop should allow user-defined class and package names
> ---
>
> Key: MAPREDUCE-675
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-675
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: classnames.patch, MAPREDUCE-675.2.patch
>
>
> Currently Sqoop generates a class for each table to be imported; the class 
> names are equal to the table names and they are not part of any package.
> This adds --class-name and --package-name parameters to Sqoop, allowing these 
> aspects of code generation to be controlled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-675) Sqoop should allow user-defined class and package names

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-675:


Status: Patch Available  (was: Open)

> Sqoop should allow user-defined class and package names
> ---
>
> Key: MAPREDUCE-675
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-675
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: classnames.patch, MAPREDUCE-675.2.patch
>
>
> Currently Sqoop generates a class for each table to be imported; the class 
> names are equal to the table names and they are not part of any package.
> This adds --class-name and --package-name parameters to Sqoop, allowing these 
> aspects of code generation to be controlled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-712) TextWritter example is CPU bound!!

2009-07-06 Thread Hong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727818#action_12727818
 ] 

Hong Tang commented on MAPREDUCE-712:
-

Besides profiling, I guess two other things you might want to try to ensure the 
problems are not in either DFS or the specific input format you are using:
- Run TestDFSIO benchmark to see make sure hdfs layer does not consume all the 
cpus.
- Use a dummier writer that simply writes zeros to the files instead of wasting 
time generating random text keys and values (which uses StringBuffer and does 
UTF8 convertions).

> TextWritter example is CPU bound!!
> --
>
> Key: MAPREDUCE-712
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: task
>Affects Versions: 0.20.1, 0.21.0
> Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-713) Sqoop has some superfluous imports

2009-07-06 Thread Aaron Kimball (JIRA)
Sqoop has some superfluous imports
--

 Key: MAPREDUCE-713
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-713
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/sqoop
Reporter: Aaron Kimball
Assignee: Aaron Kimball
Priority: Trivial
 Attachments: MAPREDUCE-713.patch

Some classes have vestigial imports that should be removed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-713) Sqoop has some superfluous imports

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727822#action_12727822
 ] 

Aaron Kimball commented on MAPREDUCE-713:
-

Cleaning a few of these up. There aren't any new tests because this is a simple 
refactoring.

> Sqoop has some superfluous imports
> --
>
> Key: MAPREDUCE-713
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-713
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Trivial
> Attachments: MAPREDUCE-713.patch
>
>
> Some classes have vestigial imports that should be removed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-713) Sqoop has some superfluous imports

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-713:


Attachment: MAPREDUCE-713.patch

> Sqoop has some superfluous imports
> --
>
> Key: MAPREDUCE-713
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-713
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Trivial
> Attachments: MAPREDUCE-713.patch
>
>
> Some classes have vestigial imports that should be removed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-713) Sqoop has some superfluous imports

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-713:


Status: Patch Available  (was: Open)

> Sqoop has some superfluous imports
> --
>
> Key: MAPREDUCE-713
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-713
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Trivial
> Attachments: MAPREDUCE-713.patch
>
>
> Some classes have vestigial imports that should be removed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-546) Provide sample fair scheduler config file in conf/ and set config file property to point to this by default

2009-07-06 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-546:


Attachment: mapreduce-546.patch

Here's a patch for this issue that makes the scheduler use fair-scheduler.xml 
off the classpath if no other allocation file is specified through 
mapred.fairscheduler.allocation.file. (We keep this parameter for backwards 
compatibility).

The only tricky part was using the URL from the ClassLoader's getResource 
method instead of a String for the path to the file. I followed the example set 
in Configuration of having an Object as the allocation file name that may be 
either an URL or a String, because I didn't want to force users of the existing 
jobconf parameter to supply a file:// URL, and I also didn't want to append 
file:// in front (this doesn't work with relative paths, and although the docs 
have said to use an absolute path since the scheduler was released, it could be 
confusing).

I haven't included a unit test because the code changes are minor and I can't 
think of an easy way to unit test this. However, I did manually test that 
configurations are found whether the default config file fair-scheduler.xml is 
used or another file is specified through the config parameter, and also that 
the allocation file is reloaded at runtime in both situations.

I've also updated the docs to reflect the new functionality and emptied-out the 
fair-scheduler.xml template so that it creates no pools by default. The docs 
cover all the features that were described in the old 
fair-scheduler.xml.template.

> Provide sample fair scheduler config file in conf/ and set config file 
> property to point to this by default
> ---
>
> Key: MAPREDUCE-546
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-546
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Matei Zaharia
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapreduce-546.patch
>
>
> The capacity scheduler includes a config file template in hadoop/conf, so it 
> would make sense to create a similar one for the fair scheduler and mention 
> it in the README.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-546) Provide sample fair scheduler config file in conf/ and set config file property to point to this by default

2009-07-06 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-546:


Fix Version/s: 0.21.0
   Status: Patch Available  (was: Open)

> Provide sample fair scheduler config file in conf/ and set config file 
> property to point to this by default
> ---
>
> Key: MAPREDUCE-546
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-546
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>Reporter: Matei Zaharia
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapreduce-546.patch
>
>
> The capacity scheduler includes a config file template in hadoop/conf, so it 
> would make sense to create a similar one for the fair scheduler and mention 
> it in the README.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-714) JobConf.findContainingJar unescapes unnecessarily on Linux

2009-07-06 Thread Todd Lipcon (JIRA)
JobConf.findContainingJar unescapes unnecessarily on Linux
--

 Key: MAPREDUCE-714
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-714
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Reporter: Todd Lipcon


In JobConf.findContainingJar, the path name is decoded using 
URLDecoder.decode(...). This was done by Doug in r381794 (commit msg "Un-escape 
containing jar's path, which is URL-encoded.  This fixes things primarily on 
Windows, where paths are likely to contain spaces.") Unfortunately, jar paths 
do not appear to be URL encoded on Linux. If you try to use "hadoop jar" on a 
jar with a "+" in it, this function decodes it to a space and then the job 
cannot be submitted.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-715) Allow pool and user default settings to be set through poolDefaults and userDefaults elements

2009-07-06 Thread Matei Zaharia (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727858#action_12727858
 ] 

Matei Zaharia commented on MAPREDUCE-715:
-

Here's an example expanding on the description. Today, a config file might look 
as follows

{code}


  
 100
 300
  
  
 5
  
  20
  10
  600

{code}

It would be more usable to set them in "poolDefaults" and "userDefaults" 
elements as follows:
{code}


  
 100
 300
  
  
 5
  
  
20
600
  
  
 10
  

{code}

> Allow pool and user default settings to be set through poolDefaults and 
> userDefaults elements
> -
>
> Key: MAPREDUCE-715
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-715
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
>
> Over time a number of elements have been added to the fair scheduler config 
> file for setting default running job limits, preemption timeouts, etc for 
> pools and for users. Right now these are all set in top-level elements in the 
> allocations file, such as . It would be easier to 
> understand if there was a  element that contained defaults for 
> all pools (using the same element names as  elements, e.g. 
>  in this case), and similarly, a  element for 
> pool defaults.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-715) Allow pool and user default settings to be set through poolDefaults and userDefaults elements

2009-07-06 Thread Matei Zaharia (JIRA)
Allow pool and user default settings to be set through poolDefaults and 
userDefaults elements
-

 Key: MAPREDUCE-715
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-715
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: contrib/fair-share
Reporter: Matei Zaharia


Over time a number of elements have been added to the fair scheduler config 
file for setting default running job limits, preemption timeouts, etc for pools 
and for users. Right now these are all set in top-level elements in the 
allocations file, such as . It would be easier to 
understand if there was a  element that contained defaults for 
all pools (using the same element names as  elements, e.g. 
 in this case), and similarly, a  element for 
pool defaults.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-715) Allow pool and user default settings to be set through poolDefaults and userDefaults elements

2009-07-06 Thread Matei Zaharia (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matei Zaharia updated MAPREDUCE-715:


Priority: Minor  (was: Major)

> Allow pool and user default settings to be set through poolDefaults and 
> userDefaults elements
> -
>
> Key: MAPREDUCE-715
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-715
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/fair-share
>Reporter: Matei Zaharia
>Priority: Minor
>
> Over time a number of elements have been added to the fair scheduler config 
> file for setting default running job limits, preemption timeouts, etc for 
> pools and for users. Right now these are all set in top-level elements in the 
> allocations file, such as . It would be easier to 
> understand if there was a  element that contained defaults for 
> all pools (using the same element names as  elements, e.g. 
>  in this case), and similarly, a  element for 
> pool defaults.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-714) JobConf.findContainingJar unescapes unnecessarily on Linux

2009-07-06 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727862#action_12727862
 ] 

Todd Lipcon commented on MAPREDUCE-714:
---

I just confirmed this on Windows vs Linux. On Windows, the URL that you get 
back from ClassLoader.getResource has spaces encoded as "%20". On Linux it 
doesn't.

Anyone have a creative solution to deal with this? We'd like to have +s in our 
version numbers due to standards in RPM and Debian land, but this is blocking 
that.

> JobConf.findContainingJar unescapes unnecessarily on Linux
> --
>
> Key: MAPREDUCE-714
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-714
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Todd Lipcon
>
> In JobConf.findContainingJar, the path name is decoded using 
> URLDecoder.decode(...). This was done by Doug in r381794 (commit msg 
> "Un-escape containing jar's path, which is URL-encoded.  This fixes things 
> primarily on Windows, where paths are likely to contain spaces.") 
> Unfortunately, jar paths do not appear to be URL encoded on Linux. If you try 
> to use "hadoop jar" on a jar with a "+" in it, this function decodes it to a 
> space and then the job cannot be submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-675) Sqoop should allow user-defined class and package names

2009-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727864#action_12727864
 ] 

Hadoop QA commented on MAPREDUCE-675:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12412661/MAPREDUCE-675.2.patch
  against trunk revision 791418.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 7 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/358/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/358/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/358/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/358/console

This message is automatically generated.

> Sqoop should allow user-defined class and package names
> ---
>
> Key: MAPREDUCE-675
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-675
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: classnames.patch, MAPREDUCE-675.2.patch
>
>
> Currently Sqoop generates a class for each table to be imported; the class 
> names are equal to the table names and they are not part of any package.
> This adds --class-name and --package-name parameters to Sqoop, allowing these 
> aspects of code generation to be controlled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-487) DBInputFormat support for Oracle

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727872#action_12727872
 ] 

Aaron Kimball commented on MAPREDUCE-487:
-

Another patch for this same issue (Oracle/DBIF integration) was available at 
HADOOP-5482. I have cleaned up that patch and attached the newest version there 
to that issue. It's considerably less code than this one.

> DBInputFormat support for Oracle
> 
>
> Key: MAPREDUCE-487
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-487
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.1, 0.21.0
>Reporter: Amandeep Khurana
>Priority: Trivial
> Fix For: 0.20.1, 0.21.0
>
> Attachments: HADOOP-5616.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> DBInputFormat doesnt support interfacing with Oracle.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-716) org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-716:


Status: Open  (was: Patch Available)

> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
> -
>
> Key: MAPREDUCE-716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: Java 1.6, HAdoop0.19.0, Linux..Oracle, 
>Reporter: evanand
> Attachments: HADOOP-5482.20-branch.patch, HADOOP-5482.patch, 
> HADOOP-5482.trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle.
> The out of the box implementation of the Hadoop is working properly with 
> mysql/hsqldb, but NOT with oracle.
> Reason is DBInputformat is implemented with mysql/hsqldb specific query 
> constructs like "LIMIT", "OFFSET".
> FIX:
> building a database provider specific logic based on the database 
> providername (which we can get using connection).
> I HAVE ALREADY IMPLEMENTED IT FOR ORACLE...READY TO CHECK_IN CODE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Moved: (MAPREDUCE-716) org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball moved HADOOP-5482 to MAPREDUCE-716:
-

  Component/s: (was: mapred)
Affects Version/s: (was: 0.18.3)
   (was: 0.19.1)
   (was: 0.18.2)
   (was: 0.19.0)
  Key: MAPREDUCE-716  (was: HADOOP-5482)
  Project: Hadoop Map/Reduce  (was: Hadoop Common)

> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
> -
>
> Key: MAPREDUCE-716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: Java 1.6, HAdoop0.19.0, Linux..Oracle, 
>Reporter: evanand
> Attachments: HADOOP-5482.20-branch.patch, HADOOP-5482.patch, 
> HADOOP-5482.trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle.
> The out of the box implementation of the Hadoop is working properly with 
> mysql/hsqldb, but NOT with oracle.
> Reason is DBInputformat is implemented with mysql/hsqldb specific query 
> constructs like "LIMIT", "OFFSET".
> FIX:
> building a database provider specific logic based on the database 
> providername (which we can get using connection).
> I HAVE ALREADY IMPLEMENTED IT FOR ORACLE...READY TO CHECK_IN CODE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-716) org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle

2009-07-06 Thread Aaron Kimball (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Kimball updated MAPREDUCE-716:


Status: Patch Available  (was: Open)

Moved issue from HADOOP into MAPREDUCE and cycled patch status for Hudson

> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle
> -
>
> Key: MAPREDUCE-716
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-716
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
> Environment: Java 1.6, HAdoop0.19.0, Linux..Oracle, 
>Reporter: evanand
> Attachments: HADOOP-5482.20-branch.patch, HADOOP-5482.patch, 
> HADOOP-5482.trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.hadoop.mapred.lib.db.DBInputformat not working with oracle.
> The out of the box implementation of the Hadoop is working properly with 
> mysql/hsqldb, but NOT with oracle.
> Reason is DBInputformat is implemented with mysql/hsqldb specific query 
> constructs like "LIMIT", "OFFSET".
> FIX:
> building a database provider specific logic based on the database 
> providername (which we can get using connection).
> I HAVE ALREADY IMPLEMENTED IT FOR ORACLE...READY TO CHECK_IN CODE

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-487) DBInputFormat support for Oracle

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727874#action_12727874
 ] 

Aaron Kimball commented on MAPREDUCE-487:
-

Related to the above comment, I moved HADOOP-5482 to MAPREDUCE-716.

> DBInputFormat support for Oracle
> 
>
> Key: MAPREDUCE-487
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-487
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.20.1, 0.21.0
>Reporter: Amandeep Khurana
>Priority: Trivial
> Fix For: 0.20.1, 0.21.0
>
> Attachments: HADOOP-5616.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> DBInputFormat doesnt support interfacing with Oracle.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-714) JobConf.findContainingJar unescapes unnecessarily on Linux

2009-07-06 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727883#action_12727883
 ] 

Todd Lipcon commented on MAPREDUCE-714:
---

Here's a potential fix, which is really klugey. If no one objects to this I'll 
upload it as a real patch in the next couple of days:

{code}
diff --git a/src/mapred/org/apache/hadoop/mapred/JobConf.java 
b/src/mapred/org/apache/hadoop/mapred/JobConf.java
index 11be95a..4794eab 100644
--- a/src/mapred/org/apache/hadoop/mapred/JobConf.java
+++ b/src/mapred/org/apache/hadoop/mapred/JobConf.java
@@ -1452,6 +1452,13 @@ public class JobConf extends Configuration {
   if (toReturn.startsWith("file:")) {
 toReturn = toReturn.substring("file:".length());
   }
+  // URLDecoder is a misnamed class, since it actually decodes
+  // x-www-form-urlencoded MIME type rather than actual
+  // URL encoding (which the file path has). Therefore it would
+  // decode +s to ' 's which is incorrect (spaces are actually
+  // either unencoded or encoded as "%20"). Replace +s first, so
+  // that they are kept sacred during the decoding process.
+  toReturn = toReturn.replaceAll("\\+", "%2B");
   toReturn = URLDecoder.decode(toReturn, "UTF-8");
   return toReturn.replaceAll("!.*$", "");
 }
{code}

> JobConf.findContainingJar unescapes unnecessarily on Linux
> --
>
> Key: MAPREDUCE-714
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-714
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Todd Lipcon
>
> In JobConf.findContainingJar, the path name is decoded using 
> URLDecoder.decode(...). This was done by Doug in r381794 (commit msg 
> "Un-escape containing jar's path, which is URL-encoded.  This fixes things 
> primarily on Windows, where paths are likely to contain spaces.") 
> Unfortunately, jar paths do not appear to be URL encoded on Linux. If you try 
> to use "hadoop jar" on a jar with a "+" in it, this function decodes it to a 
> space and then the job cannot be submitted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727884#action_12727884
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-708:
--

output from ant test-patch:

{noformat}
 [exec]
 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
 [exec]
 [exec]
{noformat}

Running tests.

> node health check script does not refresh the "reason for blacklisting"
> ---
>
> Key: MAPREDUCE-708
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-708-1.patch
>
>
> After MAPREDUCE-211, the node health check script does not refresh the 
> "reason for blacklisting".
> The steps to reproduce the issue are:
> * Blacklist a TT with an error message 'x'
> * Change the health check script to return an error message 'y'
> * The "reason for blacklisting" still shows 'x'
>  
> The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-675) Sqoop should allow user-defined class and package names

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727885#action_12727885
 ] 

Aaron Kimball commented on MAPREDUCE-675:
-

Current set of failures are unrelated.

> Sqoop should allow user-defined class and package names
> ---
>
> Key: MAPREDUCE-675
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-675
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Minor
> Attachments: classnames.patch, MAPREDUCE-675.2.patch
>
>
> Currently Sqoop generates a class for each table to be imported; the class 
> names are equal to the table names and they are not part of any package.
> This adds --class-name and --package-name parameters to Sqoop, allowing these 
> aspects of code generation to be controlled.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-709) node health check script does not display the correct message on timeout

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727889#action_12727889
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-709:
--

output from ant test-patch

{noformat}
 [exec] +1 overall.
 [exec]
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec]
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec]
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec]
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
 [exec]
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec]
{noformat}

> node health check script does not display the correct message on timeout
> 
>
> Key: MAPREDUCE-709
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-709
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: mapred-709-1.patch
>
>
> When the node health check script takes more than 
> "mapred.healthChecker.script.timeout" to return, it should display a timeout 
> message. Instead it displays the full stacktrace as below:
> {noformat}
> java.io.IOException: Stream closed at 
> java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:145) 
> at java.io.BufferedInputStream.read(BufferedInputStream.java:308) 
> at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:264) 
> at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306) 
> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158) 
> at java.io.InputStreamReader.read(InputStreamReader.java:167) 
> at java.io.BufferedReader.fill(BufferedReader.java:136) 
> at java.io.BufferedReader.readLine(BufferedReader.java:299) 
> at java.io.BufferedReader.readLine(BufferedReader.java:362) 
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:202) 
> at org.apache.hadoop.util.Shell.run(Shell.java:145) 
> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:338) 
> at 
> org.apache.hadoop.mapred.NodeHealthCheckerService$NodeHealthMonitorExecutor.run(NodeHealthCheckerService.java:119)
>  
> at java.util.TimerThread.mainLoop(Timer.java:512) 
> at java.util.TimerThread.run(Timer.java:462) 
> {noformat}
> Also the "mapred.healthChecker.script.timeout" is not being reflected in the 
> job.xml. It always picks up the default value. It is just an UI issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-711) Move Distributed Cache from Common to Map/Reduce

2009-07-06 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727901#action_12727901
 ] 

Hemanth Yamijala commented on MAPREDUCE-711:


+1

> Move Distributed Cache from Common to Map/Reduce
> 
>
> Key: MAPREDUCE-711
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-711
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Reporter: Owen O'Malley
>
> Distributed Cache logically belongs as part of map/reduce and not Common.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-717) Fix some corner case issues in speculative execution (post hadoop-2141)

2009-07-06 Thread Devaraj Das (JIRA)
Fix some corner case issues in speculative execution (post hadoop-2141)
---

 Key: MAPREDUCE-717
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-717
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.21.0


Some corner case issues can be fixed:
1) Setup task should not add anything to the job statistics (since they are 
really fast and might affect the statistics of a job with few tasks)
2) The statistics computations should be guarded for cases where things like 
sumOfSquares could become less than zero (due to rounding errors mostly).
3) The method TaskInProgress.getCurrentProgressRate() should take into account 
the COMMIT_PENDING state
4) The testcase TestSpeculativeExecution.testTaskLATEScheduling could be made 
more robust

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-718) Support for per-phase speculative execution

2009-07-06 Thread Devaraj Das (JIRA)
Support for per-phase speculative execution
---

 Key: MAPREDUCE-718
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-718
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Devaraj Das
 Fix For: 0.21.0


It would be good to have support for per-phase speculative execution where the 
algorithm looks at the current phase of a task, and compares with the other 
tasks in the same phase before deciding to launch a speculative task. That 
would have the following benefits:
1) Support for jobs where map tasks progresses jumps from 0% to 100%. This is 
true for some jobs like randomwriter. Today, we would launch speculative tasks 
for such jobs (assuming that the tasks are not making progress). But most of 
them would be unnecessary. 
2) In reality, for reduces, the three phases are quite different from each 
other, and they take different times too. We should see better results when we 
look at per-phase speculation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-383) pipes combiner does not reset properly after a spill

2009-07-06 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727908#action_12727908
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-383:
---

changes look fine to me

> pipes combiner does not reset properly after a spill
> 
>
> Key: MAPREDUCE-383
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-383
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Christian Kunz
>Assignee: Christian Kunz
> Attachments: patch.HADOOP-6070
>
>
> When using a pipes combiner, the variable numBytes is not reset to 0 in 
> spillAll, effectively reducing the effect of running a combiner to the first 
> spill.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-719) Not able to run Mapred Reliability test as "other" user.

2009-07-06 Thread Suman Sehgal (JIRA)
Not able to run Mapred Reliability test as "other" user.


 Key: MAPREDUCE-719
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-719
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.20.1
Reporter: Suman Sehgal


While executing ReliabilityTest as "other" user, following issues were observed:

--> Tasktrackers are not being killed as "other" user is not the owner of 
tasktrackers.
--> This test program just gives the usage message if not able to kill TTs.
  "09/07/06 09:37:00 INFO mapred.ReliabilityTest: : usage: kill [ 
-s signal | -p ] [ -a ] pid ..."
  Instead it should give an error message and should stop further execution 
by giving non-zero exit code.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-40) Memory management variables need a backwards compatibility option after HADOOP-5881

2009-07-06 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727915#action_12727915
 ] 

Hemanth Yamijala commented on MAPREDUCE-40:
---

In an offline discussion, the following two comments came up:
- The test cases seem to test every high memory test case in two modes - with 
the old variables and with the new variables. I think that's overkill and is 
making the test cases hard to maintain over a long run.
- Some memory related variables (related to physical memory configuration) are 
removed in a backwards incompatible manner. If they are configured in the conf 
files we should print a warning message to the users. I don't know if this is 
already done. If yes, please ignore this.

> Memory management variables need a backwards compatibility option after 
> HADOOP-5881
> ---
>
> Key: MAPREDUCE-40
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-40
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Hemanth Yamijala
>Assignee: rahul k singh
>Priority: Blocker
> Attachments: hadoop-5919-1.patch, hadoop-5919-2.patch, 
> hadoop-5919-3.patch, hadoop-5919-4.patch, hadoop-5919-5.patch, 
> hadoop-5919-6.patch
>
>
> HADOOP-5881 modified variables related to memory management without looking 
> at the backwards compatibility angle. This JIRA is to adress the gap. Marking 
> it a blocker for 0.20.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-720) Task logs should have right access-control

2009-07-06 Thread Vinod K V (JIRA)
Task logs should have right access-control
--

 Key: MAPREDUCE-720
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-720
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tasktracker
Reporter: Vinod K V




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-713) Sqoop has some superfluous imports

2009-07-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727922#action_12727922
 ] 

Hadoop QA commented on MAPREDUCE-713:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12412663/MAPREDUCE-713.patch
  against trunk revision 791418.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

-1 contrib tests.  The patch failed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/359/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/359/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/359/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/359/console

This message is automatically generated.

> Sqoop has some superfluous imports
> --
>
> Key: MAPREDUCE-713
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-713
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Trivial
> Attachments: MAPREDUCE-713.patch
>
>
> Some classes have vestigial imports that should be removed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-720) Task logs should have right access-control

2009-07-06 Thread Vinod K V (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727924#action_12727924
 ] 

Vinod K V commented on MAPREDUCE-720:
-

Setting proper access-control was originally aimed as part of HADOOP-4491, but 
moved out as HADOOP-4491 involved a huge patch. Task logs will still have 
777(world read-write-executable) permissions after HADOOP-4491. This has to be 
fixed, in the minimum the world writable part of it.

> Task logs should have right access-control
> --
>
> Key: MAPREDUCE-720
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-720
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Reporter: Vinod K V
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (MAPREDUCE-720) Task logs should have right access-control

2009-07-06 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V reassigned MAPREDUCE-720:
---

Assignee: Sreekanth Ramakrishnan

> Task logs should have right access-control
> --
>
> Key: MAPREDUCE-720
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-720
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Reporter: Vinod K V
>Assignee: Sreekanth Ramakrishnan
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727926#action_12727926
 ] 

Sreekanth Ramakrishnan commented on MAPREDUCE-708:
--

All test passed locally, except the restart test cases which are being worked 
upon.

> node health check script does not refresh the "reason for blacklisting"
> ---
>
> Key: MAPREDUCE-708
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-708-1.patch
>
>
> After MAPREDUCE-211, the node health check script does not refresh the 
> "reason for blacklisting".
> The steps to reproduce the issue are:
> * Blacklist a TT with an error message 'x'
> * Change the health check script to return an error message 'y'
> * The "reason for blacklisting" still shows 'x'
>  
> The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-708) node health check script does not refresh the "reason for blacklisting"

2009-07-06 Thread Sreekanth Ramakrishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreekanth Ramakrishnan updated MAPREDUCE-708:
-

Attachment: MAPREDUCE-708-ydist.patch

Y! distribution patch.

> node health check script does not refresh the "reason for blacklisting"
> ---
>
> Key: MAPREDUCE-708
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-708
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.21.0
>Reporter: Ramya R
>Assignee: Sreekanth Ramakrishnan
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-708-1.patch, MAPREDUCE-708-ydist.patch
>
>
> After MAPREDUCE-211, the node health check script does not refresh the 
> "reason for blacklisting".
> The steps to reproduce the issue are:
> * Blacklist a TT with an error message 'x'
> * Change the health check script to return an error message 'y'
> * The "reason for blacklisting" still shows 'x'
>  
> The impact of this issue is that the feature fails to trap transient errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-713) Sqoop has some superfluous imports

2009-07-06 Thread Aaron Kimball (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727934#action_12727934
 ] 

Aaron Kimball commented on MAPREDUCE-713:
-

Once again, test failures have nothing to do with the patch..


> Sqoop has some superfluous imports
> --
>
> Key: MAPREDUCE-713
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-713
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: contrib/sqoop
>Reporter: Aaron Kimball
>Assignee: Aaron Kimball
>Priority: Trivial
> Attachments: MAPREDUCE-713.patch
>
>
> Some classes have vestigial imports that should be removed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-697) Jobwise list of blacklisted tasktrackers doesn't get refreshed even after restarting blacklisted tasktrackers.

2009-07-06 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727938#action_12727938
 ] 

Owen O'Malley commented on MAPREDUCE-697:
-

I'm not sure this is a good idea. TT are restarted for not reporting status for 
10 minutes. That should *not* reset the blacklist status. It should probably 
count as a strike against that TT.

> Jobwise list of blacklisted tasktrackers doesn't get refreshed even after 
> restarting blacklisted tasktrackers.
> --
>
> Key: MAPREDUCE-697
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-697
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: tasktracker
>Affects Versions: 0.20.1
>Reporter: Suman Sehgal
>Priority: Critical
>
> Jobwise list of blacklisted tasktrackers doesn't get refreshed even after 
> restarting blacklisted tasktrackers. "jobdetails.jsp" page keeps on showing 
> the same no. of blacklisted tasktrackers (it doesn't get back to zero).
> One associated issue:
> =
> --> More than 25% of TTs are blacklisted in a job.
> --> Restart the blacklisted TTs. All the tasktrackers are healthy now.
> --> try to blacklist other TT for the same job.
> Not able to blacklist the "other" tasktracker even if  
> "mapred.max.tracker.failures" exceeds the specified limit.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-683) TestJobTrackerRestart fails with Map task completion events ordering mismatch

2009-07-06 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-683:
--

Affects Version/s: 0.21.0
Fix Version/s: 0.21.0

> TestJobTrackerRestart fails with Map task completion events ordering mismatch
> -
>
> Key: MAPREDUCE-683
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-683
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.21.0
>Reporter: Sreekanth Ramakrishnan
>Assignee: Amar Kamat
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-683-v1.0.patch, MAPREDUCE-683-v1.2.patch, 
> TEST-org.apache.hadoop.mapred.TestJobTrackerRestart.txt
>
>
> {{TestJobTrackerRestart}} fails consistently with Map task completion events 
> ordering mismatch error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (MAPREDUCE-683) TestJobTrackerRestart fails with Map task completion events ordering mismatch

2009-07-06 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das resolved MAPREDUCE-683.
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]

I just committed this. Thanks, Amar!

> TestJobTrackerRestart fails with Map task completion events ordering mismatch
> -
>
> Key: MAPREDUCE-683
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-683
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>  Components: jobtracker
>Affects Versions: 0.21.0
>Reporter: Sreekanth Ramakrishnan
>Assignee: Amar Kamat
> Attachments: MAPREDUCE-683-v1.0.patch, MAPREDUCE-683-v1.2.patch, 
> TEST-org.apache.hadoop.mapred.TestJobTrackerRestart.txt
>
>
> {{TestJobTrackerRestart}} fails consistently with Map task completion events 
> ordering mismatch error.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.