[jira] Updated: (MAPREDUCE-1197) username:jobid in job history search should be separated into two.

2010-04-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1197:
-

Status: Open  (was: Patch Available)

Unfortunately, the patch has become stale.

 username:jobid  in job history search should be separated into two.
 ---

 Key: MAPREDUCE-1197
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1197
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: jobtracker
Affects Versions: 0.21.0
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.21.0

 Attachments: patch-1197-1.txt, patch-1197.txt


 Job History web ui takes username:jobid  for search. For searching only 
 through jobid, user has to give :job_id. They should be separated into two 
 sothat user can give username or jobid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1062) MRReliability test does not work with retired jobs

2010-04-18 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1062:
-

   Status: Resolved  (was: Patch Available)
 Hadoop Flags: [Reviewed]
Fix Version/s: 0.22.0
   Resolution: Fixed

I committed this. Thanks, Sreekanth!

 MRReliability test does not work with retired jobs
 --

 Key: MAPREDUCE-1062
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1062
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.21.0
Reporter: Sreekanth Ramakrishnan
Assignee: Sreekanth Ramakrishnan
 Fix For: 0.22.0

 Attachments: mapreduce-1062-1.patch, mapreduce-1062-2.patch, 
 mapreduce-1062-3-ydist.patch, mapreduce-1062-3.patch, mapreduce-1062-4.patch, 
 mapreduce-ydist-20-1.patch


 Currently the MRReliability uses job clients get all job api which also 
 includes retired jobs.
 In case we have retired jobs in cluster, 
 The retired jobs are appended at the end of the job list, this results in 
 Test always getting completed job and not spawning off KillTask thread and 
 KillTracker threads.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1641) Job submission should fail if same uri is added for mapred.cache.files and mapred.cache.archives

2010-04-18 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858374#action_12858374
 ] 

Amareshwari Sriramadasu commented on MAPREDUCE-1641:


bq. Perhaps we should allow this, and both localize the file and unarchive it? 
What do you think?
We should not make the file option to unarchive the file. We have seen many use 
cases where users do not want their jars to be unjarred, for example HADOOP-5175

bq. We perform the check for conflicts between mapred.cache.files and 
mapred.cache.archives when the user finally submits the offending JobConf .
+1

bq. In particular, I plan to make a new class DistributedCache.DuplicatedURI 
extends InvalidJobConfException and throw that .
+1

 Job submission should fail if same uri is added for mapred.cache.files and 
 mapred.cache.archives
 

 Key: MAPREDUCE-1641
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1641
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: distributed-cache
Reporter: Amareshwari Sriramadasu
Assignee: Dick King
 Fix For: 0.22.0


 The behavior of mapred.cache.files and mapred.cache.archives is different 
 during localization in the following way:
 If a jar file is added to mapred.cache.files,  it will be localized under 
 TaskTracker under a unique path. 
 If a jar file is added to mapred.cache.archives, it will be localized under a 
 unique path in a directory named the jar file name, and will be unarchived 
 under the same directory.
 If same jar file is passed for both the configurations, the behavior 
 undefined. Thus the job submission should fail.
 Currently, since distributed cache processes files before archives, the jar 
 file will be just localized and not unarchived.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1657) After task logs directory is deleted, tasklog servlet displays wrong error message about job ACLs

2010-04-18 Thread Ravi Gummadi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858380#action_12858380
 ] 

Ravi Gummadi commented on MAPREDUCE-1657:
-

In tasklog servlet, the problem is even when the task log files don't exist, 
the first place an error is seen is authorization check.
Avoiding authorization check for viewing task logs if job-acls.xml file doesn't 
exist would solve the problem.

 After task logs directory is deleted, tasklog servlet displays wrong error 
 message about job ACLs
 -

 Key: MAPREDUCE-1657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Fix For: 0.22.0


 When task log gets deleted if from Web UI we click view task log, web page 
 displays wrong error message -:
 [
 HTTP ERROR: 401
 User user1 failed to view tasklogs of job job_201003241521_0001!
 user1 is not authorized for performing the operation VIEW_JOB on 
 job_201003241521_0001. VIEW_JOB Access control list
 configured for this job : 
 RequestURI=/tasklog
 ]
 Even if user is having view job acls set / or user is owner of job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1692) Remove TestStreamedMerge from the streaming tests

2010-04-18 Thread Vinod K V (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod K V updated MAPREDUCE-1692:
-

Attachment: patch-1692-ydist.txt

Attaching patch for yahoo! dist on behalf of Amareshwari/Sreekanth. Not for 
commit here.

 Remove TestStreamedMerge from the streaming tests
 -

 Key: MAPREDUCE-1692
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1692
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: contrib/streaming
Reporter: Sreekanth Ramakrishnan
Assignee: Amareshwari Sriramadasu
Priority: Minor
 Fix For: 0.22.0

 Attachments: MAPREDUCE-1692-1.patch, MAPREDUCE-1692-1.patch, 
 patch-1692-ydist.txt, patch-1692.txt


 Currently the {{TestStreamedMerge}} is never run as a part of the streaming 
 test suite, the code paths which were exercised by the test was removed in 
 HADOOP-1315, so it is better to remove the testcase from the code base.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1657) After task logs directory is deleted, tasklog servlet displays wrong error message about job ACLs

2010-04-18 Thread Ravi Gummadi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated MAPREDUCE-1657:


Attachment: MR1657.20S.1.patch

Attaching patch for earlier version of hadoop. Not for commit here.

Patch makes tasklog servlet to do authorization check only if job-acls.xml file 
exists. Also added a proper error message for the case of accessing task logs 
from tasklog servlet when the whole task log directory is not existing for a 
task. Added 2 testcases to TestWebUIAuthorization for (a) allowing all users to 
view task logs when job-acls.xml file not existing and (b) getting proper error 
code when the whole task log directory not existing.

 After task logs directory is deleted, tasklog servlet displays wrong error 
 message about job ACLs
 -

 Key: MAPREDUCE-1657
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1657
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Ravi Gummadi
Assignee: Ravi Gummadi
 Fix For: 0.22.0

 Attachments: MR1657.20S.1.patch


 When task log gets deleted if from Web UI we click view task log, web page 
 displays wrong error message -:
 [
 HTTP ERROR: 401
 User user1 failed to view tasklogs of job job_201003241521_0001!
 user1 is not authorized for performing the operation VIEW_JOB on 
 job_201003241521_0001. VIEW_JOB Access control list
 configured for this job : 
 RequestURI=/tasklog
 ]
 Even if user is having view job acls set / or user is owner of job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1672) Create test scenario for distributed cache file behaviour, when dfs file is not modified

2010-04-18 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12858385#action_12858385
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1672:
--

+  Assert.assertNotNull(jobInfo is null + jInfo,
+  jInfo.getStatus().getRunState());

The above statement is used for checking the jInfo instance right,in that case 
there is not point of invoking the getStatus() method in the statement.Because 
even for null instance also it's accessing the getStatus() and it might be 
throwing NEP. So change the statement like below.

Assert.assertNotNull(jobInfo is null,jInfo);



   if (count  10) {
+  Assert.fail(Since the sleep count has reached beyond a point +
+failing at this point);
+}


I would suggest here,the error message should be more concrete instead of 
saying 'count has reached beyone a point'.I mean,
the message should be Job has not been started for 10 mins so test 
fails.While dubugging the user will have clear information
about why the test fails and what time it has been waited to start the job.



for (String taskTracker : taskTrackers) {
+  //Formatting tasktracker to get just its FQDN 
+  taskTracker = UtilsForTests.getFQDNofTT(taskTracker);
+  LOG.info(taskTracker is : + taskTracker);
+
+  //This will be entered from the second job onwards
+  if (countLoop  1) {
+if (taskTracker != null) {
+  continueLoop = taskTrackerCollection.contains(taskTracker);
+}
+if (!continueLoop) {
+  break;
+}
+  }
+
+  //Collecting the tasktrackers
+  if (taskTracker != null)  
+taskTrackerCollection.add(taskTracker);

In the above Instructions I could see many if statements and its pretty hard to 
untangle.So you can optimize the code in the below manner.
It's my opinion.

for (String taskTracker : taskTrackers) {
+  //Formatting tasktracker to get just its FQDN 
+  taskTracker = UtilsForTests.getFQDNofTT(taskTracker);
+  LOG.info(taskTracker is : + taskTracker);
   if(taskTrackerCollection.size() == 0) {
 taskTrackerCollection.add(taskTracker);
 break;
   }else{
 if(!taskTrackerCollection.contains(taskTracker)){
taskTrackerCollection.add(taskTracker);
break;
 }   
   }
}  

 Create test scenario for distributed cache file behaviour, when dfs file is 
 not modified
 --

 Key: MAPREDUCE-1672
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Reporter: Iyappan Srinivasan
Assignee: Iyappan Srinivasan
 Attachments: 
 TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
 TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
 TestDistributedCacheUnModifiedFile.patch, 
 TestDistributedCacheUnModifiedFile.patch, 
 TestDistributedCacheUnModifiedFile.patch, 
 TestDistributedCacheUnModifiedFile.patch, 
 TestDistributedCacheUnModifiedFile.patch, 
 TestDistributedCacheUnModifiedFile.patch


 This test scenario is for a distributed cache file behaviour
 when it is not modified before and after being
 accessed by maximum two jobs. Once a job uses a distributed cache file
 that file is stored in the mapred.local.dir. If the next job
 uses the same file, then that is not stored again.
 So, if two jobs choose the same tasktracker for their job execution
 then, the distributed cache file should not be found twice.
 This testcase should run a job with a distributed cache file. All the
 tasks' corresponding tasktracker's handle is got and checked for
 the presence of distributed cache with proper permissions in the
 proper directory. Next when job
 runs again and if any of its tasks hits the same tasktracker, which
 ran one of the task of the previous job, then that
 file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.