[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-07-05 Thread Iyappan Srinivasan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885420#action_12885420
 ] 

Iyappan Srinivasan commented on MAPREDUCE-1672:
---

There was no such functionality for creating DFS file, when I created this 
testcase.
This code is already checked in some time back in 20.1.xxx, after getting 
review comments(after 20th April). The latest fix 
(TestDistributedCacheUnModifiedFile-ydist-security-patch.txt)  is for the 
testcase to be compatible with the security release. The issue is JobClient is 
not supposed to submit jobs multiple times using same job token. Once this 
variable is made false, then jobClient can submit multiple jobs. This as given 
+1 from Balaji on 17June and I had checked into security_20.1.xxx branch. Could 
not create the trunk patch as this testcase is not forward ported yet.

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile-ydist-security-patch.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-06-29 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883711#action_12883711
 ] 

Konstantin Boudnik commented on MAPREDUCE-1672:
---

Is there really no such thing as creating a DFS file yet?
Also, your latest patch for y20 security seems to be too short.


> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile-ydist-security-patch.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-06-17 Thread Balaji Rajagopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879757#action_12879757
 ] 

Balaji Rajagopalan commented on MAPREDUCE-1672:
---

+1 

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile-ydist-security-patch.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-18 Thread Vinay Kumar Thota (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12858385#action_12858385
 ] 

Vinay Kumar Thota commented on MAPREDUCE-1672:
--

+  Assert.assertNotNull("jobInfo is null" + jInfo,
+  jInfo.getStatus().getRunState());

The above statement is used for checking the jInfo instance right,in that case 
there is not point of invoking the getStatus() method in the statement.Because 
even for null instance also it's accessing the getStatus() and it might be 
throwing NEP. So change the statement like below.

Assert.assertNotNull("jobInfo is null",jInfo);



   if (count > 10) {
+  Assert.fail("Since the sleep count has reached beyond a point" +
+"failing at this point");
+}


I would suggest here,the error message should be more concrete instead of 
saying 'count has reached beyone a point'.I mean,
the message should be "Job has not been started for 10 mins" so test 
fails.While dubugging the user will have clear information
about why the test fails and what time it has been waited to start the job.



for (String taskTracker : taskTrackers) {
+  //Formatting tasktracker to get just its FQDN 
+  taskTracker = UtilsForTests.getFQDNofTT(taskTracker);
+  LOG.info("taskTracker is :" + taskTracker);
+
+  //This will be entered from the second job onwards
+  if (countLoop > 1) {
+if (taskTracker != null) {
+  continueLoop = taskTrackerCollection.contains(taskTracker);
+}
+if (!continueLoop) {
+  break;
+}
+  }
+
+  //Collecting the tasktrackers
+  if (taskTracker != null)  
+taskTrackerCollection.add(taskTracker);

In the above Instructions I could see many if statements and its pretty hard to 
untangle.So you can optimize the code in the below manner.
It's my opinion.

for (String taskTracker : taskTrackers) {
+  //Formatting tasktracker to get just its FQDN 
+  taskTracker = UtilsForTests.getFQDNofTT(taskTracker);
+  LOG.info("taskTracker is :" + taskTracker);
   if(taskTrackerCollection.size() == 0) {
 taskTrackerCollection.add(taskTracker);
 break;
   }else{
 if(!taskTrackerCollection.contains(taskTracker)){
taskTrackerCollection.add(taskTracker);
break;
 }   
   }
}  

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-13 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856495#action_12856495
 ] 

Konstantin Boudnik commented on MAPREDUCE-1672:
---

- In the UtilsForTests I'd suggest to change the name of new method
{noformat}
+  public static String format (String taskTrackerLong) throws Exception {
{noformat}
to something like {{getFQDNofTT}} or something along the lines. {{format}} it 
too vague and the intention of the public utility method is unclear. It will 
force people to look into JavaDocs just to understand what it is intended to do.

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-12 Thread Balaji Rajagopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12855928#action_12855928
 ] 

Balaji Rajagopalan commented on MAPREDUCE-1672:
---

Overall the code looks good. I have one comment, instead of using String [] for 
the previous task tracker if vector is used, third inner for loop can be 
avoided, since we trying to see if the given vector contains the tasktracker 
which is a string object. 

String taskTrackerCollection[] = new String[30];

If this comment is addressed, I think the code is ready for check in. 

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: 
> TEST-org.apache.hadoop.mapred.TestDistributedCacheUnModifiedFile.txt, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (MAPREDUCE-1672) Create test scenario for "distributed cache file behaviour, when dfs file is not modified"

2010-04-06 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854222#action_12854222
 ] 

Konstantin Boudnik commented on MAPREDUCE-1672:
---

Looks good to me except that there's already a file 
{{src/test/org/apache/hadoop/mapred/UtilsForTests.java}} so please consider 
adding your utils there rather than creating a brand new one.

Also, please ask some MR expert to look at it from MR standpoint - I'm don't 
have enough knowledge in the field. It also will help if you can run the test 
and port the results to the JIRA.

> Create test scenario for "distributed cache file behaviour, when dfs file is 
> not modified"
> --
>
> Key: MAPREDUCE-1672
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1672
> Project: Hadoop Map/Reduce
>  Issue Type: Test
>  Components: test
>Reporter: Iyappan Srinivasan
>Assignee: Iyappan Srinivasan
> Attachments: TestDistributedCacheUnModifiedFile.patch, 
> TestDistributedCacheUnModifiedFile.patch
>
>
> This test scenario is for a distributed cache file behaviour
> when it is not modified before and after being
> accessed by maximum two jobs. Once a job uses a distributed cache file
> that file is stored in the mapred.local.dir. If the next job
> uses the same file, then that is not stored again.
> So, if two jobs choose the same tasktracker for their job execution
> then, the distributed cache file should not be found twice.
> This testcase should run a job with a distributed cache file. All the
> tasks' corresponding tasktracker's handle is got and checked for
> the presence of distributed cache with proper permissions in the
> proper directory. Next when job
> runs again and if any of its tasks hits the same tasktracker, which
> ran one of the task of the previous job, then that
> file should not be uploaded again and task use the old file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.