[jira] Commented: (MAPREDUCE-896) Users can set non-writable permissions on temporary files for TT and can abuse disk usage.

2009-12-05 Thread Hemanth Yamijala (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786575#action_12786575
 ] 

Hemanth Yamijala commented on MAPREDUCE-896:


I looked at the Y! 20 patch. Some comments:

- TaskTracker.buildPathForDeletion need not be public.
- Was there a need to change CleanupQueue.addToQueue to take a FileSystem as 
argument instead of Configuration ? It has caused more changes than required by 
this patch - like in JobTracker and JobInProgress. Can we retain the original 
API and pass in a Configuration as before ? 
- When adding a task directory to delete, we are adding paths from all the 
local directories instead of just the one where files for the task are actually 
created. At a minimum, this is more work being done than necessary. More 
importantly, I don't know if there are any side effects this will cause. We can 
check which among the local directories the path belongs to (by doing a 
startsWith on the path) and all only that I think.
- Shouldn't getLocalDirs take the tasktracker's configuration always ? In which 
case, it doesn't need to take the JobConf as a parameter, but can use fConf.
- The log statements in CleanupQueue.PathCleanupThread.run are printing 
context.path which will only be the mapred local dir. We actually need the full 
path, otherwise the log statements could be confusing. Indeed, I would suggest 
a slight refactoring of the PathDeletionContext class, because as it exists 
currently we have one mode or the other that works - either a fullPath is 
provided or the path is built from other bits of data - like jobId, taskId etc. 
So, I would suggest something along the following lines:
{code}
class PathDeletionContext {
  String fullPath;
  FileSystem fs;

  protected String getPathForDeletion() {
return fullPath;
  }

  protected void enablePathForCleanup() {
// do nothing
  }
}

class TaskControllerPathDeletionContext extends PathDeletionContext {
  String user;
  String jobId;
  String taskId;
  boolean isCleanupTask;
  boolean isWorkDir;
  TaskController taskController;
  Path p;

  @Override
  protected String getPathForDeletion() {
TaskTracker.buildPathForDeletion(this);
  }

  @Override
  protected void enablePathForCleanup() {
taskController.enablePathForCleanup(this);
  }
}
{code}

Then we can use PathDeletionContext in all cases where we don't need to use the 
taskController and the sub-class in other cases. CleanupQueue will naturally 
take and store PathDeletionContext objects. getPathForDeletion can be called to 
get the final path for deletion. I feel this design is cleaner. Thoughts ?
- DefaultTaskController.enableTaskForCleanup should be package private.
- In other APIs of LinuxTaskController - like buildLaunchTaskArgs, we find out 
if the task is a cleanup task and adjust paths directly. I think we can do the 
same thing for the new command also. This is not less secure, because we are 
still constructing the full path from the command args, but we abstract the 
task-controller from details like cleanup task. It is less clear whether the 
same thing should be done for workDir (i.e. should we append that to taskid in 
LinuxTaskController itself.) For that we may need a flag still, but I am OK if 
that is also resolved in LinuxTaskController itself and we completely eliminate 
flags to pass to task-controller.
- The List of args in buildChangePathPermissionsArgs should be of the right 
size. (It's not 5). Also, I think it is useful to retain the order of commands 
the same. i.e. Let the mapred local dir be the first argument, then job-id, 
then task-id.
- I think we must allocate the exact amount of memory required in 
build_dir_path. This can be done by defining a format string like 
TT_LOCAL_TASK_SCRIPT_PATTERN and then summing the lengths of this string, and 
the arguments like jobid, taskid, mapred local dir etc. Then we can use 
snprintf to build the path instead of multiple (unsafe) strcat and strcpy 
calls. Again, please look at get_task_file_path for an example.
- Return values of calls like malloc should all be checked. When this is done, 
calls to build_dir_path can fail, which must also be checked.
- In TaskRunner.deleteDirContents, I think if we get an InterruptedException, 
we should return immediately. Because otherwise, the operation is not really 
interrupted and it can get stuck permanently.
- The intent of the testcase in TestChildTaskDirs is nice. But I am worried 
that since directory cleanup happens asynchronously, this might fail due to 
timing issues (like the TODO in the comment says). One option could be to use 
an inline directory cleaner. Can we try that ? 
- Should we also verify that the taskattemptdir is also cleaned up ?
- There are some TODOs in the tests, can you please remove them after 
addressing the concerns ?


> Users can set non-writable permissions on temporary files for TT and can 

[jira] Commented: (MAPREDUCE-1250) Refactor job token to use a common token interface

2009-12-05 Thread Kan Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786553#action_12786553
 ] 

Kan Zhang commented on MAPREDUCE-1250:
--

Attached a patch that re-factored existing job token code to use the new 
interface being added in HADOOP-6415. This patch is derived from earlier 
discussions in HADOOP-6373.

> Refactor job token to use a common token interface
> --
>
> Key: MAPREDUCE-1250
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1250
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: security
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Attachments: m1250-09.patch
>
>
> The idea is to use a common token interface for both job token and delegation 
> token (HADOOP-6373) so that the RPC layer that uses them don't have to 
> differentiate them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

2009-12-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786552#action_12786552
 ] 

Arun C Murthy commented on MAPREDUCE-1266:
--

It takes ~1s for the map/reduce task JVM to come up... 

I'm failing to see how sub-second heartbeat intervals will help.

> Allow heartbeat interval smaller than 3 seconds for tiny clusters
> -
>
> Key: MAPREDUCE-1266
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker, task, tasktracker
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> For small clusters, the heartbeat interval has a large effect on job latency. 
> This is especially true on pseudo-distributed or other "tiny" (<5 nodes) 
> clusters. It's not a big deal for production, but new users would have a 
> happier first experience if Hadoop seemed snappier.
> I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
> 0.5 seconds (but have it governed by an undocumented config parameter in case 
> people don't like this change). The cluster size-based ramp up of interval 
> will maintain the current scalable behavior for large clusters with no 
> negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (MAPREDUCE-1250) Refactor job token to use a common token interface

2009-12-05 Thread Kan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kan Zhang updated MAPREDUCE-1250:
-

Attachment: m1250-09.patch

> Refactor job token to use a common token interface
> --
>
> Key: MAPREDUCE-1250
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1250
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: security
>Reporter: Kan Zhang
>Assignee: Kan Zhang
> Attachments: m1250-09.patch
>
>
> The idea is to use a common token interface for both job token and delegation 
> token (HADOOP-6373) so that the RPC layer that uses them don't have to 
> differentiate them.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1084) Implementing aspects development and fault injeciton framework for MapReduce

2009-12-05 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786545#action_12786545
 ] 

Konstantin Boudnik commented on MAPREDUCE-1084:
---

svn external patch is better to be abandoned because vanilla git doesn't 
understand svn:externals. So, lets consider the patch without externals as the 
only alternative at the moment.

Few comments:
- it seems like patch is missing {{src/test/aop/build/aop.xml}} file
- target name {{jar-mapred-test-fault-inject}} needs to be changed to 
{{jar-mapred-fault-inject}} to be compliant with Common and HDFS names


> Implementing aspects development and fault injeciton framework for MapReduce
> 
>
> Key: MAPREDUCE-1084
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1084
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: build, test
>Reporter: Konstantin Boudnik
>Assignee: Sreekanth Ramakrishnan
> Attachments: mapreduce-1084-1-withoutsvnexternals.patch, 
> mapreduce-1084-1.patch
>
>
> Similar to HDFS-435 and HADOOP-6204 this JIRA will track the introduction of 
> injection framework for MapReduce.
> After HADOOP-6204 is in place this particular modification should be very 
> trivial and would take importing (via svn:external) of src/test/build and 
> some tweaking of the build.xml file

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

2009-12-05 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786474#action_12786474
 ] 

Todd Lipcon commented on MAPREDUCE-1266:


It basically does that already - the clusterSize variable above is the number 
of task trackers. Reducing the minimum but leaving the other argument to 
Math.max should maintain the current behavior for large clusters, and 
automatically reduce on small ones.

> Allow heartbeat interval smaller than 3 seconds for tiny clusters
> -
>
> Key: MAPREDUCE-1266
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker, task, tasktracker
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> For small clusters, the heartbeat interval has a large effect on job latency. 
> This is especially true on pseudo-distributed or other "tiny" (<5 nodes) 
> clusters. It's not a big deal for production, but new users would have a 
> happier first experience if Hadoop seemed snappier.
> I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
> 0.5 seconds (but have it governed by an undocumented config parameter in case 
> people don't like this change). The cluster size-based ramp up of interval 
> will maintain the current scalable behavior for large clusters with no 
> negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1266) Allow heartbeat interval smaller than 3 seconds for tiny clusters

2009-12-05 Thread Konstantin Boudnik (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786470#action_12786470
 ] 

Konstantin Boudnik commented on MAPREDUCE-1266:
---

Here's an interesting though: why don't we throw some ergonomics into Hadoop's 
intellect? If heartbeats seem to increase the latencies on small clusters then, 
perhaps, Hadoop can lower it dynamically if a small cluster is 'detected'?

> Allow heartbeat interval smaller than 3 seconds for tiny clusters
> -
>
> Key: MAPREDUCE-1266
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1266
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>  Components: jobtracker, task, tasktracker
>Affects Versions: 0.22.0
>Reporter: Todd Lipcon
>Priority: Minor
>
> For small clusters, the heartbeat interval has a large effect on job latency. 
> This is especially true on pseudo-distributed or other "tiny" (<5 nodes) 
> clusters. It's not a big deal for production, but new users would have a 
> happier first experience if Hadoop seemed snappier.
> I'd like to change the minimum heartbeat interval from 3.0 seconds to perhaps 
> 0.5 seconds (but have it governed by an undocumented config parameter in case 
> people don't like this change). The cluster size-based ramp up of interval 
> will maintain the current scalable behavior for large clusters with no 
> negative effect.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1097) Changes/fixes to support Vertica 3.5

2009-12-05 Thread Omer Trajman (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786413#action_12786413
 ] 

Omer Trajman commented on MAPREDUCE-1097:
-

Test failure was in gridmix. Any suggestions? Should I just resubmit?

> Changes/fixes to support Vertica 3.5
> 
>
> Key: MAPREDUCE-1097
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1097
> Project: Hadoop Map/Reduce
>  Issue Type: Improvement
>Affects Versions: 0.21.0
> Environment: Hadoop 0.21.0 pre-release and Vertica 3.5
>Reporter: Omer Trajman
>Assignee: Omer Trajman
>Priority: Minor
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-1097.patch
>
>
> Vertica 3.5 includes three changes that the formatters should handle:
> 1) deploy_design function that handles much of the logic in the optimize 
> method.  This improvement uses deploy_design if the server version supports 
> it instead of orchestrating in the formatter function.
> 2) truncate table instead of recreating the table
> 3) numeric, decimal, money, number types (all the same path)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (MAPREDUCE-1269) Failed on write sequence files in mapper.

2009-12-05 Thread YangLai (JIRA)
Failed on write sequence files in mapper.
-

 Key: MAPREDUCE-1269
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1269
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 0.20.1
 Environment: Hadoop 0.20.1
Compiled by oom on Tue Sep  1 20:55:56 UTC 2009

Linux version 2.6.18-128.el5 (mockbu...@builder10.centos.org) (gcc version 
4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Wed Jan 21 10:41:14 EST 2009

Reporter: YangLai
Priority: Critical


Because the sort phase is not necessary for my job, I want to write only values 
into sequence files by keys. So I set a hashmap into mapper: 

private HashMap hm;

and I find a suitable org.apache.hadoop.io.SequenceFile.Writer by HashMap: 

Writer seqWriter = hm.get(skey);
if (seqWriter==null){
try {
seqWriter = new SequenceFile.Writer(new 
JobClient(job).getFs()
, job, new Path(pPathOut, 
skey), VLongWritable.class, ByteWritable.class);
} catch (IOException e) {
e.printStackTrace();
}
if (seqWriter!=null){
hm.put(skey, seqWriter);
}else{
return;
}
}

The file names are obtained from job.get("mapred.task.id"), that insure no 
replicas exist.
The system always outputs : 

java.io.IOException: Could not obtain block: blk_-5398274085876111743_1021 
file=/YangLai/ranNum1GB/part-00015
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1787)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1615)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1742)
at java.io.DataInputStream.readFully(Unknown Source)
at java.io.DataInputStream.readFully(Unknown Source)
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1428)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417)
at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412)
at 
org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.java:43)
at 
org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:63)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:338)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

In fact, each mapper only write 16 sequence files, that will not be overloads 
to the hadoop system. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-1209) Move common specific part of the test TestReflectionUtils out of mapred into common

2009-12-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786360#action_12786360
 ] 

Hadoop QA commented on MAPREDUCE-1209:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12427005/mapreduce-1209.txt
  against trunk revision 887135.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/294/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/294/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/294/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/294/console

This message is automatically generated.

> Move common specific part of the test TestReflectionUtils out of mapred into 
> common
> ---
>
> Key: MAPREDUCE-1209
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1209
> Project: Hadoop Map/Reduce
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.21.0, 0.22.0
>Reporter: Vinod K V
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.21.0, 0.22.0
>
> Attachments: mapreduce-1209.txt
>
>
> As commented by Tom here 
> (https://issues.apache.org/jira/browse/HADOOP-6230?focusedCommentId=12751058&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12751058),
>  TestReflectionUtils has a single test testSetConf() to test backward 
> compatibility of ReflectionUtils for JobConfigurable objects. 
> TestReflectionUtils can be spilt into two tests - one on common and one in 
> mapred - this single test may reside in mapred till the mapred package is 
> removed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.