[jira] Commented: (MAPREDUCE-181) mapred.system.dir should be accessible only to hadoop daemons

2009-08-19 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744944#action_12744944
 ] 

Devaraj Das commented on MAPREDUCE-181:
---

Some more details on the split file handling:
1) The FileSystem used for writing the split bytes would be the same filesystem 
where mapred.system.dir is located.
2) The split info (actual split bytes) would get written to the user's home 
directory on that filesystem (e.g., /user//.mapreduce/jobid)
3) The split info can be cleaned up by the cleanup task of the job.
For now, let's postpone the special handling for the JobConf, and instead put a 
cap on the max size (like 1 MB).

> mapred.system.dir should be accessible only to hadoop daemons 
> --
>
> Key: MAPREDUCE-181
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: hadoop-3578-branch-20-example-2.patch, 
> hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
> HADOOP-3578-v2.7.patch
>
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job 
> details. Hence the {{mapred.system.dir}} has the permissions of 
> {{rwx-wx-wx}}. This could be a security loophole where the job files might 
> get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) mapred.system.dir should be accessible only to hadoop daemons

2009-08-18 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744467#action_12744467
 ] 

Devaraj Das commented on MAPREDUCE-181:
---

I wonder whether it makes sense to have the jobclient write two files per a 
split file:

1) the splits info (the actual bytes) written to a secure location on the hdfs 
(with permissions 700)
2) the split metadata, which is a set of entries like 
{:.., 
} for each map-id. This is serialized over 
RPC, and the JobTracker writes it to the well known mapred-system-directory 
(which the JobTracker owns with perms 700).

The JobTracker just reads/loads the metadata, and creates the TIP cache.

The TaskTracker is handed off a split object that looks something like 
{}. As part of task localization, the TT 
copies the specific bytes from the split file (securely), and launches the task 
that then reads the split or the TT could simply stream it over RPC to the 
child. The replication factor could be set to a high number for the splits info 
file.. 

Doing it in this way should reduce the size of the split file information 
considerably (and we can have a cap on the metadata size as well), and also 
provide security for the user generated split files' content.

For the JobConf, passing the basic and the minimum info to the JobTracker as 
Hong suggested on MAPREDUCE-841 seems to make sense. For all other conf 
properties, the Task can load them directly from the HDFS. The max size (in 
terms of #bytes) of the basic information could be easily derived and we could 
have a cap on that for the RPC communication.

Thoughts?

> mapred.system.dir should be accessible only to hadoop daemons 
> --
>
> Key: MAPREDUCE-181
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: hadoop-3578-branch-20-example-2.patch, 
> hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
> HADOOP-3578-v2.7.patch
>
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job 
> details. Hence the {{mapred.system.dir}} has the permissions of 
> {{rwx-wx-wx}}. This could be a security loophole where the job files might 
> get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) mapred.system.dir should be accessible only to hadoop daemons

2009-08-14 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743145#action_12743145
 ] 

Amar Kamat commented on MAPREDUCE-181:
--

MAPREDUCE-807 is one more reason why we should close mapred.system.dir

> mapred.system.dir should be accessible only to hadoop daemons 
> --
>
> Key: MAPREDUCE-181
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: hadoop-3578-branch-20-example-2.patch, 
> hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
> HADOOP-3578-v2.7.patch
>
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job 
> details. Hence the {{mapred.system.dir}} has the permissions of 
> {{rwx-wx-wx}}. This could be a security loophole where the job files might 
> get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) mapred.system.dir should be accessible only to hadoop daemons

2009-07-27 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12735547#action_12735547
 ] 

Amar Kamat commented on MAPREDUCE-181:
--

[patch2|https://issues.apache.org/jira/secure/attachment/12414412/hadoop-3578-branch-20-example-2.patch]
 assumes that 
[patch1|https://issues.apache.org/jira/secure/attachment/12410472/hadoop-3578-branch-20-example.patch]
 is applied.

> mapred.system.dir should be accessible only to hadoop daemons 
> --
>
> Key: MAPREDUCE-181
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: hadoop-3578-branch-20-example-2.patch, 
> hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
> HADOOP-3578-v2.7.patch
>
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job 
> details. Hence the {{mapred.system.dir}} has the permissions of 
> {{rwx-wx-wx}}. This could be a security loophole where the job files might 
> get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (MAPREDUCE-181) mapred.system.dir should be accessible only to hadoop daemons

2009-07-24 Thread Amar Kamat (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734944#action_12734944
 ] 

Amar Kamat commented on MAPREDUCE-181:
--

bq. Attaching a patch for branch-0.20 with some bug fixes.
This is an example patch not to be committed.



> mapred.system.dir should be accessible only to hadoop daemons 
> --
>
> Key: MAPREDUCE-181
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-181
> Project: Hadoop Map/Reduce
>  Issue Type: Bug
>Reporter: Amar Kamat
>Assignee: Amar Kamat
> Attachments: hadoop-3578-branch-20-example-2.patch, 
> hadoop-3578-branch-20-example.patch, HADOOP-3578-v2.6.patch, 
> HADOOP-3578-v2.7.patch
>
>
> Currently the jobclient accesses the {{mapred.system.dir}} to add job 
> details. Hence the {{mapred.system.dir}} has the permissions of 
> {{rwx-wx-wx}}. This could be a security loophole where the job files might 
> get overwritten/tampered after the job submission. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.