Streaming should execute Unix commands and scripts in well known languages 
without user specifying the path
-----------------------------------------------------------------------------------------------------------

                 Key: HADOOP-477
                 URL: http://issues.apache.org/jira/browse/HADOOP-477
             Project: Hadoop
          Issue Type: Bug
          Components: contrib/streaming
            Reporter: arkady borkovsky



If the executables for -mapper or -reducer are well-known (grep, cat, awk), 
Streaming should make sure that the executable is found.
If a script  for -mapper or -reducer are in a well-known language (.pl, .py), 
Streaming should  execute it  with the correct language processor.

Reason:
many jobs get started from machines with a different environment from that on 
the cluster.  
On another hand, different clusters may have different environments.  
Also, a user may have no access to the cluster machines.
Because of this, a user may be unable to specify correct paths for standard 
commands, and correct language processors for scripts.

Implementation:
Stream may tailr the commands by prepending the path, or the name of language 
processor.  
Another solution is to make sure that the commands are executed in a 
"meaningful" environment (with good $PATH, and other variables Unix users are 
accustomed to count upon).

Once again, Streaming is user facing tool -- it is not a library or a hackable 
example that the users are to modify for their needs.  So it should work out of 
the box.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to