Hello all, I'm trying to stream a little python script on my small hadoop cluster, and it doesn't work like I thought it would.
The script looks something like #!/usr/bin/env python import mylib dostuff where mylib is a small python library that I want included, and I launch the whole thing with something like bin/hadoop jar contrib/streaming/hadoop-0.16.4-streaming.jar -cacheFile "hdfs://master:54310/user/hadoop/mylib.py#mylib.py" -file scrpit.py -mapper "script.py" -input input -output output so it seems to me like the library should be available to the script. When I run the script locally on my machine everything works perfectly fine. However, when I run it it the script can't find the library. Does hadoop do anything strange to default paths? Am I missing something obvious? Any pointers or ideas on how to fix this would be great. Martin Blom