Hi Keith,

I have tried the exact use case you have mentioned and it works fine for me.
Below is the command line for the same:

[ramya]$ jar vxf samplelib.jar
 created: META-INF/
 inflated: META-INF/MANIFEST.MF
 inflated: libhdfs.so

[ramya]$ hadoop dfs -put samplelib.jar samplelib.jar

[ramya]$ hadoop jar hadoop-streaming.jar -input InputDir -mapper "ls
testlink/libhdfs.so" -reducer NONE -output out -cacheArchive
hdfs://<namenode>:<port>/user/ramya/samplelib.jar#testlink

[ramya]$ hadoop dfs -cat out/*
testlink/libhdfs.so
testlink/libhdfs.so
testlink/libhdfs.so


Hope it helps.

Thanks
Ramya

On 8/5/11 10:10 AM, "Keith Wiley" <kwi...@keithwiley.com> wrote:


I can use cacheFile to load .so files into the distributed cache and it
works fine (the streaming executable links against the .so and runs), but I
can't get it to work with -cacheArchive.  It always says it can't find the
.so file.  I realize that if you jar a directory, the directory will be
recreated when you unjar, but I've tried jaring a file directly.  It is
easily verified that unjarring such a file reproduces the original file as a
sibling of the jar file itself.  So it seems to me that cacheArchive should
have transferred the jar file to the cwd of my task, unjarred it, and
produced a .so file right there, but it doesn't link up with the executable.
 Like I said, I know this basic approach works just fine with cacheFile.

What could be the problem here?  I can't easily see the files on the cluster
since it is a remote cluster with limited access.  I don't believe I can ssh
to any individual machine to investigate the files that are created for a
task...but I think I have worked through the process logically and I'm not
sure what I'm doing wrong.

Thoughts?

________________________________________________________________________________
Keith Wiley     *kwi...@keithwiley.com*     keithwiley.com
music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________

Reply via email to