Hi, all, I have a problem that sounds ridiculous, but I've been struggling with this for a while now. I hope you can help.
I have a small Java program that performs some very basic operation within the HDFS. The program either creates a text file or just creates a new blank file or creates a directory. I also have the latest stable version of Hadoop installed in a single-machine configuration on Linux. As far as I can tell Hadoop works great; I can run sample M/R jobs. My Hadoop's fs.default.name is hdfs://localhost:9000/. The problem is that sometimes my small Java program works perfectly -- it does what's asked of it on the HDFS -- but sometimes it instead creates the requested file or directory on my local, native filesystem, without ever affecting the HDFS. For example, here's the relevant part of my test program: h_conf = new Configuration(); h_fs = FileSystem.get(h_conf); FileSystem.mkdirs(h_fs, new Path("/tmp/junit-test"), new FsPermission(FsAction.ALL, FsAction.ALL, FsAction.NONE)); Sometimes after I run it, I can do 'hadoop fs -ls /tmp' and see the junit-test directory. At other times, the above command doesn't show that directory but instead I can do 'ls /tmp' and see that directory! The worst thing about this problem is that I haven't been able to establish what circumstances affect which behavior. Thus far it appears truly random. It's not random in that every time I run it I get a random result, but rather that during one programming session it works as intended, no matter how many times I rerun the code, restart the IDE or restart Eclipse, and during another session it doesn't work no matter what I do. As one example, yesterday it was working fine until I replaced in my IDE hadoop-0.19-0-core.jar with hadoop-0.19.1-dev-core.jar that I built myself from the 0.19.0 code. Then it changed to the native filesystem behavior. I reverted the library to 0.19.0-core.jar, but the native filesystem behavior persisted and I cannot get Hadoop to write to the HDFS anymore. There is also a seemingly unrelated problem but I'm guessing it's related. I'm working on developing a Hadoop backend for Jena, so I often run Hadoop-enabled Jena on the same computer and from the same IDE. Occasionally (there's that word again), an exception is raised claiming that the JVM is out of heap space so Hadoop cannot execute. Tweaking JVM's command-line arguments to change the amount of memory allocated has no effect. On other days everything works fine without any errors. When it doesn't work, I can see that Java is not /truly/ out of memory -- there's as much available memory (if not more) as when Java runs fine. Also in the past I was able to fix the out-of-memory issue by recompiling everything inside Eclipse. Today I couldn't. Today I experienced both problems (Hadoop writing to native filesystem in my small Java program, and out-of-memory exceptions in Hadoop-enabled Jena) simultaneously. On previous occasions I'm not sure whether these issued occurred simultaneously or not. What could be going on? Any ideas would be very appreciated. Thank you. -- Philip
smime.p7s
Description: S/MIME cryptographic signature