Setting HDFS directory time programmatically

2012-01-04 Thread Frank Astier
Hi -

Is it possible to set the access time of a HDFS directory programmatically?

I’m using 0.20.204.0.

I need to do that in unit tests, where my clean up program is going to remove 
files/dirs whose access time is too far in the past. I can setTimes on the test 
files without any problem, but not on the directories... The directories 
created automatically when I create the test fiels have a date (with 
getAccessTime) of 1969/12/31 16:00 and I can’t control that date, which makes 
my unit testing impossible.

By the way, setTimes doesn’t allow to set the date on dirs, but getAccessTime 
is happy to return a date, which is inconsistent, IMHO.

Finally, on our production systems, I’m seeing appropriate dates for both files 
and directories.

Any insight appreciated,

Thanks!

Frank


Question about accessing another HDFS

2011-12-08 Thread Frank Astier
Hi -

We have two namenodes set up at our company, say:

hdfs://A.mycompany.com
hdfs://B.mycompany.com

From the command line, I can do:

Hadoop fs –ls hdfs://A.mycompany.com//some-dir

And

Hadoop fs –ls hdfs://B.mycompany.com//some-other-dir

I’m now trying to do the same from a Java program that uses the HDFS API. No 
luck there. I get an exception: “Wrong FS”.

Any idea what I’m missing in my Java program??

Thanks,

Frank


Re: Question about accessing another HDFS

2011-12-08 Thread Frank Astier
Can you show your code here ?  What URL protocol are you using ?

I’m guess I’m being very naïve (and relatively new to HDFS). I can’t show too 
much code, but basically, I’d like to do:

Path myPath = new Path(“hdfs://A.mycompany.com//some-dir”);

Where Path is a hadoop fs path. I think I can take it from there, if that 
worked... Did you mean that I need to address the namenode with an http:// 
address?

Thanks!

Frank

On Thu, Dec 8, 2011 at 5:47 PM, Tom Melendez t...@supertom.com wrote:

 I'm hoping there is a better answer, but I'm thinking you could load
 another configuration file (with B.company in it) using Configuration,
 grab a FileSystem obj with that and then go forward.  Seems like some
 unnecessary overhead though.

 Thanks,

 Tom

 On Thu, Dec 8, 2011 at 2:42 PM, Frank Astier fast...@yahoo-inc.com
 wrote:
  Hi -
 
  We have two namenodes set up at our company, say:
 
  hdfs://A.mycompany.com
  hdfs://B.mycompany.com
 
  From the command line, I can do:
 
  Hadoop fs –ls hdfs://A.mycompany.com//some-dir
 
  And
 
  Hadoop fs –ls hdfs://B.mycompany.com//some-other-dir
 
  I’m now trying to do the same from a Java program that uses the HDFS
 API. No luck there. I get an exception: “Wrong FS”.
 
  Any idea what I’m missing in my Java program??
 
  Thanks,
 
  Frank




--
Jay Vyas
MMSB/UCHC


Hadoop start up error

2011-11-17 Thread Frank Astier
Hi -

I’m seeing the following exception while trying to start MiniDFSCluster in some 
environment. The stack trace is:

   Starting DataNode 0 with dfs.data.dir: 
build/test/data/dfs/data/data1,build/test/data/dfs/data/data2
   java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:413)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:274)
at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:122)
at 
com.yahoo.ads.ngdstone.tpbdm.BDMTestCase.oneTimeSetUp(BDMTestCase.java:69)

My code is:

System.setProperty(hadoop.log.dir, /tmp);
System.setProperty(dfs.permissions.supergroup, su);

conf = new Configuration();
dfsCluster = new MiniDFSCluster(conf, 1, true, null);  // == 
Line 69 in BDMTestCase.java
fs = dfsCluster.getFileSystem();

...

Line 413 in MiniDFSCluster is:

   String  ipAddr = dn.getSelfAddr().getAddress().getHostAddress();

So I’m guessing it has something to do with the network setup?? - I see just 
above in MiniDFSCluster:

conf.set(dfs.datanode.address, 127.0.0.1:0);
conf.set(dfs.datanode.http.address, 127.0.0.1:0);
conf.set(dfs.datanode.ipc.address, 127.0.0.1:0);

Do I need to have those ports available/enabled in my environment? Is that the 
problem here?

Thanks!

Frank


Question about superuser and permissions

2011-11-03 Thread Frank Astier
Hi -

I’m writing unit tests that programatically start a name node and populate HDFS 
directories, but I want to simulate the situation where I don’t have read 
access to some HDFS directory (which happens on the real grid I eventually 
deploy to).

I’ve tried to chown and chmod, but it seems to have no effect, and my unit 
tests happily read the directory I don’t want them to be able to read. Looking 
at the permissions documentation, it seems that because my unit test program 
started the name node, it is automatically superuser. I tried setting 
dfs.permissions.supergroup to some other user, but that didn’t work either.

Is there any way I could have that unit test think it’s not the user who 
started the name node?

Thanks,

Frank


Debugging mapper

2011-09-15 Thread Frank Astier
Hi -

I’m using IntelliJ and the WordCount example in Hadoop (which uses 
MiniMRCluster). Is it possible to set an IntelliJ debugger breakpoint straight 
into the map function of the mapper? - I’ve tried, but so far, the debugger 
does not stop at the breakpoint.

Thanks!

Frank


Error with logging in (my) unit tests

2011-08-29 Thread Frank Astier
Hi -

I’m working with Maven inside IntelliJ, using Hadoop 0.20.203.0, and I get the 
following error message when trying to run my own unit tests, that use 
MiniDFSCluster. I copied how to use MiniDFSCluster from the Hadoop unit tests.

The message:

log4j:ERROR Could not instantiate class 
[org.apache.hadoop.log.metrics.EventCounter].
java.lang.ClassNotFoundException: org.apache.hadoop.log.metrics.EventCounter
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)
at org.apache.log4j.helpers.Loader.loadClass(Loader.java:198)
at 
org.apache.log4j.helpers.OptionConverter.instantiateByClassName(OptionConverter.java:326)
at 
org.apache.log4j.helpers.OptionConverter.instantiateByKey(OptionConverter.java:123)
at 
org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:752)
at 
org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:735)
at 
org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:615)
at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:502)
at 
org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:547)
at 
org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:483)
at org.apache.log4j.LogManager.clinit(LogManager.java:127)
at org.apache.log4j.Logger.getLogger(Logger.java:104)
at 
org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289)
at org.apache.commons.logging.impl.Log4JLogger.init(Log4JLogger.java:109)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at 
org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116)
at 
org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:914)
at 
org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604)
at 
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336)
at 
org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310)
at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:139)

Thanks!

Frank


Turn off all Hadoop logs?

2011-08-29 Thread Frank Astier
Is it possible to turn off all the Hadoop logs simultaneously? In my unit 
tests, I don’t want to see the myriad “INFO” logs spewed out by various Hadoop 
components. I’m using:

  ((Log4JLogger) DataNode.LOG).getLogger().setLevel(Level.OFF);
((Log4JLogger) LeaseManager.LOG).getLogger().setLevel(Level.OFF);
((Log4JLogger) FSNamesystem.LOG).getLogger().setLevel(Level.OFF);
((Log4JLogger) DFSClient.LOG).getLogger().setLevel(Level.OFF);
((Log4JLogger) Storage.LOG).getLogger().setLevel(Level.OFF);

But I’m still missing some loggers...

Frank


Hadoop in process?

2011-08-26 Thread Frank Astier
Hi -

Is there a way I can start HDFS (the namenode) from a Java main and run unit 
tests against that? I need to integrate my Java/HDFS program into unit tests, 
and the unit test machine might not have Hadoop installed. I’m currently 
running the unit tests by hand with hadoop jar ... My unit tests create a bunch 
of (small) files in HDFS and manipulate them. I use the fs API for that. I 
don’t have map/reduce jobs (yet!).

Thanks!

Frank