RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-14 Thread Roger Whitcomb
Thank you Dave, I got it. Needed a few other .jars as well (commons-cli and 
protobuf-java). But most importantly, the port was wrong. 50070 is for HTTP 
access, but using 8020 is correct for direct HDFS access.


Thanks again,

~Roger


From: dlmarion dlmar...@hotmail.com
Sent: Friday, April 11, 2014 6:02 PM
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS 
access?

If memory serves me, its in the hadoop-hdfs.jar file.


Sent via the Samsung GALAXY S®4, an ATT 4G LTE smartphone


 Original message 
From: Roger Whitcomb
Date:04/11/2014 8:37 PM (GMT-05:00)
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS 
access?


Hi Dave,

​Thanks for the responses.  I guess I have a small question then:  what 
exact class(es) would it be looking for that it can't find?  I have all the 
.jar files I mentioned below on the classpath, and it is loading and executing 
stuff in the org.apache.hadoop.fs.FileSystem class (according to the stack 
trace below), so  there are implementing classes I would guess, so what 
.jar file would they be in?


Thanks,

~Roger



From: david marion dlmar...@hotmail.com
Sent: Friday, April 11, 2014 4:55 PM
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS 
access?

Also, make sure that the jars on the classpath actually contain the HDFS file 
system. I'm looking at:

No FileSystem for scheme: hdfs

which is an indicator for this condition.

Dave


From: dlmar...@hotmail.com
To: user@hadoop.apache.org
Subject: RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS 
access?
Date: Fri, 11 Apr 2014 23:48:48 +

Hi Roger,

  I wrote the HDFS provider for Commons VFS. I went back and looked at the 
source and tests, and I don't see anything wrong with what you are doing. I did 
develop it against Hadoop 1.1.2 at the time, so there might be an issue that is 
not accounted for with Hadoop 2. It was also not tested with security turned 
on. Are you using security?

Dave

 From: roger.whitc...@actian.com
 To: user@hadoop.apache.org
 Subject: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS 
 access?
 Date: Fri, 11 Apr 2014 20:20:06 +

 Hi,
 I'm fairly new to Hadoop, but not to Apache, and I'm having a newbie kind of 
 issue browsing HDFS files. I have written an Apache Commons VFS (Virtual File 
 System) browser for the Apache Pivot GUI framework (I'm the PMC Chair for 
 Pivot: full disclosure). And now I'm trying to get this browser to work with 
 HDFS to do HDFS browsing from our application. I'm running into a problem, 
 which seems sort of basic, so I thought I'd ask here...

 So, I downloaded Hadoop 2.3.0 from one of the mirrors, and was able to track 
 down sort of the minimum set of .jars necessary to at least (try to) connect 
 using Commons VFS 2.1:
 commons-collections-3.2.1.jar
 commons-configuration-1.6.jar
 commons-lang-2.6.jar
 commons-vfs2-2.1-SNAPSHOT.jar
 guava-11.0.2.jar
 hadoop-auth-2.3.0.jar
 hadoop-common-2.3.0.jar
 log4j-1.2.17.jar
 slf4j-api-1.7.5.jar
 slf4j-log4j12-1.7.5.jar

 What's happening now is that I instantiated the HdfsProvider this way:
 private static DefaultFileSystemManager manager = null;

 static
 {
 manager = new DefaultFileSystemManager();
 try {
 manager.setFilesCache(new DefaultFilesCache());
 manager.addProvider(hdfs, new HdfsFileProvider());
 manager.setFileContentInfoFactory(new FileContentInfoFilenameFactory());
 manager.setFilesCache(new SoftRefFilesCache());
 manager.setReplicator(new DefaultFileReplicator());
 manager.setCacheStrategy(CacheStrategy.ON_RESOLVE);
 manager.init();
 }
 catch (final FileSystemException e) {
 throw new RuntimeException(Intl.getString(object#manager.setupError), e);
 }
 }

 Then, I try to browse into an HDFS system this way:
 String url = String.format(hdfs://%1$s:%2$d/%3$s, hadoop-master , 50070, 
 hdfsPath);
 return manager.resolveFile(url);

 Note: the client is running on Windows 7 (but could be any system that runs 
 Java), and the target has been one of several Hadoop clusters on Ubuntu VMs 
 (basically the same thing happens no matter which Hadoop installation I try 
 to hit). So I'm guessing the problem is in my client configuration.

 This attempt to basically just connect to HDFS results in a bunch of error 
 messages in the log file, which looks like it is trying to do user validation 
 on the local machine instead of against the Hadoop (remote) cluster.
 Apr 11,2014 18:27:38.640 GMT T[AWT-EventQueue-0](26) DEBUG FileObjectManager: 
 Trying to resolve file reference 'hdfs://hadoop-master:50070/'
 Apr 11,2014 18:27:38.953 GMT T[AWT-EventQueue-0](26) INFO 
 org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is 
 deprecated. Instead, use fs.defaultFS
 Apr 11,2014 18:27:39.078 GMT T[AWT

Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread Roger Whitcomb
 past the Commons VFS stuff that you 
probably don't understand to be able to tell me what other Hadoop/HDFS files / 
configuration I need to get this working.

Note: I want to build a GUI component that can browse to arbitrary HDFS 
installations, so I can't really be setting up a hard-coded XML file for each 
potential Hadoop cluster I might connect to 

Thanks,
~Roger Whitcomb



RE: Which Hadoop 2.x .jars are necessary for Apache Commons VFS HDFS access?

2014-04-11 Thread Roger Whitcomb
)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:90)
 at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2350)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2332)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:369)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:168)
 at 
 org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem.resolveFile(HdfsFileSystem.java:115)
 at 
 org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:84)
 at 
 org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:64)
 at 
 org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:700)
 at 
 org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:656)
 at 
 org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:609)

 So, my guess is that I don't have enough configuration setup on my client 
 machine to tell Hadoop that the authentication is to be done at the remote 
 end ?? So, I'm trying to track down what the configuration info might be.

 Hoping to see if anyone here can see past the Commons VFS stuff that you 
 probably don't understand to be able to tell me what other Hadoop/HDFS files 
 / configuration I need to get this working.

 Note: I want to build a GUI component that can browse to arbitrary HDFS 
 installations, so I can't really be setting up a hard-coded XML file for each 
 potential Hadoop cluster I might connect to 

 Thanks,
 ~Roger Whitcomb