Re: hadoop file system browser

2008-01-24 Thread Enis Soztutar
Yes, you can solve the bottleneck by starting a webdav server on each 
client. But this would include the burden to manage the servers etc. and 
it may not be the intended use case for webdav. But we can further 
discuss the architecture in the relevant issue.


Alban Chevignard wrote:

Thanks for the clarification. I agree that running a single WebDAV
server for all clients would make it a bottleneck. But I can't see
anything in the current WebDAV server implementation that precludes
running an instance of it on each client. It seems to me that would
solve any bottleneck issue.

-Alban

On Jan 23, 2008 2:53 AM, Enis Soztutar [EMAIL PROTECTED] wrote:
  

As you know, dfs client connects to the individual datanodes to
read/write data and has a minimal interaction with the Namenode, which
improves the io rate linearly(theoretically 1:1). However current
implementation of webdav interface, is just a server working on a single
machine, which translates the webdav requests to namenode. Thus the
whole traffic passes through this webdav server, which makes it a
bottleneck. I was planning to integrate webdav server with
namenode/datanode, and forward the requests to the other datanodes, so
that we can do io in parallel, but my focus on webdav has faded for now.




Alban Chevignard wrote:


What are the scalability issues associated with the current WebDAV interface?

Thanks,
-Alban

On Jan 22, 2008 7:27 AM, Enis Soztutar [EMAIL PROTECTED] wrote:

  

Webdav interface for hadoop works as it is, but it needs a major
redesign to be scalable, however it is still useful. It has even been
used with windows explorer defining the webdav server as a remote service.


Ted Dunning wrote:



There has been significant work on building a web-DAV interface for HDFS.  I
haven't heard any news for some time, however.


On 1/21/08 11:32 AM, Dawid Weiss [EMAIL PROTECTED] wrote:



  

The Eclipse plug-in also features a DFS browser.


  

Yep. That's all true, I don't mean to self-promote, because there really isn't
that much to advertise ;) I was just quite attached to file manager-like user
interface; the mucommander clone I posted served me as a browser, but also for
rudimentary file operations (copying to/from, deleting folders etc.). In my
experience it's been quite handy.

It would be probably a good idea to implement a commons-vfs plugin for Hadoop
so
that HDFS filesystem is transparent to use for other apps.

Dawid



  
  


  


Re: hadoop file system browser

2008-01-24 Thread Vetle Roeim

Great! Where can I get it? :)

On Thu, 24 Jan 2008 19:48:57 +0100, Pete Wyckoff [EMAIL PROTECTED]  
wrote:




Right now its tested with 0.14.4. It also includes rmdir, rm, mkdir, mv.
I¹ve implemented write, but it has to wait for appends to work in Hadoop
because of the Fuse protocol.

Our strategy thus far has been to use FUSE on a single box and then NFS
export it to other machines. We don¹t do heavy, heavy operations on it,  
so
it isn¹t a performance problem.  The things I think are most useful  
anyway

are ls, find, du, mkdir, rmdir, rm and mv ­ none of which tax FUSe much.

-- pete


On 1/24/08 10:39 AM, Vetle Roeim [EMAIL PROTECTED] wrote:


On Tue, 22 Jan 2008 22:03:03 +0100, Jeff Hammerbacher
[EMAIL PROTECTED] wrote:


 we use FUSE: who wants a gui when you could  have a shell?
 http://issues.apache.org/jira/browse/HADOOP-4


Does this work with newer versions of Hadoop?


[...]
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/








--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


Re: hadoop file system browser

2008-01-24 Thread Vetle Roeim
Yes, please post it again. :) Lack of trash and directory protection  
shouldn't be an issue for my needs.


On Thu, 24 Jan 2008 20:11:26 +0100, Pete Wyckoff [EMAIL PROTECTED]  
wrote:




I can post it again, but it doesn¹t include ioctl commands so the trash
feature cannot be configured. I can still create a flag and default it to
false. And also the directory protection isn¹t configurable so I can set  
a
flag to false. The main directory we protect here is /user/facebook for  
data

(and job :) ) protection purposes.


On 1/24/08 10:55 AM, Vetle Roeim [EMAIL PROTECTED] wrote:










--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


Re: hadoop file system browser

2008-01-24 Thread Vetle Roeim

Thanks!

On Thu, 24 Jan 2008 20:29:20 +0100, Pete Wyckoff [EMAIL PROTECTED]  
wrote:




I attached the newest version to:
https://issues.apache.org/jira/browse/HADOOP-4

Still a work in progress and any help appreciated. Not much by way of
instructions but here are some:

1. download and install fuse and do a modprobe fuse
2. modify fuse_dfs.c¹s Makefile to have the right paths for fuse, hdfs.h  
and

jni
3. ensure you have hadoop in your class path and the jni stuff in your
library path
4. mkdir /tmp/hdfs
5. ./fuse_dfs dfs://hadoop_namenode:9000 /tmp/hdfs ­d

Probably will be missing things in your class path and LD_LIBRARY_PATH  
when
you do 5, so just add them and iterate.  To run this as production  
quality,
you basically need fuse_dfs in root¹s path and add a line to /etc/fstab.  
For

people interested, I can give you my config line.

-- pete


On 1/24/08 10:55 AM, Vetle Roeim [EMAIL PROTECTED] wrote:










--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/


Re: hadoop file system browser

2008-01-22 Thread Enis Soztutar
As you know, dfs client connects to the individual datanodes to 
read/write data and has a minimal interaction with the Namenode, which 
improves the io rate linearly(theoretically 1:1). However current 
implementation of webdav interface, is just a server working on a single 
machine, which translates the webdav requests to namenode. Thus the 
whole traffic passes through this webdav server, which makes it a 
bottleneck. I was planning to integrate webdav server with 
namenode/datanode, and forward the requests to the other datanodes, so 
that we can do io in parallel, but my focus on webdav has faded for now.




Alban Chevignard wrote:

What are the scalability issues associated with the current WebDAV interface?

Thanks,
-Alban

On Jan 22, 2008 7:27 AM, Enis Soztutar [EMAIL PROTECTED] wrote:
  

Webdav interface for hadoop works as it is, but it needs a major
redesign to be scalable, however it is still useful. It has even been
used with windows explorer defining the webdav server as a remote service.


Ted Dunning wrote:


There has been significant work on building a web-DAV interface for HDFS.  I
haven't heard any news for some time, however.


On 1/21/08 11:32 AM, Dawid Weiss [EMAIL PROTECTED] wrote:


  

The Eclipse plug-in also features a DFS browser.

  

Yep. That's all true, I don't mean to self-promote, because there really isn't
that much to advertise ;) I was just quite attached to file manager-like user
interface; the mucommander clone I posted served me as a browser, but also for
rudimentary file operations (copying to/from, deleting folders etc.). In my
experience it's been quite handy.

It would be probably a good idea to implement a commons-vfs plugin for Hadoop
so
that HDFS filesystem is transparent to use for other apps.

Dawid