Re: [Samba] Custom VFS

Andrew Scherpbier Wed, 25 Jul 2012 09:44:34 -0700


On 07/24/2012 04:22 PM, Jeremy Allison wrote:

On Tue, Jul 24, 2012 at 02:35:28PM -0700, Andrew Scherpbier wrote:

Hi Daniel,


Just a note of encouragement...
I have so far written 2 filesystems in Java that use Samba for 2
different companies, so you're not alone!  :-)

The strategy I've used is to write a simple TCP protocol client (the
VFS module) and server (a straight forward threaded Java server).
Works like a charm.  As long as the client side is abstracted enough
so that its samba connection state is independent from the server
connection state, there are no issues with restarting either.  (I
started out using a statefull protocol, but ended up changing to a
completely stateless one, where the individual messages contain
enough information to establish context.  This way, if either end of
the system goes down, recovery is the simple act of building a new
TCP connection.)

I also attempted to use the Apache ActiveMQ C++ library for
communication, but found it buggy and leaky.

I originally looked into hosting the JVM in the VFS module, but that
was going to be a problem because each smbd process would have to
start its own JVM.  The JVM startup time (especially the server JVM)
is very high and the memory overhead would not make it scalable.

TCP through the loopback interface is very fast (at least on the
linux system's I've developed for), so there was no need to
implement some sort of shared memory interface.

The system I'm working on now manages PB class storage (currently up
to 10PB) with hundreds of concurrent clients and the VFS module does
this without issues or much overhead.  We're regularly seeing write
speeds in the 400-500MB/s range using 10GbE and multiple windows
clients.

Good luck!

P.S.:  Blatant plug for my current project:
http://www.cuttedge.com/psca/index.html

Wow - that's really cool stuff !

I'm glad the VFS works so well for you. I wanted to give you
a heads-up on the changes we're making to the VFS moving
forward with 4.0.x and above - take a look at the changes
Volker made for the pread() -> pread_send_fn()/pread_recv_fn()
and pwrite() -> pwrite_send_fn()/pwrite_recv_fn() in order to
make the VFS async (and allow pthreaded implementations to
be hidden under the covers).

Sample implementations are in source3/modules/vfs_default.c
in:

vfswrap_pread_send()/vfswrap_asys_ssize_t_recv()
vfswrap_pwrite_send()/vfswrap_asys_ssize_t_recv()

It makes the VFS a little more complicated, but should
enable you to get more performance out of it.

Interesting stuff. Right now I'm letting default_vfs do all thelow-level I/O, so any improvements in speed you guys make shouldimmediately be useful!So does this mean that the VFS module will need to be changed to bethread-safe? That actually will be a significant issue. I'm not toofamiliar with pthreads and don't know too much about the low levelimplications WRT errno, etc. (I'm mostly a Java weenie nowadays,sorry! Last time I used threads in C++ was a couple years ago usingBoost under Windows)


We're also thinking longer term about changing the
model of keeping the current working directory as
the root of the exported service and changing the
internals of Samba to chdir() to the parent directory
of any path currently being processed - this allows
easier security checks inside smbd and reduces the
opportunity for pathname check race conditions.

For what I'm doing now, I don't think that matters much, other than therealpath calls, I believe. Since I'm only dealing with files *after*they have been closed, the only thing I'm worried about is getting theright path to the files.

Feedback very welcome - especially from someone
who has implemented a couple of production Samba
VFS modules already :-).

My main gripe with the VFS stuff is the lack of documentation. What I'dlike to see is at least a call flow to make it easier for module writersto figure out what calls to hook. For example, does create_file callopen or do both need to be implemented/hooked? I unfortunately happento have lots of experience with windows kernel calls because I alsowrote a filter-driver based FS for windows in a previous life, so I knowhow complicated the create_file call is (Thanks, Microsoft!). The factthat you don't need to hook it is awesome, but that's not explainedanywhere I could find.

Or at least detailed docs on the individual hooks, what they aresupposed to do, why they are called, what their side effects aresupposed to be, etc. (Doxygen docs in the code would be awesome!)

I spend way too much time running "grep -rn something" on the sambasource and following ctags right now :-(

Don't get me wrong! I love working on this stuff, but the VFS module isa small (but important) part of the bigger system and I end up spendinga disproportionate amount of time on the module because of the lack ofdocumentation.

Thanks !

Jeremy.


--
Andrew Scherpbier
and...@scherpbier.org

--
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

Re: [Samba] Custom VFS

Reply via email to