[Gluster-users] encfs over glusterfs

2009-01-30 Thread Andrew McGill
I am considering running encfs over glusterfs so that I can include insecure 
boxes in the storage backend for our backups running rdiff-backup.  Actually, 
I would prefer the crypto to be done in glusterfs, but I understand that's a 
little way off.

Does anyone use this arrangement -- and are there any gotchas?  


In case you don't know, here's how the encrypted part of encfs looks:

~/.crypt$ ls -la

total 216
drwx--+  8 andrewm andrewm  4096 2009-01-13 20:47 .
drwxr-x---+ 84 andrewm andrewm 12288 2009-01-30 16:08 ..
drwx--+  7 andrewm andrewm  4096 2009-01-30 16:22 0IBZ4H3k84Id,b5SLxcUth7P
drwxr-x---+ 43 andrewm andrewm  4096 2009-01-22 13:14 8K8DI0H50LY8sA-k0M4m8au1
-rw-r-+  1 andrewm andrewm 71923 2009-01-30 13:00 CuTVvivzRtVXW--9aIZNTUGP
drwxr-x---+ 28 andrewm andrewm  4096 2009-01-30 16:08 EgCyclxjGnsa0EYf,rpxKpAg
-rw-r-+  1 andrewm andrewm   239 2008-06-24 22:34 .encfs5
drwx--+  4 andrewm andrewm  4096 2008-12-14 17:49 ldubuvg2gMnrI8hDUiL1QC,m
drwxr-x---+  9 andrewm andrewm 53248 2009-01-08 11:26 lM,Bh2CjGF1qp-ubU98Boqyr
drwx--+  9 andrewm andrewm  4096 2009-01-01 12:23 nuUVB4uIqfyNF8BQwEOmB4gt
lrwxrwxrwx+  1 andrewm andrewm49 2008-09-08 08:57 
Zclx5CVF6z20FqkURS7zMhyT - EgCyclxjGnsa0EYf,rpxKpAg/QkAtJiPaFEsGPxqwo9WAFfji


And the plaintext part of a encfs filesystem:

~/crypt$ ls -la

total 208
drwx--+  8 andrewm andrewm  4096 2009-01-13 20:47 .
drwxr-x---+ 84 andrewm andrewm 12288 2009-01-30 16:08 ..
drwxr-x---+ 43 andrewm andrewm  4096 2009-01-22 13:14 work
drwx--+  7 andrewm andrewm  4096 2009-01-30 16:22 stuff
drwx--+  9 andrewm andrewm  4096 2009-01-01 12:23 friends
lrwxrwxrwx+  1 andrewm andrewm49 2008-09-08 08:57 me - foo/baz
drwxr-x---+  9 andrewm andrewm 53248 2009-01-08 11:26 junk
drwx--+  4 andrewm andrewm  4096 2008-12-14 17:49 .Trash-1000
drwxr-x---+ 28 andrewm andrewm  4096 2009-01-30 16:08 morejunk
-rw-r-+  1 andrewm andrewm 70787 2009-01-30 13:00 whatever

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Quota translator troubles

2009-01-30 Thread Anand Avati
Patrick,
 Can you try the latest codebase? some fixes in quota have gone in
since your first mail. Also, quota is best used on the server side for
now. We are still working on making it work well on the client side.

Avati

On Fri, Jan 30, 2009 at 9:57 PM, Patrick Ruckstuhl patr...@tario.org wrote:
 Hi Ananth,


 here's the Config with the Quota on top:

 Server (two servers with different ip addresses have this config)

 ### Log brick
 volume log-posix
  type storage/posix   # POSIX FS translator
  option directory /data/glusterfs/log# Export this directory
 end-volume

 ### Add lock support
 volume log-locks
  type features/locks
  subvolumes log-posix
 end-volume

 ### Add performance translator
 volume log-brick
  type performance/io-threads
  option thread-count 8
  subvolumes log-locks
 end-volume

 ### Add network serving capability to above brick.
 volume server
  type protocol/server
  option transport-type tcp
  option transport.socket.bind-address 192.168.0.4
  subvolumes log-brick
  option auth.addr.log-brick.allow 192.168.0.2,192.168.0.3,192.168.0.4 #
 Allow access to brick volume
 end-volume


 Client

 ### Add client feature and attach to remote subvolume
 volume log-remote-hip
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.0.3 # IP address of the remote brick
  option remote-subvolume log-brick# name of the remote volume
 end-volume

 ### Add client feature and attach to remote subvolume
 volume log-remote-hop
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.0.4 # IP address of the remote brick
  option remote-subvolume log-brick# name of the remote volume
 end-volume

 ### This is a distributed volume
 volume log-distribute
  type cluster/distribute
  subvolumes log-remote-hip log-remote-hop
 end-volume

 ### Add writeback feature
 volume log-writeback
  type performance/write-behind
  option block-size 512KB
  option cache-size 100MB
  option flush-behind off
  subvolumes log-distribute
 end-volume

 ### Add quota
 #volume log
  type features/quota
  option disk-usage-limit 100GB
  subvolumes log-writeback
 #end-volume


 This config results in the crash. (if you can't reproduce the crash with
 this config I can see if I'll be able to run it with gdb)


 The config that works is


 Server (two servers with different ip addresses have this config)

 ### Log brick
 volume log-posix
  type storage/posix   # POSIX FS translator
  option directory /data/glusterfs/log# Export this directory
 end-volume

 ### Add quota support
 volume log-quota
  type features/quota
  option disk-usage-limit 100GB
  subvolumes log-posix
 end-volume

 ### Add lock support
 volume log-locks
  type features/locks
  subvolumes log-quota
 end-volume

 ### Add performance translator
 volume log-brick
  type performance/io-threads
  option thread-count 8
  subvolumes log-locks
 end-volume

 ### Add network serving capability to above brick.
 volume server
  type protocol/server
  option transport-type tcp
  option transport.socket.bind-address 192.168.0.4
  subvolumes log-brick
  option auth.addr.log-brick.allow 192.168.0.2,192.168.0.3,192.168.0.4 #
 Allow access to brick volume
 end-volume


 Client

 ### Add client feature and attach to remote subvolume
 volume log-remote-hip
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.0.3 # IP address of the remote brick
  option remote-subvolume log-brick# name of the remote volume
 end-volume

 ### Add client feature and attach to remote subvolume
 volume log-remote-hop
  type protocol/client
  option transport-type tcp
  option remote-host 192.168.0.4 # IP address of the remote brick
  option remote-subvolume log-brick# name of the remote volume
 end-volume

 ### This is a distributed volume
 volume log-distribute
  type cluster/distribute
  subvolumes log-remote-hip log-remote-hop
 end-volume

 ### Add writeback feature
 volume log-writeback
  type performance/write-behind
  option block-size 512KB
  option cache-size 100MB
  option flush-behind off
  subvolumes log-distribute
 end-volume


 So the only difference is that the quota moved from the top on the client to
 the bottom on each server. This works but df returns the wrong amount (it
 returns the total available space, not the space available with the given
 quota).


 Regards,
 Patrick

 Hi Patrick,
 It would be great if you could provide us with a bit more information.
 Could you mail us the specfiles used (both with and without quota) and also
 the gdb backtrace? Also, if there are any special steps we need to follow to
 reproduce the issue, please do let us know.
 Regards,
 Ananth

 -Original Message-
 *From*: Patrick Ruckstuhl patr...@tario.org
 mailto:patrick%20ruckstuhl%20%3cpatr...@tario.org%3e
 *To*: gluster-users@gluster.org mailto:gluster-users@gluster.org
 *Subject*: [Gluster-users] Quota 

Re: [Gluster-users] AFR/replicate questions

2009-01-30 Thread Keith Freedman
At 04:20 AM 1/30/2009, Barnaby Gray wrote:
I'm in the process of setting up server-side AFR with 2 servers in
separate data centres, separated by a WAN. Writes will be relatively
few, so we can live with the performance limitations of the WAN.

I noticed unexpected performance though when listing directories of
around 1k files with ls -al. It looks like for this operation server1 is
sending traffic to server2 in the other data centre, which for a
read-only operation I wasn't expecting.

anytime a directory is accessed, gluster/replicate checks with the 
other server to see if the information it has is current.
It does this because if something changed on the other machine it 
might not have known about it.  If something has changed, it auto-heals.

Since gluster doesn't cache information about the other machines in a 
replicate group, it has to do this everytime.

tshark shows a reasonable amount of traffic that looks related to xattr:
lots of mentions of filenames and 'trusted.glusterfs.afr.metadata-pending'.

I'm using the option read-subvolume local to point read operations to
the volume local to either server.

this means.   Once it's determined that my version of the file is the 
most up to date, then serve it from my disk (or my favorite server in 
a client-server model) which is faster than streaming it over the network.

Have tried both with and without the performance translators client-side
to no avail. We're using 2.0.0rc1.

I dont suspect any performance translator can help with this 
particular situation.  Gluster HAS to insure that it's delivering the 
most up to date version of a file, in order to do that, upon any file 
request, it has to collaborate with other replicate servers to find out.

Apologies if this is an obvious question - can someone spot what I'm
doing wrong?

one might think, well, both servers haven't lost connections with 
eachother, so they should be able to assume they're in sync, but 
this isn't necessarily the case because you can't know the 
configuration on the other end.

there may be a situation where Server A decided Server B was down 
because of a network latency, so it wrote and updated a file but 
didn't replicate it to Server A.  Server B goes to read that file, if 
it assumes that all has been well with Server A and doesn't bother 
checking then it will serve the wrong version of the file.

The only way to resolve this would be to make server B responsible 
for notifying server A  when it re-establishes a connection to 
it.   While this seems logical and would improve performance for your 
case, this would require some sort of journaling on server B.  This 
would be terribly inefficient and would require an additional journal 
filesystem, or modifying the underlying filesystem in a some 
way.  Then there's the case of changing architecture.   If you have 
10 servers in your replicate group, you have to run a journal for all 
10,  lets say you just shut 5 of them off forever, you'd then need a 
way to clear out the journal for those so that space isn't wasted.

So given that gluster wants to be non-intrusive

cheers,

Barnaby

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster ha/replication/disaster recover(dr translator) wish list

2009-01-30 Thread Prabhu Ramachandran
On 01/28/09 01:36, Keith Freedman wrote:
 At 09:32 AM 1/27/2009, Prabhu Ramachandran wrote:
 On 01/27/09 02:21, Keith Freedman wrote:
 At 10:36 AM 1/26/2009, Prabhu Ramachandran wrote:
 I dont see any problems with your config.
 other than, if your network connection is very sporadic, then you'll 
 be caught often by waiting for timeouts which will make things seem 
 slower.

 Network connection isn't usually a problem but when the network does 
 go it could be gone for a while.  In the worst case could I 
 temporarily disable the replicate/AFR feature?
 
 you wont need to.  it'll know that the other replica is down and will 
 continue working without it, then will auto-heal when it comes back up.

OK, I am now all set and things work really nicely!  Thanks for the answers!

 so there should be no need to change your configuratoin, unless you hate 
 seeing all the connection down messages in your logfile.

Well, I can see a couple of cases where it would be expedient to change 
the configuration.  Lets say one of the machines losses a disk and the 
machine won't be fixed for a week say.  Then, once the machine comes 
back up, I could use a different configuration and sync the files 
locally on the machine that is available rather than do it over the slow 
network.  I am guessing that this should work.

Anyway, I'd like to thank once again the developers for the excellent 
software and the support.  Thank you for your patient answers!

cheers,
prabhu

___
Gluster-users mailing list
Gluster-users@gluster.org
http://zresearch.com/cgi-bin/mailman/listinfo/gluster-users