Re: [9fans] s3venti

2008-02-12 Thread Wilhelm B. Kloke
Richard Bilson [EMAIL PROTECTED] schrieb:
 and an issue related to the
 fact that we need to encrypt users' data.

 For the record, s3venti does encrypt blocks that it writes to S3. It
 uses a single key, making it rather vulnerable to dictionary attacks,
 but I haven't come up with a way to do better without changing the
 venti protocol. Suggestions are welcome.

Any sort of encryption which does not change the key from time to time
is not very secure. If the attacker has enough time, security is not easy
to get.

I propose to divide the files to store, e.g. into upper and lower 4bit
nibbles and put them into different places. In this case both halves are
likely to be less worthy for themselves, and much more difficult to
decipher, too.
-- 
Dipl.-Math. Wilhelm Bernhard Kloke
Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257
PGP: http://vestein.arb-phys.uni-dortmund.de/~wb/mypublic.key


Re: [9fans] s3venti

2008-02-12 Thread Steve Simon
 For the record, s3venti does encrypt blocks that it writes to S3. It
 uses a single key, making it rather vulnerable to dictionary attacks,
 but I haven't come up with a way to do better without changing the
 venti protocol. Suggestions are welcome.

Beware: I am no security expert, I know just enough to be dangerous.

Ensure you have plenty of entropy - insist on long pass phrases.
sha1 this with the block number to give you the key for a particular block.
This at least permutes the venti tree info blocks - its real purpose is
to ensure the duplicate blocks look different when encrypted but venti doesn't
have duplicate blocks as such.

you could repeat the sha1 as it may be possible to infer some 
info given all the sha1s start with the same (or known) prefix -
the pass phrase (or block number).

If you are likely to have multiple ventis with the same password on the server
(one for work stuff and one for home) then stir a random string into the sha1,
and keep this in factotum; generate this string when the venti is initialised.

your venti blocks are compressed which gives you some obscurity, guessing
plaintext is not so easy but huffman tables and the like still stand out.

If you want to be obsessive you could generate a block of random data, say 64k
which you hold locally and xor this with your venti blocks before encryption.
offset your start position in the random data by a value generated from the
sha1(sha1(blocknumber, passphrase)) (eg the checksum), this would make cracking
your data much harder.

Note this block of random data needs to be really random, not a PRBS like rand()
which is predictable. you could slowly suck bytes from /dev/random on a busy 
machine.

as ever its a case of:

how valuable is it?
how long to you want to keep it secret?
who are you trying to keep it secret from?

caveat emptor

-Steve


Re: [9fans] s3venti

2008-02-12 Thread erik quanstrom
 You could reduce your storage bill by using file names to store the data 
 through information hiding rather than the content ;)
 
 http://www.geocities.com/patchnpuki/other/compression.htm
 
 One of these days ..

my reading of the sla seemed to indicate they count bucket names
against you.

- erik



Re: [9fans] s3venti

2008-02-12 Thread Alf
You could reduce your storage bill by using file names to store the data 
through information hiding rather than the content ;)


http://www.geocities.com/patchnpuki/other/compression.htm

One of these days ..



Re: [9fans] s3venti

2008-02-11 Thread Richard Bilson
 and an issue related to the
 fact that we need to encrypt users' data.

For the record, s3venti does encrypt blocks that it writes to S3. It
uses a single key, making it rather vulnerable to dictionary attacks,
but I haven't come up with a way to do better without changing the
venti protocol. Suggestions are welcome.


Re: [9fans] s3venti

2008-02-11 Thread Skip Tavakkolian
 skip: what are the principles of operation of s3fs?  what's the advantage
 over venti?

it is easier to do a mirror.  there is a limitation on the number
of buckets, etc that also played into it, and an issue related to the
fact that we need to encrypt users' data. unfortunately the
thread that had brucee's (and rsc's i believe) comments on it is on a sick
kenfs that's being worked on.



Re: [9fans] s3venti

2008-02-11 Thread Bakul Shah
On Mon, 11 Feb 2008 11:39:23 EST Richard Bilson [EMAIL PROTECTED]  wrote:
  what usage senerio do you have in mind for venti/s3?
 
 I wanted set it and forget it off-site backups, at a reasonable cost
 and without significant capital outlays or maintenance. I.e., mirror
 an existing venti with a cron job, or use it as a target for vbackup.
 As you point out, whether the cost of S3 is reasonable depends on how
 much you have to store, and how much it's worth to you to store it. I
 don't intend to use it for my mp3s, for instance.

In using S3 for off-site backups I would worry about the time
to restore a failed disk (apart from the privacy issues).  As
an example restoring a 100GB disk over the 'net at a constant
300KB/s of download speed can take close to 4 days.  Of
course, these days many people have much more data than that.

May be there are other remote backup companies that provide a
copy your data to disk and deliver it overnight service for
an extra charge.


Re: [9fans] s3venti

2008-02-11 Thread erik quanstrom
 I mentioned in passing some time ago that I was working on a venti
 server that uses Amazon S3 as a storage backend. There is now code in
 /n/sources/contrib/rcbilson/s3venti . Beware sharp edges. I have
 pumped a fair amount of test data through it successfully, but I
 wouldn't recommend trusting anything important to it yet. There is a
 man page.
 
 I started writing it under plan9, but for irrelevant reasons later
 switched to plan9port, so that's where it's known to work (on Linux,
 at least). I would hope and expect that moving it back to native plan9
 would be a small job.
 
 Questions and comments are welcome.

neat stuff.

i took a quick look at pricing -- $0.15/gb/month plus $0.10/gb to transfer
data in.  assuming it's the data motel and it never checks out, 
500GB would cost $1500 to store for a year.  but 1GB would cost
just $3.  this seems nice -- my fs has only 2.5GB of stuff.  and even
at my cost of $100 for the recycled machine, that $1.60/gb/month.
but i would need to cache all that locally  have a duplicate copy.
so what usage senerio do you have in mind for venti/s3?

skip: what are the principles of operation of s3fs?  what's the advantage
over venti?

- erik



Re: [9fans] s3venti

2008-02-10 Thread Skip Tavakkolian
that's interesting. we initially considered that, but decided on
S3fs. , brucee has been working on it. we will use it to provide
archiving for rangboom users.

 I mentioned in passing some time ago that I was working on a venti
 server that uses Amazon S3 as a storage backend. There is now code in
 /n/sources/contrib/rcbilson/s3venti . Beware sharp edges. I have
 pumped a fair amount of test data through it successfully, but I
 wouldn't recommend trusting anything important to it yet. There is a
 man page.
 
 I started writing it under plan9, but for irrelevant reasons later
 switched to plan9port, so that's where it's known to work (on Linux,
 at least). I would hope and expect that moving it back to native plan9
 would be a small job.
 
 Questions and comments are welcome.