On Fri, Oct 12, 2012 at 03:32:14PM -0600, Zooko Wilcox-O'Hearn wrote:
> Folks:
> 
> I've never been satisfied with our garbage collection scheme. It
> requires that the client repeatedly refresh leases on its data, and if
> it fails to do so then the server will eventually delete that data. It
> makes me feel unsafe about the longevity of my data. What if I get
> sick? What if I the renewer script breaks and I don't notice that it
> broke?

Exactly my worries too.


> Why would a customer accept an increasingly costly service?

No customer has an infinite amount of money, so they wont!



Zooko, you've given the answers below in your post, but let me
rephrase them from an economic perspective, leaving out all the
technical details.

I believe these are the expectations from the customers' perspective:

1. I want this data to be kept around and I don't want to be bothered
   with the details;
2. I *pay* you for as a long as *I* think my data is worth *your* price;
3. I don't want to pay for stuff that I don't think is worth your price
   to keep; I tell you what I consider valuable;

Tahoes' design violates some of these expectations.

> Secondly, a problem is that you might decide you no longer need
> some data, and throw it out of your local worldview (i.e. delete all
> links you have that could lead you back to that data), but fail to
> inform the storage server that you are done with it (for example, your
> network connection or the storage server itself might be down right at
> that moment when your client was about to tell the storage server that
> you, the user, have permanently lost all interest in that data). If
> that happened, the storage server would be stuck holding onto it
> forever.

This violates the expectations: Don't make me pay for your technical
incompetence (3), or be bothered by the hassles it takes to make my
wish clear to you (1). It lowers the price I'm willing to pay at (2).

There is also a fourth requirement: 4. Protect me from my own
mistakes. If I accidently delete all my data, I want it back. Make it
clear to me what accidents are covered in the price I pay at (2) so I
can decide if my data is worth it.


> (In fact, LeastAuthority.com actually won't delete your data even if
> you *do* stop paying. But we'll cease allowing you to upload or
> download until you bring your account into good standing. And we do
> reserve the right to change our minds and delete your data
> eventually.)

There is a hidden fifth expectation: 
5. A failure in payment does not equate to loss of value of my data.
   There are many reasons why the payments can fail. Check with me
   before holding my data hostage.


> There's one consequence of this use case request which affects the
> leasedb design.
> 
> That is: what if there are two different users, Amber and Bryce, and
> Amber has said "Okay server, I've marked everything I care about, and
> I hereby cease paying for anything that I haven't marked, so if you
> want you can sweep it all out.", but Bryce hasn't (yet) said that.

> Bryce has said "Keep everything I've touched until I tell you
> otherwise, and I'll pay you to do.".

Here we are back into Tahoe-territory, (or general capability theory):

Bryce has a link (from his own root-alias within the same grid) to the
data, hence he thinks it's still valuable, hence the server must keep
the data. Just like a hard link on a posix file system.

However, it *must* transfer accounting to Bryce, as Amber cannot know
who keeps links to the data when she deletes hers. Posix fails here. 

Now Bryce will be charged for it and he *has to* decide whether the
data is worth the storage cost *to him*. If not, he can decide to
delete it, or move it to some other storage that better fits his
requirements, ie, let it sit on a local hard disk until it rots away,
or store in an entirely different grid that matches his pricing.


I think that the unhappiness with the current garbage collection comes
from the design decisions that violate the customers' (economic)
expectations. 

The price can be either a monetary price if you subscribe to a
commercial grid or a price in time when you run/join a free
grid. (Time == Money)

Side note: I'ld love to see my bank send me transaction statements
into a tahoe grid. I get the read-capability to each file and store it
in my own directory in the grid. The banks pays the storage fees for
the required retention period required by tax law (out of the service
fee I pay the bank). After that I get to decide what to do with the
statements.


Cheers, Guido Witmond.


PS. I miss Alice, is she on holiday :-)
_______________________________________________
tahoe-dev mailing list
tahoe-dev@tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev

Reply via email to