At the last few devchats, we've discussed what would be the next most-useful step in the Accounting space, and we're kind of settling on the idea of a "Grid Manager" agent. It would work something like this:
* Current grids are "unmanaged": clients use whatever storage servers the introducer tells them about, and storage servers accept shares from any client that can connect to them. The only access control is to keep the introducer FURL secret (or clients can override the introducer's announcements, to use additional servers). * "Managed grids" are controlled by a Grid Manager, which is a special kind of node that decides which servers and which clients can be part of the grid. Whoever runs the Grid Manager is the Grid Manager Admin, and they make the decisions for everyone else. * Both clients and servers are added to a managed grid by using a magic-wormhole -style "invitation code". This would be based on the work that Chris Wood and the LAE team are doing with GridSync. The CLI commands might be something like "tahoe-grid-manager invite client" and "tahoe grid accept". * The invitation process exchanges public keys (Ed25519 verifying keys): the client/server learns the manager's pubkey, and the manager learns the client/server's node identity pubkey. In addition, the client/server learns the introducer.furl (or maybe multiple ones for redundancy). * Clients are only going to pay attention to servers that are authorized by the manager. We have some options for this: 1: The manager publishes a signed list of all blessed storage server IDs in a new kind of introducer announcement. Clients subscribe to this, and use it as a filter on the normal server announcements that they hear. 2: The manager gives a signed certificate to each server, saying "to whom it may concern: server X is cool. love, The Manager". Servers include this certificate in their announcements, and clients only pay attention to the ones with a valid certificate. * Servers only accept requests from authorized clients. We've got the same options. * The grid manager might also publish recommended values for k/H/N, since it knows how big the grid is supposed to be. * Payment: nothing in this deals with payment. The human who runs the Grid Manager may choose servers on the basis of cost or reliability, and they may charge clients for service, but nothing in the Grid Manager code or protocol knows about that. Servers might bill the grid admin at the end of the month, or ahead of time, but that will happen out-of-band for now. * Usage/Quotas: to support servers deciding *how much* to bill the grid manager, and to let the grid manager know how much to bill each user, servers in this scheme *will* keep track of per-client usage, and will deliver a machine-readable report to the grid manager. The manager will aggregate these reports and display a per-client usage table to the admin (via some grid-manager UI, probably web-based). * Price Lists: We might include something like a price list here, so the server->manager report could say "Alice is using 2GB, and I charge $0.03/GB*month", and then the grid manager can add all that up and say "hey admin, based on the price you told me, you should charge Alice $0.06". The actual billing would be out-of-band. The idea is to provide enough information to correlate the server's out-of-band bill to the manager with the manager's out-of-band bill to the client. But maybe we just show bytes everywhere and let the humans decide how to translate that into money. Ideally, we'd provide a couple of pre-built Grid Manager modes for common use cases: * friendnet: somebody volunteers to run the Grid Manager, they're responsible for inviting the right people. They get reports of usage but no money ever trades hands. If they go away, somebody else starts a new manager and everyone switches over, hopefully keeping the same set of storage nodes. * cloud-storage backend: for individuals who like Tahoe's encryption but want to use e.g. S3 for the backend. They can run a Grid Manager and give it their AWS credentials, and the manager will configure and launch an S3-backed storage server. The user has to pay their AWS bill, and they can get a report about how much space each client is using. (it's not clear whether the server would run in EC2 or run in the same process as the grid manager, or maybe even both). * commercial grid provider (S4): someone like LAE or Matador could run the grid manager and add servers of their choice to it. Signing up with them would get you an invitation code for your clients. The reporting would give them enough information to know how much to bill you each month. This looks just like the previous case, except that you'd write more code around it to automate the billing. * other needs could be handled by extending or wrapping the Grid Manager code This "Grid Manager" approach is an alternative to some of the other ideas we've discussed, hopefully easier to implement: * clients read from some large "yellow pages" of servers, automatically choose servers on the basis of price and crowd-sourced reputation data, then make direct BTC/ETH/ZEC/etc payments to each. * clients (i.e. their human admins) exchange invitations directly with (human admins of) servers, maybe with payment involved The general idea is that reliable long-term storage wants to mostly use the same set of servers for long periods of time (the set must be *able* to change, as servers are retired, but we don't want it to change with every upload). And scaling requires hierarchy: we want to *not* have some kind of relationship between every client and every server in the universe. So clients mostly only know about their grid manager, and the grid manager choose a mostly-stable set of servers. All your uploads go to that set. When the human that runs a client first sets it up, they may look at a yellow pages of grid managers and choose one based upon price and reputation, but then the (machine) client is explicitly told which one to use. There's still a bunch of design space to figure out: * Publishing a signed list of authorized clients/servers via the introducer would reveal this data to more people than really need it. Client 1 doesn't need to know that client 2 is on the list. And we generally want the Introducer to be less powerful than it is now. * We could gather an *encryption* key from each client/server, and then the grid manager could encrypt this list to just the folks who need it. Or the manager could publish a FURL for a subscription port: these nodes could connect to it, prove their identity by signing a challenge, then be allowed to fetch the current list. * Servers need a way to deliver the "usage report" to the grid manager. This could go to a FURL published by the manager, and servers could sign each report. Or servers could advertise a FURL from which the report could be fetched, but then do something to keep it private. * Should clients be able to fetch their own usage report from each server? Probably.. needs some more API. * Revocation: when the grid manager decides to remove a server or client, how does everyone else find out? How quickly does it propagate, and how do connectivity failures (accidental or deliberate) affect it? 1: If the introducer merely publishes a list (encrypted or not), with a sequence number (highest seqnum wins), then a failure of the introducer or the grid manager just freezes the membership until both are running again and a new list can go out (just like the current storage server announcements work). If the manager drops offline, but the introducer still retains the latest announcement, then clients and servers can be bounced and still get the authorization list. 2: If servers connect to a manager FURL and subscribe to the client list, then an offline manager prevents bounced servers from learning the current list (unless we cache the list on the server, in which case we need to think about expiration) 3: If the manager delivers certificates to clients/servers, we need to think about expiration and renewal, and we need a channel for those deliveries. The manager could publish a FURL, nodes could prove their identity with signed challenges, and then they could subscribe to get fresh certs. This would add another persistent TCP connection. Expiration time would dictate how long good usage continues once the manager goes down, also how long bad usage could continue if the manager were DoSsed offline, and finally how much network traffic is added for all the renewals. This will also tie in to the "federated inter-grid access" scheme we're working on. I'll write more in a later email, but the basic idea is that all your uploads go to your local grid, but you can download files from other grids. This might involve a "clearinghouse", where your grid manager signs up for inter-grid access. Requests for "foreign" filecaps would go through the manager, who would look up the remote gateway for that filecap and fetch the erasure-decoded ciphertext for you (maybe charging you an extra fee). Filecaps might be augmented with a grid-id (like the area code on a phone number), or might be looked up in a big table managed by the clearinghouse. More details later. thoughts? -Brian _______________________________________________ tahoe-dev mailing list tahoe-dev@tahoe-lafs.org https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev