On Sun, Jun 16, 2013 at 6:17 AM, Greg Troxel <g...@ir.bbn.com> wrote: > > > "erp...@gmail.com" <erp...@gmail.com> writes: > > > If it helps, I've noticed that Tahoe seems to be designed for use in a > > business environment > > I don't think that's true, although I agree with some of your points. > My comments are from my own usage that is heading towards a friendnet.
I was looking at it from the same point of view. > > where one entity controls all of the nodes > > I this this is mostly not really true and not really important. There > are some issues which sort of relate, but they all feel independent. There were a few bumps in the road related to node ownership. One of them was the variable number of nodes on the network at any given time. When participants and their nodes join and leave the network freely, that does create churn, but it also creates a change in the number of available nodes over time. If the number of nodes in the network drops below N, space is wasted. If the number drops below H, uploads fail. So it seems beneficial to set N and H to lower numbers. On the other hand, for a given ratio of N to K, larger values overall increase the resilience of the uploads, so it would seem beneficial to set N (and H, and probably K) to higher numbers. Finding the right balance requires knowing in advance how many nodes are going to be available in the long term, and that's hard to do when the nodes are run by people with their own needs, motivations, etc. I would forecast the reliability of a friendnet somewhere in between an unstable company likely to fold and shut down all its servers at any time and a stable company likely (we hope) to keep operating all of those servers forever. So, I think friendnet configurations can do better in this area. > > each node has a static IP > > My experience has been that the introducer needs to have static address, > but that storage nodes and clients do not. Storage nodes do need to > have a globally-routable address, but that's different. I think even the introducer may not technically need a static IP to keep the network going if it has dynamic DNS. However, all nodes need either a static IP or dynamic DNS to find each other (or did at the time I was participating in VG2). That's something else for each node operator to buy or maintain, respectively. > > very little node turnover, > > I'm not sure how much this matters, once the repair-churn issues are > fixed. But it's an interesting question. (One would need to define a > metric that relates filesystem behavior to node turnover.) Agreed. Until the issues related to repair-churn are fixed, however, ... > > very high internode bandwidth compared to gateway-user bandwidth, > > I don't really follow this. In my view, the WUI/WAPI should be run on > computers needing access, and not accessed beyond a thought-secure LAN. > So I see user-gateway bandwidth as approaching memory speeds. > > As for internode bandwidth, client nodes interact with storage nodes > (yes, I know some can be both). I don't see that tahoe makes any big > assumptions here. I also think that tahoe speed is typically not really > limited by bandwidth here as much as serializing round trips. I phrased that badly. I was trying to talk about the amount of data sent from an uploading client node being N/K times the size of the file being uploaded, because upstream bandwidth *should be* (and often is) the limiting factor in tahoe's performance in a friendnet environment. On the other hand, if I'm uploading to a grid of storage nodes operated by a business that are all interconnected at 100mbit, and that business provides an upload helper, uploads over the connection between my node and the helper (the slowest link) won't be multiplied by N/K yet, speeding up the entire process. There are also subtler issues. While I haven't dug very deeply into the code, it was my understanding that at the time of VG2, a Tahoe node processing an upload would divide an upload into chunks and upload the chunks serially. That is, it would only begin the upload of chunk 2 to host 2 after the upload of chunk 1 to host 1 was complete. This makes sense when all of the storage nodes and upload helpers are connected together with a fast ethernet switch: an uploading node would saturate its own interface to the switch while sending a single chunk to a single switch, requiring no optimization. On the Internet, if the connection between my node and a friend's node is poor, my node is going to leave most of my most precious resource (upstream bandwidth) unused while taking a long time to finish uploading that chunk. > > There are some challenges in the "friends want to pool their extra storage" > > use case. > > True. The biggest challenges I see are > > accounting, so you can have some measure of fairness (even among > friends who are trying to be reasonable, you need a way to know if > you've accidentally consuemed 10x what you thought you had) > > expiration/garbage-collection. There needs to be a way for old shares > to go away, but it needs to be safe against normal activities, and > safe against vanishing for a few months. I may be naive here, but I believe both of these problems can be solved by looking to traditional filesystems. Each filesystem object has an owner--that makes it possible and easy to do accounting. Right now objects are somewhat anonymous, which I don't see as an advantage in any of Tahoe's use cases. If you need to distribute data to people anonymously I think a model like Freenet's would provide better protection. The necessity for garbage collection IMHO comes from the fact that it's possible to lose or forget the root of a directory structure. Why not use the dropbox model, where it's just like another drive with a single root per user? > But I also think these challenges are not particularly about > allmydata.com vs friendnet - they apply to both. I believe both of > these are being worked on. With improvements for those two points and > fixes for ticket:1209 and ticket:1210, I think tahoe will be much more > usabl for the friendnet use case. Agreed. Thanks, Eric _______________________________________________ tahoe-dev mailing list tahoe-dev@tahoe-lafs.org https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev