> So, if the originating host happens to be the closest for the given key in > the entire network (which, in a network of a few hundred or at most few > thousand nodes and small file chunks is going to happen quite often), it > won't be stored at all, despite the insert seemingly completing > succesfully? > > And of course, since nodes sometimes leave from the network or change > their position, even popular content's availability will deteriorate > steadily as it's permanently stored on less and less datastores.
Not really, with healing, which toad has implemented: every successful download will then help make the file more accessible to others by inserting the failed blocks. If some nodes are now offline that received the original data, new nodes, which are online now and closest, will receive it now and make the data doubled if the original node returns. This combined with ECC should work fine, even in a network with lots of churn and lost data. This, I think, has a slight bias towards storing popular data on nodes with high availability, even more so than today. So popular data should "always" be around, not dependent on node X, which has the crucial "last block", being online. Less popular data though might require waiting around for nodes to pop up with the data on them. Healing really feels like a missing piece of the puzzle, I always thought straight LRU should work pretty badly, but my guts tell me thats not the case with healing in the mix. The problem was, I felt, that blocks that were accessed from successful downloads were worth exactly the same as an access for an incomplete file which is no longer completely in the network, and this could potentially help keep old unusable data in the stores, modulo people getting tired of requesting unsuccessful data. --- John B?ckstrand