Using a commercial Cauchy Reed Solomon I get upwards of 600MB/s on a high end processor and 60MB/s on a really low end 4+ year old intel.
This is using a 26 piece file but testing hasn't shown significant degradation even with a lot of pieces, though I admittedly haven't tested to 100. I would say that you should expect close to 30-60 locally, with a decent mechanism, but of course you'll need to account for more latency with remote nodes ( unless you specifically mean just the decode once you have the parts). -- Justin Typos by iPhone On Jan 4, 2014, at 2:22 PM, David Vorick <david.vor...@gmail.com> wrote: > I've been looking at different options for erasure coding, spinal codes seem > to slow and LT Codes don't seem to be effective against an intelligent > attacker (someone who gets to choose which nodes go offline). > > Which essentially leaves us with Reed-Solomon codes. > > If I have a file coded (using Reed-Solomon) into ~100 pieces, what is a > reasonable decoding speed? Could I expect to get over 10mbps on a standard > consumer processor? > > > On Sun, Dec 1, 2013 at 4:37 PM, David Vorick <david.vor...@gmail.com> wrote: >> Thanks Dirk, I'll be sure to check all those out as well. Haven't yet heard >> of spinal codes. >> >> Natanael, all of the mining is based on the amount of storage that you are >> contributing. If you are hosting 100 nodes each with 10GB, you will mine the >> same amount as if you had just one node with 1TB. The only way you could >> mine extra credits is if you could convince the system that you are hosting >> more storage than you are actually hosting. >> >> >> On Sun, Dec 1, 2013 at 2:40 PM, <jason.john...@p7n.net> wrote: >>> What if you gave them the node to use. Like they had to register for a >>> node. I started something like this but sort of stopped because I’m lazy. >>> >>> >>> >>> From: tahoe-dev-boun...@tahoe-lafs.org >>> [mailto:tahoe-dev-boun...@tahoe-lafs.org] On Behalf Of Natanael >>> Sent: Sunday, December 1, 2013 1:37 PM >>> To: David Vorick >>> Cc: tahoe-dev@tahoe-lafs.org >>> Subject: Re: Fwd: Erasure Coding >>> >>> >>> >>> Can't you pretend to run more nodes than you actually are running in order >>> to "mine" more credits? What could prevent that? >>> >>> - Sent from my phone >>> >>> Den 1 dec 2013 17:25 skrev "David Vorick" <david.vor...@gmail.com>: >>> >>> >>> >>> ---------- Forwarded message ---------- >>> From: David Vorick <david.vor...@gmail.com> >>> Date: Sun, Dec 1, 2013 at 11:25 AM >>> Subject: Re: Erasure Coding >>> To: Alex Elsayed <eternal...@gmail.com> >>> >>> >>> Alex, thanks for those resources. I will check them out later this week. >>> >>> I'm trying to create something that will function as a market for cloud >>> storage. People can rent out storage on the network for credit (a >>> cryptocurrency - not bitcoin but something heavily inspired from bitcoin >>> and the other altcoins) and then people who have credit (which can be >>> obtained by trading over an exchange, or by renting to the network) can >>> rent storage from the network. >>> >>> So the clusters will be spread out over large distances. With RAID5 and 5 >>> disks, the network needs to communicate 4 bits to recover each lost bit. >>> That's really expensive. The computational cost is not the concern, the >>> bandwidth cost is the concern. (though there are computational limits as >>> well) >>> >>> When you buy storage, all of the redundancy and erasure coding happens >>> behind the scenes. So a network that needs 3x redundancy will be 3x as >>> expensive to rent storage from. To be competitive, this number should be as >>> low as possible. If we had Reed-Solomon and infinite bandwidth, I think we >>> could safely get the redundancy below 1.2. But with all the other >>> requirements, I'm not sure what a reasonable minimum is. >>> >>> Since many people can be renting many different clusters, each machine on >>> the network may (will) be participating in many clusters at once (probably >>> in the hundreds to thousands). So the cost of handling a failure should be >>> fairly cheap. I don't think this requirement is as extreme as it may sound, >>> because if you are participating in 100 clusters each renting an average of >>> 50gb of storage, your overall expenses should be similar to participating >>> in a few clusters each renting an average of 1tb. The important part is >>> that you can keep up with multiple simultaneous network failures, and that >>> a single node is never a bottleneck in the repair process. >>> >>> >>> >>> We need 100s - 1000s of machines in a single cluster for multiple reasons. >>> The first is that it makes the cluster roughly as stable as the network as >>> a whole. If you have 100 machines randomly selected from the network, and >>> on average 1% of the machines on the network fail per day, your cluster >>> shouldn't stray too far from 1% failures per day. Even more so if you have >>> 300 or 1000 machines. But another reason is that the network is used to >>> mine currency based on how much storage you are contributing to the >>> network. If there is some way you can trick the network into thinking you >>> are storing data when you aren't (or you can somehow lie about the volume), >>> then you've broken the network. Having many nodes in every cluster is one >>> of the ways cheating is prevented. (there are a few others too, but >>> off-topic). >>> >>> >>> >>> Cluster size should be dynamic (fountain codes?) to support a cluster that >>> grows and shrinks in demand. Imagine if some of the files become public >>> (for example, youtube starts hosting videos over this network). If one >>> video goes viral, the bandwidth demands are going to spike and overwhelm >>> the network. But if the network can automatically expand and shrink as >>> demand changes, you may be able to solve the 'Reddit hug' problem. >>> >>> And finally, machines that only need to be on some of the time gives the >>> network a tolerance for things like power failures, without needing to >>> immediately assume that the lost node is gone for good. >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> tahoe-dev mailing list >>> tahoe-dev@tahoe-lafs.org >>> https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev >>> > > _______________________________________________ > tahoe-dev mailing list > tahoe-dev@tahoe-lafs.org > https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
_______________________________________________ tahoe-dev mailing list tahoe-dev@tahoe-lafs.org https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev