On 10/08/15 20:56, Omer Zak wrote: > All those discussions about inverting matrices over Z2 make me curious > to know what kind of problems can be solved by inverting such matrices. > > I suppose that the actual problem, with which Shachar is struggling, is > proprietary information. However, is it possible to indicate the kind of > problems which can be attacked by inverting a matrix over Z2? I think I'm not letting on too much by giving the broad area.
When calculating RAID-6 parity, one of the common methods is called "erasure coding". You have n data blocks in a stripe which you need to write to n+p actual disks, such that for any number of failures up to p, you can recover the original n data blocks. One of the ways to do that is by employing a family of algorithms called "Erasure codes", the most famous of which is Reed Solomon. The basic idea is that you write a matrix which is n over n+p, and you multiply the n data blocks by that matrix, resulting in n+p data blocks, which you then write to your disks. When p disks failed, you erase from your matrix the lines that correspond to the disks that failed, and invert the resulting n*n matrix. Multiplying the n disks you do have by that matrix result in the origin n data blocks. The obvious first requirement from your erasure code matrix, therefor, is that it be inversible after deleting any combination of p lines from it. This is the property I'm trying to verify in my project at hand. The usual erasure code matrix is calculated over "ze ozer finite field", i.e. Galois field. Frankly, I don't feel I understand Galois field, and in particular, the way it is used in this particular case, enough to explain why. One of the properties that make it attractive, however, is that addition over GF is XOR. As a result, almost all RAID erasure code matrices have their first n+1 lines look like this: 1 0 0 0 .... 0 1 0 0 .... 0 0 1 0 .... 0 0 0 1 .... ........1 0 0 ........0 1 0 ........0 0 1 1 1 1 ... 1 1 The practical upshot of the above is that the first n disks receive the same data they would have received without applying erasure coding (i.e. - if they are available, they can be read directly without going through the matrix, and more importantly, without reading the rest of the disks in the stripe). Furthermore, the first protection disk contains the XOR of all data disks in the stripe. In other words, if we lost only one disk, we can employ RAID-5 as usual. A side effect of the above is that the matrix is extremely sparse. Almost all lines (all but the protection disks) contain just one non-zero number, already on the diagonal. Using Z2 instead of GF gives certain performance benefits. It does have the cost of making the matrix much much much larger (hence the somewhat insane numbers quoted in my original mail. I am not really building a stripe of 272+12). Since the matrix is still extremely sparse, however, this does not cost the full n^3 quoted by Matan. Look up "evenodd" for an example of an algorithm that is based on XOR rather than GF. It is not phrased in matrix form, but that is just a convenience. Hope this gives enough of an insight to quelch your curiosity :-) Shachar
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il