Re: [viff-devel] FW: Bug in ViFF
Hi all Today, Sebastiaan and I have been doing some serious thinking and looking into the VIFF code, and we feel convinced that we've found the bug. The problem lies entirely in the multiplication protocol. In Runtime.mul, products of shares are computed and shared. Then secure Lagrange interpolation runs to reconstruct the original sharing. With an odd number of players, $n$, and threshold $t = (n-1)/2$ a contribution is needed from all parties. But if the threshold is lower or there are more parties, then reconstruction doesn't require all shares. VIFF uses this to optimize the protocol, using only the first $2t+1$ shares received. This doesn't work for the multiplication protocol: Unless shares from the /same/ parties are used, there will be a mismatch. Say we have players 1,2,3,4 and a threshold of 1. Each $i$ shares their product with a polynomial $P_i(x)$. If one party uses shares from 1,2,3, then the polynomial is P_{1,2,3}(x) = P_1(x) + P_2(x) + P_3(x) Another could use 2,3,4, thus interpolating P_{2,3,4}(x) = P_2(x) + P_3(x) + P_4(x) Clearly this is not the same computation! Thus, the parties end up with inconsistent shares -- they have points on different polynomials. To solve the problem, the players must agree on which shares are used. A simple solution is to require all shares present. Alternatively (for the passive case) we could say that only the first $2t+1$ parties share the products of their shares. This basically eliminates any need for any of the other parties. Regards, Tomas ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Small VIFF language parser
Hi all We have talked on and off about making a front-end compiler for VIFF and today I figured that I would try making such a guy... So far it can only parse a simple language and spit out something which looks like equivalent Python code in which both branches of if-statements are run. So a program like this: if a: x = y if b: x = z else: x = w fi fi is transformed into this x = (a * y + (1 - a) * x) x = (a * b * z + (1 - a * b) * x) x = ((1 - a * b) * w + (1 - (1 - a * b)) * x) which one could plug into a VIFF skeleton and run (not yet done). Cool, but two serious comments: 1) Rather than x = (a * (y + (1 - a) * x) you want x = (a * (y - x) + x) so you shave off a superfluous mult for each assignment. 2) As far as I can see, it doesn't work. :-( There is a small bug in the calculation of the third expression: (1 - a * b) means !(a b) but you want (a (!b)) which translates to: a * (1-b) = a - a*b The idea is that the conditions in if-statements are pushed down to all assignments done in the then- and else-branches, nothing more. Very basic, but the idea works (modulo the above problems). This is just a quick test to make people start thinking about what we want in such a language and to make people think about what kind of transformation we can do and which we cannot do. I think that we can translate anything to a list of working expressions (at least anything that doesn't involve outputting values). But I'm worried that the naive approach won't give us any reasonable efficiency except for trivial examples. I have a few suggestions for improvements, but I'm not sure how difficult they are to implement. Take the following as inputs for discussion... 1) When if b: x = z else: x = w fi is translated into two distinct statements: x = (a * b * (z - x) + x) x = ((a*(1-b)) * (w - x) + x) you increase the depth of the arithmetic circuit as the second expression must wait for the new x-value. However, as only one of b and !b will be true, the depth does not need to increase: a * ((b * (z - w) + w) - x) + x (or simply: a*b*(z-x) + (a*(1-b))*(w-x) + x) I'm not sure how easy this is to handle in general though, think of examples such as: if a: x = c*y y = d*x else: y = e*x x = f*y fi where we really want something like this: (x, y) = (a*(c*y - f*e*x) + f*e*x, a*(d*c*y - e*x) + e*x) 2) In the above translations a*b is computed multiple times. It would be a lot more efficient if this was moved into a temporary variable so it could be reduced. The same is true in hand-translation immediately above, where e*x is computed 4 times (2 times by itself and 2 times as part of f*e*x). Hmmm, Maybe using temporary variables partly solves the round-problem above as well... Just execute both branches and conditionally select the right results... What do you think? /Tomas The program is attached below -- you will need to grab simplegeneric and PLY (Python Lex-Yacc) to use it, but both modules are easy to install: http://pypi.python.org/pypi/simplegeneric http://www.dabeaz.com/ply/ You run the program by giving it the name of a file to parse and it will tell you the results on stdout. -- Martin Geisler ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Async Batching with Twisted: A Walkthrough
Hi all I just saw this blog post on Planet Python: http://oubiwann.blogspot.com/2008/06/async-batching-with-twisted-walkthrough.html Examples 7 and 8 seems interesting for us: it deals with how one can limit the number of outstanding requests. I hope we might be able to use this to fight the memory usage problems discussed here: http://article.gmane.org/gmane.comp.cryptography.viff.devel/256 I've justed gone over it lightly, and it seems like it's exactly the right tool for the job. I'll let you know how my progress goes. /Tomas ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk
Re: [viff-devel] Fixed memory leak
Hi Okay, thanks for testing! But maybe something else changed since viff-0.4, so could you try revisions 563bcc37fe47 and f993269c2660? True. Will do so later this week. It is quite easy in a Mercurial clone: if you have all local modifications committed, then just do a 'hg update 563bcc37fe47' to change the working copy to that revision. Do a 'hg update tip' to get back to the tip. snip memory profiling suggestion There they trace the boot sequence and each process is a separate bar. It would be very cool if we could produce something similar! I'm sure you can do that :-) I'm still working on getting to grips with python and getting the immediate things done. I'm still learning so trying to get more tools in will just confuse me further :-) snip gigantic computation Saying 4 bytes/field-element (say using 32-bit integers as could be done easily in C), and 400,000 field-elements (100,000 shares from each of the three parties and 100,000 reconstructed secrets) this amounts to around an astonishing 400kB (per party). Right, 100,000 32-bit values would be 400 KB. If you allocated them as integers using the array module, then they would take up close to 400 KB in Python. But each Share holds a FieldElement which holds two integers. So some of the massive overhead is simply due to more objects being tossed around. Then there is a per object malloc overhead, reference counters, list overallocation, etc... :-/ But from 400kB to more than 400MB? It's a factor of a thousand! By the way, there is now an open ticket at Twisted about the memory consumption of Deferred: http://twistedmatrix.com/trac/ticket/3245 I made a patch and I hope they will eventually include it. Another guy reported that using __slots__ cut the memory usage in half. Cool. Do you see a benefit... and how big? One positive thing here is that I've gotten an idea of how to potentially remove /some/ of that overhead (honest-but-curious only). Have you tried the code I sent you? I posted it here as well: http://article.gmane.org/gmane.comp.python.twisted/16519 and got two comments that said that similar functionality already existed in Twisted :-) There is a DeferredQueue and DeferredSemaphore in twisted.internet.defer which could be used to limit concurrency. Not yet, will do though. I'll also look into the twisted solutions. The observation is that we don't need to store all shares and reconstruct at the end. We can just keep track of the sum of the shares (times their lambda's) we've recieved. Whenever a new one arrives we add it to that accumulator. This might not make a big difference in a 3-party setting, but if many (e.g. 10) parties are involved, then this seems like a good idea to reduce memory requirements. Do you mean that we can sort of recombine as we go? I can see how that makes sense if we know which recombination vector to use -- and if people are honest-but-curious (as you write) then that is no problem. Exactly. And of course it doesn't work with active cheating, because the parties must check stuff later on. In the honest-but-curious setting we could even make it so that only t+1 players ever send their shares. If there are S such subsets, then we could choose subset number hash(program_counter) % S for each operation -- that would distribute the load evenly over all parties. How about just sending to the t next players, i.e. player P_i sends to P_{i+1},...,P_{i+t}, where the additions are implicitly mod the number of players? As everyone always sends the same amount this also distributes the load evenly :-) Alternatively, if certain players are more powerful/have better connections they can do more work, thereby taking the load off weak players. Exact details for a protocol run could be set up through config-files. Just brainstorming a bit. Regards, Tomas ___ viff-devel mailing list (http://viff.dk/) viff-devel@viff.dk http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk