Re: [viff-devel] FW: Bug in ViFF

2008-10-06 Thread T . Toft


Hi all


Today, Sebastiaan and I have been doing some serious thinking and
looking into the VIFF code, and we feel convinced that we've found the
bug.



The problem lies entirely in the multiplication protocol. In
Runtime.mul, products of shares are computed and shared. Then secure
Lagrange interpolation runs to reconstruct the original sharing.

With an odd number of players, $n$, and threshold $t = (n-1)/2$ a
contribution is needed from all parties. But if the threshold is lower
or there are more parties, then reconstruction doesn't require all
shares. VIFF uses this to optimize the protocol, using only the first
$2t+1$ shares received.

This doesn't work for the multiplication protocol: Unless shares from
the /same/ parties are used, there will be a mismatch. Say we have
players 1,2,3,4 and a threshold of 1. Each $i$ shares their product
with a polynomial $P_i(x)$. If one party uses shares from 1,2,3, then
the polynomial is

P_{1,2,3}(x) = P_1(x) + P_2(x) + P_3(x)

Another could use 2,3,4, thus interpolating

P_{2,3,4}(x) = P_2(x) + P_3(x) + P_4(x)

Clearly this is not the same computation! Thus, the parties end up
with inconsistent shares -- they have points on different polynomials.

To solve the problem, the players must agree on which shares are used. A
simple solution is to require all shares present. Alternatively (for the
passive case) we could say that only the first $2t+1$ parties share the
products of their shares. This basically eliminates any need for any of
the other parties.

Regards,
  Tomas


___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] Small VIFF language parser

2008-07-08 Thread T . Toft
Hi all

 We have talked on and off about making a front-end compiler for VIFF and
 today I figured that I would try making such a guy...

 So far it can only parse a simple language and spit out something which
 looks like equivalent Python code in which both branches of
 if-statements are run. So a program like this:

   if a:
 x = y
 if b:
   x = z
 else:
   x = w
 fi
   fi

 is transformed into this

   x = (a * y + (1 - a) * x)
   x = (a * b * z + (1 - a * b) * x)
   x = ((1 - a * b) * w + (1 - (1 - a * b)) * x)

 which one could plug into a VIFF skeleton and run (not yet done).

Cool, but two serious comments:

1) Rather than

   x = (a * (y + (1 - a) * x)

you want

   x = (a * (y - x) + x)

so you shave off a superfluous mult for each assignment.

2) As far as I can see, it doesn't work. :-( There is a small bug in the
calculation of the third expression:

(1 - a * b) means !(a  b) but you want (a  (!b)) which translates to:

   a * (1-b) = a - a*b


 The idea is that the conditions in if-statements are pushed down to all
 assignments done in the then- and else-branches, nothing more.

Very basic, but the idea works (modulo the above problems).

 This is just a quick test to make people start thinking about what we
 want in such a language and to make people think about what kind of
 transformation we can do and which we cannot do.

I think that we can translate anything to a list of working expressions
(at least anything that doesn't involve outputting values). But I'm
worried that the naive approach won't give us any reasonable efficiency
except for trivial examples.

I have a few suggestions for improvements, but I'm not sure how difficult
they are to implement. Take the following as inputs for discussion...

1) When
if b:
  x = z
else:
  x = w
fi

is translated into two distinct statements:

  x = (a * b * (z - x) + x)
  x = ((a*(1-b)) * (w - x) + x)

you increase the depth of the arithmetic circuit as the second expression
must wait for the new x-value. However, as only one of b and !b will be
true, the depth does not need to increase:

  a * ((b * (z - w) + w) - x) + x

(or simply: a*b*(z-x) + (a*(1-b))*(w-x) + x)


I'm not sure how easy this is to handle in general though, think of
examples such as:

  if a:
x = c*y
y = d*x
  else:
y = e*x
x = f*y
  fi

where we really want something like this:

(x, y) = (a*(c*y - f*e*x) + f*e*x,   a*(d*c*y - e*x) + e*x)


2) In the above translations a*b is computed multiple times. It would be a
lot more efficient if this was moved into a temporary variable so it could
be reduced. The same is true in hand-translation immediately above, where
e*x is computed 4 times (2 times by itself and 2 times as part of f*e*x).

Hmmm, Maybe using temporary variables partly solves the round-problem
above as well... Just execute both branches and conditionally select the
right results... What do you think?

/Tomas

 The program is attached below -- you will need to grab simplegeneric and
 PLY (Python Lex-Yacc) to use it, but both modules are easy to install:

   http://pypi.python.org/pypi/simplegeneric
   http://www.dabeaz.com/ply/

 You run the program by giving it the name of a file to parse and it will
 tell you the results on stdout.


 --
 Martin Geisler
 ___
 viff-devel mailing list (http://viff.dk/)
 viff-devel@viff.dk
 http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] Async Batching with Twisted: A Walkthrough

2008-06-23 Thread T . Toft

Hi all

 I just saw this blog post on Planet Python:

   
 http://oubiwann.blogspot.com/2008/06/async-batching-with-twisted-walkthrough.html

 Examples 7 and 8 seems interesting for us: it deals with how one can
 limit the number of outstanding requests. I hope we might be able to
 use this to fight the memory usage problems discussed here:

   http://article.gmane.org/gmane.comp.cryptography.viff.devel/256

I've justed gone over it lightly, and it seems like it's exactly the right
tool for the job. I'll let you know how my progress goes.


/Tomas
___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk


Re: [viff-devel] Fixed memory leak

2008-05-26 Thread T . Toft
Hi

 Okay, thanks for testing! But maybe something else changed since
 viff-0.4, so could you try revisions 563bcc37fe47 and f993269c2660?

True. Will do so later this week.

 It is quite easy in a Mercurial clone: if you have all local
 modifications committed, then just do a 'hg update 563bcc37fe47' to
 change the working copy to that revision. Do a 'hg update tip' to get
 back to the tip.



snip memory profiling suggestion

 There they trace the boot sequence and each process is a separate bar.
 It would be very cool if we could produce something similar!

I'm sure you can do that :-) I'm still working on getting to grips with
python and getting the immediate things done. I'm still learning so trying
to get more tools in will just confuse me further :-)

snip gigantic computation

 Saying 4 bytes/field-element (say using 32-bit integers as could be
 done easily in C), and 400,000 field-elements (100,000 shares from
 each of the three parties and 100,000 reconstructed secrets) this
 amounts to around an astonishing 400kB (per party).

 Right, 100,000 32-bit values would be 400 KB. If you allocated them as
 integers using the array module, then they would take up close to 400 KB
 in Python. But each Share holds a FieldElement which holds two integers.
 So some of the massive overhead is simply due to more objects being
 tossed around. Then there is a per object malloc overhead, reference
 counters, list overallocation, etc... :-/

But from 400kB to more than 400MB? It's a factor of a thousand!

 By the way, there is now an open ticket at Twisted about the memory
 consumption of Deferred:

   http://twistedmatrix.com/trac/ticket/3245

 I made a patch and I hope they will eventually include it. Another guy
 reported that using __slots__ cut the memory usage in half.

Cool. Do you see a benefit... and how big?

 One positive thing here is that I've gotten an idea of how to
 potentially remove /some/ of that overhead (honest-but-curious only).

 Have you tried the code I sent you? I posted it here as well:

   http://article.gmane.org/gmane.comp.python.twisted/16519

 and got two comments that said that similar functionality already
 existed in Twisted :-) There is a DeferredQueue and DeferredSemaphore in
 twisted.internet.defer which could be used to limit concurrency.

Not yet, will do though. I'll also look into the twisted solutions.

 The observation is that we don't need to store all shares and
 reconstruct at the end. We can just keep track of the sum of the
 shares (times their lambda's) we've recieved. Whenever a new one
 arrives we add it to that accumulator. This might not make a big
 difference in a 3-party setting, but if many (e.g. 10) parties are
 involved, then this seems like a good idea to reduce memory
 requirements.

 Do you mean that we can sort of recombine as we go? I can see how that
 makes sense if we know which recombination vector to use -- and if
 people are honest-but-curious (as you write) then that is no problem.

Exactly. And of course it doesn't work with active cheating, because the
parties must check stuff later on.

 In the honest-but-curious setting we could even make it so that only t+1
 players ever send their shares. If there are S such subsets, then we
 could choose subset number hash(program_counter) % S for each
 operation -- that would distribute the load evenly over all parties.

How about just sending to the t next players, i.e. player P_i sends to
P_{i+1},...,P_{i+t}, where the additions are implicitly mod the number of
players? As everyone always sends the same amount this also distributes
the load evenly :-)

Alternatively, if certain players are more powerful/have better
connections they can do more work, thereby taking the load off weak
players. Exact details for a protocol run could be set up through
config-files.
Just brainstorming a bit.


Regards,
  Tomas





___
viff-devel mailing list (http://viff.dk/)
viff-devel@viff.dk
http://lists.viff.dk/listinfo.cgi/viff-devel-viff.dk