Dear Prof. Danilo Gligoroski: Thank you very much for the offer of helping evaluate the performance of SHA-3 candidates in the context of Tahoe-LAFS. In this letter I sketch out how we evaluate Tahoe-LAFS performance and what we care about for the future.
There are two implementations of Tahoe-LAFS to consider: Tahoe-LAFS v2 as it will be implemented in 2011 and Tahoe-LAFS v2 as it will be implemented in 2021. The first implementation to consider is Tahoe-LAFS v2 as it will be implemented in the next year or so. This implementation will very likely continue to be written primarily in Python, except for the crypto and erasure-coding libraries which will probably continue to be written in C or C++. The deployment platforms will probably be desktops, laptops, servers, netbooks, smartbooks, handhelds, and embedded devices like the SheevaPlug [1] or Beagleboard [2]. The CPU architectures it is used on will probably be mostly amd64, ARM, x86, and perhaps a few deployments on some other architectures such as UltraSparc. The deployments might sometimes be able to take advantage of hardware acceleration which is available in the UltraSparc T2 [3], the next generation of Intel chips [4], the Via Padlock [5], or add-in cards like (usually acceleration of AES and SHA-256). Based on a few ad hoc measurements we've made so far, I expect that the performance burden of interpreting Python, waiting for network round trips, and waiting for disk seeks will usually be larger than the cost of hash function computation. There may currently be some operations which are significantly impacted by the efficiency of computation of the hash function—in particular an immutable file upload does a pass over the file to compute the secure hash of the file before it begins encrypting and uploading the file. That part may be somewhat limited by the efficiency of the hash function—I'm not sure. There may be other parts of Tahoe-LAFS behavior that are affected by the efficiency of the hash function—I'm not sure. Measuring and documenting the concrete performance of the current version of Tahoe-LAFS in various deployment scenarios with various hash function implementations would help guide our design process. We will be improving the known limitations over time. The Python interpreter will become more efficient due to projects like PyPy [6]. New download code (in Tahoe-LAFS v1.7 in May of this year) will reduce the time spent waiting for network traffic, and so on. Also, different deployment scenarios may become common. The typical deployment currently is a powerful amd64 laptop, desktop, or server serving either just one user and relying on a high-latency spinning disk, but people could start using Tahoe-LAFS in a situation where a small ARM chip (such as in the PogoPlug) serves a large number of users simultaneously and has very fast and low-latency network and SSD storage. This would make the overall performance of the system less dependent on network and disk latency and more dependent on the crypto and Python-interpretation tasks that the CPU is doing. François Deppierraz recently ran a simple measurement on his embedded ARM device and reported that hash function and encryption are potentially significant performance issues on that system [7]. Now, what about "Tahoe-LAFS v2 as it will be implemented in 2021"? Well, Tahoe-LAFS is useful for long-term storage and has a strong commitment to backwards compatibility. People who store data, potentially petabytes of data or even more, using the version of Tahoe-LAFS v2 available in 2011 will then have to use the same crypto algorithms and data structures when they test the integrity of their files and read or update the contents of their files in 2021. The implementation of Tahoe-LAFS v2 that they use in 2021 may be substantially different. It might be written entirely in a compiled language like C or C++, or it might be written entirely in a high-level language such as Python or Javascript. It might rely on new Just-in-Time compilation techniques that reduce the cost of interpreting Python or Javascript. The network, storage, memory, and CPU architectures may be different. The deployment scenarios may be different too—perhaps people will want to rely on tiny computers embedded into their eyeglasses or something to do the necessary crypto computations to assure them of the integrity of the "augmented reality" images that they are looking at. Therefore, in designing Tahoe-LAFS v2, I don't just say "It doesn't matter how fast the crypto algorithms are because the Python interpreter is slow.". I pay attention to factors which could affect future implementations such as how compactly and memory-efficiently an algorithm can be implemented in ASIC (for those eyeglasses that I want), how parallelizable our crypto algorithms and data structures are (for those thousand-core devices that people will be carrying around in 2021), and so on. On the other hand, we don't focus exclusively on such far-future performance issues—we also measure and improve current performance, choosing the "low hanging fruit" and then iterating. (At the moment, I think the low-hanging fruit is: 1. optimize server-selection, pipelining, and fail-over of download, and 2. replace RSA with ECDSA.) I would love your help! I think the most important thing we need next is a reproducible, automated benchmark which shows how the performance of operations that we care about is affected by the performance of the hash function. We can then run this automated benchmarks on our buildbot farm [8] and get an idea about how the performance of the current implementation varies across platforms. Also note that we are about to initiate the "100 Year Cryptography" project which will include combining two secure hashes using the Comb4P combiner from [9]. We will want to measure that resulting combined hash function the same way as we measure individual hash functions. I have more to say about these topics, including how to analyze François's measurements on his ARM box, and how we have to do a tree-hashing mode of operation and would need help measuring it. But, now I must go to sleep. :-) Regards, Zooko [1] http://en.wikipedia.org/wiki/SheevaPlug [2] http://beagleboard.org/ [3] http://en.wikipedia.org/wiki/UltraSPARC_T2 [4] http://en.wikipedia.org/wiki/AES_instruction_set [5] http://www.via.com.tw/en/initiatives/padlock/hardware.jsp [6] http://speed.pypy.org/ [7] http://allmydata.org/pipermail/tahoe-dev/2010-March/004150.html [8] http://allmydata.org/buildbot [9] http://tuprints.ulb.tu-darmstadt.de/2094/1/thesis.lehmann.pdf _______________________________________________ tahoe-dev mailing list tahoe-dev@allmydata.org http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev