Hi Bryan,

first thanks a lot for your comments.
I am answering them inline.

Bryan Turner wrote:
Arnaud,

Thanks for the new paper, there is a lot of nice data here. The presented graphs and analysis indeed suggest that the BitTorrent protocol's choices achieve near-optimal data distribution in its paradigm (single file, block-based content distribution). I agree with your suggestions for fairness criteria, and my own suggestions are similar (my suggestion is for a reputation-based system to model leechers and free-riders).
the first point I want to make clear is that we place ourselves in the context of P2P file delivery in the current Internet. This context has one important implications: there is a good network connectivity and everybody can discuss
with everybody (2 NATed peers cannot communicate
with each other directly, but this is not a network issue, this is an end user issue)
You will see in the following why this is important.

That should be clear in the paper, if it was not tell me so that I can improve the argumentation.


However, I disagree with several conclusions you make along the way. A brief overview:
- Byte-for-byte fairness is not appropriate
I still claim that in the context of P2P file transfer (see below)
- Network Coding is not justified
in our context. Network coding can be a good solution in the context of ad-hoc networks, or in any other network where connectivity is a problem, i.e., when local rarest first does not work optimally. Note, however, that network coding requires significant computing capabilities that a sensor network or an ad hoc network based on basic terminals (e.g., mobile phones) cannot afford for now.

There are other scenarios where network coding can help.
But, I still claim that in the current Internet and on a representative set of reat torrents we monitored, network coding would not improve
the performance significantly enough to be justified.


Fairness:

I am unconvinced by the arguments against byte-for-byte fairness. Your evidence to support this is that tit-for-tat does not take into account the excess capacity of some torrents (ie: more seeders than leechers). Yet this situation is equivalent under byte-for-byte and choking rules. Leechers are upholding the byte-for-byte policy (seeders cannot, as you mention) and thus only leechers are restricting their upload capacities to free-riders, and rightly so. Seeds (who hold the excess capacity) are distributing this capacity fairly to all peers. Thus, choking and byte-for-byte are equivalent in this regard - free riders receive a share of the bandwidth equal to the fair distribution given by seeders in both cases.
I am not sure I understand your point.
Byte-for-byte (BFB) algorithm and choke algorithm (CA) are far from being equivalent. In all the studies that mention BFB I am aware of, they never mention the case of seeds. They simply say that peers must not receive more than they give. This is the definition of BFB. You can introduce a threshold, but it does not change the main idea and there is no proposed solution
to define a dynamic threshold.

It is not correct to say that peers must not receive more than they send. As I wrote in the report, leechers have an asymmetric capacity most of the time and seeds offer capacity for free. With BFB you cannot benefit from this excess capacity. The only one solution is to define a dynamic threshold, but this is an optimization problem that is not solved, and I believe to be very hard to solve practically, for no benefit
compared to CA.

Remember that the main (and single argument) in all the studies that discuss BFB solutions is that free riders can get a high service from a torrent. This problem in all the studies I am aware of comes from the old choke algorithm in seed state that favors fast downloaders. With the new choke algorithm in seed state all the observed unfairness disappear (in the sense of the fairness
I define in the paper).



The more important case, when excess capacity is not available, free riders should be eliminated or diminished. Because leecher participation cannot be proven, the only other method of measurement is via equal exchange of services. Byte-for-byte is one such measurement (reputation systems are another). In this scenario, free riders will only receive a small credit from the leechers in the swarm before having to resort to hand-outs from the seeds. With the seeds giving fair capacity to the swarm, the free riders will receive very little of the swarm capacity, while the participating leechers will receive fair resources.
CA gives the same result from our measurements. It guarantees that a peer cannot receive more from a torrent than a peer that contributes more than him. Here again, with BFB we will not use all the capacity of the torrent because it is based on a fairness criterion that is not correct in a P2P architecture.

Free riders always receive more resources under choking than under exchange measurement models.
and that is fine as long as they do not receive more than someone that contributes. I understand that you argue in favor of global efficiency. It is true that if you do not give anything to free riders, you will have more capacity for the ones that contributes. But it is not clear that they can make use of it, and it is not clear that giving a small service
to free riders impact the overall torrent.

I have shown a long time ago in the context of multicast that it is possible to define a policy that gives more to multicast users without
significantly decreasing the performance of unicast ones.
Even if the context is different, I believe that the issue is equivalent.

Network Coding:

On the topic of Network Coding [1], I did not see any evidence in your paper refuting the network coding literature. Please correct me if I missed it, but it appears your only rebuttal is in the poor modeling of BitTorrent, not in the application of network coding techniques. From this I don't see how you conclude that network coding is not justified? I agree that most of the simulation literature does not model BitTorrent correctly, but this doesn't negate the positive results of the network coding trials, only their comparisons to BitTorrent.
That is not my point. We claim that we cannot gain a significant performance improvement using network coding in the context we are studying. As I said at the beginning of this email network coding is undoubtedly useful in a number of situations, not in our context. However, the context we are studying is very important and there are tons of persons believing that they can improve a P2P client for file sharing over the Internet using network coding techniques. This is not true and this is due to a misunderstanding of the
dynamics of BitTorrent that I hope the paper help to understand better.

In section 4-A, pp 6-7, you suggest that source or network coding may not propose interesting pieces to peers during the startup phase. I counter that this is exactly what network coding solves - every packet has an innovation rate based on the network flows from the local node to the source(s). Assuming a densely connected graph with fair sharing from the seeder, it is unlikely that any packet is not innovative during the startup phase. Unless the local node's bandwidth is many multiples of the seeder, a problem which is not specific to network coding.
You did not get the point. consider a seed with a finite upload capacity C Bytes/s and a file to send of size S Bytes. Then the seed needs in the best case S/C seconds to send the entire content. Whatever coding technique you use you cannot do it faster.

We identified that when the performance of a torrent is not optimal, then it is in a transient phase. That means that the seed has not yet sent a copy of each piece. If you use network coding of whatever coding you can imagine, we will not improve that fact that information is missing in the torrent and that is this missing information that causes the decrease in performance.

Network Coding is superior to Rarest First and other BitTorrent-like protocols for several reasons: 1) The innovation rate is the min-cut of all flows from the local node to the source(s), which is likely to be much larger than BitTorrent's default of 4 active connections. 2) BitTorrent transfers pieces in blocks-at-a-time, but only advertises in piece-at-a-time. This injects a significant delay between downloading of an innovative byte and sharing of that byte (on average, one half the download time of an entire piece). Network coding imposes no such delay. 3) There is no first-blocks or last-blocks problem with network coding, they are entirely non-issues. Large portions of your paper are devoted to these issues which simply don't occur under network coding. 4) The meta-network utilization is higher for network coding. That is, since BitTorrent cannot align to the underlying physical connections of the network elements, it is impossible to maximally utilize them. Network coding (using an innovation rate tracking algorithm) can rapidly align to the structure of the physical network, eliminating cross-core retransmissions. This saves a significant amount of ISP bandwidth and reduces the overall burden of supporting filesharing services on the internet.
All you say is theory and is true, but the real difference between network coding and rarest first depends on the context. All I can tell you is that measurements show that in the context I am discussing there is not significant differences between network coding
and Rarest Fist.
Is the full implementation of Network Coding superior enough to BitTorrent that it offers compelling reasons for migration to it as a solution? Your paper argues that BitTorrent is "good enough", and I tend to agree.
I do not believe it makes sense to say that one solution is superior to another one in all cases. In our specific context, the rarest first and choke algorithms are enough, but not superior with respect to performance. But they are much simpler.
In other contexts Network Coding will be superior.

The point of the paper is to show which kind of performance we can achieve using rarest first and choke algorithms, and to stop legends that say that there is always last pieces problems problems or fairness issues. These legends, as we show with the notion of transient phase, are due to a misunderstanding of the very complex dynamics of P2P protocols.

Regards,
Arnaud.


_______________________________________________
p2p-hackers mailing list
p2p-hackers@zgp.org
http://zgp.org/mailman/listinfo/p2p-hackers
_______________________________________________
Here is a web page listing P2P Conferences:
http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences

Reply via email to