"Deepankar Datta"; 2005 February 14 Monday 17:32 GMT-5

Hi

Sorry for the (very) long delay in getting back to you. Other work has
been overwhelming me, so its taking me this long to get round to
emails.

No problem.

The problem is with autogenerating the meta-data - the P2P sub-projects
are not directly integrated at the moment with the main mirrors. Files
are distributed via the main mirrors, and .torrent files are generated
from this manually. Ideally this would be automatically done at the
source, and sent to all mirrors, but this would involve considerable
persuasion to install the extra software to get this done, as well as
set up the main tracker for this.

I'm not familiar with all of the protocols, but have a basic understanding of the hasing techniques used by one particular client. It involves generating multiple MD4 hashes for smaller pieces and for larger pieces, and the hashes for the larger pieces are hashed again to generate a file hash, and then the information is simply collected in a url, and the rest of the work is done by the network. For this particular client, or class of clients using a similar technique, it shouldn't involve any software beyond a short program, to be achieved in many common languages such as C, Perl, PHP, and so on. If this program could be injected at or integrated with the initial mirror copy stage, it could avoid reading the file twice. Otherwise it could be standalone and kicked off only after a transfer has completed.


If you mean that the mirror servers should also provide access to the content via the protocol, well yes that would be one ideal. But I was merely asking for the meta data to be supported, not the entire protocol. That would take longer to achieve.

While this is all possible, efforts are being spent elsewhere in the
project, and this will have to wait.

Understandable. I just put the idea out there, because as a user, I thought this would make my experience better, and offering ideas for discussion is one way to start the process.


I'm also wondering about the benefits of multiple P2P protocols.
BitTorrent was chosen as we could distribute it via official trackers,

This is one issue, of course. You want to make sure the files aren't tampered with. As I mentioned above, the only protocol I am familiar with, does have some protection against this. The larger pieces are "secured" by their hashes being hashed again and used as one of the key identifiers (size being the other). Their is a possibility that unscrupulous people will tamper with a file and spread invalid smaller hashes, and the smaller hashes are not secured. However, when enough smaller pieces make a large piece, then that large piece is hashed and that piece will be detected and thrown out. The client could get stuck, if it only had bad hashes for the smaller pieces. Thus an "official" file containing this information could also be released in a format suitable for the particular client. This would allow people to recover the valid data in the worst-case scenario, and there are currently tools to do this. Or if the data was even just supplied in a plain text format, then these tools could be modified to import the data.


without being associated with P2P that has been used for piracy. The

I have never distinguished between protocols for one method of user-driven distribution or another. The very first protocol I ever heard of was Napster, and that was judged to be illegal, and if you follow that logic then every protocol is illegal, and even web sites, mail servers, IM, heck even SMS on the mobile phone could receive base64. Some or maybe all of the initial popular clients were marketed for these purposes. It is all P2P, wether some people assosciate it or not. I see no problem using any and all P2P software for a legitimate reason such as distributing free (GNU) software or opensource. In my mind, it is especially important to leverage out the dubious past of the technology by finding good uses for it, to prove that it can be used for something good, before ignorant politicians across the globe vote to outlaw all of it in it's entirety.


Open Source nature of BT was also a big plus point for us, as well as
its potential in distributing load. While it may be possible to

I'm a tremendous fan of Free Software (as defined by the FSF's GPL), somewhat less a fan of open source, and much less of closed source. However, the stuff is out there. People may want to use one client over another, and thus require the client-specific metadata. And as I mentioned in a previous email, some clients allow for plugin protocols that tie in to other networks, but in order to treat it as one download and not two copies of the same thing, the native client meta data is required.


I don't know if software can be written to generate meta data without violating copyrights of some of these protocol's clients. I know Gnucleus was also free software, but it has not had much popularity. The makers of the client with which I am familiar doesn't seem to be bothered by the numerous clones and work-alikes, all released under the GPL. Commercial clients don't seem to be bothered by plugins for other clients. Code could possibly even be gleaned directly from some of these projects, to generate the meta data.

integrate other P2P networks, the major foreseeable future load will
come through BT as it is proven in 'just released' situations.

Something is "proven", but the other methods weren't even tried? Usually I test many alternatives and generate numbers and hypothesize and test before I consider something "proven". On the other hand, if it's "common knowledge" about what seems to work best in release situations, that is another thing, proven by others. But consider the "just released" period. How long is the "just released" window? 24-48 hours? Maybe a week? How often does a release happen? Every 4-12 weeks? So that's a larger amount of time that the files are "out there" not "just released". But they're not trickled to other clients, because there is no authoritative meta data available. No plugins can be utilized. Stuck to using another protocol by itself, so I can neither retrieve from another network, nor help the content spread there for maximum distribution effect.


I'm talking about availability to applications, not necessarily "just released" speed. I'd rather avoid using half a dozen separate P2P clients, and would prefer a Meta-P2P (MP2P) client. This was the trend with instant messaging. First there was IRC and text-based games, then GUI clients like ICQ, then AIM, Yahoo and MSM. Now there are at least a dozen common protocols, and many meta-IM clients that can talk to them all. All P2P has its roots in IRC, and utilized FTP and HTTP protocols, neither of which has ever been deemed illegal because of the potential for misuse. Some friends may use an IM protocol that I personally don't like. I refuse to install a client that can only talk on that one protocol, but it doesn't mean I refuse to talk to them. I just use a meta IM client. It's reasonable to forsee the same thing happening to P2P within 1-3 years, that meta clients become the norm, that is if the technology isn't outlawed.

I hope this answers your questions.  If there is any confusion please
feel free to post further to the list.

I don't think I am too confused about anything in particular. One general area where I am lacking overall experience would be a comparative study of the short-term and long-term (both absolute and relative to file age before obsolescence) efficiency of the various P2P and server protocols, and hybrid mixtures thereof. All I have done is to present an idea to get valuable content such as OOo to every corner of the net, without discriminating against a protocol or the software license used by the client's makers who use the protocol. I recognize that it implies a non-trivial amount of work and diplomacy to fully realize the potential of the idea. And it would also require some time for users to adopt the new technologies available to them, so results probably can't be measured in a short time frame.


Ideally when I offer such an idea, I would also couple it with ability to implement and access to systems or personnel, and time to see it through, but that isn't the case right now. I've offered enough ideas (here and elsewhere combined), and am trying to focus on doing something about them. :-) But if someone else sees the idea and likes it, or even part of it, then they can tweak it if they want, and run with it.

Apologises again for the delay in replying.

Don't worry about it.


Leif



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to