Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-05-14 Thread Vedant Lath
OOPS. I am awfully sorry for such a gross mistake.

I am very very sorry.

I especially apologise for saying the email was of little use, I said  
it jokingly. It was of much use to me, as I got parts of the other  
side of the story which one doesn't get from the usual news sites.

I sincerely apologise for making such statements and hope that it will  
be taken in the way it was intended -- a very very casual way.

On 2009-May-15, at 04:58, Vedant Lath wrote:

> such a detailed reply which must have taken a lot of time for a
> mediawiki developer, with little use.
>
> btw, this does show that net neutrality is not at all clear cut.
> bittorrent gives a lot of options for abuse of the network -- many
> leechers don't seed, too many connections max out the hardware, etc.
>
> On 2009-Apr-19, at 01:22, Gregory Maxwell wrote:
>
>> On Fri, Apr 17, 2009 at 9:42 PM, Gregory Maxwell
>>  wrote:
>> [snip]
>>> But if you are running parallel connections to avoid slowdowns  
>>> you're
>>> just attempting to cheat TCP congestion control and get an unfair
>>> share of the available bandwidth.  That kind of selfish behaviour
>>> fuels non-neutral behaviour and ought not be encouraged.
>> [snip]
>> On Sat, Apr 18, 2009 at 3:06 AM, Brian 
>> wrote:
>>> I have no problem helping someone get a faster download speech and
>>> I'm also
>>> not willing to fling around fallacies about how selfish behavior is
>>> bad for
>>> society. Here is wget vs. aget for the full history dump of the
>>> simple
>> [snip]
>>
>> And? I did point out this is possible, and that no torrent was
>> required to achieve this end. Thank you for validating my point.
>>
>> Since you've called my position fallacious I figure I ought to give  
>> it
>> a reasonable defence, although we've gone off-topic.
>>
>> The use of parallel TCP has allowed you an inequitable share of the
>> available network capacity[1]. The parallel transport is  
>> fundamentally
>> less efficient as it increases the total number of congestion
>> drops[2]. The categorical imperative would have us not perform
>> activities that would be harmful if everyone undertook them. At the
>> limit: If everyone attempted to achieve an unequal share of capacity
>> by running parallel connections the internet would suffer congestion
>> collapse[3].
>>
>> Less philosophically and more practically: the unfair usage of
>> capacity by parallel fetching P2P tools is a primary reason for
>> internet providers to engage in 'non-neutral' activities such as
>> blocking or throttling this P2P traffic[4][5][6].  Ironically, a
>> provider which treats parallel transport technologies unfairly will  
>> be
>> providing a more fair network service and non-neutral handling of
>> traffic is the only way to prevent an (arguably unfair)  
>> redistribution
>> of transport towards end user heavy service providers.
>>
>> (I highly recommend reading the material in [5] for a simple overview
>> of P2P fairness and network efficiency; as well as the Briscone IETF
>> draft in [4] for a detailed operational perspective)
>>
>> Much of the public discussion on neutrality has focused on portraying
>> service providers considering or engaging in non-neutral activities  
>> as
>> greedy and evil. The real story is far more complicated and far less
>> clear cut.
>>
>> Where this is on-topic is that non-neutral behaviour by service
>> providers may well make the Wikimedia Foundation's mission more  
>> costly
>> to practice in the future.  In my professional opinion I believe the
>> best defence against this sort of outcome available to organizations
>> like Wikimedia (and other large content houses) is the promotion of
>> equitable transfer mechanisms which avoid unduly burdening end user
>> providers and therefore providing an objective justification for
>> non-neutral behaviour.  To this end Wikimedia should not promote or
>> utilize cost shifting technology (such as P2P distribution) or
>> inherently unfair inefficient transmission (parallel TCP; or fudged
>> server-side initial window) gratuitously.
>>
>> I spent a fair amount of time producing what I believe to be a well
>> cited reply which I believe stands well enough on its own that I
>> should not need to post any more in support of it. I hope that you
>> will at least put some thought into the issues I've raised here  
>> before
>> dismissing this position.  If my position is fallacious then numerous
>> academics and professionals in the industry are guilty of falling for
>> the same fallacies.
>>
>>
>> [1] Cho, S. 2006 Congestion Control Schemes for Single and Parallel
>> Tcp Flows in High Bandwidth-Delay Product Networks. Doctoral Thesis.
>> UMI Order Number: AAI3219144., Texas A & M University.
>> [2] Padhye, J., Firoiu, V. Towsley, D. and Kurose, J., Modeling TCP
>> throughput: a simple model and its empirical validation. ACMSIGCOMM,
>> Sept. 1998.
>> [3] Floyd, S., and Fall, K., Promoting the Use of End-to-End
>> Congestion Control in the Internet, IEEE/ACM Tra

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-05-14 Thread Vedant Lath
such a detailed reply which must have taken a lot of time for a  
mediawiki developer, with little use.

btw, this does show that net neutrality is not at all clear cut.  
bittorrent gives a lot of options for abuse of the network -- many  
leechers don't seed, too many connections max out the hardware, etc.

On 2009-Apr-19, at 01:22, Gregory Maxwell wrote:

> On Fri, Apr 17, 2009 at 9:42 PM, Gregory Maxwell  
>  wrote:
> [snip]
>> But if you are running parallel connections to avoid slowdowns you're
>> just attempting to cheat TCP congestion control and get an unfair
>> share of the available bandwidth.  That kind of selfish behaviour
>> fuels non-neutral behaviour and ought not be encouraged.
> [snip]
> On Sat, Apr 18, 2009 at 3:06 AM, Brian   
> wrote:
>> I have no problem helping someone get a faster download speech and  
>> I'm also
>> not willing to fling around fallacies about how selfish behavior is  
>> bad for
>> society. Here is wget vs. aget for the full history dump of the  
>> simple
> [snip]
>
> And? I did point out this is possible, and that no torrent was
> required to achieve this end. Thank you for validating my point.
>
> Since you've called my position fallacious I figure I ought to give it
> a reasonable defence, although we've gone off-topic.
>
> The use of parallel TCP has allowed you an inequitable share of the
> available network capacity[1]. The parallel transport is fundamentally
> less efficient as it increases the total number of congestion
> drops[2]. The categorical imperative would have us not perform
> activities that would be harmful if everyone undertook them. At the
> limit: If everyone attempted to achieve an unequal share of capacity
> by running parallel connections the internet would suffer congestion
> collapse[3].
>
> Less philosophically and more practically: the unfair usage of
> capacity by parallel fetching P2P tools is a primary reason for
> internet providers to engage in 'non-neutral' activities such as
> blocking or throttling this P2P traffic[4][5][6].  Ironically, a
> provider which treats parallel transport technologies unfairly will be
> providing a more fair network service and non-neutral handling of
> traffic is the only way to prevent an (arguably unfair) redistribution
> of transport towards end user heavy service providers.
>
> (I highly recommend reading the material in [5] for a simple overview
> of P2P fairness and network efficiency; as well as the Briscone IETF
> draft in [4] for a detailed operational perspective)
>
> Much of the public discussion on neutrality has focused on portraying
> service providers considering or engaging in non-neutral activities as
> greedy and evil. The real story is far more complicated and far less
> clear cut.
>
> Where this is on-topic is that non-neutral behaviour by service
> providers may well make the Wikimedia Foundation's mission more costly
> to practice in the future.  In my professional opinion I believe the
> best defence against this sort of outcome available to organizations
> like Wikimedia (and other large content houses) is the promotion of
> equitable transfer mechanisms which avoid unduly burdening end user
> providers and therefore providing an objective justification for
> non-neutral behaviour.  To this end Wikimedia should not promote or
> utilize cost shifting technology (such as P2P distribution) or
> inherently unfair inefficient transmission (parallel TCP; or fudged
> server-side initial window) gratuitously.
>
> I spent a fair amount of time producing what I believe to be a well
> cited reply which I believe stands well enough on its own that I
> should not need to post any more in support of it. I hope that you
> will at least put some thought into the issues I've raised here before
> dismissing this position.  If my position is fallacious then numerous
> academics and professionals in the industry are guilty of falling for
> the same fallacies.
>
>
> [1] Cho, S. 2006 Congestion Control Schemes for Single and Parallel
> Tcp Flows in High Bandwidth-Delay Product Networks. Doctoral Thesis.
> UMI Order Number: AAI3219144., Texas A & M University.
> [2] Padhye, J., Firoiu, V. Towsley, D. and Kurose, J., Modeling TCP
> throughput: a simple model and its empirical validation. ACMSIGCOMM,
> Sept. 1998.
> [3] Floyd, S., and Fall, K., Promoting the Use of End-to-End
> Congestion Control in the Internet, IEEE/ACM Transactions on
> Networking, Aug. 1999.
> [4] B. Briscoe, T. Moncaster, L. Burness (BT),
> http://tools.ietf.org/html/draft-briscoe-tsvwg-relax-fairness-01
> [5] Nicholas Weaver presentation  "Bulk Data P2P:
> Cost Shifting, not Cost Savings"
> (http://www.icsi.berkeley.edu/~nweaver/p2pi_shifting.ppt); Nicholas
> Weaver Position Paper P2PI Workshop http://www.funchords.com/p2pi/1
> p2pi-weaver.txt
> [6] Bruno Tuffin, Patrick Maillé: How Many Parallel TCP Sessions to
> Open: A Pricing Perspective. ICQT 2006: 2-12
>
> ___
> Wikitech-l mailin

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-05-14 Thread Vedant Lath
interesting case. happens here too. the best way to determine max  
speed of a connection is to use bittorrent for one of the most popular  
torrents for some time and look at the speed.

On 2009-Apr-18, at 06:51, Stig Meireles Johansen wrote:

> On Fri, Apr 17, 2009 at 7:39 PM, Gregory Maxwell  
>  wrote:
>
>> Torrent isn't a very good transfer method for things which are not
>> fairly popular as it has a fair amount of overhead.
>>
>> The wikimedia download site should be able to saturate your internet
>> connection in any case…
>
>
> But some ISP's throttle TCP-connections (either by design or by simple
> oversubscription and random packet drops), so many small connections  
> *can*
> yield a better result for the end user. And if you are so unlucky as  
> to
> having a crappy connection from your country to the download-site,  
> maybe,
> just maybe someone in your own country already has downloaded it and  
> is
> willing to share the torrent... :)
>
> I can saturate my little 1M ADSL-link with torrent-downloads, but  
> forget
> about getting throughput when it comes to HTTP-requests... if it's  
> in the
> country, in close proximity and the server is willing, then  
> *maybe*.. but
> else.. no way.
>
> Not everyone is very well connected, unfortunately...
>
> /Stigmj
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-20 Thread Jameson Scanlon
Many thanks to everyone who has replied to my initial request for
wikipedia database dumps.  I am glad that I kicked up such a fuss.
Anyhow, I should also state that I have access to various LAN
connections (each with a speed of 3.477Mbps, me thinks).

There was some remark about using wiki servers as torrent seeds.  If
someone has access to various such  3.477Mbps connections – would it
be possible to use them in parallel to download different parts of the
same database by keeping in mind memory addresses, etc?
Many thanks.


On 4/19/09, Brian  wrote:
> I hope you wrote it for your own benefit and not mine! Traffic congestion
> issues being obvious enough, your reductio is irrelevant to the case of a
> single user who has issues saturating their relatively slow dsl link.
> Torrent is not an option, aget is, end of story.
>
> On Sat, Apr 18, 2009 at 1:52 PM, Gregory Maxwell  wrote:
>
>> On Fri, Apr 17, 2009 at 9:42 PM, Gregory Maxwell 
>> wrote:
>> [snip]
>> > But if you are running parallel connections to avoid slowdowns you're
>> > just attempting to cheat TCP congestion control and get an unfair
>> > share of the available bandwidth.  That kind of selfish behaviour
>> > fuels non-neutral behaviour and ought not be encouraged.
>> [snip]
>> On Sat, Apr 18, 2009 at 3:06 AM, Brian  wrote:
>> > I have no problem helping someone get a faster download speech and I'm
>> also
>> > not willing to fling around fallacies about how selfish behavior is bad
>> for
>> > society. Here is wget vs. aget for the full history dump of the simple
>> [snip]
>>
>> And? I did point out this is possible, and that no torrent was
>> required to achieve this end. Thank you for validating my point.
>>
>> Since you've called my position fallacious I figure I ought to give it
>> a reasonable defence, although we've gone off-topic.
>>
>> The use of parallel TCP has allowed you an inequitable share of the
>> available network capacity[1]. The parallel transport is fundamentally
>> less efficient as it increases the total number of congestion
>> drops[2]. The categorical imperative would have us not perform
>> activities that would be harmful if everyone undertook them. At the
>> limit: If everyone attempted to achieve an unequal share of capacity
>> by running parallel connections the internet would suffer congestion
>> collapse[3].
>>
>> Less philosophically and more practically: the unfair usage of
>> capacity by parallel fetching P2P tools is a primary reason for
>> internet providers to engage in 'non-neutral' activities such as
>> blocking or throttling this P2P traffic[4][5][6].  Ironically, a
>> provider which treats parallel transport technologies unfairly will be
>> providing a more fair network service and non-neutral handling of
>> traffic is the only way to prevent an (arguably unfair) redistribution
>> of transport towards end user heavy service providers.
>>
>> (I highly recommend reading the material in [5] for a simple overview
>> of P2P fairness and network efficiency; as well as the Briscone IETF
>> draft in [4] for a detailed operational perspective)
>>
>> Much of the public discussion on neutrality has focused on portraying
>> service providers considering or engaging in non-neutral activities as
>> greedy and evil. The real story is far more complicated and far less
>> clear cut.
>>
>> Where this is on-topic is that non-neutral behaviour by service
>> providers may well make the Wikimedia Foundation's mission more costly
>> to practice in the future.  In my professional opinion I believe the
>> best defence against this sort of outcome available to organizations
>> like Wikimedia (and other large content houses) is the promotion of
>> equitable transfer mechanisms which avoid unduly burdening end user
>> providers and therefore providing an objective justification for
>> non-neutral behaviour.  To this end Wikimedia should not promote or
>> utilize cost shifting technology (such as P2P distribution) or
>> inherently unfair inefficient transmission (parallel TCP; or fudged
>> server-side initial window) gratuitously.
>>
>> I spent a fair amount of time producing what I believe to be a well
>> cited reply which I believe stands well enough on its own that I
>> should not need to post any more in support of it. I hope that you
>> will at least put some thought into the issues I've raised here before
>> dismissing this position.  If my position is fallacious then numerous
>> academics and professionals in the industry are guilty of falling for
>> the same fallacies.
>>
>>
>> [1] Cho, S. 2006 Congestion Control Schemes for Single and Parallel
>> Tcp Flows in High Bandwidth-Delay Product Networks. Doctoral Thesis.
>> UMI Order Number: AAI3219144., Texas A & M University.
>> [2] Padhye, J., Firoiu, V. Towsley, D. and Kurose, J., Modeling TCP
>> throughput: a simple model and its empirical validation. ACMSIGCOMM,
>> Sept. 1998.
>> [3] Floyd, S., and Fall, K., Promoting the Use of End-to-End
>> Congestion Control in the Inter

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-19 Thread Brian
I hope you wrote it for your own benefit and not mine! Traffic congestion
issues being obvious enough, your reductio is irrelevant to the case of a
single user who has issues saturating their relatively slow dsl link.
Torrent is not an option, aget is, end of story.

On Sat, Apr 18, 2009 at 1:52 PM, Gregory Maxwell  wrote:

> On Fri, Apr 17, 2009 at 9:42 PM, Gregory Maxwell 
> wrote:
> [snip]
> > But if you are running parallel connections to avoid slowdowns you're
> > just attempting to cheat TCP congestion control and get an unfair
> > share of the available bandwidth.  That kind of selfish behaviour
> > fuels non-neutral behaviour and ought not be encouraged.
> [snip]
> On Sat, Apr 18, 2009 at 3:06 AM, Brian  wrote:
> > I have no problem helping someone get a faster download speech and I'm
> also
> > not willing to fling around fallacies about how selfish behavior is bad
> for
> > society. Here is wget vs. aget for the full history dump of the simple
> [snip]
>
> And? I did point out this is possible, and that no torrent was
> required to achieve this end. Thank you for validating my point.
>
> Since you've called my position fallacious I figure I ought to give it
> a reasonable defence, although we've gone off-topic.
>
> The use of parallel TCP has allowed you an inequitable share of the
> available network capacity[1]. The parallel transport is fundamentally
> less efficient as it increases the total number of congestion
> drops[2]. The categorical imperative would have us not perform
> activities that would be harmful if everyone undertook them. At the
> limit: If everyone attempted to achieve an unequal share of capacity
> by running parallel connections the internet would suffer congestion
> collapse[3].
>
> Less philosophically and more practically: the unfair usage of
> capacity by parallel fetching P2P tools is a primary reason for
> internet providers to engage in 'non-neutral' activities such as
> blocking or throttling this P2P traffic[4][5][6].  Ironically, a
> provider which treats parallel transport technologies unfairly will be
> providing a more fair network service and non-neutral handling of
> traffic is the only way to prevent an (arguably unfair) redistribution
> of transport towards end user heavy service providers.
>
> (I highly recommend reading the material in [5] for a simple overview
> of P2P fairness and network efficiency; as well as the Briscone IETF
> draft in [4] for a detailed operational perspective)
>
> Much of the public discussion on neutrality has focused on portraying
> service providers considering or engaging in non-neutral activities as
> greedy and evil. The real story is far more complicated and far less
> clear cut.
>
> Where this is on-topic is that non-neutral behaviour by service
> providers may well make the Wikimedia Foundation's mission more costly
> to practice in the future.  In my professional opinion I believe the
> best defence against this sort of outcome available to organizations
> like Wikimedia (and other large content houses) is the promotion of
> equitable transfer mechanisms which avoid unduly burdening end user
> providers and therefore providing an objective justification for
> non-neutral behaviour.  To this end Wikimedia should not promote or
> utilize cost shifting technology (such as P2P distribution) or
> inherently unfair inefficient transmission (parallel TCP; or fudged
> server-side initial window) gratuitously.
>
> I spent a fair amount of time producing what I believe to be a well
> cited reply which I believe stands well enough on its own that I
> should not need to post any more in support of it. I hope that you
> will at least put some thought into the issues I've raised here before
> dismissing this position.  If my position is fallacious then numerous
> academics and professionals in the industry are guilty of falling for
> the same fallacies.
>
>
> [1] Cho, S. 2006 Congestion Control Schemes for Single and Parallel
> Tcp Flows in High Bandwidth-Delay Product Networks. Doctoral Thesis.
> UMI Order Number: AAI3219144., Texas A & M University.
> [2] Padhye, J., Firoiu, V. Towsley, D. and Kurose, J., Modeling TCP
> throughput: a simple model and its empirical validation. ACMSIGCOMM,
> Sept. 1998.
> [3] Floyd, S., and Fall, K., Promoting the Use of End-to-End
> Congestion Control in the Internet, IEEE/ACM Transactions on
> Networking, Aug. 1999.
> [4] B. Briscoe, T. Moncaster, L. Burness (BT),
> http://tools.ietf.org/html/draft-briscoe-tsvwg-relax-fairness-01
> [5] Nicholas Weaver presentation  "Bulk Data P2P:
> Cost Shifting, not Cost Savings"
> (http://www.icsi.berkeley.edu/~nweaver/p2pi_shifting.ppt);
> Nicholas
> Weaver Position Paper P2PI Workshop http://www.funchords.com/p2pi/1
> p2pi-weaver.txt 
> [6] Bruno Tuffin, Patrick Maillé: How Many Parallel TCP Sessions to
> Open: A Pricing Perspective. ICQT 2006: 2-12
>
>

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-18 Thread Tei
Small comment:

If finnaly some guy decide to mirror the files using torrent, I suggest the
use of one of these "Channels" (RSS?) that Azureus support.  This client
even support the "autodownload" of these channels.



-- 
--
ℱin del ℳensaje.
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-18 Thread Gregory Maxwell
On Fri, Apr 17, 2009 at 9:42 PM, Gregory Maxwell  wrote:
[snip]
> But if you are running parallel connections to avoid slowdowns you're
> just attempting to cheat TCP congestion control and get an unfair
> share of the available bandwidth.  That kind of selfish behaviour
> fuels non-neutral behaviour and ought not be encouraged.
[snip]
On Sat, Apr 18, 2009 at 3:06 AM, Brian  wrote:
> I have no problem helping someone get a faster download speech and I'm also
> not willing to fling around fallacies about how selfish behavior is bad for
> society. Here is wget vs. aget for the full history dump of the simple
[snip]

And? I did point out this is possible, and that no torrent was
required to achieve this end. Thank you for validating my point.

Since you've called my position fallacious I figure I ought to give it
a reasonable defence, although we've gone off-topic.

The use of parallel TCP has allowed you an inequitable share of the
available network capacity[1]. The parallel transport is fundamentally
less efficient as it increases the total number of congestion
drops[2]. The categorical imperative would have us not perform
activities that would be harmful if everyone undertook them. At the
limit: If everyone attempted to achieve an unequal share of capacity
by running parallel connections the internet would suffer congestion
collapse[3].

Less philosophically and more practically: the unfair usage of
capacity by parallel fetching P2P tools is a primary reason for
internet providers to engage in 'non-neutral' activities such as
blocking or throttling this P2P traffic[4][5][6].  Ironically, a
provider which treats parallel transport technologies unfairly will be
providing a more fair network service and non-neutral handling of
traffic is the only way to prevent an (arguably unfair) redistribution
of transport towards end user heavy service providers.

(I highly recommend reading the material in [5] for a simple overview
of P2P fairness and network efficiency; as well as the Briscone IETF
draft in [4] for a detailed operational perspective)

Much of the public discussion on neutrality has focused on portraying
service providers considering or engaging in non-neutral activities as
greedy and evil. The real story is far more complicated and far less
clear cut.

Where this is on-topic is that non-neutral behaviour by service
providers may well make the Wikimedia Foundation's mission more costly
to practice in the future.  In my professional opinion I believe the
best defence against this sort of outcome available to organizations
like Wikimedia (and other large content houses) is the promotion of
equitable transfer mechanisms which avoid unduly burdening end user
providers and therefore providing an objective justification for
non-neutral behaviour.  To this end Wikimedia should not promote or
utilize cost shifting technology (such as P2P distribution) or
inherently unfair inefficient transmission (parallel TCP; or fudged
server-side initial window) gratuitously.

I spent a fair amount of time producing what I believe to be a well
cited reply which I believe stands well enough on its own that I
should not need to post any more in support of it. I hope that you
will at least put some thought into the issues I've raised here before
dismissing this position.  If my position is fallacious then numerous
academics and professionals in the industry are guilty of falling for
the same fallacies.


[1] Cho, S. 2006 Congestion Control Schemes for Single and Parallel
Tcp Flows in High Bandwidth-Delay Product Networks. Doctoral Thesis.
UMI Order Number: AAI3219144., Texas A & M University.
[2] Padhye, J., Firoiu, V. Towsley, D. and Kurose, J., Modeling TCP
throughput: a simple model and its empirical validation. ACMSIGCOMM,
Sept. 1998.
[3] Floyd, S., and Fall, K., Promoting the Use of End-to-End
Congestion Control in the Internet, IEEE/ACM Transactions on
Networking, Aug. 1999.
[4] B. Briscoe, T. Moncaster, L. Burness (BT),
http://tools.ietf.org/html/draft-briscoe-tsvwg-relax-fairness-01
[5] Nicholas Weaver presentation  "Bulk Data P2P:
Cost Shifting, not Cost Savings"
(http://www.icsi.berkeley.edu/~nweaver/p2pi_shifting.ppt); Nicholas
Weaver Position Paper P2PI Workshop http://www.funchords.com/p2pi/1
p2pi-weaver.txt
[6] Bruno Tuffin, Patrick Maillé: How Many Parallel TCP Sessions to
Open: A Pricing Perspective. ICQT 2006: 2-12

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-18 Thread Marco Schuster
On Fri, Apr 17, 2009 at 11:55 PM, Jameson Scanlon <
jameson.scan...@googlemail.com> wrote:

> Is it possible for anyone to indicate more comprehensive lists of
> torrents/trackers than these?  Are there any plans for all the
> database download files to be available in this way (I imagine that
> there would also be some PDF manual which would go along with these to
> indicate offline viewing, and potentially more info than this).
>
In theory, one can create a torrent with the Wikipedia servers as webseeds
easily. Question is, how many torrent clients except Azureus support these?

Marco


-- 
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-18 Thread Brian
s/speech/speed/  :-0

On Sat, Apr 18, 2009 at 1:06 AM, Brian  wrote:

> I have no problem helping someone get a faster download speech and I'm also
> not willing to fling around fallacies about how selfish behavior is bad for
> society. Here is wget vs. aget for the full history dump of the simple
> english wikipedia - a substantial 3.5x improvement  that someone who is
> having slow connection issues definitely should consider trying.
>
> time wget -O/dev/null '
> http://download.wikimedia.org/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
> '
> --2009-04-18 00:59:48--
> http://download.wikimedia.org/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
>
> Resolving download.wikimedia.org...
> 208.80.152.183
>
> Connecting to download.wikimedia.org|208.80.152.183|:80...
> connected.
>
> HTTP request sent, awaiting response... 200
> OK
>
> Length: 125918415 (120M)
> [application/x-7z-compressed]
>
> Saving to:
> `/dev/null'
>
>
> 100%[>]
> 125,918,415 1.41M/s   in 73s
>
> 2009-04-18 01:01:01 (1.66 MB/s) - `/dev/null' saved [125918415/125918415]
>
>
> real1m13.156s
> user0m0.216s
> sys 0m0.964s
>
> min...@dream:/home/mingus/ccnlab_bib -> time aget -n20 -f '
> http://download.wikimedia.org/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
> '
>  Attempting to read log file
> aget-simplewiki-20090330-pages-meta-history.xml.7z.log for resuming download
> job...
>  Couldn't find log file for this download, starting a clean
> job...
>
>  Head-Request Connection
> established
>
>  Downloading
> /simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
> (125918415 bytes) from site download.wikimedia.org(208.80.152.183:80).
> Number of Threads: 20
> ..
> ..   [4%
> completed]
>
>   [4%
> completed]
>
> .  [9%
> completed]
>
> ...[14%
> completed]
>
> .. [19%
> completed]
>
>    [24%
> completed]
>
> ...[29%
> completed]
>
> .  [34%
> completed]
>
>    [39%
> completed]
>
> .. [44%
> completed]
>
> .  [49%
> completed]
>
> ...[54%
> completed]
>
> .. [59%
> completed]
>
>    [64%
> completed]
>
> ...[69%
> completed]
>
> .  [74%
> completed]
>
>    [79%
> completed]
>
> .. [84%
> completed]
>
> .  [89% completed]
> ...[94% completed]
> .. [99% completed]
> .. [100% completed]
>  Download completed, job completed in 21 seconds. (5855 Kb/sec)
>  Shutting down...
>
> real0m20.985s
> user0m0.116s
> sys 0m1.412s
>
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-18 Thread Brian
I have no problem helping someone get a faster download speech and I'm also
not willing to fling around fallacies about how selfish behavior is bad for
society. Here is wget vs. aget for the full history dump of the simple
english wikipedia - a substantial 3.5x improvement  that someone who is
having slow connection issues definitely should consider trying.

time wget -O/dev/null '
http://download.wikimedia.org/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
'
--2009-04-18 00:59:48--
http://download.wikimedia.org/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z

Resolving download.wikimedia.org...
208.80.152.183

Connecting to download.wikimedia.org|208.80.152.183|:80...
connected.

HTTP request sent, awaiting response... 200
OK

Length: 125918415 (120M)
[application/x-7z-compressed]

Saving to:
`/dev/null'


100%[>]
125,918,415 1.41M/s   in 73s

2009-04-18 01:01:01 (1.66 MB/s) - `/dev/null' saved [125918415/125918415]


real1m13.156s
user0m0.216s
sys 0m0.964s

min...@dream:/home/mingus/ccnlab_bib -> time aget -n20 -f '
http://download.wikimedia.org/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
'
 Attempting to read log file
aget-simplewiki-20090330-pages-meta-history.xml.7z.log for resuming download
job...
 Couldn't find log file for this download, starting a clean
job...

 Head-Request Connection
established

 Downloading
/simplewiki/20090330/simplewiki-20090330-pages-meta-history.xml.7z
(125918415 bytes) from site download.wikimedia.org(208.80.152.183:80).
Number of Threads: 20
..
..   [4%
completed]

  [4%
completed]

.  [9%
completed]

...[14%
completed]

.. [19%
completed]

   [24%
completed]

...[29%
completed]

.  [34%
completed]

   [39%
completed]

.. [44%
completed]

.  [49%
completed]

...[54%
completed]

.. [59%
completed]

   [64%
completed]

...[69%
completed]

.  [74%
completed]

   [79%
completed]

.. [84%
completed]

.  [89% completed]
...[94% completed]
.. [99% completed]
.. [100% completed]
 Download completed, job completed in 21 seconds. (5855 Kb/sec)
 Shutting down...

real0m20.985s
user0m0.116s
sys 0m1.412s
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-17 Thread Gregory Maxwell
On Fri, Apr 17, 2009 at 9:21 PM, Stig Meireles Johansen
 wrote:
> But some ISP's throttle TCP-connections (either by design or by simple
> oversubscription and random packet drops), so many small connections *can*
> yield a better result for the end user. And if you are so unlucky as to
> having a crappy connection from your country to the download-site, maybe,
> just maybe someone in your own country already has downloaded it and is
> willing to share the torrent... :)
> I can saturate my little 1M ADSL-link with torrent-downloads, but forget
> about getting throughput when it comes to HTTP-requests... if it's in the
> country, in close proximity and the server is willing, then *maybe*.. but
> else.. no way.

There are plenty of downloading tools that will use range requests to
download a signal file with parallel connections…

But if you are running parallel connections to avoid slowdowns you're
just attempting to cheat TCP congestion control and get an unfair
share of the available bandwidth.  That kind of selfish behaviour
fuels non-neutral behaviour and ought not be encouraged.

We offered torrents in the past for commons picture of the year
results— a more popular thing to download, a much smaller file (~500mb
vs many gbytes), and not something which should become outdated every
month… and pretty much no one stayed connected long enough for anyone
else to manage to pull anything from them. It was an interesting
experiment, but it indicated that further use for these sorts of files
would be a waste of time.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-17 Thread Stig Meireles Johansen
On Fri, Apr 17, 2009 at 7:39 PM, Gregory Maxwell  wrote:

> Torrent isn't a very good transfer method for things which are not
> fairly popular as it has a fair amount of overhead.
>
> The wikimedia download site should be able to saturate your internet
> connection in any case…


But some ISP's throttle TCP-connections (either by design or by simple
oversubscription and random packet drops), so many small connections *can*
yield a better result for the end user. And if you are so unlucky as to
having a crappy connection from your country to the download-site, maybe,
just maybe someone in your own country already has downloaded it and is
willing to share the torrent... :)

I can saturate my little 1M ADSL-link with torrent-downloads, but forget
about getting throughput when it comes to HTTP-requests... if it's in the
country, in close proximity and the server is willing, then *maybe*.. but
else.. no way.

Not everyone is very well connected, unfortunately...

/Stigmj
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-17 Thread David Gerard
2009/4/17 Gregory Maxwell :

> Torrent isn't a very good transfer method for things which are not
> fairly popular as it has a fair amount of overhead.
> The wikimedia download site should be able to saturate your internet
> connection in any case…


Indeed :-) The problem with the dumps as I understand it is not
serving them - if that was a problem, you can be sure the Internet
Archive would be happy to store Wikimedia dumps forever - but
generating them in the first place.


- d.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-17 Thread Gregory Maxwell
On Fri, Apr 17, 2009 at 6:10 PM, Chad  wrote:
> I seem to remember there being a discussion about the
> torrenting issue before. In short: there's never been any
> official torrents, and the unofficial ones never got really
> popular.

Torrent isn't a very good transfer method for things which are not
fairly popular as it has a fair amount of overhead.

The wikimedia download site should be able to saturate your internet
connection in any case…

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-17 Thread Chad
On Fri, Apr 17, 2009 at 5:55 PM, Jameson Scanlon
 wrote:
> Two separate sites indicate potential sources of torrents for *.tar.gz
> downloads of the en wikipedia database material :
>
> http://en.wikipedia.org/wiki/Wikipedia_database and
> http://meta.wikimedia.org/wiki/Data_dumps#What_about_bittorrent.3F
> (so far).
>
> Is it possible for anyone to indicate more comprehensive lists of
> torrents/trackers than these?  Are there any plans for all the
> database download files to be available in this way (I imagine that
> there would also be some PDF manual which would go along with these to
> indicate offline viewing, and potentially more info than this).
> J
>
>
> On 4/15/09, Petr Kadlec  wrote:
>> 2009/4/14 Platonides :
>>> IMHO the benefits of separated files are similar to the disadvantages. A
>>> side side benefit if it would be that hashes would be splitted, too. If
>>> you were unlucky, knowing that 'something' (perhaps just a bit) on the
>>> 150GB you downloaded is wrong, is not that helpful.
>>> So having hashes for file sections on the big ones, even if not
>>> 'standard' would be an improvement.
>>
>> For that, something like Parchive would probably be better…
>>
>> -- [[cs:User:Mormegil | Petr Kadlec]]
>>
>> ___
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

I seem to remember there being a discussion about the
torrenting issue before. In short: there's never been any
official torrents, and the unofficial ones never got really
popular.

-Chad

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Dealing with Large Files when attempting a wikipedia database download - Focus upon Bittorrent List

2009-04-17 Thread Jameson Scanlon
Two separate sites indicate potential sources of torrents for *.tar.gz
downloads of the en wikipedia database material :

http://en.wikipedia.org/wiki/Wikipedia_database and
http://meta.wikimedia.org/wiki/Data_dumps#What_about_bittorrent.3F
(so far).

Is it possible for anyone to indicate more comprehensive lists of
torrents/trackers than these?  Are there any plans for all the
database download files to be available in this way (I imagine that
there would also be some PDF manual which would go along with these to
indicate offline viewing, and potentially more info than this).
J


On 4/15/09, Petr Kadlec  wrote:
> 2009/4/14 Platonides :
>> IMHO the benefits of separated files are similar to the disadvantages. A
>> side side benefit if it would be that hashes would be splitted, too. If
>> you were unlucky, knowing that 'something' (perhaps just a bit) on the
>> 150GB you downloaded is wrong, is not that helpful.
>> So having hashes for file sections on the big ones, even if not
>> 'standard' would be an improvement.
>
> For that, something like Parchive would probably be better…
>
> -- [[cs:User:Mormegil | Petr Kadlec]]
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l