Re: [Wikitech-l] Wiki Home Extension

2009-08-03 Thread Michael Dale
yea would have to be opt in. Would have to have controls over how-much 
bandwidth sent out...  We could encourage people to enable it by sending 
out a the higher bit-rate / quality version ~by default~ for those that 
opt-in.

--michael


Ryan Lane wrote:
> On Mon, Aug 3, 2009 at 1:57 PM, Michael Dale wrote:
>   
>> Look back 2 years and you can see the xiph communities blog posts and
>> conversations with Mozilla. It was not a given that Firefox would ship
>> with ogg theora baseline video support (they took some convening and had
>> to do some thinking about it, a big site like wikipedia exclusively
>> using the free formats technology probably helped their decision).
>> Originally the xiph/annodex community built the liboggplay library as an
>> extension. This later became the basis for the library that powers
>> firefox ogg theora video today. Likewise we are putting features into
>> firefogg that we eventually hope will be supported by browsers natively.
>> Also in theory we could put a thin bittorrent client into java Cortado
>> to support IE users as well.
>>
>> 
>
> If watching video on Wikipedia requires bittorrent, most corporate
> environments are going to be locked out. If a bittorrent client is
> loaded by default for the videos, most corporate environments are
> going to blacklist wikipedia's java apps.
>
> I'm not saying p2p distributed video is a bad idea, and the Wikimedia
> foundation may not care about how corporate environments react;
> however, I think it is a bad idea to either force users to use a p2p
> client, or make them opt-out.
>
> Ignoring corporate clients... firing up a p2p client on end-user's
> systems could cause serious issues for some. What if I'm browsing on a
> 3g network, or a satellite connection where my bandwidth is metered?
>
> Maybe this is something that could be delivered via a gadget and
> enabled in user preferences?
>
> V/r,
>
> Ryan Lane
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wiki Home Extension

2009-08-03 Thread Ryan Lane
On Mon, Aug 3, 2009 at 1:57 PM, Michael Dale wrote:
> Look back 2 years and you can see the xiph communities blog posts and
> conversations with Mozilla. It was not a given that Firefox would ship
> with ogg theora baseline video support (they took some convening and had
> to do some thinking about it, a big site like wikipedia exclusively
> using the free formats technology probably helped their decision).
> Originally the xiph/annodex community built the liboggplay library as an
> extension. This later became the basis for the library that powers
> firefox ogg theora video today. Likewise we are putting features into
> firefogg that we eventually hope will be supported by browsers natively.
> Also in theory we could put a thin bittorrent client into java Cortado
> to support IE users as well.
>

If watching video on Wikipedia requires bittorrent, most corporate
environments are going to be locked out. If a bittorrent client is
loaded by default for the videos, most corporate environments are
going to blacklist wikipedia's java apps.

I'm not saying p2p distributed video is a bad idea, and the Wikimedia
foundation may not care about how corporate environments react;
however, I think it is a bad idea to either force users to use a p2p
client, or make them opt-out.

Ignoring corporate clients... firing up a p2p client on end-user's
systems could cause serious issues for some. What if I'm browsing on a
3g network, or a satellite connection where my bandwidth is metered?

Maybe this is something that could be delivered via a gadget and
enabled in user preferences?

V/r,

Ryan Lane

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wiki Home Extension

2009-08-03 Thread Michael Dale
Google's cost is probably more on the distribution side of things ... 
but I only found a top level number not a break down of component costs. 
At any rate the point is to start exploring distributing costs 
associated with large scale video collaboration. In that way I target 
developing a framework where individual pieces can be done on the server 
or on clients depending on what is optimal. Its not that much extra 
effort to design things this way.

Look back 2 years and you can see the xiph communities blog posts and 
conversations with Mozilla. It was not a given that Firefox would ship 
with ogg theora baseline video support (they took some convening and had 
to do some thinking about it, a big site like wikipedia exclusively 
using the free formats technology probably helped their decision). 
Originally the xiph/annodex community built the liboggplay library as an 
extension. This later became the basis for the library that powers 
firefox ogg theora video today. Likewise we are putting features into 
firefogg that we eventually hope will be supported by browsers natively. 
Also in theory we could put a thin bittorrent client into java Cortado 
to support IE users as well.

peace,
--michael

Tisza Gergő wrote:
> Michael Dale  wikimedia.org> writes:
>
>   
>> * We are not Google. Google lost what like ~470 million~ last year on 
>> youtube ...(and that's with $240 million in advertising) so total cost 
>> of $711 million [1]
>> 
>
> How much of that is related to transcoding, and how much to delivery? You seem
> to be conflating the two issues. We cannot do much to cut delivery costs, save
> for serving less movies to readers - distributed transcoding would actually
> raise them. (Peer-to-peer video distribution sounds like a cool feature, but 
> it
> needs to be implemented by browser vendors, not Wikimedia.)
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki Home Extension

2009-08-03 Thread Tisza Gergő
Michael Dale  wikimedia.org> writes:

> * We are not Google. Google lost what like ~470 million~ last year on 
> youtube ...(and that's with $240 million in advertising) so total cost 
> of $711 million [1]

How much of that is related to transcoding, and how much to delivery? You seem
to be conflating the two issues. We cannot do much to cut delivery costs, save
for serving less movies to readers - distributed transcoding would actually
raise them. (Peer-to-peer video distribution sounds like a cool feature, but it
needs to be implemented by browser vendors, not Wikimedia.)


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Nikola Smolenski
>> Gregory Maxwell  gmail.com> writes:
>>> I don't know how to figure out how much it would 'cost' to have human
>>> contributors spot embedded penises snuck into transcodes and then
>>> figure out which of several contributing transcoders are doing it and
>>> blocking them, only to have the bad user switch IPs and begin again.
>>> ... but it seems impossibly expensive even though it's not an actual
>>> dollar cost.

I recall reading about software that can recognize images containing a 
penis, tried googling about it but couldn't find it... something like 
this could be used too.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Michael Dale
Lets see...

* all these tools will be needed for flattening sequences anyway. In 
that case CPU costs are really really high like 1/5 or lower real-time 
and the number of computation needed explodes much faster as every 
"stable" edit necessitates a new flattening of some portion of the 
sequence.

* I don't think its possible to scale the foundation's current donation 
model to traditional free net video distribution.

* We are not Google. Google lost what like ~470 million~ last year on 
youtube ...(and that's with $240 million in advertising) so total cost 
of $711 million [1] say we manage to do 1/100th of youtube ( not 
unreasonable consider we are a top 4 site. Just imagine a world where 
you watch one wikipedia video for every 100 you watch on youtube ) ... 
then we would be what like what 7x the total budget ? ( and they are not 
supporting video editing with flattening of sequences ) ... The pirate 
bay on the other hand operates at a technology cost comparable to 
wikimedia (~$3K~ a month in bandwidth) and is distributing like 1/2 of 
the nets torrents? [2].  (obviously these numbers are a bit of tea 
leaf reading but give or take an order of magnitude it should still be 
clear which model we should be moving towards.)

... I think its good to start thinking about p2p distribution and 
computation ... even if we are not using it today ...
 
* I must say I don't quite agree with your proposed tactic to retain 
neutral networks by avoiding bandwidth distribution via peer 2 peer 
technology. I am aware the net "is not built" for p2p nor is it very 
efficient vs CDNs ... but the whole micro payment system never paned out 
... Perhaps your right p2p will just give companies an excuse to 
restructure the net in a non network neutral way... but I think they 
already have plenty excuse with the existing popular bittorrent systems 
and don't see another way other way for not-for-profit net communities 
to distribute massive amounts of video to each-other.

* I think you may be blowing this ~a bit~ outside of proportion calling 
into question foundation priority in the context of this hack. If this 
was a big initiative over the course of a year or a initiative over the 
course of more than part-time over a week ~ ... then it would make more 
sense to worry about this. But in its present state its just a quick 
hack and the starting point of conversation not foundation policy or 
initiative.


peace,
michael


[1] 
http://www.ibtimes.com/articles/20090413/alleged-470-million-youtube-loss-will-be-cleared-week.htm
[2] 
http://newteevee.com/2009/07/19/the-pirate-bay-distributing-the-worlds-entertainment-for-3000-a-month/


Gregory Maxwell wrote:
> On Sun, Aug 2, 2009 at 6:29 PM, Michael Dale wrote:
> [snip]
>   
>> two quick points.
>> 1) you don't have to re-upload the whole video just the sha1 or some
>> sort of hash of the assigned chunk.
>> 
>
> But each re-encoder must download the source material.
>
> I agree that uploads aren't much of an issue.
>
> [snip]
>   
>> other random clients that are encoding other pieces would make abuse
>> very difficult... at the cost of a few small http requests after the
>> encode is done, and at a cost of slightly more CPU cylces of the
>> computing pool.
>> 
>
> Is >2x slightly?  (Greater because some clients will abort/fail.)
>
> Even that leaves open the risk that a single trouble maker will
> register a few accounts and confirm their own blocks.  You can fight
> that too— but it's an arms race with no end.  I have no doubt that the
> problem can be made tolerably rare— but at what cost?
>
> I don't think it's all that acceptable to significantly increase the
> resources used for the operation of the site just for the sake of
> pushing the capital and energy costs onto third parties, especially
> when it appears that the cost to Wikimedia will not decrease (but
> instead be shifted from equipment cost to bandwidth and developer
> time).
>
> [snip]
>   
>> We need to start exploring the bittorrent integration anyway to
>> distribute the bandwidth cost on the distribution side. So this work
>> would lead us in a good direction as well.
>> 
>
> http://lists.wikimedia.org/pipermail/wikitech-l/2009-April/042656.html
>
>
> I'm troubled that Wikimedia is suddenly so interested in all these
> cost externalizations which will dramatically increase the total cost
> but push those costs off onto (sometimes unwilling) third parties.
>
> Tech spending by the Wikimedia Foundation is a fairly small portion of
> the budget, enough that it has drawn some criticism.  Behaving in the
> most efficient manner is laudable and the WMF has done excellently on
> this front in the past.  Behaving in an inefficient manner in order to
> externalize costs is, in my view, deplorable and something which
> should be avoided.
>
> Has some organizational problem arisen within Wikimedia which has made
> it unreasonably difficult to obtain computing resources, but easy to
> burn bandw

Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Gregory Maxwell
On Sun, Aug 2, 2009 at 6:29 PM, Michael Dale wrote:
[snip]
> two quick points.
> 1) you don't have to re-upload the whole video just the sha1 or some
> sort of hash of the assigned chunk.

But each re-encoder must download the source material.

I agree that uploads aren't much of an issue.

[snip]
> other random clients that are encoding other pieces would make abuse
> very difficult... at the cost of a few small http requests after the
> encode is done, and at a cost of slightly more CPU cylces of the
> computing pool.

Is >2x slightly?  (Greater because some clients will abort/fail.)

Even that leaves open the risk that a single trouble maker will
register a few accounts and confirm their own blocks.  You can fight
that too— but it's an arms race with no end.  I have no doubt that the
problem can be made tolerably rare— but at what cost?

I don't think it's all that acceptable to significantly increase the
resources used for the operation of the site just for the sake of
pushing the capital and energy costs onto third parties, especially
when it appears that the cost to Wikimedia will not decrease (but
instead be shifted from equipment cost to bandwidth and developer
time).

[snip]
> We need to start exploring the bittorrent integration anyway to
> distribute the bandwidth cost on the distribution side. So this work
> would lead us in a good direction as well.

http://lists.wikimedia.org/pipermail/wikitech-l/2009-April/042656.html


I'm troubled that Wikimedia is suddenly so interested in all these
cost externalizations which will dramatically increase the total cost
but push those costs off onto (sometimes unwilling) third parties.

Tech spending by the Wikimedia Foundation is a fairly small portion of
the budget, enough that it has drawn some criticism.  Behaving in the
most efficient manner is laudable and the WMF has done excellently on
this front in the past.  Behaving in an inefficient manner in order to
externalize costs is, in my view, deplorable and something which
should be avoided.

Has some organizational problem arisen within Wikimedia which has made
it unreasonably difficult to obtain computing resources, but easy to
burn bandwidth and development time? I'm struggling to understand why
development-intensive externalization measures are being regarded as
first choice solutions, and invented ahead of the production
deployment of basic functionality.

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Michael Dale
two quick points.
1) you don't have to re-upload the whole video just the sha1 or some 
sort of hash of the assigned chunk.
2) should be relatively strait froward to catch abuse via assigned user 
id's to each chunk uploaded. But checking the sha1 a few times from 
other random clients that are encoding other pieces would make abuse 
very difficult... at the cost of a few small http requests after the 
encode is done, and at a cost of slightly more CPU cylces of the 
computing pool. But as this thread has pointed out CPU cycles are much 
cheaper than bandwidth bits or humans time patrolling derivatives.

We have the advantage with a system like Firefogg that we control the 
version of the encoder pushed out to clients via auto-update and check 
that before accepting their participation (so sha1s should match if the 
client is not doing anything fishy)

But these are version 2 type features conditioned on 1) Bandwidth being 
cheep and internal computer system maintenance and acquisition being 
slightly more costly. (and or 2) We probably want to integrating a thin 
bittorrent client into firefogg so we hit the "sending out the source 
footage only once" upstream cost ratio.

We need to start exploring the bittorrent integration anyway to 
distribute the bandwidth cost on the distribution side. So this work 
would lead us in a good direction as well.

peace,
--michael

Tisza Gergő wrote:
> Steve Bennett  gmail.com> writes:
>
>   
>> Why are we suddenly concerned about someone sneaking obscenity onto a
>> wiki? As if no one has ever snuck a rude picture onto a main page...
>> 
>
> There is a slight difference between vandalism that shows up in recent changes
> and one that leaves no trail at all except maybe in log files only accessible
> for sysadmins.
>
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>   


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Tisza Gergő
Steve Bennett  gmail.com> writes:

> Why are we suddenly concerned about someone sneaking obscenity onto a
> wiki? As if no one has ever snuck a rude picture onto a main page...

There is a slight difference between vandalism that shows up in recent changes
and one that leaves no trail at all except maybe in log files only accessible
for sysadmins.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l


Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Gerard Meijssen
Hoi,
Because it is no longer obvious that vandalism has taken place. You have to
look at the changes.. the whole time to find what might be an issue
Thanks,
 GerardM

2009/8/2 Steve Bennett 

> On Sun, Aug 2, 2009 at 12:16 AM, Tisza Gergő wrote:
> > Gregory Maxwell  gmail.com> writes:
> >
> >> I don't know how to figure out how much it would 'cost' to have human
> >> contributors spot embedded penises snuck into transcodes and then
> >> figure out which of several contributing transcoders are doing it and
> >> blocking them, only to have the bad user switch IPs and begin again.
> >> ... but it seems impossibly expensive even though it's not an actual
> >> dollar cost.
> >
> > Standard solution to that is to perform each operation multiple times on
> > different machines and then compare results. Of course, that raises
> bandwidth
> > costs even further.
>
> Why are we suddenly concerned about someone sneaking obscenity onto a
> wiki? As if no one has ever snuck a rude picture onto a main page...
>
> Steve
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki Home Extension

2009-08-02 Thread Steve Bennett
On Sun, Aug 2, 2009 at 12:16 AM, Tisza Gergő wrote:
> Gregory Maxwell  gmail.com> writes:
>
>> I don't know how to figure out how much it would 'cost' to have human
>> contributors spot embedded penises snuck into transcodes and then
>> figure out which of several contributing transcoders are doing it and
>> blocking them, only to have the bad user switch IPs and begin again.
>> ... but it seems impossibly expensive even though it's not an actual
>> dollar cost.
>
> Standard solution to that is to perform each operation multiple times on
> different machines and then compare results. Of course, that raises bandwidth
> costs even further.

Why are we suddenly concerned about someone sneaking obscenity onto a
wiki? As if no one has ever snuck a rude picture onto a main page...

Steve

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Wiki Home Extension

2009-08-01 Thread Tisza Gergő
Gregory Maxwell  gmail.com> writes:

> I don't know how to figure out how much it would 'cost' to have human
> contributors spot embedded penises snuck into transcodes and then
> figure out which of several contributing transcoders are doing it and
> blocking them, only to have the bad user switch IPs and begin again.
> ... but it seems impossibly expensive even though it's not an actual
> dollar cost.

Standard solution to that is to perform each operation multiple times on
different machines and then compare results. Of course, that raises bandwidth
costs even further.


___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l