Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 12:13 AM, Michael Dalemd...@wikimedia.org wrote: true... people will never upload to site without instant gratification ( cough youtube cough ) ... Hm? I just tried uploading to youtube and there was a video up right away. Other sizes followed within a minute or two. At any rate its not replacing the firefogg that has instant gratification at point of upload its ~just another option~... As another option— Okay. But video support on the site stinks because of lack of server side 'thumbnailing' for video. People upload multi-megabit videos, which is a good thing for editing, but then they don't play well for most users. Just doing it locally is hard— we've had failed SOC projects for this— doing it distributed has all the local complexity and then some. Also I should add that this w...@home system just gives us distributed transcoding as a bonus side effect ... its real purpose will be to distribute the flattening of edited sequences. So that 1) IE users can view them 2) We can use effects that for the time being are too computationally expensive to render out in real-time in javascript 3) you can download and play the sequences with normal video players and 4) we can transclude sequences and use templates with changes propagating to flattened versions rendered on the w...@home distributed computer I'm confused as to why this isn't being done locally at Wikimedia. Creating some whole distributed thing seems to be trading off something inexpensive (machine cycles) for something there is less supply of— skilled developer time. Processing power is really inexpensive. Some old copy of ffmpeg2theora on a single core of my core2 desktop process a 352x288 input video at around 100mbit/sec (input video consumption rate). Surely the time and cost required to send a bunch of source material to remote hosts is going to offset whatever benefit this offers. We're also creating a whole additional layer of cost in that someone have to police the results. Perhaps my tyler durden reference was too indirect: * Create a new account * splice some penises 30 minutes into some talking head video * extreme lulz. Tracking down these instance and blocking these users seems like it would be a fulltime job for a couple of people and it would only be made worse if the naughtyness could be targeted at particular resolutions or fallbacks. (Making it less likely that clueful people will see the vandalism) While presently many machines in the wikimedia internal server cluster grind away at parsing and rendering html from wiki-text the situation is many orders of magnitude more costly with using transclution and temples with video ... so its good to get this type of extension out in the wild and warmed up for the near future ;) In terms of work per byte of input the wikitext parser is thousands of times slower than the theora encoder. Go go inefficient software. As a result the difference may be less than many would assume. Once you factor in the ratio of video to non-video content for the for-seeable future this comes off looking like a time wasting boondoggle. Unless the basic functionality— like downsampled videos that people can actually play— is created I can't see there ever being a time where some great distributed thing will do any good at all. The segmenting is going to significant harm compression efficiency for any inter-frame coded output format unless you perform a two pass encode with the first past on the server to do keyframe location detection. Because the stream will restart at cut points. also true. Good thing theora-svn now supports two pass encoding :) ... Yea, great, except doing the first pass for segmentation is pretty similar to the computational cost as simply doing a one-pass encode of the video. but an extra key frame every 30 seconds properly wont hurt your compression efficiency too much.. It's not just about keyframes locations— if you encode separately and then merge you lose the ability to provide continuous rate control. So there would be large bitrate spikes at the splice intervals which will stall streaming for anyone without significantly more bandwidth than the clip. vs the gain of having your hour long interview trans-code a hundred times faster than non-distributed conversion. (almost instant gratification) Well tuned you can expect a distributed system to improve throughput at the expense of latency. Sending out source material to a bunch of places, having them crunch on it on whatever slow hardware they have, then sending it back may win on the dollars per throughput front, but I can't see that having good latency. true... You also have to log in to upload to commons It will make life easier and make abuse of the system more difficult.. plus it can Having to create an account does pretty much nothing to discourage malicious activity. act as a motivation factor with distribu...@home teams, personal stats and
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 12:47 AM, Gregory Maxwell gmaxw...@gmail.com wrote: On Sat, Aug 1, 2009 at 12:13 AM, Michael Dalemd...@wikimedia.org wrote: Once you factor in the ratio of video to non-video content for the for-seeable future this comes off looking like a time wasting boondoggle. I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. There is a lot of free video out there and as soon as we have a stable system in place wikimedians are going to have a heyday uploading it to Commons. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 2:54 AM, Brianbrian.min...@colorado.edu wrote: On Sat, Aug 1, 2009 at 12:47 AM, Gregory Maxwell gmaxw...@gmail.com wrote: On Sat, Aug 1, 2009 at 12:13 AM, Michael Dalemd...@wikimedia.org wrote: Once you factor in the ratio of video to non-video content for the for-seeable future this comes off looking like a time wasting boondoggle. I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. Really rough back of the napkin numbers: My desktop has a X3360 CPU. You can build systems all day using this processor for $600 (I think I spent $500 on it 6 months ago). There are processors with better price/performance available now, but I can benchmark on this. Commons is getting roughly 172076 uploads per month now across all media types. Scans of single pages, photographs copied from flickr, audio pronouncations, videos, etc. If everyone switched to uploading 15 minute long SD videos instead of other things there would be 154,868,400 seconds of video uploaded to commons per-month. Truly a staggering amount. Assuming a 40 hour work week it would take over 250 people working full time just to *view* all of it. That number is an average rate of 58.9 seconds of video uploaded per second every second of the month. Using all four cores my desktop video encodes at 16x real-time (for moderate motion standard def input using the latest theora 1.1 svn). So you'd need less than four of those systems to keep up with the entire commons upload rate switched to 15 minute videos. Okay, it would be slow at peak hours and you might wish to produce a couple of versions at different resolutions, so multiply that by a couple. This is what I meant by processing being cheap. If the uploads were all compressed at a bitrate of 4mbit/sec and that users were kind enough to spread their uploads out through the day and that the distributed system were perfectly efficient (only need to send one copy of the upload out), and if Wikimedia were only paying $10/mbit/sec/month for transit out of their primary dataceter... we'd find that the bandwidth costs of sending that source material out again would be $2356/month. (58.9 seconds per second * 4mbit/sec * $10/mbit/sec/month) (Since transit billing is on the 95th percentile 5 minute average of the greater of inbound or outbound uploads are basically free, but sending out data to the 'cloud' costs like anything else). So under these assumptions sending out compressed video for re-encoding is likely to cost roughly as much *each month* as the hardware for local transcoding. ... and the pace of processing speed up seems to be significantly better than the declining prices for bandwidth. This is also what I meant by processing being cheap. Because uploads won't be uniformly space you'll need some extra resources to keep things from getting bogged at peak hours. But the poor peak-to-average ratio also works against the bandwidth costs. You can't win: Unless you assume that uploads are going to be very low bitrates local transcoding will always be cheaper with very short payoff times. I don't know how to figure out how much it would 'cost' to have human contributors spot embedded penises snuck into transcodes and then figure out which of several contributing transcoders are doing it and blocking them, only to have the bad user switch IPs and begin again. ... but it seems impossibly expensive even though it's not an actual dollar cost. There is a lot of free video out there and as soon as we have a stable system in place wikimedians are going to have a heyday uploading it to Commons. I'm not saying that there won't be video; I'm saying there won't be video if development time is spent on fanciful features rather than desperately needed short term functionality. We have tens of thousands of videos, much of which don't stream well for most people because they need thumbnailing. Firefogg was useful upload lubrication. But user-powered cloud transcoding? I believe the analysis I provided above demonstrates that resources would be better applied elsewhere. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
2009/8/1 Brian brian.min...@colorado.edu: I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. There is a lot of free video out there and as soon as we have a stable system in place wikimedians are going to have a heyday uploading it to Commons. Oh hell yes. If I could just upload any AVI or MPEG4 straight off a camera, you bet I would. Just imagine what people who've never heard the word Theora will do. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 9:57 AM, David Gerarddger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. There is a lot of free video out there and as soon as we have a stable system in place wikimedians are going to have a heyday uploading it to Commons. Oh hell yes. If I could just upload any AVI or MPEG4 straight off a camera, you bet I would. Just imagine what people who've never heard the word Theora will do. Even if so, I don't think assuming that every single commons upload at the current rate will instead be a 15-minute video is much of an underestimate... -Kat -- Your donations keep Wikipedia online: http://donate.wikimedia.org/en Wikimedia, Press: k...@wikimedia.org * Personal: k...@mindspillage.org http://en.wikipedia.org/wiki/User:Mindspillage * (G)AIM:Mindspillage mindspillage or mind|wandering on irc.freenode.net * email for phone ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 10:12 AM, Kat Walsh k...@mindspillage.org wrote: On Sat, Aug 1, 2009 at 9:57 AM, David Gerarddger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. There is a lot of free video out there and as soon as we have a stable system in place wikimedians are going to have a heyday uploading it to Commons. Oh hell yes. If I could just upload any AVI or MPEG4 straight off a camera, you bet I would. Just imagine what people who've never heard the word Theora will do. Even if so, I don't think assuming that every single commons upload at the current rate will instead be a 15-minute video is much of an underestimate... -Kat A reasonable estimate would require knowledge of how much free video can be automatically acquired, it's metadata automatically parsed and then automatically uploaded to commons. I am aware of some massive archives of free content video. Current estimates based on images do not necessarily apply to video, especially as we are just entering a video-aware era of the internet. At any rate, while Gerard's estimate is a bit optimistic in my view, it seems realistic for the near term. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 10:17 AM, Brian brian.min...@colorado.edu wrote: On Sat, Aug 1, 2009 at 10:12 AM, Kat Walsh k...@mindspillage.org wrote: On Sat, Aug 1, 2009 at 9:57 AM, David Gerarddger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. There is a lot of free video out there and as soon as we have a stable system in place wikimedians are going to have a heyday uploading it to Commons. Oh hell yes. If I could just upload any AVI or MPEG4 straight off a camera, you bet I would. Just imagine what people who've never heard the word Theora will do. Even if so, I don't think assuming that every single commons upload at the current rate will instead be a 15-minute video is much of an underestimate... -Kat A reasonable estimate would require knowledge of how much free video can be automatically acquired, it's metadata automatically parsed and then automatically uploaded to commons. I am aware of some massive archives of free content video. Current estimates based on images do not necessarily apply to video, especially as we are just entering a video-aware era of the internet. At any rate, while Gerard's estimate is a bit optimistic in my view, it seems realistic for the near term. Sorry, looked up to the wrong message - Gregory's estimate. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 12:17 PM, Brianbrian.min...@colorado.edu wrote: A reasonable estimate would require knowledge of how much free video can be automatically acquired, it's metadata automatically parsed and then automatically uploaded to commons. I am aware of some massive archives of free content video. Current estimates based on images do not necessarily apply to video, especially as we are just entering a video-aware era of the internet. At any rate, while Gerard's estimate is a bit optimistic in my view, it seems realistic for the near term. So— The plan is that we'll lose money on every transaction but we'll make it up in volume? (Again, this time without math: The rate of increase as a function of video-minutes of the amortized hardware costs costs for local transcoding is lower than the rate of increase in bandwidth costs needed to send off the source material to users to transcode in a distributed manner. This holds for pretty much any reasonable source bitrate, though I used 4mbit/sec in my calculaton. So regardless of the amount of video being uploaded using users is simply more expensive than doing it locally) Existing distributed computing projects work because the ratio of CPU-crunching to communicating is enormously high. This isn't (and shouldn't be) true for video transcoding. They also work because there is little reward for tampering with the system. I don't think this is true for our transcoding. There are many who would be greatly gratified by splicing penises into streams far more so than anonymously and undetectably making a protein fold wrong. ... and it's only reasonable to expect the cost gap to widen. On Sat, Aug 1, 2009 at 9:57 AM, David Gerarddger...@gmail.com wrote: Oh hell yes. If I could just upload any AVI or MPEG4 straight off a camera, you bet I would. Just imagine what people who've never heard the word Theora will do. Sweet! Except, *instead* of developing the ability to upload straight off a camera what is being developed is user-distributed video transcoding— which won't do anything itself to make it easier to upload. What it will do is waste precious development cycles maintaining an overly complicated software infrastructure, waste precious commons administration cycles hunting subtle and confusing sources of vandalism, and waste income from donors by spending more on additional outbound bandwidth than would be spent on computing resources to transcode locally. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 11:04 AM, Gregory Maxwell gmaxw...@gmail.com wrote: On Sat, Aug 1, 2009 at 12:17 PM, Brianbrian.min...@colorado.edu wrote: A reasonable estimate would require knowledge of how much free video can be automatically acquired, it's metadata automatically parsed and then automatically uploaded to commons. I am aware of some massive archives of free content video. Current estimates based on images do not necessarily apply to video, especially as we are just entering a video-aware era of the internet. At any rate, while Gerard's estimate is a bit optimistic in my view, it seems realistic for the near term. So— The plan is that we'll lose money on every transaction but we'll make it up in volume? There are always tradeoffs. If I understand w...@home correctly it is also intended to be run @foundation. It works just as well for distributing transcoding over the foundation cluster as it does for distributing it to disparate clients. Thus, if the foundation encounters a cpu backlog and wishes to distribute some long running jobs to @home clients in order to maintain realtime operation of the site in exchange for bandwidth it could. Through this method the foundation could handle transcoding spikes of arbitrary size. In the case of spikes @foundation can do first pass get-something-back-to-the-user-now encoding and pass the rest of the tasks to @home. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 1:13 PM, Brianbrian.min...@colorado.edu wrote: There are always tradeoffs. If I understand w...@home correctly it is also intended to be run @foundation. It works just as well for distributing transcoding over the foundation cluster as it does for distributing it to disparate clients. There is nothing in the source code that suggests that. It currently requires the compute nodes to be running the firefogg browser extension. So this would require loading an xserver and firefox onto the servers in order to have them participate as it is now. The video data has to take a round-trip through PHP and the upload interface which doesn't really make any sense, that alone could well take as much time as the actual transcode. As a server distribution infrastructure it would be an inefficient one. Much of the code in the extension appears to be there to handle issues that simply wouldn't exist in the local transcoding case. I would have no objection to a transcoding system designed for local operation with some consideration made for adding externally distributed operation in the future if it ever made sense. Incidentally— The slice and recombine approach using oggCat in WikiAtHome produces files with gaps in the granpos numbering and audio desync for me. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
Some notes: * ~its mostly an api~. We can run it internally if that is more cost efficient. ( will do on a command line client shortly ) ... (as mentioned earlier the present code was hacked together quickly its just a prototype. I will generalize things to work better as internal jobs. and I think I will not create File:Myvideo.mp4 wiki pages rather create a placeholder File:Myvideo.ogg page and only store the derivatives outside of wiki page node system. I also notice some sync issues with oggCat which are under investigation ) * Clearly CPU's, are cheep so is power for the commuters, human resources for system maintenance, rack-space and internal network management, and we of-course will want to run the numbers on any solution we go with. I think your source bitrate assumption was a little high I would think more like 1-2Mbs (with cell-phone camaras targeting low bitrates for transport and desktops re-encoding before upload). But I think this whole convesation is missing the larget issue which is if its cost prohibitive to distribute a few copies for transcode how are we going to distribute the derivatives thousands of times for viewing? Perhaps future work in this area should focus more on the distributing bandwith cost issue. * Furthermore I think I might have mis-represented w...@home I should have more clearly focused on the sequence flattening and only mentioned transocding as an option. With sequence flattening we have a more standard viewing bitrate of source material and cpu costs for rendering are much higher. At present there is no fast way to overlay html/svg on video with filters and effects that are only presently predictably defined in javascript. For this reason we use the browser to wysiwyg render out the content. Eventually we may want to write a optimized stand alone flattener, but for now the w...@home solution worlds less costly in terms of developer resources since we can use the editor to output the flat file. 3) And finally yes ... you can already insert a penis into video uploads today. With something like: oggCat | ffmpeg2theora -i someVideo.ogg -s 0 -e 42.2 myOneFramePenis.ogg ffmpeg2theora -i someVideo.ogg -s 42.2 But yea its one more level to worry about and if its cheaper to do it internally (the transcodes not the penis insertion) we should do it internally. :P (I hope other appreciate the multiple levels of humor here) peace, michael Gregory Maxwell wrote: On Sat, Aug 1, 2009 at 2:54 AM, Brianbrian.min...@colorado.edu wrote: On Sat, Aug 1, 2009 at 12:47 AM, Gregory Maxwell gmaxw...@gmail.com wrote: On Sat, Aug 1, 2009 at 12:13 AM, Michael Dalemd...@wikimedia.org wrote: Once you factor in the ratio of video to non-video content for the for-seeable future this comes off looking like a time wasting boondoggle. I think you vastly underestimate the amount of video that will be uploaded. Michael is right in thinking big and thinking distributed. CPU cycles are not *that* cheap. Really rough back of the napkin numbers: My desktop has a X3360 CPU. You can build systems all day using this processor for $600 (I think I spent $500 on it 6 months ago). There are processors with better price/performance available now, but I can benchmark on this. Commons is getting roughly 172076 uploads per month now across all media types. Scans of single pages, photographs copied from flickr, audio pronouncations, videos, etc. If everyone switched to uploading 15 minute long SD videos instead of other things there would be 154,868,400 seconds of video uploaded to commons per-month. Truly a staggering amount. Assuming a 40 hour work week it would take over 250 people working full time just to *view* all of it. That number is an average rate of 58.9 seconds of video uploaded per second every second of the month. Using all four cores my desktop video encodes at 16x real-time (for moderate motion standard def input using the latest theora 1.1 svn). So you'd need less than four of those systems to keep up with the entire commons upload rate switched to 15 minute videos. Okay, it would be slow at peak hours and you might wish to produce a couple of versions at different resolutions, so multiply that by a couple. This is what I meant by processing being cheap. If the uploads were all compressed at a bitrate of 4mbit/sec and that users were kind enough to spread their uploads out through the day and that the distributed system were perfectly efficient (only need to send one copy of the upload out), and if Wikimedia were only paying $10/mbit/sec/month for transit out of their primary dataceter... we'd find that the bandwidth costs of sending that source material out again would be $2356/month. (58.9 seconds per second * 4mbit/sec * $10/mbit/sec/month) (Since transit billing is on the 95th percentile 5 minute average of the greater of inbound or outbound uploads are basically free, but
Re: [Wikitech-l] w...@home Extension
2009/8/1 Brian brian.min...@colorado.edu: And of course, you can just ship them the binaries! Trusted clients are impossible. Particularly for prrotecting against lulz-seekers. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 1:07 PM, David Gerard dger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: And of course, you can just ship them the binaries! Trusted clients are impossible. Particularly for prrotecting against lulz-seekers. - d. Impossible? That's hyperbole. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
2009/8/1 Brian brian.min...@colorado.edu: On Sat, Aug 1, 2009 at 1:07 PM, David Gerard dger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: And of course, you can just ship them the binaries! Trusted clients are impossible. Particularly for prrotecting against lulz-seekers. Impossible? That's hyperbole. No, it's mathematically accurate. There is NO SUCH THING as a trusted client. It's the same problem as DRM and security by obscurity. http://en.wikipedia.org/wiki/Trusted_client http://en.wikipedia.org/wiki/Security_by_obscurity Never trust the client. Ever, ever, ever. If you have a working model that relies on a trusted client you're fucked already. Basically, if you want to distribute binaries to reduce hackability ... it won't work and you might as well be distributing source. Security by obscurity just isn't. - d. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 1:32 PM, David Gerard dger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: On Sat, Aug 1, 2009 at 1:07 PM, David Gerard dger...@gmail.com wrote: 2009/8/1 Brian brian.min...@colorado.edu: And of course, you can just ship them the binaries! Trusted clients are impossible. Particularly for prrotecting against lulz-seekers. Impossible? That's hyperbole. No, it's mathematically accurate. There is NO SUCH THING as a trusted client. It's the same problem as DRM and security by obscurity. http://en.wikipedia.org/wiki/Trusted_client http://en.wikipedia.org/wiki/Security_by_obscurity Never trust the client. Ever, ever, ever. If you have a working model that relies on a trusted client you're fucked already. Basically, if you want to distribute binaries to reduce hackability ... it won't work and you might as well be distributing source. Security by obscurity just isn't. - d. Ok, nice rant. But nobody cares if you scramble their scientific data before sending it back to the server. They will notice the statistical blip and ban you. I don't think in terms of impossible. It impedes progress. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
On Sat, Aug 1, 2009 at 9:35 PM, Brian brian.min...@colorado.edu wrote: Never trust the client. Ever, ever, ever. If you have a working model that relies on a trusted client you're fucked already. Basically, if you want to distribute binaries to reduce hackability ... it won't work and you might as well be distributing source. Security by obscurity just isn't. - d. Ok, nice rant. But nobody cares if you scramble their scientific data before sending it back to the server. They will notice the statistical blip and ban you. What about video files exploiting some new 0day exploit in a video input format? The Wikimedia transcoding servers *must* be totally separated from the other WM servers to prevent 0wnage or a site-wide hack. About users who run encoding chunks - they have to get a full installation of decoders and stuff, which also has to be kept up to date (and if the clients run in different countries - there are patents and other legal stuff to take care of!); also, the clients must be protected from getting infected chunks so they do not get 0wned by content wikimedia gave to them (imagine the press headlines)... I'd actually be interested how YouTube and the other video hosters protect themselves against hacker threats - did they code totally new de/en-coders? Marco -- VMSoft GbR Nabburger Str. 15 81737 München Geschäftsführer: Marco Schuster, Volker Hemmert http://vmsoft-gbr.de ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] Statistics now on MediaWiki page
This is something of a hyperbole, it's true; my apologies. Parser.php itself has ~5,200 lines of code (total, including comments); combined with the preprocessors (Preprocessor_DOM.php and Preprocessor_Hash.php, ~1,500 and ~1,600 lines), CoreParserFunctions.php (~650 lines), and the rest of the 'parser' related files in the /parser directory (~300 lines each), you get around 11,000. This is total lines, including comments. ~3,000 executable lines in Parser.php sounds plausible. --HM dan nessett dness...@yahoo.com wrote in message news:459215.97119...@web32507.mail.mud.yahoo.com... I am not finished with the analysis (MacGyver) tool, but I thought I would put up what I have so far on the MediaWiki site. I have created a web page in my user space for the Parser Test code coverage analysis - http://www.mediawiki.org/wiki/User:Dnessett/Parser_Tests/Code_Coverage I would appreciate it if someone familiar with the parser would at least glance at the per file statistics for a sanity check. Some things that worry me are: * parserTests seems to visit Special:Nuke. Does this make sense? * Only about 72% of Parser.php is exercised. Is this reasonable? * Xdebug is reporting that the Parser only has 2975 lines of executable code. This contrasts to the report by Happy-Mellon that there is 11,000 lines of code in Parser.php. Are there really that many non-executable lines of code in the parser or is Xdebug missing a whole bunch? ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 BTW, Who's idea was this extension? I know Michael Dale is writing it, but was this something assigned to him by someone else? Was it discussed beforehand? Or is this just Michael's project through and through? Thanks, - -Mike -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkp0zv4ACgkQst0AR/DaKHtFVACgyH8J835v8xDGMHL78D+pYrB7 NB8AoMZVwO7gzg9+IYIlZh2Zb3zGG07q =tpEc -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
Marco Schuster wrote: What about video files exploiting some new 0day exploit in a video input format? The Wikimedia transcoding servers *must* be totally separated from the other WM servers to prevent 0wnage or a site-wide hack. That's not different than a 0day on tiff or djvu. You can do privilege separation, communicate using only pipes... About users who run encoding chunks - they have to get a full installation of decoders and stuff, which also has to be kept up to date (and if the clients run in different countries - there are patents and other legal stuff to take care of!); also, the clients must be protected from getting infected chunks so they do not get 0wned by content wikimedia gave to them (imagine the press headlines)... The exploit affecting third party seems is imho a bigger concern than affecting wmf servers. The servers can be protected better than the users systems. And infecting your users is a Really Bad Thing (tm). Regarding an up-to-date install, the task can include the minimum version to run it, to avoid running tasks on outdated systems. Although you can only do that if you provide the whole framework, whereas for patents and licenes issues it would be preferable to let the users get the codecs themselves. I'd actually be interested how YouTube and the other video hosters protect themselves against hacker threats - did they code totally new de/en-coders? That would be even more risky than using existing, tested (de|en)coders. ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: [Wikitech-l] w...@home Extension
I had to program it anyway to support the distributing of the flattening of sequences. Which has been the planed approach for quite some time. I thought of the name and adding one-off support for transocoding recently, and hacked it up over the past few days. This code will eventually support flattening of sequences. But adding code to do transcoding was a low hanging fruit feature and easy first step. We can now consider if its efficient to use the transcoding feature in wikimedia setup or not but I will use the code either way to support sequence flattening (which has to take place in the browser since there is no other easy way to guarantee wysiwyg flat representation of browser edited sequences ) peace, --michael Mike.lifeguard wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 BTW, Who's idea was this extension? I know Michael Dale is writing it, but was this something assigned to him by someone else? Was it discussed beforehand? Or is this just Michael's project through and through? Thanks, - -Mike -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkp0zv4ACgkQst0AR/DaKHtFVACgyH8J835v8xDGMHL78D+pYrB7 NB8AoMZVwO7gzg9+IYIlZh2Zb3zGG07q =tpEc -END PGP SIGNATURE- ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l ___ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l