Thanks Ward -- very useful! It would be interesting to run it again on a
recent dump and to find whether certain categories are getting better video
treatment, though the set

Fascinating that in about 2.5 years, the number of videos in that category
has not changed much.

By coincidence, I was looking at a 2009 blog post I had about Encarta and
Wikipedia's lack of video/multimedia.

"There is a loss to the world with the absence of Encarta’s historic images
[and video]. Because Wikipedia has a strict “free” edict on content,
especially images and multimedia, it will always be at a disadvantage in
having visuals that are unique and under copyright protection. For that,
the community will have to wait until copyright runs out on those
materials. Technology may be fast, but that’s one area that will be slow."

-Andrew



On Mon, Jan 21, 2013 at 6:04 PM, Ward Cunningham <w...@c2.com> wrote:

> Andrew -- Good question. I have an answer. It's a few years old. But if
> you like my method, I bring the data up to date.
>
> I used my exploratory parsing mechanism to look for [[File: ... ]] links
> to media files. I first ignored files with familiar suffixes like jpg, png,
> gif and pdf. This left lots of ogg and ogv files which I separated out as
> videos. This left a couple of oga files and some strange suffixes I didn't
> recognize like djvu, shivg and ext. I ignored them.
>
> All total I found 878 video files on 707 pages,  227 of which were flagged
> as "Articles containing video clips".
>
> I also looked for {{cite video ... }} templates and found 9,716 of them.
>
> I'm scraping this information from an  enwiki.xml dump file downloaded Sep
> 22, 2010. It was 12,162,183,168 bytes uncompressed and contained 2,598,517
> pages.
>
> I'm attaching a text file with one line for each page on which I found (at
> least) one video. The tab-separated columns are: page-title, media-file,
> clips-flag.
>
>
>
> I'd be happy to adjust my methods if there are other ways to markup a
> video. I hope this is useful.
>
> Best regards. -- Ward
>
>
> On Jan 21, 2013, at 3:17 PM, Andrew Lih wrote:
>
> Hi all,
>
> I'm wondering if anyone has done any research into identifying which
> articles in Wikipedia have associated video?
>
> There is this category, which only has 280 or so articles:
> http://en.wikipedia.org/wiki/Category:Articles_containing_video_clips
>
> It seems far from complete. Appreciate any advice or previous work in this
> area.
>
> The background: I'm working with some grad students on staging a Wiki
> Makes Video contest in April, and we'd like to do some measurement of the
> current state of video in Wikipedia.
>
> Thanks, and email me if you'd like to know more about the video project
> for April.
>
> -Andrew
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to