Hi; @Derrick: I don't trust Amazon. Really, I don't trust Wikimedia Foundation either. They can't and/or they don't want to provide image dumps (what is worst?). Community donates images to Commons, community donates money every year, and now community needs to develop a software to extract all the images and packed them, and of course, host them in a permanent way. Crazy, right?
@Milos: Instead of spliting image dump using the first letter of filenames, I thought about spliting using the upload date (YYYY-MM-DD). So, first chunks (2005-01-01) will be tiny, and recent ones of several GB (a single day). Regards, emijrp 2011/6/28 Derrick Coetzee <dcoet...@eecs.berkeley.edu> > As a Commons admin I've thought a lot about the problem of > distributing Commons dumps. As for distribution, I believe BitTorrent > is absolutely the way to go, but the Torrent will require a small > network of dedicated permaseeds (servers that seed indefinitely). > These can easily be set up at low cost on Amazon EC2 "small" instances > - the disk storage for the archives is free, since small instances > include a large (~120 GB) ephemeral storage volume at no additional > cost, and the cost of bandwidth can be controlled by configuring the > BitTorrent client with either a bandwidth throttle or a transfer cap > (or both). In fact, I think all Wikimedia dumps should be available > through such a distribution solution, just as all Linux installation > media are today. > > Additionally, it will be necessary to construct (and maintain) useful > subsets of Commons media, such as "all media used on the English > Wikipedia", or "thumbnails of all images on Wikimedia Commons", of > particular interest to certain content reusers, since the full set is > far too large to be of interest to most reusers. It's on this latter > point that I want your feedback: what useful subsets of Wikimedia > Commons does the research community want? Thanks for your feedback. > > --=20 > Derrick Coetzee > User:Dcoetzee, English Wikipedia and Wikimedia Commons administrator > http://www.eecs.berkeley.edu/~dcoetzee/ > > _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l