I've just made a mirror of the Wikimedia pagecounts in a requester pays bucket in the AWS cloud.
http://basekb.com/subjectiveEye/wikipedia_traffic_page_counts.php This data in S3 can be efficiently used from a Hadoop cluster based in AWS and there is an open source package to do this that requires nothing more than your AWS credentials to start https://github.com/paulhoule/telepath With hourly hit statistics for all URIs in all Wikimedia projects, this rich data set contains a wealth of information. -- Paul Houle Expert on Freebase, DBpedia, Hadoop and RDF (607) 539 6254 paul.houle on Skype ontolo...@gmail.com ᐧ ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion