Here is an interesting article about how to process Common Crawl data using
Amazon EMR:
http://www.commoncrawl.org/mapreduce-for-the-masses/

I think we should be able to do something similar with Whirr quite easily.

I will give it a try soon.

-- Andrei Savu

Reply via email to