We are somewhat new to Hadoop and are looking to run some experiments with
HDFS, Pig, and HBase.
With that in mind, I have a few questions:
What is the easiest (preferably free) Hadoop distro to get started with?
Cloudera?
What host OS distro/release is recommended?
What is the easiest environment to get started with? Amazon EC2? Is there
anyone offering virtual/hosted prebuilt Hadoop instances?
Where would we find some "big data" files that people have used for testing
purposes?
Feel free to RTFM me to the right place ;-)
Thanks, john