Setting up vendor distros is a great first step. 1) Running TeraSort and benchmarking is a good step. You can also run larger, full stack hadoop applications like bigpetstore, which we curate here : https://github.com/apache/bigtop/tree/master/bigtop-bigpetstore/.
2) Write some mapreduce or spark jobs which write data to a persistent transactional store, such as SOLR or HBase. This is a hugely important part of real world hadoop administration, where you will encounter problems like running out of memory, possibly CPU overclocking on some nodes, and so on. 3) Now, did you want to go deeper into the build/setup/deployment of hadoop ? Its worth it to try building/deploying/debugging hadoop ecosytem components from scratch, by setting up Apache BigTop, which packages RPM/DEB artifacts and provides puppet recipes for distributions. Its the original roots of both the cloudera and hortonworks distributions, so you will learn something about both by playing with it. We have some exersizes you can use to guide you and get started https://cwiki.apache.org/confluence/display/BIGTOP/BigTop+U%3A+Exersizes . Feel free to join the mailing list for questions. On Sat, Mar 7, 2015 at 9:32 AM, max scalf <oracle.bl...@gmail.com> wrote: > Krish, > > I dont mean to hijack your mail here but i wanted to find out how/what you > did for the below portion, as i am trying to go down your path as well, i > was able to get 4-5 node cluster using ambari and cdh and now wanted to > take it to next level. What have you done for below? > > "I have done a web log integration using flume and twitter sentiment > analysis." > > On Sat, Mar 7, 2015 at 12:11 AM, Krish Donald <gotomyp...@gmail.com> > wrote: > >> Hi, >> >> I would like to enter into Big Data world as Hadoop Admin and I have >> setup 7 nodes cluster using Ambari, Cloudera Manager and Apache Hadoop. >> I have installed the services like hive, oozie, zookeeper etc. >> >> I have done a web log integration using flume and twitter sentiment >> analysis. >> >> I wanted to understand what are the other skills I should learn ? >> >> Thanks >> Krish >> > > -- jay vyas