Greetings All !!!

I am using Cloudera CDH3 for Hadoop deployment. We have 7 nodes, in which 5
are used for a fully distributed cluster, 1 for pseudo-distributed & 1 as
management-node.

Fully distributed cluster: HDFS, Mapreduce & Hbase cluster
Pseudo distributed mode: All

I had read about we can install Pig, hive & Sqoop on the client node, no
need to install it in cluster. What is the client node actually? Can I use
my management-node as a client?

What is the best practice to install Pig, Hive, & Sqoop?
For the fully distributed cluster do we need to install Pig, Hive, & Sqoop
in each nodes?

Mysql is needed for Hive as a metastore and sqoop can import mysql database
to HDFS or hive or pig, so can we make use of mysql DB's residing on
another node?

-- 
Thanks & Regards
----
Manu S
SI Engineer - OpenSource & HPC
Wipro Infotech
Mob: +91 8861302855                Skype: manuspkd
www.opensourcetalk.co.in

Reply via email to