Install CDH4 using tar ball with MRv1, Not YARN version

2013-06-12 Thread selva
using these two tarballs Since cloudera tailored the steps to package installation. I am totally confused like whether to start dfs of hadoop-2.0.0+n version and start mapred of mr1-0.20.2+n or something else. Kindly help me on setting up. Thanks Selva

Parallel Load Data into Two partitions of a Hive Table

2013-05-03 Thread selva
processedlogs PARTITION(logdate='2013-04-30'); Thanks Selva

Re: Parallel Load Data into Two partitions of a Hive Table

2013-05-03 Thread selva
Thanks Yanbo. I my doubt is got clarified now. On Fri, May 3, 2013 at 2:38 PM, Yanbo Liang yanboha...@gmail.com wrote: load data to different partitions parallel is OK, because it equivalent to write to different file on HDFS 2013/5/3 selva selvai...@gmail.com Hi All, I need to load

Re: High IO Usage in Datanodes due to Replication

2013-05-01 Thread selva
: 447943 Verified since restart : 318433 Scans since restart : 318058 Scan errors since restart: 0 Transient scan errors: 0 Current scan rate limit KBps : 3205 Progress this period :101% Time left in cur period : 86.02% Thanks Selva

Re: High IO Usage in Datanodes due to Replication

2013-05-01 Thread selva
Hi Harsh, You are right, Our Hadoop version is 0.20.2-cdh3u1 which is lack of HDFS-2379. As you suggest i have doubled the DN heap size, Now i will monitor the Block scanning speed. The 2nd idea is good, but I can not merge the small files(~1 MB) since its all in hive table partitions. -Selva

High IO Usage in Datanodes due to Replication

2013-04-27 Thread selva
org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_2442656050740605335_10906493 src: /10.171.11.11:60744 dest: / 10.157.10.242:10013 of size 25431 Thanks Selva