Multiple directories for hadoop

2011-01-17 Thread rajgopalv
I have a doubt in configuring dfs.data.dir, One of my slave has 4 500GB harddisks. They are mounted on different mount points : /data1 /data2 /data3 /data4 How can i make use of all the 4 harddisks for hdfs data and local jobcahe ? if i give comma seperated values for dfs.data.dir, will the tot

Re: Non DFS space usage blows up.

2010-12-23 Thread rajgopalv
Dear Stack, The reduce attempt, attempt_201012211759_0003_r_02_0 blows up to 42GB in each of slaves machine [ This is in the middle of the job, the map is 92%complete, and reduce is 30% complete ] Regards, Rajgopal V rajgopalv wrote: > > Dear Stack, > > I browsed through t

Re: Non DFS space usage blows up.

2010-12-22 Thread rajgopalv
`file.out` is the HFileOutputFormat of the CVS data. Stack writes: > > Have you took a look at the content of the 'mapred' directory? > St.Ack > > On Wed, Dec 22, 2010 at 10:34 AM, rajgopalv wrote: > > Jean-Daniel Cryans writes: > >> > >> Look

Re: Non DFS space usage blows up.

2010-12-22 Thread rajgopalv
Jean-Daniel Cryans writes: > > Look on your disks, using du -hs, to see what eats all that space. > > J-D > > On Tue, Dec 21, 2010 at 11:12 PM, rajgopalv wrote: > > > > I'm doing a map reduce job to create the HFileOutputFormat out of CSVs. > > >

Non DFS space usage blows up.

2010-12-21 Thread rajgopalv
I'm doing a map reduce job to create the HFileOutputFormat out of CSVs. * The mapreduce job, operates on 75files, each containing 1Million rows. Total comes up to 16GB. [with replication factor as 2, the total DFS used is 32GB ] * There are 300 Map jobs. * The map job ends perfectly. * There are

Re: Zoo keeper exception in the middle of MR

2010-12-15 Thread rajgopalv
Thanks Don, Thanks all... The problem was because of overloaded master.. i had really small machines. One machine (2GB ram) which was both hadoop master, hadoop slave and a region server.. another machine (2gb ram again) as hbase master, hadoop slave, and a region server, and another 3 machin

Re: Zoo keeper exception in the middle of MR

2010-12-09 Thread rajgopalv
OOPS! in some forum pages, the XML tags created some problem.. http://pastebin.com/2wGdswft so here's my previous reply [http://pastebin.com/2wGdswft ].. sorry for the trouble. :( rajgopalv wrote: > > Suraj, > > Hbase works when i work with smaller clusters, so i do

Re: Zoo keeper exception in the middle of MR

2010-12-09 Thread rajgopalv
t is okay.!? rajgopalv wrote: > >>From the logs, it looks like you don't have hbase conf directory in the > classpath. Can you recheck? Also - in what mode are you running hbase? > Fully > distributed? If so, is zookeeper running locally (localhost:2181). > > My guess is that yo

Re: Zoo keeper exception in the middle of MR

2010-12-09 Thread rajgopalv
gs. > > Can you run small test jobs correctly or does everything mess up? > > On Wed, Dec 8, 2010 at 8:26 PM, rajgopalv wrote: > >> >> Ted, >> >> I've tried incrementing my own counter in every map job, but this keep >> happening. >>

Re: Zoo keeper exception in the middle of MR

2010-12-08 Thread rajgopalv
o > update a Hadoop counter as they go along? That might be all that is > needed. > > On Tue, Dec 7, 2010 at 5:37 AM, rajgopalv wrote: > >> Task attempt_201012071646_0001_m_25_0 failed to report status for 600 >> seconds. Killing! >> > > -- View th

Zoo keeper exception in the middle of MR

2010-12-07 Thread rajgopalv
Hi all, I wrote a MR job for inserting rows into hbase. I open a CSV file which is present in the hdfs and i use put function in the map() to insert into hbase. This technique worked just for 50% and the job got killed.. the log file is here : http://pastebin.com/12gmF3Z6 Why is this happening

Re: Maps sharing a common table.

2010-12-07 Thread rajgopalv
@all.. Thanks. I created instance variable in the setup() function asd suggested. rajgopalv wrote: > > Hi, > I'm writing a MR job to read values from a CSV file and insert it into > hbase using Htable.put() > So each map function will insert one row. > There is no re

Maps sharing a common table.

2010-12-06 Thread rajgopalv
Hi, I'm writing a MR job to read values from a CSV file and insert it into hbase using Htable.put() So each map function will insert one row. There is no reduce function. But now, i open a htable instance inside every map function. This is bad.. i know... but how can I share a common htable i

Re: Inserting Random Data into HBASE

2010-12-03 Thread rajgopalv
350544 > > -regards > Amit > > > - Original Message > From: rajgopalv > To: hbase-u...@hadoop.apache.org > Sent: Thu, 2 December, 2010 5:59:06 PM > Subject: Re: Inserting Random Data into HBASE > > > @Mike : > I am using the client side cache.

Re: Inserting Random Data into HBASE

2010-12-02 Thread rajgopalv
ems to be a better option right? rajgopalv wrote: > > Hi, > I have to test hbase as to how long it takes to store 100 Million Records. > > So i wrote a simple java code which > > 1 : generates random key and 10 columns per key and random values for the > 10 columns. >

Sequential Inserts In HBASE.

2010-11-29 Thread rajgopalv
Hi All, I'm new to HBASE. I understand that HBASE keeps its data sorted in the filesystem. So when we insert randomly, it takes time to sort. Where as when we insert sequentially, there is no need for HBASE to sort. But, i keep hearing from some of the users that, sequential inserts to HBASE is

Sequential Inserts In HBASE.

2010-11-29 Thread rajgopalv
Hi All, I'm new to HBASE. I understand that HBASE keeps its data sorted in the filesystem. So when we insert randomly, it takes time to sort. Where as when we insert sequentially, there is no need for HBASE to sort. But, i keep hearing from some of the users that, sequential inserts to HBASE is