Re: Multicluster Communication

2009-06-19 Thread Rakhi Khatwani
for analysis? Regards Raakhi On Fri, Jun 19, 2009 at 4:19 PM, Harish Mallipeddi harish.mallipe...@gmail.com wrote: On Fri, Jun 19, 2009 at 4:06 PM, Rakhi Khatwani rakhi.khatw...@gmail.com wrote: we want hadoop cluster 1 for collecting data n storing it in HDFS we want hadoop cluster 2

Re: Debugging Map-Reduce programs

2009-06-17 Thread Rakhi Khatwani
Hi, You could also use apache commons logging to write logs in your map/reduce functions which will be seen in the jobtracker UI. that's how we did debugging :) Hope it helps Regards, Raakhi On Tue, Jun 16, 2009 at 7:29 PM, jason hadoop jason.had...@gmail.comwrote: When you are running

Customizing machines to use for different jobs

2009-06-04 Thread Rakhi Khatwani
Hi, Can we specify which subset of machines to use for different jobs? E.g. We set machine A as namenode, and B, C, D as datanodes. Then for job 1, we have a mapreduce that runs on B C and for job 2, the map-reduce runs on C D. Regards, Raakhi

Re: Setting up another machine as secondary node

2009-05-27 Thread Rakhi Khatwani
-node's directories are damaged. In regular case you start name-node with ./hadoop-daemon.sh start namenode Thanks, --Konstantin Rakhi Khatwani wrote: Hi, I followed the instructions suggested by you all. but i still come across this exception when i use the following command

Re: Setting up another machine as secondary node

2009-05-26 Thread Rakhi Khatwani
On Thu, May 14, 2009 at 9:03 AM, Rakhi Khatwani rakhi.khatw...@gmail.com wrote: Hi, I wanna set up a cluster of 5 nodes in such a way that node1 - master node2 - secondary namenode node3 - slave node4 - slave node5 - slave How do we go about that? there is no property

JobInProgress and TaskInProgress

2009-05-18 Thread Rakhi Khatwani
Hi, how do i get the job progress n task progress information programmaticaly at any point of time using the API's there is a JobInProgress and TaskInProgress classes... but both of them are private any suggestions? Thanks, Raakhi

Re: JobInProgress and TaskInProgress

2009-05-18 Thread Rakhi Khatwani
. Thanks, Raakhi On Mon, May 18, 2009 at 7:46 PM, Jothi Padmanabhan joth...@yahoo-inc.comwrote: Could you let us know what information are you looking to extract from these classes? You possibly could get them from other classes. Jothi On 5/18/09 6:23 PM, Rakhi Khatwani rakhi.khatw

Setting up another machine as secondary node

2009-05-14 Thread Rakhi Khatwani
Hi, I wanna set up a cluster of 5 nodes in such a way that node1 - master node2 - secondary namenode node3 - slave node4 - slave node5 - slave How do we go about that? there is no property in hadoop-env where i can set the ip-address for secondary name node. if i set node-1 and node-2 in

Small issues regarding hadoop/hbase

2009-05-06 Thread Rakhi Khatwani
Hi, I have a couple of small issues regarding hadoop/hbase 1. i wanna scan a table, but the table is really huge. so i want the result of the scan to some file so that i can analyze it. how do we go about it??? 2. how do you dynamically add and remove nodes in the cluser without disturbing the

Re: Load .so library error when Hadoop calls JNI interfaces

2009-04-30 Thread Rakhi Khatwani
Hi Jason, when will the full version of your book be available?? On Thu, Apr 30, 2009 at 8:51 AM, jason hadoop jason.had...@gmail.comwrote: You need to make sure that the shared library is available on the tasktracker nodes, either by installing it, or by pushing it around via the

IO Exception in Map Tasks

2009-04-27 Thread Rakhi Khatwani
Hi, In one of the map tasks, i get the following exception: java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:424) java.io.IOException: Task process exit with nonzero status of 255. at

Re: IO Exception in Map Tasks

2009-04-27 Thread Rakhi Khatwani
Thanks Jason, is there any way we can avoid this exception?? Thanks, Raakhi On Mon, Apr 27, 2009 at 1:20 PM, jason hadoop jason.had...@gmail.comwrote: The jvm had a hard failure and crashed On Sun, Apr 26, 2009 at 11:34 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi

Re: Advice on restarting HDFS in a cron

2009-04-25 Thread Rakhi Khatwani
Hi, I have faced somewhat a similar issue... i have a couple of map reduce jobs running on EC2... after a week or so, i get a no space on device exception while performing any linux command... so end up shuttin down hadoop and hbase, clear the logs and then restart them. is there a cleaner

Re: Advice on restarting HDFS in a cron

2009-04-25 Thread Rakhi Khatwani
redirect your logs to some place under /mnt (/dev/sdb1); that's 160 GB. - Aaron On Sun, Apr 26, 2009 at 3:21 AM, Rakhi Khatwani rakhi.khatw...@gmail.com wrote: Hi, I have faced somewhat a similar issue... i have a couple of map reduce jobs running on EC2... after a week or so, i get

Custom Input Split

2009-04-22 Thread Rakhi Khatwani
Hi, I have a table with N records, now i want to run a map reduce job with 4 maps and 0 reduces. is there a way i can create my own custom input split so that i can send 'n' records to each map?? if there is a way, can i have a sample code snippet to gain better understanding?

Re: Ec2 instability

2009-04-18 Thread Rakhi Khatwani
node problem with 10. - Andy From: Rakhi Khatwani Subject: Re: Ec2 instability To: hbase-u...@hadoop.apache.org, core-user@hadoop.apache.org Date: Friday, April 17, 2009, 9:44 AM Hi, this is the exception i have been getting @ the mapreduce java.io.IOException: Cannot run

Ec2 instability

2009-04-17 Thread Rakhi Khatwani
Hi, Its been several days since we have been trying to stabilize hadoop/hbase on ec2 cluster. but failed to do so. We still come across frequent region server fails, scanner timeout exceptions and OS level deadlocks etc... and 2day while doing a list of tables on hbase i get the following

Re: Ec2 instability

2009-04-17 Thread Rakhi Khatwani
) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) ... 10 more On Fri, Apr 17, 2009 at 10:09 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, Its been several days since we have been trying to stabilize hadoop

No space left on device Exception

2009-04-16 Thread Rakhi Khatwani
Hi, I am running a map-reduce program on 6-Node ec2 cluster. and after a couple of hours all my tasks gets hanged. so i started digging into the logs there were no logs for regionserver no logs for tasktracker. However for jobtracker i get the following: 2009-04-16 03:00:29,691 INFO

Re: No space left on device Exception

2009-04-16 Thread Rakhi Khatwani
of memory available. but i still get the exception :( Thanks Raakhi On Thu, Apr 16, 2009 at 1:18 PM, Desai, Milind B milind.de...@hp.comwrote: From the exception it appears that there is no space left on machine. You can check using 'df' Thanks Milind -Original Message- From: Rakhi

Re: No space left on device Exception

2009-04-16 Thread Rakhi Khatwani
you don't. i would check on the file system as your jobs run and see if indeed they are filling-up. Miles 2009/4/16 Rakhi Khatwani rakhi.khatw...@gmail.com: Hi, following is the output on the df command [r...@domu-12-31-39-00-e5-d2 conf]# df -h FilesystemSize Used Avail

Migration

2009-04-16 Thread Rakhi Khatwani
Hi, Incase we migrate from hadoop 0.19.0 and hbase 0.19.0 to hadoop 0.20.0 and hbase 0.20.0 respectively, how would it affect the existing data on hadoop dfs and hbase tables? can we migrate the data using distcp only?? Regards Raakhi

Changing block size of hadoop

2009-04-12 Thread Rakhi Khatwani
Hi, I would like to know if it is feasbile to change the blocksize of Hadoop while map reduce jobs are executing? and if not would the following work? 1. stop map-reduce 2. stop-hbase 3. stop hadoop 4. change hadoop-sites.xml to reduce the blocksize 5. restart all whether the data in the

My Map Tasks are getting killed

2009-04-05 Thread Rakhi Khatwani
Hi, I am executing a job on ec2 (set up on cluster with 18 nodes... my job has 7 Map tasks). however my tasks gets killed without reporting an error. i even tried going through the logs, which happens to be fine. On the UI the tasks fail and the status shows as KILLED (error column being

Re: My Map Tasks are getting killed

2009-04-05 Thread Rakhi Khatwani
this is happening? is it a problem because i am performing a split on the table inside my map?? Thanks Raakhi. On Sun, Apr 5, 2009 at 12:18 PM, Rakhi Khatwani rakhi.khatw...@gmail.comwrote: Hi, I am executing a job on ec2 (set up on cluster with 18 nodes... my job has 7 Map tasks). however