Rescheduling of already completed map/reduce task

2009-04-27 Thread Sagar Naik
Hi, The job froze after the filesystem hung on a machine which had successfully completed a map task. Is there a flag to enable the re scheduling of such a task ? Jstack of job tracker SocketListener0-2 prio=10 tid=0x08916000 nid=0x4a4f runnable [0x4d05c000..0x4d05ce30]

Multithreaded Reducer

2009-04-10 Thread Sagar Naik
Hi, I would like to implement a Multi-threaded reducer. As per my understanding , the system does not have one coz we expect the output to be sorted. However, in my case I dont need the output sorted. Can u pl point to me any other issues or it would be safe to do so -Sagar

Re: Multithreaded Reducer

2009-04-10 Thread Sagar Naik
, and increase the total number of reduce tasks per job via mapred.reduce.tasks to ensure that they're all filled. This will effectively utilize a higher number of cores. - Aaron On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik sn...@attributor.com wrote: Hi, I would like to implement a Multi-threaded

Re: connecting two clusters

2009-04-07 Thread Sagar Naik
Hi, I m not sure if u have looked at this option. But instead of having two HDFS , u can have one HDFS and two map-red clusters (pointing to same HDFS) and then do the sync mechanisms -Sagar Mithila Nagendra wrote: Hello Aaron Yes it makes a lot of sense! Thank you! :) The incremental

Re: safemode forever

2009-04-07 Thread Sagar Naik
It means tht not all blocks have been reported Can u check how many datanodes have reported in UI or bin/hadoop dfsadmin -report In case u have to disable the safemode check bin/hadoop dfsadmin -safemode command it has options to enter/leave/get -Sagar javateck javateck wrote: Hi, I'm

Re: hadoop-a small doubt

2009-03-29 Thread Sagar Naik
Yes u can Java Client : Copy the conf dir (same as one on namenode/datanode) and hadoop jars shud be in the classpath of client Non Java Client : http://wiki.apache.org/hadoop/MountableHDFS -Sagar -Sagar deepya wrote: Hi, I am SreeDeepya doing MTech in IIIT.I am working on a project

Re: Design issue for a problem using Map Reduce

2009-02-14 Thread Sagar Naik
Here is one thought N maps and 1 Reduce, input to map: t,w(t) output of map t, w(t)*w(t) I assume t is an integer. So in case of 1 reducer, u will receive t0, square(w(0) t1, square(w(1) t2, square(w(2) t3, square(w(3) Note this wiil be a sorted series on t. in reduce static prevF = 0;

Re: Not able to copy a file to HDFS after installing

2009-02-04 Thread Sagar Naik
where is the namenode running ? localhost or some other host -Sagar Rajshekar wrote: Hello, I am new to Hadoop and I jus installed on Ubuntu 8.0.4 LTS as per guidance of a web site. I tested it and found working fine. I tried to copy a file but it is giving some error pls help me out

Re: My tasktrackers keep getting lost...

2009-02-02 Thread Sagar Naik
Can u post the output from hadoop-argus-hostname-jobtracker.out -Sagar jason hadoop wrote: When I was at Attributor we experienced periodic odd XFS hangs that would freeze up the Hadoop Server processes resulting in them going away. Sometimes XFS would deadlock all writes to the log file and

Re: Question about HDFS capacity and remaining

2009-02-01 Thread Sagar Naik
Hi Brian, Is it possible to publish these test results along with configuration options ? -Sagar Brian Bockelman wrote: For what it's worth, our organization did extensive tests on many filesystems benchmarking their performance when they are 90 - 95% full. Only XFS retained most of its

Re: sudden instability in 0.18.2

2009-01-28 Thread Sagar Naik
Pl check which nodes have these failures. I guess the new tasktrackers/machines are not configured correctly. As a result, the map-task will die and the remaining map-tasks will be sucked onto these machines -Sagar David J. O'Dell wrote: We've been running 0.18.2 for over a month on an 8

Re: tools for scrubbing HDFS data nodes?

2009-01-28 Thread Sagar Naik
Check out fsck bin/hadoop fsck path -files -location -blocks Sriram Rao wrote: By scrub I mean, have a tool that reads every block on a given data node. That way, I'd be able to find corrupted blocks proactively rather than having an app read the file and find it. Sriram On Wed, Jan 28,

Re: tools for scrubbing HDFS data nodes?

2009-01-28 Thread Sagar Naik
are good? Sriram On Wed, Jan 28, 2009 at 6:20 PM, Sagar Naik sn...@attributor.com wrote: Check out fsck bin/hadoop fsck path -files -location -blocks Sriram Rao wrote: By scrub I mean, have a tool that reads every block on a given data node. That way, I'd be able to find corrupted blocks

Re: HDFS - millions of files in one directory?

2009-01-27 Thread Sagar Naik
System with: 1 billion small files. Namenode will need to maintain the data-structure for all those files. System will have atleast 1 block per file. And if u have replication factor set to 3, the system will have 3 billion blocks. Now , if you try to read all these files in a job , you will be

Mapred job parallelism

2009-01-26 Thread Sagar Naik
Hi Guys, I was trying to setup a cluster so that two jobs can run simultaneously. The conf : number of nodes : 4(say) mapred.tasktracker.map.tasks.maximum=2 and in the joblClient mapred.map.tasks=4 (# of nodes) I also have a condition, that each job should have only one map-task per node

Re: Calling a mapreduce job from inside another

2009-01-19 Thread Sagar Naik
You can also play with the priority of the jobs to have the innermost job finish first -Sagar Devaraj Das wrote: You can chain job submissions at the client. Also, you can run more than one job in parallel (if you have enough task slots). An example of chaining jobs is there in

Locks in hadoop

2009-01-15 Thread Sagar Naik
I would like to implement a locking mechanism across the hdfs cluster I assume there is no inherent support for it I was going to do it with files. According to my knowledge, file creation is an atomic operation. So the file-based lock should work. I need to think through with all conditions

Namenode freeze

2009-01-14 Thread Sagar Naik
Hi Datanode goes down. and then looks like ReplicationMonitor tries to even-out the replication However while doing so, it holds the lock on FsNameSystem With this lock held, other threads wait on this lock to respond As a result, the namenode does not list the dirs/ Web-UI does not respond I

Re: 0.18.1 datanode psuedo deadlock problem

2009-01-10 Thread Sagar Naik
Hi Raghu, The periodic du and block reports thread thrash the disk. (Block Reports takes abt on an avg 21 mins ) and I think all the datanode threads are not able to do much and freeze org.apache.hadoop.dfs.datanode$dataxcei...@f2127a daemon prio=10 tid=0x41f06000 nid=0x7c7c waiting for

Re: cannot allocate memory error

2008-12-31 Thread Sagar Naik
{HADOOP_HOME}/conf/hadoop-env.sh export HADOOP_HEAPSIZE the default is 1M, so I think that there could be another issue -Sagar sagar arlekar wrote: Hello, I am new to hadoop. I am running hapdoop 0.17 in a Eucalyptus cloud instance (its a centos image on xen) bin/hadoop dfs -ls / gives the

Re: Threads per mapreduce job

2008-12-27 Thread Sagar Naik
mapred.map.multithreadedrunner.threads is the property u r looking for Michael wrote: Hi everyone: How do I control the number of threads per mapreduce job. I am using bin/hadoop jar wordcount to run jobs and even though I have found these settings in hadoop-default.xml and changed the values

Re: Failed to start TaskTracker server

2008-12-19 Thread Sagar Naik
Well u have some process which grabs this port and Hadoop is not able to bind the port By the time u check, there is a chance that socket connection has died but was occupied when hadoop processes was attempting Check all the processes running on the system Do any of the processes acquire

Re: Failed to start TaskTracker server

2008-12-19 Thread Sagar Naik
have no permission to change or modify other users' programs or settings. Is there any way to change 50060 to other port? Sagar Naik wrote: Well u have some process which grabs this port and Hadoop is not able to bind the port By the time u check, there is a chance that socket connection has

.18.1 jobtracker deadlock

2008-12-17 Thread Sagar Naik
Hi, Found one Java-level deadlock: = SocketListener0-7: waiting to lock monitor 0x0845e1fc (object 0x54f95838, a org.apache.hadoop.mapred.JobTracker), which is held by IPC Server handler 0 on 54311 IPC Server handler 0 on 54311: waiting to lock monitor 0x4d671064

DiskUsage ('du -sk') probably hangs Datanode

2008-12-17 Thread Sagar Naik
I see createBlockException and Abandoning block quite often When I check the datanode, they are running. I can browse file system from that datanode:50075 However, I also notice tht a du forked off from the DN. This 'du' run anywhere from 6mins to 30 mins. During this time no logs are

Re: DiskUsage ('du -sk') probably hangs Datanode

2008-12-17 Thread Sagar Naik
if there's an issue. I can't say for sure, but the 'du' is probably not hanging the Datanode; it's probably a symptom of larger problems. Thanks Brian I will start SMART tests Pl tell me what direction I should look in case of larger problems. Brian On Dec 17, 2008, at 8:29 PM, Sagar Naik

Re: occasional createBlockException in Hadoop .18.1

2008-12-15 Thread Sagar Naik
to datanode:50010 I think, the disk is bad or something Pl suggest some pointers to analyze this problem -Sagar Sagar Naik wrote: CLIENT EXCEPTION: 2008-12-14 08:41:46,919 [Thread-90] INFO org.apache.hadoop.dfs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException

Re: Q about storage architecture

2008-12-06 Thread Sagar Naik
http://hadoop.apache.org/core/docs/r0.18.2/hdfs_design.html Sirisha Akkala wrote: Hi I would like to know if Hadoop architecture more resembles SAN or NAS? -I'm guessing it is NAS. Or does it fall under a totally different category? If so, can you please email brief information?

Re: getting Configuration object in mapper

2008-12-05 Thread Sagar Naik
check : mapred.task.is.map Craig Macdonald wrote: I have a related question - I have a class which is both mapper and reducer. How can I tell in configure() if the current task is map or a reduce task? Parse the taskid? C Owen O'Malley wrote: On Dec 4, 2008, at 9:19 PM, abhinit wrote: I

Re: Bad connection to FS. command aborted.

2008-12-04 Thread Sagar Naik
Check u r conf in the classpath. Check if Namenode is running U r not able to connect to the intended Namenode -Sagar elangovan anbalahan wrote: im getting this error message when i am dong *bash-3.2$ bin/hadoop dfs -put urls urls* please lemme know the resolution, i have a project

Re: Bad connection to FS. command aborted.

2008-12-04 Thread Sagar Naik
hadoop version ? command : bin/hadoop version -Sagar elangovan anbalahan wrote: i tried that but nothing happened bash-3.2$ bin/hadoop dfs -put urll urll put: java.io.IOException: failed to create file /user/nutch/urll/.urls.crc on client 192.168.1.6 because target-length is 0, below

Re: Hadoop datanode crashed - SIGBUS

2008-12-01 Thread Sagar Naik
*top's top* Cpu(s): 0.1% us, 1.1% sy, 0.0% ni, 98.0% id, 0.8% wa, 0.0% hi, 0.0% si Mem: 8288280k total, 1575680k used, 6712600k free, 5392k buffers Swap: 16386292k total, 68k used, 16386224k free, 522408k cached 8 core , xeon 2GHz Brian On Dec 1, 2008, at 3:00 PM, Sagar

Re: Hadoop datanode crashed - SIGBUS

2008-12-01 Thread Sagar Naik
crashed or were set up wrong, and died fatally enough to take out the JVM. Are you using any compression? Does your job complete successfully in local mode, if the crash correlates well with a job running? Brian On Dec 1, 2008, at 3:32 PM, Sagar Naik wrote: Brian Bockelman wrote

Re: Hadoop datanode crashed - SIGBUS

2008-12-01 Thread Sagar Naik
PM, Sagar Naik wrote: Brian Bockelman wrote: Hardware/memory problems? I m not sure. SIGBUS is relatively rare; it sometimes indicates a hardware error in the memory system, depending on your arch. *uname -a : * Linux hdimg53 2.6.15-1.2054_FC5smp #1 SMP Tue Mar 14 16:05:46 EST 2006 i686

Re: Namenode BlocksMap on Disk

2008-11-26 Thread Sagar Naik
We can also try to mount the particular dir on ramfs and reduce the performance degradation -Sagar Billy Pearson wrote: I would like to see something like this also I run 32bit servers so I am limited on how much memory I can use for heap. Besides just storing to disk I would like to see some

64 bit namenode and secondary namenode 32 bit datanode

2008-11-25 Thread Sagar Naik
I am trying to migrate from 32 bit jvm and 64 bit for namenode only. *setup* NN - 64 bit Secondary namenode (instance 1) - 64 bit Secondary namenode (instance 2) - 32 bit datanode- 32 bit From the mailing list I deduced that NN-64 bit and Datanode -32 bit

Re: 64 bit namenode and secondary namenode 32 bit datanode

2008-11-25 Thread Sagar Naik
file. I am not aware of image corruption (did not take look into it). I did for SNN redundancy Pl correct me if I am wrong Thanks Sagar Wondering if there are chances of image corruption. Thanks, lohit - Original Message From: Sagar Naik [EMAIL PROTECTED] To: core-user@hadoop.apache.org

Re: Exception in thread main org.apache.hadoop.mapred.InvalidInputException: Input path does not exist:

2008-11-24 Thread Sagar Naik
Include the ${HADOOP}/conf/ dir in the classpath of the java program Alternatively, u can also try, bin/hadoop jar your_jar main_class args -Sagar Saju K K wrote: This is in referance with the sample application in the JAVAWord

Re: Hadoop 18.1 ls stalls

2008-11-20 Thread Sagar Naik
Angadi wrote: Sagar Naik wrote: Thanks Raghu, *datapoints:* - So when I use FSShell client, it gets into retry mode for getFilesInfo() call and takes a long time. What does retry mode mean? - Also, when do a ls operation, it takes secs(4/5) . - 1.6 million files and namenode is mostly full

Re: Hadoop Installation

2008-11-19 Thread Sagar Naik
Mithila Nagendra wrote: Hello I m currently a student at Arizona State University, Tempe, Arizona, pursuing my masters in Computer Science. I m currently involved in a research project that makes use of Hadoop to run various map reduce functions. Hence I searched the web on whats the best way to

Re: Recovering NN failure when the SNN data is on another server

2008-11-16 Thread Sagar Naik
Take backup of you dfs.data.dir (both on namenode and secondary namenode). If secondary namenode is not running on same machine as namenode, copy over the fs.checkpoint.dir from secondary onto namenode. start only the namenode . The importCheckpoint fails for a valid NN image. If you want to

Re: Recovering NN failure when the SNN data is on another server

2008-11-16 Thread Sagar Naik
(dfs.name.dir has been backed-up) - start only the namenode with -importCheckpoint - For additional info : https://issues.apache.org/jira/browse/HADOOP-2585?focusedCommentId=12558173#action_12558173 -Sagar Sagar Naik wrote: Take backup of you dfs.data.dir (both on namenode and secondary namenode

Re: Recovery of files in hadoop 18

2008-11-14 Thread Sagar Naik
since the last checkpoint. Hope that helps, Lohit - Original Message From: Sagar Naik [EMAIL PROTECTED] To: core-user@hadoop.apache.org Sent: Friday, November 14, 2008 10:38:45 AM Subject: Recovery of files in hadoop 18 Hi, I accidentally deleted the root folder in our hdfs. I have

Re: Recovery of files in hadoop 18

2008-11-14 Thread Sagar Naik
, bring namenode out of safemode. I hope you had started this namenode with old image and empty edits. You do not want your latest edits to be replayed, which has your delete transactions. Thanks, Lohit - Original Message From: Sagar Naik [EMAIL PROTECTED] To: core-user@hadoop.apache.org

Re: HDFS from non-hadoop Program

2008-11-07 Thread Sagar Naik
Can u make sure the files in hadoop conf dir is in the classpath of the java program -Sagar Wasim Bari wrote: Hello, I am trying to access HDFS from a non-hadoop program using java. When I try to get Configuration file, it shows exception both in DEBUG mode and normal one:

Missing blocks from bin/hadoop text but fsck is all right

2008-11-04 Thread Sagar Naik
Hi, We have a strange problem on getting out some of our files bin/hadoop dfs -text dir/* gives me missing block exceptions. 0/8/11/04 10:45:09 [main] INFO dfs.DFSClient: Could not obtain block blk_6488385702283300787_1247408 from any node: java.io.IOException: No live nodes contain current

Re: Missing blocks from bin/hadoop text but fsck is all right

2008-11-04 Thread Sagar Naik
Hi, We were hitting file descriptor limits :). Increased it and got solved. Thanks Jason -Sagar Sagar Naik wrote: Hi, We have a strange problem on getting out some of our files bin/hadoop dfs -text dir/* gives me missing block exceptions. 0/8/11/04 10:45:09 [main] INFO dfs.DFSClient

Re: namenode failure

2008-10-30 Thread Sagar Naik
Pl check your classpath entries. Looks like hadoop-core jar before you shutdown the cluster and after u changed hadoop-env.sh are different -Sagar Songting Chen wrote: Hi, I modified the classpath in hadoop-env.sh in namenode and datanodes before shutting down the cluster. Then problem

Hadoop .16 : Task failures

2008-10-10 Thread Sagar Naik
Hi, We are using Hadoop 0.16 and on our heavy IO job we are seeing lot of these exceptions. We are seeing lot of task failures more than 50% :(. They are two reasons from log: a) Task task_200810092310_0003_m_20_0 failed to report status for 600 seconds. Killing! - b)

Re: Getting started questions

2008-09-08 Thread Sagar Naik
Dennis Kubes wrote: John Howland wrote: I've been reading up on Hadoop for a while now and I'm excited that I'm finally getting my feet wet with the examples + my own variations. If anyone could answer any of the following questions, I'd greatly appreciate it. 1. I'm processing document

Re: Aborting Map Function

2008-04-16 Thread Sagar Naik
Chaman Singh Verma wrote: Hello, I am developing one application with MapReduce and in that whenever some MapTask condition is met, I would like to broadcast to all other MapTask to abort their work. I am not quite sure whether such broadcasting functionality currently exist in Hadoop