hadoop 0.19.0 and data node failure

2009-01-15 Thread Kumar Pandey
To test hadoop's fault tolerence I tried the following node A -- name node and secondaryname node nodeB - datanode nodeC - datanode replica set to 2. When A, B and C are running I'm able to make a round trip for a wav file. Now to test fault tolerence I brought nodeB down and tried to write a f

Re: hadoop job -history

2009-01-15 Thread Amareshwari Sriramadasu
is the location specified by the configuration property "hadoop.job.history.user.location". If you don't specify anything for the property, the job history logs will be created in job's output directory. So, to view your history give your jobOutputDir, if you havent specified any location. Hop

RE: Locks in hadoop

2009-01-15 Thread Koji Noguchi
> I was going to do it with files. According to my knowledge, > file creation is an atomic operation. > You might want to look at hdfs balancer. It uses hdfs file to make sure only one instance of balancer is up. Koji -Original Message- From: Sagar Naik [mailto:sn...@attributor.com] Se

Re: Locks in hadoop

2009-01-15 Thread Konstantin Shvachko
Did you look at Zookeeper? Thanks, --Konstantin Sagar Naik wrote: I would like to implement a locking mechanism across the hdfs cluster I assume there is no inherent support for it I was going to do it with files. According to my knowledge, file creation is an atomic operation. So the file-ba

Locks in hadoop

2009-01-15 Thread Sagar Naik
I would like to implement a locking mechanism across the hdfs cluster I assume there is no inherent support for it I was going to do it with files. According to my knowledge, file creation is an atomic operation. So the file-based lock should work. I need to think through with all conditions bu

Re: Hadoop 0.17.1 => EOFException reading FSEdits file, what causes this? how to prevent?

2009-01-15 Thread Konstantin Shvachko
Joe, It looks like you edits file is corrupted or truncated. Most probably the last modification was not written to it, when the name-node was turned off. This may happen if the node crashes depending on the underlying local file system I guess. Here are some options for you to consider: - try a

Re: Hadoop 0.17.1 => EOFException reading FSEdits file, what causes this? how to prevent?

2009-01-15 Thread Aaron Kimball
Does the file exist or maybe was it deleted? Also, are the permissions on that directory set correctly, or could they have been changed out from under you by accident? - Aaron On Tue, Jan 13, 2009 at 9:53 AM, Joe Montanez wrote: > Hi: > > > > I'm using Hadoop 0.17.1 and I'm encountering EOFExce

Cascading 1.0.0 Released

2009-01-15 Thread Chris K Wensel
Hi all Just a quick note to let everyone know that Cascading 1.0.0 is out. http://www.cascading.org/ Cascading is an API for defining and executing data processing flows without needing to think in MapReduce. This release supports only Hadoop 0.19.x. Minor releases will be available to tra

Re: Indexed Hashtables

2009-01-15 Thread Jim Twensky
Delip, Why do you think Hbase will be an overkill? I do something similar to what you're trying to do with Hbase and I haven't encountered any significant problems so far. Can you give some more info on the size of the data you have? Jim On Wed, Jan 14, 2009 at 8:47 PM, Delip Rao wrote: > Hi,

Re: Indexed Hashtables

2009-01-15 Thread pr-hadoop
Delip, what about Hadoop MapFile? http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/io/MapFile.html Regards, Peter

hadoop job -history

2009-01-15 Thread Bill Au
I am having trouble getting the hadoop command "job -hisotry" to work. What am I suppose to use for ? I can see the job history from the JobTracker web ui. I tried specifing the history directory on the JobTracker but it didn't work: $ hadoop job -history logs/history/ Exception in thread "main

Re: RAID vs. JBOD

2009-01-15 Thread Runping Qi
Yes, all the machines in the tests are new, with the same spec. The 30% to 50% throughput variations of the disks were observed on the disks of the same machines. Runping On 1/15/09 2:41 AM, "Steve Loughran" wrote: > Runping Qi wrote: >> Hi, >> >> We at Yahoo did some Hadoop benchmarking ex

Re: Is it possible to submit job to JobClient and exit immediately?

2009-01-15 Thread Amar Kamat
Andrew wrote: For now, I use such code blocks in all my MR jobs: try { JobClient.runJob(job); JobClient jc = new JobClient(job); jc.submitJob(job); // submits a job and comes out } catch (IOException exc) { LOG.info("Job failed", exc); } System.exit(0); But this cod

Is it possible to submit job to JobClient and exit immediately?

2009-01-15 Thread Andrew
For now, I use such code blocks in all my MR jobs: try { JobClient.runJob(job); } catch (IOException exc) { LOG.info("Job failed", exc); } System.exit(0); But this code waits until MR job to complete. Thus, I have to run it on machine that is always online to jobtracker.

Re: Indexed Hashtables

2009-01-15 Thread Steve Loughran
Sean Shanny wrote: Delip, So far we have had pretty good luck with memcached. We are building a hadoop based solution for data warehouse ETL on XML based log files that represent click stream data on steroids. We process about 34 million records or about 70 GB data a day. We have to proce

Re: RAID vs. JBOD

2009-01-15 Thread Steve Loughran
Runping Qi wrote: Hi, We at Yahoo did some Hadoop benchmarking experiments on clusters with JBOD and RAID0. We found that under heavy loads (such as gridmix), JBOD cluster performed better. Gridmix tests: Load: gridmix2 Cluster size: 190 nodes Test results: RAID0: 75 minutes JBOD: 67 minutes