namnode replication

2008-06-30 Thread Vibhooti Verma
I have set my property as follows. property namedfs.name.dir/name value/apollo/env/TVHadoopCluster/var/tmp/hadoop/dfs/name,/local/namenode/value descriptionDetermines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list

Data-local tasks

2008-06-30 Thread Saptarshi Guha
Hello, I recall asking this question but this is in addition to what I'ev askd. Firstly, to recap my question and Arun's specific response: -- On May 20, 2008, at 9:03 AM, Saptarshi Guha wrote: Hello, -- Does the Data-local map tasks counter mean the number of tasks that

reducers hanging problem

2008-06-30 Thread Andreas Kostyrka
Hi! I'm running streaming tasks on hadoop 0.17.0, and wondered, if anyone has an approach to debugging the following situation: -) map have all finished (100% in http display), -) some reducers are hanging, with the messages below. Notice, that the task had 100 map tasks at allo, so 58 seems

Re: RecordReader Functionality

2008-06-30 Thread Jorgen Johnson
Hi Sean, Perhaps I'm missing something, but it doesn't appear to me that you're actually seeking to the filesplit start position in your constructor... This would explain why all the mappers are getting the same records. -jorgenj On Mon, Jun 30, 2008 at 9:22 AM, Sean Arietta [EMAIL PROTECTED]

Re: reducers hanging problem

2008-06-30 Thread Andreas Kostyrka
Another observation, the TaskTracker$Child was alive, and the reduce script has hung on read(0, ) :( Andreas signature.asc Description: This is a digitally signed message part.

Re: joins in map reduce

2008-06-30 Thread Jason Venner
I have just started to try using the Join operators. The join I am trying is this; join is outer(tbl(org.apache.hadoop.mapred.SequenceFileInputFormat,Input1),tbl(org.apache.hadoop.mapred.SequenceFileInputFormat,IndexedTry1)) but I get an error 08/06/30 08:55:13 INFO mapred.FileInputFormat:

RE: RecordReader Functionality

2008-06-30 Thread Runping Qi
Your record reader must be able to find the beginning of the next record beyond the start position of a given split. Your file format must enable your record reader to detect the beginning of the next record beyond the start pos of a split. It seems to me that is not possible based on the info I

Parameterized InputFormats

2008-06-30 Thread Nathan Marz
Hello, Are there any plans to change the JobConf API so that it takes an instance of an InputFormat rather than the InputFormat class? I am finding the inability to properly parameterize my InputFormats to be very restricting. What's the reasoning behind having the class as a parameter

problem when many map tasks are used (since 0.17.1 was installed)

2008-06-30 Thread Ashish Venugopal
The crash below occurs when I run many ( -jobconf mapred.map.tasks=200) mappers. It does not occur if I set mapred.map.task=1 even when I allocated many machines (causing there to be many mappers). But when I set number of map.tasks to 200 the error below happens. This just started happening

RE: Hadoop - is it good for me and performance question

2008-06-30 Thread Haijun Cao
http://www.mail-archive.com/core-user@hadoop.apache.org/msg02906.html -Original Message- From: yair gotdanker [mailto:[EMAIL PROTECTED] Sent: Sunday, June 29, 2008 4:46 AM To: core-user@hadoop.apache.org Subject: Hadoop - is it good for me and performance question Hello all, I am

Re: Too many fetch failures AND Shuffle error

2008-06-30 Thread Tarandeep Singh
I am getting this error as well. As Sayali mentioned in his mail, I updated the /etc/hosts file with the slave machines IP addresses, but I am still getting this error. Amar, which is the url that you were talking about in your mail - There will be a URL associated with a map that the reducer try

Summit / Camp Hadoop at ApacheCon

2008-06-30 Thread Ajay Anand
We are planning to host a mini-summit (aka Camp Hadoop) in conjunction with ApacheCon this year - Nov 6th and 7th - in New Orleans. We are working on putting together the agenda for this now, and would love to hear from you if you have suggestions for talks or panel discussions that we could

Re: Summit / Camp Hadoop at ApacheCon

2008-06-30 Thread Ted Dunning
I would love to help, especially on the Mahout side of things. What would you like to have? On Mon, Jun 30, 2008 at 2:53 PM, Ajay Anand [EMAIL PROTECTED] wrote: We are planning to host a mini-summit (aka Camp Hadoop) in conjunction with ApacheCon this year - Nov 6th and 7th - in New Orleans.

RE: Summit / Camp Hadoop at ApacheCon

2008-06-30 Thread Ajay Anand
At this point I am looking for proposals for talks or topics for panel discussions - similar to the Summit we did a few months ago. The idea would be to share with the community progress that's being made with Hadoop related projects or discuss interesting applications / deployments using Hadoop.

Re: reducers hanging problem

2008-06-30 Thread Andreas Kostyrka
On Monday 30 June 2008 18:38:28 Runping Qi wrote: Looks like the reducer stuck at shuffling phase. What is the progression percentage do you see for the reducer from web GUI? It is known that 0.17 does not handle shuffling well. I think it has been 87% (meaning that 19 of 22 reducer tasks

Using S3 Block FileSystem as HDFS replacement

2008-06-30 Thread slitz
Hello, I've been trying to setup hadoop to use s3 as filesystem, i read in the wiki that it's possible to choose either S3 native FileSystem or S3 Block Filesystem. I would like to use S3 Block FileSystem to avoid the task of manually transferring data from S3 to HDFS every time i want to run a

Re: Test Hadoop performance on EC2

2008-06-30 Thread 王志祥
Sorry for the previous post. I haven't finished. Please skip it. Hi all, I've made some experiments on Hadoop on Amazon EC2. I would like to share the result and any feedback would be appreciated. Environment: -Xen VM (Amazon EC2 instance ami-ee53b687) -1.7Ghz Xeon CPU, 1.75GB of RAM, 160GB of

Re: Too many fetch failures AND Shuffle error

2008-06-30 Thread Amar Kamat
Tarandeep Singh wrote: I am getting this error as well. As Sayali mentioned in his mail, I updated the /etc/hosts file with the slave machines IP addresses, but I am still getting this error. Amar, which is the url that you were talking about in your mail - There will be a URL associated with a

Should there be a way not maintaining the whole namespace structure in memory?

2008-06-30 Thread heyongqiang
In now's hdfs implementation,all INodeFile and INodeDirectory objects were loaded into memory,this is done when setting up the FSNameSpacs structure set up at namenode startup. the namenode will analyze the fsimage file and edit log file. And if there are milllions of files or directories how

Re: How to configure RandomWriter to generate less amount of data

2008-06-30 Thread Amar Kamat
Heshan Lin wrote: Hi, I'm trying to configure RandomWriter to generate less data than does the default configuration. bin/hadoop jar hadoop-*-examples.jar randomwriter -Dtest.randomwrite.bytes_per_map=value -Dtest.randomwrite.total_bytes=value -Dtest.randomwriter.maps_per_host=value