Re: adding more datanode

2008-10-21 Thread David Wei
It is really different from what you said that on my cluster, and
performing format command will not destroy the file system. And if I do
not perform format command on datanode, the datanode will not be able to
find my the master.
Hopefully anybody can tell me the reason.


Konstantin Shvachko 写道:
> You just start the new data-node as the cluster is running using
> bin/hadoop datanode
> The configuration on the new data-node should be the same as on other nodes.
> The data-node should join the cluster automatically.
> Formatting will destroy your file system.
>
> --Konstantin
>
> David Wei wrote:
>   
>> Well, in my cluster, I do this:
>> 1. Adding new machines into conf/slaves on master machine
>> 2. On the new nodes, run format command
>> 3. Back to master, run start-all.sh
>> 4. Run start-balancer.sh , still on master
>>
>> Then I got the new nodes inside my cluster and no need to reboot the
>> whole system.
>>
>> Hopefully this will help. ;=)
>>
>>
>> Ski Gh3 写道:
>> 
>>> I'm not sure I get this.
>>> 1. If you format the filesystem (which I thought is usually executed
>>> on the master node, but anyway)
>>> don't you erase all your data?
>>> 2. I guess I need to add the new machine to the conf/slaves file,
>>> but then I run the start-all.sh again from the master node while my
>>> cluster is already running?
>>>
>>> Thanks!
>>> On Mon, Oct 20, 2008 at 5:59 PM, David Wei <[EMAIL PROTECTED]
>>> > wrote:
>>>
>>> this is quite easy. U can just config your new datanodes as others
>>> and format the filesystem before u start it.
>>> Remember to make it ssh-able for your master and run
>>> ./bin/start-all.sh on the master machine if you want to start all
>>> the deamons. This will start and add the new datanodes to the
>>> up-and-running cluster.
>>>
>>> hopefully my info will be help.
>>>
>>>
>>> Ski Gh3 写道:
>>>
>>> hi,
>>>
>>> I am wondering how to add more datanodes to an up-and-running
>>> hadoop
>>> instance?
>>> Couldn't find instructions on this from the wiki page.
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>>   
>>
>> 
>
>   




Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-21 Thread Edward J. Yoon
Thanks for all feedbacks and stories.
Also, I got a lot of insightful feedbacks via private mail. WOW!!

OK. I hope to hear next time again!

/Edward

On Tue, Oct 21, 2008 at 10:36 PM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:
> Oh, Sorry for our mistake, "It will be one of the Apache Incubator
> Projects" should be "It will be proposed to the Apache Incubator
> Project".
>
> Thanks. :)
>
> On Tue, Oct 21, 2008 at 10:02 AM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:
>> Hi all,
>>
>> This RDF proposal is a good long time ago. Now we'd like to settle
>> down to research again. I attached our proposal, We'd love to hear
>> your feedback & stories!!
>>
>> Thanks.
>> --
>> Best regards, Edward J. Yoon
>> [EMAIL PROTECTED]
>> http://blog.udanax.org
>>
>
>
>
> --
> Best regards, Edward J. Yoon
> [EMAIL PROTECTED]
> http://blog.udanax.org
>



-- 
Best regards, Edward J. Yoon
[EMAIL PROTECTED]
http://blog.udanax.org


Re: adding more datanode

2008-10-21 Thread Konstantin Shvachko
You just start the new data-node as the cluster is running using
bin/hadoop datanode
The configuration on the new data-node should be the same as on other nodes.
The data-node should join the cluster automatically.
Formatting will destroy your file system.

--Konstantin

David Wei wrote:
> Well, in my cluster, I do this:
> 1. Adding new machines into conf/slaves on master machine
> 2. On the new nodes, run format command
> 3. Back to master, run start-all.sh
> 4. Run start-balancer.sh , still on master
> 
> Then I got the new nodes inside my cluster and no need to reboot the
> whole system.
> 
> Hopefully this will help. ;=)
> 
> 
> Ski Gh3 写道:
>> I'm not sure I get this.
>> 1. If you format the filesystem (which I thought is usually executed
>> on the master node, but anyway)
>> don't you erase all your data?
>> 2. I guess I need to add the new machine to the conf/slaves file,
>> but then I run the start-all.sh again from the master node while my
>> cluster is already running?
>>
>> Thanks!
>> On Mon, Oct 20, 2008 at 5:59 PM, David Wei <[EMAIL PROTECTED]
>> > wrote:
>>
>> this is quite easy. U can just config your new datanodes as others
>> and format the filesystem before u start it.
>> Remember to make it ssh-able for your master and run
>> ./bin/start-all.sh on the master machine if you want to start all
>> the deamons. This will start and add the new datanodes to the
>> up-and-running cluster.
>>
>> hopefully my info will be help.
>>
>>
>> Ski Gh3 写道:
>>
>> hi,
>>
>> I am wondering how to add more datanodes to an up-and-running
>> hadoop
>> instance?
>> Couldn't find instructions on this from the wiki page.
>>
>> Thanks!
>>
>>
>>
>>
>>
> 
> 
> 


Re: Too many open files IOException while running hadoop mapper phase

2008-10-21 Thread 晋光峰
Can you give me a detailed explanation about how to dealing with this issue?
I havn't found related archives of this list.

Regards
Guangfeng

On Tue, Oct 21, 2008 at 6:19 PM, Karl Anderson <[EMAIL PROTECTED]> wrote:

>
> On 20-Oct-08, at 11:59 PM, 晋光峰 wrote:
>
>  Dear all,
>>
>> I use Hadoop 0.18.0 [...]
>>
>
> 0.18.0 is not a stable release.  Try 0.17.x or 0.18.1.  I had a similar
> problem which was solved by switching.  There are suggestions on the web and
> archives of this list for dealing with the issue, but try switching versions
> first.
>
>
> Karl Anderson
> [EMAIL PROTECTED]
> http://monkey.org/~kra 
>
>
>
>


RE: Hadoop Camp next month

2008-10-21 Thread Jim Kellerman (POWERSET)


---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -Original Message-
> From: Milind Bhandarkar [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 21, 2008 3:37 PM
> To: core-user@hadoop.apache.org; [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; pig-
> [EMAIL PROTECTED]
> Subject: Re: Hadoop Camp next month
>
> Hi,
>
> Just received an email from ApacheCon US organizers that they are giving
> 50%
> discount for Hadoop Training
> (http://us.apachecon.com/c/acus2008/sessions/93).
>
> 
> >We just created a discount code for you to give people 50% off of the
> >cost of the training. The code is <  Hadoop50  >  The discount would
> >apply only to the Training.
> 
>
> Hope to see you there !
>
> - Milind
>
>
> On 10/2/08 9:03 AM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote:
>
> > Hi all,
> >I'd like to remind everyone that the Hadoop Camp & ApacheCon US is
> > coming up in New Orleans next month. http://tinyurl.com/hadoop-camp
> >
> > It will be the largest gathering of Hadoop developers outside of
> > California. We'll have:
> >
> > Core: Doug Cutting, Dhruba Borthakur, Arun Murthy, Owen O'Malley,
> > Sameer Paranjpye,
> > Sanjay Radia, Tom White
> > Zookeeper: Ben Reed
> >
> > There will also be a training session on Practical Problem Solving
> > with Hadoop by Milind Bhandarkar on Monday.
> >
> > So if you'd like to meet the developers or find out more about Hadoop,
> > come join us!
> >
> > -- Owen
>
>
> --
> Milind Bhandarkar
> Y!IM: GridSolutions
> 408-349-2136
> ([EMAIL PROTECTED])



Re: adding more datanode

2008-10-21 Thread David Wei
Well, in my cluster, I do this:
1. Adding new machines into conf/slaves on master machine
2. On the new nodes, run format command
3. Back to master, run start-all.sh
4. Run start-balancer.sh , still on master

Then I got the new nodes inside my cluster and no need to reboot the
whole system.

Hopefully this will help. ;=)


Ski Gh3 写道:
> I'm not sure I get this.
> 1. If you format the filesystem (which I thought is usually executed
> on the master node, but anyway)
> don't you erase all your data?
> 2. I guess I need to add the new machine to the conf/slaves file,
> but then I run the start-all.sh again from the master node while my
> cluster is already running?
>
> Thanks!
> On Mon, Oct 20, 2008 at 5:59 PM, David Wei <[EMAIL PROTECTED]
> > wrote:
>
> this is quite easy. U can just config your new datanodes as others
> and format the filesystem before u start it.
> Remember to make it ssh-able for your master and run
> ./bin/start-all.sh on the master machine if you want to start all
> the deamons. This will start and add the new datanodes to the
> up-and-running cluster.
>
> hopefully my info will be help.
>
>
> Ski Gh3 写道:
>
> hi,
>
> I am wondering how to add more datanodes to an up-and-running
> hadoop
> instance?
> Couldn't find instructions on this from the wiki page.
>
> Thanks!
>
>
>
>
>




Re: replication issue and safe mode pending

2008-10-21 Thread Konstantin Shvachko

hadoop fsck /user/brian/path/to/file -files -blocks -locations

will also list data-nodes containing replicas.

Brian Bockelman wrote:

Hey Zheng,

You can explicitly exit safemode by

hadoop dfsadmin -safemode leave

Then, you can delete the file with the corrupt blocks by:

hadoop fsck / -delete

Joey: You may list the block locations for a specific file by:

hadoop fsck /user/brian/path/to/file -files -blocks

However, if they are missing, then there obviously won't be any listed 
replicas for some block.


Hope it helps,

Brian

On Oct 21, 2008, at 4:33 PM, Zheng Shao wrote:

http://markmail.org/message/2xtywnnppacywsya shows we can exit safe 
mode explicitly and just delete these corrupted files.


But I don't know how to exit safe mode explicitly.

Zheng
-Original Message-
From: Joey Pan [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 21, 2008 2:01 PM
To: core-user@hadoop.apache.org
Subject: replication issue and safe mode pending

When starting hadoop, it stays in safe mode forever. Looking into the 
issue,

seems there is problem with block replication.



Command   hadoop fsck / shows error msg:

/X/part-7.deflate: CORRUPT block blk_1402039344260425079



/X/part-7.deflate: MISSING 1 blocks of total size 2294
B..



Tried to restart the fs , it doesn't solve the problem.



Question: how to find the data node that contains corrupted/missing 
blocks?




Thanks,

Joey











Re: Too many open files IOException while running hadoop mapper phase

2008-10-21 Thread Karl Anderson


On 20-Oct-08, at 11:59 PM, 晋光峰 wrote:


Dear all,

I use Hadoop 0.18.0 [...]


0.18.0 is not a stable release.  Try 0.17.x or 0.18.1.  I had a  
similar problem which was solved by switching.  There are suggestions  
on the web and archives of this list for dealing with the issue, but  
try switching versions first.



Karl Anderson
[EMAIL PROTECTED]
http://monkey.org/~kra





Re: Hadoop Camp next month

2008-10-21 Thread Milind Bhandarkar
Hi,

Just received an email from ApacheCon US organizers that they are giving 50%
discount for Hadoop Training
(http://us.apachecon.com/c/acus2008/sessions/93).

 
>We just created a discount code for you to give people 50% off of the
>cost of the training. The code is <  Hadoop50  >  The discount would
>apply only to the Training.


Hope to see you there !

- Milind


On 10/2/08 9:03 AM, "Owen O'Malley" <[EMAIL PROTECTED]> wrote:

> Hi all,
>I'd like to remind everyone that the Hadoop Camp & ApacheCon US is
> coming up in New Orleans next month. http://tinyurl.com/hadoop-camp
> 
> It will be the largest gathering of Hadoop developers outside of
> California. We'll have:
> 
> Core: Doug Cutting, Dhruba Borthakur, Arun Murthy, Owen O'Malley,
> Sameer Paranjpye,
> Sanjay Radia, Tom White
> Zookeeper: Ben Reed
> 
> There will also be a training session on Practical Problem Solving
> with Hadoop by Milind Bhandarkar on Monday.
> 
> So if you'd like to meet the developers or find out more about Hadoop,
> come join us!
> 
> -- Owen


-- 
Milind Bhandarkar
Y!IM: GridSolutions
408-349-2136 
([EMAIL PROTECTED])



Re: replication issue and safe mode pending

2008-10-21 Thread Brian Bockelman

Hey Zheng,

You can explicitly exit safemode by

hadoop dfsadmin -safemode leave

Then, you can delete the file with the corrupt blocks by:

hadoop fsck / -delete

Joey: You may list the block locations for a specific file by:

hadoop fsck /user/brian/path/to/file -files -blocks

However, if they are missing, then there obviously won't be any listed  
replicas for some block.


Hope it helps,

Brian

On Oct 21, 2008, at 4:33 PM, Zheng Shao wrote:

http://markmail.org/message/2xtywnnppacywsya shows we can exit safe  
mode explicitly and just delete these corrupted files.


But I don't know how to exit safe mode explicitly.

Zheng
-Original Message-
From: Joey Pan [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 21, 2008 2:01 PM
To: core-user@hadoop.apache.org
Subject: replication issue and safe mode pending

When starting hadoop, it stays in safe mode forever. Looking into  
the issue,

seems there is problem with block replication.



Command   hadoop fsck / shows error msg:

/X/part-7.deflate: CORRUPT block blk_1402039344260425079



/X/part-7.deflate: MISSING 1 blocks of total size 2294
B..



Tried to restart the fs , it doesn't solve the problem.



Question: how to find the data node that contains corrupted/missing  
blocks?




Thanks,

Joey










RE: replication issue and safe mode pending

2008-10-21 Thread Zheng Shao
http://markmail.org/message/2xtywnnppacywsya shows we can exit safe mode 
explicitly and just delete these corrupted files.

But I don't know how to exit safe mode explicitly.

Zheng
-Original Message-
From: Joey Pan [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 21, 2008 2:01 PM
To: core-user@hadoop.apache.org
Subject: replication issue and safe mode pending

When starting hadoop, it stays in safe mode forever. Looking into the issue,
seems there is problem with block replication.



Command   hadoop fsck / shows error msg:

/X/part-7.deflate: CORRUPT block blk_1402039344260425079



/X/part-7.deflate: MISSING 1 blocks of total size 2294
B..



Tried to restart the fs , it doesn't solve the problem.



Question: how to find the data node that contains corrupted/missing blocks?



Thanks,

Joey









replication issue and safe mode pending

2008-10-21 Thread Joey Pan
When starting hadoop, it stays in safe mode forever. Looking into the issue,
seems there is problem with block replication. 

 

Command   hadoop fsck / shows error msg: 

/X/part-7.deflate: CORRUPT block blk_1402039344260425079

 

/X/part-7.deflate: MISSING 1 blocks of total size 2294
B..

 

Tried to restart the fs , it doesn't solve the problem. 

 

Question: how to find the data node that contains corrupted/missing blocks? 

 

Thanks, 

Joey 

 

 

 



understanding dfshealth.jsp

2008-10-21 Thread Michael Bieniosek
Hi,

I have a dfs cluster with replication set to 3.  In dfshealth.jsp, I see a node 
with:

Size(GB) = 930.25
Used(%) = 9.83
Remaining(GB) = 631.08

How is this possible?  Why doesn't (size - remaining = used * size)?  Are Size 
and Remaining measuring in different units (replicated vs. not)?

At the top of the page, I also see this:
Capacity : 80.85 TB
DFS Remaining:29.94 TB
DFS Used:39.01 TB
DFS Used%:48.25 %

Does Capacity mean the same thing as Size?  Here, it looks like DFS Used% * 
Capacity = DFS Used.  But how does DFS Remaining fit in?

Thanks,
Michael


Re: adding more datanode

2008-10-21 Thread Ski Gh3
I'm not sure I get this.
1. If you format the filesystem (which I thought is usually executed on the
master node, but anyway)
don't you erase all your data?
2. I guess I need to add the new machine to the conf/slaves file,
but then I run the start-all.sh again from the master node while my cluster
is already running?

Thanks!
On Mon, Oct 20, 2008 at 5:59 PM, David Wei <[EMAIL PROTECTED]> wrote:

> this is quite easy. U can just config your new datanodes as others and
> format the filesystem before u start it.
> Remember to make it ssh-able for your master and run ./bin/start-all.sh on
> the master machine if you want to start all the deamons. This will start and
> add the new datanodes to the up-and-running cluster.
>
> hopefully my info will be help.
>
>
> Ski Gh3 写道:
>
>  hi,
>>
>> I am wondering how to add more datanodes to an up-and-running hadoop
>> instance?
>> Couldn't find instructions on this from the wiki page.
>>
>> Thanks!
>>
>>
>>
>
>
>


Re: mysql in hadoop

2008-10-21 Thread Amit k. Saha
Hi Deepak,

On Mon, Oct 20, 2008 at 10:13 PM, Deepak Diwakar <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> I am sure someone must have tried mysql connection using hadoop. But I am
> getting problem.
> Basically I am not getting how to inlcude classpath of jar of jdbc connector
> in the run command of hadoop or is there any other way so that we can
> incorporate  jdbc connector jar into the main jar which we run using
> $hadoop-home/bin/hadoop?
>
> plz help me .
>
> Thanks in advance,

Just inquisitive: What application on Hadoop are you working on which
uses MySQL?

Thanks,
Amit
>



-- 
Amit Kumar Saha
http://blogs.sun.com/amitsaha/
http://amitsaha.in.googlepages.com/
Skype: amitkumarsaha


Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-21 Thread Stuart Sierra
On Mon, Oct 20, 2008 at 9:02 PM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:
> This RDF proposal is a good long time ago. Now we'd like to settle
> down to research again. I attached our proposal, We'd love to hear
> your feedback & stories!!

Hello, Edward,
I'm very glad to see this idea moving forward.  Two comments:

An essential feature, for me, would be the ability to write custom
MapReduce jobs to process RDF, independent of the RDF query processor.
 That way I could plug in my own inference engine, rules engine, or
graph transformer.

I'd also like to see re-use of existing APIs wherever possible, like
JRDF or RDF2Go.  It may be worth examining other large-scale RDF
databases like Mulgara to see if any code can be reused.
-Stuart


Re: Keep free space with du.reserved and du.pct

2008-10-21 Thread Allen Wittenauer



On 10/21/08 3:33 AM, "Jean-Adrien" <[EMAIL PROTECTED]> wrote:
> I expected to keep 3.75 Gb free.
> But free space goes under 1 Gb, as if I kept the default settings

I noticed that you're running on /.  In general, this is a bad idea, as
space can disappear in various ways and you'll never know.  For example,
/var/log can grow tremendously without warning or there might be a
deleted-but-still-open file handle on /tmp.

What does a du on the dfs directories tell you?  How space is *actually*
being used by Hadoop?

You might also look around for dead task leftovers.

> I read a bit in jira (HADOOP-2991) and I saw that the implementation of
> these directives was subject to discussions. But it is not marked as
> affecting 0.18.1. What is the situation now ?

I'm fairly certain it is unchanged.  None of the developers seem
particularly interested in a static allocation method, deeming it too hard
to maintain when you have large or heterogeneous clusters.

HADOOP-2816 going into 0.19 is somewhat relevant, though, because the
name node UI is completely wrong when it comes to the actual capacity.



Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-21 Thread Edward J. Yoon
Oh, Sorry for our mistake, "It will be one of the Apache Incubator
Projects" should be "It will be proposed to the Apache Incubator
Project".

Thanks. :)

On Tue, Oct 21, 2008 at 10:02 AM, Edward J. Yoon <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> This RDF proposal is a good long time ago. Now we'd like to settle
> down to research again. I attached our proposal, We'd love to hear
> your feedback & stories!!
>
> Thanks.
> --
> Best regards, Edward J. Yoon
> [EMAIL PROTECTED]
> http://blog.udanax.org
>



-- 
Best regards, Edward J. Yoon
[EMAIL PROTECTED]
http://blog.udanax.org


Re: Does anybody have tried to setup a cluster with multiple namenodes?

2008-10-21 Thread Steve Loughran

Chris Douglas wrote:
The secondary namenode is neither a backup service for the HDFS 
namespace nor a failover for requests:


http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode 



The secondary namenode periodically merges an image (FSImage) of the 
namesystem with recent changes (FSEdits), almost always on another 
machine. The memory requirements for the NameNode are formidable, and 
merging these on the same machine that's servicing requests would make 
them infeasible for large clusters. -C


Currently that namenode is a SPOF; puts a limit on both the scale and 
availability of a large cluster. The ultimate way to fix both problems 
would be to (somehow) share the work out among multiple namenodes, 
though I hesitate to come up with an architecture for doing that 
reliably, let alone a commitment to do any of the work. Some possible 
ways to do this (none of which are in the codebase today)


-give different namenodes ownership of bits of the filesystem; they'd 
need to coordinate allocation of blocks across machines though.


-have peer namenodes sharing state using some kind of tuple-space or 
similar distributed datastructure. That's easier said than done though; 
a busy cluster will have a high change-rate on the t-space facts and you 
need to choreograph the directory operations quite carefully.


-Have the namenode storing state into a local database which is 
clustered in some failover design (shared disk array, or synchronization 
of changes over ethernet)


This would be a very interesting project for someone out there to take 
up. One of the fun problems is testing failover works in all situations, 
especially handing not the outage of one of the machines, but the 
partitioning of the network so the two namenodes can't see each other.


-steve


Re: mysql in hadoop

2008-10-21 Thread Deepak Diwakar
Thanks for the solution u guys provided.

actually  i got one solution ..
If we add the path of any external jar into classpath of hadoop which in
$hadoop_home/hadoop-$version/conf/hadoop-env.sh , It will do.

thanks for solution

2008/10/20 Gerardo Velez <[EMAIL PROTECTED]>

> Hi!
>
> Actually I got same problem and temporally I've solved it including jdbc
> dependecies inside main jar.
>
> Actually another solution I've found is you can place all jar dependencias
> inside hadoop/lib directory.
>
>
> Hope it helps.
>
>
> -- Gerardo
>
>
> On Mon, Oct 20, 2008 at 9:43 AM, Deepak Diwakar <[EMAIL PROTECTED]>
> wrote:
>
> > Hi all,
> >
> > I am sure someone must have tried mysql connection using hadoop. But I am
> > getting problem.
> > Basically I am not getting how to inlcude classpath of jar of jdbc
> > connector
> > in the run command of hadoop or is there any other way so that we can
> > incorporate  jdbc connector jar into the main jar which we run using
> > $hadoop-home/bin/hadoop?
> >
> > plz help me .
> >
> > Thanks in advance,
> >
>



-- 
- Deepak Diwakar,
Associate Software Eng.,
Pubmatic, pune
Contact: +919960930405


Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-21 Thread Edward J. Yoon
Any feedback?

We also need a feedback from core committers.

/Edward

On Tue, Oct 21, 2008 at 3:13 PM, Hyunsik Choi <[EMAIL PROTECTED]> wrote:
> Although we proposed the system for RDF data, we actually are
> considering more general system for graph data model. Actually, many
> data in real world can be represented graph data model. In particular,
> besides web data some data domains (i.e., biological data, chemical
> data, social networks, and so on) are rather represented as graph data.
>
> What do you think about that?
>
> --
> Hyunsik Choi
> Database & Information Systems Lab, Korea University
>
>
> Edward J. Yoon wrote:
>> Hi all,
>>
>> This RDF proposal is a good long time ago. Now we'd like to settle
>> down to research again. I attached our proposal, We'd love to hear
>> your feedback & stories!!
>>
>> Thanks.
>>
>
>



-- 
Best regards, Edward J. Yoon
[EMAIL PROTECTED]
http://blog.udanax.org


Too many open files IOException while running hadoop mapper phase

2008-10-21 Thread 晋光峰
Dear all,

I use Hadoop 0.18.0 to execute a job which will output huge key-value pairs
in Mapper phase. While running the job in the mapper phase, the hadoop
framework throws exception as below:

java.io.FileNotFoundException:
/home/guangfeng/bin/hadoop-0.18.0/tmp/hadoop-guangfeng/mapred/local/taskTracker/jobcache/job_200810211354_0001/attempt_200810211354_0001_m_05_0/output/spill491.out.index
(Too many open files)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:106)
at 
org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.(RawLocalFileSystem.java:70)
at 
org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.(RawLocalFileSystem.java:106)
at 
org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:176)
at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:117)
at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:274)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:375)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1025)
at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:702)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:228)
at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)


It seems that the framework opens too many files at the same time which
exceeds the linux file system limits. Anybody faced with such problems
before? By the way, can anybody give a detailed process description while
running mappers?

Thanks
-- 
Guangfeng Jin

Software Engineer