Scaling inference on Hadoop DFS

2009-11-30 Thread Kunsheng Chen
Hi everyone, Currently I got a MapReduce program that soring input records and Map-Reduce them to output records with priority information for each of them. So far the program is running on 1 mainnode and 3 datanodes. And I got data something like following: -

Re: Hadoop dfs can't allocate memory with enough hard disk space when data gets huge

2009-10-19 Thread Kunsheng Chen
ussed multiple times on the list. Greping through > archives will help. > Also http://www.cloudera.com/blog/2009/02/02/the-small-files-problem/ > > Ashutosh > > On Sun, Oct 18, 2009 at 22:57, Kunsheng Chen > wrote: > > > I and running a hadoop program to perform MapRe

Hadoop dfs can't allocate memory with enough hard disk space when data gets huge

2009-10-18 Thread Kunsheng Chen
I and running a hadoop program to perform MapReduce work on files inside a folder. My program is basically doing Map and Reduce work, each line of any file is a pair of string, and the result is a string associate with occurence inside all files. The program works fine until the number of file

Program crashed when volume of data getting large

2009-09-23 Thread Kunsheng Chen
Hi everyone, I am running two map-reduce program, they were working good but when the data turns into around 900MB (5+ files). things weird happen to remind me as below: 'Communication problem with server: java.net.SocketTimeoutException: timed out waiting for rpc response' Also there i

"Timed out waiting for rpc response" after running a large number of jobs

2009-09-19 Thread Kunsheng Chen
Hi everyone, I am running two map-reduce program, they were working good but when the data turns into around 900MB (5+ files). things weird happen to remind me as below: 'Communication problem with server: java.net.SocketTimeoutException: timed out waiting for rpc response' Also there