Re: java.io.FileNotFoundException

2010-01-13 Thread ruslan usifov
No, with tmp folder settings all ok. The problem is if follow code in map reduce framework: TaskRunner.java public static void setupWorkDir(JobConf conf) throws IOException { File workDir = new File(".").getAbsoluteFile(); FileUtil.fullyDelete(workDir); on cygwin if we delete cwd(curr

Re: Should mapreduce.ReduceContext reuse same object in nextKeyValue?

2010-01-13 Thread Eric Sammer
On 1/13/10 12:29 PM, Ed Mazur wrote: > What is the preferred method of avoiding value buffering? For example, > if you're building a basic inverted index, you have one key (term) > associated with many values (doc ids) in your reducer. If you want an > output pair of something like , is there a way

RE: how to load big files into Hbase without crashing?

2010-01-13 Thread Clements, Michael
After some investigation I find this feature - a max cap on the number of tasks in a job - is upcoming. Looks like Kevin and Matei wrote it: http://issues.apache.org/jira/browse/MAPREDUCE-698 We'll create a Fair Scheduler pool used exclusively for uploading data to HBase. In this pool we'll cap th

Re: Should mapreduce.ReduceContext reuse same object in nextKeyValue?

2010-01-13 Thread Ed Mazur
On Tue, Jan 12, 2010 at 7:14 PM, Eric Sammer wrote: > On 1/12/10 6:53 PM, Wilkes, Chris wrote: >> I created my own Writable class to store 3 pieces of information.  In my >> mapreducer.Reducer class I collect all of them and then process as a >> group, ie: >> >> reduce(key, values, context) { >>