Re: Working with MapFiles

2012-04-02 Thread Ioan Eugen Stan
Hi Ondrej, Pe 02.04.2012 13:00, Ondřej Klimpera a scris: Ok, thanks. I missed setup() method because of using older version of hadoop, so I suppose that method configure() does the same in hadoop 0.20.203. Aha, if it's possible, try upgrading. I don't know how support is for versions older t

Re: Working with MapFiles

2012-04-02 Thread Ondřej Klimpera
Ok, thanks. I missed setup() method because of using older version of hadoop, so I suppose that method configure() does the same in hadoop 0.20.203. Now I'm able to load a map file inside configure() method to MapFile.Reader instance as a class private variable, all works fine, just wonderin

Re: Working with MapFiles

2012-04-02 Thread Ioan Eugen Stan
Hi Ondrej, Pe 30.03.2012 14:30, Ondřej Klimpera a scris: And one more question, is it even possible to add a MapFile (as it consits of index and data file) to Distributed cache? Thanks Should be no problem, they are just two files. On 03/30/2012 01:15 PM, Ondřej Klimpera wrote: Hello, I'm

Re: Working with MapFiles

2012-03-30 Thread Ondřej Klimpera
And one more question, is it even possible to add a MapFile (as it consits of index and data file) to Distributed cache? Thanks On 03/30/2012 01:15 PM, Ondřej Klimpera wrote: Hello, I'm not sure what you mean by using map reduce setup()? "If the file is that small you could load it all in mem

Re: Working with MapFiles

2012-03-30 Thread Ondřej Klimpera
Hello, I'm not sure what you mean by using map reduce setup()? "If the file is that small you could load it all in memory to avoid network IO. Do that in the setup() method of the map reduce job." Can you please explain little bit more? Thanks On 03/30/2012 12:49 PM, Ioan Eugen Stan wrote:

Re: Working with MapFiles

2012-03-30 Thread Ioan Eugen Stan
Hello Ondrej, Pe 29.03.2012 18:05, Ondřej Klimpera a scris: Hello, I have a MapFile as a product of MapReduce job, and what I need to do is: 1. If MapReduce produced more spilts as Output, merge them to single file. 2. Copy this merged MapFile to another HDFS location and use it as a Distrib

Re: Working with MapFiles

2012-03-30 Thread Ondřej Klimpera
Hello, I've got one more question, how is seek() (or get()) method implemented in MapFile.Reader, does it use hashCode, compareTo() or another mechanism to find a match in MapFile's index. Thanks for your reply. Ondrej Klimpera On 03/29/2012 08:26 PM, Ondřej Klimpera wrote: Thanks for your f

Re: Working with MapFiles

2012-03-29 Thread Ondřej Klimpera
Thanks for your fast reply, I'll try this approach:) On 03/29/2012 05:43 PM, Deniz Demir wrote: Not sure if this helps in your use case but you can put all output file into distributed cache and then access them in the subsequent map-reduce job (in driver code): // previous mr-job's o

Re: Working with MapFiles

2012-03-29 Thread Deniz Demir
Not sure if this helps in your use case but you can put all output file into distributed cache and then access them in the subsequent map-reduce job (in driver code): // previous mr-job's output String pstr = "hdfs:// Hello, > > I have a MapFile as a product of MapReduce job, an

Working with MapFiles

2012-03-29 Thread Ondřej Klimpera
Hello, I have a MapFile as a product of MapReduce job, and what I need to do is: 1. If MapReduce produced more spilts as Output, merge them to single file. 2. Copy this merged MapFile to another HDFS location and use it as a Distributed cache file for another MapReduce job. I'm wondering if