Re: Sending data to all reducers

2012-08-23 Thread Sonal Goyal
Hamid, I would recommend taking a relook at your current algorithm and making sure you are utilizing the MR framework to its strengths. You can evaluate having multiple passes for your map reduce program, or doing a map side join. You mention runtime is important for your system, so make sure you

Reg: when failures on writing to DB from map\reduce

2012-08-23 Thread Manoj Babu
Hi All, In Sqoop: When exporting from HDFS to DB, If an export map task fails due to these or other reasons, it will cause the export job to fail. The results of a failed export are undefined. Each export map task operates in a separate transaction. Furthermore, individual map tasks commit their c

Re: Sending data to all reducers

2012-08-23 Thread Hamid Oliaei
Hi, I take a look to that, hope it can be useful for my purpose. Thank you so much. Hamid

Re: Sending data to all reducers

2012-08-23 Thread Tim Robertson
Then I think you might be best exploring running a getmerge on each client. How you trigger that is up to you, but something like Fabric [1] might help. Others might propose different solutions, but it doesn't sound like MR is a natural choice to me. I would expect this is the very fastest way o

Re: Sending data to all reducers

2012-08-23 Thread Hamid Oliaei
Hi, First of all, thank you Tim for giving your time. The answer of first question is yes. My inputs are in format of triples (sub,pre,obj) and they are stored on the HDFS. The problem is: After running some MR jobs,some data generated in all machines and I want to each machine send part of that

Re: Sending data to all reducers

2012-08-23 Thread Tim Robertson
Sorry to ask too many questions, but it will help the user list best offer you advice, as this is not a typical MR use case. - Do you foresee the reducer store the data on a local files system to the machine? - Do you need to use specific input formats for the job, or is it really just text files?

Re: Sending data to all reducers

2012-08-23 Thread Hamid Oliaei
exactly!!

Re: Sending data to all reducers

2012-08-23 Thread Tim Robertson
So you are trying to run a single reducer on each machine, and all input data regardless of its location gets streamed to each reducer? On Thu, Aug 23, 2012 at 10:41 AM, Hamid Oliaei wrote: > Hi, > > I want to broadcast some data to all nodes under Hadoop 0.20.2. I tested > DistributedCache modu

Sending data to all reducers

2012-08-23 Thread Hamid Oliaei
Hi, I want to broadcast some data to all nodes under Hadoop 0.20.2. I tested DistributedCache module. Unfortunately, it was time-consuming and runtime is important for my work. I want to write a MR job so that a copy of input data are generated in output of all reducers. Is that possible? How? I m