Re: Working of combiner in hadoop

2014-07-04 Thread Chris Mawata
The key/value pairs are processes by the mapper independently of each other. The combiner logic deals with all the outputs from multiple key/value pairs do that logic can not be in the map method. On Jul 4, 2014 1:29 AM, Chhaya Vishwakarma chhaya.vishwaka...@lntinfotech.com wrote: Hi, If

Re: Need to evaluate the price of a Hadoop cluster

2014-07-04 Thread Chris Mawata
Some comments: 3 drives each of capacity 1Tb will be better than one 3 Tb drive. On a small cluster you can not afford to reserve a whole machine for each master daemon. The NameNode and JobTracker will have to cohabit with DataNodes and TaskTrackers. As for pricing if it is for an institution

Re: Working of combiner in hadoop

2014-07-04 Thread JAGANADH G
On Fri, Jul 4, 2014 at 10:59 AM, Chhaya Vishwakarma chhaya.vishwaka...@lntinfotech.com wrote: Hi, If have two map tasks running on one node , i have written combiner class also. Will combiner be called once for each map task or just once for both the map tasks Can i write a logic

In progress edit log from last run not being played in case of a cluster (HA) restart

2014-07-04 Thread Nitin Goyal
Hi All, I am running Hadoop 2.4.0. I am trying to restart my HA cluster but since there isn't a way to gracefully shutdown the NN (AFAIK), I am running into a (sort of) race condition. A client has issued a delete command and NN successfully deletes the requested file (in-progress edit logs

Re: Multi-Cluster Setup

2014-07-04 Thread fab wol
hey Rahul, thanks for pointing me to that page. It's definately worth a read. Need both clusters to be at least V2.3 for that? I was digging also a little bit further. There is the property setting fs.defaultFS whchi might be the exact setting I was actually looking for. Unfortuantely MapR

Streaming data - Avaiable tools

2014-07-04 Thread santosh.viswanathan
Hello Experts, Wanted to explore the available tools in the market on streaming data. I know Apache Spark exists. Are there any other tools available? Regards, Santosh Karthikeyan This message is for the designated recipient only and may contain privileged,

Thank you And What advice would you give me on running my first Hadoop cluster based Job

2014-07-04 Thread Chris MacKenzie
Hi, Over the past two weeks, from a standing start, I¹ve worked on a Hadoop based parallel genetic sequence alignment algorithm as part of my university masters project. Thankfully that¹s now up and running, along the way I got some great help from members of this group and I deeply appreciate

Re: Streaming data - Avaiable tools

2014-07-04 Thread Adaryl Bob Wakefield, MBA
Storm. It’s not a part of the Apache project but it seems to be what people are using to process event data. B. From: santosh.viswanat...@accenture.com Sent: Friday, July 04, 2014 11:25 AM To: user@hadoop.apache.org Subject: Streaming data - Avaiable tools Hello Experts, Wanted to

Re: Streaming data - Avaiable tools

2014-07-04 Thread Marcos Ortiz
Storm is another project sponsored by ASF. Look here: http://storm.apache.org On 04/07/14 12:28, Adaryl Bob Wakefield, MBA wrote: Storm. It’s not a part of the Apache project but it seems to be what people are using to process event data. B. *From:* santosh.viswanat...@accenture.com

Re: Streaming data - Avaiable tools

2014-07-04 Thread Adaryl Bob Wakefield, MBA
My information is out of date. It looks like it’s a full on incubator project now. Here is a working link: https://storm.incubator.apache.org/ B. From: Marcos Ortiz Sent: Friday, July 04, 2014 11:31 AM To: user@hadoop.apache.org Subject: Re: Streaming data - Avaiable tools Storm is another

Re: Streaming data - Avaiable tools

2014-07-04 Thread Cristóbal Giadach
Try Storm+ Esper http://tomdzk.wordpress.com/2011/09/28/storm-esper/ On Fri, Jul 4, 2014 at 12:38 PM, Adaryl Bob Wakefield, MBA adaryl.wakefi...@hotmail.com wrote: My information is out of date. It looks like it's a full on incubator project now. Here is a working link:

Pagerank In Hadoop

2014-07-04 Thread Deep Pradhan
I want to run a PageRank job in Hadoop. I know that there is a Pegasus implementation of PageRank. How do I submit the job to Hadoop for running PageRank algorithm? I also want to know if I have to supply the code. Thank You -- *Whether you think you can or you cannot.either way you are