subject:"RE\: Queue support from HDFS"

RE: Queue support from HDFS

2011-06-27 Thread GOEKE, MATTHEW (AG/1000)

Saumitra, Two questions come to mind that could help you narrow down a solution: 1) How quickly do the downstream processes need the transformed data? Reason: If you can delay the processing for a period of time, enough to batch the data into a blob that is a multiple of your block

Re: Queue support from HDFS

2011-06-26 Thread Bharath Mundlapudi

saumitra.offic...@gmail.com To: common-user@hadoop.apache.org Sent: Saturday, June 25, 2011 1:05 PM Subject: Re: Queue support from HDFS Thanks for reply Jakob, As far as I understand, Kafka's hadoop consumers is MR job where mappers read from shared queue from Kafka and dump data to HDFS

Re: Queue support from HDFS

2011-06-25 Thread Saumitra

Thanks for reply Jakob, As far as I understand, Kafka's hadoop consumers is MR job where mappers read from shared queue from Kafka and dump data to HDFS, but they are not dynamically created as queue elements start bursting up. Is there way so that new mappers are created when input queue of

Re: Queue support from HDFS

2011-06-24 Thread Jakob Homan

Not directly, but you may wish to take a look at the Kafka project (http://sna-projects.com/kafka/), which we use as a queue and then bring the data periodically into HDFS via an MR job. See this presentation: http://www.slideshare.net/ydn/hug-january-2011-kafka-presentation -Jakob On Fri, Jun