Hello, I have a hadoop cluster setup of 10 nodes and I an in need of implementing queues in the cluster for receiving high volumes of data. Please suggest what will be more efficient to use in the case of receiving 24 Million Json files.. approx 5 KB each in every 24 hours : 1. Using Capacity Scheduler 2. Implementing RabbitMQ and receive data from them using Spring Integration Data pipe lines.
I cannot afford to loose any of the JSON files received. Thanking You, -- Regards, Ouch Whisper 010101010101