Take a look at Camus <https://github.com/linkedin/camus/>



François Langelier
Étudiant en génie Logiciel - École de Technologie
Supérieure<http://www.etsmtl.ca/>
Capitaine Club Capra <http://capra.etsmtl.ca/>
VP-Communication - CS Games <http://csgames.org> 2014
Jeux de Génie <http://www.jdgets.com/> 2011 à 2014
Argentier Fraternité du Piranha <http://fraternitedupiranha.com/> 2012-2014
Comité Organisateur Olympiades ÉTS 2012
Compétition Québécoise d'Ingénierie 2012 - Compétition Senior


On 19 May 2014 05:28, Hangjun Ye <yehang...@gmail.com> wrote:

> Hi there,
>
> I recently started to use Kafka for our data analysis pipeline and it works
> very well.
>
> One problem to us so far is expanding our cluster when we need more storage
> space.
> Kafka provides some scripts for helping do this but the process wasn't
> smooth.
>
> To make it work perfectly, seems Kafka needs to do some jobs that a
> distributed file system has already done.
> So just wondering if any thoughts to make Kafka work on top of HDFS? Maybe
> make the Kafka storage engine pluggable and HDFS is one option?
>
> The pros might be that HDFS has already handled storage management
> (replication, corrupted disk/machine, migration, load balance, etc.) very
> well and it frees Kafka and the users from the burden, and the cons might
> be performance degradation.
> As Kafka does very well on performance, possibly even with some degree of
> degradation, it's still competitive for the most situations.
>
> Best,
> --
> Hangjun Ye
>

Reply via email to