Re: Brisk vs Cloudera Distribution

Edward Capriolo Wed, 08 Feb 2012 20:59:03 -0800

Hadoop can work on a number of filessytems hdfs , s3. Local files. Brisk
file system is known as cfs. Cfs stores all block and meta data in
cassandra. Thus it does not use a name node. Brisk fires up a jobtracker
automatically as well. Brisk also has a hivemeta store backed by cassandra
so takes away that spof.

Brisk snappy compresses all data so you may not need to use compression or
sequence files. Performance wise I have gotten comparable numbers with tera
sort and tera gen. But the system work vastly differently and likely it
scales differently.

The hive integration is solid. Not sure what the biggest cluster is or
making other vague performance claims. Brisk is not active anymore the
commercial product is dse. There is a github fork of brisk however.

On Wednesday, February 8, 2012, rk vishu <talk2had...@gmail.com> wrote:
> Hello All,
>
> Could any one help me understand pros and cons of Brisk vs Cloudera Hadoop
> (DHFS + HBASE) in terms of functionality and performance?
> Wanted to keep aside the single point of failure (NN) issue while
comparing?
> Are there any big clusters in petabytes using brisk in production? How is
> the performance comparision CFS vs HDFS? How is Hive integration?
>
> Thanks and Regrds
> RK
>

Re: Brisk vs Cloudera Distribution

Reply via email to