On Fri, Aug 26, 2011 at 12:18 AM, Eric Evans <eev...@acunu.com> wrote:

> On Thu, Aug 25, 2011 at 6:31 AM, Ruby Stevenson <ruby...@gmail.com> wrote:
> > - Although Cassandra (and other decentralized NoSQL data store) has
> > been reported to handle very large data in total, my preliminary
> > understanding is the individual "column value" is quite limited. I
> > have read some posts saying you shouldn't store file this big in
> > Cassandra for example, use a path instead and let file system handle
> > it. Is this true?
>
> http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
>
> --
> Eric Evans
> Acunu | http://www.acunu.com | @acunu
>

It is important to note that even distributed storage file solutions like
GlusterFS, NFS, Iscsi are not as good as local disk either. The reason is
simple, best case scenario on a local file system file lives in VFS cache
you maybe be talking like micro or nanoseconds to read a block. Even if not
in vfs cache you are bounded by BUS speeds and disk speeds.

Network disks solutions like (isci) require dedicated expensive infini-ban
or ethernet networks to work well. Also that when using something like ISCI
your system gets to leverage its local VFS cache so not all the read traffic
has to cross the network.

The best case scenario for Cassandra would be a block of data living in the
row cache on a node. This data still has to traverse the network. That is
going to be slower then a local file.

However depending on what you are doing storing file data in cassandra could
be a big win.

Reply via email to