On 26 April 2010 00:57, Shuge Lee <shuge....@gmail.com> wrote:

> In Python:
>
> keyspace.columnfamily[key][column] = value
>
> files.video[uuid.uuid4()]['name'] = 'foo.flv'
> files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv'
>

Hi.

Storing the filename in the database will not solve the file storage
problem. Cassandra is a distributed database, and a file stored locally will
not be available on other client nodes.

If you're using Cassandra at all, that probably implies that you have lots
of client nodes. A non-redundant NFS server (for example) would not offer
high availability, so would be inadequate for the OP's situation.

Storing files *IN* Cassandra is very useful because you can then retrieve
them from anywhere with high availability.

However, as others have discussed, they should be split across multiple
columns, or if very big, multiple rows.

I prefer to split by row because this scales better to very large files.
During compaction, as is well noted, Cassandra needs the entire row in
memory, which will cause a FAIL  once you have files more than a few gigs.

Mark

Reply via email to