On 26 April 2010 00:57, Shuge Lee <shuge....@gmail.com> wrote: > In Python: > > keyspace.columnfamily[key][column] = value > > files.video[uuid.uuid4()]['name'] = 'foo.flv' > files.video[uuid.uuid4()]['path'] = '/var/files/foo.flv' >
Hi. Storing the filename in the database will not solve the file storage problem. Cassandra is a distributed database, and a file stored locally will not be available on other client nodes. If you're using Cassandra at all, that probably implies that you have lots of client nodes. A non-redundant NFS server (for example) would not offer high availability, so would be inadequate for the OP's situation. Storing files *IN* Cassandra is very useful because you can then retrieve them from anywhere with high availability. However, as others have discussed, they should be split across multiple columns, or if very big, multiple rows. I prefer to split by row because this scales better to very large files. During compaction, as is well noted, Cassandra needs the entire row in memory, which will cause a FAIL once you have files more than a few gigs. Mark