> So if one is forced to use a SAN, how should you set up Cassandra is
> the interesting question - to me! Here are some thoughts:- 
> 1. Ensure that each node gets dedicated - not shared - LUNs 
> 2. Ensure that these LUNs do share spindles, or nodes will seize to be
> isolatable (this will be tough to get, given how SAN administrators
> think about this) 
> 3. Most SANs deliver performance by striping (RAID 0) - sacrifice
> striping for isolation if push comes to shove 
> 4. Do not share data directories from multiple nodes onto a single
> location via NFS or CFS for example. They are cool in shared resource
> environments, but breaks the premise behind Cassandra. All data
> storage should be private to the cassandra node, even when on shared
> storage 
> 5. Do not change any assumption around Replication Factor (RF) or
> Consistency Level (CL) due to the shared storage - in fact if
> anything, increase your replication factor because you now have
> potential SPOF storage.  

That was gold, and lead to a direct conversation between provider and
developer. Various tests showed IOPS will often be at 5k per node.
Therefore the iSCSI solution would need to be tailored to handle it.

Just like mentioned above our provider simply couldn't provide us so much
disk per server. But after a good discussion it became obvious (doh!)
that the application can actually save a lot of disk by using different
keyspaces with different RF. We have raw data that needs to be
collected, but can be temporarily unavailable for reading, hence RF=1
makes sense. This raw data is the vast bulk of the data so this saves
lots of disk space. The aggregated data, which is relatively small in
comparison, is critical for the application to read so we can keep in a
separate keyspace with higher RF...

~mck

-- 
“Anyone who lives within their means suffers from a lack of
imagination.” - Oscar Wilde 
| http://semb.wever.org | http://sesat.no
| http://finn.no       | Java XSS Filter

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to