Please refer the code: org.apache.cassandra.db.ColumnFamilyStore
public String getFlushPath() { long guessedSize = 2 * DatabaseDescriptor.getMemtableThroughput() * 1024*1024; // 2* adds room for keys, column indexes String location = DatabaseDescriptor.getDataFileLocationForTable(table_, guessedSize); if (location == null) throw new RuntimeException("Insufficient disk space to flush"); return new File(location, getTempSSTableFileName()).getAbsolutePath(); } and we can go through org.apache.cassandra.config.DatabaseDescriptor: public static String getDataFileLocationForTable(String table, long expectedCompactedFileSize) { long maxFreeDisk = 0; int maxDiskIndex = 0; String dataFileDirectory = null; String[] dataDirectoryForTable = getAllDataFileLocationsForTable(table); for ( int i = 0 ; i < dataDirectoryForTable.length ; i++ ) { File f = new File(dataDirectoryForTable[i]); if( maxFreeDisk < f.getUsableSpace()) { maxFreeDisk = f.getUsableSpace(); maxDiskIndex = i; } } // Load factor of 0.9 we do not want to use the entire disk that is too risky. maxFreeDisk = (long)(0.9 * maxFreeDisk); if( expectedCompactedFileSize < maxFreeDisk ) { dataFileDirectory = dataDirectoryForTable[maxDiskIndex]; currentIndex = (maxDiskIndex + 1 )%dataDirectoryForTable.length ; } else { currentIndex = maxDiskIndex; } return dataFileDirectory; } So, DataFileDirectories means multiple disks or disk-partitions. I think your storage01, storage02 and storage03 are in same disk or disk partition. 2010/4/26 Roland Hänel <rol...@haenel.me> > I have a configuration like this: > > <DataFileDirectories> > <DataFileDirectory>/storage01/cassandra/data</DataFileDirectory> > <DataFileDirectory>/storage02/cassandra/data</DataFileDirectory> > <DataFileDirectory>/storage03/cassandra/data</DataFileDirectory> > </DataFileDirectories> > > After loading a big chunk of data into cassandra, I end up wich some 70GB > in the first directory, and only about 10GB in the second and third one. All > rows are quite small, so it's not just some big rows that contain the > majority of data. > > Does Cassandra have the ability to 'see' the maximum available space in > these directory? I'm asking myself this question since my limit is 100GB, > and the first directory is approaching this limit... > > And, wouldn't it be better if Cassandra tried to 'load-balance' the files > inside the directories because this will result in better (read) performance > if the directories are on different disks (which is the case for me)? > > Any help is appreciated. > > Roland > >