Hi, all, 

I have a performance question about the batch insert and bulk load. 

According to the documents, to import large volume of data into Cassandra, 
Batch Insert and Bulk Load can both be an option. Using batch insert is pretty 
straightforwards, but there have not been an ‘official’ way to use Bulk Load to 
import the data (in this case, i mean the data was generated online). 

So, i am thinking first clients use CQLSSTableWriter to create the SSTable 
files, then use “org.apache.cassandra.tools.BulkLoader” to import these 
SSTables into Cassandra directly. 

The question is can I expect a better performance using the BulkLoader this way 
comparing with using Batch insert?

I am not so familiar with the implementation of Bulk Load. But i do see a huge 
performance improvement using Batch Insert. Really want to know the upper 
limits of the write performance. Any comment will be helpful, Thanks!

- Dong

Reply via email to