So I've read: http://www.datastax.com/dev/blog/bulk-loading
Are there any tips for using sstableloader / SSTableSimpleUnsortedWriter to migrate time series data from a our old datastore (PostgreSQL) to Cassandra? After thinking about how sstables are done on disk, it seems best (required??) to write out each row at once. Ie: if each row == 1 years worth of data and you have say 30,000 rows, write one full row at a time (a full years worth of data points for a given metric) rather then 1 data point for 30,000 rows. Any other tips to improve load time or reduce the load on the cluster or subsequent compaction activity? All my CF's I'll be writing to use compression and leveled compaction. Right now my Cassandra data store has about 4 months of data and we have 5 years of historical (not sure yet how much we'll actually load yet, but minimally 1 years worth). Thanks! -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"