RE: [EXTERNAL] Re: ETL options from Hive/Presto/s3 to cassandra

2018-08-09 Thread Durity, Sean R
DataStax Enterprise 6.0 has a new bulk loader tool. DSE is a commercial product, but maybe your needs are worth the investigation. Sean Durity From: Rahul Singh Sent: Tuesday, August 07, 2018 9:37 AM To: user@cassandra.apache.org Subject: [EXTERNAL] Re: ETL options from Hive/Presto/s3 to

Re: ETL options from Hive/Presto/s3 to cassandra

2018-08-07 Thread Rahul Singh
Spark is scalable to as many nodes as you want and could be collocated with the data nodes — sstableloader wont be as performant for larger datasets. Although it can be run in parallel on different nodes I don’t believe it to be as fault tolerant. If you have to do it continuously I would even

ETL options from Hive/Presto/s3 to cassandra

2018-08-06 Thread srimugunthan dhandapani
Hi all, We have data that gets filled into Hive/ presto every few hours. We want that data to be transferred to cassandra tables. What are some of the high performance ETL options for transferring data between hive or presto into cassandra? Also does anybody have any performance numbers comparin