Hi All,

I am looking at integrating a data stream from AWS Kinesis to AWS Redshift
and since I am already ingesting the data through Spark Streaming, it seems
convenient to also push that data to AWS Redshift at the same time.

I have taken a look at the AWS kinesis connector although I am not sure it
was designed to integrate with Apache Spark. It seems more like a
standalone approach:

   - https://github.com/awslabs/amazon-kinesis-connectors

There is also a Spark redshift integration library, however, it looks like
it was intended for pulling data rather than pushing data to AWS Redshift:

   - https://github.com/databricks/spark-redshift

I finally took a look at a general Scala library that integrates with AWS
Redshift:

   - http://scalikejdbc.org/

Does anyone have any experience pushing data from Spark Streaming to AWS
Redshift? Does it make sense conceptually, or does it make more sense to
push data from AWS Kinesis to AWS Redshift VIA another standalone approach
such as the AWS Kinesis connectors.

Thanks, Mike.

Reply via email to