Hi Spark devs,

I was recently into a tech session about data processing with spark vs
redshift which concluded with metrics and datapoint that for 2 Billion
data, Select queries on data based on filters on attributes were faster and
cheaper on AWS Redshift as compared to an AWS Spark cluster.

I have researched around a bit, and both Redshift and Spark seem to
processing softwares where we want to do OLAP queries on a large dataset. I
was wondering in which usecases does Spark has an edge over Redshift? Are
there certain kind of Complex queries where Spark can outperform Redshift?
Or does Redshift only work well with schema defined data?

Please share your experience with either of the technologies. Thanks.

Cheers,
Eris.

Reply via email to