RE: Spark vs Redshift

2016-04-02 Thread rajesh.prabhu
Hi Eris,

I also found this rather old discussion, about Spark Vs Redshift.
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-v-Redshift-td18112.html

Regards,
Rajesh

Basel, Switzerland
Ph: +41 77 941 0562
rajesh.pra...@wipro.com<mailto:rajesh.pra...@wipro.com>

From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Saturday, April 02, 2016 11:16 PM
To: Eris Lawrence
Cc: user @spark
Subject: Re: Spark vs Redshift


** This mail has reached you via an external source **
Hi,

Like anything else your mileage varies using any tool.

To start what is your use case here (fit for your needs)? You stated that you 
want to perform OLAP on large datasets. OLAP is normally performed on large 
data sets anyway so I assume you already have some form of Data Warehouse 
commercial or otherwise. Do you also need to do Big Data analytics containing a 
variety of  data formats including un-structured data?

HTH


Dr Mich Talebzadeh



LinkedIn  
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 2 April 2016 at 21:34, Eris Lawrence 
<vortexi...@gmail.com<mailto:vortexi...@gmail.com>> wrote:
Hi Spark devs,

I was recently into a tech session about data processing with spark vs redshift 
which concluded with metrics and datapoint that for 2 Billion data, Select 
queries on data based on filters on attributes were faster and cheaper on AWS 
Redshift as compared to an AWS Spark cluster.

I have researched around a bit, and both Redshift and Spark seem to processing 
softwares where we want to do OLAP queries on a large dataset. I was wondering 
in which usecases does Spark has an edge over Redshift? Are there certain kind 
of Complex queries where Spark can outperform Redshift? Or does Redshift only 
work well with schema defined data?

Please share your experience with either of the technologies. Thanks.

Cheers,
Eris.

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. WARNING: Computer viruses can be transmitted via email. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email. www.wipro.com


Re: Spark vs Redshift

2016-04-02 Thread Mich Talebzadeh
Hi,

Like anything else your mileage varies using any tool.

To start what is your use case here (fit for your needs)? You stated that
you want to perform OLAP on large datasets. OLAP is normally performed on
large data sets anyway so I assume you already have some form of Data
Warehouse commercial or otherwise. Do you also need to do Big Data
analytics containing a variety of  data formats including un-structured
data?

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 2 April 2016 at 21:34, Eris Lawrence  wrote:

> Hi Spark devs,
>
> I was recently into a tech session about data processing with spark vs
> redshift which concluded with metrics and datapoint that for 2 Billion
> data, Select queries on data based on filters on attributes were faster and
> cheaper on AWS Redshift as compared to an AWS Spark cluster.
>
> I have researched around a bit, and both Redshift and Spark seem to
> processing softwares where we want to do OLAP queries on a large dataset. I
> was wondering in which usecases does Spark has an edge over Redshift? Are
> there certain kind of Complex queries where Spark can outperform Redshift?
> Or does Redshift only work well with schema defined data?
>
> Please share your experience with either of the technologies. Thanks.
>
> Cheers,
> Eris.
>