RE: Regarding tooling/performance vs RedShift

Daniel, Ronald (ELS-SDG) Wed, 06 Aug 2014 13:31:41 -0700

Well yes, MLlib-like routines or pretty much anything else could be run on the 
derived results, but you have to unload the results from Redshift and then load 
them into some other tool. So it's nicer to leave them in memory and operate on 
them there. Major architectural advantage to Spark.

Ron

From: Gary Malouf [mailto:[email protected]]
Sent: Wednesday, August 06, 2014 1:17 PM
To: Nicholas Chammas
Cc: Daniel, Ronald (ELS-SDG); [email protected]
Subject: Re: Regarding tooling/performance vs RedShift

Also, regarding something like redshift not having MLlib built in, much of that 
could be done on the derived results.
On Aug 6, 2014 4:07 PM, "Nicholas Chammas" 
<[email protected]<mailto:[email protected]>> wrote:
On Wed, Aug 6, 2014 at 3:41 PM, Daniel, Ronald 
(ELS-SDG)<[email protected]<mailto:[email protected]>> wrote:
Mostly I was just objecting to " Redshift does very well, but Shark is on par 
or better than it in most of the tests " when that was not how I read the 
results, and Redshift was on HDDs.

My bad. You are correct; the only test Shark (mem) does better on is test #1 
"Scan Query".

And indeed, it would be good to see an updated benchmark with Redshift running 
on SSDs.

Nick

RE: Regarding tooling/performance vs RedShift

Reply via email to