1) We get tooling out of the box from RedShift (specifically, stable JDBC
access) - Spark we often are waiting for devops to get the right combo of
tools working or for libraries to support sequence files.
The arguments about JDBC access and simpler setup definitely make sense. My
first
[mailto:nicholas.cham...@gmail.com]
Sent: Wednesday, August 06, 2014 9:30 AM
To: Gary Malouf
Cc: user
Subject: Re: Regarding tooling/performance vs RedShift
1) We get tooling out of the box from RedShift (specifically, stable JDBC
access) - Spark we often are waiting for devops to get the right combo of tools
regards,
Ron Daniel, Jr.
Director, Elsevier Labs
r.dan...@elsevier.com
mobile: +1 619 208 3064
*From:* Gary Malouf [mailto:malouf.g...@gmail.com]
*Sent:* Wednesday, August 06, 2014 12:35 PM
*To:* Daniel, Ronald (ELS-SDG)
*Subject:* Re: Regarding tooling/performance vs RedShift
On Wed, Aug 6, 2014 at 3:41 PM, Daniel, Ronald (ELS-SDG)
r.dan...@elsevier.com wrote:
Mostly I was just objecting to Redshift does very well, but Shark is on
par or better than it in most of the tests when that was not how I read
the results, and Redshift was on HDDs.
My bad. You are
Also, regarding something like redshift not having MLlib built in, much of
that could be done on the derived results.
On Aug 6, 2014 4:07 PM, Nicholas Chammas nicholas.cham...@gmail.com
wrote:
On Wed, Aug 6, 2014 at 3:41 PM, Daniel, Ronald (ELS-SDG)
r.dan...@elsevier.com wrote:
Mostly I was
From: Gary Malouf [mailto:malouf.g...@gmail.com]
Sent: Wednesday, August 06, 2014 1:17 PM
To: Nicholas Chammas
Cc: Daniel, Ronald (ELS-SDG); user@spark.apache.org
Subject: Re: Regarding tooling/performance vs RedShift
Also, regarding something like redshift not having MLlib built in, much
On Wed, Aug 6, 2014 at 4:30 PM, Daniel, Ronald (ELS-SDG)
r.dan...@elsevier.com wrote:
Major architectural advantage to Spark.
Amen to that. For a really cool and succinct demonstration of this, check
out Aaron's demo http://youtu.be/sPhyePwo7FA?t=10m16s at the Hadoop
Summit earlier this ear