Re: Regarding tooling/performance vs RedShift

2014-08-06 Thread Nicholas Chammas
1) We get tooling out of the box from RedShift (specifically, stable JDBC access) - Spark we often are waiting for devops to get the right combo of tools working or for libraries to support sequence files. The arguments about JDBC access and simpler setup definitely make sense. My first

RE: Regarding tooling/performance vs RedShift

2014-08-06 Thread Daniel, Ronald (ELS-SDG)
[mailto:nicholas.cham...@gmail.com] Sent: Wednesday, August 06, 2014 9:30 AM To: Gary Malouf Cc: user Subject: Re: Regarding tooling/performance vs RedShift 1) We get tooling out of the box from RedShift (specifically, stable JDBC access) - Spark we often are waiting for devops to get the right combo of tools

Re: Regarding tooling/performance vs RedShift

2014-08-06 Thread Gary Malouf
regards, Ron Daniel, Jr. Director, Elsevier Labs r.dan...@elsevier.com mobile: +1 619 208 3064 *From:* Gary Malouf [mailto:malouf.g...@gmail.com] *Sent:* Wednesday, August 06, 2014 12:35 PM *To:* Daniel, Ronald (ELS-SDG) *Subject:* Re: Regarding tooling/performance vs RedShift

Re: Regarding tooling/performance vs RedShift

2014-08-06 Thread Nicholas Chammas
On Wed, Aug 6, 2014 at 3:41 PM, Daniel, Ronald (ELS-SDG) r.dan...@elsevier.com wrote: Mostly I was just objecting to Redshift does very well, but Shark is on par or better than it in most of the tests when that was not how I read the results, and Redshift was on HDDs. My bad. You are

Re: Regarding tooling/performance vs RedShift

2014-08-06 Thread Gary Malouf
Also, regarding something like redshift not having MLlib built in, much of that could be done on the derived results. On Aug 6, 2014 4:07 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: On Wed, Aug 6, 2014 at 3:41 PM, Daniel, Ronald (ELS-SDG) r.dan...@elsevier.com wrote: Mostly I was

RE: Regarding tooling/performance vs RedShift

2014-08-06 Thread Daniel, Ronald (ELS-SDG)
From: Gary Malouf [mailto:malouf.g...@gmail.com] Sent: Wednesday, August 06, 2014 1:17 PM To: Nicholas Chammas Cc: Daniel, Ronald (ELS-SDG); user@spark.apache.org Subject: Re: Regarding tooling/performance vs RedShift Also, regarding something like redshift not having MLlib built in, much

Re: Regarding tooling/performance vs RedShift

2014-08-06 Thread Nicholas Chammas
On Wed, Aug 6, 2014 at 4:30 PM, Daniel, Ronald (ELS-SDG) r.dan...@elsevier.com wrote: Major architectural advantage to Spark. Amen to that. For a really cool and succinct demonstration of this, check out Aaron's demo http://youtu.be/sPhyePwo7FA?t=10m16s at the Hadoop Summit earlier this ear