Re: Announcing Spark SQL

Patrick Wendell Thu, 27 Mar 2014 12:19:39 -0700

Hey Rohit,

I think external tables based on Cassandra or other datastores will work
out-of-the box if you build Catalyst with Hive support.

Michael may have feelings about this but I'd guess the longer term design
for having schema support for Cassandra/HBase etc likely wouldn't rely on
hive external tables because it's an unnecessary layer of indirection.

Spark should be able to directly load an SchemaRDD from Cassandra by just
letting the user give relevant information about the Cassandra schema. And
it should let you write-back to Cassandra by giving a mapping of fields to
the respective cassandra columns. I think all of this would be fairly easy
to implement on SchemaRDD and likely will make it into Spark 1.1

- Patrick

On Wed, Mar 26, 2014 at 10:59 PM, Rohit Rai <ro...@tuplejump.com> wrote:

> Great work guys! Have been looking forward to this . . .
>
> In the blog it mentions support for reading from Hbase/Avro... What will
> be the recommended approach for this? Will it be writing custom wrappers
> for SQLContext like in HiveContext or using Hive's "EXTERNAL TABLE" support?
>
> I ask this because a few days back (based on your pull request in github)
> I started analyzing what it would take to support Spark SQL on Cassandra.
> One obvious approach will be to use Hive External Table support with our
> cassandra-hive handler. But second approach sounds tempting as it will give
> more fidelity.
>
> Regards,
> Rohit
>
> *Founder & CEO, **Tuplejump, Inc.*
> ____________________________
> www.tuplejump.com
> *The Data Engineering Platform*
>
>
> On Thu, Mar 27, 2014 at 9:12 AM, Michael Armbrust 
> <mich...@databricks.com>wrote:
>
>> Any plans to make the SQL typesafe using something like Slick (
>>> http://slick.typesafe.com/)
>>>
>>
>> I would really like to do something like that, and maybe we will in a
>> couple of months. However, in the near term, I think the top priorities are
>> going to be performance and stability.
>>
>> Michael
>>
>
>

Re: Announcing Spark SQL

Reply via email to