Re: Spark on Kudu

Benjamin Kim Wed, 24 Feb 2016 15:42:17 -0800

J-D,

It looks like it fulfills most of the basic requirements (kudu RDD, kudu 
DStream) in KUDU-1214. Am I right? Besides shoring up more Spark SQL 
functionality (Dataframes) and doing the documentation, what more needs to be 
done? Optimizations?


I believe that it’s a good place to start using Spark with Kudu and compare it 
to HBase with Spark (not clean).

Thanks,
Ben


> On Feb 24, 2016, at 3:10 PM, Jean-Daniel Cryans <jdcry...@apache.org> wrote:
> 
> AFAIK no one is working on it, but we did manage to get this in for 0.7.0: 
> https://issues.cloudera.org/browse/KUDU-1321 
> <https://issues.cloudera.org/browse/KUDU-1321>
> 
> It's a really simple wrapper, and yes you can use SparkSQL on Kudu, but it 
> will require a lot more work to make it fast/useful.
> 
> Hope this helps,
> 
> J-D
> 
> On Wed, Feb 24, 2016 at 3:08 PM, Benjamin Kim <bbuil...@gmail.com 
> <mailto:bbuil...@gmail.com>> wrote:
> I see this KUDU-1214 <https://issues.cloudera.org/browse/KUDU-1214> targeted 
> for 0.8.0, but I see no progress on it. When this is complete, will this mean 
> that Spark will be able to work with Kudu both programmatically and as a 
> client via Spark SQL? Or is there more work that needs to be done on the 
> Spark side for it to work?
> 
> Just curious.
> 
> Cheers,
> Ben
> 
>

Re: Spark on Kudu

Reply via email to