Re: new Catalyst/SQL component merged into master

Evan Chan Sun, 23 Mar 2014 23:51:07 -0700

Hi Michael,

Congrats, this is really neat!


What thoughts do you have regarding adding indexing support and
predicate pushdown to this SQL framework?    Right now we have custom
bitmap indexing to speed up queries, so we're really curious as far as
the architectural direction.

-Evan


On Fri, Mar 21, 2014 at 11:09 AM, Michael Armbrust
<[email protected]> wrote:
>>
>> It will be great if there are any examples or usecases to look at ?
>>
> There are examples in the Spark documentation.  Patrick posted and updated
> copy here so people can see them before 1.0 is released:
> http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html
>
>> Does this feature has different usecases than shark or more cleaner as
>> hive dependency is gone?
>>
> Depending on how you use this, there is still a dependency on Hive (By
> default this is not the case.  See the above documentation for more
> details).  However, the dependency is on a stock version of Hive instead of
> one modified by the AMPLab.  Furthermore, Spark SQL has its own optimizer,
> instead of relying on the Hive optimizer.  Long term, this is going to give
> us a lot more flexibility to optimize queries specifically for the Spark
> execution engine.  We are actively porting over the best parts of shark
> (specifically the in-memory columnar representation).
>
> Shark still has some features that are missing in Spark SQL, including
> SharkServer (and years of testing).  Once SparkSQL graduates from Alpha
> status, it'll likely become the new backend for Shark.



-- 
--
Evan Chan
Staff Engineer
[email protected]  |

Re: new Catalyst/SQL component merged into master

Reply via email to