Hi Cheolsoo,
Thanks for the correction. I took that for granted and didn't actually
check the code to verify. Yes, from the Spark version (1.2), I did see
their parser etc. Below is a portion of the README from Spark's sql package
for reference.
Thanks,
Xuefu
Spark SQL is broken up into four sub
Hi Xuefu,
Thanks for the good comparison. I agree with most points, but #1 isn't true.
SparkSQL has its own parser (implemented with Scala parser combinator
library), analyzer, and optimizer although they're not as mature as Hive.
What it depends on Hive for is Metastore, CliDriver, DDL parser, e
Thank you Xuefu!
Excellent explanation and comparison!
We should put it to Hive on Spark wiki.
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark
On Wed, May 20, 2015 at 10:45 AM, Xuefu Zhang wrote:
> I have been working on HIve on Spark, and knows a little about SparkSQL.
> Here a
I have been working on HIve on Spark, and knows a little about SparkSQL.
Here are a few factors to be considered:
1. SparkSQL is similar to Shark (discontinued) in that it clones Hive's
front end (parser and semantic analyzer) and metastore, and inject in
between a laryer where Hive's operator tre
ed
>>> the available memory; Hive handles this sort of situation much more
>>> gracefully. If you have a smallish cluster and large data, this could pose
>>> a problem. Still, it’s worth looking into SparkSQL to see if this is still
>>> an issue.
>>&g
t; a problem. Still, it’s worth looking into SparkSQL to see if this is still
>> an issue.
>>
>>
>>
>> -Chris Dragga
>>
>>
>>
>> *From:* Uli Bethke [mailto:uli.bet...@sonra.io]
>> *Sent:* Wednesday, May 20, 2015 7:04 AM
>> *To:* user@hive.apa
ose
> a problem. Still, it’s worth looking into SparkSQL to see if this is still
> an issue.
>
>
>
> -Chris Dragga
>
>
>
> *From:* Uli Bethke [mailto:uli.bet...@sonra.io]
> *Sent:* Wednesday, May 20, 2015 7:04 AM
> *To:* user@hive.apache.org
> *Subject:* Re:
ose a problem. Still,
it's worth looking into SparkSQL to see if this is still an issue.
-Chris Dragga
From: Uli Bethke [mailto:uli.bet...@sonra.io]
Sent: Wednesday, May 20, 2015 7:04 AM
To: user@hive.apache.org
Subject: Re: Hive on Spark VS Spark SQL
Interesting question and one that I
Interesting question and one that I have asked myself. If you are
already heavily invested in the Hive ecosystem in terms of code and
skills I would look at Hive on Spark as my engine. In theory swapping
out engines (MR, TEZ, Spark) should be easy. Even though the devil is in
the detail.
SparkS