Re: How spark and hive integrate in long term?

Ted Yu Fri, 21 Nov 2014 16:03:36 -0800

bq. spark-0.12 also has some nice feature added

Minor correction: you meant Spark 1.2.0 I guess


Cheers

On Fri, Nov 21, 2014 at 3:45 PM, Zhan Zhang <zzh...@hortonworks.com> wrote:

> Thanks Dean, for the information.
>
> Hive-on-spark is nice. Spark sql has the advantage to take the full
> advantage of spark and allows user to manipulate the table as RDD through
> native spark support.
>
> When I tried to upgrade the current hive-0.13.1 support to hive-0.14.0. I
> found the hive parser is not compatible any more. In the meantime, those
> new feature introduced in hive-0.14.1, e.g, ACID, etc, is not there yet. In
> the meantime, spark-0.12 also
> has some nice feature added which is supported by thrift-server too, e.g.,
> hive-0.13, table cache, etc.
>
> Given that both have more and more features added, it would be great if
> user can take advantage of both. Current, spark sql give us such benefits
> partially, but I am wondering how to keep such integration in long term.
>
> Thanks.
>
> Zhan Zhang
>
> On Nov 21, 2014, at 3:12 PM, Dean Wampler <deanwamp...@gmail.com> wrote:
>
> > I can't comment on plans for Spark SQL's support for Hive, but several
> > companies are porting Hive itself onto Spark:
> >
> >
> http://blog.cloudera.com/blog/2014/11/apache-hive-on-apache-spark-the-first-demo/
> >
> > I'm not sure if they are leveraging the old Shark code base or not, but
> it
> > appears to be a fresh effort.
> >
> > dean
> >
> > Dean Wampler, Ph.D.
> > Author: Programming Scala, 2nd Edition
> > <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
> > Typesafe <http://typesafe.com>
> > @deanwampler <http://twitter.com/deanwampler>
> > http://polyglotprogramming.com
> >
> > On Fri, Nov 21, 2014 at 2:51 PM, Zhan Zhang <zhaz...@gmail.com> wrote:
> >
> >> Now Spark and hive integration is a very nice feature. But I am
> wondering
> >> what the long term roadmap is for spark integration with hive. Both of
> >> these
> >> two projects are undergoing fast improvement and changes. Currently, my
> >> understanding is that spark hive sql part relies on hive meta store and
> >> basic parser to operate, and the thrift-server intercept hive query and
> >> replace it with its own engine.
> >>
> >> With every release of hive, there need a significant effort on spark
> part
> >> to
> >> support it.
> >>
> >> For the metastore part, we may possibly replace it with hcatalog. But
> given
> >> the dependency of other parts on hive, e.g., metastore, thriftserver,
> >> hcatlog may not be able to help much.
> >>
> >> Does anyone have any insight or idea in mind?
> >>
> >> Thanks.
> >>
> >> Zhan Zhang
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-spark-developers-list.1001551.n3.nabble.com/How-spark-and-hive-integrate-in-long-term-tp9482.html
> >> Sent from the Apache Spark Developers List mailing list archive at
> >> Nabble.com.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> >> For additional commands, e-mail: dev-h...@spark.apache.org
> >>
> >>
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>

Re: How spark and hive integrate in long term?

Reply via email to