RE: Working Formula for Hive 0.13?

2014-08-28 Thread Zhan Zhang
I have preliminary patch against spark1.0.2, which is attached to spark-2706.
Now I am working on supporting both hive-0.12 and hive-0.13.1 with
non-intrusive way (not breaking any existing hive-0.12 when introduce
supporting new version). I will attach a proposal to solve multi-version
support issue to spark-2706 soon.

Thanks.

Zhan Zhang



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p8118.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



RE: Working Formula for Hive 0.13?

2014-08-25 Thread Andrew Lee
>From my perspective, there're few benefits regarding Hive 0.13.1+. The 
>following are the 4 major ones that I can see why people are asking to upgrade 
>to Hive 0.13.1 recently.
1. Performance and bug fix, patches. (Usual case)
2. Native support for Parquet format, no need to provide custom JARs and SerDe 
like Hive 0.12. (Depends, driven by data format and queries)
3. Support of Tez engine which gives performance improvement in several use 
cases (Performance improvement)
4. Security enhancement in Hive 0.13.1 has improved a lot (Security concerns, 
ACLs, etc)
These are the major benefits I see to upgrade to Hive 0.13.1+ from Hive 0.12.0.
There may be others out there that I'm not aware of, but I do see it coming.
my 2 cents.
> From: mich...@databricks.com
> Date: Mon, 25 Aug 2014 13:08:42 -0700
> Subject: Re: Working Formula for Hive 0.13?
> To: wangf...@huawei.com
> CC: dev@spark.apache.org
> 
> Thanks for working on this!  Its unclear at the moment exactly how we are
> going to handle this, since the end goal is to be compatible with as many
> versions of Hive as possible.  That said, I think it would be great to open
> a PR in this case.  Even if we don't merge it, thats a good way to get it
> on people's radar and have a discussion about the changes that are required.
> 
> 
> On Sun, Aug 24, 2014 at 7:11 PM, scwf  wrote:
> 
> >   I have worked for a branch update the hive version to hive-0.13(by
> > org.apache.hive)---https://github.com/scwf/spark/tree/hive-0.13
> > I am wondering whether it's ok to make a PR now because hive-0.13 version
> > is not compatible with hive-0.12 and here i used org.apache.hive.
> >
> >
> >
> > On 2014/7/29 8:22, Michael Armbrust wrote:
> >
> >> A few things:
> >>   - When we upgrade to Hive 0.13.0, Patrick will likely republish the
> >> hive-exec jar just as we did for 0.12.0
> >>   - Since we have to tie into some pretty low level APIs it is
> >> unsurprising
> >> that the code doesn't just compile out of the box against 0.13.0
> >>   - ScalaReflection is for determining Schema from Scala classes, not
> >> reflection based bridge code.  Either way its unclear to if there is any
> >> reason to use reflection to support multiple versions, instead of just
> >> upgrading to Hive 0.13.0
> >>
> >> One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
> >> it purely because you are having problems connecting to newer metastores?
> >>   Are there some features you are hoping for?  This will help me
> >> prioritize
> >> this effort.
> >>
> >> Michael
> >>
> >>
> >> On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:
> >>
> >>  I was looking for a class where reflection-related code should reside.
> >>>
> >>> I found this but don't think it is the proper class for bridging
> >>> differences between hive 0.12 and 0.13.1:
> >>>
> >>> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/
> >>> ScalaReflection.scala
> >>>
> >>> Cheers
> >>>
> >>>
> >>> On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:
> >>>
> >>>  After manually copying hive 0.13.1 jars to local maven repo, I got the
> >>>> following errors when building spark-hive_2.10 module :
> >>>>
> >>>> [ERROR]
> >>>>
> >>>>  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
> >>> sql/hive/HiveContext.scala:182:
> >>>
> >>>> type mismatch;
> >>>>   found   : String
> >>>>   required: Array[String]
> >>>> [ERROR]   val proc: CommandProcessor =
> >>>> CommandProcessorFactory.get(tokens(0), hiveconf)
> >>>> [ERROR]
> >>>> ^
> >>>> [ERROR]
> >>>>
> >>>>  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
> >>> sql/hive/HiveMetastoreCatalog.scala:60:
> >>>
> >>>> value getAllPartitionsForPruner is not a member of org.apache.
> >>>>   hadoop.hive.ql.metadata.Hive
> >>>> [ERROR] client.getAllPartitionsForPruner(table).toSeq
> >>>> [ERROR]^
> >>>> [ERROR]
> >>>>
> >>>>  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
> >>> sql/hive/HiveMetastoreCatalog.scala:267:
> >>>
> >>>> overloaded method constructor TableDesc with alternatives:
> >>

Re: Working Formula for Hive 0.13?

2014-08-25 Thread Michael Armbrust
Thanks for working on this!  Its unclear at the moment exactly how we are
going to handle this, since the end goal is to be compatible with as many
versions of Hive as possible.  That said, I think it would be great to open
a PR in this case.  Even if we don't merge it, thats a good way to get it
on people's radar and have a discussion about the changes that are required.


On Sun, Aug 24, 2014 at 7:11 PM, scwf  wrote:

>   I have worked for a branch update the hive version to hive-0.13(by
> org.apache.hive)---https://github.com/scwf/spark/tree/hive-0.13
> I am wondering whether it's ok to make a PR now because hive-0.13 version
> is not compatible with hive-0.12 and here i used org.apache.hive.
>
>
>
> On 2014/7/29 8:22, Michael Armbrust wrote:
>
>> A few things:
>>   - When we upgrade to Hive 0.13.0, Patrick will likely republish the
>> hive-exec jar just as we did for 0.12.0
>>   - Since we have to tie into some pretty low level APIs it is
>> unsurprising
>> that the code doesn't just compile out of the box against 0.13.0
>>   - ScalaReflection is for determining Schema from Scala classes, not
>> reflection based bridge code.  Either way its unclear to if there is any
>> reason to use reflection to support multiple versions, instead of just
>> upgrading to Hive 0.13.0
>>
>> One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
>> it purely because you are having problems connecting to newer metastores?
>>   Are there some features you are hoping for?  This will help me
>> prioritize
>> this effort.
>>
>> Michael
>>
>>
>> On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:
>>
>>  I was looking for a class where reflection-related code should reside.
>>>
>>> I found this but don't think it is the proper class for bridging
>>> differences between hive 0.12 and 0.13.1:
>>>
>>> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/
>>> ScalaReflection.scala
>>>
>>> Cheers
>>>
>>>
>>> On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:
>>>
>>>  After manually copying hive 0.13.1 jars to local maven repo, I got the
 following errors when building spark-hive_2.10 module :

 [ERROR]

  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
>>> sql/hive/HiveContext.scala:182:
>>>
 type mismatch;
   found   : String
   required: Array[String]
 [ERROR]   val proc: CommandProcessor =
 CommandProcessorFactory.get(tokens(0), hiveconf)
 [ERROR]
 ^
 [ERROR]

  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
>>> sql/hive/HiveMetastoreCatalog.scala:60:
>>>
 value getAllPartitionsForPruner is not a member of org.apache.
   hadoop.hive.ql.metadata.Hive
 [ERROR] client.getAllPartitionsForPruner(table).toSeq
 [ERROR]^
 [ERROR]

  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
>>> sql/hive/HiveMetastoreCatalog.scala:267:
>>>
 overloaded method constructor TableDesc with alternatives:
(x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
 Class[_],x$3:

>>> java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
>>>
 
()org.apache.hadoop.hive.ql.plan.TableDesc
   cannot be applied to (Class[org.apache.hadoop.hive.
 serde2.Deserializer],
 Class[(some other)?0(in value tableDesc)(in value tableDesc)],

>>> Class[?0(in
>>>
 value tableDesc)(in   value tableDesc)], java.util.Properties)
 [ERROR]   val tableDesc = new TableDesc(
 [ERROR]   ^
 [WARNING] Class org.antlr.runtime.tree.CommonTree not found -
 continuing
 with a stub.
 [WARNING] Class org.antlr.runtime.Token not found - continuing with a

>>> stub.
>>>
 [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing with
 a
 stub.
 [ERROR]
   while compiling:

  /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/
>>> sql/hive/HiveQl.scala
>>>
  during phase: typer
   library version: version 2.10.4
  compiler version: version 2.10.4

 The above shows incompatible changes between 0.12 and 0.13.1
 e.g. the first error corresponds to the following method
 in CommandProcessorFactory :
public static CommandProcessor get(String[] cmd, HiveConf conf)

 Cheers


 On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
 wrote:

  So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
>
 adding
>>>
 the hive-exec jar to the spark-project repo? It doesn¹t look like
>
 there¹s
>>>
 a release date schedule for 0.14.
>
>
>
> On 7/28/14, 10:50, "Cheng Lian"  wrote:
>
>  Exactly, forgot to mention Hulu team also made changes to cope with
>>
> those
>>>
 incompatibility issues, but they said that¹s relatively easy once the
>> re-packaging work is done.
>>
>>
>> On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell 
>>
>>>

Re: Working Formula for Hive 0.13?

2014-08-24 Thread scwf

  I have worked for a branch update the hive version to hive-0.13(by 
org.apache.hive)---https://github.com/scwf/spark/tree/hive-0.13
I am wondering whether it's ok to make a PR now because hive-0.13 version is 
not compatible with hive-0.12 and here i used org.apache.hive.


On 2014/7/29 8:22, Michael Armbrust wrote:

A few things:
  - When we upgrade to Hive 0.13.0, Patrick will likely republish the
hive-exec jar just as we did for 0.12.0
  - Since we have to tie into some pretty low level APIs it is unsurprising
that the code doesn't just compile out of the box against 0.13.0
  - ScalaReflection is for determining Schema from Scala classes, not
reflection based bridge code.  Either way its unclear to if there is any
reason to use reflection to support multiple versions, instead of just
upgrading to Hive 0.13.0

One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
it purely because you are having problems connecting to newer metastores?
  Are there some features you are hoping for?  This will help me prioritize
this effort.

Michael


On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:


I was looking for a class where reflection-related code should reside.

I found this but don't think it is the proper class for bridging
differences between hive 0.12 and 0.13.1:

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala

Cheers


On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:


After manually copying hive 0.13.1 jars to local maven repo, I got the
following errors when building spark-hive_2.10 module :

[ERROR]


/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:

type mismatch;
  found   : String
  required: Array[String]
[ERROR]   val proc: CommandProcessor =
CommandProcessorFactory.get(tokens(0), hiveconf)
[ERROR]
^
[ERROR]


/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:

value getAllPartitionsForPruner is not a member of org.apache.
  hadoop.hive.ql.metadata.Hive
[ERROR] client.getAllPartitionsForPruner(table).toSeq
[ERROR]^
[ERROR]


/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:

overloaded method constructor TableDesc with alternatives:
   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
Class[_],x$3:

java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc


   ()org.apache.hadoop.hive.ql.plan.TableDesc
  cannot be applied to (Class[org.apache.hadoop.hive.serde2.Deserializer],
Class[(some other)?0(in value tableDesc)(in value tableDesc)],

Class[?0(in

value tableDesc)(in   value tableDesc)], java.util.Properties)
[ERROR]   val tableDesc = new TableDesc(
[ERROR]   ^
[WARNING] Class org.antlr.runtime.tree.CommonTree not found - continuing
with a stub.
[WARNING] Class org.antlr.runtime.Token not found - continuing with a

stub.

[WARNING] Class org.antlr.runtime.tree.Tree not found - continuing with a
stub.
[ERROR]
  while compiling:


/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala

 during phase: typer
  library version: version 2.10.4
 compiler version: version 2.10.4

The above shows incompatible changes between 0.12 and 0.13.1
e.g. the first error corresponds to the following method
in CommandProcessorFactory :
   public static CommandProcessor get(String[] cmd, HiveConf conf)

Cheers


On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
wrote:


So, do we have a short-term fix until Hive 0.14 comes out? Perhaps

adding

the hive-exec jar to the spark-project repo? It doesn¹t look like

there¹s

a release date schedule for 0.14.



On 7/28/14, 10:50, "Cheng Lian"  wrote:


Exactly, forgot to mention Hulu team also made changes to cope with

those

incompatibility issues, but they said that¹s relatively easy once the
re-packaging work is done.


On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell 



wrote:


I've heard from Cloudera that there were hive internal changes

between

0.12 and 0.13 that required code re-writing. Over time it might be
possible for us to integrate with hive using API's that are more
stable (this is the domain of Michael/Cheng/Yin more than me!). It
would be interesting to see what the Hulu folks did.

- Patrick

On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian 
wrote:

AFAIK, according a recent talk, Hulu team in China has built Spark

SQL

against Hive 0.13 (or 0.13.1?) successfully. Basically they also
re-packaged Hive 0.13 as what the Spark team did. The slides of the

talk

hasn't been released yet though.


On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu 

wrote:



Owen helped me find this:
https://issues.apache.org/jira/browse/HIVE-7423

I guess this means that for Hive 0.14, Spark should be able to

directly

pull in hive-exec-core.jar

Cheers


On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell <

pwend...@gmail.com>

wrote:


It would be great if the hive team can fix that issue. I

Re: Working Formula for Hive 0.13?

2014-08-08 Thread Zhan Zhang
Attached the diff the PR SPARK-2706. I am currently working on this problem.
If somebody are also working on this, we can share the load.



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p7782.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Working Formula for Hive 0.13?

2014-08-08 Thread Michael Armbrust
Could you make a PR as described here:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark


On Fri, Aug 8, 2014 at 1:57 PM, Zhan Zhang  wrote:

> Sorry, forget to upload files. I have never posted before :) hive.diff
> <
> http://apache-spark-developers-list.1001551.n3.nabble.com/file/n/hive.diff
> >
>
>
>
> --
> View this message in context:
> http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: Working Formula for Hive 0.13?

2014-08-08 Thread Zhan Zhang
Sorry, forget to upload files. I have never posted before :) hive.diff

  



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Working Formula for Hive 0.13?

2014-08-08 Thread Zhan Zhang
Here is the patch. Please ignore the pom.xml related change, which just for
compiling purpose. I need to further work on this one based on Wandou's
previous work.



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p7776.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Working Formula for Hive 0.13?

2014-08-08 Thread Zhan Zhang
I can compile with no error, but my patch also includes other stuff. 



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p7775.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Working Formula for Hive 0.13?

2014-08-08 Thread Zhan Zhang
The API change seems not major. I have locally change it and compiled, but
not test yet. The major problem is still how to solve the hive-exec jar
dependency. I am willing to help on this issue. Is it better stick to the
same way as hive-0.12 until hive-exec is cleaned enough to switch back?



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Working-Formula-for-Hive-0-13-tp7551p7774.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Working Formula for Hive 0.13?

2014-07-30 Thread Ted Yu
I found SPARK-2706

Let me attach tentative patch there - I still face compilation error.

Cheers


On Mon, Jul 28, 2014 at 5:59 PM, Ted Yu  wrote:

> bq. Either way its unclear to if there is any reason to use reflection to
> support multiple versions, instead of just upgrading to Hive 0.13.0
>
> Which Spark release would this Hive upgrade take place ?
> I agree it is cleaner to upgrade Hive dependency vs. introducing
> reflection.
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 5:22 PM, Michael Armbrust 
> wrote:
>
>> A few things:
>>  - When we upgrade to Hive 0.13.0, Patrick will likely republish the
>> hive-exec jar just as we did for 0.12.0
>>  - Since we have to tie into some pretty low level APIs it is unsurprising
>> that the code doesn't just compile out of the box against 0.13.0
>>  - ScalaReflection is for determining Schema from Scala classes, not
>> reflection based bridge code.  Either way its unclear to if there is any
>> reason to use reflection to support multiple versions, instead of just
>> upgrading to Hive 0.13.0
>>
>> One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
>> it purely because you are having problems connecting to newer metastores?
>>  Are there some features you are hoping for?  This will help me prioritize
>> this effort.
>>
>> Michael
>>
>>
>> On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:
>>
>> > I was looking for a class where reflection-related code should reside.
>> >
>> > I found this but don't think it is the proper class for bridging
>> > differences between hive 0.12 and 0.13.1:
>> >
>> >
>> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
>> >
>> > Cheers
>> >
>> >
>> > On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:
>> >
>> > > After manually copying hive 0.13.1 jars to local maven repo, I got the
>> > > following errors when building spark-hive_2.10 module :
>> > >
>> > > [ERROR]
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
>> > > type mismatch;
>> > >  found   : String
>> > >  required: Array[String]
>> > > [ERROR]   val proc: CommandProcessor =
>> > > CommandProcessorFactory.get(tokens(0), hiveconf)
>> > > [ERROR]
>> > >^
>> > > [ERROR]
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
>> > > value getAllPartitionsForPruner is not a member of org.apache.
>> > >  hadoop.hive.ql.metadata.Hive
>> > > [ERROR] client.getAllPartitionsForPruner(table).toSeq
>> > > [ERROR]^
>> > > [ERROR]
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
>> > > overloaded method constructor TableDesc with alternatives:
>> > >   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
>> > > Class[_],x$3:
>> > java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
>> > > 
>> > >   ()org.apache.hadoop.hive.ql.plan.TableDesc
>> > >  cannot be applied to
>> (Class[org.apache.hadoop.hive.serde2.Deserializer],
>> > > Class[(some other)?0(in value tableDesc)(in value tableDesc)],
>> > Class[?0(in
>> > > value tableDesc)(in   value tableDesc)], java.util.Properties)
>> > > [ERROR]   val tableDesc = new TableDesc(
>> > > [ERROR]   ^
>> > > [WARNING] Class org.antlr.runtime.tree.CommonTree not found -
>> continuing
>> > > with a stub.
>> > > [WARNING] Class org.antlr.runtime.Token not found - continuing with a
>> > stub.
>> > > [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing
>> with a
>> > > stub.
>> > > [ERROR]
>> > >  while compiling:
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
>> > > during phase: typer
>> > >  library version: version 2.10.4
>> > > compiler version: version 2.10.4
>> > >
>> > > The above shows incompatible changes between 0.12 and 0.13.1
>> > > e.g. the first error corresponds to the following method
>> > > in CommandProcessorFactory :
>> > >   public static CommandProcessor get(String[] cmd, HiveConf conf)
>> > >
>> > > Cheers
>> > >
>> > >
>> > > On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
>> > > wrote:
>> > >
>> > >> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
>> > adding
>> > >> the hive-exec jar to the spark-project repo? It doesn¹t look like
>> > there¹s
>> > >> a release date schedule for 0.14.
>> > >>
>> > >>
>> > >>
>> > >> On 7/28/14, 10:50, "Cheng Lian"  wrote:
>> > >>
>> > >> >Exactly, forgot to mention Hulu team also made changes to cope with
>> > those
>> > >> >incompatibility issues, but they said that¹s relatively easy once
>> the
>> > >> >re-packaging work is done.
>> > >> >
>> > >> >
>> > >> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell <
>> pwend...@gmail.com>
>> > >>
>> > >> >wrote:
>> > >> >
>> > >> >> I've heard from Cloudera that there were hive internal changes
>> > between
>> > >> >> 0.12 and 0.13 that required code re-writing. Over time i

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
bq. Either way its unclear to if there is any reason to use reflection to
support multiple versions, instead of just upgrading to Hive 0.13.0

Which Spark release would this Hive upgrade take place ?
I agree it is cleaner to upgrade Hive dependency vs. introducing reflection.

Cheers


On Mon, Jul 28, 2014 at 5:22 PM, Michael Armbrust 
wrote:

> A few things:
>  - When we upgrade to Hive 0.13.0, Patrick will likely republish the
> hive-exec jar just as we did for 0.12.0
>  - Since we have to tie into some pretty low level APIs it is unsurprising
> that the code doesn't just compile out of the box against 0.13.0
>  - ScalaReflection is for determining Schema from Scala classes, not
> reflection based bridge code.  Either way its unclear to if there is any
> reason to use reflection to support multiple versions, instead of just
> upgrading to Hive 0.13.0
>
> One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
> it purely because you are having problems connecting to newer metastores?
>  Are there some features you are hoping for?  This will help me prioritize
> this effort.
>
> Michael
>
>
> On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:
>
> > I was looking for a class where reflection-related code should reside.
> >
> > I found this but don't think it is the proper class for bridging
> > differences between hive 0.12 and 0.13.1:
> >
> >
> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
> >
> > Cheers
> >
> >
> > On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:
> >
> > > After manually copying hive 0.13.1 jars to local maven repo, I got the
> > > following errors when building spark-hive_2.10 module :
> > >
> > > [ERROR]
> > >
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
> > > type mismatch;
> > >  found   : String
> > >  required: Array[String]
> > > [ERROR]   val proc: CommandProcessor =
> > > CommandProcessorFactory.get(tokens(0), hiveconf)
> > > [ERROR]
> > >^
> > > [ERROR]
> > >
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
> > > value getAllPartitionsForPruner is not a member of org.apache.
> > >  hadoop.hive.ql.metadata.Hive
> > > [ERROR] client.getAllPartitionsForPruner(table).toSeq
> > > [ERROR]^
> > > [ERROR]
> > >
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
> > > overloaded method constructor TableDesc with alternatives:
> > >   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
> > > Class[_],x$3:
> > java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
> > > 
> > >   ()org.apache.hadoop.hive.ql.plan.TableDesc
> > >  cannot be applied to
> (Class[org.apache.hadoop.hive.serde2.Deserializer],
> > > Class[(some other)?0(in value tableDesc)(in value tableDesc)],
> > Class[?0(in
> > > value tableDesc)(in   value tableDesc)], java.util.Properties)
> > > [ERROR]   val tableDesc = new TableDesc(
> > > [ERROR]   ^
> > > [WARNING] Class org.antlr.runtime.tree.CommonTree not found -
> continuing
> > > with a stub.
> > > [WARNING] Class org.antlr.runtime.Token not found - continuing with a
> > stub.
> > > [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing
> with a
> > > stub.
> > > [ERROR]
> > >  while compiling:
> > >
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
> > > during phase: typer
> > >  library version: version 2.10.4
> > > compiler version: version 2.10.4
> > >
> > > The above shows incompatible changes between 0.12 and 0.13.1
> > > e.g. the first error corresponds to the following method
> > > in CommandProcessorFactory :
> > >   public static CommandProcessor get(String[] cmd, HiveConf conf)
> > >
> > > Cheers
> > >
> > >
> > > On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
> > > wrote:
> > >
> > >> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
> > adding
> > >> the hive-exec jar to the spark-project repo? It doesn¹t look like
> > there¹s
> > >> a release date schedule for 0.14.
> > >>
> > >>
> > >>
> > >> On 7/28/14, 10:50, "Cheng Lian"  wrote:
> > >>
> > >> >Exactly, forgot to mention Hulu team also made changes to cope with
> > those
> > >> >incompatibility issues, but they said that¹s relatively easy once the
> > >> >re-packaging work is done.
> > >> >
> > >> >
> > >> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell  >
> > >>
> > >> >wrote:
> > >> >
> > >> >> I've heard from Cloudera that there were hive internal changes
> > between
> > >> >> 0.12 and 0.13 that required code re-writing. Over time it might be
> > >> >> possible for us to integrate with hive using API's that are more
> > >> >> stable (this is the domain of Michael/Cheng/Yin more than me!). It
> > >> >> would be interesting to see what the Hulu folks did.
> > >> >>
> > >> >> - Patrick
> > >> >>
> > >> >> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Li

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
The larger goal is to get a clean compile & test in the environment I have
to use. As near as I can tell, tests fail in parquet because parquet was
only added in Hive 0.13. There could well be issues in later meta-stores,
but one thing at a time...

- SteveN



On 7/28/14, 17:22, "Michael Armbrust"  wrote:

>A few things:
> - When we upgrade to Hive 0.13.0, Patrick will likely republish the
>hive-exec jar just as we did for 0.12.0
> - Since we have to tie into some pretty low level APIs it is unsurprising
>that the code doesn't just compile out of the box against 0.13.0
> - ScalaReflection is for determining Schema from Scala classes, not
>reflection based bridge code.  Either way its unclear to if there is any
>reason to use reflection to support multiple versions, instead of just
>upgrading to Hive 0.13.0
>
>One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
>it purely because you are having problems connecting to newer metastores?
> Are there some features you are hoping for?  This will help me prioritize
>this effort.
>
>Michael
>
>
>On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:
>
>> I was looking for a class where reflection-related code should reside.
>>
>> I found this but don't think it is the proper class for bridging
>> differences between hive 0.12 and 0.13.1:
>>
>> 
>>sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection
>>.scala
>>
>> Cheers
>>
>>
>> On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:
>>
>> > After manually copying hive 0.13.1 jars to local maven repo, I got the
>> > following errors when building spark-hive_2.10 module :
>> >
>> > [ERROR]
>> >
>> 
>>/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveCon
>>text.scala:182:
>> > type mismatch;
>> >  found   : String
>> >  required: Array[String]
>> > [ERROR]   val proc: CommandProcessor =
>> > CommandProcessorFactory.get(tokens(0), hiveconf)
>> > [ERROR]
>> >^
>> > [ERROR]
>> >
>> 
>>/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMet
>>astoreCatalog.scala:60:
>> > value getAllPartitionsForPruner is not a member of org.apache.
>> >  hadoop.hive.ql.metadata.Hive
>> > [ERROR] client.getAllPartitionsForPruner(table).toSeq
>> > [ERROR]^
>> > [ERROR]
>> >
>> 
>>/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMet
>>astoreCatalog.scala:267:
>> > overloaded method constructor TableDesc with alternatives:
>> >   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
>> > Class[_],x$3:
>> java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
>> > 
>> >   ()org.apache.hadoop.hive.ql.plan.TableDesc
>> >  cannot be applied to
>>(Class[org.apache.hadoop.hive.serde2.Deserializer],
>> > Class[(some other)?0(in value tableDesc)(in value tableDesc)],
>> Class[?0(in
>> > value tableDesc)(in   value tableDesc)], java.util.Properties)
>> > [ERROR]   val tableDesc = new TableDesc(
>> > [ERROR]   ^
>> > [WARNING] Class org.antlr.runtime.tree.CommonTree not found -
>>continuing
>> > with a stub.
>> > [WARNING] Class org.antlr.runtime.Token not found - continuing with a
>> stub.
>> > [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing
>>with a
>> > stub.
>> > [ERROR]
>> >  while compiling:
>> >
>> 
>>/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.
>>scala
>> > during phase: typer
>> >  library version: version 2.10.4
>> > compiler version: version 2.10.4
>> >
>> > The above shows incompatible changes between 0.12 and 0.13.1
>> > e.g. the first error corresponds to the following method
>> > in CommandProcessorFactory :
>> >   public static CommandProcessor get(String[] cmd, HiveConf conf)
>> >
>> > Cheers
>> >
>> >
>> > On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
>> > wrote:
>> >
>> >> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
>> adding
>> >> the hive-exec jar to the spark-project repo? It doesn¹t look like
>> there¹s
>> >> a release date schedule for 0.14.
>> >>
>> >>
>> >>
>> >> On 7/28/14, 10:50, "Cheng Lian"  wrote:
>> >>
>> >> >Exactly, forgot to mention Hulu team also made changes to cope with
>> those
>> >> >incompatibility issues, but they said that¹s relatively easy once
>>the
>> >> >re-packaging work is done.
>> >> >
>> >> >
>> >> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell
>>
>> >>
>> >> >wrote:
>> >> >
>> >> >> I've heard from Cloudera that there were hive internal changes
>> between
>> >> >> 0.12 and 0.13 that required code re-writing. Over time it might be
>> >> >> possible for us to integrate with hive using API's that are more
>> >> >> stable (this is the domain of Michael/Cheng/Yin more than me!). It
>> >> >> would be interesting to see what the Hulu folks did.
>> >> >>
>> >> >> - Patrick
>> >> >>
>> >> >> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian
>>
>> >> >> wrote:
>> >> >> > AFAIK, according a recent talk, Hulu team in China has built
>>Spark
>> >> SQL
>> >> >> > a

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Michael Armbrust
A few things:
 - When we upgrade to Hive 0.13.0, Patrick will likely republish the
hive-exec jar just as we did for 0.12.0
 - Since we have to tie into some pretty low level APIs it is unsurprising
that the code doesn't just compile out of the box against 0.13.0
 - ScalaReflection is for determining Schema from Scala classes, not
reflection based bridge code.  Either way its unclear to if there is any
reason to use reflection to support multiple versions, instead of just
upgrading to Hive 0.13.0

One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
it purely because you are having problems connecting to newer metastores?
 Are there some features you are hoping for?  This will help me prioritize
this effort.

Michael


On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu  wrote:

> I was looking for a class where reflection-related code should reside.
>
> I found this but don't think it is the proper class for bridging
> differences between hive 0.12 and 0.13.1:
>
> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:
>
> > After manually copying hive 0.13.1 jars to local maven repo, I got the
> > following errors when building spark-hive_2.10 module :
> >
> > [ERROR]
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
> > type mismatch;
> >  found   : String
> >  required: Array[String]
> > [ERROR]   val proc: CommandProcessor =
> > CommandProcessorFactory.get(tokens(0), hiveconf)
> > [ERROR]
> >^
> > [ERROR]
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
> > value getAllPartitionsForPruner is not a member of org.apache.
> >  hadoop.hive.ql.metadata.Hive
> > [ERROR] client.getAllPartitionsForPruner(table).toSeq
> > [ERROR]^
> > [ERROR]
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
> > overloaded method constructor TableDesc with alternatives:
> >   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
> > Class[_],x$3:
> java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
> > 
> >   ()org.apache.hadoop.hive.ql.plan.TableDesc
> >  cannot be applied to (Class[org.apache.hadoop.hive.serde2.Deserializer],
> > Class[(some other)?0(in value tableDesc)(in value tableDesc)],
> Class[?0(in
> > value tableDesc)(in   value tableDesc)], java.util.Properties)
> > [ERROR]   val tableDesc = new TableDesc(
> > [ERROR]   ^
> > [WARNING] Class org.antlr.runtime.tree.CommonTree not found - continuing
> > with a stub.
> > [WARNING] Class org.antlr.runtime.Token not found - continuing with a
> stub.
> > [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing with a
> > stub.
> > [ERROR]
> >  while compiling:
> >
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
> > during phase: typer
> >  library version: version 2.10.4
> > compiler version: version 2.10.4
> >
> > The above shows incompatible changes between 0.12 and 0.13.1
> > e.g. the first error corresponds to the following method
> > in CommandProcessorFactory :
> >   public static CommandProcessor get(String[] cmd, HiveConf conf)
> >
> > Cheers
> >
> >
> > On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
> > wrote:
> >
> >> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
> adding
> >> the hive-exec jar to the spark-project repo? It doesn¹t look like
> there¹s
> >> a release date schedule for 0.14.
> >>
> >>
> >>
> >> On 7/28/14, 10:50, "Cheng Lian"  wrote:
> >>
> >> >Exactly, forgot to mention Hulu team also made changes to cope with
> those
> >> >incompatibility issues, but they said that¹s relatively easy once the
> >> >re-packaging work is done.
> >> >
> >> >
> >> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell 
> >>
> >> >wrote:
> >> >
> >> >> I've heard from Cloudera that there were hive internal changes
> between
> >> >> 0.12 and 0.13 that required code re-writing. Over time it might be
> >> >> possible for us to integrate with hive using API's that are more
> >> >> stable (this is the domain of Michael/Cheng/Yin more than me!). It
> >> >> would be interesting to see what the Hulu folks did.
> >> >>
> >> >> - Patrick
> >> >>
> >> >> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian 
> >> >> wrote:
> >> >> > AFAIK, according a recent talk, Hulu team in China has built Spark
> >> SQL
> >> >> > against Hive 0.13 (or 0.13.1?) successfully. Basically they also
> >> >> > re-packaged Hive 0.13 as what the Spark team did. The slides of the
> >> >>talk
> >> >> > hasn't been released yet though.
> >> >> >
> >> >> >
> >> >> > On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu 
> wrote:
> >> >> >
> >> >> >> Owen helped me find this:
> >> >> >> https://issues.apache.org/jira/browse/HIVE-7423
> >> >> >>
> >> >> >> I guess this means that for Hive 0.14, Spark should be able to
> >> >>directly
> >

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
I was looking for a class where reflection-related code should reside.

I found this but don't think it is the proper class for bridging
differences between hive 0.12 and 0.13.1:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala

Cheers


On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu  wrote:

> After manually copying hive 0.13.1 jars to local maven repo, I got the
> following errors when building spark-hive_2.10 module :
>
> [ERROR]
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
> type mismatch;
>  found   : String
>  required: Array[String]
> [ERROR]   val proc: CommandProcessor =
> CommandProcessorFactory.get(tokens(0), hiveconf)
> [ERROR]
>^
> [ERROR]
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
> value getAllPartitionsForPruner is not a member of org.apache.
>  hadoop.hive.ql.metadata.Hive
> [ERROR] client.getAllPartitionsForPruner(table).toSeq
> [ERROR]^
> [ERROR]
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
> overloaded method constructor TableDesc with alternatives:
>   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
> Class[_],x$3: java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
> 
>   ()org.apache.hadoop.hive.ql.plan.TableDesc
>  cannot be applied to (Class[org.apache.hadoop.hive.serde2.Deserializer],
> Class[(some other)?0(in value tableDesc)(in value tableDesc)], Class[?0(in
> value tableDesc)(in   value tableDesc)], java.util.Properties)
> [ERROR]   val tableDesc = new TableDesc(
> [ERROR]   ^
> [WARNING] Class org.antlr.runtime.tree.CommonTree not found - continuing
> with a stub.
> [WARNING] Class org.antlr.runtime.Token not found - continuing with a stub.
> [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing with a
> stub.
> [ERROR]
>  while compiling:
> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
> during phase: typer
>  library version: version 2.10.4
> compiler version: version 2.10.4
>
> The above shows incompatible changes between 0.12 and 0.13.1
> e.g. the first error corresponds to the following method
> in CommandProcessorFactory :
>   public static CommandProcessor get(String[] cmd, HiveConf conf)
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez 
> wrote:
>
>> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps adding
>> the hive-exec jar to the spark-project repo? It doesn¹t look like there¹s
>> a release date schedule for 0.14.
>>
>>
>>
>> On 7/28/14, 10:50, "Cheng Lian"  wrote:
>>
>> >Exactly, forgot to mention Hulu team also made changes to cope with those
>> >incompatibility issues, but they said that¹s relatively easy once the
>> >re-packaging work is done.
>> >
>> >
>> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell 
>>
>> >wrote:
>> >
>> >> I've heard from Cloudera that there were hive internal changes between
>> >> 0.12 and 0.13 that required code re-writing. Over time it might be
>> >> possible for us to integrate with hive using API's that are more
>> >> stable (this is the domain of Michael/Cheng/Yin more than me!). It
>> >> would be interesting to see what the Hulu folks did.
>> >>
>> >> - Patrick
>> >>
>> >> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian 
>> >> wrote:
>> >> > AFAIK, according a recent talk, Hulu team in China has built Spark
>> SQL
>> >> > against Hive 0.13 (or 0.13.1?) successfully. Basically they also
>> >> > re-packaged Hive 0.13 as what the Spark team did. The slides of the
>> >>talk
>> >> > hasn't been released yet though.
>> >> >
>> >> >
>> >> > On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu  wrote:
>> >> >
>> >> >> Owen helped me find this:
>> >> >> https://issues.apache.org/jira/browse/HIVE-7423
>> >> >>
>> >> >> I guess this means that for Hive 0.14, Spark should be able to
>> >>directly
>> >> >> pull in hive-exec-core.jar
>> >> >>
>> >> >> Cheers
>> >> >>
>> >> >>
>> >> >> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell <
>> pwend...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >> > It would be great if the hive team can fix that issue. If not,
>> >>we'll
>> >> >> > have to continue forking our own version of Hive to change the way
>> >>it
>> >> >> > publishes artifacts.
>> >> >> >
>> >> >> > - Patrick
>> >> >> >
>> >> >> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu 
>> >>wrote:
>> >> >> > > Talked with Owen offline. He confirmed that as of 0.13,
>> >>hive-exec is
>> >> >> > still
>> >> >> > > uber jar.
>> >> >> > >
>> >> >> > > Right now I am facing the following error building against Hive
>> >> 0.13.1
>> >> >> :
>> >> >> > >
>> >> >> > > [ERROR] Failed to execute goal on project spark-hive_2.10: Could
>> >>not
>> >> >> > > resolve dependencies for project
>> >> >> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The
>> >>following
>> >> >> > > artifacts could not be resolved:
>> >> >> > > org.sp

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
After manually copying hive 0.13.1 jars to local maven repo, I got the
following errors when building spark-hive_2.10 module :

[ERROR]
/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
type mismatch;
 found   : String
 required: Array[String]
[ERROR]   val proc: CommandProcessor =
CommandProcessorFactory.get(tokens(0), hiveconf)
[ERROR]
 ^
[ERROR]
/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
value getAllPartitionsForPruner is not a member of org.apache.
 hadoop.hive.ql.metadata.Hive
[ERROR] client.getAllPartitionsForPruner(table).toSeq
[ERROR]^
[ERROR]
/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
overloaded method constructor TableDesc with alternatives:
  (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
Class[_],x$3: java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc

  ()org.apache.hadoop.hive.ql.plan.TableDesc
 cannot be applied to (Class[org.apache.hadoop.hive.serde2.Deserializer],
Class[(some other)?0(in value tableDesc)(in value tableDesc)], Class[?0(in
value tableDesc)(in   value tableDesc)], java.util.Properties)
[ERROR]   val tableDesc = new TableDesc(
[ERROR]   ^
[WARNING] Class org.antlr.runtime.tree.CommonTree not found - continuing
with a stub.
[WARNING] Class org.antlr.runtime.Token not found - continuing with a stub.
[WARNING] Class org.antlr.runtime.tree.Tree not found - continuing with a
stub.
[ERROR]
 while compiling:
/homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
during phase: typer
 library version: version 2.10.4
compiler version: version 2.10.4

The above shows incompatible changes between 0.12 and 0.13.1
e.g. the first error corresponds to the following method
in CommandProcessorFactory :
  public static CommandProcessor get(String[] cmd, HiveConf conf)

Cheers


On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez  wrote:

> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps adding
> the hive-exec jar to the spark-project repo? It doesn¹t look like there¹s
> a release date schedule for 0.14.
>
>
>
> On 7/28/14, 10:50, "Cheng Lian"  wrote:
>
> >Exactly, forgot to mention Hulu team also made changes to cope with those
> >incompatibility issues, but they said that¹s relatively easy once the
> >re-packaging work is done.
> >
> >
> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell 
> >wrote:
> >
> >> I've heard from Cloudera that there were hive internal changes between
> >> 0.12 and 0.13 that required code re-writing. Over time it might be
> >> possible for us to integrate with hive using API's that are more
> >> stable (this is the domain of Michael/Cheng/Yin more than me!). It
> >> would be interesting to see what the Hulu folks did.
> >>
> >> - Patrick
> >>
> >> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian 
> >> wrote:
> >> > AFAIK, according a recent talk, Hulu team in China has built Spark SQL
> >> > against Hive 0.13 (or 0.13.1?) successfully. Basically they also
> >> > re-packaged Hive 0.13 as what the Spark team did. The slides of the
> >>talk
> >> > hasn't been released yet though.
> >> >
> >> >
> >> > On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu  wrote:
> >> >
> >> >> Owen helped me find this:
> >> >> https://issues.apache.org/jira/browse/HIVE-7423
> >> >>
> >> >> I guess this means that for Hive 0.14, Spark should be able to
> >>directly
> >> >> pull in hive-exec-core.jar
> >> >>
> >> >> Cheers
> >> >>
> >> >>
> >> >> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell  >
> >> >> wrote:
> >> >>
> >> >> > It would be great if the hive team can fix that issue. If not,
> >>we'll
> >> >> > have to continue forking our own version of Hive to change the way
> >>it
> >> >> > publishes artifacts.
> >> >> >
> >> >> > - Patrick
> >> >> >
> >> >> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu 
> >>wrote:
> >> >> > > Talked with Owen offline. He confirmed that as of 0.13,
> >>hive-exec is
> >> >> > still
> >> >> > > uber jar.
> >> >> > >
> >> >> > > Right now I am facing the following error building against Hive
> >> 0.13.1
> >> >> :
> >> >> > >
> >> >> > > [ERROR] Failed to execute goal on project spark-hive_2.10: Could
> >>not
> >> >> > > resolve dependencies for project
> >> >> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The
> >>following
> >> >> > > artifacts could not be resolved:
> >> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1,
> >> >> > > org.spark-project.hive:hive-exec:jar:0.13.1,
> >> >> > > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> >> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1 in
> >> >> > > http://repo.maven.apache.org/maven2 was cached in the local
> >> >> repository,
> >> >> > > resolution will not be reattempted until the update interval of
> >> >> > maven-repo
> >> >> > > has elapsed or updates are forced -> [Help 1]
> >> >> > >
> >> >> > > Some hint wo

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
So, do we have a short-term fix until Hive 0.14 comes out? Perhaps adding
the hive-exec jar to the spark-project repo? It doesn¹t look like there¹s
a release date schedule for 0.14.



On 7/28/14, 10:50, "Cheng Lian"  wrote:

>Exactly, forgot to mention Hulu team also made changes to cope with those
>incompatibility issues, but they said that¹s relatively easy once the
>re-packaging work is done.
>
>
>On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell 
>wrote:
>
>> I've heard from Cloudera that there were hive internal changes between
>> 0.12 and 0.13 that required code re-writing. Over time it might be
>> possible for us to integrate with hive using API's that are more
>> stable (this is the domain of Michael/Cheng/Yin more than me!). It
>> would be interesting to see what the Hulu folks did.
>>
>> - Patrick
>>
>> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian 
>> wrote:
>> > AFAIK, according a recent talk, Hulu team in China has built Spark SQL
>> > against Hive 0.13 (or 0.13.1?) successfully. Basically they also
>> > re-packaged Hive 0.13 as what the Spark team did. The slides of the
>>talk
>> > hasn't been released yet though.
>> >
>> >
>> > On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu  wrote:
>> >
>> >> Owen helped me find this:
>> >> https://issues.apache.org/jira/browse/HIVE-7423
>> >>
>> >> I guess this means that for Hive 0.14, Spark should be able to
>>directly
>> >> pull in hive-exec-core.jar
>> >>
>> >> Cheers
>> >>
>> >>
>> >> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell 
>> >> wrote:
>> >>
>> >> > It would be great if the hive team can fix that issue. If not,
>>we'll
>> >> > have to continue forking our own version of Hive to change the way
>>it
>> >> > publishes artifacts.
>> >> >
>> >> > - Patrick
>> >> >
>> >> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu 
>>wrote:
>> >> > > Talked with Owen offline. He confirmed that as of 0.13,
>>hive-exec is
>> >> > still
>> >> > > uber jar.
>> >> > >
>> >> > > Right now I am facing the following error building against Hive
>> 0.13.1
>> >> :
>> >> > >
>> >> > > [ERROR] Failed to execute goal on project spark-hive_2.10: Could
>>not
>> >> > > resolve dependencies for project
>> >> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The
>>following
>> >> > > artifacts could not be resolved:
>> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1,
>> >> > > org.spark-project.hive:hive-exec:jar:0.13.1,
>> >> > > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
>> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1 in
>> >> > > http://repo.maven.apache.org/maven2 was cached in the local
>> >> repository,
>> >> > > resolution will not be reattempted until the update interval of
>> >> > maven-repo
>> >> > > has elapsed or updates are forced -> [Help 1]
>> >> > >
>> >> > > Some hint would be appreciated.
>> >> > >
>> >> > > Cheers
>> >> > >
>> >> > >
>> >> > > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen 
>> wrote:
>> >> > >
>> >> > >> Yes, it is published. As of previous versions, at least,
>>hive-exec
>> >> > >> included all of its dependencies *in its artifact*, making it
>> unusable
>> >> > >> as-is because it contained copies of dependencies that clash
>>with
>> >> > >> versions present in other artifacts, and can't be managed with
>> Maven
>> >> > >> mechanisms.
>> >> > >>
>> >> > >> I am not sure why hive-exec was not published normally, with
>>just
>> its
>> >> > >> own classes. That's why it was copied, into an artifact with
>>just
>> >> > >> hive-exec code.
>> >> > >>
>> >> > >> You could do the same thing for hive-exec 0.13.1.
>> >> > >> Or maybe someone knows that it's published more 'normally' now.
>> >> > >> I don't think hive-metastore is related to this question?
>> >> > >>
>> >> > >> I am no expert on the Hive artifacts, just remembering what the
>> issue
>> >> > >> was initially in case it helps you get to a similar solution.
>> >> > >>
>> >> > >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu 
>> wrote:
>> >> > >> > hive-exec (as of 0.13.1) is published here:
>> >> > >> >
>> >> > >>
>> >> >
>> >>
>> 
>>http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C
>>0.13.1%7Cjar
>> >> > >> >
>> >> > >> > Should a JIRA be opened so that dependency on hive-metastore
>>can
>> be
>> >> > >> > replaced by dependency on hive-exec ?
>> >> > >> >
>> >> > >> > Cheers
>> >> > >> >
>> >> > >> >
>> >> > >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen
>>
>> >> > wrote:
>> >> > >> >
>> >> > >> >> The reason for org.spark-project.hive is that Spark relies on
>> >> > >> >> hive-exec, but the Hive project does not publish this
>>artifact
>> by
>> >> > >> >> itself, only with all its dependencies as an uber jar. Maybe
>> that's
>> >> > >> >> been improved. If so, you need to point at the new hive-exec
>>and
>> >> > >> >> perhaps sort out its dependencies manually in your build.
>> >> > >> >>
>> >> > >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu 
>> >> wrote:
>> >> > >> >> > I found 0.13.1 artifacts in maven:
>> >> > >> >> >
>> >> > >> >>
>> >>

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Cheng Lian
Exactly, forgot to mention Hulu team also made changes to cope with those
incompatibility issues, but they said that’s relatively easy once the
re-packaging work is done.


On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell  wrote:

> I've heard from Cloudera that there were hive internal changes between
> 0.12 and 0.13 that required code re-writing. Over time it might be
> possible for us to integrate with hive using API's that are more
> stable (this is the domain of Michael/Cheng/Yin more than me!). It
> would be interesting to see what the Hulu folks did.
>
> - Patrick
>
> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian 
> wrote:
> > AFAIK, according a recent talk, Hulu team in China has built Spark SQL
> > against Hive 0.13 (or 0.13.1?) successfully. Basically they also
> > re-packaged Hive 0.13 as what the Spark team did. The slides of the talk
> > hasn't been released yet though.
> >
> >
> > On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu  wrote:
> >
> >> Owen helped me find this:
> >> https://issues.apache.org/jira/browse/HIVE-7423
> >>
> >> I guess this means that for Hive 0.14, Spark should be able to directly
> >> pull in hive-exec-core.jar
> >>
> >> Cheers
> >>
> >>
> >> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell 
> >> wrote:
> >>
> >> > It would be great if the hive team can fix that issue. If not, we'll
> >> > have to continue forking our own version of Hive to change the way it
> >> > publishes artifacts.
> >> >
> >> > - Patrick
> >> >
> >> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
> >> > > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
> >> > still
> >> > > uber jar.
> >> > >
> >> > > Right now I am facing the following error building against Hive
> 0.13.1
> >> :
> >> > >
> >> > > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
> >> > > resolve dependencies for project
> >> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
> >> > > artifacts could not be resolved:
> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1,
> >> > > org.spark-project.hive:hive-exec:jar:0.13.1,
> >> > > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1 in
> >> > > http://repo.maven.apache.org/maven2 was cached in the local
> >> repository,
> >> > > resolution will not be reattempted until the update interval of
> >> > maven-repo
> >> > > has elapsed or updates are forced -> [Help 1]
> >> > >
> >> > > Some hint would be appreciated.
> >> > >
> >> > > Cheers
> >> > >
> >> > >
> >> > > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen 
> wrote:
> >> > >
> >> > >> Yes, it is published. As of previous versions, at least, hive-exec
> >> > >> included all of its dependencies *in its artifact*, making it
> unusable
> >> > >> as-is because it contained copies of dependencies that clash with
> >> > >> versions present in other artifacts, and can't be managed with
> Maven
> >> > >> mechanisms.
> >> > >>
> >> > >> I am not sure why hive-exec was not published normally, with just
> its
> >> > >> own classes. That's why it was copied, into an artifact with just
> >> > >> hive-exec code.
> >> > >>
> >> > >> You could do the same thing for hive-exec 0.13.1.
> >> > >> Or maybe someone knows that it's published more 'normally' now.
> >> > >> I don't think hive-metastore is related to this question?
> >> > >>
> >> > >> I am no expert on the Hive artifacts, just remembering what the
> issue
> >> > >> was initially in case it helps you get to a similar solution.
> >> > >>
> >> > >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu 
> wrote:
> >> > >> > hive-exec (as of 0.13.1) is published here:
> >> > >> >
> >> > >>
> >> >
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
> >> > >> >
> >> > >> > Should a JIRA be opened so that dependency on hive-metastore can
> be
> >> > >> > replaced by dependency on hive-exec ?
> >> > >> >
> >> > >> > Cheers
> >> > >> >
> >> > >> >
> >> > >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
> >> > wrote:
> >> > >> >
> >> > >> >> The reason for org.spark-project.hive is that Spark relies on
> >> > >> >> hive-exec, but the Hive project does not publish this artifact
> by
> >> > >> >> itself, only with all its dependencies as an uber jar. Maybe
> that's
> >> > >> >> been improved. If so, you need to point at the new hive-exec and
> >> > >> >> perhaps sort out its dependencies manually in your build.
> >> > >> >>
> >> > >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu 
> >> wrote:
> >> > >> >> > I found 0.13.1 artifacts in maven:
> >> > >> >> >
> >> > >> >>
> >> > >>
> >> >
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> >> > >> >> >
> >> > >> >> > However, Spark uses groupId of org.spark-project.hive, not
> >> > >> >> org.apache.hive
> >> > >> >> >
> >> > >> >> > Can someone tell me how it is supposed to work ?
> >> > >> >> >
> >> > >> >> > Cheers
> >> > >> >> >
> >> > >> >> >
> >> > >> >> > On 

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Mark Hamstra
Getting and maintaining our own branch in the main asf hive repo is a
non-starter or isn't workable?


On Mon, Jul 28, 2014 at 10:17 AM, Patrick Wendell 
wrote:

> Yeah so we need a model for this (Mark - do you have any ideas?). I
> did this in a personal github repo. I just did it quickly because
> dependency issues were blocking the 1.0 release:
>
> https://github.com/pwendell/hive/tree/branch-0.12-shaded-protobuf
>
> I think what we want is to have a semi official github repo with an
> index to each of the shaded dependencies and what version is included
> in which branch.
>
> - Patrick
>
> On Mon, Jul 28, 2014 at 10:02 AM, Mark Hamstra 
> wrote:
> > Where and how is that fork being maintained?  I'm not seeing an obviously
> > correct branch or tag in the main asf hive repo & github mirror.
> >
> >
> > On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell 
> wrote:
> >
> >> It would be great if the hive team can fix that issue. If not, we'll
> >> have to continue forking our own version of Hive to change the way it
> >> publishes artifacts.
> >>
> >> - Patrick
> >>
> >> On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
> >> > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
> >> still
> >> > uber jar.
> >> >
> >> > Right now I am facing the following error building against Hive
> 0.13.1 :
> >> >
> >> > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
> >> > resolve dependencies for project
> >> > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
> >> > artifacts could not be resolved:
> >> > org.spark-project.hive:hive-metastore:jar:0.13.1,
> >> > org.spark-project.hive:hive-exec:jar:0.13.1,
> >> > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> >> > org.spark-project.hive:hive-metastore:jar:0.13.1 in
> >> > http://repo.maven.apache.org/maven2 was cached in the local
> repository,
> >> > resolution will not be reattempted until the update interval of
> >> maven-repo
> >> > has elapsed or updates are forced -> [Help 1]
> >> >
> >> > Some hint would be appreciated.
> >> >
> >> > Cheers
> >> >
> >> >
> >> > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen 
> wrote:
> >> >
> >> >> Yes, it is published. As of previous versions, at least, hive-exec
> >> >> included all of its dependencies *in its artifact*, making it
> unusable
> >> >> as-is because it contained copies of dependencies that clash with
> >> >> versions present in other artifacts, and can't be managed with Maven
> >> >> mechanisms.
> >> >>
> >> >> I am not sure why hive-exec was not published normally, with just its
> >> >> own classes. That's why it was copied, into an artifact with just
> >> >> hive-exec code.
> >> >>
> >> >> You could do the same thing for hive-exec 0.13.1.
> >> >> Or maybe someone knows that it's published more 'normally' now.
> >> >> I don't think hive-metastore is related to this question?
> >> >>
> >> >> I am no expert on the Hive artifacts, just remembering what the issue
> >> >> was initially in case it helps you get to a similar solution.
> >> >>
> >> >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
> >> >> > hive-exec (as of 0.13.1) is published here:
> >> >> >
> >> >>
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
> >> >> >
> >> >> > Should a JIRA be opened so that dependency on hive-metastore can be
> >> >> > replaced by dependency on hive-exec ?
> >> >> >
> >> >> > Cheers
> >> >> >
> >> >> >
> >> >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
> >> wrote:
> >> >> >
> >> >> >> The reason for org.spark-project.hive is that Spark relies on
> >> >> >> hive-exec, but the Hive project does not publish this artifact by
> >> >> >> itself, only with all its dependencies as an uber jar. Maybe
> that's
> >> >> >> been improved. If so, you need to point at the new hive-exec and
> >> >> >> perhaps sort out its dependencies manually in your build.
> >> >> >>
> >> >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu 
> wrote:
> >> >> >> > I found 0.13.1 artifacts in maven:
> >> >> >> >
> >> >> >>
> >> >>
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> >> >> >> >
> >> >> >> > However, Spark uses groupId of org.spark-project.hive, not
> >> >> >> org.apache.hive
> >> >> >> >
> >> >> >> > Can someone tell me how it is supposed to work ?
> >> >> >> >
> >> >> >> > Cheers
> >> >> >> >
> >> >> >> >
> >> >> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez <
> >> snu...@hortonworks.com>
> >> >> >> wrote:
> >> >> >> >
> >> >> >> >> I saw a note earlier, perhaps on the user list, that at least
> one
> >> >> >> person is
> >> >> >> >> using Hive 0.13. Anyone got a working build configuration for
> this
> >> >> >> version
> >> >> >> >> of Hive?
> >> >> >> >>
> >> >> >> >> Regards,
> >> >> >> >> - Steve
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> CONFIDENTIALITY NOTICE
> >> >> >> >> NOTICE: This message is intended for the use of the individual
> or
> >> >> >> en

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Patrick Wendell
I've heard from Cloudera that there were hive internal changes between
0.12 and 0.13 that required code re-writing. Over time it might be
possible for us to integrate with hive using API's that are more
stable (this is the domain of Michael/Cheng/Yin more than me!). It
would be interesting to see what the Hulu folks did.

- Patrick

On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian  wrote:
> AFAIK, according a recent talk, Hulu team in China has built Spark SQL
> against Hive 0.13 (or 0.13.1?) successfully. Basically they also
> re-packaged Hive 0.13 as what the Spark team did. The slides of the talk
> hasn't been released yet though.
>
>
> On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu  wrote:
>
>> Owen helped me find this:
>> https://issues.apache.org/jira/browse/HIVE-7423
>>
>> I guess this means that for Hive 0.14, Spark should be able to directly
>> pull in hive-exec-core.jar
>>
>> Cheers
>>
>>
>> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell 
>> wrote:
>>
>> > It would be great if the hive team can fix that issue. If not, we'll
>> > have to continue forking our own version of Hive to change the way it
>> > publishes artifacts.
>> >
>> > - Patrick
>> >
>> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
>> > > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
>> > still
>> > > uber jar.
>> > >
>> > > Right now I am facing the following error building against Hive 0.13.1
>> :
>> > >
>> > > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
>> > > resolve dependencies for project
>> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
>> > > artifacts could not be resolved:
>> > > org.spark-project.hive:hive-metastore:jar:0.13.1,
>> > > org.spark-project.hive:hive-exec:jar:0.13.1,
>> > > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
>> > > org.spark-project.hive:hive-metastore:jar:0.13.1 in
>> > > http://repo.maven.apache.org/maven2 was cached in the local
>> repository,
>> > > resolution will not be reattempted until the update interval of
>> > maven-repo
>> > > has elapsed or updates are forced -> [Help 1]
>> > >
>> > > Some hint would be appreciated.
>> > >
>> > > Cheers
>> > >
>> > >
>> > > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:
>> > >
>> > >> Yes, it is published. As of previous versions, at least, hive-exec
>> > >> included all of its dependencies *in its artifact*, making it unusable
>> > >> as-is because it contained copies of dependencies that clash with
>> > >> versions present in other artifacts, and can't be managed with Maven
>> > >> mechanisms.
>> > >>
>> > >> I am not sure why hive-exec was not published normally, with just its
>> > >> own classes. That's why it was copied, into an artifact with just
>> > >> hive-exec code.
>> > >>
>> > >> You could do the same thing for hive-exec 0.13.1.
>> > >> Or maybe someone knows that it's published more 'normally' now.
>> > >> I don't think hive-metastore is related to this question?
>> > >>
>> > >> I am no expert on the Hive artifacts, just remembering what the issue
>> > >> was initially in case it helps you get to a similar solution.
>> > >>
>> > >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
>> > >> > hive-exec (as of 0.13.1) is published here:
>> > >> >
>> > >>
>> >
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
>> > >> >
>> > >> > Should a JIRA be opened so that dependency on hive-metastore can be
>> > >> > replaced by dependency on hive-exec ?
>> > >> >
>> > >> > Cheers
>> > >> >
>> > >> >
>> > >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
>> > wrote:
>> > >> >
>> > >> >> The reason for org.spark-project.hive is that Spark relies on
>> > >> >> hive-exec, but the Hive project does not publish this artifact by
>> > >> >> itself, only with all its dependencies as an uber jar. Maybe that's
>> > >> >> been improved. If so, you need to point at the new hive-exec and
>> > >> >> perhaps sort out its dependencies manually in your build.
>> > >> >>
>> > >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu 
>> wrote:
>> > >> >> > I found 0.13.1 artifacts in maven:
>> > >> >> >
>> > >> >>
>> > >>
>> >
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
>> > >> >> >
>> > >> >> > However, Spark uses groupId of org.spark-project.hive, not
>> > >> >> org.apache.hive
>> > >> >> >
>> > >> >> > Can someone tell me how it is supposed to work ?
>> > >> >> >
>> > >> >> > Cheers
>> > >> >> >
>> > >> >> >
>> > >> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez <
>> > snu...@hortonworks.com>
>> > >> >> wrote:
>> > >> >> >
>> > >> >> >> I saw a note earlier, perhaps on the user list, that at least
>> one
>> > >> >> person is
>> > >> >> >> using Hive 0.13. Anyone got a working build configuration for
>> this
>> > >> >> version
>> > >> >> >> of Hive?
>> > >> >> >>
>> > >> >> >> Regards,
>> > >> >> >> - Steve
>> > >> >> >>
>> > >> >> >>
>> > >> >> >>
>> > >> >> >> --
>> > >> >> >> CONFIDENTIALITY NOTICE
>> > >> 

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Patrick Wendell
Yeah so we need a model for this (Mark - do you have any ideas?). I
did this in a personal github repo. I just did it quickly because
dependency issues were blocking the 1.0 release:

https://github.com/pwendell/hive/tree/branch-0.12-shaded-protobuf

I think what we want is to have a semi official github repo with an
index to each of the shaded dependencies and what version is included
in which branch.

- Patrick

On Mon, Jul 28, 2014 at 10:02 AM, Mark Hamstra  wrote:
> Where and how is that fork being maintained?  I'm not seeing an obviously
> correct branch or tag in the main asf hive repo & github mirror.
>
>
> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell  wrote:
>
>> It would be great if the hive team can fix that issue. If not, we'll
>> have to continue forking our own version of Hive to change the way it
>> publishes artifacts.
>>
>> - Patrick
>>
>> On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
>> > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
>> still
>> > uber jar.
>> >
>> > Right now I am facing the following error building against Hive 0.13.1 :
>> >
>> > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
>> > resolve dependencies for project
>> > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
>> > artifacts could not be resolved:
>> > org.spark-project.hive:hive-metastore:jar:0.13.1,
>> > org.spark-project.hive:hive-exec:jar:0.13.1,
>> > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
>> > org.spark-project.hive:hive-metastore:jar:0.13.1 in
>> > http://repo.maven.apache.org/maven2 was cached in the local repository,
>> > resolution will not be reattempted until the update interval of
>> maven-repo
>> > has elapsed or updates are forced -> [Help 1]
>> >
>> > Some hint would be appreciated.
>> >
>> > Cheers
>> >
>> >
>> > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:
>> >
>> >> Yes, it is published. As of previous versions, at least, hive-exec
>> >> included all of its dependencies *in its artifact*, making it unusable
>> >> as-is because it contained copies of dependencies that clash with
>> >> versions present in other artifacts, and can't be managed with Maven
>> >> mechanisms.
>> >>
>> >> I am not sure why hive-exec was not published normally, with just its
>> >> own classes. That's why it was copied, into an artifact with just
>> >> hive-exec code.
>> >>
>> >> You could do the same thing for hive-exec 0.13.1.
>> >> Or maybe someone knows that it's published more 'normally' now.
>> >> I don't think hive-metastore is related to this question?
>> >>
>> >> I am no expert on the Hive artifacts, just remembering what the issue
>> >> was initially in case it helps you get to a similar solution.
>> >>
>> >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
>> >> > hive-exec (as of 0.13.1) is published here:
>> >> >
>> >>
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
>> >> >
>> >> > Should a JIRA be opened so that dependency on hive-metastore can be
>> >> > replaced by dependency on hive-exec ?
>> >> >
>> >> > Cheers
>> >> >
>> >> >
>> >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
>> wrote:
>> >> >
>> >> >> The reason for org.spark-project.hive is that Spark relies on
>> >> >> hive-exec, but the Hive project does not publish this artifact by
>> >> >> itself, only with all its dependencies as an uber jar. Maybe that's
>> >> >> been improved. If so, you need to point at the new hive-exec and
>> >> >> perhaps sort out its dependencies manually in your build.
>> >> >>
>> >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
>> >> >> > I found 0.13.1 artifacts in maven:
>> >> >> >
>> >> >>
>> >>
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
>> >> >> >
>> >> >> > However, Spark uses groupId of org.spark-project.hive, not
>> >> >> org.apache.hive
>> >> >> >
>> >> >> > Can someone tell me how it is supposed to work ?
>> >> >> >
>> >> >> > Cheers
>> >> >> >
>> >> >> >
>> >> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez <
>> snu...@hortonworks.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >> I saw a note earlier, perhaps on the user list, that at least one
>> >> >> person is
>> >> >> >> using Hive 0.13. Anyone got a working build configuration for this
>> >> >> version
>> >> >> >> of Hive?
>> >> >> >>
>> >> >> >> Regards,
>> >> >> >> - Steve
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> CONFIDENTIALITY NOTICE
>> >> >> >> NOTICE: This message is intended for the use of the individual or
>> >> >> entity to
>> >> >> >> which it is addressed and may contain information that is
>> >> confidential,
>> >> >> >> privileged and exempt from disclosure under applicable law. If the
>> >> >> reader
>> >> >> >> of this message is not the intended recipient, you are hereby
>> >> notified
>> >> >> that
>> >> >> >> any printing, copying, dissemination, distribution, disclosure or
>> >> >> >> forwarding of this communication is strictly p

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Cheng Lian
AFAIK, according a recent talk, Hulu team in China has built Spark SQL
against Hive 0.13 (or 0.13.1?) successfully. Basically they also
re-packaged Hive 0.13 as what the Spark team did. The slides of the talk
hasn't been released yet though.


On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu  wrote:

> Owen helped me find this:
> https://issues.apache.org/jira/browse/HIVE-7423
>
> I guess this means that for Hive 0.14, Spark should be able to directly
> pull in hive-exec-core.jar
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell 
> wrote:
>
> > It would be great if the hive team can fix that issue. If not, we'll
> > have to continue forking our own version of Hive to change the way it
> > publishes artifacts.
> >
> > - Patrick
> >
> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
> > > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
> > still
> > > uber jar.
> > >
> > > Right now I am facing the following error building against Hive 0.13.1
> :
> > >
> > > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
> > > resolve dependencies for project
> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
> > > artifacts could not be resolved:
> > > org.spark-project.hive:hive-metastore:jar:0.13.1,
> > > org.spark-project.hive:hive-exec:jar:0.13.1,
> > > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> > > org.spark-project.hive:hive-metastore:jar:0.13.1 in
> > > http://repo.maven.apache.org/maven2 was cached in the local
> repository,
> > > resolution will not be reattempted until the update interval of
> > maven-repo
> > > has elapsed or updates are forced -> [Help 1]
> > >
> > > Some hint would be appreciated.
> > >
> > > Cheers
> > >
> > >
> > > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:
> > >
> > >> Yes, it is published. As of previous versions, at least, hive-exec
> > >> included all of its dependencies *in its artifact*, making it unusable
> > >> as-is because it contained copies of dependencies that clash with
> > >> versions present in other artifacts, and can't be managed with Maven
> > >> mechanisms.
> > >>
> > >> I am not sure why hive-exec was not published normally, with just its
> > >> own classes. That's why it was copied, into an artifact with just
> > >> hive-exec code.
> > >>
> > >> You could do the same thing for hive-exec 0.13.1.
> > >> Or maybe someone knows that it's published more 'normally' now.
> > >> I don't think hive-metastore is related to this question?
> > >>
> > >> I am no expert on the Hive artifacts, just remembering what the issue
> > >> was initially in case it helps you get to a similar solution.
> > >>
> > >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
> > >> > hive-exec (as of 0.13.1) is published here:
> > >> >
> > >>
> >
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
> > >> >
> > >> > Should a JIRA be opened so that dependency on hive-metastore can be
> > >> > replaced by dependency on hive-exec ?
> > >> >
> > >> > Cheers
> > >> >
> > >> >
> > >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
> > wrote:
> > >> >
> > >> >> The reason for org.spark-project.hive is that Spark relies on
> > >> >> hive-exec, but the Hive project does not publish this artifact by
> > >> >> itself, only with all its dependencies as an uber jar. Maybe that's
> > >> >> been improved. If so, you need to point at the new hive-exec and
> > >> >> perhaps sort out its dependencies manually in your build.
> > >> >>
> > >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu 
> wrote:
> > >> >> > I found 0.13.1 artifacts in maven:
> > >> >> >
> > >> >>
> > >>
> >
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> > >> >> >
> > >> >> > However, Spark uses groupId of org.spark-project.hive, not
> > >> >> org.apache.hive
> > >> >> >
> > >> >> > Can someone tell me how it is supposed to work ?
> > >> >> >
> > >> >> > Cheers
> > >> >> >
> > >> >> >
> > >> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez <
> > snu...@hortonworks.com>
> > >> >> wrote:
> > >> >> >
> > >> >> >> I saw a note earlier, perhaps on the user list, that at least
> one
> > >> >> person is
> > >> >> >> using Hive 0.13. Anyone got a working build configuration for
> this
> > >> >> version
> > >> >> >> of Hive?
> > >> >> >>
> > >> >> >> Regards,
> > >> >> >> - Steve
> > >> >> >>
> > >> >> >>
> > >> >> >>
> > >> >> >> --
> > >> >> >> CONFIDENTIALITY NOTICE
> > >> >> >> NOTICE: This message is intended for the use of the individual
> or
> > >> >> entity to
> > >> >> >> which it is addressed and may contain information that is
> > >> confidential,
> > >> >> >> privileged and exempt from disclosure under applicable law. If
> the
> > >> >> reader
> > >> >> >> of this message is not the intended recipient, you are hereby
> > >> notified
> > >> >> that
> > >> >> >> any printing, copying, dissemination, distribution, disclosure
> or
> > >> >> >> forwarding of this communication is str

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Mark Hamstra
Where and how is that fork being maintained?  I'm not seeing an obviously
correct branch or tag in the main asf hive repo & github mirror.


On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell  wrote:

> It would be great if the hive team can fix that issue. If not, we'll
> have to continue forking our own version of Hive to change the way it
> publishes artifacts.
>
> - Patrick
>
> On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
> > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
> still
> > uber jar.
> >
> > Right now I am facing the following error building against Hive 0.13.1 :
> >
> > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
> > resolve dependencies for project
> > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
> > artifacts could not be resolved:
> > org.spark-project.hive:hive-metastore:jar:0.13.1,
> > org.spark-project.hive:hive-exec:jar:0.13.1,
> > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> > org.spark-project.hive:hive-metastore:jar:0.13.1 in
> > http://repo.maven.apache.org/maven2 was cached in the local repository,
> > resolution will not be reattempted until the update interval of
> maven-repo
> > has elapsed or updates are forced -> [Help 1]
> >
> > Some hint would be appreciated.
> >
> > Cheers
> >
> >
> > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:
> >
> >> Yes, it is published. As of previous versions, at least, hive-exec
> >> included all of its dependencies *in its artifact*, making it unusable
> >> as-is because it contained copies of dependencies that clash with
> >> versions present in other artifacts, and can't be managed with Maven
> >> mechanisms.
> >>
> >> I am not sure why hive-exec was not published normally, with just its
> >> own classes. That's why it was copied, into an artifact with just
> >> hive-exec code.
> >>
> >> You could do the same thing for hive-exec 0.13.1.
> >> Or maybe someone knows that it's published more 'normally' now.
> >> I don't think hive-metastore is related to this question?
> >>
> >> I am no expert on the Hive artifacts, just remembering what the issue
> >> was initially in case it helps you get to a similar solution.
> >>
> >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
> >> > hive-exec (as of 0.13.1) is published here:
> >> >
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
> >> >
> >> > Should a JIRA be opened so that dependency on hive-metastore can be
> >> > replaced by dependency on hive-exec ?
> >> >
> >> > Cheers
> >> >
> >> >
> >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
> wrote:
> >> >
> >> >> The reason for org.spark-project.hive is that Spark relies on
> >> >> hive-exec, but the Hive project does not publish this artifact by
> >> >> itself, only with all its dependencies as an uber jar. Maybe that's
> >> >> been improved. If so, you need to point at the new hive-exec and
> >> >> perhaps sort out its dependencies manually in your build.
> >> >>
> >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
> >> >> > I found 0.13.1 artifacts in maven:
> >> >> >
> >> >>
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> >> >> >
> >> >> > However, Spark uses groupId of org.spark-project.hive, not
> >> >> org.apache.hive
> >> >> >
> >> >> > Can someone tell me how it is supposed to work ?
> >> >> >
> >> >> > Cheers
> >> >> >
> >> >> >
> >> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez <
> snu...@hortonworks.com>
> >> >> wrote:
> >> >> >
> >> >> >> I saw a note earlier, perhaps on the user list, that at least one
> >> >> person is
> >> >> >> using Hive 0.13. Anyone got a working build configuration for this
> >> >> version
> >> >> >> of Hive?
> >> >> >>
> >> >> >> Regards,
> >> >> >> - Steve
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> CONFIDENTIALITY NOTICE
> >> >> >> NOTICE: This message is intended for the use of the individual or
> >> >> entity to
> >> >> >> which it is addressed and may contain information that is
> >> confidential,
> >> >> >> privileged and exempt from disclosure under applicable law. If the
> >> >> reader
> >> >> >> of this message is not the intended recipient, you are hereby
> >> notified
> >> >> that
> >> >> >> any printing, copying, dissemination, distribution, disclosure or
> >> >> >> forwarding of this communication is strictly prohibited. If you
> have
> >> >> >> received this communication in error, please contact the sender
> >> >> immediately
> >> >> >> and delete it from your system. Thank You.
> >> >> >>
> >> >>
> >>
>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
Owen helped me find this:
https://issues.apache.org/jira/browse/HIVE-7423

I guess this means that for Hive 0.14, Spark should be able to directly
pull in hive-exec-core.jar

Cheers


On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell  wrote:

> It would be great if the hive team can fix that issue. If not, we'll
> have to continue forking our own version of Hive to change the way it
> publishes artifacts.
>
> - Patrick
>
> On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
> > Talked with Owen offline. He confirmed that as of 0.13, hive-exec is
> still
> > uber jar.
> >
> > Right now I am facing the following error building against Hive 0.13.1 :
> >
> > [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
> > resolve dependencies for project
> > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
> > artifacts could not be resolved:
> > org.spark-project.hive:hive-metastore:jar:0.13.1,
> > org.spark-project.hive:hive-exec:jar:0.13.1,
> > org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> > org.spark-project.hive:hive-metastore:jar:0.13.1 in
> > http://repo.maven.apache.org/maven2 was cached in the local repository,
> > resolution will not be reattempted until the update interval of
> maven-repo
> > has elapsed or updates are forced -> [Help 1]
> >
> > Some hint would be appreciated.
> >
> > Cheers
> >
> >
> > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:
> >
> >> Yes, it is published. As of previous versions, at least, hive-exec
> >> included all of its dependencies *in its artifact*, making it unusable
> >> as-is because it contained copies of dependencies that clash with
> >> versions present in other artifacts, and can't be managed with Maven
> >> mechanisms.
> >>
> >> I am not sure why hive-exec was not published normally, with just its
> >> own classes. That's why it was copied, into an artifact with just
> >> hive-exec code.
> >>
> >> You could do the same thing for hive-exec 0.13.1.
> >> Or maybe someone knows that it's published more 'normally' now.
> >> I don't think hive-metastore is related to this question?
> >>
> >> I am no expert on the Hive artifacts, just remembering what the issue
> >> was initially in case it helps you get to a similar solution.
> >>
> >> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
> >> > hive-exec (as of 0.13.1) is published here:
> >> >
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
> >> >
> >> > Should a JIRA be opened so that dependency on hive-metastore can be
> >> > replaced by dependency on hive-exec ?
> >> >
> >> > Cheers
> >> >
> >> >
> >> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen 
> wrote:
> >> >
> >> >> The reason for org.spark-project.hive is that Spark relies on
> >> >> hive-exec, but the Hive project does not publish this artifact by
> >> >> itself, only with all its dependencies as an uber jar. Maybe that's
> >> >> been improved. If so, you need to point at the new hive-exec and
> >> >> perhaps sort out its dependencies manually in your build.
> >> >>
> >> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
> >> >> > I found 0.13.1 artifacts in maven:
> >> >> >
> >> >>
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> >> >> >
> >> >> > However, Spark uses groupId of org.spark-project.hive, not
> >> >> org.apache.hive
> >> >> >
> >> >> > Can someone tell me how it is supposed to work ?
> >> >> >
> >> >> > Cheers
> >> >> >
> >> >> >
> >> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez <
> snu...@hortonworks.com>
> >> >> wrote:
> >> >> >
> >> >> >> I saw a note earlier, perhaps on the user list, that at least one
> >> >> person is
> >> >> >> using Hive 0.13. Anyone got a working build configuration for this
> >> >> version
> >> >> >> of Hive?
> >> >> >>
> >> >> >> Regards,
> >> >> >> - Steve
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> CONFIDENTIALITY NOTICE
> >> >> >> NOTICE: This message is intended for the use of the individual or
> >> >> entity to
> >> >> >> which it is addressed and may contain information that is
> >> confidential,
> >> >> >> privileged and exempt from disclosure under applicable law. If the
> >> >> reader
> >> >> >> of this message is not the intended recipient, you are hereby
> >> notified
> >> >> that
> >> >> >> any printing, copying, dissemination, distribution, disclosure or
> >> >> >> forwarding of this communication is strictly prohibited. If you
> have
> >> >> >> received this communication in error, please contact the sender
> >> >> immediately
> >> >> >> and delete it from your system. Thank You.
> >> >> >>
> >> >>
> >>
>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Patrick Wendell
It would be great if the hive team can fix that issue. If not, we'll
have to continue forking our own version of Hive to change the way it
publishes artifacts.

- Patrick

On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu  wrote:
> Talked with Owen offline. He confirmed that as of 0.13, hive-exec is still
> uber jar.
>
> Right now I am facing the following error building against Hive 0.13.1 :
>
> [ERROR] Failed to execute goal on project spark-hive_2.10: Could not
> resolve dependencies for project
> org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
> artifacts could not be resolved:
> org.spark-project.hive:hive-metastore:jar:0.13.1,
> org.spark-project.hive:hive-exec:jar:0.13.1,
> org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
> org.spark-project.hive:hive-metastore:jar:0.13.1 in
> http://repo.maven.apache.org/maven2 was cached in the local repository,
> resolution will not be reattempted until the update interval of maven-repo
> has elapsed or updates are forced -> [Help 1]
>
> Some hint would be appreciated.
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:
>
>> Yes, it is published. As of previous versions, at least, hive-exec
>> included all of its dependencies *in its artifact*, making it unusable
>> as-is because it contained copies of dependencies that clash with
>> versions present in other artifacts, and can't be managed with Maven
>> mechanisms.
>>
>> I am not sure why hive-exec was not published normally, with just its
>> own classes. That's why it was copied, into an artifact with just
>> hive-exec code.
>>
>> You could do the same thing for hive-exec 0.13.1.
>> Or maybe someone knows that it's published more 'normally' now.
>> I don't think hive-metastore is related to this question?
>>
>> I am no expert on the Hive artifacts, just remembering what the issue
>> was initially in case it helps you get to a similar solution.
>>
>> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
>> > hive-exec (as of 0.13.1) is published here:
>> >
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
>> >
>> > Should a JIRA be opened so that dependency on hive-metastore can be
>> > replaced by dependency on hive-exec ?
>> >
>> > Cheers
>> >
>> >
>> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen  wrote:
>> >
>> >> The reason for org.spark-project.hive is that Spark relies on
>> >> hive-exec, but the Hive project does not publish this artifact by
>> >> itself, only with all its dependencies as an uber jar. Maybe that's
>> >> been improved. If so, you need to point at the new hive-exec and
>> >> perhaps sort out its dependencies manually in your build.
>> >>
>> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
>> >> > I found 0.13.1 artifacts in maven:
>> >> >
>> >>
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
>> >> >
>> >> > However, Spark uses groupId of org.spark-project.hive, not
>> >> org.apache.hive
>> >> >
>> >> > Can someone tell me how it is supposed to work ?
>> >> >
>> >> > Cheers
>> >> >
>> >> >
>> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez 
>> >> wrote:
>> >> >
>> >> >> I saw a note earlier, perhaps on the user list, that at least one
>> >> person is
>> >> >> using Hive 0.13. Anyone got a working build configuration for this
>> >> version
>> >> >> of Hive?
>> >> >>
>> >> >> Regards,
>> >> >> - Steve
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> CONFIDENTIALITY NOTICE
>> >> >> NOTICE: This message is intended for the use of the individual or
>> >> entity to
>> >> >> which it is addressed and may contain information that is
>> confidential,
>> >> >> privileged and exempt from disclosure under applicable law. If the
>> >> reader
>> >> >> of this message is not the intended recipient, you are hereby
>> notified
>> >> that
>> >> >> any printing, copying, dissemination, distribution, disclosure or
>> >> >> forwarding of this communication is strictly prohibited. If you have
>> >> >> received this communication in error, please contact the sender
>> >> immediately
>> >> >> and delete it from your system. Thank You.
>> >> >>
>> >>
>>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
Talked with Owen offline. He confirmed that as of 0.13, hive-exec is still
uber jar.

Right now I am facing the following error building against Hive 0.13.1 :

[ERROR] Failed to execute goal on project spark-hive_2.10: Could not
resolve dependencies for project
org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT: The following
artifacts could not be resolved:
org.spark-project.hive:hive-metastore:jar:0.13.1,
org.spark-project.hive:hive-exec:jar:0.13.1,
org.spark-project.hive:hive-serde:jar:0.13.1: Failure to find
org.spark-project.hive:hive-metastore:jar:0.13.1 in
http://repo.maven.apache.org/maven2 was cached in the local repository,
resolution will not be reattempted until the update interval of maven-repo
has elapsed or updates are forced -> [Help 1]

Some hint would be appreciated.

Cheers


On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen  wrote:

> Yes, it is published. As of previous versions, at least, hive-exec
> included all of its dependencies *in its artifact*, making it unusable
> as-is because it contained copies of dependencies that clash with
> versions present in other artifacts, and can't be managed with Maven
> mechanisms.
>
> I am not sure why hive-exec was not published normally, with just its
> own classes. That's why it was copied, into an artifact with just
> hive-exec code.
>
> You could do the same thing for hive-exec 0.13.1.
> Or maybe someone knows that it's published more 'normally' now.
> I don't think hive-metastore is related to this question?
>
> I am no expert on the Hive artifacts, just remembering what the issue
> was initially in case it helps you get to a similar solution.
>
> On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
> > hive-exec (as of 0.13.1) is published here:
> >
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
> >
> > Should a JIRA be opened so that dependency on hive-metastore can be
> > replaced by dependency on hive-exec ?
> >
> > Cheers
> >
> >
> > On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen  wrote:
> >
> >> The reason for org.spark-project.hive is that Spark relies on
> >> hive-exec, but the Hive project does not publish this artifact by
> >> itself, only with all its dependencies as an uber jar. Maybe that's
> >> been improved. If so, you need to point at the new hive-exec and
> >> perhaps sort out its dependencies manually in your build.
> >>
> >> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
> >> > I found 0.13.1 artifacts in maven:
> >> >
> >>
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> >> >
> >> > However, Spark uses groupId of org.spark-project.hive, not
> >> org.apache.hive
> >> >
> >> > Can someone tell me how it is supposed to work ?
> >> >
> >> > Cheers
> >> >
> >> >
> >> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez 
> >> wrote:
> >> >
> >> >> I saw a note earlier, perhaps on the user list, that at least one
> >> person is
> >> >> using Hive 0.13. Anyone got a working build configuration for this
> >> version
> >> >> of Hive?
> >> >>
> >> >> Regards,
> >> >> - Steve
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> CONFIDENTIALITY NOTICE
> >> >> NOTICE: This message is intended for the use of the individual or
> >> entity to
> >> >> which it is addressed and may contain information that is
> confidential,
> >> >> privileged and exempt from disclosure under applicable law. If the
> >> reader
> >> >> of this message is not the intended recipient, you are hereby
> notified
> >> that
> >> >> any printing, copying, dissemination, distribution, disclosure or
> >> >> forwarding of this communication is strictly prohibited. If you have
> >> >> received this communication in error, please contact the sender
> >> immediately
> >> >> and delete it from your system. Thank You.
> >> >>
> >>
>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Sean Owen
Yes, it is published. As of previous versions, at least, hive-exec
included all of its dependencies *in its artifact*, making it unusable
as-is because it contained copies of dependencies that clash with
versions present in other artifacts, and can't be managed with Maven
mechanisms.

I am not sure why hive-exec was not published normally, with just its
own classes. That's why it was copied, into an artifact with just
hive-exec code.

You could do the same thing for hive-exec 0.13.1.
Or maybe someone knows that it's published more 'normally' now.
I don't think hive-metastore is related to this question?

I am no expert on the Hive artifacts, just remembering what the issue
was initially in case it helps you get to a similar solution.

On Mon, Jul 28, 2014 at 4:47 PM, Ted Yu  wrote:
> hive-exec (as of 0.13.1) is published here:
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar
>
> Should a JIRA be opened so that dependency on hive-metastore can be
> replaced by dependency on hive-exec ?
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen  wrote:
>
>> The reason for org.spark-project.hive is that Spark relies on
>> hive-exec, but the Hive project does not publish this artifact by
>> itself, only with all its dependencies as an uber jar. Maybe that's
>> been improved. If so, you need to point at the new hive-exec and
>> perhaps sort out its dependencies manually in your build.
>>
>> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
>> > I found 0.13.1 artifacts in maven:
>> >
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
>> >
>> > However, Spark uses groupId of org.spark-project.hive, not
>> org.apache.hive
>> >
>> > Can someone tell me how it is supposed to work ?
>> >
>> > Cheers
>> >
>> >
>> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez 
>> wrote:
>> >
>> >> I saw a note earlier, perhaps on the user list, that at least one
>> person is
>> >> using Hive 0.13. Anyone got a working build configuration for this
>> version
>> >> of Hive?
>> >>
>> >> Regards,
>> >> - Steve
>> >>
>> >>
>> >>
>> >> --
>> >> CONFIDENTIALITY NOTICE
>> >> NOTICE: This message is intended for the use of the individual or
>> entity to
>> >> which it is addressed and may contain information that is confidential,
>> >> privileged and exempt from disclosure under applicable law. If the
>> reader
>> >> of this message is not the intended recipient, you are hereby notified
>> that
>> >> any printing, copying, dissemination, distribution, disclosure or
>> >> forwarding of this communication is strictly prohibited. If you have
>> >> received this communication in error, please contact the sender
>> immediately
>> >> and delete it from your system. Thank You.
>> >>
>>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
hive-exec (as of 0.13.1) is published here:
http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C0.13.1%7Cjar

Should a JIRA be opened so that dependency on hive-metastore can be
replaced by dependency on hive-exec ?

Cheers


On Mon, Jul 28, 2014 at 8:26 AM, Sean Owen  wrote:

> The reason for org.spark-project.hive is that Spark relies on
> hive-exec, but the Hive project does not publish this artifact by
> itself, only with all its dependencies as an uber jar. Maybe that's
> been improved. If so, you need to point at the new hive-exec and
> perhaps sort out its dependencies manually in your build.
>
> On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
> > I found 0.13.1 artifacts in maven:
> >
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
> >
> > However, Spark uses groupId of org.spark-project.hive, not
> org.apache.hive
> >
> > Can someone tell me how it is supposed to work ?
> >
> > Cheers
> >
> >
> > On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez 
> wrote:
> >
> >> I saw a note earlier, perhaps on the user list, that at least one
> person is
> >> using Hive 0.13. Anyone got a working build configuration for this
> version
> >> of Hive?
> >>
> >> Regards,
> >> - Steve
> >>
> >>
> >>
> >> --
> >> CONFIDENTIALITY NOTICE
> >> NOTICE: This message is intended for the use of the individual or
> entity to
> >> which it is addressed and may contain information that is confidential,
> >> privileged and exempt from disclosure under applicable law. If the
> reader
> >> of this message is not the intended recipient, you are hereby notified
> that
> >> any printing, copying, dissemination, distribution, disclosure or
> >> forwarding of this communication is strictly prohibited. If you have
> >> received this communication in error, please contact the sender
> immediately
> >> and delete it from your system. Thank You.
> >>
>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Sean Owen
The reason for org.spark-project.hive is that Spark relies on
hive-exec, but the Hive project does not publish this artifact by
itself, only with all its dependencies as an uber jar. Maybe that's
been improved. If so, you need to point at the new hive-exec and
perhaps sort out its dependencies manually in your build.

On Mon, Jul 28, 2014 at 4:01 PM, Ted Yu  wrote:
> I found 0.13.1 artifacts in maven:
> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar
>
> However, Spark uses groupId of org.spark-project.hive, not org.apache.hive
>
> Can someone tell me how it is supposed to work ?
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez  wrote:
>
>> I saw a note earlier, perhaps on the user list, that at least one person is
>> using Hive 0.13. Anyone got a working build configuration for this version
>> of Hive?
>>
>> Regards,
>> - Steve
>>
>>
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>


Re: Working Formula for Hive 0.13?

2014-07-28 Thread Ted Yu
I found 0.13.1 artifacts in maven:
http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metastore%7C0.13.1%7Cjar

However, Spark uses groupId of org.spark-project.hive, not org.apache.hive

Can someone tell me how it is supposed to work ?

Cheers


On Mon, Jul 28, 2014 at 7:44 AM, Steve Nunez  wrote:

> I saw a note earlier, perhaps on the user list, that at least one person is
> using Hive 0.13. Anyone got a working build configuration for this version
> of Hive?
>
> Regards,
> - Steve
>
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>


Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
I saw a note earlier, perhaps on the user list, that at least one person is
using Hive 0.13. Anyone got a working build configuration for this version
of Hive?

Regards,
- Steve



-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.