Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Yin Huai Mon, 01 Jun 2015 11:11:19 -0700

Hi Peter,

Based on your error message, seems you were not using the RC3. For the
error thrown at HiveContext's line 206, we have changed the message to this
one
<https://github.com/apache/spark/blob/v1.4.0-rc3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala#L205-207>
just
before RC3. Basically, we will not print out the class loader name. Can you
check if a older version of 1.4 branch got used? Have you published a RC3
to your local maven repo? Can you clean your local repo cache and try again?


Thanks,

Yin

On Mon, Jun 1, 2015 at 10:45 AM, Peter Rudenko <[email protected]>
wrote:

>  Still have problem using HiveContext from sbt. Here’s an example of
> dependencies:
>
>  val sparkVersion = "1.4.0-rc3"
>
>     lazy val root = Project(id = "spark-hive", base = file("."),
>        settings = Project.defaultSettings ++ Seq(
>        name := "spark-1.4-hive",
>        scalaVersion := "2.10.5",
>        scalaBinaryVersion := "2.10",
>        resolvers += "Spark RC" at 
> "https://repository.apache.org/content/repositories/orgapachespark-1110/"; 
> <https://repository.apache.org/content/repositories/orgapachespark-1110/>,
>        libraryDependencies ++= Seq(
>          "org.apache.spark" %% "spark-core" % sparkVersion,
>          "org.apache.spark" %% "spark-mllib" % sparkVersion,
>          "org.apache.spark" %% "spark-hive" % sparkVersion,
>          "org.apache.spark" %% "spark-sql" % sparkVersion
>         )
>
>   ))
>
> Launching sbt console with it and running:
>
> val conf = new SparkConf().setMaster("local[4]").setAppName("test")
> val sc = new SparkContext(conf)
> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
> val data = sc.parallelize(1 to 10000)
> import sqlContext.implicits._
> scala> data.toDF
> java.lang.IllegalArgumentException: Unable to locate hive jars to connect to 
> metastore using classloader 
> scala.tools.nsc.interpreter.IMain$TranslatingClassLoader. Please set 
> spark.sql.hive.metastore.jars
>     at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:206)
>     at 
> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:175)
>     at 
> org.apache.spark.sql.hive.HiveContext$anon$2.<init>(HiveContext.scala:367)
>     at 
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:367)
>     at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:366)
>     at 
> org.apache.spark.sql.hive.HiveContext$anon$1.<init>(HiveContext.scala:379)
>     at 
> org.apache.spark.sql.hive.HiveContext.analyzer$lzycompute(HiveContext.scala:379)
>     at org.apache.spark.sql.hive.HiveContext.analyzer(HiveContext.scala:378)
>     at 
> org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:901)
>     at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:134)
>     at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
>     at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:474)
>     at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:456)
>     at 
> org.apache.spark.sql.SQLContext$implicits$.intRddToDataFrameHolder(SQLContext.scala:345)
>
> Thanks,
> Peter Rudenko
>
> On 2015-06-01 05:04, Guoqiang Li wrote:
>
>   +1 (non-binding)
>
>
>  ------------------ Original ------------------
>  *From: * "Sandy Ryza";<[email protected]> <[email protected]>
> ;
> *Date: * Mon, Jun 1, 2015 07:34 AM
> *To: * "Krishna Sankar"<[email protected]> <[email protected]>;
> *Cc: * "Patrick Wendell"<[email protected]> <[email protected]>;
> "[email protected]" <[email protected]><[email protected]>
> <[email protected]>;
> *Subject: * Re: [VOTE] Release Apache Spark 1.4.0 (RC3)
>
>  +1 (non-binding)
>
>  Launched against a pseudo-distributed YARN cluster running Hadoop 2.6.0
> and ran some jobs.
>
>  -Sandy
>
> On Sat, May 30, 2015 at 3:44 PM, Krishna Sankar < <[email protected]>
> [email protected]> wrote:
>
>>  +1 (non-binding, of course)
>>
>>  1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:07 min
>>      mvn clean package -Pyarn -Dyarn.version=2.6.0 -Phadoop-2.4
>> -Dhadoop.version=2.6.0 -DskipTests
>> 2. Tested pyspark, mlib - running as well as compare results with 1.3.1
>> 2.1. statistics (min,max,mean,Pearson,Spearman) OK
>> 2.2. Linear/Ridge/Laso Regression OK
>> 2.3. Decision Tree, Naive Bayes OK
>> 2.4. KMeans OK
>>        Center And Scale OK
>> 2.5. RDD operations OK
>>       State of the Union Texts - MapReduce, Filter,sortByKey (word count)
>> 2.6. Recommendation (Movielens medium dataset ~1 M ratings) OK
>>        Model evaluation/optimization (rank, numIter, lambda) with
>> itertools OK
>> 3. Scala - MLlib
>> 3.1. statistics (min,max,mean,Pearson,Spearman) OK
>> 3.2. LinearRegressionWithSGD OK
>> 3.3. Decision Tree OK
>> 3.4. KMeans OK
>> 3.5. Recommendation (Movielens medium dataset ~1 M ratings) OK
>> 3.6. saveAsParquetFile OK
>> 3.7. Read and verify the 4.3 save(above) - sqlContext.parquetFile,
>> registerTempTable, sql OK
>> 3.8. result = sqlContext.sql("SELECT
>> OrderDetails.OrderID,ShipCountry,UnitPrice,Qty,Discount FROM Orders INNER
>> JOIN OrderDetails ON Orders.OrderID = OrderDetails.OrderID") OK
>> 4.0. Spark SQL from Python OK
>> 4.1. result = sqlContext.sql("SELECT * from people WHERE State = 'WA'") OK
>>
>>  Cheers
>>  <k/>
>>
>> On Fri, May 29, 2015 at 4:40 PM, Patrick Wendell < <[email protected]>
>> [email protected]> wrote:
>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 1.4.0!
>>>
>>> The tag to be voted on is v1.4.0-rc3 (commit dd109a8):
>>>
>>> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=dd109a8746ec07c7c83995890fc2c0cd7a693730
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-bin/
>>>
>>> Release artifacts are signed with the following key:
>>> https://people.apache.org/keys/committer/pwendell.asc
>>>
>>> The staging repository for this release can be found at:
>>> [published as version: 1.4.0]
>>> https://repository.apache.org/content/repositories/orgapachespark-1109/
>>> [published as version: 1.4.0-rc3]
>>> https://repository.apache.org/content/repositories/orgapachespark-1110/
>>>
>>> The documentation corresponding to this release can be found at:
>>> http://people.apache.org/~pwendell/spark-releases/spark-1.4.0-rc3-docs/
>>>
>>> Please vote on releasing this package as Apache Spark 1.4.0!
>>>
>>> The vote is open until Tuesday, June 02, at 00:32 UTC and passes
>>> if a majority of at least 3 +1 PMC votes are cast.
>>>
>>> [ ] +1 Release this package as Apache Spark 1.4.0
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see
>>> http://spark.apache.org/
>>>
>>> == What has changed since RC1 ==
>>> Below is a list of bug fixes that went into this RC:
>>> http://s.apache.org/vN
>>>
>>> == How can I help test this release? ==
>>> If you are a Spark user, you can help us test this release by
>>> taking a Spark 1.3 workload and running on this release candidate,
>>> then reporting any regressions.
>>>
>>> == What justifies a -1 vote for this release? ==
>>> This vote is happening towards the end of the 1.4 QA period,
>>> so -1 votes should only occur for significant regressions from 1.3.1.
>>> Bugs already present in 1.3.X, minor regressions, or bugs related
>>> to new features will not block this release.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: <[email protected]>
>>> [email protected]
>>> For additional commands, e-mail: <[email protected]>
>>> [email protected]
>>>
>>>
>>
>    
>

Re: [VOTE] Release Apache Spark 1.4.0 (RC3)

Reply via email to