Re: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available

2015-04-02 Thread Todd Nist
Hi Young,

Sorry for the duplicate post, want to reply to all.

I just downloaded the bits prebuilt form apache spark download site.
Started the spark shell and got the same error.

I then started the shell as follows:

./bin/spark-shell --master spark://radtech.io:7077 --total-executor-cores 2
--driver-class-path
/usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jar --jars $(echo
~/Downloads/apache-hive-0.13.1-bin/lib/*.jar | tr ' ' ',')

this worked, or at least got rid of this

scala case class MetricTable(path: String, pathElements: String, name:
String, value: String) scala.reflect.internal.Types$TypeError: bad symbolic
reference. A signature in HiveMetastoreCatalog.class refers to term cache in
package com.google.common which is not available. It may be completely
missing from the current classpath, or the version on the classpath might
be incompatible with the version used when compiling HiveMetastoreCatalog
.class. That entry seems to have slain the compiler. Shall I replay your
session? I can re-run each line except the last one. [y/n]

Still getting the ClassNotFoundException, json_tuple, from this statement
same as in 1.2.1:

sql(
SELECT path, name, value, v1.peValue, v1.peName
 FROM metric_table
   lateral view json_tuple(pathElements, 'name', 'value') v1
 as peName, peValue
)
.collect.foreach(println(_))


15/04/02 20:50:14 INFO ParseDriver: Parsing command: SELECT path,
name, value, v1.peValue, v1.peName
 FROM metric_table
   lateral view json_tuple(pathElements, 'name', 'value') v1
 as peName, peValue15/04/02 20:50:14 INFO ParseDriver:
Parse Completed
java.lang.ClassNotFoundException: json_tuple
at 
scala.tools.nsc.interpreter.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:83)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

Any ideas on the json_tuple exception?


Modified the syntax to take into account some minor changes in 1.3.  The
one posted this morning was from my 1.2.1 test.

import sqlContext.implicits._case class MetricTable(path: String,
pathElements: String, name: String, value: String)val mt = new
MetricTable(path: /DC1/HOST1/,
pathElements: [{node: DataCenter,value: DC1},{node:
host,value: HOST1}],
name: Memory Usage (%),
value: 29.590943279257175)val rdd1 =
sc.makeRDD(List(mt))val df = rdd1.toDF
df.printSchema
df.show
df.registerTempTable(metric_table)
sql(
SELECT path, name, value, v1.peValue, v1.peName
   FROM metric_table
  lateral view json_tuple(pathElements, 'name', 'value') v1
 as peName, peValue
)
.collect.foreach(println(_))


On Thu, Apr 2, 2015 at 8:21 PM, java8964 java8...@hotmail.com wrote:

 Hmm, I just tested my own Spark 1.3.0 build. I have the same problem, but
 I cannot reproduce it on Spark 1.2.1

 If we check the code change below:

 Spark 1.3 branch

 https://github.com/apache/spark/blob/branch-1.3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala

 vs

 Spark 1.2 branch

 https://github.com/apache/spark/blob/branch-1.2/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala

 You can see that on line 24:

 import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache}

 is introduced on 1.3 branch.

 The error basically mean runtime com.google.common.cache package cannot be
 found in the classpath.

 Either you and me made the same mistake when we build Spark 1.3.0, or
 there are something wrong with Spark 1.3 pom.xml file.

 Here is how I built the 1.3.0:

 1) Download the spark 1.3.0 source
 2) make-distribution --targz -Dhadoop.version=1.1.1 -Phive -Phive-0.12.0
 -Phive-thriftserver -DskipTests

 Is this only due to that I built against Hadoop 1.x?

 Yong


 --
 Date: Thu, 2 Apr 2015 13:56:33 -0400
 Subject: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class
 refers to term cache in package com.google.common which is not available
 From: tsind...@gmail.com
 To: user@spark.apache.org


 I was trying a simple test from the spark-shell to see if 1.3.0 would
 address a problem I was having with locating the json_tuple class and got
 the following error:

 scala import org.apache.spark.sql.hive._
 import org.apache.spark.sql.hive._

 scala val sqlContext = new HiveContext(sc)sqlContext: 
 org.apache.spark.sql.hive.HiveContext = 
 org.apache.spark.sql.hive.HiveContext@79c849c7

 scala import sqlContext._
 import sqlContext._

 scala case class MetricTable(path: String, pathElements: String, name: 
 String, value: String)scala.reflect.internal.Types$TypeError: bad symbolic 
 reference. A signature in HiveMetastoreCatalog.class refers to term cachein 
 package com.google.common which is not available.
 It may be completely missing from the current classpath, or the version on
 the classpath might be incompatible with the version used when compiling 
 HiveMetastoreCatalog.class.
 That entry seems to have slain 

RE: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class refers to term cache in package com.google.common which is not available

2015-04-02 Thread java8964
Hmm, I just tested my own Spark 1.3.0 build. I have the same problem, but I 
cannot reproduce it on Spark 1.2.1
If we check the code change below:
Spark 1.3 
branchhttps://github.com/apache/spark/blob/branch-1.3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
vs 
Spark 1.2 
branchhttps://github.com/apache/spark/blob/branch-1.2/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
You can see that on line 24:
import com.google.common.cache.{CacheBuilder, CacheLoader, LoadingCache}
is introduced on 1.3 branch.
The error basically mean runtime com.google.common.cache package cannot be 
found in the classpath.
Either you and me made the same mistake when we build Spark 1.3.0, or there are 
something wrong with Spark 1.3 pom.xml file.
Here is how I built the 1.3.0:
1) Download the spark 1.3.0 source2) make-distribution --targz 
-Dhadoop.version=1.1.1 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests
Is this only due to that I built against Hadoop 1.x?
Yong

Date: Thu, 2 Apr 2015 13:56:33 -0400
Subject: Spark SQL 1.3.0 - spark-shell error : HiveMetastoreCatalog.class 
refers to term cache in package com.google.common which is not available
From: tsind...@gmail.com
To: user@spark.apache.org

I was trying a simple test from the spark-shell to see if 1.3.0 would address a 
problem I was having with locating the json_tuple class and got the following 
error:
scala import org.apache.spark.sql.hive._
import org.apache.spark.sql.hive._

scala val sqlContext = new HiveContext(sc)
sqlContext: org.apache.spark.sql.hive.HiveContext = 
org.apache.spark.sql.hive.HiveContext@79c849c7

scala import sqlContext._
import sqlContext._

scala case class MetricTable(path: String, pathElements: String, name: String, 
value: String)
scala.reflect.internal.Types$TypeError: bad symbolic reference. A signature in 
HiveMetastoreCatalog.class refers to term cache
in package com.google.common which is not available.
It may be completely missing from the current classpath, or the version on
the classpath might be incompatible with the version used when compiling 
HiveMetastoreCatalog.class.
That entry seems to have slain the compiler.  Shall I replay
your session? I can re-run each line except the last one.
[y/n]
Abandoning crashed session.I entered the shell as follows:./bin/spark-shell 
--master spark://radtech.io:7077 --total-executor-cores 2 --driver-class-path 
/usr/local/spark/lib/mysql-connector-java-5.1.34-bin.jarhive-site.xml looks 
like this:?xml version=1.0?
?xml-stylesheet type=text/xsl href=configuration.xsl?

configuration
  property
namehive.semantic.analyzer.factory.impl/name
valueorg.apache.hcatalog.cli.HCatSemanticAnalyzerFactory/value
  /property

  property
namehive.metastore.sasl.enabled/name
valuefalse/value
  /property

  property
namehive.server2.authentication/name
valueNONE/value
  /property

  property
namehive.server2.enable.doAs/name
valuetrue/value
  /property

  property
namehive.warehouse.subdir.inherit.perms/name
valuetrue/value
  /property

  property
namehive.metastore.schema.verification/name
valuefalse/value
  /property

  property
namejavax.jdo.option.ConnectionURL/name

valuejdbc:mysql://localhost:3306/metastore_db?createDatabaseIfNotExist=true/value
descriptionmetadata is stored in a MySQL server/description
  /property

  property
namejavax.jdo.option.ConnectionDriverName/name
valuecom.mysql.jdbc.Driver/value
descriptionMySQL JDBC driver class/description
  /property

  property
namejavax.jdo.option.ConnectionUserName/name
value***/value
  /property

  property
namejavax.jdo.option.ConnectionPassword/name
value/value
  /property

/configurationI have downloaded a clean version of 1.3.0 and tried it again 
but same error. Is this a know issue? Or a configuration issue on my part?TIA 
for the assistances.-Todd