from:"Judy Nash"

security testing on spark ?

2015-12-15 Thread Judy Nash

Hi all,

Does anyone know of any effort from the community on security testing spark 
clusters.
I.e.
Static source code analysis to find security flaws
Penetration testing to identify ways to compromise spark cluster
Fuzzing to crash spark

Thanks,
Judy

RE: Error building Spark on Windows with sbt

2015-10-30 Thread Judy Nash

I have not had any success building using sbt/sbt on windows.
However, I have been able to binary by using maven command directly.

From: Richard Eggert [mailto:richard.egg...@gmail.com]
Sent: Sunday, October 25, 2015 12:51 PM
To: Ted Yu 
Cc: User 
Subject: Re: Error building Spark on Windows with sbt

Yes, I know, but it would be nice to be able to test things myself before I 
push commits.

On Sun, Oct 25, 2015 at 3:50 PM, Ted Yu 
> wrote:
If you have a pull request, Jenkins can test your change for you.

FYI

On Oct 25, 2015, at 12:43 PM, Richard Eggert 
> wrote:
Also, if I run the Maven build on Windows or Linux without setting 
-DskipTests=true, it hangs indefinitely when it gets to 
org.apache.spark.JavaAPISuite.

It's hard to test patches when the build doesn't work. :-/

On Sun, Oct 25, 2015 at 3:41 PM, Richard Eggert 
> wrote:
By "it works", I mean, "It gets past that particular error". It still fails 
several minutes later with a different error:

java.lang.IllegalStateException: impossible to get artifacts when data has not 
been loaded. IvyNode = org.scala-lang#scala-library;2.10.3

On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert 
> wrote:

When I try to start up sbt for the Spark build,  or if I try to import it in 
IntelliJ IDEA as an sbt project, it fails with a "No such file or directory" 
error when it attempts to "git clone" sbt-pom-reader into 
.sbt/0.13/staging/some-sha1-hash.

If I manually create the expected directory before running sbt or importing 
into IntelliJ, then it works. Why is it necessary to do this,  and what can be 
done to make it not necessary?

Rich

--
Rich

--
Rich

--
Rich

spark thrift server supports timeout?

2015-07-21 Thread Judy Nash

Hello everyone,

Does spark thrift server support timeout?
Is there a documentation I can reference for questions like these?

I know it support cancels, but not sure about timeout.

Thanks,
Judy

Get a list of temporary RDD tables via Thrift

2015-05-11 Thread Judy Nash

Hi,

How can I get a list of temporary tables via Thrift?

Have used thrift's startWithContext and registered a temp table, but not seeing 
the temp table/rdd when running show tables.


Thanks,
Judy

saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash

Hello,

I am following the tutorial code on sql programming 
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
 to try out Python on spark 1.2.1.

SaveAsTable function works on Scala bur fails on python with Unresolved plan 
found.

Broken Python code:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

saveAsTable fails with Unresolved plan found.
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:
'CreateTableAsSelect None, pytable, false, None


This scala code works fine:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

Is this a known issue? Or am I not using Python correctly?

Thanks,
Judy

RE: saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash

SPARK-4825https://issues.apache.org/jira/browse/SPARK-4825 looks like the 
right bug, but it should've been fixed on 1.2.1.

Is a similar fix needed in Python?

From: Judy Nash
Sent: Thursday, May 7, 2015 7:26 AM
To: user@spark.apache.org
Subject: saveAsTable fails on Python with Unresolved plan found

Hello,

I am following the tutorial code on sql programming 
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
 to try out Python on spark 1.2.1.

SaveAsTable function works on Scala bur fails on python with Unresolved plan 
found.

Broken Python code:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

saveAsTable fails with Unresolved plan found.
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:
'CreateTableAsSelect None, pytable, false, None

This scala code works fine:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

Is this a known issue? Or am I not using Python correctly?

Thanks,
Judy

RE: saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash

Figured it out. It was because I was using HiveContext instead of SQLContext.
FYI in case others saw the same issue.

From: Judy Nash
Sent: Thursday, May 7, 2015 7:38 AM
To: 'user@spark.apache.org'
Subject: RE: saveAsTable fails on Python with Unresolved plan found

SPARK-4825https://issues.apache.org/jira/browse/SPARK-4825 looks like the 
right bug, but it should've been fixed on 1.2.1.

Is a similar fix needed in Python?

From: Judy Nash
Sent: Thursday, May 7, 2015 7:26 AM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject: saveAsTable fails on Python with Unresolved plan found

Hello,

I am following the tutorial code on sql programming 
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
 to try out Python on spark 1.2.1.

SaveAsTable function works on Scala bur fails on python with Unresolved plan 
found.

Broken Python code:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

saveAsTable fails with Unresolved plan found.
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:
'CreateTableAsSelect None, pytable, false, None

This scala code works fine:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

Is this a known issue? Or am I not using Python correctly?

Thanks,
Judy

RE: Using 'fair' scheduler mode with thrift server

2015-04-01 Thread Judy Nash

The expensive query can take all executor slots, but no task occupy the 
executor permanently.
i.e. The second job can possibly to take some resources to execute in-between 
tasks of the expensive queries.

Can the fair scheduler mode help in this case? Or is it possible to setup 
thrift such that no query is taking all resources.

From: Sean Owen [mailto:so...@cloudera.com]
Sent: Wednesday, April 1, 2015 12:28 AM
To: Asad Khan
Cc: user@spark.apache.org
Subject: Re: Using 'fair' scheduler mode


Does the expensive query take all executor slots? Then there is nothing for any 
other job to use regardless of scheduling policy.
On Mar 31, 2015 9:20 PM, asadrao 
as...@microsoft.commailto:as...@microsoft.com wrote:
Hi, I am using the Spark ‘fair’ scheduler mode. I have noticed that if the
first query is a very expensive query (ex: ‘select *’ on a really big data
set) than any subsequent query seem to get blocked. I would have expected
the second query to run in parallel since I am using the ‘fair’ scheduler
mode not the ‘fifo’. I am submitting the query through thrift server.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-fair-scheduler-mode-tp22328.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail: 
user-h...@spark.apache.orgmailto:user-h...@spark.apache.org

Spark SQL does not read from cached table if table is renamed

2015-04-01 Thread Judy Nash

Hi all,

Noticed a bug in my current version of Spark 1.2.1.

After a table is cached with cache table table command, query will not read 
from memory if SQL query renames the table.

This query reads from in memory table
i.e. select hivesampletable.country from default.hivesampletable  group by 
hivesampletable.country

This query with renamed table reads from hive
i.e. select table.country from default.hivesampletable table group by 
table.country


Is this a known bug?
Most BI tools rename tables to avoid table name collision.

Thanks,
Judy

Matching Spark application metrics data to App Id

2015-03-20 Thread Judy Nash

Hi,

I want to get telemetry metrics on spark apps activities, such as run time and 
jvm activities.

Using Spark Metrics I am able to get the following sample data point on the an 
app:
type=GAUGE, name=application.SparkSQL::headnode0.1426626495312.runtime_ms, 
value=414873

How can I match this datapoint to the AppId? (i.e. app-20150317210815-0001)
Spark App name is not an unique identifier.
1426626495312 appear to be unique, but I am unable to see how this is related 
to the AppId.

Thanks,
Judy

RE: configure number of cached partition in memory on SparkSQL

2015-03-19 Thread Judy Nash

Thanks Cheng for replying.

Meant to say to change number of partitions of a cached table. It doesn’t need 
to be re-adjusted after caching.

To provide more context:
What I am seeing on my dataset is that we have a large number of tasks. Since 
it appears each task is mapped to a partition, I want to see if matching 
partitions to available core count will make it faster.

I’ll give your suggestion a try to see if it will help. Experiment is a great 
way to learn more about spark internals.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, March 16, 2015 5:41 AM
To: Judy Nash; user@spark.apache.org
Subject: Re: configure number of cached partition in memory on SparkSQL


Hi Judy,

In the case of HadoopRDD and NewHadoopRDD, partition number is actually decided 
by the InputFormat used. And spark.sql.inMemoryColumnarStorage.batchSize is not 
related to partition number, it controls the in-memory columnar batch size 
within a single partition.

Also, what do you mean by “change the number of partitions after caching the 
table”? Are you trying to re-cache an already cached table with a different 
partition number?

Currently, I don’t see a super intuitive pure SQL way to set the partition 
number in this case. Maybe you can try this (assuming table t has a column s 
which is expected to be sorted):

SET spark.sql.shuffle.partitions = 10;

CACHE TABLE cached_t AS SELECT * FROM t ORDER BY s;

In this way, we introduce a shuffle by sorting a column, and zoom in/out the 
partition number at the same time. This might not be the best way out there, 
but it’s the first one that jumped into my head.

Cheng

On 3/5/15 3:51 AM, Judy Nash wrote:
Hi,

I am tuning a hive dataset on Spark SQL deployed via thrift server.

How can I change the number of partitions created by caching the table on 
thrift server?

I have tried the following but still getting the same number of partitions 
after caching:
Spark.default.parallelism
spark.sql.inMemoryColumnarStorage.batchSize


Thanks,
Judy

RE: spark standalone with multiple executors in one work node

2015-03-05 Thread Judy Nash

I meant from one app, yes.

Was asking this because our previous tuning experiment shows spark-on-yarn runs 
faster when overloading workers with executors (i.e. if a worker has 4 cores, 
creating 2 executors each use 4 cores will see a speed boost from 1 executor 
with 4 cores).

I have found an equivalent solution for standalone that have given me a speed 
boost. Instead of adding more executors, I overloaded SPARK_WORKER_CORES to 2x 
of CPU cores on the worker. We are seeing better performance due to CPU now has 
consistent 100% utilization.

-Original Message-
From: Sean Owen [mailto:so...@cloudera.com] 
Sent: Thursday, February 26, 2015 2:11 AM
To: Judy Nash
Cc: user@spark.apache.org
Subject: Re: spark standalone with multiple executors in one work node

--num-executors is the total number of executors. In YARN there is not quite 
the same notion of a Spark worker. Of course, one worker has an executor for 
each running app, so yes, but you mean for one app? it's possible, though not 
usual, to run multiple executors for one app on one worker. This may be useful 
if your executor heap size is otherwise getting huge.

On Thu, Feb 26, 2015 at 1:58 AM, Judy Nash judyn...@exchange.microsoft.com 
wrote:
 Hello,



 Does spark standalone support running multiple executors in one worker node?



 It seems yarn has the parameter --num-executors  to set number of 
 executors to deploy, but I do not find the equivalent parameter in spark 
 standalone.





 Thanks,

 Judy

configure number of cached partition in memory on SparkSQL

2015-03-04 Thread Judy Nash

Hi,

I am tuning a hive dataset on Spark SQL deployed via thrift server.

How can I change the number of partitions after caching the table on thrift 
server?

I have tried the following but still getting the same number of partitions 
after caching:
Spark.default.parallelism
spark.sql.inMemoryColumnarStorage.batchSize


Thanks,
Judy

spark standalone with multiple executors in one work node

2015-02-25 Thread Judy Nash

Hello,

Does spark standalone support running multiple executors in one worker node?

It seems yarn has the parameter --num-executors  to set number of executors to 
deploy, but I do not find the equivalent parameter in spark standalone.


Thanks,
Judy

spark slave cannot execute without admin permission on windows

2015-02-18 Thread Judy Nash

Hi,

Is it possible to configure spark to run without admin permission on windows?

My current setup run master  slave successfully with admin permission.
However, if I downgrade permission level from admin to user, SparkPi fails with 
the following exception on the slave node:
Exception in thread main org.apache.spark.SparkException: Job aborted due to s
tage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task
0.3 in stage 0.0 (TID 9, workernode0.jnashsparkcurr2.d10.internal.cloudapp.net)
: java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi$$anonfun$1

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)

Upon investigation, it appears that sparkPi jar under 
spark_home\worker\appname\*.jar does not have execute permission set, causing 
spark not able to find class.

Advice would be very much appreciated.

Thanks,
Judy

RE: Is the Thrift server right for me?

2015-02-11 Thread Judy Nash

It should relay the queries to spark (i.e. you shouldn't see any MR job on 
Hadoop  you should see activities on the spark app on headnode UI).

Check your hive-site.xml. Are you directing to the hive server 2 port instead 
of spark thrift port?
Their default ports are both 1.

From: Andrew Lee [mailto:alee...@hotmail.com]
Sent: Wednesday, February 11, 2015 12:00 PM
To: sjbrunst; user@spark.apache.org
Subject: RE: Is the Thrift server right for me?

I have ThriftServer2 up and running, however, I notice that it relays the query 
to HiveServer2 when I pass the hive-site.xml to it.

I'm not sure if this is the expected behavior, but based on what I have up and 
running, the ThriftServer2 invokes HiveServer2 that results in MapReduce or Tez 
query. In this case, I could just connect directly to HiveServer2 if Hive is 
all you need.

If you are programmer and want to mash up data from Hive with other tables and 
data in Spark, then Spark ThriftServer2 seems to be a good integration point at 
some use case.

Please correct me if I misunderstood the purpose of Spark ThriftServer2.

 Date: Thu, 8 Jan 2015 14:49:00 -0700
 From: sjbru...@uwaterloo.camailto:sjbru...@uwaterloo.ca
 To: user@spark.apache.orgmailto:user@spark.apache.org
 Subject: Is the Thrift server right for me?

 I'm building a system that collects data using Spark Streaming, does some
 processing with it, then saves the data. I want the data to be queried by
 multiple applications, and it sounds like the Thrift JDBC/ODBC server might
 be the right tool to handle the queries. However, the documentation for the
 Thrift server
 http://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server
 seems to be written for Hive users who are moving to Spark. I never used
 Hive before I started using Spark, so it is not clear to me how best to use
 this.

 I've tried putting data into Hive, then serving it with the Thrift server.
 But I have not been able to update the data in Hive without first shutting
 down the server. This is a problem because new data is always being streamed
 in, and so the data must continuously be updated.

 The system I'm building is supposed to replace a system that stores the data
 in MongoDB. The dataset has now grown so large that the database index does
 not fit in memory, which causes major performance problems in MongoDB.

 If the Thrift server is the right tool for me, how can I set it up for my
 application? If it is not the right tool, what else can I use?

 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Is-the-Thrift-server-right-for-me-tp21044.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: 
 user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
 For additional commands, e-mail: 
 user-h...@spark.apache.orgmailto:user-h...@spark.apache.org

Spark Metrics Servlet for driver and executor

2015-02-05 Thread Judy Nash

Hi all,

Looking at spark metricsServlet.

What is the url exposing driver  executor json response?

Found master and worker successfully, but can't find url that return json for 
the other 2 sources.


Thanks!
Judy

RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash

Yes. It's compatible with HDP 2.1 

-Original Message-
From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] 
Sent: Friday, January 16, 2015 3:17 PM
To: user@spark.apache.org
Subject: spark 1.2 compatibility

Is spark 1.2 is compatibly with HDP 2.1



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-compatibility-tp21197.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash

Should clarify on this. I personally have used HDP 2.1 + Spark 1.2 and have not 
seen a problem. 

However officially HDP 2.1 + Spark 1.2 is not a supported scenario. 

-Original Message-
From: Judy Nash 
Sent: Friday, January 16, 2015 5:35 PM
To: 'bhavyateja'; user@spark.apache.org
Subject: RE: spark 1.2 compatibility

Yes. It's compatible with HDP 2.1 

-Original Message-
From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] 
Sent: Friday, January 16, 2015 3:17 PM
To: user@spark.apache.org
Subject: spark 1.2 compatibility

Is spark 1.2 is compatibly with HDP 2.1



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-compatibility-tp21197.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: Spark SQL API Doc IsCached as SQL command

2014-12-16 Thread Judy Nash

Thanks Cheng. Tried it out and saw the InMemoryColumnarTableScan word in the 
physical plan.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, December 12, 2014 11:37 PM
To: Judy Nash; user@spark.apache.org
Subject: Re: Spark SQL API Doc  IsCached as SQL command

There isn’t a SQL statement that directly maps SQLContext.isCached, but you can 
use EXPLAIN EXTENDED to check whether the underlying physical plan is a 
InMemoryColumnarTableScan.

On 12/13/14 7:14 AM, Judy Nash wrote:
Hello,

Few questions on Spark SQL:

1)  Does Spark SQL support equivalent SQL Query for Scala command: 
IsCached(table name) ?

2)  Is there a documentation spec I can reference for question like this?

Closest doc I can find is this one: 
https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#deploying-in-existing-hive-warehouses

Thanks,
Judy

Spark SQL API Doc IsCached as SQL command

2014-12-12 Thread Judy Nash

Hello,

Few questions on Spark SQL:


1)  Does Spark SQL support equivalent SQL Query for Scala command: 
IsCached(table name) ?


2)  Is there a documentation spec I can reference for question like this?



Closest doc I can find is this one: 
https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#deploying-in-existing-hive-warehouses


Thanks,
Judy

RE: Spark-SQL JDBC driver

2014-12-10 Thread Judy Nash

Looks like you are wondering why you cannot see the RDD table you have created 
via thrift?

Based on my own experience with spark 1.1, RDD created directly via Spark SQL 
(i.e. Spark Shell or Spark-SQL.sh) is not visible on thrift, since thrift has 
its own session containing its own RDD.
Spark SQL experts on the forum can confirm on this though.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Tuesday, December 9, 2014 6:42 AM
To: Anas Mosaad
Cc: Judy Nash; user@spark.apache.org
Subject: Re: Spark-SQL JDBC driver

According to the stacktrace, you were still using SQLContext rather than 
HiveContext. To interact with Hive, HiveContext *must* be used.

Please refer to this page 
http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables

On 12/9/14 6:26 PM, Anas Mosaad wrote:
Back to the first question, this will mandate that hive is up and running?

When I try it, I get the following exception. The documentation says that this 
method works only on SchemaRDD. I though that countries.saveAsTable did not 
work for that a reason so I created a tmp that contains the results from the 
registered temp table. Which I could validate that it's a SchemaRDD as shown 
below.


@Judy, I do really appreciate your kind support and I want to understand and 
off course don't want to wast your time. If you can direct me the documentation 
describing this details, this will be great.


scala val tmp = sqlContext.sql(select * from countries)

tmp: org.apache.spark.sql.SchemaRDD =

SchemaRDD[12] at RDD at SchemaRDD.scala:108

== Query Plan ==

== Physical Plan ==

PhysicalRDD 
[COUNTRY_ID#20,COUNTRY_ISO_CODE#21,COUNTRY_NAME#22,COUNTRY_SUBREGION#23,COUNTRY_SUBREGION_ID#24,COUNTRY_REGION#25,COUNTRY_REGION_ID#26,COUNTRY_TOTAL#27,COUNTRY_TOTAL_ID#28,COUNTRY_NAME_HIST#29],
 MapPartitionsRDD[9] at mapPartitions at ExistingRDD.scala:36



scala tmp.saveAsTable(Countries)

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:

'CreateTableAsSelect None, Countries, false, None

 Project 
[COUNTRY_ID#20,COUNTRY_ISO_CODE#21,COUNTRY_NAME#22,COUNTRY_SUBREGION#23,COUNTRY_SUBREGION_ID#24,COUNTRY_REGION#25,COUNTRY_REGION_ID#26,COUNTRY_TOTAL#27,COUNTRY_TOTAL_ID#28,COUNTRY_NAME_HIST#29]

  Subquery countries

   LogicalRDD 
[COUNTRY_ID#20,COUNTRY_ISO_CODE#21,COUNTRY_NAME#22,COUNTRY_SUBREGION#23,COUNTRY_SUBREGION_ID#24,COUNTRY_REGION#25,COUNTRY_REGION_ID#26,COUNTRY_TOTAL#27,COUNTRY_TOTAL_ID#28,COUNTRY_NAME_HIST#29],
 MapPartitionsRDD[9] at mapPartitions at ExistingRDD.scala:36



at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:78)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:78)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:76)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)

at 
scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)

at 
scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)

at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)

at scala.collection.immutable.List.foreach(List.scala:318)

at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)

at 
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)

at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)

at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:412)

at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.scala:412)

at 
org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:413)

at 
org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:413)

at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)

at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)

at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)

at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)

at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)

at org.apache.spark.sql.SQLContext

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-12-09 Thread Judy Nash

To report back how I ultimately solved this issue and someone else can do:

1) Check each jar class path and make sure the jars are listed in the order of 
Guava class version (i.e. spark-assembly needs to list before Hadoop 2.4 
because spark-assembly has guava 14 and Hadoop 2.4 has guava 11). May require 
update compute-classpath.sh to get the ordering right. 

2) If the other jars uses a higher version, bump spark guava library to higher 
version. Guava supposedly to be very backward compatible.  

Hope this helps. 

-Original Message-
From: Marcelo Vanzin [mailto:van...@cloudera.com] 
Sent: Tuesday, December 2, 2014 11:35 AM
To: Judy Nash
Cc: Patrick Wendell; Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

On Tue, Dec 2, 2014 at 11:22 AM, Judy Nash judyn...@exchange.microsoft.com 
wrote:
 Any suggestion on how can user with custom Hadoop jar solve this issue?

You'll need to include all the dependencies for that custom Hadoop jar to the 
classpath. Those will include Guava (which is not included in its original form 
as part of the Spark dependencies).



 -Original Message-
 From: Patrick Wendell [mailto:pwend...@gmail.com]
 Sent: Sunday, November 30, 2014 11:06 PM
 To: Judy Nash
 Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Thanks Judy. While this is not directly caused by a Spark issue, it is likely 
 other users will run into this. This is an unfortunate consequence of the way 
 that we've shaded Guava in this release, we rely on byte code shading of 
 Hadoop itself as well. And if the user has their own Hadoop classes present 
 it can cause issues.

 On Sun, Nov 30, 2014 at 10:53 PM, Judy Nash judyn...@exchange.microsoft.com 
 wrote:
 Thanks Patrick and Cheng for the suggestions.

 The issue was Hadoop common jar was added to a classpath. After I removed 
 Hadoop common jar from both master and slave, I was able to bypass the error.
 This was caused by a local change, so no impact on the 1.2 release.
 -Original Message-
 From: Patrick Wendell [mailto:pwend...@gmail.com]
 Sent: Wednesday, November 26, 2014 8:17 AM
 To: Judy Nash
 Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Just to double check - I looked at our own assembly jar and I confirmed that 
 our Hadoop configuration class does use the correctly shaded version of 
 Guava. My best guess here is that somehow a separate Hadoop library is 
 ending up on the classpath, possible because Spark put it there somehow.

 tar xvzf spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
 cd org/apache/hadoop/
 javap -v Configuration | grep Precond

 Warning: Binary file Configuration contains 
 org.apache.hadoop.conf.Configuration

#497 = Utf8   
 org/spark-project/guava/common/base/Preconditions

#498 = Class  #497 //
 org/spark-project/guava/common/base/Preconditions

#502 = Methodref  #498.#501//
 org/spark-project/guava/common/base/Preconditions.checkArgument:(ZL
 j
 ava/lang/Object;)V

 12: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLj
 a
 va/lang/Object;)V

 50: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLj
 a
 va/lang/Object;)V

 On Wed, Nov 26, 2014 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
 Hi Judy,

 Are you somehow modifying Spark's classpath to include jars from 
 Hadoop and Hive that you have running on the machine? The issue 
 seems to be that you are somehow including a version of Hadoop that 
 references the original guava package. The Hadoop that is bundled in 
 the Spark jars should not do this.

 - Patrick

 On Wed, Nov 26, 2014 at 1:45 AM, Judy Nash 
 judyn...@exchange.microsoft.com wrote:
 Looks like a config issue. I ran spark-pi job and still failing 
 with the same guava error

 Command ran:

 .\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
 org.apache.spark.examples.SparkPi --master 
 spark://headnodehost:7077 --executor-memory 1G --num-executors 1 
 .\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100



 Had used the same build steps on spark 1.1 and had no issue.



 From: Denny Lee [mailto:denny.g@gmail.com]
 Sent: Tuesday, November 25, 2014 5:47 PM
 To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org


 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 To determine if this is a Windows vs. other configuration, can you 
 just try to call the Spark-class.cmd SparkSubmit without actually 
 referencing the Hadoop or Thrift server classes?





 On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
 judyn...@exchange.microsoft.com
 wrote:

 I

RE: Spark-SQL JDBC driver

2014-12-08 Thread Judy Nash

You can use thrift server for this purpose then test it with beeline.

See doc:
https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbc-server

From: Anas Mosaad [mailto:anas.mos...@incorta.com]
Sent: Monday, December 8, 2014 11:01 AM
To: user@spark.apache.org
Subject: Spark-SQL JDBC driver

Hello Everyone,

I'm brand new to spark and was wondering if there's a JDBC driver to access 
spark-SQL directly. I'm running spark in standalone mode and don't have hadoop 
in this environment.

--

Best Regards/أطيب المنى,

Anas Mosaad

monitoring for spark standalone

2014-12-07 Thread Judy Nash

Hello,

Are there ways we can programmatically get health status of master  slave 
nodes, similar to Hadoop Ambari?

Wiki seems to suggest there are only web UI or instrumentations 
(http://spark.apache.org/docs/latest/monitoring.html).

Thanks,
Judy

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-12-02 Thread Judy Nash

Any suggestion on how can user with custom Hadoop jar solve this issue? 

-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com] 
Sent: Sunday, November 30, 2014 11:06 PM
To: Judy Nash
Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Thanks Judy. While this is not directly caused by a Spark issue, it is likely 
other users will run into this. This is an unfortunate consequence of the way 
that we've shaded Guava in this release, we rely on byte code shading of Hadoop 
itself as well. And if the user has their own Hadoop classes present it can 
cause issues.

On Sun, Nov 30, 2014 at 10:53 PM, Judy Nash judyn...@exchange.microsoft.com 
wrote:
 Thanks Patrick and Cheng for the suggestions.

 The issue was Hadoop common jar was added to a classpath. After I removed 
 Hadoop common jar from both master and slave, I was able to bypass the error.
 This was caused by a local change, so no impact on the 1.2 release.
 -Original Message-
 From: Patrick Wendell [mailto:pwend...@gmail.com]
 Sent: Wednesday, November 26, 2014 8:17 AM
 To: Judy Nash
 Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Just to double check - I looked at our own assembly jar and I confirmed that 
 our Hadoop configuration class does use the correctly shaded version of 
 Guava. My best guess here is that somehow a separate Hadoop library is ending 
 up on the classpath, possible because Spark put it there somehow.

 tar xvzf spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
 cd org/apache/hadoop/
 javap -v Configuration | grep Precond

 Warning: Binary file Configuration contains 
 org.apache.hadoop.conf.Configuration

#497 = Utf8   org/spark-project/guava/common/base/Preconditions

#498 = Class  #497 //
 org/spark-project/guava/common/base/Preconditions

#502 = Methodref  #498.#501//
 org/spark-project/guava/common/base/Preconditions.checkArgument:(ZLj
 ava/lang/Object;)V

 12: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLja
 va/lang/Object;)V

 50: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLja
 va/lang/Object;)V

 On Wed, Nov 26, 2014 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
 Hi Judy,

 Are you somehow modifying Spark's classpath to include jars from 
 Hadoop and Hive that you have running on the machine? The issue seems 
 to be that you are somehow including a version of Hadoop that 
 references the original guava package. The Hadoop that is bundled in 
 the Spark jars should not do this.

 - Patrick

 On Wed, Nov 26, 2014 at 1:45 AM, Judy Nash 
 judyn...@exchange.microsoft.com wrote:
 Looks like a config issue. I ran spark-pi job and still failing with 
 the same guava error

 Command ran:

 .\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
 org.apache.spark.examples.SparkPi --master spark://headnodehost:7077 
 --executor-memory 1G --num-executors 1 
 .\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100

 Had used the same build steps on spark 1.1 and had no issue.

 From: Denny Lee [mailto:denny.g@gmail.com]
 Sent: Tuesday, November 25, 2014 5:47 PM
 To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org

 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 To determine if this is a Windows vs. other configuration, can you 
 just try to call the Spark-class.cmd SparkSubmit without actually 
 referencing the Hadoop or Thrift server classes?

 On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
 judyn...@exchange.microsoft.com
 wrote:

 I traced the code and used the following to call:

 Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
 spark-internal --hiveconf hive.server2.thrift.port=1

 The issue ended up to be much more fundamental however. Spark 
 doesn't work at all in configuration below. When open spark-shell, 
 it fails with the same ClassNotFound error.

 Now I wonder if this is a windows-only issue or the hive/Hadoop 
 configuration that is having this problem.

 From: Cheng Lian [mailto:lian.cs@gmail.com]
 Sent: Tuesday, November 25, 2014 1:50 AM

 To: Judy Nash; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Oh so you're using Windows. What command are you using to start the 
 Thrift server then?

 On 11/25/14 4:25 PM, Judy Nash wrote:

 Made progress but still blocked.

 After recompiling the code on cmd instead of PowerShell, now I can 
 see all 5 classes as you mentioned.

 However I am still seeing the same error as before. Anything else I 
 can check for?

 From

RE: Unable to compile spark 1.1.0 on windows 8.1

2014-12-01 Thread Judy Nash

Have you checked out the wiki here? 
http://spark.apache.org/docs/latest/building-with-maven.html

A couple things I did differently from you:
1) I got the bits directly from github (https://github.com/apache/spark/). Use 
branch 1.1 for spark 1.1
2) execute maven command on cmd (powershell misses libraries sometimes) 
3) Increase maven memory per suggested by building with maven wiki

Hope this helps. 

-Original Message-
From: Ishwardeep Singh [mailto:ishwardeep.si...@impetus.co.in] 
Sent: Monday, December 1, 2014 1:50 AM
To: u...@spark.incubator.apache.org
Subject: RE: Unable to compile spark 1.1.0 on windows 8.1

Hi Judy,

Thank you for your response.

When I try to compile using maven mvn -Dhadoop.version=1.2.1 -DskipTests clean 
package I get an error Error: Could not find or load main class . 
I have maven 3.0.4.

And when I run command sbt package I get the same exception as earlier.

I have done the following steps:

1. Download spark-1.1.0.tgz from the spark site and unzip the compressed zip to 
a folder d:\myworkplace\software\spark-1.1.0
2. Then I downloaded sbt-0.13.7.zip and extract it to folder 
d:\myworkplace\software\sbt
3. Update the PATH environment variable to include 
d:\myworkplace\software\sbt\bin in the PATH.
4. Navigate to spark folder d:\myworkplace\software\spark-1.1.0
5. Run the command sbt assembly
6. As a side effect of this command a number of libraries are downloaded and I 
get an initial error that path 
C:\Users\ishwardeep.singh\.sbt\0.13\staging\ec3aa8f39111944cc5f2\sbt-pom-reader
does not exist. 
7. I manually create this subfolder ec3aa8f39111944cc5f2\sbt-pom-reader
and retry to get the next error as described in my initial error.

Is this the correct procedure to compile spark 1.1.0? Please let me know.

Hoping to hear from you soon.

Regards,
ishwardeep



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-compile-spark-1-1-0-on-windows-8-1-tp19996p20075.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: Unable to compile spark 1.1.0 on windows 8.1

2014-11-30 Thread Judy Nash

I have found the following to work for me on win 8.1:
1) run sbt assembly
2) Use Maven. You can find the maven commands for your build at : 
docs\building-spark.md

-Original Message-
From: Ishwardeep Singh [mailto:ishwardeep.si...@impetus.co.in] 
Sent: Thursday, November 27, 2014 11:31 PM
To: u...@spark.incubator.apache.org
Subject: Unable to compile spark 1.1.0 on windows 8.1

Hi,

I am trying to compile spark 1.1.0 on windows 8.1 but I get the following 
exception. 

[info] Compiling 3 Scala sources to
D:\myworkplace\software\spark-1.1.0\project\target\scala-2.10\sbt0.13\classes...
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:26:
object sbt is not a member of package com.typesafe [error] import 
com.typesafe.sbt.pom.{PomBuild, SbtPomKeys}
[error] ^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:53: not
found: type PomBuild
[error] object SparkBuild extends PomBuild {
[error]   ^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:121:
not found: value SbtPomKeys
[error] otherResolvers = SbtPomKeys.mvnLocalRepository(dotM2 =
Seq(Resolver.file(dotM2, dotM2))),
[error]^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:165:
value projectDefinitions is not a member of AnyRef
[error] super.projectDefinitions(baseDirectory).map { x =
[error]   ^
[error] four errors found
[error] (plugins/compile:compile) Compilation failed

I have also setup scala 2.10.

Need help to resolve this issue.

Regards,
Ishwardeep 

--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-compile-spark-1-1-0-on-windows-8-1-tp19996.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-30 Thread Judy Nash

Thanks Patrick and Cheng for the suggestions.

The issue was Hadoop common jar was added to a classpath. After I removed 
Hadoop common jar from both master and slave, I was able to bypass the error. 
This was caused by a local change, so no impact on the 1.2 release. 
-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com] 
Sent: Wednesday, November 26, 2014 8:17 AM
To: Judy Nash
Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Just to double check - I looked at our own assembly jar and I confirmed that 
our Hadoop configuration class does use the correctly shaded version of Guava. 
My best guess here is that somehow a separate Hadoop library is ending up on 
the classpath, possible because Spark put it there somehow.

 tar xvzf spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
 cd org/apache/hadoop/
 javap -v Configuration | grep Precond

Warning: Binary file Configuration contains org.apache.hadoop.conf.Configuration

   #497 = Utf8   org/spark-project/guava/common/base/Preconditions

   #498 = Class  #497 //
org/spark-project/guava/common/base/Preconditions

   #502 = Methodref  #498.#501//
org/spark-project/guava/common/base/Preconditions.checkArgument:(ZLjava/lang/Object;)V

12: invokestatic  #502// Method
org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLjava/lang/Object;)V

50: invokestatic  #502// Method
org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLjava/lang/Object;)V

On Wed, Nov 26, 2014 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
 Hi Judy,

 Are you somehow modifying Spark's classpath to include jars from 
 Hadoop and Hive that you have running on the machine? The issue seems 
 to be that you are somehow including a version of Hadoop that 
 references the original guava package. The Hadoop that is bundled in 
 the Spark jars should not do this.

 - Patrick

 On Wed, Nov 26, 2014 at 1:45 AM, Judy Nash 
 judyn...@exchange.microsoft.com wrote:
 Looks like a config issue. I ran spark-pi job and still failing with 
 the same guava error

 Command ran:

 .\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
 org.apache.spark.examples.SparkPi --master spark://headnodehost:7077 
 --executor-memory 1G --num-executors 1 
 .\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100

 Had used the same build steps on spark 1.1 and had no issue.

 From: Denny Lee [mailto:denny.g@gmail.com]
 Sent: Tuesday, November 25, 2014 5:47 PM
 To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org

 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 To determine if this is a Windows vs. other configuration, can you 
 just try to call the Spark-class.cmd SparkSubmit without actually 
 referencing the Hadoop or Thrift server classes?

 On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
 judyn...@exchange.microsoft.com
 wrote:

 I traced the code and used the following to call:

 Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 
 spark-internal --hiveconf hive.server2.thrift.port=1

 The issue ended up to be much more fundamental however. Spark doesn't 
 work at all in configuration below. When open spark-shell, it fails 
 with the same ClassNotFound error.

 Now I wonder if this is a windows-only issue or the hive/Hadoop 
 configuration that is having this problem.

 From: Cheng Lian [mailto:lian.cs@gmail.com]
 Sent: Tuesday, November 25, 2014 1:50 AM

 To: Judy Nash; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Oh so you're using Windows. What command are you using to start the 
 Thrift server then?

 On 11/25/14 4:25 PM, Judy Nash wrote:

 Made progress but still blocked.

 After recompiling the code on cmd instead of PowerShell, now I can 
 see all 5 classes as you mentioned.

 However I am still seeing the same error as before. Anything else I 
 can check for?

 From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
 Sent: Monday, November 24, 2014 11:50 PM
 To: Cheng Lian; u...@spark.incubator.apache.org
 Subject: RE: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 This is what I got from jar tf:

 org/spark-project/guava/common/base/Preconditions.class

 org/spark-project/guava/common/math/MathPreconditions.class

 com/clearspring/analytics/util/Preconditions.class

 parquet/Preconditions.class

 I seem to have the line that reported missing, but I am missing this file:

 com/google/inject/internal/util/$Preconditions.class

 Any suggestion on how to fix this?

 Very much appreciate the help as I am very new to Spark and open 
 source technologies.

 From: Cheng Lian [mailto:lian.cs@gmail.com]
 Sent: Monday

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash

Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5 
classes as you mentioned.

However I am still seeing the same error as before. Anything else I can check 
for?

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday, November 24, 2014 11:50 PM
To: Cheng Lian; u...@spark.incubator.apache.org
Subject: RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)….

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:
42)
at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala
:202)
at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.sca
la)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1784)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:292)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.init(SparkContext.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:38)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:56)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash

I traced the code and used the following to call:
Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal 
--hiveconf hive.server2.thrift.port=1

The issue ended up to be much more fundamental however. Spark doesn’t work at 
all in configuration below. When open spark-shell, it fails with the same 
ClassNotFound error.
Now I wonder if this is a windows-only issue or the hive/Hadoop configuration 
that is having this problem.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Tuesday, November 25, 2014 1:50 AM
To: Judy Nash; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Oh so you're using Windows. What command are you using to start the Thrift 
server then?
On 11/25/14 4:25 PM, Judy Nash wrote:
Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5 
classes as you mentioned.


However I am still seeing the same error as before. Anything else I can check 
for?

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday, November 24, 2014 11:50 PM
To: Cheng Lian; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)….

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init

RE: beeline via spark thrift doesn't retain cache

2014-11-25 Thread Judy Nash

Thanks Yanbo.
My issue was 1) . I had spark thrift server setup, but it was running against 
hive instead of Spark SQL due a local change.

After I fix this, beeline automatically caches rerun queries + accepts cache 
table.

From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Friday, November 21, 2014 12:42 AM
To: Judy Nash
Cc: u...@spark.incubator.apache.org
Subject: Re: beeline via spark thrift doesn't retain cache

1) make sure your beeline client connected to Hiveserver2 of Spark SQL.
You can found execution logs of Hiveserver2 in the environment of 
start-thriftserver.sh.
2) what about your scale of data. If cache with small data, it will take more 
time to schedule workload between different executors.
Look the configuration of spark execution environment. Whether there are enough 
memory for RDD storage, if not, it will take some time to serialize/deserialize 
data between memory and disk.

2014-11-21 11:06 GMT+08:00 Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com:
Hi friends,

I have successfully setup thrift server and execute beeline on top.

Beeline can handle select queries just fine, but it cannot seem to do any kind 
of caching/RDD operations.

i.e.

1)  Command “cache table” doesn’t work. See error:

Error: Error while processing statement: FAILED: ParseException line 1:0 cannot

recognize input near 'cache' 'table' 'hivesampletable' (state=42000,code=4)

2)  Re-run SQL commands do not have any performance improvements.

By comparison, Spark-SQL shell can execute “cache table” command and rerunning 
SQL command has a huge performance boost.

Am I missing something or this is expected when execute through Spark thrift 
server?

Thanks!
Judy

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash

Looks like a config issue. I ran spark-pi job and still failing with the same 
guava error
Command ran:
.\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.examples.SparkPi --master spark://headnodehost:7077 
--executor-memory 1G --num-executors 1 
.\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100

Had used the same build steps on spark 1.1 and had no issue.

From: Denny Lee [mailto:denny.g@gmail.com]
Sent: Tuesday, November 25, 2014 5:47 PM
To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

To determine if this is a Windows vs. other configuration, can you just try to 
call the Spark-class.cmd SparkSubmit without actually referencing the Hadoop or 
Thrift server classes?


On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote:
I traced the code and used the following to call:
Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal 
--hiveconf hive.server2.thrift.port=1

The issue ended up to be much more fundamental however. Spark doesn’t work at 
all in configuration below. When open spark-shell, it fails with the same 
ClassNotFound error.
Now I wonder if this is a windows-only issue or the hive/Hadoop configuration 
that is having this problem.

From: Cheng Lian [mailto:lian.cs@gmail.commailto:lian.cs@gmail.com]
Sent: Tuesday, November 25, 2014 1:50 AM

To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Oh so you're using Windows. What command are you using to start the Thrift 
server then?
On 11/25/14 4:25 PM, Judy Nash wrote:
Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5 
classes as you mentioned.

However I am still seeing the same error as before. Anything else I can check 
for?

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday, November 24, 2014 11:50 PM
To: Cheng Lian; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-24 Thread Judy Nash

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)….

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:
42)
at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala
:202)
at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.sca
la)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1784)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:292)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.init(SparkContext.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:38)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:56)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr
iftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75

latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-21 Thread Judy Nash

Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:
42)
at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala
:202)
at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.sca
la)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1784)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:292)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.init(SparkContext.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:38)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:56)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr
iftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.google.common.base.Precondition
s
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

beeline via spark thrift doesn't retain cache

2014-11-20 Thread Judy Nash

Hi friends,

I have successfully setup thrift server and execute beeline on top.

Beeline can handle select queries just fine, but it cannot seem to do any kind 
of caching/RDD operations.

i.e.

1)  Command cache table doesn't work. See error:

Error: Error while processing statement: FAILED: ParseException line 1:0 cannot

recognize input near 'cache' 'table' 'hivesampletable' (state=42000,code=4)



2)  Re-run SQL commands do not have any performance improvements.

By comparison, Spark-SQL shell can execute cache table command and rerunning 
SQL command has a huge performance boost.

Am I missing something or this is expected when execute through Spark thrift 
server?

Thanks!
Judy

security testing on spark ?

RE: Error building Spark on Windows with sbt

spark thrift server supports timeout?

Get a list of temporary RDD tables via Thrift

saveAsTable fails on Python with Unresolved plan found

RE: saveAsTable fails on Python with Unresolved plan found

RE: saveAsTable fails on Python with Unresolved plan found

RE: Using 'fair' scheduler mode with thrift server

Spark SQL does not read from cached table if table is renamed

Matching Spark application metrics data to App Id

RE: configure number of cached partition in memory on SparkSQL

RE: spark standalone with multiple executors in one work node

configure number of cached partition in memory on SparkSQL

spark standalone with multiple executors in one work node

spark slave cannot execute without admin permission on windows

RE: Is the Thrift server right for me?

Spark Metrics Servlet for driver and executor

RE: spark 1.2 compatibility

RE: spark 1.2 compatibility

RE: Spark SQL API Doc IsCached as SQL command

Spark SQL API Doc IsCached as SQL command

RE: Spark-SQL JDBC driver

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

RE: Spark-SQL JDBC driver

monitoring for spark standalone

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

RE: Unable to compile spark 1.1.0 on windows 8.1

RE: Unable to compile spark 1.1.0 on windows 8.1

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

RE: beeline via spark thrift doesn't retain cache

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

beeline via spark thrift doesn't retain cache

36 matches

Site Navigation

Mail list logo

Footer information