security testing on spark ?

2015-12-15 Thread Judy Nash
Hi all, Does anyone know of any effort from the community on security testing spark clusters. I.e. Static source code analysis to find security flaws Penetration testing to identify ways to compromise spark cluster Fuzzing to crash spark Thanks, Judy

RE: Error building Spark on Windows with sbt

2015-10-30 Thread Judy Nash
I have not had any success building using sbt/sbt on windows. However, I have been able to binary by using maven command directly. From: Richard Eggert [mailto:richard.egg...@gmail.com] Sent: Sunday, October 25, 2015 12:51 PM To: Ted Yu Cc: User

spark thrift server supports timeout?

2015-07-21 Thread Judy Nash
Hello everyone, Does spark thrift server support timeout? Is there a documentation I can reference for questions like these? I know it support cancels, but not sure about timeout. Thanks, Judy

Get a list of temporary RDD tables via Thrift

2015-05-11 Thread Judy Nash
Hi, How can I get a list of temporary tables via Thrift? Have used thrift's startWithContext and registered a temp table, but not seeing the temp table/rdd when running show tables. Thanks, Judy

saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash
Hello, I am following the tutorial code on sql programming guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection to try out Python on spark 1.2.1. SaveAsTable function works on Scala bur fails on python with Unresolved plan found. Broken

RE: saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash
SPARK-4825https://issues.apache.org/jira/browse/SPARK-4825 looks like the right bug, but it should've been fixed on 1.2.1. Is a similar fix needed in Python? From: Judy Nash Sent: Thursday, May 7, 2015 7:26 AM To: user@spark.apache.org Subject: saveAsTable fails on Python with Unresolved plan

RE: saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash
Figured it out. It was because I was using HiveContext instead of SQLContext. FYI in case others saw the same issue. From: Judy Nash Sent: Thursday, May 7, 2015 7:38 AM To: 'user@spark.apache.org' Subject: RE: saveAsTable fails on Python with Unresolved plan found SPARK-4825https

RE: Using 'fair' scheduler mode with thrift server

2015-04-01 Thread Judy Nash
The expensive query can take all executor slots, but no task occupy the executor permanently. i.e. The second job can possibly to take some resources to execute in-between tasks of the expensive queries. Can the fair scheduler mode help in this case? Or is it possible to setup thrift such that

Spark SQL does not read from cached table if table is renamed

2015-04-01 Thread Judy Nash
Hi all, Noticed a bug in my current version of Spark 1.2.1. After a table is cached with cache table table command, query will not read from memory if SQL query renames the table. This query reads from in memory table i.e. select hivesampletable.country from default.hivesampletable group by

Matching Spark application metrics data to App Id

2015-03-20 Thread Judy Nash
Hi, I want to get telemetry metrics on spark apps activities, such as run time and jvm activities. Using Spark Metrics I am able to get the following sample data point on the an app: type=GAUGE, name=application.SparkSQL::headnode0.1426626495312.runtime_ms, value=414873 How can I match this

RE: configure number of cached partition in memory on SparkSQL

2015-03-19 Thread Judy Nash
want to see if matching partitions to available core count will make it faster. I’ll give your suggestion a try to see if it will help. Experiment is a great way to learn more about spark internals. From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Monday, March 16, 2015 5:41 AM To: Judy Nash

RE: spark standalone with multiple executors in one work node

2015-03-05 Thread Judy Nash
[mailto:so...@cloudera.com] Sent: Thursday, February 26, 2015 2:11 AM To: Judy Nash Cc: user@spark.apache.org Subject: Re: spark standalone with multiple executors in one work node --num-executors is the total number of executors. In YARN there is not quite the same notion of a Spark worker

configure number of cached partition in memory on SparkSQL

2015-03-04 Thread Judy Nash
Hi, I am tuning a hive dataset on Spark SQL deployed via thrift server. How can I change the number of partitions after caching the table on thrift server? I have tried the following but still getting the same number of partitions after caching: Spark.default.parallelism

spark standalone with multiple executors in one work node

2015-02-25 Thread Judy Nash
Hello, Does spark standalone support running multiple executors in one worker node? It seems yarn has the parameter --num-executors to set number of executors to deploy, but I do not find the equivalent parameter in spark standalone. Thanks, Judy

spark slave cannot execute without admin permission on windows

2015-02-18 Thread Judy Nash
Hi, Is it possible to configure spark to run without admin permission on windows? My current setup run master slave successfully with admin permission. However, if I downgrade permission level from admin to user, SparkPi fails with the following exception on the slave node: Exception in thread

RE: Is the Thrift server right for me?

2015-02-11 Thread Judy Nash
It should relay the queries to spark (i.e. you shouldn't see any MR job on Hadoop you should see activities on the spark app on headnode UI). Check your hive-site.xml. Are you directing to the hive server 2 port instead of spark thrift port? Their default ports are both 1. From: Andrew

Spark Metrics Servlet for driver and executor

2015-02-05 Thread Judy Nash
Hi all, Looking at spark metricsServlet. What is the url exposing driver executor json response? Found master and worker successfully, but can't find url that return json for the other 2 sources. Thanks! Judy

RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash
Yes. It's compatible with HDP 2.1 -Original Message- From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] Sent: Friday, January 16, 2015 3:17 PM To: user@spark.apache.org Subject: spark 1.2 compatibility Is spark 1.2 is compatibly with HDP 2.1 -- View this message in context:

RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash
Should clarify on this. I personally have used HDP 2.1 + Spark 1.2 and have not seen a problem. However officially HDP 2.1 + Spark 1.2 is not a supported scenario. -Original Message- From: Judy Nash Sent: Friday, January 16, 2015 5:35 PM To: 'bhavyateja'; user@spark.apache.org

RE: Spark SQL API Doc IsCached as SQL command

2014-12-16 Thread Judy Nash
Thanks Cheng. Tried it out and saw the InMemoryColumnarTableScan word in the physical plan. From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Friday, December 12, 2014 11:37 PM To: Judy Nash; user@spark.apache.org Subject: Re: Spark SQL API Doc IsCached as SQL command There isn’t a SQL

Spark SQL API Doc IsCached as SQL command

2014-12-12 Thread Judy Nash
Hello, Few questions on Spark SQL: 1) Does Spark SQL support equivalent SQL Query for Scala command: IsCached(table name) ? 2) Is there a documentation spec I can reference for question like this? Closest doc I can find is this one:

RE: Spark-SQL JDBC driver

2014-12-10 Thread Judy Nash
SQL experts on the forum can confirm on this though. From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Tuesday, December 9, 2014 6:42 AM To: Anas Mosaad Cc: Judy Nash; user@spark.apache.org Subject: Re: Spark-SQL JDBC driver According to the stacktrace, you were still using SQLContext rather

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-12-09 Thread Judy Nash
...@cloudera.com] Sent: Tuesday, December 2, 2014 11:35 AM To: Judy Nash Cc: Patrick Wendell; Denny Lee; Cheng Lian; u...@spark.incubator.apache.org Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava On Tue, Dec 2, 2014 at 11:22 AM, Judy Nash judyn...@exchange.microsoft.com

RE: Spark-SQL JDBC driver

2014-12-08 Thread Judy Nash
You can use thrift server for this purpose then test it with beeline. See doc: https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbc-server From: Anas Mosaad [mailto:anas.mos...@incorta.com] Sent: Monday, December 8, 2014 11:01 AM To: user@spark.apache.org

monitoring for spark standalone

2014-12-07 Thread Judy Nash
Hello, Are there ways we can programmatically get health status of master slave nodes, similar to Hadoop Ambari? Wiki seems to suggest there are only web UI or instrumentations (http://spark.apache.org/docs/latest/monitoring.html). Thanks, Judy

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-12-02 Thread Judy Nash
Any suggestion on how can user with custom Hadoop jar solve this issue? -Original Message- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Sunday, November 30, 2014 11:06 PM To: Judy Nash Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org Subject: Re: latest Spark 1.2

RE: Unable to compile spark 1.1.0 on windows 8.1

2014-12-01 Thread Judy Nash
Have you checked out the wiki here? http://spark.apache.org/docs/latest/building-with-maven.html A couple things I did differently from you: 1) I got the bits directly from github (https://github.com/apache/spark/). Use branch 1.1 for spark 1.1 2) execute maven command on cmd (powershell misses

RE: Unable to compile spark 1.1.0 on windows 8.1

2014-11-30 Thread Judy Nash
I have found the following to work for me on win 8.1: 1) run sbt assembly 2) Use Maven. You can find the maven commands for your build at : docs\building-spark.md -Original Message- From: Ishwardeep Singh [mailto:ishwardeep.si...@impetus.co.in] Sent: Thursday, November 27, 2014 11:31

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-30 Thread Judy Nash
- From: Patrick Wendell [mailto:pwend...@gmail.com] Sent: Wednesday, November 26, 2014 8:17 AM To: Judy Nash Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava Just to double check - I looked at our own

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash
Made progress but still blocked. After recompiling the code on cmd instead of PowerShell, now I can see all 5 classes as you mentioned. However I am still seeing the same error as before. Anything else I can check for? From: Judy Nash [mailto:judyn...@exchange.microsoft.com] Sent: Monday

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash
AM To: Judy Nash; u...@spark.incubator.apache.org Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava Oh so you're using Windows. What command are you using to start the Thrift server then? On 11/25/14 4:25 PM, Judy Nash wrote: Made progress but still blocked. After

RE: beeline via spark thrift doesn't retain cache

2014-11-25 Thread Judy Nash
21, 2014 12:42 AM To: Judy Nash Cc: u...@spark.incubator.apache.org Subject: Re: beeline via spark thrift doesn't retain cache 1) make sure your beeline client connected to Hiveserver2 of Spark SQL. You can found execution logs of Hiveserver2 in the environment of start-thriftserver.sh. 2) what

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash
-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100 Had used the same build steps on spark 1.1 and had no issue. From: Denny Lee [mailto:denny.g@gmail.com] Sent: Tuesday, November 25, 2014 5:47 PM To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org Subject: Re: latest Spark 1.2 thrift server fail

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-24 Thread Judy Nash
this file: com/google/inject/internal/util/$Preconditions.class Any suggestion on how to fix this? Very much appreciate the help as I am very new to Spark and open source technologies. From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Monday, November 24, 2014 8:24 PM To: Judy Nash; u

latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-21 Thread Judy Nash
Hi, Thrift server is failing to start for me on latest spark 1.2 branch. I got the error below when I start thrift server. Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas e/Preconditions at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur

beeline via spark thrift doesn't retain cache

2014-11-20 Thread Judy Nash
Hi friends, I have successfully setup thrift server and execute beeline on top. Beeline can handle select queries just fine, but it cannot seem to do any kind of caching/RDD operations. i.e. 1) Command cache table doesn't work. See error: Error: Error while processing statement: FAILED: