Hi all,
Does anyone know of any effort from the community on security testing spark
clusters.
I.e.
Static source code analysis to find security flaws
Penetration testing to identify ways to compromise spark cluster
Fuzzing to crash spark
Thanks,
Judy
I have not had any success building using sbt/sbt on windows.
However, I have been able to binary by using maven command directly.
From: Richard Eggert [mailto:richard.egg...@gmail.com]
Sent: Sunday, October 25, 2015 12:51 PM
To: Ted Yu
Cc: User
Hello everyone,
Does spark thrift server support timeout?
Is there a documentation I can reference for questions like these?
I know it support cancels, but not sure about timeout.
Thanks,
Judy
Hi,
How can I get a list of temporary tables via Thrift?
Have used thrift's startWithContext and registered a temp table, but not seeing
the temp table/rdd when running show tables.
Thanks,
Judy
Hello,
I am following the tutorial code on sql programming
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
to try out Python on spark 1.2.1.
SaveAsTable function works on Scala bur fails on python with Unresolved plan
found.
Broken
SPARK-4825https://issues.apache.org/jira/browse/SPARK-4825 looks like the
right bug, but it should've been fixed on 1.2.1.
Is a similar fix needed in Python?
From: Judy Nash
Sent: Thursday, May 7, 2015 7:26 AM
To: user@spark.apache.org
Subject: saveAsTable fails on Python with Unresolved plan
Figured it out. It was because I was using HiveContext instead of SQLContext.
FYI in case others saw the same issue.
From: Judy Nash
Sent: Thursday, May 7, 2015 7:38 AM
To: 'user@spark.apache.org'
Subject: RE: saveAsTable fails on Python with Unresolved plan found
SPARK-4825https
The expensive query can take all executor slots, but no task occupy the
executor permanently.
i.e. The second job can possibly to take some resources to execute in-between
tasks of the expensive queries.
Can the fair scheduler mode help in this case? Or is it possible to setup
thrift such that
Hi all,
Noticed a bug in my current version of Spark 1.2.1.
After a table is cached with cache table table command, query will not read
from memory if SQL query renames the table.
This query reads from in memory table
i.e. select hivesampletable.country from default.hivesampletable group by
Hi,
I want to get telemetry metrics on spark apps activities, such as run time and
jvm activities.
Using Spark Metrics I am able to get the following sample data point on the an
app:
type=GAUGE, name=application.SparkSQL::headnode0.1426626495312.runtime_ms,
value=414873
How can I match this
want to see if matching
partitions to available core count will make it faster.
I’ll give your suggestion a try to see if it will help. Experiment is a great
way to learn more about spark internals.
From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, March 16, 2015 5:41 AM
To: Judy Nash
[mailto:so...@cloudera.com]
Sent: Thursday, February 26, 2015 2:11 AM
To: Judy Nash
Cc: user@spark.apache.org
Subject: Re: spark standalone with multiple executors in one work node
--num-executors is the total number of executors. In YARN there is not quite
the same notion of a Spark worker
Hi,
I am tuning a hive dataset on Spark SQL deployed via thrift server.
How can I change the number of partitions after caching the table on thrift
server?
I have tried the following but still getting the same number of partitions
after caching:
Spark.default.parallelism
Hello,
Does spark standalone support running multiple executors in one worker node?
It seems yarn has the parameter --num-executors to set number of executors to
deploy, but I do not find the equivalent parameter in spark standalone.
Thanks,
Judy
Hi,
Is it possible to configure spark to run without admin permission on windows?
My current setup run master slave successfully with admin permission.
However, if I downgrade permission level from admin to user, SparkPi fails with
the following exception on the slave node:
Exception in thread
It should relay the queries to spark (i.e. you shouldn't see any MR job on
Hadoop you should see activities on the spark app on headnode UI).
Check your hive-site.xml. Are you directing to the hive server 2 port instead
of spark thrift port?
Their default ports are both 1.
From: Andrew
Hi all,
Looking at spark metricsServlet.
What is the url exposing driver executor json response?
Found master and worker successfully, but can't find url that return json for
the other 2 sources.
Thanks!
Judy
Yes. It's compatible with HDP 2.1
-Original Message-
From: bhavyateja [mailto:bhavyateja.potin...@gmail.com]
Sent: Friday, January 16, 2015 3:17 PM
To: user@spark.apache.org
Subject: spark 1.2 compatibility
Is spark 1.2 is compatibly with HDP 2.1
--
View this message in context:
Should clarify on this. I personally have used HDP 2.1 + Spark 1.2 and have not
seen a problem.
However officially HDP 2.1 + Spark 1.2 is not a supported scenario.
-Original Message-
From: Judy Nash
Sent: Friday, January 16, 2015 5:35 PM
To: 'bhavyateja'; user@spark.apache.org
Thanks Cheng. Tried it out and saw the InMemoryColumnarTableScan word in the
physical plan.
From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, December 12, 2014 11:37 PM
To: Judy Nash; user@spark.apache.org
Subject: Re: Spark SQL API Doc IsCached as SQL command
There isn’t a SQL
Hello,
Few questions on Spark SQL:
1) Does Spark SQL support equivalent SQL Query for Scala command:
IsCached(table name) ?
2) Is there a documentation spec I can reference for question like this?
Closest doc I can find is this one:
SQL experts on the forum can confirm on this though.
From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Tuesday, December 9, 2014 6:42 AM
To: Anas Mosaad
Cc: Judy Nash; user@spark.apache.org
Subject: Re: Spark-SQL JDBC driver
According to the stacktrace, you were still using SQLContext rather
...@cloudera.com]
Sent: Tuesday, December 2, 2014 11:35 AM
To: Judy Nash
Cc: Patrick Wendell; Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on
Guava
On Tue, Dec 2, 2014 at 11:22 AM, Judy Nash judyn...@exchange.microsoft.com
You can use thrift server for this purpose then test it with beeline.
See doc:
https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbc-server
From: Anas Mosaad [mailto:anas.mos...@incorta.com]
Sent: Monday, December 8, 2014 11:01 AM
To: user@spark.apache.org
Hello,
Are there ways we can programmatically get health status of master slave
nodes, similar to Hadoop Ambari?
Wiki seems to suggest there are only web UI or instrumentations
(http://spark.apache.org/docs/latest/monitoring.html).
Thanks,
Judy
Any suggestion on how can user with custom Hadoop jar solve this issue?
-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com]
Sent: Sunday, November 30, 2014 11:06 PM
To: Judy Nash
Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2
Have you checked out the wiki here?
http://spark.apache.org/docs/latest/building-with-maven.html
A couple things I did differently from you:
1) I got the bits directly from github (https://github.com/apache/spark/). Use
branch 1.1 for spark 1.1
2) execute maven command on cmd (powershell misses
I have found the following to work for me on win 8.1:
1) run sbt assembly
2) Use Maven. You can find the maven commands for your build at :
docs\building-spark.md
-Original Message-
From: Ishwardeep Singh [mailto:ishwardeep.si...@impetus.co.in]
Sent: Thursday, November 27, 2014 11:31
-
From: Patrick Wendell [mailto:pwend...@gmail.com]
Sent: Wednesday, November 26, 2014 8:17 AM
To: Judy Nash
Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on
Guava
Just to double check - I looked at our own
Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5
classes as you mentioned.
However I am still seeing the same error as before. Anything else I can check
for?
From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday
AM
To: Judy Nash; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on
Guava
Oh so you're using Windows. What command are you using to start the Thrift
server then?
On 11/25/14 4:25 PM, Judy Nash wrote:
Made progress but still blocked.
After
21, 2014 12:42 AM
To: Judy Nash
Cc: u...@spark.incubator.apache.org
Subject: Re: beeline via spark thrift doesn't retain cache
1) make sure your beeline client connected to Hiveserver2 of Spark SQL.
You can found execution logs of Hiveserver2 in the environment of
start-thriftserver.sh.
2) what
-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100
Had used the same build steps on spark 1.1 and had no issue.
From: Denny Lee [mailto:denny.g@gmail.com]
Sent: Tuesday, November 25, 2014 5:47 PM
To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail
this file:
com/google/inject/internal/util/$Preconditions.class
Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source
technologies.
From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; u
Hi,
Thrift server is failing to start for me on latest spark 1.2 branch.
I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
Hi friends,
I have successfully setup thrift server and execute beeline on top.
Beeline can handle select queries just fine, but it cannot seem to do any kind
of caching/RDD operations.
i.e.
1) Command cache table doesn't work. See error:
Error: Error while processing statement: FAILED:
36 matches
Mail list logo