I created a simple Spark Streaming program - it received numbers and
computed averages and sent the results to Kafka.
It worked perfectly in local mode as well as standalone master/slave mode
across a two-node cluster.
It did not work however in yarn-client or yarn-cluster mode.
The job was
When you are starting the thrift server service - are you connecting to it
locally or is this on a remote server when you use beeline and/or Tableau?
On Thu, Oct 30, 2014 at 8:00 AM, Bojan Kostic blood9ra...@gmail.com wrote:
I use beta driver SQL ODBC from Databricks.
--
View this message
QQ - did you download the Spark 1.1 binaries that included the Hadoop one?
Does this happen if you're using the Spark 1.1 binaries that do not include
the Hadoop jars?
On Wed, Oct 29, 2014 at 11:31 AM, Ron Ayoub ronalday...@live.com wrote:
Apparently Spark does require Hadoop even if you do not
–jar (ADD_JARS) is a special class loading for Spark while
–driver-class-path (SPARK_CLASSPATH) is captured by the startup scripts and
appended to classpath settings that is used to start the JVM running the
driver
You can reference
https://www.concur.com/blog/en-us/connect-tableau-to-sparksql
I created https://issues.apache.org/jira/browse/SPARK-3947
On Tue, Oct 14, 2014 at 3:54 AM, Michael Armbrust mich...@databricks.com
wrote:
Its not on the roadmap for 1.2. I'd suggest opening a JIRA.
On Mon, Oct 13, 2014 at 4:28 AM, Pierre B
pierre.borckm...@realimpactanalytics.com wrote:
Hi,
You can merge them into one table by:
sqlContext.unionAll(sqlContext.unionAll(sqlContext.table(table_1),
sqlContext.table(table_2)),
sqlContext.table(table_3)).registarTempTable(table_all)
Or load them in one call by:
Hi,
Please check Zeppelin, too.
http://zeppelin-project.org
https://github.com/nflabs/zeppelin
Which is similar to scala notebook.
Best,
moon
2014년 10월 9일 목요일, andy petrellaandy.petre...@gmail.com님이 작성한 메시지:
Sure! I'll post updates as well in the ML :-)
I'm doing it on twitter for now
Hi,
There is project called Zeppelin.
You can checkout here
https://github.com/NFLabs/zeppelin
Homepage is here.
http://zeppelin-project.org/
It's notebook style tool (like databrics demo, scala-notebook) with nice
UI, with built-in Spark integration.
It's in active development, so don't
at 10:48 AM, moon soo Lee leemoon...@gmail.com
wrote:
Hi,
There is project called Zeppelin.
You can checkout here
https://github.com/NFLabs/zeppelin
Homepage is here.
http://zeppelin-project.org/
It's notebook style tool (like databrics demo, scala-notebook) with nice
UI, with built
by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException:
Specified key was too long; max key length is 767 bytes
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
Should I use HIVE 0.12.0 instead of HIVE 0.13.1?
Regards
Arthur
On 31 Aug, 2014, at 6:01 am, Denny Lee denny.g
This seems similar to a related Windows issue concerning python where
pyspark could't find the python because the PYTHONSTARTUP environment
wasn't set - by any chance could this be related?
On Wed, Sep 24, 2014 at 7:51 PM, christy 760948...@qq.com wrote:
Hi I have installed standalone on
The registered table is stored within the spark context itself. To have the
table available for the thrift server to get access to, you can save the sc
table into the Hive context so that way the Thrift server process can see the
table. If you are using derby as your metastore, then the
Hi,
I'm trying to make spark work on multithreads java application.
What i'm trying to do is,
- Create a Single SparkContext
- Create Multiple SparkILoop and SparkIMain
- Inject created SparkContext into SparkIMain interpreter.
Thread is created by every user request and take a SparkILoop and
I’m not sure if I’m completely answering your question here but I’m currently
working (on OSX) with Hadoop 2.5 and I used the Spark 1.1 with Hadoop 2.4
without any issues.
On September 11, 2014 at 18:11:46, Haopu Wang (hw...@qilinsoft.com) wrote:
I see the binary packages include hadoop 1,
registerTempTable you mentioned works on SqlContext instead of HiveContext.
Thanks,
Du
On 9/10/14, 1:21 PM, Denny Lee denny.g@gmail.com wrote:
Actually, when registering the table, it is only available within the sc
context you are running it in. For Spark 1.1, the method name is changed
, but in Spark 1.1.0, there are separate packages for hadoop 2.3 and 2.4.
That implies some difference in Spark according to hadoop version.
From:Denny Lee [mailto:denny.g@gmail.com]
Sent: Friday, September 12, 2014 9:35 AM
To: user@spark.apache.org; Haopu Wang; d...@spark.apache.org
to read from HDFS, you’ll need to build Spark against the
specific HDFS version in your environment.”
Did you try to read a hadoop 2.5.0 file using Spark 1.1 with hadoop 2.4?
Thanks!
From:Denny Lee [mailto:denny.g@gmail.com]
Sent: Friday, September 12, 2014 10:00 AM
To: Patrick
When you re-ran sbt did you clear out the packages first and ensure that
the datanucleus jars were generated within lib_managed? I remembered
having to do that when I was working testing out different configs.
On Thu, Sep 11, 2014 at 10:50 AM, alexandria1101
alexandria.shea...@gmail.com wrote:
Could you provide some context about running this in yarn-cluster mode?
The Thrift server that's included within Spark 1.1 is based on Hive 0.12.
Hive has been able to work against YARN since Hive 0.10. So when you start
the thrift server, provided you copied the hive-site.xml over to the Spark
Actually, when registering the table, it is only available within the sc
context you are running it in. For Spark 1.1, the method name is changed to
RegisterAsTempTable to better reflect that.
The Thrift server process runs under a different process meaning that it cannot
see any of the
your-port
This behavior is inherited from Hive since Spark SQL Thrift server is a variant
of HiveServer2.
On Wed, Sep 3, 2014 at 10:47 PM, Denny Lee denny.g@gmail.com wrote:
When I start the thrift server (on Spark 1.1 RC4) via:
./sbin/start-thriftserver.sh --master spark://hostname:7077
When I start the thrift server (on Spark 1.1 RC4) via:
./sbin/start-thriftserver.sh --master spark://hostname:7077 --driver-class-path
$CLASSPATH
It appears that the thrift server is starting off of localhost as opposed to
hostname. I have set the spark-env.sh to use the hostname, modified the
Hi, I'm developing an application with Spark.
My java application trying to creates spark context like
Creating spark context
public SparkContext createSparkContext(){
String execUri = System.getenv(SPARK_EXECUTOR_URI);
String[] jars = SparkILoop.getAddedJars();
Oh, forgot to add the managed libraries and the Hive libraries within the
CLASSPATH. As soon as I did that, we’re good to go now.
On August 29, 2014 at 22:55:47, Denny Lee (denny.g@gmail.com) wrote:
My issue is similar to the issue as noted
http://mail-archives.apache.org/mod_mbox
Oh, you may be running into an issue with your MySQL setup actually, try running
alter database metastore_db character set latin1
so that way Hive (and the Spark HiveContext) can execute properly against the
metastore.
On August 29, 2014 at 04:39:01, arthur.hk.c...@gmail.com
I’m currently using the Spark 1.1 branch and have been able to get the Thrift
service up and running. The quick questions were whether I should able to use
the Thrift service to connect to SparkSQL generated tables and/or Hive tables?
As well, by any chance do we have any documents that
Lee alee...@hotmail.com wrote:
Hopefully there could be some progress on SPARK-2420. It looks like
shading
may be the voted solution among downgrading.
Any idea when this will happen? Could it happen in Spark 1.1.1 or Spark
1.1.2?
By the way, regarding bin/spark-sql? Is this more
Quick question - is there a handy sample / example of how to use the LDA
algorithm within Spark MLLib?
Thanks!
Denny
Hi,
I've used hdfs 2.3.0-cdh5.0.1, mesos 0.19.1 and spark 1.0.2 that is
re-compiled.
For a security reason, we run hdfs and mesos as hdfs, that is an account
name and not in a root group, and non-root user submit a spark job on
mesos. With no-switch_user, simple job, which only read data from
We use MapR Hadoop and I have configured mesos-0.18.1 and spark-1.0.1 to work
together on top of the nodes running mapr hadoop. I would like to configure
spark to access files from the mapr filesystem (maprfs://) and I'm starting
with configuring the SPARK_EXECUTOR_URI environment variable in
Apologies but we had placed the settings for downloading the slides to Seattle
Spark Meetup members only - but actually meant to share with everyone. We have
since fixed this and now you can download it. HTH!
On August 14, 2014 at 18:14:35, Denny Lee (denny.g@gmail.com) wrote
For those whom were not able to attend the Seattle Spark Meetup - Spark at eBay
- Troubleshooting the Everyday Issues, the slides have been now posted at:
http://files.meetup.com/12063092/SparkMeetupAugust2014Public.pdf.
Enjoy!
Denny
/user/hive/warehouse)
On Thu, Jul 31, 2014 at 8:05 AM, Andrew Lee lt;
alee526@
gt; wrote:
Hi All,
It has been awhile, but what I did to make it work is to make sure the
followings:
1. Hive is working when you run Hive CLI and JDBC via Hiveserver2
2. Make sure you have
Hi All,
Not sure if anyone has ran into this problem, but this exist in spark 1.0.0
when you specify the location in conf/spark-defaults.conf for
spark.eventLog.dir hdfs:///user/$USER/spark/logs
to use the $USER env variable.
For example, I'm running the command with user 'test'.
In
2014-07-28 12:40 GMT-07:00 Andrew Lee alee...@hotmail.com:
Hi All,
Not sure if anyone has ran into this problem, but this exist in spark 1.0.0
when you specify the location in conf/spark-defaults.conf for
spark.eventLog.dir hdfs:///user/$USER/spark/logs
to use the $USER env variable
files explicitly to --jars option and it worked fine.
The Caused by... messages were found in yarn logs actually, I think it might
be useful if I can seem them from the console which runs spark-submit. Would
that be possible?
Jianshi
On Sat, Jul 26, 2014 at 7:08 AM, Andrew Lee alee
Hi Jianshi,
Could you provide which HBase version you're using?
By the way, a quick sanity check on whether the Workers can access HBase?
Were you able to manually write one record to HBase with the serialize
function? Hardcode and test it ?
From: jianshi.hu...@gmail.com
Date: Fri, 25 Jul 2014
-cassandra-connector rather than the hadoop back end?
Cheers,
Lee
for Hive-on-Spark now.
On Mon, Jul 21, 2014 at 6:27 PM, Andrew Lee alee...@hotmail.com wrote:
Hive and Hadoop are using an older version of guava libraries (11.0.1) where
Spark Hive is using guava 14.0.1+.
The community isn't willing to downgrade to 11.0.1 which is the current
version
:
Unfortunately, this is a query where we just don't have an efficiently
implementation yet. You might try switching the table order.
Here is the JIRA for doing something more efficient:
https://issues.apache.org/jira/browse/SPARK-2212
On Fri, Jul 18, 2014 at 7:05 AM, Pei-Lun Lee pl
Hi All,
Currently, if you are running Spark HiveContext API with Hive 0.12, it won't
work due to the following 2 libraries which are not consistent with Hive 0.12
and Hadoop as well. (Hive libs aligns with Hadoop libs, and as a common
practice, they should be consistent to work inter-operable).
:
com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0,
required: 1
Looks like spark sql tried to do a broadcast join and collecting one of the
table to master but it is too large.
How do we explicitly control the join behavior like this?
--
Pei-Lun Lee
We're coming off a great Seattle Spark Meetup session with Evan Chan
(@evanfchan) Interactive OLAP Queries with @ApacheSpark and #Cassandra
(http://www.slideshare.net/EvanChan2/2014-07olapcassspark) at Whitepages.
Now, we're proud to announce that our next session is Spark at eBay -
, but there is a PR open to fix it:
https://issues.apache.org/jira/browse/SPARK-2446
On Mon, Jul 14, 2014 at 4:17 AM, Pei-Lun Lee pl...@appier.com wrote:
Hi,
I am using spark-sql 1.0.1 to load parquet files generated from method
described in:
https://gist.github.com/massie/7224868
When I
Filed SPARK-2446
2014-07-15 16:17 GMT+08:00 Michael Armbrust mich...@databricks.com:
Oh, maybe not. Please file another JIRA.
On Tue, Jul 15, 2014 at 12:34 AM, Pei-Lun Lee pl...@appier.com wrote:
Hi Michael,
Good to know it is being handled. I tried master branch (9fe693b5) and
got
Sorry, should be SPARK-2489
2014-07-15 19:22 GMT+08:00 Pei-Lun Lee pl...@appier.com:
Filed SPARK-2446
2014-07-15 16:17 GMT+08:00 Michael Armbrust mich...@databricks.com:
Oh, maybe not. Please file another JIRA.
On Tue, Jul 15, 2014 at 12:34 AM, Pei-Lun Lee pl...@appier.com wrote
Hi,
I am using spark-sql 1.0.1 to load parquet files generated from method
described in:
https://gist.github.com/massie/7224868
When I try to submit a select query with columns of type fixed length byte
array, the following error pops up:
14/07/14 11:09:14 INFO scheduler.DAGScheduler: Failed
As mentioned, deprecated in Spark 1.0+.
Try to use the --driver-class-path:
./bin/spark-shell --driver-class-path yourlib.jar:abc.jar:xyz.jar
Don't use glob *, specify the JAR one by one with colon.
Date: Wed, 9 Jul 2014 13:45:07 -0700
From: kat...@cs.pitt.edu
Subject: SPARK_CLASSPATH Warning
Ok, I found it on JIRA SPARK-2390:
https://issues.apache.org/jira/browse/SPARK-2390
So it looks like this is a known issue.
From: alee...@hotmail.com
To: user@spark.apache.org
Subject: spark-1.0.0-rc11 2f1dc868 spark-shell not honoring --properties-file
option?
Date: Tue, 8 Jul 2014 15:17:00
Build: Spark 1.0.0 rc11 (git commit tag:
2f1dc868e5714882cf40d2633fb66772baf34789)
Hi All,
When I enabled the spark-defaults.conf for the eventLog, spark-shell broke
while spark-submit works.
I'm trying to create a separate directory per user to keep track with their own
Spark job event
Hi Kudryavtsev,
Here's what I am doing as a common practice and reference, I don't want to say
it is best practice since it requires a lot of customer experience and
feedback, but from a development and operating stand point, it will be great to
separate the YARN container logs with the Spark
=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty(hadoop.home.dir, d:\\winutil\\)
after that test runs
Thank you,
Konstantin Kudryavtsev
On Wed, Jul 2, 2014 at 10:24 PM, Denny Lee denny.g@gmail.com wrote:
You don't actually need it per se - its just that some
Thanks! will take a look at this later today. HTH!
On Jul 3, 2014, at 11:09 AM, Kostiantyn Kudriavtsev
kudryavtsev.konstan...@gmail.com wrote:
Hi Denny,
just created https://issues.apache.org/jira/browse/SPARK-2356
On Jul 3, 2014, at 7:06 PM, Denny Lee denny.g@gmail.com wrote
By any chance do you have HDP 2.1 installed? you may need to install the utils
and update the env variables per
http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
On Jul 2, 2014, at 10:20 AM, Konstantin Kudryavtsev
kudryavtsev.konstan...@gmail.com wrote:
issue.
On Wed, Jul 2, 2014 at 12:04 PM, Kostiantyn Kudriavtsev
kudryavtsev.konstan...@gmail.com wrote:
No, I don’t
why do I need to have HDP installed? I don’t use Hadoop at all and I’d
like to read data from local filesystem
On Jul 2, 2014, at 9:10 PM, Denny Lee denny.g@gmail.com
Hi Christophe,
Make sure you have 3 slashes in the hdfs scheme.
e.g.
hdfs:///server_name:9000/user/user_name/spark-events
and in the spark-defaults.conf as
well.spark.eventLog.dir=hdfs:///server_name:9000/user/user_name/spark-events
Date: Thu, 19 Jun 2014 11:18:51 +0200
From:
submitted.
Don’t know if that can help.
On Jun 26, 2014, at 6:41 AM, Pei-Lun Lee pl...@appier.com wrote:
Hi,
We have a long running spark application runs on spark 1.0 standalone
server and after it runs several hours the following exception shows up:
14/06/25 23:13:08 ERROR
I checked the source code, it looks like it was re-added back based on JIRA
SPARK-1588, but I don't know if there's any test case associated with this?
SPARK-1588. Restore SPARK_YARN_USER_ENV and SPARK_JAVA_OPTS for YARN.
Sandy Ryza sa...@cloudera.com
2014-04-29 12:54:02 -0700
, 2014 at 9:29 PM, Jeremy Lee
unorthodox.engine...@gmail.com wrote:
I am about to spin up some new clusters, so I may give that a go... any
special instructions for making them work? I assume I use the
--spark-git-repo= option on the spark-ec2 command. Is it as easy as
concatenating your
Hi All,
Have anyone ran into the same problem? By looking at the source code in
official release (rc11),this property settings is set to false by default,
however, I'm seeing the .sparkStaging folder remains on the HDFS and causing it
to fill up the disk pretty fast since SparkContext deploys
Forgot to mention that I am using spark-submit to submit jobs, and a verbose
mode print out looks like this with the SparkPi examples.The .sparkStaging
won't be deleted. My thoughts is that this should be part of the staging and
should be cleaned up as well when sc gets terminated.
on that issue. Let me know if I can help with testing and whatnot.
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
a 1.0.1 release soon (this patch being one of the main reasons),
but if you are itching for this sooner, you can just checkout the head
of branch-1.0 and you will be able to use r3.XXX instances.
- Patrick
On Tue, Jun 17, 2014 at 4:17 PM, Jeremy Lee
unorthodox.engine...@gmail.com wrote
-Lun Lee pl...@appier.com wrote:
Hi,
I am using spark 1.0.0 and found in spark sql some queries use GROUP BY
give weird results.
To reproduce, type the following commands in spark-shell connecting to a
standalone server:
case class Foo(k: String, v: Int)
val sqlContext = new
],
[c,270], [4,56], [1,1])
and if I run the same query again, the new result will be correct:
sql(select k,count(*) from foo group by k).collect
res2: Array[org.apache.spark.sql.Row] = Array([b,200], [a,100], [c,300])
Should I file a bug?
--
Pei-Lun Lee
I read it more carefully, and window() might actually work for some other
stuff like logs. (assuming I can have multiple windows with entirely
different attributes on a single stream..)
Thanks for that!
On Sun, Jun 8, 2014 at 11:11 PM, Jeremy Lee unorthodox.engine...@gmail.com
wrote:
Yes
I shut down my first (working) cluster and brought up a fresh one... and
It's been a bit of a horror and I need to sleep now. Should I be worried
about these errors? Or did I just have the old log4j.config tuned so I
didn't see them?
I
14/06/08 16:32:52 ERROR scheduler.JobScheduler: Error
of learning maven, if it means I never have to use sbt
again. Does it mean that?
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
and the StreamingContext uses the network
to read words, but as I said, nothing comes out.
I tried changing the .print() to .saveAsTextFiles(), and I AM getting a
file, but nothing is in it other than a _temporary subdir.
I'm sure I'm confused here, but not sure where. Help?
--
Jeremy Lee
persistent data for a streaming app?
(Across restarts) And to clean up on termination?
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
, 2014 at 5:46 PM, Nick Pentreath nick.pentre...@gmail.com
wrote:
Great - well we do hope we hear from you, since the user list is for
interesting success stories and anecdotes, as well as blog posts etc too :)
On Thu, Jun 5, 2014 at 9:40 AM, Jeremy Lee unorthodox.engine...@gmail.com
wrote
!
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
Nope, sorry, nevermind!
I looked at the source, and it was pretty obvious that it didn't implement
that yet, so I've ripped the classes out and am mutating them into a new
receivers right now...
... starting to get the hang of this.
On Fri, Jun 6, 2014 at 1:07 PM, Jeremy Lee unorthodox.engine
, I'm sure I'll get there. But I do understand the
implications of a mixed functional-imperative language with closures and
lambdas. That is serious voodoo.
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
://search.maven.org/#search%7Cga%7C1%7Ca%3A%22spark-streaming-twitter_2.10%22
The name is spark-streaming-twitter_2.10
On Wed, Jun 4, 2014 at 1:49 PM, Jeremy Lee
unorthodox.engine...@gmail.com wrote:
Man, this has been hard going. Six days, and I finally got a Hello
World
App
http://nabble.com/.
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
if creating Uberjars takes this
long every... single... time...
On Thu, Jun 5, 2014 at 8:52 AM, Jeremy Lee unorthodox.engine...@gmail.com
wrote:
Thanks Patrick!
Uberjars. Cool. I'd actually heard of them. And thanks for the link to the
example! I shall work through that today.
I'm still learning sbt
/SPARK-1990 to track
this.
Matei
On Jun 1, 2014, at 6:14 PM, Jeremy Lee unorthodox.engine...@gmail.com
wrote:
Sort of.. there were two separate issues, but both related to AWS..
I've sorted the confusion about the Master/Worker AMI ... use the version
chosen by the scripts. (and use
.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
Lee BCompSci(Hons)
The Unorthodox Engineers
/10.100.75.70:38485
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
-a that allows you to give a specific
AMI. This flag is just an internal tool that we use for testing when
we spin new AMI's. Users can't set that to an arbitrary AMI because we
tightly control things like the Java and OS versions, libraries, etc.
On Sun, Jun 1, 2014 at 12:51 AM, Jeremy Lee
12.04 AMI... that
might be a good place to start. But if there is a straightforward way to
make them compatible with 2.6 we should do that.
For r3.large, we can add that to the script. It's a newer type. Any
interest in contributing this?
- Patrick
On May 30, 2014 5:08 AM, Jeremy Lee
to bite
the bullet and start building my own AMI's from scratch... if anyone can
save me from that, I'd be most grateful.
--
Jeremy Lee BCompSci(Hons)
The Unorthodox Engineers
For those whom were not able to attend the last Seattle Spark Meetup, we had a
great session by Claudiu Barbura on xPatterns on Spark, Shark, Tachyon, and
Mesos - you can find the slides at:
http://www.slideshare.net/ClaudiuBarbura/seattle-spark-meetup-may-2014.
As well, check out the next
Does anyone know if:
./bin/spark-shell --master yarn
is running yarn-cluster or yarn-client by default?
Base on source code:
./core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
if (args.deployMode == cluster args.master.startsWith(yarn)) {
args.master = yarn-cluster
:
if (args.deployMode != cluster args.master.startsWith(yarn)) {
args.master = yarn-client}
2014-05-21 10:57 GMT-07:00 Andrew Lee alee...@hotmail.com:
Does anyone know if:
./bin/spark-shell --master yarn
is running yarn-cluster or yarn-client by default?
Base on source code:
./core/src
- (512) 286-6075
Andrew Lee ---05/04/2014 09:57:08 PM---Hi Jacob, Taking both concerns into
account, I'm actually thinking about using a separate subnet to
From: Andrew Lee alee...@hotmail.com
To: user@spark.apache.org user@spark.apache.org
Date: 05/04/2014 09:57 PM
Subject
Please check JAVA_HOME. Usually it should point to /usr/java/default on
CentOS/Linux.
or FYI: http://stackoverflow.com/questions/1117398/java-home-directory
Date: Tue, 6 May 2014 00:23:02 -0700
From: sln-1...@163.com
To: u...@spark.incubator.apache.org
Subject: run spark0.9.1 on yarn with
pairs
//set parallelism to 1 to keep the file from being partitioned
sc.makeRDD(kv,1)
.saveAsSequenceFile(path)
Does anyone have any pointers on how to get past this?
Thanks,
--
*Allen Lee*
Software Engineer
MediaCrossing Inc.
.nabble.com/Securing-Spark-s-Network-tp4832p4984.html
[2] http://en.wikipedia.org/wiki/Ephemeral_port
[3]
http://www.cyberciti.biz/tips/linux-increase-outgoing-network-sockets-range.html
Jacob D. Eisinger
IBM Emerging Technologies
jeis...@us.ibm.com - (512) 286-6075
Andrew Lee ---05/02/2014
Hi All,
I encountered this problem when the firewall is enabled between the spark-shell
and the Workers.
When I launch spark-shell in yarn-client mode, I notice that Workers on the
YARN containers are trying to talk to the driver (spark-shell), however, the
firewall is not opened and caused
-0400
Subject: Re: spark-shell driver interacting with Workers in YARN mode -
firewall blocking communication
From: yana.kadiy...@gmail.com
To: user@spark.apache.org
I think what you want to do is set spark.driver.port to a fixed port.
On Fri, May 2, 2014 at 1:52 PM, Andrew Lee alee...@hotmail.com
We’ve had some pretty awesome presentations at the Seattle Spark Meetup - here
are the links to the various slides:
Seattle Spark Meetup KickOff with DataBricks | Introduction to Spark with Matei
Zaharia and Pat McDonough
Learnings from Running Spark at Twitter sessions
Ben Hindman’s Mesos
You may also want to check out Paco Nathan's Introduction to Spark courses:
http://liber118.com/pxn/
On May 1, 2014, at 8:20 AM, Mayur Rustagi mayur.rust...@gmail.com wrote:
Hi Nicholas,
We provide training on spark, hands-on also associated ecosystem.
We gave it recently at a
I’ve been able to get CDH5 up and running on EC2 and according to Cloudera
Manager, Spark is running healthy.
But when I try to run spark-shell, I eventually get the error:
14/04/02 07:18:18 INFO client.AppClient$ClientActor: Connecting to master
spark://ip-172-xxx-xxx-xxx:7077...
14/04/02
If you have any questions on helping to get a Spark Meetup off the ground,
please do not hesitate to ping me (denny.g@gmail.com). I helped jump start
the one here in Seattle (and tangentially have been helping the Vancouver and
Denver ones as well). HTH!
On March 31, 2014 at 12:35:38
Hi All,
I'm getting the following error when I execute start-master.sh which also
invokes spark-class at the end.
Failed to find Spark assembly in /root/spark/assembly/target/scala-2.10/
You need to build Spark with 'sbt/sbt assembly' before running this program.
After digging into the
to the jar it self so need for random class paths.
On Tue, Mar 25, 2014 at 1:47 PM, Andrew Lee alee...@hotmail.com wrote:
Hi All,
I'm getting the following error when I execute start-master.sh which also
invokes spark-class at the end.
Failed to find Spark assembly in /root/spark/assembly
201 - 299 of 299 matches
Mail list logo