Hi there,
In our experiment with spark, we found same spark application has large
variance on execution time and sometimes even fail totally. And in the log,
we find this usually due to task resubmit from fetch failure, with log as
following,
14/03/16 16:40:38 WARN TaskSetManager: Lost TID
Hi, page https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Sparksays
I need write here, if want my project to be added there.
In Yandex (www.yandex.com) now we using spark for project Yandex Islands (
Hi,
I gave my spark job 16 gb of memory and it is running on 8 executors.
The job needs more memory due to ALS requirements (20M x 1M matrix)
On each node I do have 96 gb of memory and I am using 16 gb out of it. I
want to increase the memory but I am not sure what is the right way to do
Are these the right options:
1. If there is a spark script, just do a ctrl-c from spark-shell and the
job will be killed property.
2. For spark application also ctrl c will kill the job property on the
cluster:
Somehow the ctrl-c option did not work for us...
Similar option works fine for
You should simply use a snapshot built from HEAD of
github.com/apache/sparkif you can. The key change is in MLlib and with
any luck you can just
replace that bit. See the PR I referenced.
Sure with enough memory you can get it to run even with the memory issue,
but it could be hundreds of GB at
Thr is a no good way to kill jobs in Spark yet. The closest is
cancelAllJobs cancelJobGroup in spark context. I have had bugs using
both. I am trying to test them out, typically you would start a different
thread call these functions on it when you wish to cancel a job.
Regards
Mayur
Mayur
On Mar 14, 2014, at 5:52 PM, Michael Allman m...@allman.ms wrote:
I also found that the product and user RDDs were being rebuilt many times
over in my tests, even for tiny data sets. By persisting the RDD returned
from updateFeatures() I was able to avoid a raft of duplicate computations.
Is
From
http://spark.incubator.apache.org/docs/latest/spark-standalone.html#launching-applications-inside-the-cluster
./bin/spark-class org.apache.spark.deploy.Client kill driverId
does not work / has bugs ?
On Sun, Mar 16, 2014 at 1:17 PM, Mayur Rustagi mayur.rust...@gmail.comwrote:
Thr is a
This is meant to kill the whole driver hosted inside the Master (new
feature as of 0.9.0).
I assume you are trying to kill a job/task/stage inside the Spark rather
than the whole application.
Regards
Mayur
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi
Hi,
I know it is probably not the purpose of spark but the syntax is easy and
cool...
I need to run some spark like code in memory on a single machine any
pointers how to optimize it to run only on one machine?
--
Eran | CTO
hi, I m into a project in which i have to get streaming URL's and Filter it
and classify it as benin or suspicious. Now Machine Learning and Streaming
are two separate things in apache spark (AFAIK). my Question is Can we apply
Online Machine Learning Algorithms on Streams??
I am at Beginner
This is not released yet but we're planning to cut a 0.9.1 release
very soon (e.g. most likely this week). In the mean time you'll have
checkout branch-0.9 of Spark and publish it locally then depend on the
snapshot version. Or just wait it out...
On Fri, Mar 14, 2014 at 2:01 PM, Adrian Mocanu
Please follow the instructions at
http://spark.apache.org/docs/latest/index.html and
http://spark.apache.org/docs/latest/quick-start.html to get started on a local
machine.
—
Sent from Mailbox for iPhone
On Sun, Mar 16, 2014 at 11:39 PM, goi cto goi@gmail.com wrote:
Hi,
I know it is
If it’s a driver on the cluster, please open a JIRA issue about this — this
kill command is indeed intended to work.
Matei
On Mar 16, 2014, at 2:35 PM, Mayur Rustagi mayur.rust...@gmail.com wrote:
Are you embedding your driver inside the cluster?
If not then that command will not kill the
Thanks, I’ve added you:
https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark. Let me know
if you want to change any wording.
Matei
On Mar 16, 2014, at 6:48 AM, Egor Pahomov pahomov.e...@gmail.com wrote:
Hi, page https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark
15 matches
Mail list logo