Please ignore this whole thread. It's working out of nowhere. I'm not sure
what was the root cause. After I restarted the VM the previous SIFT code
also started working.
On Fri, Jun 5, 2015 at 10:40 PM, Sam Stoelinga sammiest...@gmail.com
wrote:
Thanks Davies. I will file a bug later with code
?
If the bytes came from sequenceFile() is broken, it's easy to crash a
C library in Python (OpenCV).
On Thu, May 28, 2015 at 8:33 AM, Sam Stoelinga sammiest...@gmail.com
wrote:
Hi sparkers,
I am working on a PySpark application which uses the OpenCV library. It
runs
fine when running
.COLOR_BGR2GRAY)
sift = cv2.xfeatures2d.SIFT_create()
kp, descriptors = sift.detectAndCompute(gray, None)
return (imgfilename, test)
And corresponding tests.py:
https://gist.github.com/samos123/d383c26f6d47d34d32d6
On Sat, May 30, 2015 at 8:04 PM, Sam Stoelinga sammiest...@gmail.com
wrote
This is the error message taken from STDERR of the worker log:
https://gist.github.com/samos123/3300191684aee7fc8013
Would like pointers or tips on how to debug further? Would be nice to know
the reason why the worker crashed.
Thanks,
Sam Stoelinga
org.apache.spark.SparkException: Python worker exited
.
Looking forward to hear you point out my stupidity or provide work-arounds
that could make Spark KMeans work well on large datasets.
Regards,
Sam Stoelinga
PM, Jeetendra Gangele gangele...@gmail.com
wrote:
How you are passing feature vector to K means?
its in 2-D space of 1-D array?
Did you try using Streaming Kmeans?
will you be able to paste code here?
On 29 April 2015 at 17:23, Sam Stoelinga sammiest...@gmail.com wrote:
Hi Sparkers,
I
Guys, great feedback by pointing out my stupidity :D
Rows and columns got intermixed hence the weird results I was seeing.
Ignore my previous issues will reformat my data first.
On Wed, Apr 29, 2015 at 8:47 PM, Sam Stoelinga sammiest...@gmail.com
wrote:
I'm mostly using example code, see here
We want to monitor spark master and spark slaves using monit but we want to
use the sbin scripts to do so. The scripts create the spark master and
salve processes independent from themselves so monit would not know the
started processed pid to watch. Is this correct? Should we watch the ports?
We are planning to use varying servers spec (32 GB, 64GB, 244GB RAM or even
higher and varying cores) for an standalone deployment of spark but we do
not know the spec of the server ahead of time and we need to script up some
logic that will run on the server on boot and automatically set the
Hi Geraard,
isn't this the same issueas this?
https://issues.apache.org/jira/browse/MESOS-1688
On Mon, Jan 26, 2015 at 9:17 PM, Gerard Maas gerard.m...@gmail.com wrote:
Hi,
We are observing with certain regularity that our Spark jobs, as Mesos
framework, are hoarding resources and not
if there is a configuration that needs to be tweaked or if
this is expected response time.
Machines are 30g RAM and 4 cores. Seems the CPU's are just getting pegged
and that is what is taking so long.
Any help on this would be amazing.
Thanks,
--
*MAGNE**+**I**C*
*Sam Flint* | *Lead Developer, Data
)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:745)
--
*MAGNE**+**I**C*
*Sam Flint* | *Lead Developer, Data Analytics*
. Why?
Thanks!
Sam Liu
Hi all,
Having a strange issue that I can't find any previous issues for on the
mailing list or stack overflow.
Frequently we are getting ACTOR SYSTEM CORRUPTED!! A Dispatcher can't have
less than 0 inhabitants! with a stack trace, from akka, in the executor
logs, and the executor is marked as
that contains all the data.
On Wed, Nov 19, 2014 at 2:46 PM, Sam Flint sam.fl...@magnetic.com wrote:
Michael,
Thanks for your help. I found a wholeTextFiles() that I can use to
import all files in a directory. I believe this would be the case if all
the files existed in the same directory
Hi There,
I am new to Spark and I was wondering when you have so much memory on each
machine of the cluster, is it better to run multiple workers with limited
memory on each machine or is it better to run a single worker with access
to the majority of the machine memory? If the answer is it
!
Sam Liu
Thanks Xiangrui, your suggestion fixed the problem. I will see if I can upgrade
the numpy/python for a permanent fix. My current versions of python and numpy
are 2.6 and 4.1.9 respectively.
Thanks,
Sam
-Original Message-
From: Xiangrui Meng [mailto:men...@gmail.com]
Sent: Tuesday
Hi,
I modified the example code for logistic regression to compute the error in
classification. Please see below. However the code is failing when it makes a
call to:
labelsAndPreds.filter(lambda (v, p): v != p).count()
with the error message (something related to numpy or dot product):
Any idea when they will release it? Also I'm uncertain what we will need to
do to fix the shell? Will we have to reinstall spark? or reinstall hadoop?
(i'm not a devops so maybe this question sounds silly)
--
View this message in context:
I get a very similar stack trace and have no idea what could be causing it
(see below). I've created a SO:
http://stackoverflow.com/questions/24038908/spark-fails-on-big-jobs-with-java-io-ioexception-filesystem-closed
14/06/02 20:44:04 INFO client.AppClient$ClientActor: Executor updated:
be much appreciated!
Sam
- Original Message -
From: Krishna Sankar ksanka...@gmail.com
To: user@spark.apache.org
Sent: Wednesday, June 4, 2014 8:52:59 AM
Subject: Re: Trouble launching EC2 Cluster with Spark
One reason could be that the keys are in a different region. Need to create
the keys
PM, Sam Taylor Steyer sste...@stanford.edu
wrote:
Also, once my friend logged in to his cluster he received the error
Permissions 0644 for 'FinalKey.pem' are too open. This sounds like the
other problem described. How do we make the permissions more private?
Thanks very much,
Sam
What we are doing is:
1. Installing Spark 0.9.1 according to the documentation on the website,
along with CDH4 (and another cluster with CDH5) distros of hadoop/hdfs.
2. Building a fat jar with a Spark app with sbt then trying to run it on the
cluster
I've also included code snippets, and sbt
Why don't start by explaining what kind of operation you're running on
spark that's faster than hadoop mapred. Mybewe could start there. And yes
this mailing is very busy since many people are getting into Spark, it's
hard to answer to everyone.
On 21 Apr 2014 20:23, Joe L selme...@yahoo.com
Sounds great François.
On 21 Apr 2014 22:31, François Le Lay f...@spotify.com wrote:
Hi everyone,
This is a quick email to announce the creation of a Spark-NYC Meetup.
We have 2 upcoming events, one at PlaceIQ, another at Spotify where
Reynold Xin (Databricks) and Christopher Johnson
I have this problem too. Eventually the job fails (on the UI) and hangs the
terminal until I CTRL + C. (Logs below)
Now the Spark docs explain the heartbeat configuration stuff can be tweaked
to handle GC hangs. I'm wondering if this is symptomatic of pushing the
cluster a little too hard (we
101 - 127 of 127 matches
Mail list logo