Hi Lucas,
That did the trick just had to change JavaPairRDDByteBuffer,
SortedMapByteBuffer, IColumn to JavaPairRDDByteBuffer,* ? extends
* SortedMapByteBuffer,
IColumn thanks for the help.
Regards,
Pulasthi
On Thu, Dec 5, 2013 at 10:40 AM, Lucas Fernandes Brunialti
lbrunia...@igcorp.com.br
Thanks a lot Evan...
On Wed, Dec 4, 2013 at 8:31 PM, Evan R. Sparks evan.spa...@gmail.comwrote:
Ah, actually - I just remembered that the user and product features of the
model are RDDs, so - you might be better off saving those components to
HDFS and then at load time reading them back in
Hi Hao,
Where tasks go is influenced by where the data they operate on resides. If
the data is on one executor, it may make more sense to do all the
computation on that node rather than ship data across the network. How was
the data distributed across your cluster?
Andrew
On Mon, Dec 2, 2013
Hi Andrew,
My data was loaded in HDFS. Actually, I got the answer from the spark-user
google group.
Patrick said:
All cores in the cluster are considered fungible since the tasks are
completely parallel. So until you run out of cores on any given node, it
might get all the tasks.
In some cases
Hi,
Maybe you need to check those nodes. It's very slow.
3487SUCCESS PROCESS_LOCAL ip-10-60-150-111.ec2.internal 2013/12/01
02:11:38 17.7 m 16.3 m 23.3 MB
3447SUCCESS PROCESS_LOCAL ip-10-12-54-63.ec2.internal 2013/12/01
02:11:26 20.1 m 13.9 m 50.9 MB
在
Excellent! Thank you, Matei.
From: Matei Zaharia [mailto:matei.zaha...@gmail.com]
Sent: Wednesday, December 4, 2013 4:26 PM
To: user@spark.incubator.apache.org
Subject: Re: Pre-build Spark for Windows 8.1
Hey Adrian,
Ideally you shouldn't use Cygwin to run on Windows - use the .cmd scripts we
The master starts up now as expected but the workers are unable to connect to
the master. It looks like the master is refusing the connection messages but
I'm not sure why. The first two error lines below are from trying to connect a
worker from a separate machine and the last two error lines
Does anyone have an example or some sort of starting point code when writing
from Spark Streaming into HBase?
We currently stream ad server event log data using Flume-NG to tail log
entries, collect them, and put them directly into a HBase table. We would like
to do the same with Spark
Here's a good place to start:
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3ccacyzca3askwd-tujhqi1805bn7sctguaoruhd5xtxcsul1a...@mail.gmail.com%3E
On 12/5/2013 10:18 AM, Benjamin Kim wrote:
Does anyone have an example or some sort of starting point code when
The variability in task completion times could be caused by variability in
the amount of work that those tasks perform rather than slow or faulty
nodes.
For PageRank, consider a link graph contains a few disproportionately
popular webpages that have many inlinks (such as Yahoo.com). These
Hi Matt,
Try using take() instead, which will only begin computing from the start of the
RDD (first partition) if the number of elements you ask for is small.
Note that if you’re doing any shuffle operations, like groupBy or sort, then
the stages before that do have to be computed fully.
Actually, we want the opposite – we want as much data to be computed as
possible.
It's only for benchmarking purposes, of course.
-Matt Cheah
From: Matei Zaharia matei.zaha...@gmail.commailto:matei.zaha...@gmail.com
Reply-To:
Hi,
When you launch the worker, try using spark://ADRIBONA-DEV-1:7077 as the URL
(uppercase instead of lowercase). Unfortunately Akka is very specific about
seeing hostnames written in the same way on each node, or else it thinks the
message is for another machine!
Matei
On Dec 5, 2013, at
Speaking of akka and host sensitivity... How much have you hacked on akka
to get it to support all of: myhost.mydomain.int, myhost, and 10.1.1.1?
It's kind of a pain to get the Spark URL to exactly match. I'm wondering
if there are usability gains that could be made here or if we're pretty
Ah, got it. Then takeSample is going to do what you want, because it needs a
uniform sample. If you don’t want any result at all, you can also use
RDD.foreach() with an empty function.
Matei
On Dec 5, 2013, at 12:54 PM, Matt Cheah mch...@palantir.com wrote:
Actually, we want the opposite –
Strange, but that definitely did the trick. Thanks again!
From: Matei Zaharia [mailto:matei.zaha...@gmail.com]
Sent: Thursday, December 5, 2013 2:44 PM
To: user@spark.incubator.apache.org
Subject: Re: Pre-build Spark for Windows 8.1
Hi,
When you launch the worker, try using
Try allocating some more resources to your application.
You seem to be using 512Mb for you worker node - (you can verify that from
the master UI)
Try putting the following settings into your code and see if it helps -
System.setProperty(spark.executor.memory,15g) // Will allocate more
memory
17 matches
Mail list logo