Hey dears,
Can you give me a maven repo, so I can compile Spark with Maven.
I'm using http://repo1.maven.org/maven2/ currently
but It complains cannot find akka-actor-2.0.1, I searched on the
repo1.maven, and I am also cannot find akka-actor-2.0.1, which is too old.
another strange output I ca
Kyle, the fundamental contract of a Spark RDD is that it is immutable. This
follows the paradigm where data is (functionally) transformed into other
data, rather than mutated. This allows these systems to make certain
assumptions and guarantees that otherwise they wouldn't be able to.
Now we've be
Hi Phillip/Hao,
I was wondering if there is a simple working example out there that I can just
run and see it work. Then, I can customize it for our needs. Unfortunately,
this explanation still confuses me a little.
Here is a little about the environment we are working with. We have Cloudera's
C
I'm trying to figure out if I can use an RDD to backend an interactive
server. One of the requirements would be to have incremental updates to
elements in the RDD, ie transforms that change/add/delete a single element
in the RDD.
It seems pretty drastic to do a full RDD filter to remove a single el
Hi Prashant,
Thank you! The reason I would like to do it is that currently my program's
output is set to stdout, and it would be mixed with Spark's log.
That's not a big issue anyway, since I can either disable log or put some
prefix before my info :)
Best,
Wenlei
On Sun, Dec 1, 2013 at 2:49 AM
Hao,
Thank you for the detailed response! (even if delayed!)
I'm curious to know what version of hbase you added to your pom file.
Thanks,
Philip
On 11/14/2013 10:38 AM, Hao REN wrote:
Hi, Philip.
Basically, we need* PairRDDFunctions.saveAsHadoopDataset* to do the
job, as HBase is not a fs
I have a simple scenario that I'm struggling to implement. I would like
to take a fairly simple RDD generated from a large log file, perform
some transformations on it, and write the results out such that I can
perform a Hive query either from Hive (via Hue) or Shark. I'm having
troubles with
Thanks! I will do that.
Cheers
Gustavo
On Fri, Dec 6, 2013 at 5:53 PM, Matei Zaharia wrote:
> Yeah, unfortunately the reason it pops up more in 0.8.0 is because our
> package names got longer! But if you just do the build in /tmp it will work.
>
> On Dec 6, 2013, at 11:35 AM, Josh Rosen wrote:
Hi everyone,
I used to launch EC2 clusters with the spark scripts running Hadoop 1. I
recently changed it and launched a new cluster with the hadoop major version
set to 2.
Spark-ec2 --hadoop-major-version=2
In the old cluster, I would start persistent-hdfs and migrate data from S3 with
dis
Hey Andrew, unfortunately I don’t know how easy this is. Maybe future versions
of Akka have it. We can certainly ask them to do it in general but I imagine
there are some use cases where they wanted this behavior.
Matei
On Dec 5, 2013, at 2:49 PM, Andrew Ash wrote:
> Speaking of akka and host
Yeah, unfortunately the reason it pops up more in 0.8.0 is because our package
names got longer! But if you just do the build in /tmp it will work.
On Dec 6, 2013, at 11:35 AM, Josh Rosen wrote:
> This isn't a Spark 0.8.0-specific problem. I googled for "sbt error filen
> ame too long" and fo
This isn't a Spark 0.8.0-specific problem. I googled for "sbt error filen
ame too long" and found a couple of links that suggest that this error may
crop up for Linux users with encrypted filesystems or home directories:
http://stackoverflow.com/questions/8404815/how-do-i-build-a-project-that-use
Hi there:
I've trying to compile using sbt/sbt assembly and mvn clean package (with
memory adjustments as suggested here
http://spark.incubator.apache.org/docs/latest/building-with-maven.html).
Unfortunately, compiling fails for both of them with the following error
(here is with Maven
but with SB
Yeah, in general, make sure you use exactly the same “cluster URL” string shown
on the master’s web UI. There’s currently a limitation in Akka where different
ways of specifying the hostname won’t work.
Matei
On Dec 6, 2013, at 10:54 AM, Nathan Kronenfeld
wrote:
> Never mind, I figured it ou
Never mind, I figured it out - apparently it was different DNS resolutions
locally and within the cluster; when I use the IP address instead of the
machine name in MASTER, it all seems to work.
On Fri, Dec 6, 2013 at 1:38 PM, Nathan Kronenfeld <
nkronenf...@oculusinfo.com> wrote:
> Hi, all.
>
>
Hi, all.
I'm trying to connect to a remote cluster from my machine, using spark
0.7.3. In conf/spark-env.sh, I've set MASTER, SCALA_HOME, SPARK_MASTER_IP,
and SPARK_MASTER_PORT.
When I try to run a job, it starts, but never gets anywhere, and I keep
getting the following error message:
13/12/06
Btw the node only has 4GB memory so does the spark.executor.memory make
sense...
Should i instead make it around 2-3GB. ALso how different is this parameter
from SPARK_MEM
Thanks,
Saurabh
On Fri, Dec 6, 2013 at 8:26 AM, learner1014 all wrote:
> Still see a whole lot of following erros
> java.la
Hello,
I am new to the spark system, and I am trying to write a simple program to
get myself familiar with how spark works. I am currently having problem
with importing the spark package. I am getting the following compiler
error: package org.apache.spark.api.java does not exist.
I have spark-0.8
Still see a whole lot of following erros
java.lang.OutOfMemoryError: Java heap space
13/12/05 16:04:13 INFO executor.StandaloneExecutorBackend: Got assigned
task 553
13/12/05 16:04:13 INFO executor.Executor: Running task ID 553
Issue seems to be that the process hangs as we are probably performing
19 matches
Mail list logo