Hi,
I've tried to enable debug logging, but can't figure out what might be
going wrong. Can anyone assist decyphering the log?
The log of the startup and run attempts is at http://pastebin.com/XyeY92VF
This uses SparkILoop, DEBUG level logging and settings.debug.value = true
option.
Line 323:
We are trying to use kryo serialization, but with kryo serialization ON the
memory consumption does not change. We have tried this on multiple sets of
data.
We have also checked the logs of Kryo serialization and have confirmed that
Kryo is being used.
Can somebody please help us with this?
The
So this happened again today. As I noted before, the Spark shell starts up
fine after I reconnect to the cluster, but this time around I tried opening
a file and doing some processing. I get this message over and over (and
can't do anything):
14/03/06 15:43:09 WARN scheduler.TaskSchedulerImpl:
Hi,
I've successfully built 0.9.0-incubating on Solaris using sbt, following
the instructions at http://spark.incubator.apache.org/docs/latest/ and
it seems to work OK. However, when I start it up I get an error about
missing Hadoop native libraries. I can't find any mention of how to
build
Thanks Mayur. I don't have clear idea on how pipe works wanted to
understand more on it. But when do we use pipe() and how it works ?. Can
you please share some sample code if you have ( even pseudo-code is fine )
? It will really help.
Regards,
Suman Bharadwaj S
On Thu, Mar 6, 2014 at 3:46 AM,
Is it an error, or just a warning? In any case, you need to get those libraries
from a build of Hadoop for your platform. Then add them to the
SPARK_LIBRARY_PATH environment variable in conf/spark-env.sh, or to your
-Djava.library.path if launching an application separately.
These libraries
Hi,
I am trying to setup Spark in windows for development environment. I get
following error when I run sbt. Pl help me to resolve this issue. I am working
for Verizon and am in my company network and can't access internet without
proxy.
C:\Userssbt
Getting org.fusesource.jansi jansi 1.11 ...
export JAVA_OPTS=$JAVA_OPTS -Dhttp.proxyHost=yourserver
-Dhttp.proxyPort=8080 -Dhttp.proxyUser=username
-Dhttp.proxyPassword=password
Also please use separate thread for different questions.
Regards
Mayur
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi
Thanks Alan.
I am very new to Spark. I am trying to set Spark development environment in
Windows. I added below mentioned export as set in sbt.bat file and tried,
it was not working. Where will I see .gitconfig?
set JAVA_OPTS=%JAVA_OPTS% -Dhttp.proxyHost=myservername -Dhttp.proxyPort=8080
Dana,
When you run multiple applications under Spark, and if each application
takes up the entire cluster resources, it is expected that one will block
the other completely, thus you're seeing that the wall time add together
sequentially. In addition there is some overhead associated with
Hi everyone,
We are using to Pig to build our data pipeline. I came across Spork -- Pig on
Spark at: https://github.com/dvryaboy/pig and not sure if it is still active.
Can someone please let me know the status of Spork or any other effort that
will let us run Pig on Spark? We can
I had asked a similar question on the dev mailing list a while back (Jan 22nd).
See the archives:
http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser - look
for spork.
Basically Matei said:
Yup, that was it, though I believe people at Twitter picked it up again
recently.
There is some work to make this work on yarn at
https://github.com/aniket486/pig. (So, compile pig with ant
-Dhadoopversion=23)
You can look at https://github.com/aniket486/pig/blob/spork/pig-spark to
find out what sort of env variables you need (sorry, I haven't been able to
clean this up-
On 06/03/2014 18:55, Matei Zaharia wrote:
For the native libraries, you can use an existing Hadoop build and
just put them on the path. For linking to Hadoop, Spark grabs it
through Maven, but you can do mvn install locally on your version
of Hadoop to install it to your local Maven cache, and
Hi Aniket,Many thanks! I will check this out.
Date: Thu, 6 Mar 2014 13:46:50 -0800
Subject: Re: Pig on Spark
From: aniket...@gmail.com
To: user@spark.apache.org; tgraves...@yahoo.com
There is some work to make this work on yarn at
https://github.com/aniket486/pig. (So, compile pig with ant
Can you see your webUI of Spark. Is it running? (would run on
masterurl:8080)
if so what is the master URL shown thr..
MASTER=spark://URL:PORT ./bin/spark-shell
Should work.
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi https://twitter.com/mayur_rustagi
On
I see the same error. I am trying a standalone example integrated into a Play
Framework v2.2.2 application. The error occurs when I try to create a Spark
Streaming Context. Compilation succeeds, so I am guessing it has to do with
the version of Akka getting picked up at runtime.
--
View this
Are you launching your application using scala or java command? scala
command bring in a version of Akka that we have found to cause conflicts
with Spark's version for Akka. So its best to launch using Java.
TD
On Thu, Mar 6, 2014 at 3:45 PM, Deepak Nulu deepakn...@gmail.com wrote:
I see the
I was just able to fix this in my environment.
By looking at the repository/cache in my Play Framework installation, I was
able to determine that spark-0.9.0-incubating uses Akka version 2.2.3.
Similarly, looking at repository/local revealed that Play Framework 2.2.2
ships with Akka version
The difference between your two jobs is that take() is optimized and
only runs on the machine where you are using the shell, whereas
sortByKey requires using many machines. It seems like maybe python
didn't get upgraded correctly on one of the slaves. I would look in
the /root/spark/work/ folder
I dont have a Eclipse setup so I am not sure what is going on here. I would
try to use maven in the command line with a pom to see if this compiles.
Also, try to cleanup your system maven cache. Who knows if it had pulled in
a wrong version of kafka 0.8 and using it all the time. Blowing away the
many thanks for guiding.
2014-03-06 23:39 GMT+08:00 Yana Kadiyska yana.kadiy...@gmail.com:
Hi qingyang,
1. You do not need to install shark on every node.
2. Not really sure..it's just a warning so I'd see if it works despite it
3. You need to provide the actual hdfs path, e.g.
We're not using Ooyala's job server. We are holding the spark context for
reuse within our own REST server (with a service to run each job).
Our low-latency job now reads all its data from a memory cached RDD, instead
of from HDFS seq file (upstream jobs cache resultant RDDs for downstream
jobs
Hello,
What is the general approach people take when trying to do analysis
across multiple large files where the data to be extracted from a
successive file depends on the data extracted from a previous file or
set of files?
For example:
I have the following: a group of HDFS files each
Would you be the best person in the world share some code. Its a pretty
common problem .
On Mar 6, 2014 6:36 PM, polkosity polkos...@gmail.com wrote:
We're not using Ooyala's job server. We are holding the spark context for
reuse within our own REST server (with a service to run each job).
Hi, Yana, do you know if there is mailing list for shark like spark's?
2014-03-06 23:39 GMT+08:00 Yana Kadiyska yana.kadiy...@gmail.com:
Hi qingyang,
1. You do not need to install shark on every node.
2. Not really sure..it's just a warning so I'd see if it works despite it
3. You need to
Will give it a shot, later. BTW, this forced me to move to Scala! Decided to
design our aggregation frame-work in scala for now.
On 07-Mar-2014, at 6:02 AM, Tathagata Das tathagata.das1...@gmail.com wrote:
I dont have a Eclipse setup so I am not sure what is going on here. I would
try to use
It looks like the problem is in the filter task - is there anything
special about filter()?
I have removed the filter line from the loops just to see if things will
work and they do.
Anyone has any ideas?
Thanks!
Ognen
On 3/6/14, 9:39 PM, Ognen Duzlevski wrote:
Hello,
What is the general
Please remove me from the mail list.
-邮件原件-
发件人: Deepak Nulu [mailto:deepakn...@gmail.com]
发送时间: 2014年3月7日 7:45
收件人: u...@spark.incubator.apache.org
主题: Re: NoSuchMethodError - Akka - Props
I see the same error. I am trying a standalone example integrated into a Play
Framework v2.2.2
29 matches
Mail list logo