Hi Sean,
Yeah.. I am seeing erros across all repos and yepp.. this error is mainly
because of connectivity issue...
How do I set up proxy.. I did set up proxy as suggested by Mayur:
export JAVA_OPTS=$JAVA_OPTS -Dhttp.proxyHost=yourserver
-Dhttp.proxyPort=8080 -Dhttp.proxyUser=username
Yeah.. The http_proxy is set up.. and so is https_proxy..
Basically, my maven projects, git pulls etc everything is working fine..
except this.
Here is another question which might help me to bypass this issue
If I create a jar using eclipse... how do i run that jar in code. Like in
hadoop, I
you need to assemble the code to get spark working (unless you are using
hadoop 1.0.4).
to run the code you can follow any of the standalone guides here:
https://spark.apache.org/docs/0.9.0/quick-start.html#a-standalone-app-in-scalayou
would still need sbt though.
Mayur Rustagi
Ph: +1 (760)
Dear All,
I'm trying to cluster data from native library code with Spark Kmeans||. In
my native library the data are represented as a matrix (row = number of
data and col = dimension). For efficiency reason, they are copied into a
one dimensional scala Array row major wise so after the
On Sunday, 2 March 2014 19:19:49 UTC+2, Aureliano Buendia wrote:
Is there a reason for spark using the older akka?
On Sun, Mar 2, 2014 at 1:53 PM, 1esha alexey.r...@gmail.com wrote:
The problem is in akka remote. It contains files compiled with 2.4.*. When
you run it with 2.5.* in
Hi ALL !!
In the interactive spark shell i get the following error.
I just followed the steps of the video First steps with spark - spark
screen cast #1 by andy konwinski...
Any thoughts ???
scala val textfile = sc.textFile(README.md)
textfile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at
I tried that command on Fedora and I got a lot of random downloads (around
250 downloads) and it appeared that something was trying to get BitTorrent
start. That command ./sbt/sbt assembly doesn't work on Windows.
I installed sbt separately. Is there a way to determine if I'm using the
sbt that's
On 3/18/14, 4:49 AM, dmpou...@gmail.com wrote:
On Sunday, 2 March 2014 19:19:49 UTC+2, Aureliano Buendia wrote:
Is there a reason for spark using the older akka?
On Sun, Mar 2, 2014 at 1:53 PM, 1esha alexey.r...@gmail.com wrote:
The problem is in akka remote. It contains files compiled
Hi all, I changed spark.closure.serializer to kryo, when I try count action in
spark shell the Task obj deserialize in Executor return null, src line is:
override def run(){
..
task = ser.deserializer[Task[Any]](...)
..
}
Where task is null
Can any one help me? Thank you!
Well, if anyone is still following this, I've gotten the following code
working which in theory should allow me to parse whole XML files: (the
problem was that I can't return the tree iterator directly. I have to call
iter(). Why?)
import xml.etree.ElementTree as ET
# two source files, format
hi, if you run that under windows, you should use \ to replace /.
sbt/sbt means the sbt file under the sbt folder.
On Mar 18, 2014 8:42 PM, wapisani wapis...@mtu.edu wrote:
I tried that command on Fedora and I got a lot of random downloads (around
250 downloads) and it appeared that something
Hi Chen,
I tried sbt\sbt assembly and I got an error of 'sbt\sbt' is not
recognized as an internal or external command, operable program or batch
file.
On Tue, Mar 18, 2014 at 11:18 AM, Chen Jingci [via Apache Spark User List]
ml-node+s1001560n2811...@n3.nabble.com wrote:
hi, if you run
Hi,
I wrote this new article after studying deeper how to adapt scalaz-stream
to spark dstreams.
I re-explain a few spark ( scalaz-stream) concepts (in my own words) in
it and I went further using new scalaz-stream NIO API which is quite
interesting IMHO.
The result is a long blog tryptic
Hi Andrew,
Thanks for your interest. This is a standalone job.
On Mon, Mar 17, 2014 at 4:30 PM, Andrew Ash and...@andrewash.com wrote:
Are you running from the spark shell or from a standalone job?
On Mon, Mar 17, 2014 at 4:17 PM, Walrus theCat walrusthe...@gmail.comwrote:
Hi,
I'm
Sorry, the link was wrong. Should be
https://github.com/apache/spark/pull/131 -Xiangrui
On Tue, Mar 18, 2014 at 10:20 AM, Michael Allman m...@allman.ms wrote:
Hi Xiangrui,
I don't see how https://github.com/apache/spark/pull/161 relates to ALS. Can
you explain?
Also, thanks for addressing
Hi Jaonary,
With the current implementation, you need to call Array.slice to make
each row an Array[Double] and cache the result RDD. There is a plan to
support block-wise input data and I will keep you informed.
Best,
Xiangrui
On Tue, Mar 18, 2014 at 2:46 AM, Jaonary Rabarisoa
Although sbt assembly reports success, I re-ran that step, and see errors
like:
Error extracting zip entry
'scala/tools/nsc/transformUnCurry$UnCurryTransformer$$anonfun$14$$anonfun$apply
(omitting rest of super-long path) (File name too long)
Is this a problem with the 'zip' tool on my
OK, the problem was that the directory where I had installed Spark is
encrypted. The particular encryption system appears to limit the length of
files.
I re-installed on a vanilla partition, and spark-shell runs fine.
--
View this message in context:
Hi all,
The Maven central repo contains an artifact for spark 0.9.0 built with
unmodified Hadoop, and the Cloudera repo contains an artifact for spark
0.9.0 built with CDH 5 beta. Is there a repo that contains spark-core built
against a non-beta version of CDH (such as 4.4.0)?
Punya
I just ran a runtime performance comparison between 0.9.0-incubating and your
als branch. I saw a 1.5x improvement in performance.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/possible-bug-in-Spark-s-ALS-implementation-tp2567p2823.html
Sent from the
Glad to hear the speed-up. Wish we can improve the implementation
further in the future. -Xiangrui
On Tue, Mar 18, 2014 at 1:55 PM, Michael Allman m...@allman.ms wrote:
I just ran a runtime performance comparison between 0.9.0-incubating and your
als branch. I saw a 1.5x improvement in
Hi ,
I am new to Spark scala environment.Currently I am working on Discrete
wavelet transformation algos on time series data.
I have to perform recursive additions on successive elements in RDDs.
for example
List of elements(RDDS) --a1 a2 a3 a4.
level1 Tranformation --a1+a2 a3+a4 a1-a2
I just meant that you call union() before creating the RDDs that you pass to
new Graph(). If you call it after it will produce other RDDs.
The Graph() constructor actually shuffles and “indexes” the data to make graph
operations efficient, so it’s not too easy to add elements after. You could
Hi spark-folk,
I have a directory full of files that I want to process using PySpark.
There is some necessary metadata in the filename that I would love to
attach to each record in that file. Using Java MapReduce, I would access
(FileSplit) context.getInputSplit()).getPath().getName()
in the
This problem occurs because graph.triplets generates an iterator that reuses
the same EdgeTriplet object for every triplet in the partition. The
workaround is to force a copy using graph.triplets.map(_.copy()).
The solution in the AMPCamp tutorial is mistaken -- I'm not sure if that
ever worked.
The workaround is to force a copy using graph.triplets.map(_.copy()).
Sorry, this actually won't copy the entire triplet, only the attributes
defined in Edge. The right workaround is to copy the EdgeTriplet explicitly:
graph.triplets.map { et =
val et2 = new EdgeTriplet[VD, ED] // Replace
The examples in graphx/data are meant to show the input data format, but if
you want to play around with larger and more interesting datasets, we've
been using the following ones, among others:
- SNAP's web-Google dataset (5M edges):
https://snap.stanford.edu/data/web-Google.html
- SNAP's
27 matches
Mail list logo