Sorry, I did not explain myself correctly.
I know how to run spark, the question is how to instruct spark to do all of
the computation on a single machine?
I was trying to convert the code to scala but I miss some of the methods of
spark like reduceByKey
Eran
On Mon, Mar 17, 2014 at 7:25 AM,
Patrick, Koert,
I'm also very interested in these examples, could you please post them if
you find them?
thanks in advance,
Richard
On Thu, Mar 13, 2014 at 9:39 PM, Koert Kuipers ko...@tresata.com wrote:
not that long ago there was a nice example on here about how to combine
multiple
Hi everyone !!
I installed scala 2.9.3, spark 0.8.1, oracle java 7...
I launched master and logged on to the interactive spark shell
MASTER=spark://localhost:7077 ./spark-shell
But after one minute, automatically it exits from the interactive shell...
Is there something i am missing...Do i
Solved...but dont know whats the difference...
just giving ./spark-shell fixes it all...but dont know why !!
On Mon, Mar 17, 2014 at 1:32 PM, Sai Prasanna ansaiprasa...@gmail.comwrote:
Hi everyone !!
I installed scala 2.9.3, spark 0.8.1, oracle java 7...
I launched master and logged on to
The factor matrix Y is used twice in implicit ALS computation, one to
compute global Y^T Y, and another to compute local Y_i^T C_i Y_i.
-Xiangrui
On Sun, Mar 16, 2014 at 1:18 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
On Mar 14, 2014, at 5:52 PM, Michael Allman m...@allman.ms wrote:
I
Hi,
Thanks for the quick response.. Is there a simple way to write and
deploy apps on spark.
import org.apache.spark.SparkContext;
import org.apache.spark.SparkContext._;
object HelloWorld {
def main(args: Array[String]) {
println(Hello, world!)
val sc = new
Hi Diana,
Non-text input formats are only supported in Java and Scala right now, where
you can use sparkContext.hadoopFile or .hadoopDataset to load data with any
InputFormat that Hadoop MapReduce supports. In Python, you unfortunately only
have textFile, which gives you one record per line.
Here’s an example of getting together all lines in a file as one string:
$ cat dir/a.txt
Hello
world!
$ cat dir/b.txt
What's
up??
$ bin/pyspark
files = sc.textFile(“dir”)
files.collect()
[u'Hello', u'world!', uWhat's, u'up??’] # one element per line, not what we
want
There's also mapPartitions, which gives you an iterator for each partition
instead of an array. You can then return an iterator or list of objects to
produce from that.
I confess, I was hoping for an example of just that, because i've not yet
been able to figure out how to use mapPartitions. No
Hi
Quick question here,
I know that .foreach is not idempotent. I am wondering if collect() is
idempotent? Meaning that once I've collect()-ed if spark node crashes I can't
get the same values from the stream ever again.
Thanks
-Adrian
I have set it up.. still it fails.. Question:
https://oss.sonatype.org/content/repositories/snapshots/io/netty/netty-all/https://oss.sonatype.org/content/repositories/snapshots/io/netty/netty-all/4.0.13.Final/netty-all-4.0.13.Final.pom
4.0.13 is not there? Instead 4.0.18 is there?? Is this a bug?
Yup, it only returns each value once.
Matei
On Mar 17, 2014, at 1:14 PM, Adrian Mocanu amoc...@verticalscope.com wrote:
Hi
Quick question here,
I know that .foreach is not idempotent. I am wondering if collect() is
idempotent? Meaning that once I’ve collect()-ed if spark node crashes I
Oh, I see, the problem is that the function you pass to mapPartitions must
itself return an iterator or a collection. This is used so that you can return
multiple output records for each input record. You can implement most of the
existing map-like operations in Spark, such as map, filter,
Hi,
I'm getting this stack trace, using Spark 0.7.3. No references to anything
in my code, never experienced anything like this before. Any ideas what is
going on?
java.lang.ClassCastException: spark.SparkContext$$anonfun$9 cannot be cast
to scala.Function2
at
I'm guessing the other result was wrong, or just never evaluated here. The
RDD transforms being lazy may have let it be expressed, but it wouldn't
work. Nested RDD's are not supported.
On Mon, Mar 17, 2014 at 4:01 PM, anny9699 anny9...@gmail.com wrote:
Hi Andrew,
Thanks for the reply.
Thanks for reporting this, looking into it.
On Mar 17, 2014, at 2:44 PM, Walrus theCat walrusthe...@gmail.com wrote:
ping
On Thu, Mar 13, 2014 at 11:05 AM, Aaron Davidson ilike...@gmail.com wrote:
Looks like everything from 0.8.0 and before errors similarly (though Spark
0.3 for Scala
From what i understand getting Spark to run alongside a hadoop cluster
requires the following
a) a working hadoop
b) a compiled Spark
c) configuration parameters that point spark to the right hadoop conf files
i ) Can you let me know the specific steps to take after spark was compiled
(via sbt
Thanks all!
I figured it out...
I thought sbt package is enough...
2014-03-17 21:46 GMT-04:00 Debasish Das debasish.da...@gmail.com:
You need the spark assembly jar to run spark shellPlease do sbt
assembly to generate the jar
On Mar 17, 2014 2:11 PM, Yexi Jiang
Good morning! I'm attempting to build Apache Spark 0.9.0 on Windows 8. I've
installed all prerequisites (except Hadoop) and run sbt/sbt assembly while
in the root directory. I'm getting an error after the line Set current
project to root in build file:C:/.../spark-0.9.0-incubating/. The error
is:
Hi Michael,
I made couple changes to implicit ALS. One gives faster construction
of YtY (https://github.com/apache/spark/pull/161), which was merged
into master. The other caches intermediate matrix factors properly
(https://github.com/apache/spark/pull/165). They should give you the
same result
20 matches
Mail list logo