from:"ll"

broadcast: OutOfMemoryError

2014-12-11 Thread ll

hi. i'm running into this OutOfMemory issue when i'm broadcasting a large array. what is the best way to handle this? should i split the array into smaller arrays before broadcasting, and then combining them locally at each node? thanks! -- View this message in context:

Re: RDD.aggregate?

2014-12-11 Thread ll

any explaination on how aggregate works would be much appreciated. i already looked at the spark example and still am confused about the seqop and combop... thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-aggregate-tp20434p20634.html Sent from

Re: what is the best way to implement mini batches?

2014-12-11 Thread ll

any advice/comment on this would be much appreciated. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/what-is-the-best-way-to-implement-mini-batches-tp20264p20635.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

why is spark + scala code so slow, compared to python?

2014-12-11 Thread ll

hi.. i'm converting some of my machine learning python code into scala + spark. i haven't been able to run it on large dataset yet, but on small datasets (like http://yann.lecun.com/exdb/mnist/), my spark + scala code is much slower than my python code (5 to 10 times slower than python) i

RDD.aggregate?

2014-12-04 Thread ll

can someone please explain how RDD.aggregate works? i looked at the average example done with aggregate() but i'm still confused about this function... much appreciated. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/RDD-aggregate-tp20434.html Sent from

where is the org.apache.spark.util package?

2014-11-07 Thread ll

i'm trying to compile some of the spark code directly from the source (https://github.com/apache/spark). it complains about the missing package org.apache.spark.util. it doesn't look like this package is part of the source code on github. where can i find this package? -- View this message

Re: where is the org.apache.spark.util package?

2014-11-07 Thread ll

i found util package under spark core package, but i now got this error Sysmbol Utils is inaccessible from this place. what does this error mean? the org.apache.spark.util and org.apache.spark.spark.Utils are there now. thanks. -- View this message in context:

Re: Fwd: Why is Spark not using all cores on a single machine?

2014-11-07 Thread ll

hi. i did use local[8] as below, but it still ran on only 1 core. val sc = new SparkContext(new SparkConf().setMaster(local[8]).setAppName(abc)) any advice is much appreciated. -- View this message in context:

word2vec: how to save an mllib model and reload it?

2014-11-06 Thread ll

what is the best way to save an mllib model that you just trained and reload it in the future? specifically, i'm using the mllib word2vec model... thanks. -- View this message in context:

Re: Matrix multiplication in spark

2014-11-05 Thread ll

@sowen.. i am looking for distributed operations, especially very large sparse matrix x sparse matrix multiplication. what is the best way to implement this in spark? -- View this message in context:

sparse x sparse matrix multiplication

2014-11-04 Thread ll

what is the best way to implement a sparse x sparse matrix multiplication with spark? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/sparse-x-sparse-matrix-multiplication-tp18163.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

is spark a good fit for sequential machine learning algorithms?

2014-11-03 Thread ll

i'm struggling with implementing a few algorithms with spark. hope to get help from the community. most of the machine learning algorithms today are sequential, while spark is all about parallelism. it seems to me that using spark doesn't actually help much, because in most cases you can't

SparkContext.stop() ?

2014-10-31 Thread ll

what is it for? when do we call it? thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkContext-stop-tp17826.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

real-time streaming

2014-10-28 Thread ll

the spark tutorial shows that we can create a stream that reads new files from a directory. that seems to have some lag time, as we have to write the data to file first and then wait until spark stream picks it up. what is the best way to implement REAL 'REAL-TIME' streaming for analysis in

Re: real-time streaming

2014-10-28 Thread ll

thanks jay. do you think spark is a good fit for handling streaming analyzing videos in real time? in this case, we're streaming 30 frames per second, and each frame is an image (size: roughly 500K - 1MB). we need to analyze every frame and return the analysis result back instantly in real

complexity of each action / transformation

2014-10-17 Thread ll

hello... is there a list that shows the complexity of each action/transformation? for example, what is the complexity of RDD.map()/filter() or RowMatrix.multiply() etc? that would be really helpful. thanks! -- View this message in context:

mllib.linalg.Vectors vs Breeze?

2014-10-17 Thread ll

hello... i'm looking at the source code for mllib.linalg.Vectors and it looks like it's a wrapper around Breeze with very small changes (mostly changing the names). i don't have any problem with using spark wrapper around Breeze or Breeze directly. i'm just curious to understand why this wrapper

reverse an rdd

2014-10-16 Thread ll

hello... what is the best way to iterate through an rdd backward (last element first, first element last)? thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/reverse-an-rdd-tp16602.html Sent from the Apache Spark User List mailing list archive at

scala: java.net.BindException?

2014-10-16 Thread ll

hello... does anyone know how to resolve this issue? i'm running this locally on my computer. keep getting this BindException. much appreciated. 14/10/16 17:48:13 WARN component.AbstractLifeCycle: FAILED SelectChannelConnector@0.0.0.0:4040: java.net.BindException: Address already in use

object in an rdd: serializable?

2014-10-16 Thread ll

i got an exception complaining about serializable. the sample code is below... class HelloWorld(val count: Int) { ... ... } object Test extends App { ... val data = sc.parallelize(List(new HelloWorld(1), new HelloWorld(2))) ... } what is the best way to serialize HelloWorld so that

matrix operations?

2014-10-15 Thread ll

hi there... is there any other matrix operations in addition to multiply()? like addition or dot product? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/matrix-operations-tp16508.html Sent from the Apache Spark User List mailing list archive at

RowMatrix.multiply() ?

2014-10-15 Thread ll

hi.. it looks like RowMatrix.multiply() takes a local Matrix as a parameter and returns the result as a distributed RowMatrix. how do you perform this series of multiplications if A, B, C, and D are all RowMatrix? ((A x B) x C) x D) thanks! -- View this message in context:

Re: graphx - mutable?

2014-10-14 Thread ll

hi again. just want to check in again to see if anyone could advise on how to implement a mutable, growing graph with graphx? we're building a graph is growing over time. it adds more vertices and edges every iteration of our algorithm. it doesn't look like there is an obvious way to add a

mllib CoordinateMatrix

2014-10-14 Thread ll

after creating a coordinate matrix from my rdd[matrixentry]... 1. how can i get/query the value at coordiate (i, j)? 2. how can i set/update the value at coordiate (i, j)? 3. how can i get all the values on a specific row i, ideally as a vector? thanks! -- View this message in context:

graphx - mutable?

2014-10-05 Thread ll

i understand that graphx is an immutable rdd. i'm working on an algorithm that requires a mutable graph. initially, the graph starts with just a few nodes and edges. then over time, it adds more and more nodes and edges. what would be the best way to implement this growing graph with

Re: android + spark streaming?

2014-10-04 Thread ll

any comment/feedback/advice on this is much appreciated! thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/android-spark-streaming-tp15661p15735.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

scala Vector vs mllib Vector

2014-10-04 Thread ll

what are the pros/cons of each? when should we use mllib Vector, and when to use standard scala Vector? thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/scala-Vector-vs-mllib-Vector-tp15736.html Sent from the Apache Spark User List mailing list

Re: scala Vector vs mllib Vector

2014-10-04 Thread ll

thanks dean. thanks for the answer with great clarity! i'm working on an algorithm that has a weight vector W(w0, w1, .., wN). the elements of this weight vector are adjusted/updated frequently - every iteration of the algorithm. how would you recommend to implement this vector? what is the

broadcast: OutOfMemoryError

Re: RDD.aggregate?

Re: what is the best way to implement mini batches?

why is spark + scala code so slow, compared to python?

RDD.aggregate?

where is the org.apache.spark.util package?

Re: where is the org.apache.spark.util package?

Re: Fwd: Why is Spark not using all cores on a single machine?

word2vec: how to save an mllib model and reload it?

Re: Matrix multiplication in spark

sparse x sparse matrix multiplication

is spark a good fit for sequential machine learning algorithms?

SparkContext.stop() ?

real-time streaming

Re: real-time streaming

complexity of each action / transformation

mllib.linalg.Vectors vs Breeze?

reverse an rdd

scala: java.net.BindException?

object in an rdd: serializable?

matrix operations?

RowMatrix.multiply() ?

Re: graphx - mutable?

mllib CoordinateMatrix

graphx - mutable?

Re: android + spark streaming?

scala Vector vs mllib Vector

Re: scala Vector vs mllib Vector

28 matches

Site Navigation

Mail list logo

Footer information