Using CUDA within Spark / boosting linear algebra

2015-02-05 Thread Ulanov, Alexander
Dear Spark developers, I am exploring how to make linear algebra operations faster within Spark. One way of doing this is to use Scala Breeze library that is bundled with Spark. For matrix operations, it employs Netlib-java that has a Java wrapper for BLAS (basic linear algebra subprograms)

Maximum size of vector that reduce can handle

2015-01-23 Thread Ulanov, Alexander
Dear Spark developers, I am trying to measure the Spark reduce performance for big vectors. My motivation is related to machine learning gradient. Gradient is a vector that is computed on each worker and then all results need to be summed up and broadcasted back to workers. For example,

RE: Maximum size of vector that reduce can handle

2015-01-23 Thread Ulanov, Alexander
aggregation? If that is the problem, then how to force it to do aggregation after receiving each portion of data from Workers? Best regards, Alexander -Original Message- From: DB Tsai [mailto:dbt...@dbtsai.com] Sent: Friday, January 23, 2015 11:53 AM To: Ulanov, Alexander Cc: dev

RE: Is breeze thread safe in Spark?

2014-09-04 Thread Ulanov, Alexander
, changes to memory between different threads is bad. There's actually a potential bug in the KMeans code where it uses += instead of +. On Wed, Sep 3, 2014 at 1:26 PM, Ulanov, Alexander alexander.ula...@hp.com wrote: Hi, Is breeze library called thread safe from Spark mllib code

Is breeze thread safe in Spark?

2014-09-03 Thread Ulanov, Alexander
Hi, Is breeze library called thread safe from Spark mllib code in case when native libs for blas and lapack are used? Might it be an issue when running Spark locally? Best regards, Alexander - To unsubscribe, e-mail:

Re: Is breeze thread safe in Spark?

2014-09-03 Thread Ulanov, Alexander
, 2014 at 1:26 PM, Ulanov, Alexander alexander.ula...@hp.com wrote: Hi, Is breeze library called thread safe from Spark mllib code in case when native libs for blas and lapack are used? Might it be an issue when running Spark locally? Best regards, Alexander

Re: Gradient descent and runMiniBatchSGD

2014-08-26 Thread Ulanov, Alexander
at 8:15 AM, Ulanov, Alexander alexander.ula...@hp.commailto:alexander.ula...@hp.com wrote: Hi, RJ https://github.com/avulanov/spark/blob/neuralnetwork/mllib/src/main/scala/org/apache/spark/mllib/classification/NeuralNetwork.scala Unit tests are in the same branch. Alexander From: RJ Nowling

Spark maven project with the latest Spark jars

2014-08-05 Thread Ulanov, Alexander
Hi, I'm trying to create a maven project that references the latest build of Spark. 1)downloaded sources and compiled the latest version of Spark. 2)added new spark-core jar to the a new local maven repo 3)created Scala maven project with net.alchim31.maven (scala-archetype-simple v 1.5) 4)added

RE: Feature selection interface

2014-07-18 Thread Ulanov, Alexander
FYI This is my first take on feature selection, filtering and chi-squared: https://github.com/apache/spark/pull/1484 -Original Message- From: Ulanov, Alexander Sent: Thursday, July 10, 2014 9:39 PM To: dev@spark.apache.org Subject: Feature selection interface Hi, I've implemented

Feature selection interface

2014-07-10 Thread Ulanov, Alexander
Hi, I've implemented a class that does Chi-squared feature selection for RDD[LabeledPoint]. It also computes basic class/feature occurrence statistics and other methods like mutual information or information gain can be easily implemented. I would like to make a pull request. However, MLlib

Pass parameters to RDD functions

2014-07-03 Thread Ulanov, Alexander
Hi, I wonder how I can pass parameters to RDD functions with closures. If I do it in a following way, Spark crashes with NotSerializableException: class TextToWordVector(csvData:RDD[Array[String]]) { val n = 1 lazy val x = csvData.map{ stringArr = stringArr(n)}.collect() } Exception: Job

RE: Pass parameters to RDD functions

2014-07-03 Thread Ulanov, Alexander
, Ulanov, Alexander alexander.ula...@hp.com wrote: Hi, I wonder how I can pass parameters to RDD functions with closures. If I do it in a following way, Spark crashes with NotSerializableException: class TextToWordVector(csvData:RDD[Array[String]]) { val n = 1 lazy val x = csvData.map

RE: Artificial Neural Network in Spark?

2014-07-01 Thread Ulanov, Alexander
a project there which used autoencoder functions...It's not updated for a long time now ! On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander alexander.ula...@hp.com wrote: Hi Bert, It would be extremely interesting. Do you plan to implement autoencoder as well? It would be great

Re: IntelliJ IDEA cannot compile TreeNode.scala

2014-06-26 Thread Ulanov, Alexander
Hi Ron Hu, The Idea project generated with update gen-idea didn't work properly for me as well. My workaround is to open corresponding Maven project in Idea (File-Open look for .bom file). To compile the opened project I use Maven window in Idea (View-show Maven ). However, tests fail to

Re: Artificial Neural Network in Spark?

2014-06-26 Thread Ulanov, Alexander
Hi Bert, It would be extremely interesting. Do you plan to implement autoencoder as well? It would be great to have deep learning in Spark. Best regards, Alexander 27.06.2014, в 4:47, Bert Greevenbosch bert.greevenbo...@huawei.com написал(а): Hello all, I was wondering whether

<    1   2