Re: Using CUDA within Spark / boosting linear algebra

2015-03-25 Thread Reza Zadeh
These are awesome (and surprising) results, Alex. I've been following this
thread and really surprised by the improvement over BIDMat-cuda, almost 20x
faster.

Any chance you could send scripts or github gist for reproduction?

Thanks,
Reza

On Wed, Mar 25, 2015 at 2:31 PM, Ulanov, Alexander 
wrote:

> Hi again,
>
> I finally managed to use nvblas within Spark+netlib-java. It has
> exceptional performance for big matrices with Double, faster than
> BIDMat-cuda with Float. But for smaller matrices, if you will copy them
> to/from GPU, OpenBlas or MKL might be a better choice. This correlates with
> original nvblas presentation on GPU conf 2013 (slide 21):
> http://on-demand.gputechconf.com/supercomputing/2013/presentation/SC3108-New-Features-CUDA%206%20-GPU-Acceleration.pdf
>
> My results:
>
> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>
> Just in case, these tests are not for generalization of performance of
> different libraries. I just want to pick a library that does at best dense
> matrices multiplication for my task.
>
> P.S. My previous issue with nvblas was the following: it has Fortran blas
> functions, at the same time netlib-java uses C cblas functions. So, one
> needs cblas shared library to use nvblas through netlib-java. Fedora does
> not have cblas (but Debian and Ubuntu have), so I needed to compile it. I
> could not use cblas from Atlas or Openblas because they link to their
> implementation and not to Fortran blas.
>
> Best regards, Alexander
>
> -Original Message-
> From: Ulanov, Alexander
> Sent: Tuesday, March 24, 2015 6:57 PM
> To: Sam Halliday
> Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks
> Subject: RE: Using CUDA within Spark / boosting linear algebra
>
> Hi,
>
> I am trying to use nvblas with netlib-java from Spark. nvblas functions
> should replace current blas functions calls after executing LD_PRELOAD as
> suggested in http://docs.nvidia.com/cuda/nvblas/#Usage without any
> changes to netlib-java. It seems to work for simple Java example, but I
> cannot make it work with Spark. I run the following:
> export LD_LIBRARY_PATH=/usr/local/cuda-6.5/lib64
> env LD_PRELOAD=/usr/local/cuda-6.5/lib64/libnvblas.so ./spark-shell
> --driver-memory 4G In nvidia-smi I observe that Java is to use GPU:
>
> +-+
> | Processes:   GPU
> Memory |
> |  GPU   PID  Type  Process name   Usage
> |
>
> |=|
> |0  8873C   bash
> 39MiB |
> |0  8910C   /usr/lib/jvm/java-1.7.0/bin/java
> 39MiB |
>
> +-+
>
> In Spark shell I do matrix multiplication and see the following:
> 15/03/25 06:48:01 INFO JniLoader: successfully loaded
> /tmp/jniloader8192964377009965483netlib-native_system-linux-x86_64.so
> So I am sure that netlib-native is loaded and cblas supposedly used.
> However, matrix multiplication does executes on CPU since I see 16% of CPU
> used and 0% of GPU used. I also checked different matrix sizes, from
> 100x100 to 12000x12000
>
> Could you suggest might the LD_PRELOAD not affect Spark shell?
>
> Best regards, Alexander
>
>
>
> From: Sam Halliday [mailto:sam.halli...@gmail.com]
> Sent: Monday, March 09, 2015 6:01 PM
> To: Ulanov, Alexander
> Cc: dev@spark.apache.org; Xiangrui Meng; Joseph Bradley; Evan R. Sparks
> Subject: RE: Using CUDA within Spark / boosting linear algebra
>
>
> Thanks so much for following up on this!
>
> Hmm, I wonder if we should have a concerted effort to chart performance on
> various pieces of hardware...
> On 9 Mar 2015 21:08, "Ulanov, Alexander"  alexander.ula...@hp.com>> wrote:
> Hi Everyone, I've updated the benchmark as Xiangrui suggested. Added the
> comment that BIDMat 0.9.7 uses Float matrices in GPU (although I see the
> support of Double in the current source code), did the test with BIDMat and
> CPU Double matrices. BIDMat MKL is indeed on par with netlib MKL.
>
>
> https://docs.google.com/spreadsheets/d/1lWdVSuSragOobb0A_oeouQgHUMx378T9J5r7kwKSPkY/edit?usp=sharing
>
> Best regards, Alexander
>
> -Original Message-
> From: Sam Halliday [mailto:sam.halli...@gmail.com sam.halli...@gmail.com>]
> Sent: Tuesday, March 03, 2015 1:54 PM
> To: Xiangrui Meng; Joseph Bradley
> Cc: Evan R. Sparks; Ulanov, Alexander; dev@spark.apache.org dev@spark.apache.org>
> Subject: Re: Using CUDA within Spark / boosting linear algebra
>
> BTW, is anybody on this list going to the London Meetup in a few weeks?
>
>
> https://skillsmatter.com/meetups/6987-apache-spark-living-the-post-mapreduce-world#community
>
> Would be nice to meet other people working on the guts of Spark! :-)
>
>
> Xiangrui Meng mailto:men...@gmail.com>> writes:
>
> > Hey Alexander,
> >
>

Re: Semantics of LGTM

2015-01-17 Thread Reza Zadeh
LGTM

On Sat, Jan 17, 2015 at 5:40 PM, Patrick Wendell  wrote:

> Hey All,
>
> Just wanted to ping about a minor issue - but one that ends up having
> consequence given Spark's volume of reviews and commits. As much as
> possible, I think that we should try and gear towards "Google Style"
> LGTM on reviews. What I mean by this is that LGTM has the following
> semantics:
>
> "I know this code well, or I've looked at it close enough to feel
> confident it should be merged. If there are issues/bugs with this code
> later on, I feel confident I can help with them."
>
> Here is an alternative semantic:
>
> "Based on what I know about this part of the code, I don't see any
> show-stopper problems with this patch".
>
> The issue with the latter is that it ultimately erodes the
> significance of LGTM, since subsequent reviewers need to reason about
> what the person meant by saying LGTM. In contrast, having strong
> semantics around LGTM can help streamline reviews a lot, especially as
> reviewers get more experienced and gain trust from the comittership.
>
> There are several easy ways to give a more limited endorsement of a patch:
> - "I'm not familiar with this code, but style, etc look good" (general
> endorsement)
> - "The build changes in this code LGTM, but I haven't reviewed the
> rest" (limited LGTM)
>
> If people are okay with this, I might add a short note on the wiki.
> I'm sending this e-mail first, though, to see whether anyone wants to
> express agreement or disagreement with this approach.
>
> - Patrick
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: Row Similarity

2014-12-10 Thread Reza Zadeh
Here we go: https://issues.apache.org/jira/browse/SPARK-4823

On Wed, Dec 10, 2014 at 9:01 PM, Debasish Das 
wrote:

> I added code to compute topK products for each user and topK user for each
> product in SPARK-3066..
>
> That is different than row similarity calculation as we need both user and
> product factors to calculate the topK recommendations..
>
> For (1) and (2) we are trying to answer similarUsers to given a user and
> similarProducts to a given product
>
> similarProducts to a given product is straightforward to compute through
> columnSimilarities/dimsum when products are skinny...
>
> similarUser to a given user will need a map-reduce implementation of row
> similarity since the matrix is tall...
>
> I don't see a JIRA for that yet...Are there any good reference for map
> reduce implementation of row similarity ?
>
> On Wed, Dec 10, 2014 at 2:30 PM, Reza Zadeh  wrote:
>
>> It's not so cheap to compute row similarities when there are many rows,
>> as it amounts to computing the outer product of a matrix A (i.e. computing
>> AA^T, which is expensive).
>>
>> There is a JIRA to track handling (1) and (2) more efficiently than
>> computing all pairs: https://issues.apache.org/jira/browse/SPARK-3066
>>
>>
>>
>> On Wed, Dec 10, 2014 at 2:44 PM, Debasish Das 
>> wrote:
>>
>>> Hi,
>>>
>>> It seems there are multiple places where we would like to compute row
>>> similarity (accurate or approximate similarities)
>>>
>>> Basically through RowMatrix columnSimilarities we can compute column
>>> similarities of a tall skinny matrix
>>>
>>> Similarly we should have an API in RowMatrix called rowSimilarities where
>>> we can compute similar rows in a map-reduce fashion. It will be useful
>>> for
>>> following use-cases:
>>>
>>> 1. Generate topK users for each user from matrix factorization model
>>> 2. Generate topK products for each product from matrix factorization
>>> model
>>> 3. Generate kernel matrix for use in spectral clustering
>>> 4. Generate kernel matrix for use in kernel regression/classification
>>>
>>> I am not sure if there are already good implementation for map-reduce row
>>> similarity that we can use (ideas like fastfood and kitchen sink felt
>>> more
>>> like for classification use-case but for recommendation also user
>>> similarities show up which is unsupervised)...
>>>
>>> Is there a JIRA tracking it ? If not I can open one and we can discuss
>>> further on it.
>>>
>>> Thanks.
>>> Deb
>>>
>>
>>
>


Re: Row Similarity

2014-12-10 Thread Reza Zadeh
It's not so cheap to compute row similarities when there are many rows, as
it amounts to computing the outer product of a matrix A (i.e. computing
AA^T, which is expensive).

There is a JIRA to track handling (1) and (2) more efficiently than
computing all pairs: https://issues.apache.org/jira/browse/SPARK-3066



On Wed, Dec 10, 2014 at 2:44 PM, Debasish Das 
wrote:

> Hi,
>
> It seems there are multiple places where we would like to compute row
> similarity (accurate or approximate similarities)
>
> Basically through RowMatrix columnSimilarities we can compute column
> similarities of a tall skinny matrix
>
> Similarly we should have an API in RowMatrix called rowSimilarities where
> we can compute similar rows in a map-reduce fashion. It will be useful for
> following use-cases:
>
> 1. Generate topK users for each user from matrix factorization model
> 2. Generate topK products for each product from matrix factorization model
> 3. Generate kernel matrix for use in spectral clustering
> 4. Generate kernel matrix for use in kernel regression/classification
>
> I am not sure if there are already good implementation for map-reduce row
> similarity that we can use (ideas like fastfood and kitchen sink felt more
> like for classification use-case but for recommendation also user
> similarities show up which is unsupervised)...
>
> Is there a JIRA tracking it ? If not I can open one and we can discuss
> further on it.
>
> Thanks.
> Deb
>


Re: matrix computation in spark

2014-11-17 Thread Reza Zadeh
Hi Yuxi,

We are integrating the ml-matrix from the AMPlab repo into MLlib, tracked
by this JIRA: https://issues.apache.org/jira/browse/SPARK-3434

We already have matrix multiply, but are missing LU decomposition. Could
you please track that JIRA, once the initial design is in, we can sync on
how to contribute LU decomposition.

Let's move the discussion to the JIRA.

Thanks!

On Mon, Nov 17, 2014 at 9:49 PM, 顾荣  wrote:

> Hey Yuxi,
>
> We also have implemented a distributed matrix multiplication library in
> PasaLab. The repo is host on here https://github.com/PasaLab/marlin . We
> implemented three distributed matrix multiplication algorithms on Spark. As
> we see, communication-optimal does not always means the total-optimal.
> Thus, besides the CARMA matrix multiplication you mentioned, we also
> implemented the Block-splitting matrix multiplication and Broadcast matrix
> multiplication. They are more efficient than the CARMA matrix
> multiplication for some situations, for example a large matrix multiplies a
> small matrix.
>
> Actually, We have shared the work on Spark Meetup@Beijing on October
> 26th.(
> http://www.meetup.com/spark-user-beijing-Meetup/events/210422112/ ). The
> slide can be download from the archive here
> http://pan.baidu.com/s/1dDoyHX3#path=%252Fmeetup-3rd
>
> Best,
> Rong
>
> 2014-11-18 13:11 GMT+08:00 顾荣 :
>
> > Hey Yuxi,
> >
> > We also have implemented a distributed matrix multiplication library in
> > PasaLab. The repo is host on here https://github.com/PasaLab/marlin . We
> > implemented three distributed matrix multiplication algorithms on Spark.
> As
> > we see, communication-optimal does not always means the total-optimal.
> > Thus, besides the CARMA matrix multiplication you mentioned, we also
> > implemented the Block-splitting matrix multiplication and Broadcast
> matrix
> > multiplication. They are more efficient than the CARMA matrix
> > multiplication for some situations, for example a large matrix
> multiplies a
> > small matrix.
> >
> > Actually, We have shared the work on Spark Meetup@Beijing on October
> > 26th.( http://www.meetup.com/spark-user-beijing-Meetup/events/210422112/
> > ). The slide is also attached in this mail.
> >
> > Best,
> > Rong
> >
> > 2014-11-18 11:36 GMT+08:00 Zongheng Yang :
> >
> >> There's been some work at the AMPLab on a distributed matrix library on
> >> top
> >> of Spark; see here [1]. In particular, the repo contains a couple
> >> factorization algorithms.
> >>
> >> [1] https://github.com/amplab/ml-matrix
> >>
> >> Zongheng
> >>
> >> On Mon Nov 17 2014 at 7:34:17 PM liaoyuxi  wrote:
> >>
> >> > Hi,
> >> > Matrix computation is critical for algorithm efficiency like least
> >> square,
> >> > Kalman filter and so on.
> >> > For now, the mllib module offers limited linear algebra on matrix,
> >> > especially for distributed matrix.
> >> >
> >> > We have been working on establishing distributed matrix computation
> APIs
> >> > based on data structures in MLlib.
> >> > The main idea is to partition the matrix into sub-blocks, based on the
> >> > strategy in the following paper.
> >> > http://www.cs.berkeley.edu/~odedsc/papers/bfsdfs-mm-ipdps13.pdf
> >> > In our experiment, it's communication-optimal.
> >> > But operations like factorization may not be appropriate to carry out
> in
> >> > blocks.
> >> >
> >> > Any suggestions and guidance are welcome.
> >> >
> >> > Thanks,
> >> > Yuxi
> >> >
> >> >
> >>
> >
> >
> >
> > --
> > --
> > Rong Gu
> > Department of Computer Science and Technology
> > State Key Laboratory for Novel Software Technology
> > Nanjing University
> > Phone: +86 15850682791
> > Email: gurongwal...@gmail.com
> > Homepage: http://pasa-bigdata.nju.edu.cn/people/ronggu/
> >
>
>
>
> --
> --
> Rong Gu
> Department of Computer Science and Technology
> State Key Laboratory for Novel Software Technology
> Nanjing University
> Phone: +86 15850682791
> Email: gurongwal...@gmail.com
> Homepage: http://pasa-bigdata.nju.edu.cn/people/ronggu/
>


Re: TimSort in 1.2

2014-11-13 Thread Reza Zadeh
See https://issues.apache.org/jira/browse/SPARK-2045
and https://issues.apache.org/jira/browse/SPARK-3280

On Thu, Nov 13, 2014 at 4:19 PM, Debasish Das 
wrote:

> Hi,
>
> I am noticing the first step for Spark jobs does a TimSort in 1.2
> branch...and there is some time spent doing the TimSort...Is this assigning
> the RDD blocks to different nodes based on a sort order ?
>
> Could someone please point to a JIRA about this change so that I can read
> more about it ?
>
> Thanks.
> Deb
>


Re: [VOTE] Designating maintainers for some Spark components

2014-11-05 Thread Reza Zadeh
+1, sounds good.

On Wed, Nov 5, 2014 at 9:19 PM, Kousuke Saruta 
wrote:

> +1, It makes sense!
>
> - Kousuke
>
>
> (2014/11/05 17:31), Matei Zaharia wrote:
>
>> Hi all,
>>
>> I wanted to share a discussion we've been having on the PMC list, as well
>> as call for an official vote on it on a public list. Basically, as the
>> Spark project scales up, we need to define a model to make sure there is
>> still great oversight of key components (in particular internal
>> architecture and public APIs), and to this end I've proposed implementing a
>> maintainer model for some of these components, similar to other large
>> projects.
>>
>> As background on this, Spark has grown a lot since joining Apache. We've
>> had over 80 contributors/month for the past 3 months, which I believe makes
>> us the most active project in contributors/month at Apache, as well as over
>> 500 patches/month. The codebase has also grown significantly, with new
>> libraries for SQL, ML, graphs and more.
>>
>> In this kind of large project, one common way to scale development is to
>> assign "maintainers" to oversee key components, where each patch to that
>> component needs to get sign-off from at least one of its maintainers. Most
>> existing large projects do this -- at Apache, some large ones with this
>> model are CloudStack (the second-most active project overall), Subversion,
>> and Kafka, and other examples include Linux and Python. This is also
>> by-and-large how Spark operates today -- most components have a de-facto
>> maintainer.
>>
>> IMO, adopting this model would have two benefits:
>>
>> 1) Consistent oversight of design for that component, especially
>> regarding architecture and API. This process would ensure that the
>> component's maintainers see all proposed changes and consider them to fit
>> together in a good way.
>>
>> 2) More structure for new contributors and committers -- in particular,
>> it would be easy to look up who’s responsible for each module and ask them
>> for reviews, etc, rather than having patches slip between the cracks.
>>
>> We'd like to start with in a light-weight manner, where the model only
>> applies to certain key components (e.g. scheduler, shuffle) and user-facing
>> APIs (MLlib, GraphX, etc). Over time, as the project grows, we can expand
>> it if we deem it useful. The specific mechanics would be as follows:
>>
>> - Some components in Spark will have maintainers assigned to them, where
>> one of the maintainers needs to sign off on each patch to the component.
>> - Each component with maintainers will have at least 2 maintainers.
>> - Maintainers will be assigned from the most active and knowledgeable
>> committers on that component by the PMC. The PMC can vote to add / remove
>> maintainers, and maintained components, through consensus.
>> - Maintainers are expected to be active in responding to patches for
>> their components, though they do not need to be the main reviewers for them
>> (e.g. they might just sign off on architecture / API). To prevent inactive
>> maintainers from blocking the project, if a maintainer isn't responding in
>> a reasonable time period (say 2 weeks), other committers can merge the
>> patch, and the PMC will want to discuss adding another maintainer.
>>
>> If you'd like to see examples for this model, check out the following
>> projects:
>> - CloudStack: https://cwiki.apache.org/confluence/display/CLOUDSTACK/
>> CloudStack+Maintainers+Guide > confluence/display/CLOUDSTACK/CloudStack+Maintainers+Guide>
>> - Subversion: https://subversion.apache.org/docs/community-guide/roles.
>> html 
>>
>> Finally, I wanted to list our current proposal for initial components and
>> maintainers. It would be good to get feedback on other components we might
>> add, but please note that personnel discussions (e.g. "I don't think Matei
>> should maintain *that* component) should only happen on the private list.
>> The initial components were chosen to include all public APIs and the main
>> core components, and the maintainers were chosen from the most active
>> contributors to those modules.
>>
>> - Spark core public API: Matei, Patrick, Reynold
>> - Job scheduler: Matei, Kay, Patrick
>> - Shuffle and network: Reynold, Aaron, Matei
>> - Block manager: Reynold, Aaron
>> - YARN: Tom, Andrew Or
>> - Python: Josh, Matei
>> - MLlib: Xiangrui, Matei
>> - SQL: Michael, Reynold
>> - Streaming: TD, Matei
>> - GraphX: Ankur, Joey, Reynold
>>
>> I'd like to formally call a [VOTE] on this model, to last 72 hours. The
>> [VOTE] will end on Nov 8, 2014 at 6 PM PST.
>>
>> Matei
>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-06 Thread Reza Zadeh
+1
Tested recently merged mllib matrix multiplication bugfix



On Sat, Sep 6, 2014 at 2:35 PM, Tathagata Das 
wrote:

> +1
>
> Tested streaming integration with flume on a local test bed.
>
>
> On Thu, Sep 4, 2014 at 6:08 PM, Kan Zhang  wrote:
>
> > +1
> >
> > Compiled, ran newly-introduced PySpark Hadoop input/output examples.
> >
> >
> > On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov 
> > wrote:
> >
> > > +1
> > >
> > > Compiled, ran on yarn-hadoop-2.3 simple job.
> > >
> > >
> > > 2014-09-04 22:22 GMT+04:00 Henry Saputra :
> > >
> > > > LICENSE and NOTICE files are good
> > > > Hash files are good
> > > > Signature files are good
> > > > No 3rd parties executables
> > > > Source compiled
> > > > Run local and standalone tests
> > > > Test persist off heap with Tachyon looks good
> > > >
> > > > +1
> > > >
> > > > - Henry
> > > >
> > > > On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell  >
> > > > wrote:
> > > > > Please vote on releasing the following candidate as Apache Spark
> > > version
> > > > 1.1.0!
> > > > >
> > > > > The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd):
> > > > >
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=2f9b2bd7844ee8393dc9c319f4fefedf95f5e460
> > > > >
> > > > > The release files, including signatures, digests, etc. can be found
> > at:
> > > > > http://people.apache.org/~pwendell/spark-1.1.0-rc4/
> > > > >
> > > > > Release artifacts are signed with the following key:
> > > > > https://people.apache.org/keys/committer/pwendell.asc
> > > > >
> > > > > The staging repository for this release can be found at:
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapachespark-1031/
> > > > >
> > > > > The documentation corresponding to this release can be found at:
> > > > > http://people.apache.org/~pwendell/spark-1.1.0-rc4-docs/
> > > > >
> > > > > Please vote on releasing this package as Apache Spark 1.1.0!
> > > > >
> > > > > The vote is open until Saturday, September 06, at 08:30 UTC and
> > passes
> > > if
> > > > > a majority of at least 3 +1 PMC votes are cast.
> > > > >
> > > > > [ ] +1 Release this package as Apache Spark 1.1.0
> > > > > [ ] -1 Do not release this package because ...
> > > > >
> > > > > To learn more about Apache Spark, please see
> > > > > http://spark.apache.org/
> > > > >
> > > > > == Regressions fixed since RC3 ==
> > > > > SPARK-3332 - Issue with tagging in EC2 scripts
> > > > > SPARK-3358 - Issue with regression for m3.XX instances
> > > > >
> > > > > == What justifies a -1 vote for this release? ==
> > > > > This vote is happening very late into the QA period compared with
> > > > > previous votes, so -1 votes should only occur for significant
> > > > > regressions from 1.0.2. Bugs already present in 1.0.X will not
> block
> > > > > this release.
> > > > >
> > > > > == What default changes should I be aware of? ==
> > > > > 1. The default value of "spark.io.compression.codec" is now
> "snappy"
> > > > > --> Old behavior can be restored by switching to "lzf"
> > > > >
> > > > > 2. PySpark now performs external spilling during aggregations.
> > > > > --> Old behavior can be restored by setting "spark.shuffle.spill"
> to
> > > > "false".
> > > > >
> > > > > 3. PySpark uses a new heuristic for determining the parallelism of
> > > > > shuffle operations.
> > > > > --> Old behavior can be restored by setting
> > > > > "spark.default.parallelism" to the number of cores in the cluster.
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > > > For additional commands, e-mail: dev-h...@spark.apache.org
> > > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> > > > For additional commands, e-mail: dev-h...@spark.apache.org
> > > >
> > > >
> > >
> > >
> > > --
> > >
> > >
> > >
> > > *Sincerely yoursEgor PakhomovScala Developer, Yandex*
> > >
> >
>