[ 
https://issues.apache.org/jira/browse/MAHOUT-1885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15560908#comment-15560908
 ] 

ASF GitHub Bot commented on MAHOUT-1885:
----------------------------------------

GitHub user andrewpalumbo opened a pull request:

    https://github.com/apache/mahout/pull/261

    MAHOUT-1885 [WIP][FOR COMMENT]: Inital implementation of VCL bindings for 
mahout math.

    This is an initial set of bindings for VCL to mahout matrices.  Currently 
supports `DenseRowMatrix` and `SparseRowMatrix` for both GPU and multi-threaded 
CPU.
    
    Note that there are two new modules:  `vienniacl` and `viennacl-omp`  there 
are new profiles to activate these: `-Pvienniacl` activates both and  
`-Pviennacl-omp` activates only the OpenMP version.
    
    The default build should not require any new installs.
    
    If activating either version you';; need to install vienniacl 1.7.2+ 
    
    On ubuntu 16.04 this is simply:
    ```
    sudo apt-get install libviennacl-dev
    ```
    if running on GPU you'll also need, you'll need to have OpenCL 1.2+, which 
is likely installed with your video driver.
    
    Hadoop2 is still in a profile which needs to be removed, so the command to 
install for GPU and CPU is:
    ```
    mvn clean install -Pviennacl -Phadoop2 -DskipTests && mvn test
    ```
    for OpenMP only  (mac users) :
    ```
    mvn clean install -Pviennacl -Phadoop2 -DskipTests && mvn test
    ```
    
    The output is still very verbose and will let you know each time a `%*%` 
operatoion is made which device it is made on.
    
    A few todos:
    
    1.  Not sure that we need two separate modules for omp and gpu.  There may 
be a way to handle this with environment variables. 
    2. Currently a new object is being created each timw `%*%` is called.  it 
would be best to create these once and then cache them for subsequent calls.
    3.  Currently using try/catch for `XXXMMul` creation, really bad style.
    4.  GPU Vector are not working and throw exceptions when trying to read 
data out of them
       - Vector this we could implement at least one native linear solver. 
(most viennacl linear solvers require boost however this is slated for removal 
on the next release).
    5.  Have been noticing intermittent crashes on GPU.
    6.  Tests with CLI drivers are currently commented out.  Was getting odd 
non-numeric failures with them.  must be a config issue.
    7.  Currently each operation can only be run on one GPU /node.
    
    I'd like to get this committed relatively quickly and then go back and fine 
tune.
    
    Any feedback is much appreciated. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andrewpalumbo/mahout viennacl-opmmul-a

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/mahout/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #261
    
----
commit dfc93684a3d17a0331504a45b68ad2dfb9520f5c
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-08-10T17:48:51Z

    vienniacl back POC for mahout dense and sparse matrix multiplication.  
Compiling and testing as is with a couple of tests commented out.  Having 
issues currently with Dense %*% Sparse, and reading data from GPU after Matrix 
%*% Vector multiplication.

commit 5d96a17bb828b2c2a887ddd48f9eede4e13eef92
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-08-16T20:32:47Z

    fix imports for Sparse/Dense MMul

commit c2a548e239d31fee56428e7a02c8a0f76de01915
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-08-16T20:51:24Z

    Add functions to LinalgFunctions.java for direct matrix prod expressions, 
remove import.  Uncomment missed OpenCL test

commit 8de0fd03821a2bd08bd469177214567c6ae60b66
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-08-31T16:46:19Z

    allow spark build in module.  Add a linux-haswell properties file- this may 
need a property in viennacl/pom.xml to be used. As well will need one for mac.  
This is for the distribution artifact which may not be built by a 'native' 
machine.  Currently leaving linux-x86_64 to use -march=native but as this will 
likely need to be changed as suneel does releases from a mac and I (AP) use an 
amdfam10 so we should have a general linux-x86_64 properties file.

commit 1525dbe389aebd3ea1244ec4468cff08fb683cda
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-08-31T20:19:43Z

    [EXPIRMENTAL BRANCH] Split OpenCL and OpenMP modules up.  Next steps: 
implement MMulGPU an MMulOMP and SolverFactory per doc, test on AWS GPU Cluster.

commit a6a81f8ced8f24dfbafc655e46822a9ddfd2c90a
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-07T17:21:49Z

    [EXPIRMENTAL BRANCH] Hacked everything together in hopes of having a 
running branch to test on AWS byt Wed 9/7 Hangout... Compiling and auto 
detecting mahout-native-viennacl module.  Getting a runtime error when trying 
to create the o.a.m.vcl.GPUMMul Object via reflection from the SolverFactory:
    
    scala.reflect.runtime.ReflectError: value viennacl is not a package
    at 
scala.reflect.runtime.JavaMirrors.scala14884makeScalaPackage(JavaMirrors.scala:912)

commit aac6adc2f08712fdcb5a0ce91aa2e1bccf9bdac4
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-08T18:27:38Z

    Hardcode path to vcl jar for classloader.  compiles.. getting null pointer.

commit 5024ad27fa296505a486ba32ca6645f7aa91e029
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-09T01:48:45Z

    [EXPIRMENTAL BRANCH] everything is hacked together but tests r passing

commit 0a6d1855783dbcbf2a9809243759d9c2e0ba4a5e
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-09T02:06:42Z

    [EXPIRMENTAL BRANCH] removed large core file.  testing out some 
sparse/dense cases in viennacl/GPUMMul

commit c1b3f05b59cab146871b33526d1b0aa2a2e61a76
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-09T02:53:21Z

    [EXPIRMENTAL BRANCH] gpuRWRW was failing with a missing allocator when 
directed from:
    
    java.lang.UnsatisfiedLinkError: 
org.apache.mahout.viennacl.vcl.javacpp.Context.allocate(I)V
    at org.apache.mahout.viennacl.vcl.javacpp.Context.allocate(Native Method)
    at org.apache.mahout.viennacl.vcl.javacpp.Context.<init>(Context.scala:33)
    at org.apache.mahout.viennacl.vcl.GPUMMul$.org4600gpuRWRW(GPUMMul.scala:146)
    at org.apache.mahout.viennacl.vcl.GPUMMul$.org4600jvmCWRW(GPUMMul.scala:174)
    at 
org.apache.mahout.viennacl.vcl.GPUMMul4600anonfun6.apply(GPUMMul.scala:79)
    
    revert to jvm for now just to get some passing tests

commit d8abf28932a9c2216a41feff2e6cead4bd7a99d3
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-09T03:07:35Z

    [EXPIRMENTAL BRANCH] GPUMMul using only Sparse %*% Sparse on GPU -> all 
spark tests pass

commit 8f1f716a7766e47ccda22dae2aeb16d4e54cff90
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-10T19:10:44Z

    [EXPRIMENTAL BRANCH] wip: some VCL classes are not being picked up by spark.
    
     Lost task 2.0 in stage 44.0 (TID 283, localhost): 
java.lang.ExceptionInInitializerError
    at 
org.apache.mahout.viennacl.vcl.GPUMMul$.org5465gpuSparseRWRW(GPUMMul.scala:199)
    at 
org.apache.mahout.viennacl.vcl.GPUMMul$.org5465jvmSparseCWRW(GPUMMul.scala:252)
    at 
org.apache.mahout.viennacl.vcl.GPUMMul5465anonfun5.apply(GPUMMul.scala:111)
    at 
org.apache.mahout.viennacl.vcl.GPUMMul5465anonfun5.apply(GPUMMul.scala:111)
    at org.apache.mahout.viennacl.vcl.GPUMMul$.apply(GPUMMul.scala:122)
    at org.apache.mahout.viennacl.vcl.GPUMMul$.apply(GPUMMul.scala:34)
    at 
org.apache.mahout.math.scalabindings.RLikeMatrixOps.(RLikeMatrixOps.scala:36)
    at 
org.apache.mahout.sparkbindings.blas.AtB5465anonfun15465anonfun.apply(AtB.scala:241)
    at 
org.apache.mahout.sparkbindings.blas.AtB5465anonfun15465anonfun.apply(AtB.scala:239)
    at scala.collection.Iterator5465anon.next(Iterator.scala:89)
    at scala.collection.Iterator5465anon3.next(Iterator.scala:372)
    at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
    at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
    at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
    at org.apache.spark.executor.Executor.run(Executor.scala:214)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    Caused by: java.lang.ClassNotFoundException: 
org.apache.mahout.viennacl.vcl.javacpp.MemHandle
    
    org.apache.mahout.viennacl.vcl.javacpp.MemHandle is clearly added to jar.  
jar must not be reacking back end properly other GPUMMul gpu algorithms are  
passing without issue

commit 07d0087dc49c0e14c128e45181622c31e8b60322
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-10T23:04:29Z

    [EXPIREMENTAL BRANCH] in local mode, only 1 thread may be used per GPU so 
for tests to pass, change the Spark URL to local[1] in DistributedSparkSuite. 
Add  mahout-native-niennack_2.10 module to Spark and remove hard coded path to 
jars, Tests are passing now except for those with 0 non-zero elements, an 
assertion made by vienniacl::compressed_matrix:
    
    Solution will be to eith add a check for this wit a fallback to jvm MMul or 
implement:
    
    try (GPUMMul) catch ex { try (OMPMMul) catch ex { jvmMMul}}

commit 1a37c5a3ee1dd8a483175fbcdb151699b3775601
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-11T00:28:42Z

    [EXPIREMENTAL BRANCH] implement AAt in GPU MMul

commit 201bbdb2098cefdfffc1b006dd0d679d2ec5da88
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-11T01:28:31Z

    [EXPIREMENTAL BRANCH] cleanup some fix an incorrect check for empty matrix 
in GPUMMul, still somehow getting error:

commit f55389cfe31fb8555ed1932024ae4a11562219eb
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-12T19:08:44Z

    [EXPIREMENTAL BRANCH] All Spark teses pass with the exception of 
similiarity analysis and cf Drivers.  All flink tests pass.
    
    Can not get the backend to recogniz mahout-native-javacpp even when packed 
into the assembly jar.  fails in Spark-shell in pseudo cluster mode (1) local 
worker

commit 971f6577c51b749cc520d3475c0f9656ec00def0
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-12T19:11:20Z

    reomve flink from build.  add test script

commit 5f6484846e40abe8ab8031c8a94853a32cb7cfc2
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-13T19:34:37Z

    [EXPIREMENTAL BRANCH] fix check for empty SparseMatrices (possibly empty 
partitions) in GPUMMul and pass those cases off to jvm MMUl

commit 37ee67d40337c84650d6ef9bfabfd9e86ad9eba4
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-13T19:39:59Z

    [EXPIREMENTAL BRANCH] set threads in DistributedSparkSuite back to 3

commit 8a7b747233756e57ce417d7f09da98b110ab99d3
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-13T20:49:39Z

    Fixed typo in Spark-assembly that was keeping shell from working in 
standalone cluster mode with 1 worker.  Tested drmA %*% drmB with different 
partition sizes:
    
    timeSparseDRMMMul: (m: Int, n: Int, s: Int, para: Int, pctDense: Double)Long
    mahout>
    mahout>
    mahout>
    mahout> timeSparseDRMMMul(100,100,100,1,.1)
    res0: Long = 2954
    mahout> timeSparseDRMMMul(1000,1000,100,1,.1)
    res1: Long = 3593
    mahout> timeSparseDRMMMul(3000,3000,100,1,.1)
    res2: Long = 27919
    mahout> timeSparseDRMMMul(3000,3000,100,10,.1)
    res3: Long = 11012
    mahout> timeSparseDRMMMul(3000,3000,100,100,.1)
    res4: Long = 24706
    mahout> timeSparseDRMMMul(3000,3000,100,50,.1)
    res5: Long = 10378
    mahout> timeSparseDRMMMul(3000,3000,3000,50,.1)
    res6: Long = 139161
    mahout> timeSparseDRMMMul(3000,3000,3000,10,.1)
    res7: Long = 122031

commit 5d6ff3c930d05acf5d4df2ea0af0bea0ec96615c
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-13T21:20:29Z

    Supress output in SolverFactory, All tests passing except for viennacl

commit d855460ad3abe75cf6c0f6ccc5a8cc8c6466d6fe
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-14T16:31:16Z

    clean up some of the comments From SolverFactory.  Issue:  Tests in VCL 
fail with java.lang.NullPointerException if there is only one card on the 
machine:
    
    val oclCtx = new Context(Context.OPENCL_MEMORY) // creates context
    ...
    val mDvecC = mxA %*% dvecB // creates context

commit 4ec1c651731b62c32ca55bf3d7907dc21272b1bc
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-14T16:39:44Z

    debuging SolverFactory with println, add SparseSparseDrmTimer.mscala

commit 4441550557fa69d01c8db312d0dd648208b2d5cc
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-14T18:38:10Z

    added MMul as default for classes n the case that more than one ocl context 
is set.

commit a843e3a459a19d2c9a850e07888300d1836a77c5
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-14T19:05:49Z

    all tests passing (verbosely)

commit 09cc2d01f6eb113e7baa5078248e26f16dfc273a
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-15T00:38:44Z

    [EXPIREMENTAL BRANCH(STILL)] Debugging GPUMMul.. SparseRowMatrix %*% 
SparseRowMatrix in DRM was geing passed off to default JVM method jvmRWCW.

commit 8fba0c888d8de2e8c41845227db026d772813d57
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-15T01:15:54Z

    [EXPIREMENTAL BRANCH] changed spark ABt to use sparse marices.  there 
should not be any Dense Matrices in the test calculation 
/home/andrew/sandbox/mahout/examples/bin/SparseSparseDrmTimer.mscala
    
    we are getting dense matrices though it seems that it may be coming from 
the Kryo(de)Serializer:
    
    169.1.102): java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:66)
    at 
org.apache.mahout.math.scalabindings.package5989anonfun.apply(package.scala:163)
    at 
org.apache.mahout.math.scalabindings.package5989anonfun.apply(package.scala:141)
    at 
scala.collection.TraversableLike5989anonfun.apply(TraversableLike.scala:244)
    at 
scala.collection.TraversableLike5989anonfun.apply(TraversableLike.scala:244)
    at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
    at scala.collection.TraversableLike.map(TraversableLike.scala:244)
    at scala.collection.AbstractTraversable.map(Traversable.scala:105)
    at org.apache.mahout.math.scalabindings.package$.dense(package.scala:141)
    at 
org.apache.mahout.common.io.GenericMatrixKryoSerializer.read(GenericMatrixKryoSerializer.scala:179)
    at 
org.apache.mahout.common.io.GenericMatrixKryoSerializer.read(GenericMatrixKryoSerializer.scala:38)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
    at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:42)
    at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:33)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
    at 
org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:192)
    at 
org.apache.spark.serializer.DeserializationStream.readValue(Serializer.scala:171)
    at 
org.apache.spark.serializer.DeserializationStream5989anon.getNext(Serializer.scala:201)
    at 
org.apache.spark.serializer.DeserializationStream5989anon.getNext(Serializer.scala:198)
    at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
    at scala.collection.Iterator5989anon3.hasNext(Iterator.scala:371)
    at scala.collection.Iterator5989anon1.hasNext(Iterator.scala:327)
    at 
org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32)
    at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
    at scala.collection.Iterator5989anon1.hasNext(Iterator.scala:327)
    at 
org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:132)
    at 
org.apache.spark.rdd.CoGroupedRDD5989anonfun.apply(CoGroupedRDD.scala:169)
    at 
org.apache.spark.rdd.CoGroupedRDD5989anonfun.apply(CoGroupedRDD.scala:168)
    at 
scala.collection.TraversableLike5989anonfun.apply(TraversableLike.scala:772)
    at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

commit 3d759ba5f4513b3a4303176e8d3ddf382f026caf
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-15T22:31:20Z

    [Expiremental Branch] Changed to use Matrix.zSum > 0 to identify Matrices 
that would trip up CSR in GPUMmul.  Getting errors in flink and spark tests 
mostly from GPUMMul.gpuRWCW

commit 75cf32f75711df0939f3884056328fcd12f49469
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-09-15T22:52:05Z

    removed debugging printout that was causing failures

commit 6bacd20e1a56d0c1a8c94f2f0ccf4e75444cf34a
Author: Andrew Palumbo <apalu...@apache.org>
Date:   2016-10-07T00:49:06Z

    commented out failing RowSimilarityDriver test temporarily

----


> Inital Implementation of VCL Bindings
> -------------------------------------
>
>                 Key: MAHOUT-1885
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1885
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.12.2
>            Reporter: Andrew Palumbo
>            Assignee: Andrew Palumbo
>             Fix For: 0.13.0
>
>
> Push a working experimental branch of VCL bindings into master.  There is 
> still a lot of work to be done.  All tests are passing, At the moment there 
> am opening this JIRA mostly to get a number for PR and to test profiles 
> against on travis. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to