[ 
https://issues.apache.org/jira/browse/SYSTEMML-843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405096#comment-15405096
 ] 

Imran Younus commented on SYSTEMML-843:
---------------------------------------

[~nakul02] and I investigated tSNE algorithm further today. After applying the 
changes suggested by [~mboehm7], there was huge improvement in the function 
{{x2p}}. Today we looked at the performance of {{tsne}} which implements the 
gradient descent. It takes more than twice as much time as the python 
implementation (820sec vs 390sec). I've update tSNE.dml script in github 
(https://github.com/apache/incubator-systemml/pull/200). Here are the stats:
{code}
 16/08/02 17:17:16 INFO api.DMLScript: SystemML Statistics:
Total elapsed time:             823.878 sec.
Total compilation time:         0.458 sec.
Total execution time:           823.420 sec.
Number of compiled MR Jobs:     1.
Number of executed MR Jobs:     0.
Cache hits (Mem, WB, FS, HDFS): 1299576/0/0/1.
Cache writes (WB, FS, HDFS):    379333/0/2.
Cache times (ACQr/m, RLS, EXP): 0.400/0.079/0.996/0.036 sec.
HOP DAGs recompiled (PRED, SB): 0/2001.
HOP DAGs recompile time:        1.035 sec.
Functions recompiled:           2.
Functions recompile time:       0.035 sec.
Total JIT compile time:         19.167 sec.
Total JVM GC count:             1951.
Total JVM GC time:              13.888 sec.
Heavy hitter instructions (name, time, count):
-- 1)   tsne    823.061 sec     1
-- 2)   *       244.509 sec     120782
-- 3)   /       173.698 sec     173172
-- 4)   +       168.255 sec     164652
-- 5)   -       91.713 sec      118780
-- 6)   tak+*   53.855 sec      58390
-- 7)   tsmm    48.542 sec      2001
-- 8)   uark+   27.817 sec      2000
-- 9)   x2p     10.541 sec      1
-- 10)  ba+*    6.135 sec       2000

16/08/02 17:17:16 INFO api.DMLScript: END DML run 08/02/2016 17:17:16
{code}

Here is the {{tsne}} function:

{code}
tsne = function(matrix[double] X, int reduced_dims, int initial_dims, int 
perplexity)
  return(matrix[double] Y, matrix[double] C) {
    d = reduced_dims
    n = nrow(X)

    max_iter = 2000
    eta = 500
    P = x2p(X, 1.0e-5, 20.0)
    P = P*4
    Y = rand(rows=n, cols=d, pdf="normal")
    C = matrix(0, rows=max_iter, cols=1)
    ZERODIAG = (diag(matrix(-1, rows=n, cols=1)) + 1)

    for (itr in 1:max_iter) {
      D = distance_matrix(Y)
      Z = 1/(D + 1)
      Z = Z * ZERODIAG
      Q = Z/sum(Z)
      W = (P - Q)*Z
      sumW = rowSums(W)
      grad_C = Y * sumW - W %*% Y
      Y = Y - eta*grad_C
      Y = Y - colMeans(Y)

      if (itr%%50 == 0) {
        #C[itr,] = sum(P * log(pmax(P, 1e-12) / pmax(Q, 1e-12)))
        #print(as.scalar(C[itr,1]))
        print(itr)
      }
      if (itr == 100) {
        P = P/4
      }
    }
  }
{code}


> leftIndex and cache release extremely slow
> ------------------------------------------
>
>                 Key: SYSTEMML-843
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-843
>             Project: SystemML
>          Issue Type: Bug
>            Reporter: Imran Younus
>         Attachments: tSNT.tar.gz
>
>
> I'm running the tSNE script in standalone mode with a subset of MNIST data 
> (2500 points). I ran this with and without  `-exec singlenode`. Here are the 
> stats:
> (BTW, the same function implemented in python takes less than 10 sec!)
> -> with singlenode flag
> {code}
> ./bin/systemml scripts/staging/tSNE.dml -stats -nvargs 
> X=/home/iyounus/workspace/tsne_python/mnist2500_X.txt Y=Y_out.txt C=C_out.txt
> 16/08/01 16:46:54 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:           109.667 sec.
> Total compilation time:               0.407 sec.
> Total execution time:         109.260 sec.
> Number of compiled MR Jobs:   0.
> Number of executed MR Jobs:   0.
> Cache hits (Mem, WB, FS, HDFS):       223692/0/0/1.
> Cache writes (WB, FS, HDFS):  80351/0/2.
> Cache times (ACQr/m, RLS, EXP):       0.289/0.015/85.192/0.043 sec.
> HOP DAGs recompiled (PRED, SB):       0/0.
> HOP DAGs recompile time:      0.007 sec.
> Functions recompiled:         1.
> Functions recompile time:     0.039 sec.
> Total JIT compile time:               4.924 sec.
> Total JVM GC count:           312.
> Total JVM GC time:            1.12 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)         tsne    109.202 sec     1
> -- 2)         x2p     109.189 sec     1
> -- 3)         leftIndex       106.728 sec     32136
> -- 4)         tsmm    0.564 sec       1
> -- 5)         exp     0.376 sec       8034
> -- 6)         rangeReIndex    0.201 sec       40170
> -- 7)         /       0.183 sec       24103
> -- 8)         *       0.161 sec       16069
> -- 9)         +       0.144 sec       22840
> -- 10)        uak+    0.106 sec       8036
> 16/08/01 16:46:54 INFO api.DMLScript: END DML run 08/01/2016 16:46:54
> {code}
> -> without singlenode flag
> {code}
> > ./bin/systemml scripts/staging/tSNE.dml -stats -nvargs 
> > X=/home/iyounus/workspace/tsne_python/mnist2500_X.txt Y=Y_out.txt 
> > C=C_out.txt
> 16/08/01 16:52:59 INFO api.DMLScript: SystemML Statistics:
> Total elapsed time:           127.290 sec.
> Total compilation time:               0.396 sec.
> Total execution time:         126.894 sec.
> Number of compiled MR Jobs:   1.
> Number of executed MR Jobs:   0.
> Cache hits (Mem, WB, FS, HDFS):       223693/0/0/1.
> Cache writes (WB, FS, HDFS):  80352/0/2.
> Cache times (ACQr/m, RLS, EXP):       0.421/0.016/100.974/0.041 sec.
> HOP DAGs recompiled (PRED, SB):       0/0.
> HOP DAGs recompile time:      0.009 sec.
> Functions recompiled:         1.
> Functions recompile time:     0.038 sec.
> Total JIT compile time:               4.835 sec.
> Total JVM GC count:           312.
> Total JVM GC time:            1.226 sec.
> Heavy hitter instructions (name, time, count):
> -- 1)         tsne    126.426 sec     1
> -- 2)         x2p     126.412 sec     1
> -- 3)         leftIndex       123.982 sec     32136
> -- 4)         exp     0.427 sec       8034
> -- 5)         MR-Job_CSV_REBLOCK      0.412 sec       1
> -- 6)         tsmm    0.308 sec       1
> -- 7)         rangeReIndex    0.242 sec       40170
> -- 8)         /       0.208 sec       24103
> -- 9)         +       0.172 sec       22840
> -- 10)        *       0.151 sec       16069
> 16/08/01 16:52:59 INFO api.DMLScript: END DML run 08/01/2016 16:52:59
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to