Roger Menezes created SPARK-8314:
------------------------------------

             Summary: improvement in performance of MLUtils.appendBias
                 Key: SPARK-8314
                 URL: https://issues.apache.org/jira/browse/SPARK-8314
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 1.4.0
            Reporter: Roger Menezes
             Fix For: 1.5.0


MLUtils.appendBias method is heavily used in creating intercepts for linear 
models. This method uses Breeze's vector concatenation which is very slow 
compared to the plain System.arrayCopy. This improvement is to change the 
implementation to use System.arrayCopy. 

We saw the following performance improvements after the change:
Benchmark with mnist dataset for 50 times:
MLUtils.appendBias (SparseVector Before): 47320 ms
MLUtils.appendBias (SparseVector After): 1935 ms

MLUtils.appendBias (DenseVector Before): 5340 ms
MLUtils.appendBias (DenseVector After): 4080 ms

This is almost a 24 times performance boost for SparseVectors.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to