[ https://issues.apache.org/jira/browse/SPARK-8314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14582629#comment-14582629 ]
Apache Spark commented on SPARK-8314: ------------------------------------- User 'rogermenezes' has created a pull request for this issue: https://github.com/apache/spark/pull/6768 > improvement in performance of MLUtils.appendBias > ------------------------------------------------ > > Key: SPARK-8314 > URL: https://issues.apache.org/jira/browse/SPARK-8314 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 1.4.0 > Reporter: Roger Menezes > Fix For: 1.5.0 > > > MLUtils.appendBias method is heavily used in creating intercepts for linear > models. This method uses Breeze's vector concatenation which is very slow > compared to the plain System.arrayCopy. This improvement is to change the > implementation to use System.arrayCopy. > We saw the following performance improvements after the change: > Benchmark with mnist dataset for 50 times: > MLUtils.appendBias (SparseVector Before): 47320 ms > MLUtils.appendBias (SparseVector After): 1935 ms > MLUtils.appendBias (DenseVector Before): 5340 ms > MLUtils.appendBias (DenseVector After): 4080 ms > This is almost a 24 times performance boost for SparseVectors. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org