[
https://issues.apache.org/jira/browse/MADLIB-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16407138#comment-16407138
]
Jingyi Mei commented on MADLIB-1216:
------------------------------------
pca|pca.sql_in seems to be the slowest one regardless of platform/database
version. This install check test calls pca_train and pca_sparse_train multiple
times. It's not trivial to remove these tests and move them to tinc without
losing code coverage.
We refactored pca_project today, and after refactoring, the runtime goes from
~37s to ~23s
Modified decision tree to use a smaller array dataset, which reduced run time
from ~30s to ~9s.
Modified random forest to use less trees, which reduced run time from ~14s to
~9s
Modified elastic net to not test cross validation, which reduced the run time
by ~20s.
Spiked on svm, it is also hard to cut down the runtime without losing code
coverage.
> Fix slowest 3 Install Check on Greenplum
> ----------------------------------------
>
> Key: MADLIB-1216
> URL: https://issues.apache.org/jira/browse/MADLIB-1216
> Project: Apache MADlib
> Issue Type: Improvement
> Components: Infrastructure: Automated Tests
> Reporter: Jingyi Mei
> Assignee: Jingyi Mei
> Priority: Major
> Fix For: v1.14
>
>
> We want to find out which are the slowest n install check tests (say n=3) on
> Greenplum, so that we can reduce the total install check runtime.
> Acceptance
> 1) Run install check on greenplum and it runs faster than before.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)