[ https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139600#comment-14139600 ]
Xiangrui Meng commented on SPARK-3530: -------------------------------------- [~eustache] The default implementation of multi-model training will be a for loop. But the API leaves space for future optimizations, like group weight vectors and using level-3 BLAS for better performance. It shouldn't be a meta class, because many optimizations are specific. For example, LASSO can be solved via LARS, which computes a full solution path for all regularization parameters. The level-3 BLAS optimization is another example, which can give 8x speedup (SPARK-1486). [~vrilleup] We can have a set of built-in preconditions, like positivity. Or we could accept lambda function for assertions (T) => Unit, which may be hard for Java users but they should be familiar of creating those in Spark. > Pipeline and Parameters > ----------------------- > > Key: SPARK-3530 > URL: https://issues.apache.org/jira/browse/SPARK-3530 > Project: Spark > Issue Type: Sub-task > Components: ML, MLlib > Reporter: Xiangrui Meng > Assignee: Xiangrui Meng > Priority: Critical > > This part of the design doc is for pipelines and parameters. I put the design > doc at > https://docs.google.com/document/d/1rVwXRjWKfIb-7PI6b86ipytwbUH7irSNLF1_6dLmh8o/edit?usp=sharing > I will copy the proposed interfaces to this JIRA later. Some sample code can > be viewed at: https://github.com/mengxr/spark-ml/ > Please help review the design and post your comments here. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org