[ 
https://issues.apache.org/jira/browse/SPARK-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139600#comment-14139600
 ] 

Xiangrui Meng commented on SPARK-3530:
--------------------------------------

[~eustache] The default implementation of multi-model training will be a for 
loop. But the API leaves space for future optimizations, like group weight 
vectors and using level-3 BLAS for better performance. It shouldn't be a meta 
class, because many optimizations are specific. For example, LASSO can be 
solved via LARS, which computes a full solution path for all regularization 
parameters. The level-3 BLAS optimization is another example, which can give 8x 
speedup (SPARK-1486).

[~vrilleup] We can have a set of built-in preconditions, like positivity. Or we 
could accept lambda function for assertions (T) => Unit, which may be hard for 
Java users but they should be familiar of creating those in Spark.

> Pipeline and Parameters
> -----------------------
>
>                 Key: SPARK-3530
>                 URL: https://issues.apache.org/jira/browse/SPARK-3530
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, MLlib
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>            Priority: Critical
>
> This part of the design doc is for pipelines and parameters. I put the design 
> doc at
> https://docs.google.com/document/d/1rVwXRjWKfIb-7PI6b86ipytwbUH7irSNLF1_6dLmh8o/edit?usp=sharing
> I will copy the proposed interfaces to this JIRA later. Some sample code can 
> be viewed at: https://github.com/mengxr/spark-ml/
> Please help review the design and post your comments here. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to