[ https://issues.apache.org/jira/browse/SPARK-29380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhengruifeng reassigned SPARK-29380: ------------------------------------ Assignee: zhengruifeng > RFormula avoid repeated 'first' jobs to get vector size > ------------------------------------------------------- > > Key: SPARK-29380 > URL: https://issues.apache.org/jira/browse/SPARK-29380 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 3.0.0 > Reporter: zhengruifeng > Assignee: zhengruifeng > Priority: Minor > > In current impl, {{RFormula}} will trigger one {{first}} job to get the > vector size, if the size can not be obtained from {{AttributeGroup.}} > {{This can be optimized by get the first row lazily, and reuse it for each > vector column.}} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org