[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15741169#comment-15741169 ] Yanbo Liang commented on SPARK-10413: - [~anshbansal] Yeah, we will put this feature at a high priority in Spark 2.2 release cycle. I think there is no JIRA ticket for predict method on the whole pipeline model, it depends on this feature. Thanks. > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. > UPDATE: This issue is for making predictions with single models. We can make > methods like {{def predict(features: Vector): Double}} public. > * This issue is *not* for single-instance prediction for full Pipelines, > which would require making predictions on {{Row}}s. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734495#comment-15734495 ] Aseem Bansal commented on SPARK-10413: -- Hi Is anyone working on this? > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. > UPDATE: This issue is for making predictions with single models. We can make > methods like {{def predict(features: Vector): Double}} public. > * This issue is *not* for single-instance prediction for full Pipelines, > which would require making predictions on {{Row}}s. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15547855#comment-15547855 ] Hussein Hazimeh commented on SPARK-10413: - As discussed in [SPARK-16431|https://issues.apache.org/jira/browse/SPARK-16431], I think it would useful to extend this further to support feature transformations on single instances, which enables low-latency feature transformations on single instances and can lead to improved code readability and testing. As a first step, the body of the UDF of each feature transformer can be refactored into a new low-level method (e.g. called "transformInstance") that accepts single instances with raw data types (double, vector, etc). The new method can be utilized later on to add full pipeline support for transforming and predicting single instances. [~josephkb] let me know your thoughts on this. > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. > UPDATE: This issue is for making predictions with single models. We can make > methods like {{def predict(features: Vector): Double}} public. > * This issue is *not* for single-instance prediction for full Pipelines, > which would require making predictions on {{Row}}s. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138756#comment-15138756 ] Rares Mirica commented on SPARK-10413: -- I don't know if I am reading this right but many times the processing pipelines contain a relatively large number of stages, supporting single instance on string types means that the pipeline will need to be split into the column-manipulation stages (run over a dataframe, eg: PolynomialExpansion creates a column that is then used as the feature for prediction) and the single-instance run (in this case the prediction on a model). Supporting single Row instance would open the way for local execution of an entire pipeline (presumably loaded from storage) which opens up applications in the low-latency space (online prediction with rest front-end for example) > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940197#comment-14940197 ] Joseph K. Bradley commented on SPARK-10413: --- SGTM > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936947#comment-14936947 ] Yanbo Liang commented on SPARK-10413: - [~mengxr] [~josephkb] I found this issue involved too many models and files, so let's separate this JIRA into sub tasks. I have opened SPARK-10884 to make all classification and regression model support prediction on single instance. Other community members who are interested in this issue can open other sub tasks and work on them. > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905230#comment-14905230 ] Joseph K. Bradley commented on SPARK-10413: --- For API, I think my main question is whether predict() should take strong types (Vector, etc.) and/or Rows. I prefer supporting strong types first (as you are doing) since we could add support for Rows later on (although there could be difficult questions about missing schema for Scala/Java). For raw & probability, I would again vote for just making those public. But that could be done at a later time. > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904291#comment-14904291 ] Apache Spark commented on SPARK-10413: -- User 'yanboliang' has created a pull request for this issue: https://github.com/apache/spark/pull/8883 > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904276#comment-14904276 ] Yanbo Liang commented on SPARK-10413: - [~mengxr] I think to support prediction on single instance for PredictionModel and its subclass is not complex, we just make predict() public and add test cases. In the case of other Model subclass we should add predict function for single instance prediction. And then the transform function will use predict functions as UDF. And one issues that we should discuss is that after we make predict as public, shall we need to make other functions such as predictRaw, predictProbability, etc. public for single instance? > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728072#comment-14728072 ] Xiangrui Meng commented on SPARK-10413: --- [~yanboliang] Let's finish your pending PRs and the log-linear model first:) > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-10413) Model should support prediction on single instance
[ https://issues.apache.org/jira/browse/SPARK-10413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726909#comment-14726909 ] Yanbo Liang commented on SPARK-10413: - I will try to work on it. > Model should support prediction on single instance > -- > > Key: SPARK-10413 > URL: https://issues.apache.org/jira/browse/SPARK-10413 > Project: Spark > Issue Type: Umbrella > Components: ML >Reporter: Xiangrui Meng >Priority: Critical > > Currently models in the pipeline API only implement transform(DataFrame). It > would be quite useful to support prediction on single instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org