[ https://issues.apache.org/jira/browse/SPARK-9084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph K. Bradley closed SPARK-9084. ------------------------------------ Resolution: Later > Add in support for realtime data predictions using ML PipelineModel > ------------------------------------------------------------------- > > Key: SPARK-9084 > URL: https://issues.apache.org/jira/browse/SPARK-9084 > Project: Spark > Issue Type: New Feature > Components: ML > Reporter: Hollin Wilkins > Priority: Minor > > Currently ML provides excellent support for feature manipulation, model > selection, and prediction for large datasets. The models can all be easily > serialized but currently it is not possible to use the fitted models without > a DataFrame. This means that these models are only good for batch processing. > In order to support realtime ML pipelines, I propose adding in three new > methods to the Transformer class: > def transform(row: StructuredRow): StructuredRow > def transform(row: StructuredRow, paramMap: ParamMap): StructuredRow > def transform(row: StructuredRow, firstParamPair: ParamPair[_], > otherParamPairs: ParamPair[_]*): StructuredRow > Where a StructuredRow is a case class that is the combination of an > org.apache.spark.sql.Row and an org.apache.spark.sql.types.StructType. An > alternative would be to modify the transform method signature to take in two > objects, a StructType and a Row. > This change necessitates the addition of the new transform method to each > implementor of the Transformer class. > Following this change, it would be trivial to include the spark jars in an > API server, deserialize an ML PipelineModel object, take incoming data from > users, convert it into a StructuredRow and feed it into the PipelineModel to > get a realtime result. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org