[ 
https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094607#comment-16094607
 ] 

Aseem Bansal edited comment on SPARK-21483 at 7/20/17 12:29 PM:
----------------------------------------------------------------

Some pseudo code to show what I am trying to achieve

{code:java}
class MyTransformer implemenets Serializable {
    
       public FeaturesAndLabel transform(RawData rawData) {
            //Some logic which creates Features and Labels from raw data. Raw 
data is just a java bean
           //FeaturesAndLabel is a bean which contains a SparseVector as 
features, and double as label
       }
}
{code}

{code:java}

Dataset<RawData> dataset = //read from somewhere and create Dataset of RawData 
bean
Dataset<FeaturesAndLabel> featuresAndLabels = dataset.transform(new 
MyTransformer()::transform)

//use features and labels for machine learning
{code}



was (Author: anshbansal):
Some pseudo code to show what I am trying to achieve

{code:java}
class MyTransformer implemenets Serializable {
    
       public FeaturesAndLabel transform(RawData rawData) {
            //Some logic which creates Features and Labels from raw data
           //FeaturesAndLabel is a bean which contains a SparseVector as 
features, and double as label
       }
}
{code}

{code:java}

Dataset<RawData> dataset = //read from somewhere and create Dataset of RawData 
bean
Dataset<FeaturesAndLabel> featuresAndLabels = dataset.transform(new 
MyTransformer()::transform)

//use features and labels for machine learning
{code}


> Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in 
> Encoders.bean(Vector.class)
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21483
>                 URL: https://issues.apache.org/jira/browse/SPARK-21483
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 2.1.0
>            Reporter: Aseem Bansal
>            Priority: Minor
>
> The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant 
> as per spark.
> This makes it impossible to create a Vector via a dataset.tranform. It should 
> be made bean-compliant so it can be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to