[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094607#comment-16094607 ] Aseem Bansal commented on SPARK-21483: -- Some pseudo code to show what I am trying to achieve {code:java} class MyTransformer implemenets Serializable { public FeaturesAndLabel transform(RawData rawData) { //Some logic which creates Features and Labels from raw data //FeaturesAndLabel is a bean which contains a SparseVector as features, and double as label } } {code} {code:java} Dataset dataset = //read from somewhere and create Dataset of RawData bean Dataset featuresAndLabels = dataset.transform(new MyTransformer()::transform) //use features and labels for machine learning {code} > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094525#comment-16094525 ] Nick Pentreath commented on SPARK-21483: Perhaps you can supply some example code for what you're trying to do? > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094409#comment-16094409 ] Sean Owen commented on SPARK-21483: --- VectorUDT? https://spark.apache.org/docs/2.1.1/api/java/org/apache/spark/mllib/linalg/VectorUDT.html Not sure then. I doubt the answer is to make it a bean, but maybe offering a built in encoder? Maybe I'm missing something. > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094314#comment-16094314 ] Aseem Bansal commented on SPARK-21483: -- Now it does not. Can you give a link to what you are referring to? And I am not using spark SQL. I am using Dataset's transformations only. > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094302#comment-16094302 ] Sean Owen commented on SPARK-21483: --- Hm, I thought there was an Encoder for Vector already. But there's a UDT for Vector of course. I've used that to interact with Vectors in Spark SQL. Does that answer your use case? > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094297#comment-16094297 ] Aseem Bansal commented on SPARK-21483: -- How would you encode it otherwise? > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21483) Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in Encoders.bean(Vector.class)
[ https://issues.apache.org/jira/browse/SPARK-21483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16094283#comment-16094283 ] Sean Owen commented on SPARK-21483: --- (Not major, and I don't think I'd open multiple JIRAs) A Vector is not conceptually a bean. Again, why must it be a bean? it can be encoded otherwise. > Make org.apache.spark.ml.linalg.Vector bean-compliant so it can be used in > Encoders.bean(Vector.class) > -- > > Key: SPARK-21483 > URL: https://issues.apache.org/jira/browse/SPARK-21483 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.1.0 >Reporter: Aseem Bansal >Priority: Minor > > The class org.apache.spark.ml.linalg.Vector is currently not bean-compliant > as per spark. > This makes it impossible to create a Vector via a dataset.tranform. It should > be made bean-compliant so it can be used. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org