[jira] (SPARK-18131) Support returning Vector/Dense Vector from backend

2017-01-31 Thread Shivaram Venkataraman (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847473#comment-15847473
 ] 

Shivaram Venkataraman commented on SPARK-18131:
---

Hmm - this is tricky. We ran into a similar issue in SQL and we added a reader, 
writer object in SQL that was registered to the method in core. See 
https://github.com/apache/spark/blob/ce112cec4f9bff222aa256893f94c316662a2a7e/sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala#L39
 for how we did that. We could do a similar thing in MLlib as well ? 

cc [~mengxr]

> Support returning Vector/Dense Vector from backend
> --
>
> Key: SPARK-18131
> URL: https://issues.apache.org/jira/browse/SPARK-18131
> Project: Spark
>  Issue Type: New Feature
>  Components: SparkR
>Reporter: Miao Wang
>
> For `spark.logit`, there is a `probabilityCol`, which is a vector in the 
> backend (scala side). When we do collect(select(df, "probabilityCol")), 
> backend returns the java object handle (memory address). We need to implement 
> a method to convert a Vector/Dense Vector column as R vector, which can be 
> read in SparkR. It is a followup JIRA of adding `spark.logit`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] (SPARK-18131) Support returning Vector/Dense Vector from backend

2017-01-31 Thread Miao Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847390#comment-15847390
 ] 

Miao Wang commented on SPARK-18131:
---

[~felixcheung][~yanboliang][~shivaram] I am trying to add the serialization 
utility to SerDe.scala. 

Inside def writeObject(dos: DataOutputStream, obj: Object, jvmObjectTracker: 
JVMObjectTracker): Unit, one case should be added:

case v: org.apache.spark.ml.linalg.DenseVector => 

This file is in spark-core. So I can't import org.apache.spark.ml.linalg._ in 
this file, because of dependency issue. Do you have any suggestions?

One possibility is to move `Vectors` from mllib-local to core folder. I am not 
sure whether there are other options.

Thanks! 

> Support returning Vector/Dense Vector from backend
> --
>
> Key: SPARK-18131
> URL: https://issues.apache.org/jira/browse/SPARK-18131
> Project: Spark
>  Issue Type: New Feature
>  Components: SparkR
>Reporter: Miao Wang
>
> For `spark.logit`, there is a `probabilityCol`, which is a vector in the 
> backend (scala side). When we do collect(select(df, "probabilityCol")), 
> backend returns the java object handle (memory address). We need to implement 
> a method to convert a Vector/Dense Vector column as R vector, which can be 
> read in SparkR. It is a followup JIRA of adding `spark.logit`.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org