[jira] [Updated] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread zhengruifeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengruifeng updated SPARK-23085:
-
Description: 
Both {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]}} and {{ML.Vectors#sparse(size: 
{color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} support 
zero-length vectors.

In old MLLib,

{{MLLib.Vectors.sparse(size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}])}} also supports it.

However,

{{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} require a 
positve length.

 
{code:java}
scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res15: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)])
res16: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res17: org.apache.spark.mllib.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, 
Double)])
java.lang.IllegalArgumentException: requirement failed: The size of the 
requested sparse vector must be greater than 0.
  at scala.Predef$.require(Predef.scala:224)
  at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315)
  ... 50 elided

 

{code}

  was:
Both {ML.Vectors#sparse

size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]

} and {{

ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])

}} support zero-length vectors.

In old MLLib,

{{MLLib.Vectors.sparse(

size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]

)}} also supports it.

However,

{{

ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])

}} require a positve length.

 
{code:java}
scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res15: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)])
res16: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res17: org.apache.spark.mllib.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, 
Double)])
java.lang.IllegalArgumentException: requirement failed: The size of the 
requested sparse vector must be greater than 0.
  at scala.Predef$.require(Predef.scala:224)
  at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315)
  ... 50 elided

 

{code}


> API parity for mllib.linalg.Vectors.sparse 
> ---
>
> Key: SPARK-23085
> URL: https://issues.apache.org/jira/browse/SPARK-23085
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 2.4.0
>Reporter: zhengruifeng
>Priority: Minor
>
> Both {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}indices: 
> Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
> Array[{color:#cc7832}Double{color}]}} and {{ML.Vectors#sparse(size: 
> {color:#cc7832}Int, {color}elements: 
> {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} support 
> zero-length vectors.
> In old MLLib,
> {{MLLib.Vectors.sparse(size: {color:#cc7832}Int, {color}indices: 
> Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
> Array[{color:#cc7832}Double{color}])}} also supports it.
> However,
> {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
> {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} require a 
> positve length.
>  
> {code:java}
> scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], 
> Array.empty[Double])
> res15: org.apache.spark.ml.linalg.Vector = (0,[],[])
> scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, 
> Double)])
> res16: org.apache.spark.ml.linalg.Vector = (0,[],[])
> scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], 
> Array.empty[Double])
> res17: org.apache.spark.mllib.linalg.Vector = (0,[],[])
> scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, 
> Double)])
> java.lang.IllegalArgumentException: requirement failed: The size of the 
> requested

[jira] [Updated] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse

2018-01-15 Thread zhengruifeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengruifeng updated SPARK-23085:
-
Description: 
Both {ML.Vectors#sparse

size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]

} and {{

ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])

}} support zero-length vectors.

In old MLLib,

{{MLLib.Vectors.sparse(

size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]

)}} also supports it.

However,

{{

ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])

}} require a positve length.

 
{code:java}
scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res15: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)])
res16: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res17: org.apache.spark.mllib.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, 
Double)])
java.lang.IllegalArgumentException: requirement failed: The size of the 
requested sparse vector must be greater than 0.
  at scala.Predef$.require(Predef.scala:224)
  at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315)
  ... 50 elided

 

{code}

  was:
Both {{ML.Vectors#sparse

size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]

}} and {{

ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])

}} support zero-length vectors.

In old MLLib,

{{MLLib.Vectors.sparse(

size: {color:#cc7832}Int, {color}indices: 
Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
Array[{color:#cc7832}Double{color}]

)}} also supports it.

However,

{{

ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
{color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])

}} require a positve length. 

 

{code}

scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res15: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)])
res16: org.apache.spark.ml.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], 
Array.empty[Double])
res17: org.apache.spark.mllib.linalg.Vector = (0,[],[])

scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, 
Double)])
java.lang.IllegalArgumentException: requirement failed: The size of the 
requested sparse vector must be greater than 0.
  at scala.Predef$.require(Predef.scala:224)
  at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315)
  ... 50 elided

 

{code}


> API parity for mllib.linalg.Vectors.sparse 
> ---
>
> Key: SPARK-23085
> URL: https://issues.apache.org/jira/browse/SPARK-23085
> Project: Spark
>  Issue Type: Improvement
>  Components: ML
>Affects Versions: 2.4.0
>Reporter: zhengruifeng
>Priority: Minor
>
> Both {ML.Vectors#sparse
> size: {color:#cc7832}Int, {color}indices: 
> Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
> Array[{color:#cc7832}Double{color}]
> } and {{
> ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
> {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])
> }} support zero-length vectors.
> In old MLLib,
> {{MLLib.Vectors.sparse(
> size: {color:#cc7832}Int, {color}indices: 
> Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: 
> Array[{color:#cc7832}Double{color}]
> )}} also supports it.
> However,
> {{
> ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: 
> {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])
> }} require a positve length.
>  
> {code:java}
> scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], 
> Array.empty[Double])
> res15: org.apache.spark.ml.linalg.Vector = (0,[],[])
> scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, 
> Double)])
> res16: org.apache.spark.ml.linalg.Vector = (0,[],[])
> scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], 
> Array.empty[Double])
> res17: org.apache.spark.mllib.linalg.Vector = (0,[],[])
> scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, 
> Double)])
> java.lang.IllegalArgumentException: requirement failed: The si