[jira] [Updated] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-23085: - Description: Both {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}]}} and {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} support zero-length vectors. In old MLLib, {{MLLib.Vectors.sparse(size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}])}} also supports it. However, {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} require a positve length. {code:java} scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res15: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) res16: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res17: org.apache.spark.mllib.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) java.lang.IllegalArgumentException: requirement failed: The size of the requested sparse vector must be greater than 0. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315) ... 50 elided {code} was: Both {ML.Vectors#sparse size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}] } and {{ ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) }} support zero-length vectors. In old MLLib, {{MLLib.Vectors.sparse( size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}] )}} also supports it. However, {{ ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) }} require a positve length. {code:java} scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res15: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) res16: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res17: org.apache.spark.mllib.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) java.lang.IllegalArgumentException: requirement failed: The size of the requested sparse vector must be greater than 0. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315) ... 50 elided {code} > API parity for mllib.linalg.Vectors.sparse > --- > > Key: SPARK-23085 > URL: https://issues.apache.org/jira/browse/SPARK-23085 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.4.0 >Reporter: zhengruifeng >Priority: Minor > > Both {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}indices: > Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: > Array[{color:#cc7832}Double{color}]}} and {{ML.Vectors#sparse(size: > {color:#cc7832}Int, {color}elements: > {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} support > zero-length vectors. > In old MLLib, > {{MLLib.Vectors.sparse(size: {color:#cc7832}Int, {color}indices: > Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: > Array[{color:#cc7832}Double{color}])}} also supports it. > However, > {{ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: > {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})])}} require a > positve length. > > {code:java} > scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], > Array.empty[Double]) > res15: org.apache.spark.ml.linalg.Vector = (0,[],[]) > scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, > Double)]) > res16: org.apache.spark.ml.linalg.Vector = (0,[],[]) > scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], > Array.empty[Double]) > res17: org.apache.spark.mllib.linalg.Vector = (0,[],[]) > scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, > Double)]) > java.lang.IllegalArgumentException: requirement failed: The size of the > requested
[jira] [Updated] (SPARK-23085) API parity for mllib.linalg.Vectors.sparse
[ https://issues.apache.org/jira/browse/SPARK-23085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-23085: - Description: Both {ML.Vectors#sparse size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}] } and {{ ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) }} support zero-length vectors. In old MLLib, {{MLLib.Vectors.sparse( size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}] )}} also supports it. However, {{ ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) }} require a positve length. {code:java} scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res15: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) res16: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res17: org.apache.spark.mllib.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) java.lang.IllegalArgumentException: requirement failed: The size of the requested sparse vector must be greater than 0. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315) ... 50 elided {code} was: Both {{ML.Vectors#sparse size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}] }} and {{ ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) }} support zero-length vectors. In old MLLib, {{MLLib.Vectors.sparse( size: {color:#cc7832}Int, {color}indices: Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: Array[{color:#cc7832}Double{color}] )}} also supports it. However, {{ ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) }} require a positve length. {code} scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res15: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) res16: org.apache.spark.ml.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], Array.empty[Double]) res17: org.apache.spark.mllib.linalg.Vector = (0,[],[]) scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, Double)]) java.lang.IllegalArgumentException: requirement failed: The size of the requested sparse vector must be greater than 0. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.mllib.linalg.Vectors$.sparse(Vectors.scala:315) ... 50 elided {code} > API parity for mllib.linalg.Vectors.sparse > --- > > Key: SPARK-23085 > URL: https://issues.apache.org/jira/browse/SPARK-23085 > Project: Spark > Issue Type: Improvement > Components: ML >Affects Versions: 2.4.0 >Reporter: zhengruifeng >Priority: Minor > > Both {ML.Vectors#sparse > size: {color:#cc7832}Int, {color}indices: > Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: > Array[{color:#cc7832}Double{color}] > } and {{ > ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: > {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) > }} support zero-length vectors. > In old MLLib, > {{MLLib.Vectors.sparse( > size: {color:#cc7832}Int, {color}indices: > Array[{color:#cc7832}Int{color}]{color:#cc7832}, {color}values: > Array[{color:#cc7832}Double{color}] > )}} also supports it. > However, > {{ > ML.Vectors#sparse(size: {color:#cc7832}Int, {color}elements: > {color:#4e807d}Seq{color}[({color:#cc7832}Int, Double{color})]) > }} require a positve length. > > {code:java} > scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[Int], > Array.empty[Double]) > res15: org.apache.spark.ml.linalg.Vector = (0,[],[]) > scala> org.apache.spark.ml.linalg.Vectors.sparse(0, Array.empty[(Int, > Double)]) > res16: org.apache.spark.ml.linalg.Vector = (0,[],[]) > scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[Int], > Array.empty[Double]) > res17: org.apache.spark.mllib.linalg.Vector = (0,[],[]) > scala> org.apache.spark.mllib.linalg.Vectors.sparse(0, Array.empty[(Int, > Double)]) > java.lang.IllegalArgumentException: requirement failed: The si