[ 
https://issues.apache.org/jira/browse/SPARK-12372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph K. Bradley updated SPARK-12372:
--------------------------------------
    Description: 
This JIRA is now for documenting limitations of MLlib's local linear algebra 
types.  Basically, we should make it clear in the user guide that they provide 
simple functionality but are not a full-fledged local linear library.  We 
should also recommend libraries for users to use in the meantime: probably 
Breeze for Scala (and Java?) and numpy/scipy for Python.

*Original JIRA title*: Unary operator "-" fails for MLlib vectors

*Original JIRA text, as an example of the need for better docs*:
Consider the following snippet in pyspark 1.5.2:

{code:none}
>>> from pyspark.mllib.linalg import Vectors
>>> x = Vectors.dense([0.0, 1.0, 0.0, 7.0, 0.0])
>>> x
DenseVector([0.0, 1.0, 0.0, 7.0, 0.0])
>>> -x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: func() takes exactly 2 arguments (1 given)
>>> y = Vectors.dense([2.0, 0.0, 3.0, 4.0, 5.0])
>>> y
DenseVector([2.0, 0.0, 3.0, 4.0, 5.0])
>>> x-y
DenseVector([-2.0, 1.0, -3.0, 3.0, -5.0])
>>> -y+x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: func() takes exactly 2 arguments (1 given)
>>> -1*x
DenseVector([-0.0, -1.0, -0.0, -7.0, -0.0])
{code}

Clearly, the unary operator {{-}} (minus) for vectors fails, giving errors for 
expressions like {{-x}} and {{-y+x}}, despite the fact that {{x-y}} behaves as 
expected.
The last operation, {{-1*x}}, although mathematically "correct", includes minus 
signs for the zero entries, which again is normally not expected.

  was:
Consider the following snippet in pyspark 1.5.2:

{code:none}
>>> from pyspark.mllib.linalg import Vectors
>>> x = Vectors.dense([0.0, 1.0, 0.0, 7.0, 0.0])
>>> x
DenseVector([0.0, 1.0, 0.0, 7.0, 0.0])
>>> -x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: func() takes exactly 2 arguments (1 given)
>>> y = Vectors.dense([2.0, 0.0, 3.0, 4.0, 5.0])
>>> y
DenseVector([2.0, 0.0, 3.0, 4.0, 5.0])
>>> x-y
DenseVector([-2.0, 1.0, -3.0, 3.0, -5.0])
>>> -y+x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: func() takes exactly 2 arguments (1 given)
>>> -1*x
DenseVector([-0.0, -1.0, -0.0, -7.0, -0.0])
{code}

Clearly, the unary operator {{-}} (minus) for vectors fails, giving errors for 
expressions like {{-x}} and {{-y+x}}, despite the fact that {{x-y}} behaves as 
expected.
The last operation, {{-1*x}}, although mathematically "correct", includes minus 
signs for the zero entries, which again is normally not expected.


> Document limitations of MLlib linear algebra
> --------------------------------------------
>
>                 Key: SPARK-12372
>                 URL: https://issues.apache.org/jira/browse/SPARK-12372
>             Project: Spark
>          Issue Type: Documentation
>          Components: Documentation, MLlib
>    Affects Versions: 1.5.2
>            Reporter: Christos Iraklis Tsatsoulis
>
> This JIRA is now for documenting limitations of MLlib's local linear algebra 
> types.  Basically, we should make it clear in the user guide that they 
> provide simple functionality but are not a full-fledged local linear library. 
>  We should also recommend libraries for users to use in the meantime: 
> probably Breeze for Scala (and Java?) and numpy/scipy for Python.
> *Original JIRA title*: Unary operator "-" fails for MLlib vectors
> *Original JIRA text, as an example of the need for better docs*:
> Consider the following snippet in pyspark 1.5.2:
> {code:none}
> >>> from pyspark.mllib.linalg import Vectors
> >>> x = Vectors.dense([0.0, 1.0, 0.0, 7.0, 0.0])
> >>> x
> DenseVector([0.0, 1.0, 0.0, 7.0, 0.0])
> >>> -x
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: func() takes exactly 2 arguments (1 given)
> >>> y = Vectors.dense([2.0, 0.0, 3.0, 4.0, 5.0])
> >>> y
> DenseVector([2.0, 0.0, 3.0, 4.0, 5.0])
> >>> x-y
> DenseVector([-2.0, 1.0, -3.0, 3.0, -5.0])
> >>> -y+x
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: func() takes exactly 2 arguments (1 given)
> >>> -1*x
> DenseVector([-0.0, -1.0, -0.0, -7.0, -0.0])
> {code}
> Clearly, the unary operator {{-}} (minus) for vectors fails, giving errors 
> for expressions like {{-x}} and {{-y+x}}, despite the fact that {{x-y}} 
> behaves as expected.
> The last operation, {{-1*x}}, although mathematically "correct", includes 
> minus signs for the zero entries, which again is normally not expected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to