[jira] [Commented] (SPARK-11476) Incorrect function referred to in MLib Random data generation documentation

Sean Owen (JIRA) Tue, 03 Nov 2015 07:58:45 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987508#comment-14987508
 ]


Sean Owen commented on SPARK-11476:
-----------------------------------

The other way around, right? normalRDD generates N(0,1) samples. That is not 
the same thing as the uniform distribution on [0,1]. The Python example is the 
one that needs the fix. Feel free to open a PR.

> Incorrect function referred to in MLib Random data generation documentation
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-11476
>                 URL: https://issues.apache.org/jira/browse/SPARK-11476
>             Project: Spark
>          Issue Type: Documentation
>          Components: Documentation
>    Affects Versions: 1.5.1
>            Reporter: Jason Blochowiak
>            Priority: Minor
>              Labels: documentation, easyfix
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> http://spark.apache.org/docs/latest/mllib-statistics.html in the "Random data 
> generation", a comment in the example code says:
> Generate a random double RDD that contains 1 million i.i.d. values drawn from 
> the standard normal distribution `N(0, 1)`, evenly distributed in 10 
> partitions.
> But it then calls normalRDD(), which does not do that - a call to 
> uniformRDD() with the same parameters would do what the comment claims.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-11476) Incorrect function referred to in MLib Random data generation documentation

Reply via email to