[
https://issues.apache.org/jira/browse/SPARK-11476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987303#comment-14987303
]
Jason Blochowiak commented on SPARK-11476:
------------------------------------------
The Scala and Java example code incorrectly use normalRDD(). The Python example
code uses uniformRDD().
> Incorrect function referred to in MLib Random data generation documentation
> ---------------------------------------------------------------------------
>
> Key: SPARK-11476
> URL: https://issues.apache.org/jira/browse/SPARK-11476
> Project: Spark
> Issue Type: Documentation
> Components: Documentation
> Affects Versions: 1.5.1
> Reporter: Jason Blochowiak
> Priority: Minor
> Labels: documentation, easyfix
> Original Estimate: 10m
> Remaining Estimate: 10m
>
> http://spark.apache.org/docs/latest/mllib-statistics.html in the "Random data
> generation", a comment in the example code says:
> Generate a random double RDD that contains 1 million i.i.d. values drawn from
> the standard normal distribution `N(0, 1)`, evenly distributed in 10
> partitions.
> But it then calls normalRDD(), which does not do that - a call to
> uniformRDD() with the same parameters would do what the comment claims.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]