[ https://issues.apache.org/jira/browse/SPARK-3509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14131554#comment-14131554 ]
Egor Pakhomov commented on SPARK-3509: -------------------------------------- So far I have bad code for my usages. need good code. bad code - def randomRegressionLabeledFeatureSet(size: Int, featureNumber: Int) = { // bad code. task for better code - SPARK-3509 val seed = Random.nextLong(); sc.parallelize(1 to size, 10).map(i => { val features = (1 to featureNumber).map(_ => Random.nextDouble()).toArray var seedCopy = seed val result = features.reduceLeft((a, b) => { if (seedCopy % 3 == 0) { seedCopy = seedCopy / 3 a * b } else { seedCopy = seedCopy / 2 a + b } }) new LabeledPoint(result, Vectors.dense(features)) }) } > Method for generating random LabeledPoints for testing > ------------------------------------------------------ > > Key: SPARK-3509 > URL: https://issues.apache.org/jira/browse/SPARK-3509 > Project: Spark > Issue Type: New Feature > Components: MLlib > Affects Versions: 1.2.0 > Reporter: Egor Pakhomov > Priority: Minor > Fix For: 1.2.0 > > > During testing I need random LabeledPoints with some correletion behind it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org