[ https://issues.apache.org/jira/browse/SPARK-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963173#comment-13963173 ]
Aaron Davidson commented on SPARK-1438: --------------------------------------- PartitionwiseSampledRDD already has the seed as an optional argument, using System.nanoTime as the default value. This seems reasonable, as Math.random() does the same thing (the first time). System.nanoTime is also usually high enough resolution that collisions are unlikely. Scala and probably Python can use default arguments, Java will need an overloaded method. > Update RDD.sample() API to make seed parameter optional > ------------------------------------------------------- > > Key: SPARK-1438 > URL: https://issues.apache.org/jira/browse/SPARK-1438 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Reporter: Matei Zaharia > Priority: Blocker > Labels: Starter > Fix For: 1.0.0 > > > When a seed is not given, it should pick one based on Math.random(). > This needs to be done in Java and Python as well. -- This message was sent by Atlassian JIRA (v6.2#6252)